LDAP High Availability: Load Balancing and Production Architecture

Reading Time: 6 minutes

The Identity Stack, Episode 7
EP06: OpenLDAPEP07EP08: FreeIPA → …


TL;DR

  • LDAP HA means multiple directory servers behind a load balancer — clients connect to a VIP, not to individual servers
  • Read/write split: all writes go to the provider, reads are distributed across consumers — the load balancer enforces this by routing on port or backend check
  • SSSD handles multi-server failover natively (ldap_uri accepts a comma-separated list) — for apps without built-in failover, HAProxy with health checks does the work
  • Connection pooling is critical at scale — nss_ldap and pam_ldap opened a new connection per login; SSSD maintains a pool; apps that use libldap directly must implement their own
  • cn=monitor is the built-in monitoring endpoint — exposes connection counts, operation rates, and backend stats readable via ldapsearch
  • 389-DS (Red Hat Directory Server) is the production choice for >1M entries — purpose-built for large directories with a dedicated replication engine

The Big Picture: Production LDAP Topology

         Clients (SSSD, apps, VPN concentrators)
                      │
              ┌───────▼───────┐
              │   HAProxy VIP  │   ← single endpoint, port 389/636
              │  10.0.0.10     │
              └───────┬───────┘
                      │
          ┌───────────┼───────────┐
          ▼           ▼           ▼
   ldap1.corp.com  ldap2.corp.com  ldap3.corp.com
   (Provider)      (Consumer)      (Consumer)
   Reads + Writes  Reads only      Reads only
          │           ▲               ▲
          └───────────┴───────────────┘
               SyncRepl replication

EP06 built a two-node replicated directory. This episode covers what happens when the directory becomes infrastructure — when it needs to survive a node failure, handle thousands of connections, and be monitored like any other critical service.


HAProxy for LDAP

HAProxy is the standard choice for LDAP load balancing. Unlike HTTP, LDAP is a stateful protocol — once a client binds, subsequent operations on that connection share the authenticated session. The load balancer must use connection persistence, not per-request routing.

# /etc/haproxy/haproxy.cfg

global
    log /dev/log local0
    maxconn 50000

defaults
    mode tcp                  # LDAP is TCP, not HTTP
    timeout connect 5s
    timeout client  30s
    timeout server  30s
    option tcplog

# ── LDAP read/write split ─────────────────────────────────────────────

# Writes → provider only
frontend ldap-write
    bind *:389
    default_backend ldap-provider

backend ldap-provider
    balance first                   # always use first available (provider)
    option tcp-check
    tcp-check connect
    server ldap1 ldap1.corp.com:389 check inter 5s rise 2 fall 3
    server ldap2 ldap2.corp.com:389 check inter 5s rise 2 fall 3 backup

# Reads → all nodes round-robin
frontend ldap-read
    bind *:3389                     # internal read port
    default_backend ldap-consumers

backend ldap-consumers
    balance roundrobin
    option tcp-check
    tcp-check connect
    server ldap1 ldap1.corp.com:389 check inter 5s
    server ldap2 ldap2.corp.com:389 check inter 5s
    server ldap3 ldap3.corp.com:389 check inter 5s

# LDAPS (TLS)
frontend ldaps
    bind *:636
    default_backend ldap-consumers-tls

backend ldap-consumers-tls
    balance roundrobin
    server ldap1 ldap1.corp.com:636 check inter 5s ssl verify required ca-file /etc/ssl/certs/ca.pem
    server ldap2 ldap2.corp.com:636 check inter 5s ssl verify required ca-file /etc/ssl/certs/ca.pem

The health check (tcp-check connect) just verifies TCP connectivity. For a more precise check — verifying that slapd is actually responding to LDAP requests — use a custom script that runs ldapsearch and checks the result code.


SSSD Multi-Server Failover

SSSD has native failover — no load balancer required for SSSD-based clients:

# /etc/sssd/sssd.conf
[domain/corp.com]
ldap_uri = ldap://ldap1.corp.com, ldap://ldap2.corp.com, ldap://ldap3.corp.com
# SSSD tries them in order; switches to next on failure
# Switches back to primary after ldap_recovery_interval (default: 30s)

# For AD, discovery via DNS SRV records is even better:
ad_server = _srv_
# SSSD queries _ldap._tcp.corp.com SRV records and gets all DCs automatically

SSSD monitors the connection health. If the current server becomes unreachable, it switches to the next in the list within seconds. Existing cached data keeps serving during the switchover. Clients using SSSD don’t need a load balancer for basic HA.


Connection Pooling

Every LDAP bind creates an authenticated session on the server. A server with connection limits (olcConnMaxPending, olcConnMaxPendingAuth in OLC) will reject new connections when those limits are hit.

The problem: applications that use libldap directly tend to open a new connection per operation. At 500 requests/second, that’s 500 new TCP connections, 500 binds, 500 TLS handshakes per second — a directory that can handle 5000 concurrent connections starts refusing new ones.

The solutions:

SSSD — handles this automatically. SSSD maintains one or a small number of persistent connections per domain and multiplexes all PAM/NSS queries through them.

Application-level pooling — frameworks like python-ldap with connection pooling, ldap3 with connection strategies, or dedicated middleware like 389-DS‘s Directory Proxy Server.

ldap_maxconnections in OpenLDAP — sets a hard limit. When hit, new connections block until existing ones close. Set this to something reasonable (olcConnMaxPending: 100 in OLC) so you get a controlled failure mode instead of unbounded queuing.


Monitoring with cn=monitor

OpenLDAP exposes live operational statistics via the cn=monitor database — a virtual LDAP subtree that reflects the server’s current state. Enable it:

# enable-monitor.ldif
dn: cn=module,cn=config
objectClass: olcModuleList
cn: module
olcModulePath: /usr/lib/ldap
olcModuleLoad: back_monitor

dn: olcDatabase=monitor,cn=config
objectClass: olcDatabaseConfig
olcDatabase: monitor
olcAccess: to *
  by dn="cn=admin,dc=corp,dc=com" read
  by * none

Query it:

# Overall statistics
ldapsearch -x -H ldap://localhost \
  -D "cn=admin,dc=corp,dc=com" -w password \
  -b "cn=monitor" -s sub "(objectClass=*)" \
  monitorOpInitiated monitorOpCompleted

# Connection counts
ldapsearch -x -H ldap://localhost \
  -D "cn=admin,dc=corp,dc=com" -w password \
  -b "cn=Connections,cn=monitor" -s one \
  monitorConnectionNumber

# Operations by type
ldapsearch -x -H ldap://localhost \
  -D "cn=admin,dc=corp,dc=com" -w password \
  -b "cn=Operations,cn=monitor" -s one \
  monitorOpInitiated monitorOpCompleted

Useful metrics to export to Prometheus (via prometheus-openldap-exporter or similar):
monitorOpCompleted per operation type (bind, search, modify)
monitorConnectionNumber — current connection count
– Backend-specific: olmMDBEntries, olmMDBPagesMax, olmMDBPagesUsed


389-DS: LDAP at Scale

OpenLDAP is excellent for directories up to a few million entries. When you need:
– 10M+ entries
– High write throughput (more than a few hundred writes/second)
– Fine-grained replication filtering
– A dedicated web-based admin UI

…389-DS (Red Hat Directory Server, community edition) is the production answer. It’s what FreeIPA uses under the hood.

Key architectural differences from OpenLDAP:

Multi-supplier replication — 389-DS’s replication engine uses a dedicated changelog (stored in LMDB) and Change Sequence Numbers (CSNs) for conflict resolution. Multi-supplier (multi-master) replication is first-class, not a bolted-on feature.

Changelog — every change is written to a persistent changelog before being applied. This enables precise replication: a consumer can reconnect after a network partition and get exactly the changes it missed, rather than doing a full resync.

Plugin architecture — 389-DS functionality (replication, managed entries, DNA for automatic UID allocation, memberOf, password policy) is all implemented as plugins that can be enabled/disabled per directory instance.

# Install 389-DS
dnf install -y 389-ds-base

# Create a new instance
dscreate interactive
# — or use a template:
dscreate from-file /path/to/instance.inf

# Manage with dsctl
dsctl slapd-corp status
dsctl slapd-corp start
dsctl slapd-corp stop

# Admin with dsconf
dsconf slapd-corp backend suffix list
dsconf slapd-corp replication status -suffix "dc=corp,dc=com"

The dsconf replication status command gives a live view of replication lag across all suppliers and consumers — something OpenLDAP requires you to compute manually from contextCSN comparisons.


Global Catalog: Cross-Domain Search in AD

When your directory spans multiple AD domains in a forest, the Global Catalog solves a specific problem: a user in emea.corp.com needs to be found by an app that only knows corp.com.

Forest: corp.com
  ├── corp.com       → DC port 389    full directory: 500K entries
  ├── emea.corp.com  → DC port 389    full directory: 200K entries
  └── Global Catalog → GC port 3268  partial replica: 700K entries
                                       (not all attributes — just the most queried ones)

The GC replicates a subset of attributes from every domain in the forest. By default: cn, mail, sAMAccountName, userPrincipalName, memberOf, and about 150 others. Attributes marked with isMemberOfPartialAttributeSet in the schema are replicated to the GC.

If an application is configured to use port 3268 instead of 389, it’s using the GC — and it won’t see attributes not included in the partial attribute set. This surprises teams that add a custom attribute to AD and then wonder why their application can’t see it on 3268 but can on 389.


⚠ Production Gotchas

HAProxy TCP health checks don’t verify LDAP is responsive. A server can accept TCP connections but have slapd in a degraded state (database corruption, out-of-memory). Build a proper LDAP health check: a script that binds and searches a known entry and checks the result.

replication lag under write load. SyncRepl consumers can fall behind under sustained write load. Monitor the contextCSN difference between provider and consumers. If consumers are more than a few seconds behind, investigate the provider’s write throughput and the consumer’s processing speed.

Directory size and the MDB mapsize. LMDB requires a pre-configured maximum database size (olcDbMaxSize). If the database grows beyond this, slapd starts failing writes. Set it to 2–4x your expected data size and monitor olmMDBPagesUsed / olmMDBPagesMax.


Key Takeaways

  • HAProxy in TCP mode provides LDAP load balancing — use balance first for write routing (provider only), balance roundrobin for reads
  • SSSD has native failover via ldap_uri — for SSSD clients, a load balancer adds HA but isn’t strictly required
  • cn=monitor is the built-in OpenLDAP monitoring endpoint — export its counters to Prometheus for operational visibility
  • 389-DS is the right choice for >1M entries, high write throughput, or multi-supplier replication as a first-class feature
  • Global Catalog (port 3268/3269) is a partial replica of all AD domains — useful for forest-wide searches, but missing non-replicated attributes

What’s Next

EP07 covers the infrastructure layer. EP08 zooms out to FreeIPA — what you get when LDAP, Kerberos, DNS, PKI, and HBAC are integrated into a single Linux-native identity stack, and why most Linux shops running their own directory should be running FreeIPA instead of bare OpenLDAP.

Next: FreeIPA: LDAP + Kerberos + PKI in a Single Linux Identity Stack

Get EP08 in your inbox when it publishes → linuxcent.com/subscribe

OpenLDAP Setup and Replication: Running Your Own Directory

Reading Time: 5 minutes

The Identity Stack, Episode 6
EP01 → … → EP05: KerberosEP06EP07: LDAP HA → …


TL;DR

  • OpenLDAP’s server process is slapd — the backend that stores data is MDB (LMDB), a memory-mapped B-tree that replaced the old Berkeley DB backend
  • Configuration lives in the directory itself: cn=config (OLC — Online Configuration) lets you modify slapd at runtime without restarting
  • SyncRepl is the replication protocol: a consumer subscribes to a provider and stays in sync via either polling (refreshOnly) or a persistent connection (refreshAndPersist)
  • Multi-Provider (formerly Multi-Master) lets multiple nodes accept writes — conflict resolution uses CSN (Change Sequence Number), last-writer-wins
  • The essential tools: slapd, ldapadd, ldapmodify, ldapsearch, slapcat, slaptest
  • Always build indexes on the attributes you search most — uid, cn, memberOf — or every search is a full scan

The Big Picture: slapd Architecture

ldapsearch / ldapadd / SSSD / any LDAP client
              │ TCP 389 / 636
              ▼
         ┌─────────────────────────────────┐
         │  slapd (OpenLDAP server)         │
         │                                 │
         │  Frontend (protocol layer)       │
         │    • parse BER requests          │
         │    • ACL enforcement             │
         │    • schema validation           │
         │                                 │
         │  Backend (storage layer)         │
         │    • MDB (LMDB) — default       │
         │    • memory-mapped file I/O      │
         │    • ACID transactions           │
         └────────────┬────────────────────┘
                      │
              /var/lib/ldap/
              data.mdb   (the directory data)
              lock.mdb   (LMDB lock file)

EP05 showed Kerberos in isolation. OpenLDAP is where you run the identity store that Kerberos references — and where SSSD looks up user and group attributes. This episode builds a working two-node replicated directory from scratch.


Installation

# Ubuntu / Debian
apt-get install -y slapd ldap-utils

# RHEL / Rocky / AlmaLinux
dnf install -y openldap-servers openldap-clients

# After install — Ubuntu runs a configuration wizard
# Skip it: dpkg-reconfigure slapd
# Or answer it and then switch to OLC management

On RHEL-family systems, slapd is not configured after install — you work entirely through OLC from the start.


OLC: The Directory Configures Itself

The old way was slapd.conf — a static file that required a full restart on every change. OLC (Online Configuration) replaced it: slapd‘s own configuration is stored as LDAP entries under cn=config. You modify configuration the same way you modify data — with ldapmodify. Changes take effect immediately.

cn=config                        ← root config entry
├── cn=schema,cn=config          ← schema definitions
│     ├── cn={0}core             ← core schema
│     ├── cn={1}cosine           ← RFC 1274 attributes
│     └── cn={2}inetorgperson    ← inetOrgPerson object class
├── olcDatabase={-1}frontend     ← default settings for all databases
├── olcDatabase={0}config        ← the config database itself
└── olcDatabase={1}mdb           ← your actual directory data
      ├── olcAccess              ← ACLs
      ├── olcSuffix              ← base DN (e.g., dc=corp,dc=com)
      └── olcDbIndex             ← search indexes

Everything under cn=config has attributes prefixed with olc (OpenLDAP Configuration). You query and modify it just like any other LDAP subtree — with one restriction: only the cn=config admin (usually gidNumber=0+uidNumber=0,cn=peercred,cn=external,cn=auth — the local root via SASL EXTERNAL) can write to it.


Bootstrapping a Directory

The quickest way to get a working directory is a set of LDIF files applied in order.

1. Load schemas

# Apply the schemas OpenLDAP ships with
ldapadd -Y EXTERNAL -H ldapi:/// \
  -f /etc/ldap/schema/cosine.ldif
ldapadd -Y EXTERNAL -H ldapi:/// \
  -f /etc/ldap/schema/inetorgperson.ldif
ldapadd -Y EXTERNAL -H ldapi:/// \
  -f /etc/ldap/schema/nis.ldif       # adds posixAccount, posixGroup

2. Configure the MDB database

# mdb-config.ldif
dn: olcDatabase={1}mdb,cn=config
changetype: modify
replace: olcSuffix
olcSuffix: dc=corp,dc=com
-
replace: olcRootDN
olcRootDN: cn=admin,dc=corp,dc=com
-
replace: olcRootPW
olcRootPW: {SSHA}hashed_password_here

Generate the hash: slappasswd -s yourpassword

ldapmodify -Y EXTERNAL -H ldapi:/// -f mdb-config.ldif

3. Add indexes

# indexes.ldif
dn: olcDatabase={1}mdb,cn=config
changetype: modify
add: olcDbIndex
olcDbIndex: uid eq,pres
olcDbIndex: cn eq,sub
olcDbIndex: sn eq,sub
olcDbIndex: mail eq
olcDbIndex: memberOf eq
olcDbIndex: entryCSN eq
olcDbIndex: entryUUID eq

The last two (entryCSN, entryUUID) are required for SyncRepl replication to work efficiently.

4. Load initial data

# base.ldif
dn: dc=corp,dc=com
objectClass: top
objectClass: dcObject
objectClass: organization
o: Corp
dc: corp

dn: ou=people,dc=corp,dc=com
objectClass: organizationalUnit
ou: people

dn: ou=groups,dc=corp,dc=com
objectClass: organizationalUnit
ou: groups

dn: uid=vamshi,ou=people,dc=corp,dc=com
objectClass: inetOrgPerson
objectClass: posixAccount
objectClass: shadowAccount
cn: Vamshi Krishna
sn: Krishna
uid: vamshi
uidNumber: 1001
gidNumber: 1001
homeDirectory: /home/vamshi
loginShell: /bin/bash
mail: [email protected]
userPassword: {SSHA}hashed_password_here
ldapadd -x -H ldap://localhost \
  -D "cn=admin,dc=corp,dc=com" \
  -w adminpassword \
  -f base.ldif

ACLs: Who Can Read What

OpenLDAP ACLs are evaluated top-to-bottom; first match wins.

# acls.ldif — set via OLC
dn: olcDatabase={1}mdb,cn=config
changetype: modify
replace: olcAccess
# Users can change their own passwords
olcAccess: to attrs=userPassword
  by self write
  by anonymous auth
  by * none
# Users can read their own entry
olcAccess: to dn.base="ou=people,dc=corp,dc=com"
  by self read
  by users read
  by * none
# Service accounts can read everything (for SSSD)
olcAccess: to *
  by dn="cn=svc-ldap,ou=services,dc=corp,dc=com" read
  by self read
  by * none

A service account (cn=svc-ldap) that SSSD uses to search the directory needs read access to ou=people and ou=groups. Never give SSSD admin (write) access.


SyncRepl Replication

SyncRepl is a pull-based replication protocol built on the LDAP Sync operation (RFC 4533). A consumer connects to a provider and requests changes. The provider sends them. The consumer stays in sync.

On the Provider: Enable the syncprov overlay

# syncprov.ldif
dn: olcOverlay=syncprov,olcDatabase={1}mdb,cn=config
objectClass: olcOverlayConfig
objectClass: olcSyncProvConfig
olcOverlay: syncprov
olcSpCheckpoint: 100 10     # checkpoint every 100 ops or 10 minutes
olcSpSessionLog: 100        # keep last 100 changes for delta-sync
ldapadd -Y EXTERNAL -H ldapi:/// -f syncprov.ldif

On the Consumer: Configure syncrepl

# consumer-config.ldif
dn: olcDatabase={1}mdb,cn=config
changetype: modify
add: olcSyncrepl
olcSyncrepl: rid=001
  provider=ldap://ldap1.corp.com:389
  bindmethod=simple
  binddn="cn=repl-svc,dc=corp,dc=com"
  credentials=replication-password
  searchbase="dc=corp,dc=com"
  scope=sub
  schemachecking=on
  type=refreshAndPersist    # persistent connection (vs refreshOnly = polling)
  retry="5 5 60 +"          # retry: 5 times every 5s, then every 60s forever
  interval=00:00:05:00      # (for refreshOnly) sync every 5 minutes
-
add: olcUpdateRef
olcUpdateRef: ldap://ldap1.corp.com   # redirect writes to provider

refreshAndPersist keeps a persistent connection open. Changes replicate within milliseconds. refreshOnly polls on an interval — simpler, but adds latency.

Verify Replication

# On provider: check the contextCSN (the sync state token)
ldapsearch -x -H ldap://ldap1.corp.com \
  -D "cn=admin,dc=corp,dc=com" -w password \
  -b "dc=corp,dc=com" -s base contextCSN
# contextCSN: 20260427010000.000000Z#000000#000#000000

# On consumer: should match after sync
ldapsearch -x -H ldap://ldap2.corp.com \
  -D "cn=admin,dc=corp,dc=com" -w password \
  -b "dc=corp,dc=com" -s base contextCSN
# Same CSN = in sync

Multi-Provider: Accepting Writes on Both Nodes

Standard SyncRepl has one provider and one or more consumers — only the provider accepts writes. Multi-Provider (formerly Multi-Master) lets every node accept writes.

# On each node — add mirrormode to the database config
dn: olcDatabase={1}mdb,cn=config
changetype: modify
add: olcMirrorMode
olcMirrorMode: TRUE

With mirrormode enabled and each node configured as both provider and consumer of the other, writes on either node replicate to the other. Conflict resolution is CSN-based (Change Sequence Number) — a monotonically increasing timestamp. Last write wins at the attribute level.

Multi-Provider does not prevent split-brain conflicts — if two clients write the same attribute on two different nodes during a network partition, the higher CSN wins when the partition heals. For most directory use cases (user passwords, group memberships), this is acceptable. For others, it requires careful thought.


⚠ Production Gotchas

MDB data file grows monotonically. LMDB never shrinks the data file automatically. Deleted entries leave free space inside the file that gets reused, but the file on disk doesn’t shrink. Use slapcat to export and slapadd to reimport if you need to reclaim disk space.

slapcat is the only safe backup. slapcat reads the MDB database directly and exports LDIF — it does not go through slapd. Run it while slapd is running (LMDB is MVCC-safe for readers), but never copy the raw MDB files while slapd is running.

Schema changes on a replicated directory require coordination. Load the new schema on the provider first. SyncRepl will propagate it to consumers — but if a consumer gets a new entry using the new schema before the schema itself is replicated, the import will fail. Load schemas manually on all nodes before adding entries that use them.


Key Takeaways

  • OpenLDAP uses LMDB (MDB backend) — a memory-mapped, ACID-compliant storage engine with no external dependency
  • OLC (cn=config) is the right way to configure slapd — changes apply without restarts
  • SyncRepl pulls changes from a provider to a consumer — refreshAndPersist for near-real-time, refreshOnly for poll-based
  • Always index uid, cn, entryCSN, and entryUUID — unindexed searches are full scans
  • Multi-Provider allows writes on all nodes with CSN-based last-write-wins conflict resolution

What’s Next

A single OpenLDAP server works. Two nodes with SyncRepl work better. EP07 goes further: how you put multiple LDAP servers behind a load balancer, how connection pooling works, what to monitor, and how 389-DS handles directories with tens of millions of entries.

Next: LDAP High Availability: Load Balancing and Production Architecture

Get EP07 in your inbox when it publishes → linuxcent.com/subscribe

LDAP Internals: The Directory Tree, Schema, and What Travels on the Wire

Reading Time: 12 minutes

The Identity Stack, Episode 2
EP01: What Is LDAPEP02EP03: LDAP Authentication on Linux → …


TL;DR

  • The Directory Information Tree (DIT) is the hierarchical database LDAP stores — every entry lives at a unique path described by its Distinguished Name (DN)
  • Object classes define what attributes an entry is allowed or required to have — posixAccount adds UID, GID, and home directory; inetOrgPerson adds email and display name
  • Schema is the rulebook: which attribute types exist across the entire directory, what syntax each follows, and which object classes require or permit them
  • An LDAP Search sends four things: a base DN, a scope (base/one/sub), a filter like (uid=vamshi), and a list of attributes to return — the server traverses the tree and returns LDIF
  • Every LDAP message on the wire is BER-encoded (Basic Encoding Rules, a subset of ASN.1) — a compact binary format, not text
  • ldapsearch output is LDIF (LDAP Data Interchange Format) — the human-readable representation of what the BER payload carried

The Big Picture: From ldapsearch to Directory Entry

ldapsearch -x -H ldap://dc.corp.com -b "dc=corp,dc=com" "(uid=vamshi)" cn mail uidNumber
     │
     │  TCP port 389 (or 636 for LDAPS)
     │  BER-encoded SearchRequest
     ▼
┌─────────────────────────────────────────────────┐
│  LDAP Server (AD / OpenLDAP / 389-DS / FreeIPA)  │
│                                                   │
│  Directory Information Tree                       │
│                                                   │
│  dc=corp,dc=com                    ← search base  │
│    └── ou=engineers                ← scope: sub   │
│          ├── uid=alice                            │
│          └── uid=vamshi  ← filter match           │
│                cn: vamshi                         │
│                mail: [email protected]              │
│                uidNumber: 1001                    │
└─────────────────────────────────────────────────┘
     │
     │  BER-encoded SearchResultEntry
     ▼
# LDIF output on your terminal
dn: uid=vamshi,ou=engineers,dc=corp,dc=com
cn: vamshi
mail: [email protected]
uidNumber: 1001

LDAP internals are the mechanics between the command you type and the directory entry you get back. EP01 explained why LDAP was invented. This episode explains what it actually does when you run it.


The Directory Information Tree

EP01 introduced the DIT as a concept inherited from X.500. Here’s what it actually looks like inside a directory.

Every LDAP directory has a root — the base DN — from which all entries descend. For a company called Corp with a domain corp.com, the base is typically dc=corp,dc=com. Below that, the tree branches into organizational units, and below those, individual entries for people, groups, services, and anything else the directory administrator decided to model.

dc=corp,dc=com                          ← domain root (base DN)
│
├── ou=people                           ← organizational unit: people
│     ├── uid=alice                     ← user entry
│     ├── uid=vamshi
│     └── uid=bob
│
├── ou=groups                           ← organizational unit: groups
│     ├── cn=engineers
│     └── cn=ops
│
├── ou=services                         ← organizational unit: service accounts
│     ├── cn=jenkins
│     └── cn=gitlab-runner
│
└── ou=hosts                            ← organizational unit: machines
      ├── cn=web01.corp.com
      └── cn=db01.corp.com

This hierarchy is not a file system and not a relational database. It is specifically optimized for reads — the query “give me everything about this user” is the operation the protocol is built around. Writes are infrequent. Reads are constant.

Every entry in the tree has exactly one parent. There are no cross-links between branches, no foreign keys. The tree is the structure. An entry’s position in the tree is what defines it.


Distinguished Names: Reading the Path

The Distinguished Name (DN) is how you address any entry in the directory. It reads right-to-left, from the leaf to the root, with each component separated by a comma.

uid=vamshi,ou=engineers,dc=corp,dc=com

Reading right-to-left:
  dc=corp,dc=com       ← domain: corp.com
  ou=engineers         ← organizational unit: engineers
  uid=vamshi           ← this specific entry: user "vamshi"

Each component of a DN — uid=vamshi, ou=engineers, dc=corp — is a Relative Distinguished Name (RDN). The RDN is the attribute-value pair that uniquely identifies the entry within its parent container. Two users in the same ou=engineers cannot both have uid=vamshi — that would create two entries with identical DNs, which the directory won’t allow.

Common RDN attribute types and what they mean:

Attribute Stands for Typical use
dc Domain Component Domain name segments (dc=corp,dc=com = corp.com)
ou Organizational Unit Container for grouping entries
cn Common Name Groups, service accounts, human-readable name
uid User ID Linux username — the standard RDN for user entries
o Organization Top-level org containers (less common in modern setups)

When your Linux system calls getent passwd vamshi, SSSD translates that into an LDAP Search for an entry where uid=vamshi somewhere under the configured base DN. The full DN comes back with the result, but what your system cares about are the attributes inside it.


Object Classes and Schema

Every entry in the directory has a objectClass attribute — usually several values. Object classes define what attributes the entry is allowed or required to have.

# A typical user entry's object classes
dn: uid=vamshi,ou=engineers,dc=corp,dc=com
objectClass: top
objectClass: inetOrgPerson
objectClass: posixAccount
objectClass: shadowAccount

Each object class contributes a set of attributes — some required (MUST), some optional (MAY):

objectClass: posixAccount
  MUST: cn, uid, uidNumber, gidNumber, homeDirectory
  MAY:  userPassword, loginShell, gecos, description

objectClass: inetOrgPerson
  MUST: sn (surname), cn
  MAY:  mail, telephoneNumber, displayName, jpegPhoto, ...

objectClass: shadowAccount
  MUST: uid
  MAY:  shadowLastChange, shadowMin, shadowMax, shadowWarning, ...

When Linux authenticates a user via LDAP, it needs the posixAccount attributes: uidNumber (the numeric UID), gidNumber, homeDirectory, and loginShell. Without posixAccount, the user entry exists in the directory but can’t be used for Linux logins — getent passwd will return nothing.

Object classes are grouped into three kinds:

Groups in LDAP use their own object class:

objectClass: groupOfNames
  MUST: cn, member
  MAY:  description, owner, ...

# A group entry looks like this:
dn: cn=engineers,ou=groups,dc=corp,dc=com
objectClass: groupOfNames
cn: engineers
member: uid=vamshi,ou=engineers,dc=corp,dc=com
member: uid=alice,ou=engineers,dc=corp,dc=com

groupOfNames stores members as full DNs — which is why the SSSD group search filter is (member=uid=vamshi,ou=...) rather than (member=vamshi). The directory stores the exact path to each member entry. posixGroup is the alternative, which stores the memberUid as a bare username string instead of a DN — Active Directory uses groupOfNames; pure POSIX environments often use posixGroup.

Object classes are grouped into three kinds:

Structural — defines what the entry fundamentally is. Every entry must have exactly one structural class. posixAccount is structural.

Auxiliary — adds additional attributes to an existing entry. shadowAccount and inetOrgPerson can be auxiliary. You can stack multiple auxiliary classes on a single entry.

Abstract — base classes that other classes inherit from. top is the root abstract class that every entry implicitly has. You never add top to an entry; it’s always there.

Schema: The Directory’s Type System

Schema is the global rulebook for the entire directory. It defines:

  • Attribute type definitions — what each attribute is named, what syntax it uses (a string? an integer? a binary blob?), whether it’s case-sensitive, whether multiple values are allowed
  • Object class definitions — which attributes each class requires or permits
  • Matching rules — how equality comparisons work for each attribute type

The schema is stored in the directory itself, under a special entry at cn=schema,cn=config (OpenLDAP) or cn=Schema,cn=Configuration (Active Directory). You can query it:

# View the schema for the posixAccount object class
ldapsearch -x -H ldap://your-dc \
  -b "cn=schema,cn=config" \
  "(objectClass=olcObjectClasses)" \
  olcObjectClasses | grep -A 10 "posixAccount"

# Output:
# olcObjectClasses: ( 1.3.6.1.1.1.2.0
#   NAME 'posixAccount'
#   DESC 'Abstraction of an account with POSIX attributes'
#   SUP top
#   AUXILIARY
#   MUST ( cn $ uid $ uidNumber $ gidNumber $ homeDirectory )
#   MAY ( userPassword $ loginShell $ gecos $ description ) )

That OID (1.3.6.1.1.1.2.0) is the globally unique identifier for the posixAccount object class. Every object class and attribute type in every LDAP directory on the planet has a unique OID assigned by an authority. This is how schema interoperability works across different directory implementations — OpenLDAP, Active Directory, and 389-DS can all understand each other’s posixAccount entries because they share the same OID.


LDAP Operations: What Actually Runs

LDAP defines eight operations. Day-to-day authentication uses two: Bind and Search.

LDAP Operation Set
──────────────────
Bind        ← authenticate (prove identity)
Search      ← query the directory
Add         ← create a new entry
Modify      ← change attributes on an existing entry
Delete      ← remove an entry
ModifyDN    ← rename or move an entry
Compare     ← test if an attribute has a specific value
Abandon     ← cancel an outstanding operation

Bind: Proving Who You Are

Before any authenticated operation, the client sends a Bind request. There are two types:

Simple Bind — the client sends its DN and password in the clear (or over TLS). This is what -x in ldapsearch means: simple authentication.

# Simple bind as a service account
ldapsearch -x \
  -D "cn=svc-ldap-reader,ou=services,dc=corp,dc=com" \
  -w "service-account-password" \
  -H ldap://dc.corp.com \
  -b "dc=corp,dc=com" \
  "(uid=vamshi)"

SASL Bind — the client uses an authentication mechanism registered with SASL (Simple Authentication and Security Layer). Kerberos (via the GSSAPI mechanism) is the most common. EP05 covers Kerberos in detail.

# SASL bind using Kerberos (after kinit)
ldapsearch -Y GSSAPI \
  -H ldap://dc.corp.com \
  -b "dc=corp,dc=com" \
  "(uid=vamshi)"

An anonymous Bind (no DN, no password) is also valid for directories configured to allow anonymous reads. Many public LDAP directories (and some internal ones, misconfigured) allow this.

Search: The Core Operation

A Search request has five required parameters:

baseObject   — where in the DIT to start (e.g., "dc=corp,dc=com")
scope        — how deep to look
               base    = only the base entry itself
               one     = one level below base (immediate children)
               sub     = entire subtree below base (most common)
derefAliases — how to handle alias entries (usually derefAlways)
filter       — what to match (e.g., "(uid=vamshi)")
attributes   — which attributes to return (empty = return all)

When SSSD authenticates a user login, it runs exactly two Search operations:

Search 1 — find the user's entry
  base:       dc=corp,dc=com
  scope:      sub
  filter:     (uid=vamshi)
  attributes: dn, uid, uidNumber, gidNumber, homeDirectory, loginShell

Search 2 — find the user's group memberships
  base:       dc=corp,dc=com
  scope:      sub
  filter:     (member=uid=vamshi,ou=engineers,dc=corp,dc=com)
  attributes: dn, cn, gidNumber

The first search locates the user entry and retrieves the POSIX attributes. The second finds all group entries that contain the user’s DN as a member. These two queries are the complete basis for a Linux login over LDAP.

Search Filters

LDAP filters follow a prefix (Polish notation) syntax. Every filter is wrapped in parentheses:

# Simple equality
(uid=vamshi)

# Presence — entry has this attribute at all
(mail=*)

# Substring match
(cn=vam*)

# Comparison
(uidNumber>=1000)

# Logical AND — both conditions must match
(&(objectClass=posixAccount)(uid=vamshi))

# Logical OR — either condition matches
(|(uid=vamshi)([email protected]))

# Logical NOT
(!(uid=guest))

# Combined — posixAccount entries with UID >= 1000 and no disabled flag
(&(objectClass=posixAccount)(uidNumber>=1000)(!(pwdAccountLockedTime=*)))

The & and | operators take any number of operands. Filter syntax looks strange the first time but is unambiguous and compact — which matters when you’re encoding it into BER for the wire.


What Actually Travels on the Wire

Every LDAP message is encoded in BER (Basic Encoding Rules), a binary subset of ASN.1. LDAP is not a text protocol.

When you run ldapsearch, the tool constructs a BER-encoded SearchRequest message and sends it over TCP. The server responds with one or more SearchResultEntry messages (one per matching entry), followed by a SearchResultDone. All of these are BER.

BER uses a type-length-value (TLV) encoding:

Tag byte(s)    — what type of data this is
Length byte(s) — how many bytes of data follow
Value byte(s)  — the actual data

A minimal LDAP SearchRequest for ldapsearch -x -b "dc=corp,dc=com" "(uid=vamshi)" uid looks like this on the wire:

30 45          ← SEQUENCE (LDAPMessage)
  02 01 01     ← INTEGER 1 (messageID = 1)
  63 40        ← [APPLICATION 3] SearchRequest
    04 11       ← OCTET STRING: baseObject
      64 63 3d  ← "dc=corp,dc=com" (20 bytes)
      63 6f 72
      70 2c 64
      63 3d 63
      6f 6d
    0a 01 02   ← ENUMERATED: scope = wholeSubtree (2)
    0a 01 03   ← ENUMERATED: derefAliases = derefAlways (3)
    02 01 00   ← INTEGER: sizeLimit = 0 (unlimited)
    02 01 00   ← INTEGER: timeLimit = 0 (unlimited)
    01 01 00   ← BOOLEAN: typesOnly = false
    a7 0f      ← [7] equalityMatch filter
      04 03 75 69 64   ← attributeDesc: "uid"
      04 06 76 61 6d   ← assertionValue: "vamshi"
             73 68 69
    30 05      ← SEQUENCE: AttributeDescriptionList
      04 03 75 69 64   ← "uid"

You don’t need to read BER by hand in practice. But knowing it’s binary — not HTTP, not JSON, not plain text — explains some things:

  • Why tcpdump port 389 shows binary output you can’t read directly
  • Why LDAP on port 389 looks different in Wireshark than HTTP traffic
  • Why ldapsearch output (LDIF) is a transformation of the wire data, not the wire data itself

To see the wire protocol in action:

# Run ldapsearch with debug output (level 1 = protocol tracing)
ldapsearch -d 1 -x \
  -H ldap://ldap.forumsys.com \
  -b "dc=example,dc=com" \
  -D "cn=read-only-admin,dc=example,dc=com" \
  -w readonly \
  "(uid=tesla)" cn

# You'll see output like:
# ldap_connect_to_host: TCP ldap.forumsys.com:389
# ldap_new_connection 1 1 0
# ldap_connect_to_host: Trying ldap.forumsys.com:389
# ldap_pvt_connect: fd: 5 tm: -1 async: 0
# TLS: can't connect.
# ldap_open_defconn: successful
# ber_scanf fmt ({it) ber:     ← BER decoding of the response
# ber_scanf fmt ({) ber:
# ber_scanf fmt (W) ber:
# ...

The ber_scanf lines are the BER decoder working through the server’s response. Each line represents one TLV element being read off the wire.


Reading ldapsearch Output: Every Field

ldapsearch output is LDIF (LDAP Data Interchange Format), defined in RFC 2849. It’s the standard text serialization of LDAP entries.

ldapsearch -x \
  -H ldap://ldap.forumsys.com \
  -b "dc=example,dc=com" \
  -D "cn=read-only-admin,dc=example,dc=com" \
  -w readonly \
  "(uid=tesla)" \
  cn mail uid uidNumber objectClass

Output, annotated:

# extended LDIF
#
# LDAPv3                              ← protocol version confirmed
# base <dc=example,dc=com> with scope subtree
# filter: (uid=tesla)                 ← your search filter echoed back
# requesting: cn mail uid uidNumber objectClass
#

# tesla, example.com                  ← comment: CN, base DN
dn: uid=tesla,dc=example,dc=com      ← Distinguished Name — full path in the tree

objectClass: inetOrgPerson           ← structural class: person with org attrs
objectClass: organizationalPerson    ← auxiliary: adds telephoneNumber etc.
objectClass: person                  ← auxiliary: adds sn (surname)
objectClass: top                     ← every entry has this implicitly
cn: Tesla                            ← common name (from inetOrgPerson MUST)
mail: [email protected]        ← email (from inetOrgPerson MAY)
uid: tesla                           ← userid (from inetOrgPerson MAY)

# search result
search: 2                            ← messageID of the SearchResultDone
result: 0 Success                    ← 0 = no error; 32 = no such object; 49 = invalid credentials

# numResponses: 2                    ← 1 result entry + 1 SearchResultDone
# numEntries: 1

The result: line is the one to watch when debugging. LDAP result codes:

Code Meaning What it tells you
0 Success Query ran, results returned (or no results found — check numEntries)
32 No Such Object Base DN doesn’t exist in this directory
49 Invalid Credentials Bind failed — wrong DN, wrong password, or account locked
50 Insufficient Access Your bind DN doesn’t have read permission on these entries
53 Unwilling to Perform Server refused the operation (e.g., password policy, anonymous bind disabled)
65 Object Class Violation Add/Modify would violate schema (missing MUST attribute, unrecognized object class)

Ports: 389, 636, and 3268

Port 389   — LDAP (plaintext, or StartTLS in-session upgrade)
Port 636   — LDAPS (LDAP wrapped in TLS from the start)
Port 3268  — Active Directory Global Catalog (plain)
Port 3269  — Active Directory Global Catalog over TLS

Port 389 vs 636: Both carry the same BER-encoded LDAP protocol. The difference is when TLS starts. On 636 (LDAPS), the TLS handshake happens before the first LDAP message. On 389 with StartTLS, the client sends a plaintext ExtendedRequest with OID 1.3.6.1.4.1.1466.20037 to initiate the TLS upgrade, then both sides continue over TLS. In production, use one or the other — never unencrypted port 389. Your credentials transit the wire on every Bind.

Ports 3268/3269 — Active Directory Global Catalog: AD organizes domains into forests. Each domain controller holds the full LDAP tree for its own domain. The Global Catalog is a read-only, partial replica of every domain in the forest — just the most-queried attributes from every object. When an application needs to find a user across domains in the same forest (not just in one domain), it queries the Global Catalog on 3268/3269 instead of a domain-specific DC on 389/636.

Forest: corp.com
  ├── Domain: corp.com       → DC at port 389/636   (full copy of corp.com)
  ├── Domain: emea.corp.com  → DC at port 389/636   (full copy of emea.corp.com)
  └── Global Catalog        → GC at port 3268/3269  (partial copy of ALL domains)

If your SSSD or application is configured to use port 3268 instead of 389, it’s talking to the Global Catalog — useful for forest-wide user lookups, but missing some less-common attributes that aren’t replicated to the GC.


Try It: ldapsearch Against Your Own Directory

If your Linux machine is joined to AD or connected to an LDAP directory, you can run these right now:

# 1. Confirm your SSSD knows where the LDAP server is
grep -E "ldap_uri|ad_domain|krb5_server" /etc/sssd/sssd.conf

# 2. Look up your own user entry
ldapsearch -x \
  -H ldap://$(grep ldap_uri /etc/sssd/sssd.conf | awk -F= '{print $2}' | tr -d ' ') \
  -b "dc=$(hostname -d | sed 's/\./,dc=/g')" \
  "(uid=$(whoami))" \
  dn objectClass uid uidNumber gidNumber homeDirectory loginShell

# 3. Find the groups you're in
ldapsearch -x \
  -H ldap://your-dc \
  -b "dc=corp,dc=com" \
  "(member=$(ldapsearch -x ... "(uid=$(whoami))" dn | grep ^dn | cut -d' ' -f2-))" \
  cn gidNumber

# 4. Check what object classes your entry has
ldapsearch -x \
  -H ldap://your-dc \
  -b "dc=corp,dc=com" \
  "(uid=$(whoami))" \
  objectClass

On a machine joined to Active Directory, the ldap_uri in sssd.conf is your domain controller’s address. On FreeIPA or OpenLDAP, it’s the directory server. The same ldapsearch commands work against all of them — because they all speak LDAP v3.


⚠ Common Misconceptions

“The DN is like a file path.” The analogy holds for reading it, but the DIT is not a file system. Entries don’t inherit permissions from parent containers the way files inherit from directories. Access control in LDAP is defined by ACLs on the server — not by position in the tree.

“LDAP is case-sensitive.” It depends on the attribute. Most string attributes (like cn and mail) use case-insensitive matching by default — (cn=Vamshi) and (cn=vamshi) return the same results. But some attributes (like userPassword and most binary types) are case-sensitive. The schema’s matching rules define this per-attribute.

“You need the full DN to search for a user.” No. The Search operation with a sub scope searches the entire subtree below the base DN. You search with a filter like (uid=vamshi) without knowing the full DN. The DN comes back in the result.

“LDAP accounts and Linux accounts are the same thing.” An LDAP user entry becomes a Linux account only if the entry has a posixAccount object class with the required POSIX attributes (uidNumber, gidNumber, homeDirectory). An LDAP entry without posixAccount can exist in the directory but getent passwd will not return it.

“The objectClass attribute can be changed freely.” Structural object classes cannot be changed after an entry is created — you’d have to delete and recreate the entry. Auxiliary classes can be added or removed. This is why correctly choosing the structural class at entry creation time matters.


Framework Alignment

Domain Relevance
CISSP Domain 5: Identity and Access Management DIT structure, DN addressing, object classes, and schema are the data model underpinning every enterprise identity store — understanding them is foundational to managing directory-based IAM
CISSP Domain 4: Communications and Network Security BER on port 389 is unencrypted; LDAPS (port 636) or StartTLS is required for production — wire-level understanding informs the transport security decision
CISSP Domain 3: Security Architecture and Engineering Schema design and DIT hierarchy are architectural decisions with security consequences: overly permissive schemas enable privilege escalation; flat DITs make access delegation harder

Key Takeaways

  • The DIT is a hierarchical database — every entry has a unique DN that describes its path from leaf to root
  • Object classes define the schema rules for each entry: what attributes are required (MUST) vs optional (MAY), and what the entry fundamentally is
  • For a user to be usable for Linux logins, the directory entry needs the posixAccount object class with uidNumber, gidNumber, and homeDirectory populated
  • An LDAP login is two operations: a Bind (authenticate), then a Search (retrieve POSIX attributes and group memberships)
  • Everything on the wire is BER-encoded binary — ldapsearch output is LDIF, a human-readable transformation of what the wire actually carries
  • LDAP result code 0 means success; 49 means bad credentials; 32 means the base DN doesn’t exist — these are the three you’ll debug most often


Run ldapsearch against your own directory and look at the object classes on your entry. Does it have posixAccount? Does it have shadowAccount? What attributes is your SSSD actually reading on every login — and what does it do when the LDAP server is unreachable? 👇


What’s Next

EP02 showed what’s inside the directory: the tree structure, the schema, the operations, and the wire protocol. What it left open is how Linux actually uses this information to grant a login.

LDAP is not, by itself, an authentication protocol. The Bind operation can verify a password — but that’s a tiny piece of what happens when you SSH into a machine joined to Active Directory. The full login flow runs through PAM, NSS, and SSSD before LDAP ever gets queried. EP03 traces that path.

Next: LDAP Authentication on Linux: PAM, NSS, and the Login Stack

Get EP03 in your inbox when it publishes → linuxcent.com/subscribe

What Is LDAP — and Why It Was Invented to Replace Something Worse

Reading Time: 9 minutes

The Identity Stack, Episode 1
EP01EP02: LDAP Internals → EP03 → …


TL;DR

  • LDAP (Lightweight Directory Access Protocol) is a protocol for reading and writing directory information — most commonly, who is allowed to do what
  • It was built in 1993 as a “lightweight” alternative to X.500/DAP, which ran over the full OSI stack and was impossible to deploy on anything but mainframe hardware
  • Before LDAP, every server had its own /etc/passwd — 50 machines meant 50 separate user databases, managed manually
  • NIS (Network Information Service) was the first attempt to centralize this — it worked, then became a cleartext-credentials security liability
  • LDAP v3 (RFC 2251, 1997) is the version still in production today — 27 years of backwards compatibility
  • Everything you use today — Active Directory, Okta, Entra ID — is built on top of, or speaks, LDAP

The Big Picture: 50 Years of “Who Are You?”

1969–1980s   /etc/passwd — per-machine, no network auth
     │        50 servers = 50 user databases, managed manually
     │
     ▼
1984         Sun NIS / Yellow Pages — first centralized directory
     │        broadcast-based, no encryption, flat namespace
     │        Revolutionary for its era. A liability by the 1990s.
     │
     ▼
1988         X.500 / DAP — enterprise-grade directory services
     │        OSI protocol stack. Powerful. Impossible to deploy.
     │        Mainframe-class infrastructure required just to run it.
     │
     ▼
1993         RFC 1487 — LDAP v1
     │        Tim Howes, University of Michigan.
     │        Lightweight. TCP/IP. Actually deployable.
     │
     ▼
1997         RFC 2251 — LDAP v3
     │        SASL authentication. TLS. Controls. Referrals.
     │        The version still in production today.
     │
     ▼
2000s–now    Active Directory, OpenLDAP, 389-DS, FreeIPA
             Okta, Entra ID, Google Workspace
             LDAP DNA in every identity system on the planet.

What is LDAP? It’s the protocol that solved one of the most boring and consequential problems in computing: how do you know who someone is, across machines, at scale, without sending their password in cleartext?


The World Before LDAP

Before you understand why LDAP was invented, you need to feel the problem it solved.

Every Unix machine in the 1970s and 1980s managed its own users. When you created an account on a server, your username, UID, and hashed password went into /etc/passwd on that machine. Another machine had no idea you existed. If you needed access to ten servers, an administrator created ten separate accounts — manually, one by one. When you changed your password, each account had to be updated separately.

For a university with 200 machines and 10,000 students, this was chaos. For a company with offices in three cities, it was a full-time job for multiple sysadmins.

Machine A           Machine B           Machine C
/etc/passwd         /etc/passwd         /etc/passwd
vamshi:x:1001       (vamshi unknown)    vamshi:x:1004
alice:x:1002        alice:x:1001        alice:x:1003
bob:x:1003          bob:x:1002          (bob unknown)

Same people, different UIDs, different machines, no central truth.
File permissions become meaningless when UID 1001 means
different users on different hosts.

For every new hire, an admin SSHed to every machine and ran useradd. When someone left, you hoped whoever ran the offboarding remembered all the machines. Most organizations didn’t know their own attack surface because there was no single place to look.


Sun NIS: The First Attempt at Centralization

Sun Microsystems released NIS (Network Information Service) in 1984, originally called Yellow Pages — a name they had to drop after a trademark dispute with British Telecom. The idea was elegant: one server holds the authoritative /etc/passwd (and /etc/group, /etc/hosts, and a dozen other maps), and client machines query it instead of reading local files.

For the first time, you could create an account once and have it work across your entire network. For a generation of Unix administrators, NIS was liberating.

       NIS Master Server
       /var/yp/passwd.byname
              │
    ┌─────────┼──────────┐
    ▼         ▼          ▼
 Client A   Client B   Client C
 (query NIS — no local /etc/passwd needed)

NIS worked well — until it didn’t. The failure modes were structural:

No encryption. NIS responses were cleartext UDP. An attacker on the same network segment could capture the full password database with a packet sniffer. In 1984, “the network” meant a trusted corporate LAN. By the mid-1990s, it meant ethernet segments that included lab workstations, and the assumptions no longer held.

Broadcast-based discovery. NIS clients found servers by broadcasting on the local network. This worked on a single flat ethernet. It failed completely across routers, across buildings, and across WAN links. Multi-site organizations ended up running separate NIS domains with no connection between them — which partially defeated the purpose.

Flat namespace. NIS had no organizational hierarchy. One domain. Everything flat. You couldn’t have engineering and finance as separate administrative units. You couldn’t delegate user management to a department. One person — usually one overworked sysadmin — managed the whole thing.

UIDs had to match across all machines. If alice was UID 1002 on one server but UID 1001 on another, NFS file ownership became wrong. NIS enforced consistency, but onboarding a new machine into an existing network required manually auditing UID conflicts across the entire directory. Get one wrong and files end up owned by the wrong person.

NIS worked for thousands of installations from 1984 to the mid-1990s. It also ended careers when it failed. What the industry needed was a hierarchical, structured, encrypted, scalable directory service.


X.500 and DAP: The Right Idea, Wrong Protocol

The OSI (Open Systems Interconnection) standards body had an answer: X.500 directory services. X.500 was comprehensive, hierarchical, globally federated. The ITU-T published the standard in 1988, and it looked like exactly what enterprises needed.

X.500 Directory Information Tree (DIT)
              c=US                   ← country
                │
         o=University                ← organization
                │
         ┌──────┴──────┐
     ou=CS           ou=Physics      ← organizational units
         │
     cn=Tim Howes                    ← common name (person)
     telephoneNumber: +1-734-...
     mail: [email protected]

This data model — the hierarchy, the object classes, the distinguished names — is exactly what LDAP inherited. The DIT, the cn=, ou=, dc= notation in every LDAP query you’ve ever read: all of it came from X.500.

The problem was DAP: the Directory Access Protocol that X.500 used to communicate.

DAP ran over the full OSI protocol stack. Not TCP/IP — OSI. Seven layers, all of which required specialized software that in 1988 only mainframe and minicomputer vendors had implemented. A university department wanting to run X.500 needed hardware and software licenses that cost as much as a small car. The vast majority of workstations couldn’t speak OSI at all.

The data model was sound. The transport was impractical.

X.500 / DAP (1988)              LDAP v1 (1993)
──────────────────              ──────────────
Full OSI stack (7 layers)  →    TCP/IP only
Mainframe-class hardware   →    Any Unix box with a TCP stack
$50,000+ deployment cost   →    Free (reference implementation)
Vendor-specific OSI impl.  →    Standard socket API
Zero internet adoption     →    Universities deployed immediately

The Invention: LDAP at the University of Michigan

Tim Howes was at the University of Michigan in the early 1990s. The university was running X.500 for its directory — faculty, staff, student contact information, credentials. The data model was good. The protocol was the problem.

His insight, working with colleagues Wengyik Yeong and Steve Kille: strip X.500 down to what actually needs to function over a TCP/IP connection. Keep the hierarchical data model. Throw away the OSI transport. The result was the Lightweight Directory Access Protocol.

RFC 1487, published July 1993, described LDAP v1. It preserved the X.500 directory information model — the hierarchy, the object classes, the distinguished name format — and mapped it onto a protocol that could run over a simple TCP socket on port 389.

No specialized hardware. No OSI. If you had a Unix machine and TCP/IP, you could run LDAP. By 1993, that meant virtually every workstation and server in every university and most enterprises.

The University of Michigan deployed it immediately. Within two years, organizations across the internet were running the reference implementation.

LDAP v2 (RFC 1777, 1995) cleaned up the protocol. LDAP v3 (RFC 2251, 1997) is the version in production today — adding SASL authentication (which enables Kerberos integration), TLS support, referrals for federated directories, and extensible controls for server-side operations. The RFC that standardized the internet’s primary identity protocol is 27 years old and still running.


What LDAP Actually Is

LDAP is a client-server protocol for reading and writing a directory — a structured, hierarchical database optimized for reads.

Every entry in the directory has a Distinguished Name (DN) that describes its position in the hierarchy, and a set of attributes defined by its object classes. A person entry looks like this:

dn: cn=vamshi,ou=engineers,dc=linuxcent,dc=com

objectClass: inetOrgPerson
objectClass: posixAccount
cn: vamshi
uid: vamshi
uidNumber: 1001
gidNumber: 1001
homeDirectory: /home/vamshi
loginShell: /bin/bash
mail: [email protected]

The DN reads right-to-left: domain linuxcent.com (dc=linuxcent,dc=com) → organizational unit engineers → common name vamshi. Every entry in the directory has a unique path through the tree — there’s no ambiguity about which vamshi you mean.

LDAP defines eight operations: Bind (authenticate), Search, Add, Modify, Delete, ModifyDN (rename), Compare, and Abandon. Most of what a Linux authentication system does with LDAP reduces to two: Bind (prove you are who you say you are) and Search (tell me everything you know about this user).

When your Linux machine authenticates an SSH login against LDAP:

1. User types password
2. PAM calls pam_sss (or pam_ldap on older systems)
3. SSSD issues a Bind to the LDAP server: "I am cn=vamshi, and here is my credential"
4. LDAP server verifies the bind → success or failure
5. SSSD issues a Search: "give me the posixAccount attributes for uid=vamshi"
6. LDAP returns uidNumber, gidNumber, homeDirectory, loginShell
7. PAM creates the session with those attributes

The entire login flow is two LDAP operations: one Bind, one Search.


Try It Right Now

You don’t need to set up an LDAP server to run your first query. There’s a public test LDAP directory at ldap.forumsys.com:

# Query a public LDAP server — no setup required
ldapsearch -x \
  -H ldap://ldap.forumsys.com \
  -b "dc=example,dc=com" \
  -D "cn=read-only-admin,dc=example,dc=com" \
  -w readonly \
  "(objectClass=inetOrgPerson)" \
  cn mail uid

# What you get back (abbreviated):
# dn: uid=tesla,dc=example,dc=com
# cn: Tesla
# mail: [email protected]
# uid: tesla
#
# dn: uid=einstein,dc=example,dc=com
# cn: Albert Einstein
# mail: [email protected]
# uid: einstein

Decode what you just ran:

  • -x — simple authentication (username/password bind, not Kerberos/SASL)
  • -H ldap://ldap.forumsys.com — the LDAP server URI, port 389
  • -b "dc=example,dc=com" — the base DN, the top of the subtree to search
  • -D "cn=read-only-admin,dc=example,dc=com" — the bind DN (who you’re authenticating as)
  • -w readonly — the bind password
  • "(objectClass=inetOrgPerson)" — the search filter: return entries that are people
  • cn mail uid — the attributes to return (default returns all)

That’s a live LDAP query returning real directory entries from a server running RFC 2251 — the same protocol Tim Howes designed in 1993.

On your own Linux system, if you’re joined to AD or LDAP, you can query it the same way with your domain credentials.


Why It Never Went Away

LDAP v3 was finalized in 1997. In 2024, it’s still the protocol every enterprise directory speaks. Why?

Because it became the lingua franca of enterprise identity before any replacement existed. Every application that needs to authenticate users — VPN concentrators, mail servers, network switches, web applications, HR systems — implemented LDAP support. Every directory service Microsoft, Red Hat, Sun, and Novell shipped stored data in an LDAP-accessible tree.

When Microsoft built Active Directory in 1999, they built it on top of LDAP + Kerberos. When your Linux machine joins an AD domain, it speaks LDAP to enumerate users and groups, and Kerberos to verify credentials. When Okta or Entra ID syncs with your on-premises directory, it uses LDAP Sync (or a modern protocol that maps directly to LDAP semantics).

The protocol is old. The ecosystem built on top of it is so deep that replacing LDAP would mean simultaneously replacing every enterprise application that depends on it. Nobody has done that. Nobody has had to.

What happened instead is the stack got taller. LDAP at the bottom, Kerberos for network authentication, SSSD as the local caching daemon, PAM as the Linux integration layer, SAML and OIDC at the top for web-based federation. The directory is still LDAP. The interfaces above it evolved.

That full stack — from the directory at the bottom to Zero Trust at the top — is what this series covers.


⚠ Common Misconceptions

“LDAP is an authentication protocol.” LDAP is a directory protocol. It stores identity information and can verify credentials (via Bind). Authentication in modern stacks is typically Kerberos or OIDC — LDAP provides the directory backing it.

“LDAP is obsolete.” LDAP is the storage layer for Active Directory, OpenLDAP, 389-DS, FreeIPA, and every enterprise IdP’s on-premises sync. It is ubiquitous. What’s changed is the interface layer above it.

“You need Active Directory to run LDAP.” Active Directory uses LDAP. OpenLDAP, 389-DS, FreeIPA, and Apache Directory Server are all standalone LDAP implementations. You can run a directory without Microsoft.

“LDAP and LDAPS are different protocols.” LDAP is the protocol. LDAPS is LDAP over TLS on port 636. StartTLS is LDAP on port 389 with an in-session upgrade to TLS. Same protocol, different transport security.


Framework Alignment

Domain Relevance
CISSP Domain 5: Identity and Access Management LDAP is the foundational directory protocol for centralized identity stores — the base layer of every enterprise IAM stack
CISSP Domain 4: Communications and Network Security Port 389 (LDAP), 636 (LDAPS), 3268/3269 (AD Global Catalog) — transport security decisions affect every directory deployment
CISSP Domain 3: Security Architecture and Engineering DIT hierarchy, schema design, replication topology — directory structure is an architectural security decision
NIST SP 800-63B LDAP as a credential service provider (CSP) backing enterprise authenticators

Key Takeaways

  • LDAP was invented to solve a real, painful problem: the authentication chaos that NIS couldn’t fix and X.500/DAP was too expensive to deploy
  • It inherited the right thing from X.500 (the hierarchical data model) and replaced the right thing (the impractical OSI transport with TCP/IP)
  • NIS was the predecessor that worked until it didn’t — its failure modes (no encryption, flat namespace, broadcast discovery) are exactly what LDAP was designed to fix
  • LDAP v3 (RFC 2251, 1997) is still the production standard — 27 years later
  • Active Directory, OpenLDAP, FreeIPA, Okta, Entra ID — every enterprise identity system either runs LDAP or speaks it
  • The full authentication stack is deeper than LDAP: the next 12 episodes peel it apart layer by layer

What’s Next

EP01 stayed at the design level — the problem, the predecessor failures, the invention, the data model.

EP02 goes inside the wire. The DIT structure, DN syntax, object classes, schema, and the BER-encoded bytes that actually travel from the server to your authentication daemon. Run ldapsearch against your own directory and read every line of what comes back.

Next: LDAP Internals: The Directory Tree, Schema, and What Travels on the Wire

Get EP02 in your inbox when it publishes → linuxcent.com/subscribe