What's New in OpenLDAP
Howard Chu
OpenLDAP Project
●

Open source code project

●

Founded 1998

●

Three core team members

●

A dozen or so contributors

●

Feature releases every 12-18 months

●

Maintenance releases roughly monthly
A Word About Symas
●

Founded 1999

●

Founders from Enterprise Software world
– platinum Technology (Locus Computing)
– IBM

●

Howard joined OpenLDAP in 1999
– One of the Core Team members
– Appointed Chief Architect January 2007

●

No debt, no VC investments
Intro
Howard Chu
●

●

Founder and CTO Symas Corp.
Developing Free/Open Source software since
1980s
– GNU compiler toolchain, e.g. "gmake -j", etc.
– Many other projects, check ohloh.net...

●

Worked for NASA/JPL, wrote software for
Space Shuttle, etc.

4
What's New
●

Lightning Memory-Mapped Database (LMDB)
and its knock-on effects
●

Within OpenLDAP code

●

Other projects

●

New HyperDex clustered backend

●

New Samba4/AD integration work

●

Other features

●

What's missing
LMDB
●

Introduced at LDAPCon 2011
●
●

Full ACID transactions
MVCC, readers and writers don't block each
other

●

Ultra-compact, compiles to under 32KB

●

Memory-mapped, lightning fast zero-copy reads

●

Much greater CPU and memory efficiency

●

Much simpler configuration
LMDB Impact
●

Within OpenLDAP
●

●

Revealed other frontend bottlenecks that were
hidden by BerkeleyDB-based backends
Addressed in OpenLDAP 2.5
●

●

Thread pool enhanced, support multiple work
queues to reduce mutex contention
Connection manager enhanced, simplify write
synchronization
OpenLDAP Frontend
●

Testing in 2011 (16 core server):
●

back-hdb, 62000 searches/sec, 1485 % CPU

●

back-mdb, 75000 searches/sec, 1000 % CPU

●

●

back-mdb, 2 slapds, 127000 searches/sec,
1250 % CPU - network limited

We should not have needed two processes to
hit this rate
Efficiency Note
●

back-hdb 62000 searches/sec @ 1485 %
●

●

back-mdb 127000 searches/sec @1250 %
●

●
●

41.75 searches per CPU %
101.60 searches per CPU %

2.433x as many searches per unit of CPU
"Performance" isn't the point, *Efficiency* is
what matters
OpenLDAP Frontend
●

Threadpool contention
●

Analyzed using mutrace

●

Found #1 bottleneck in threadpool mutex

●

Modified threadpool to support multiple queues

●

●
●

On quad-core laptop, using 4 queues reduced mutex
contended time by factor of 6.
Reduced condition variable contention by factor of 3.
Overall 20 % improvement in throughput on quadcore VM
OpenLDAP Frontend
●

Connection Manager
●

●

Also a single thread, accepting new connections and
polling for read/write ready on existing
Now can be split to multiple threads
●

●

Impact depends on number of connections

Polling for write is no longer handled by the listener thread
●
●
●

Removes one level of locks and indirection
Simplifies WriteTimeout implementation
Typically no benchmark impact, only significant when blocking
on writes due to slow clients
OpenLDAP Frontend
Frontend Improvements, Quadcore VM
40000

35000

30000

Ops/Second

25000

SearchRate
AuthRate
ModRate

20000

15000

10000

5000

0
OL 2.4

OL 2.5
LMDB Impact
●

Adoption by many other projects
●

Outperforms all other embedded databases in
common applications
●

●

CFengine, Postfix, PowerDNS, etc.

Has none of the reliability/integrity weaknesses
of other databases

●

Has none of the licensing issues...

●

Integrated into multiple NoSQL projects
●

Redis, SkyDB, Memcached, HyperDex, etc.
LMDB Microbenchmark
●
●

●

Comparisons based on Google's LevelDB
Also tested against Kyoto Cabinet's TreeDB,
SQLite3, and BerkeleyDB
Tested using RAM filesystem (tmpfs), reiserfs
on SSD, and multiple filesystems on HDD
– btrfs, ext2, ext3, ext4, jfs, ntfs, reiserfs, xfs, zfs
– ext3, ext4, jfs, reiserfs, xfs also tested with
external journals
LMDB Microbenchmark
Relative Footprint
text

data

bss

dec

hex

filename

272247

1456

328

274031

42e6f db_bench

1675911

2288

304

1678503

199ca7 db_bench_bdb

90423

1508

304

92235

1684b db_bench_mdb

653480

7768

1688

662936

a2764 db_bench_sqlite3

296572

4808

1096

302476

49d8c db_bench_tree_db

Clearly LMDB has the smallest footprint
– Carefully written C code beats C++ every time
LMDB Microbenchmark
Read Performance

Read Performance

Small Records

Small Records

16000000

800000

14000000

700000

12000000

600000

10000000

500000

8000000

400000

6000000

300000

4000000

200000

2000000

100000

0

0
Sequential
SQLite3

TreeDB

LevelDB

Random
BDB

MDB

SQLite3

TreeDB

LevelDB

BDB

MDB
LMDB Microbenchmark
Read Performance

Read Performance

Large Records

Large Records

35000000

2000000
30303030

30000000

1718213

1800000
1600000

25000000

1400000
1200000

20000000

1000000
15000000

800000
600000

10000000

400000
5000000
0

7402

16514

299133

200000
9133

0

7047

14518

Sequential
SQLite3

TreeDB

LevelDB

15183

8646

Random
BDB

MDB

SQLite3

TreeDB

LevelDB

BDB

MDB
LMDB Microbenchmark
Read Performance

Read Performance

Large Records

Large Records

100000000

30303030

10000000

1718213
1000000

1000000

299133

100000

100000
10000

10000000

7402

16514

10000

9133

7047

14518

15183

8646

1000

1000

100

100
10

10

1

1
Sequential
SQLite3

TreeDB

LevelDB

BDB

Random
MDB

SQLite3

TreeDB

LevelDB

BDB

MDB
LMDB Microbenchmark
Asynchronous Write Performance

Asynchronous Write Performance

Large Records, tmpfs

Large Records, tmpfs

14000

12905

14000

12000

12000

10000

10000

8000

12735

8000
5860

6000
4000
2000

5709

6000
4000

3366
2029

1920

2000

0

2004

1902
742

0
Sequential
SQLite3

TreeDB

LevelDB

Random
BDB

MDB

SQLite3

TreeDB

LevelDB

BDB

MDB
LMDB Microbenchmark
Batched Write Performance

Batched Write Performance

Large Records, tmpfs

Large Records, tmpfs

14000

13215

14000

12000

12000

10000

10000

8000

13099

8000
5860

6000
4000
2000

5709

6000
4000

3138
2068

1952

2000

0

3079
2041

1939

0
Sequential
SQLite3

TreeDB

LevelDB

Random
BDB

MDB

SQLite3

TreeDB

LevelDB

BDB

MDB
LMDB Microbenchmark
Synchronous Write Performance

Synchronous Write Performance

Large Records, tmpfs

Large Records, tmpfs

14000

12916

14000

12000

12000

10000

10000

8000

8000

6000

12665

6000

4000
2000

3121

4000

3368

2026

1913

2000

0

1996

2162

1893
745

0
Sequential
SQLite3

TreeDB

LevelDB

Random
BDB

MDB

SQLite3

TreeDB

LevelDB

BDB

MDB
MemcacheDB

Read Performance

Write Performance

Single Thread, Log Scale

Single Thread, Log Scale
1000

100

msec

1

0.1

100

min
avg
max90th
max95th
max99th
max

10
msec

min
avg
max90th
max95th
max99th
max

10

1

0.1

0.01

0.01
BDB 4.7

MDB

Memcached

BDB 4.7

MDB

Memcached
MemcacheDB

Read Performance

Write Performance

4 Threads, Log Scale

4 Threads, Log Scale

10

1000

msec

1

100

min
avg
max90th
max95th
max99th
max

10
msec

min
avg
max90th
max95th
max99th
max

1

0.1
0.1

0.01

0.01
BDB 4.7

MDB

Memcached

BDB 4.7

MDB

Memcached
HyperDex
●

New generation NoSQL database server
●

http://hyperdex.org

●

Simple configuration/deployment

●

Multidimensional indexing/sharding

●

Efficient distributed search engine

●

●

Built on Google LevelDB, evolved to their fixed
version HyperLevelDB
Ported to LMDB
LMDB, HyperDex
LMDB, HyperDex
●

CPU time used for inserts :
●
●

●

●

LMDB 19:44.52
HyperLevelDB 96:46.96

HyperLevelDB used 4.9x more CPU for same
number of operations
Again, performance isn't the point. Throwing
extra CPU at a job to "make it go faster" is
stupid.
LMDB, HyperDex
LMDB, HyperDex
●

CPU time used for read/update :
– LMDB 1:33.17
– HyperLevelDB 3:37.67

●

HyperLevelDB used 2.3x more CPU for same
number of operations
LMDB, HyperDex
LMDB, HyperDex
●

CPU time used for inserts :
●
●

●

LMDB 227:26
HyperLevelDB 3373:13

HyperLevelDB used 14.8x more CPU for
same number of operations
LMDB, HyperDex
LMDB, HyperDex
●

CPU time used for read/update :
– LMDB 4:21.41
– HyperLevelDB 17:27

●

HyperLevelDB used 4.0x more CPU for same
number of operations
back-hyperdex
●

New clustered backend built on HyperDex
●

●

Existing back-ndb clustered backend is deprecated,
Oracle has refused to cooperate on support

Nearly complete LDAP support
●
●

●

Currently has limited search filter support
Uses flat (back-bdb style) namespace, not
hierarchical
Still in prototype stage as HyperDex API is still in
flux
Samba4/AD
●

Samba4 provides its own ActiveDirectorycompatible LDAP service
●
●

●

built on Samba ldb/tdb libraries
supports AD replication

Has some problems
●
●

●

Incompatible with Samba3+OpenLDAP deployments
Originally attempted to interoperate with OpenLDAP,
but that work was abandoned
Poor performance
Samba4/AD
●

OpenLDAP interop work revived
●

two opposite approaches being pursued in
parallel
●
●

resurrect original interop code
port functionality into slapd overlays

●

currently about 75 % of the test suite passes

●

keep an eye on contrib/slapd-modules/samba4
Other Features
●

cn=config enhancements
●
●

●

Support LDAPDelete op
Support slapmodify/slapdelete offline tools

LDAP transactions
●

●

Needed for Samba4 support

Frontend/overlay restructuring
●

Rationalize Bind and ExtendedOp result handling

●

Other internal API cleanup
What's Missing
●

Deprecated BerkeleyDB-based backends
●

back-bdb was deprecated in 2.4

●

back-hdb deprecated in 2.5

●

both scheduled for deletion in 2.6

●

configure switches renamed, so existing
packager scripts can no longer enable them
without explicit action
Questions?

38
Thanks!

What's New in OpenLDAP

  • 1.
    What's New inOpenLDAP Howard Chu
  • 2.
    OpenLDAP Project ● Open sourcecode project ● Founded 1998 ● Three core team members ● A dozen or so contributors ● Feature releases every 12-18 months ● Maintenance releases roughly monthly
  • 3.
    A Word AboutSymas ● Founded 1999 ● Founders from Enterprise Software world – platinum Technology (Locus Computing) – IBM ● Howard joined OpenLDAP in 1999 – One of the Core Team members – Appointed Chief Architect January 2007 ● No debt, no VC investments
  • 4.
    Intro Howard Chu ● ● Founder andCTO Symas Corp. Developing Free/Open Source software since 1980s – GNU compiler toolchain, e.g. "gmake -j", etc. – Many other projects, check ohloh.net... ● Worked for NASA/JPL, wrote software for Space Shuttle, etc. 4
  • 5.
    What's New ● Lightning Memory-MappedDatabase (LMDB) and its knock-on effects ● Within OpenLDAP code ● Other projects ● New HyperDex clustered backend ● New Samba4/AD integration work ● Other features ● What's missing
  • 6.
    LMDB ● Introduced at LDAPCon2011 ● ● Full ACID transactions MVCC, readers and writers don't block each other ● Ultra-compact, compiles to under 32KB ● Memory-mapped, lightning fast zero-copy reads ● Much greater CPU and memory efficiency ● Much simpler configuration
  • 7.
    LMDB Impact ● Within OpenLDAP ● ● Revealedother frontend bottlenecks that were hidden by BerkeleyDB-based backends Addressed in OpenLDAP 2.5 ● ● Thread pool enhanced, support multiple work queues to reduce mutex contention Connection manager enhanced, simplify write synchronization
  • 8.
    OpenLDAP Frontend ● Testing in2011 (16 core server): ● back-hdb, 62000 searches/sec, 1485 % CPU ● back-mdb, 75000 searches/sec, 1000 % CPU ● ● back-mdb, 2 slapds, 127000 searches/sec, 1250 % CPU - network limited We should not have needed two processes to hit this rate
  • 9.
    Efficiency Note ● back-hdb 62000searches/sec @ 1485 % ● ● back-mdb 127000 searches/sec @1250 % ● ● ● 41.75 searches per CPU % 101.60 searches per CPU % 2.433x as many searches per unit of CPU "Performance" isn't the point, *Efficiency* is what matters
  • 10.
    OpenLDAP Frontend ● Threadpool contention ● Analyzedusing mutrace ● Found #1 bottleneck in threadpool mutex ● Modified threadpool to support multiple queues ● ● ● On quad-core laptop, using 4 queues reduced mutex contended time by factor of 6. Reduced condition variable contention by factor of 3. Overall 20 % improvement in throughput on quadcore VM
  • 11.
    OpenLDAP Frontend ● Connection Manager ● ● Alsoa single thread, accepting new connections and polling for read/write ready on existing Now can be split to multiple threads ● ● Impact depends on number of connections Polling for write is no longer handled by the listener thread ● ● ● Removes one level of locks and indirection Simplifies WriteTimeout implementation Typically no benchmark impact, only significant when blocking on writes due to slow clients
  • 12.
    OpenLDAP Frontend Frontend Improvements,Quadcore VM 40000 35000 30000 Ops/Second 25000 SearchRate AuthRate ModRate 20000 15000 10000 5000 0 OL 2.4 OL 2.5
  • 13.
    LMDB Impact ● Adoption bymany other projects ● Outperforms all other embedded databases in common applications ● ● CFengine, Postfix, PowerDNS, etc. Has none of the reliability/integrity weaknesses of other databases ● Has none of the licensing issues... ● Integrated into multiple NoSQL projects ● Redis, SkyDB, Memcached, HyperDex, etc.
  • 14.
    LMDB Microbenchmark ● ● ● Comparisons basedon Google's LevelDB Also tested against Kyoto Cabinet's TreeDB, SQLite3, and BerkeleyDB Tested using RAM filesystem (tmpfs), reiserfs on SSD, and multiple filesystems on HDD – btrfs, ext2, ext3, ext4, jfs, ntfs, reiserfs, xfs, zfs – ext3, ext4, jfs, reiserfs, xfs also tested with external journals
  • 15.
    LMDB Microbenchmark Relative Footprint text data bss dec hex filename 272247 1456 328 274031 42e6fdb_bench 1675911 2288 304 1678503 199ca7 db_bench_bdb 90423 1508 304 92235 1684b db_bench_mdb 653480 7768 1688 662936 a2764 db_bench_sqlite3 296572 4808 1096 302476 49d8c db_bench_tree_db Clearly LMDB has the smallest footprint – Carefully written C code beats C++ every time
  • 16.
    LMDB Microbenchmark Read Performance ReadPerformance Small Records Small Records 16000000 800000 14000000 700000 12000000 600000 10000000 500000 8000000 400000 6000000 300000 4000000 200000 2000000 100000 0 0 Sequential SQLite3 TreeDB LevelDB Random BDB MDB SQLite3 TreeDB LevelDB BDB MDB
  • 17.
    LMDB Microbenchmark Read Performance ReadPerformance Large Records Large Records 35000000 2000000 30303030 30000000 1718213 1800000 1600000 25000000 1400000 1200000 20000000 1000000 15000000 800000 600000 10000000 400000 5000000 0 7402 16514 299133 200000 9133 0 7047 14518 Sequential SQLite3 TreeDB LevelDB 15183 8646 Random BDB MDB SQLite3 TreeDB LevelDB BDB MDB
  • 18.
    LMDB Microbenchmark Read Performance ReadPerformance Large Records Large Records 100000000 30303030 10000000 1718213 1000000 1000000 299133 100000 100000 10000 10000000 7402 16514 10000 9133 7047 14518 15183 8646 1000 1000 100 100 10 10 1 1 Sequential SQLite3 TreeDB LevelDB BDB Random MDB SQLite3 TreeDB LevelDB BDB MDB
  • 19.
    LMDB Microbenchmark Asynchronous WritePerformance Asynchronous Write Performance Large Records, tmpfs Large Records, tmpfs 14000 12905 14000 12000 12000 10000 10000 8000 12735 8000 5860 6000 4000 2000 5709 6000 4000 3366 2029 1920 2000 0 2004 1902 742 0 Sequential SQLite3 TreeDB LevelDB Random BDB MDB SQLite3 TreeDB LevelDB BDB MDB
  • 20.
    LMDB Microbenchmark Batched WritePerformance Batched Write Performance Large Records, tmpfs Large Records, tmpfs 14000 13215 14000 12000 12000 10000 10000 8000 13099 8000 5860 6000 4000 2000 5709 6000 4000 3138 2068 1952 2000 0 3079 2041 1939 0 Sequential SQLite3 TreeDB LevelDB Random BDB MDB SQLite3 TreeDB LevelDB BDB MDB
  • 21.
    LMDB Microbenchmark Synchronous WritePerformance Synchronous Write Performance Large Records, tmpfs Large Records, tmpfs 14000 12916 14000 12000 12000 10000 10000 8000 8000 6000 12665 6000 4000 2000 3121 4000 3368 2026 1913 2000 0 1996 2162 1893 745 0 Sequential SQLite3 TreeDB LevelDB Random BDB MDB SQLite3 TreeDB LevelDB BDB MDB
  • 22.
    MemcacheDB Read Performance Write Performance SingleThread, Log Scale Single Thread, Log Scale 1000 100 msec 1 0.1 100 min avg max90th max95th max99th max 10 msec min avg max90th max95th max99th max 10 1 0.1 0.01 0.01 BDB 4.7 MDB Memcached BDB 4.7 MDB Memcached
  • 23.
    MemcacheDB Read Performance Write Performance 4Threads, Log Scale 4 Threads, Log Scale 10 1000 msec 1 100 min avg max90th max95th max99th max 10 msec min avg max90th max95th max99th max 1 0.1 0.1 0.01 0.01 BDB 4.7 MDB Memcached BDB 4.7 MDB Memcached
  • 24.
    HyperDex ● New generation NoSQLdatabase server ● http://hyperdex.org ● Simple configuration/deployment ● Multidimensional indexing/sharding ● Efficient distributed search engine ● ● Built on Google LevelDB, evolved to their fixed version HyperLevelDB Ported to LMDB
  • 25.
  • 26.
    LMDB, HyperDex ● CPU timeused for inserts : ● ● ● ● LMDB 19:44.52 HyperLevelDB 96:46.96 HyperLevelDB used 4.9x more CPU for same number of operations Again, performance isn't the point. Throwing extra CPU at a job to "make it go faster" is stupid.
  • 27.
  • 28.
    LMDB, HyperDex ● CPU timeused for read/update : – LMDB 1:33.17 – HyperLevelDB 3:37.67 ● HyperLevelDB used 2.3x more CPU for same number of operations
  • 29.
  • 30.
    LMDB, HyperDex ● CPU timeused for inserts : ● ● ● LMDB 227:26 HyperLevelDB 3373:13 HyperLevelDB used 14.8x more CPU for same number of operations
  • 31.
  • 32.
    LMDB, HyperDex ● CPU timeused for read/update : – LMDB 4:21.41 – HyperLevelDB 17:27 ● HyperLevelDB used 4.0x more CPU for same number of operations
  • 33.
    back-hyperdex ● New clustered backendbuilt on HyperDex ● ● Existing back-ndb clustered backend is deprecated, Oracle has refused to cooperate on support Nearly complete LDAP support ● ● ● Currently has limited search filter support Uses flat (back-bdb style) namespace, not hierarchical Still in prototype stage as HyperDex API is still in flux
  • 34.
    Samba4/AD ● Samba4 provides itsown ActiveDirectorycompatible LDAP service ● ● ● built on Samba ldb/tdb libraries supports AD replication Has some problems ● ● ● Incompatible with Samba3+OpenLDAP deployments Originally attempted to interoperate with OpenLDAP, but that work was abandoned Poor performance
  • 35.
    Samba4/AD ● OpenLDAP interop workrevived ● two opposite approaches being pursued in parallel ● ● resurrect original interop code port functionality into slapd overlays ● currently about 75 % of the test suite passes ● keep an eye on contrib/slapd-modules/samba4
  • 36.
    Other Features ● cn=config enhancements ● ● ● SupportLDAPDelete op Support slapmodify/slapdelete offline tools LDAP transactions ● ● Needed for Samba4 support Frontend/overlay restructuring ● Rationalize Bind and ExtendedOp result handling ● Other internal API cleanup
  • 37.
    What's Missing ● Deprecated BerkeleyDB-basedbackends ● back-bdb was deprecated in 2.4 ● back-hdb deprecated in 2.5 ● both scheduled for deletion in 2.6 ● configure switches renamed, so existing packager scripts can no longer enable them without explicit action
  • 38.
  • 39.