A backend to tie them all ?
Emmanuel Lécharny
Apache Software Foundation member
Chairman of MINA project
PMC of Apache Directory Project
IKTEK Owner (www.iktek.com)
www.iktek@com, elecharny@iktek.com
What is the Backend anyway ?

Entries
+
Indexes
Entries/Index

dn: dc=example,dc=com
objectclass: top
objectclass: domain
dc: example
dn: cn=user1,dc=example,dc=com
objectClass: top
objectClass: person
cn: user1
sn: User one
dn: cn=user2,dc=example,dc=com
objectClass: top
objectClass: person
cn: user2
sn: User two
Criteria

Performance
Size
Cost
reliability
In Memory
Backend
In memory Backend
Internal data structures :
Btrees,
AVLs,
HashMap,
Lists,
...
In Memory Backend Usage
Cache
Ultra-fast server
No local storage
Tests
In Memory Backend

Performance
Size
Cost
reliability
LDIF
Backend
LDIF Backend

version: 1
dn: dc=example,dc=com
objectclass: top
objectclass: domain
dc: example
dn: cn=user1,dc=example,dc=com
objectClass: top
objectClass: person
cn: user1
sn: User one
dn: cn=user2,dc=example,dc=com
objectClass: top
objectClass: person
cn: user2
sn: User two
LDIF Backend Usage
Configuration
Schema
Tests
LDIF Backend

Performance
Size
Cost

reliability
BTree
Backend
BTree Backend
BTree Backend

Various possible
Implementations
Btree, B+tree, B*Tree, ...
BTree Backend

Average

Worst Case

Space
Search

O( n )
O( n )
O( log n ) O( log n )

Insert

O( log n ) O( log n )

Delete

O( log n ) O( log n )
BTree Backend
Usage :
Pretty much everything
Improvements :
Hashed keys
Cache
MVCC Backend

Performance
Size
Cost
reliability
MVCC
Backend
MVCC Backend
MVCC Backend
Usage :
Pretty much everything
MVCC Backend

Performance
Size

Cost

reliability
RDBMS
Backend
RDBMS Database
RDBMS Backend
Usage :
Existing DBAs...

ApacheDS experimentation :
Oracle Partition

http://svn.apache.org/viewvc/directory/apacheds/branches/ldif-partition/oracle-partition/?pathrev=982537
RDBMS Backend

Performance

Size
Cost

reliability
NoSQL
Backend
NoSQL Backend
Distributed Configuration Service

Distributed File System
NoSQL Backend
Usage :
Replicated backend
Huge base

ApacheDS experimentation :

http://www.stefan-seelmann.de/media/presentations/3rdOpenHUG2010_Seelmann_ApacheDirectoryHBase.pdf
http://svn.apache.org/repos/asf/directory/sandbox/seelmann/
NoSQL Backend

Performance

Size
Cost
reliability
Thanks!

A Backend to tie them all?