User Content after JCRWarning: This will be a low level technical presentation.                              Dr Ian Boston...
Content                          Content Management                                   Enterprise               Social10075...
JSR-170, JSR-283, JSR-333, JCRAny guesses what type of content management ?
Jackrabbit                  concurrent              Session              ItemCache                concurrent read         ...
Jackrabbit ACLs                     concurrent                  Session                                      concurrent   ...
Jackrabbit Cluster        Single Threaded         Single Threaded     Single Threaded          PM                      PM ...
Social Content Store Requirements                             Jackrabbit       Social ContentHigh Concurrency             ...
Sparse Map ConceptMap Addressing         KeySpace     ColumnFamily RowID                                           MapMap ...
Sparse Map Concept updates                               cA cB cC cD                         RowID v1    v2 v9      cC cD ...
Hierarchy Model            JCR                          Sparse Parent                         Parent     list of children ...
Hierarchy Model       JCR               Sparse  /                   /  /child              /child  /child/subchild     /ch...
Threading Model     shared          Not thread safe, not shared between threads, no sync,                                 ...
CachingContentManager CachingManager     SharedCache                  StorageCacheManager.get     concurrent, write throug...
Objects Exposed Objects                    UserInternal GroupInternalContent                   User Group InternalContent ...
Data Formats (Authorizables)Authorizable               id(string)                    map addressing            name(string...
Data Formats (Content) Content                          map addressing    _:cid(string)                         to Structu...
Data Formats (ACL)    ACL                               map addressing principal@g(int)          to ACL Map               ...
Content Versioning Content                                  map addressing    _:cid(string)                               ...
Content Linking Content                           map addressing    _:cid(string)                          to Structure Ma...
Cassandra Driverkeyspace, columnFamily, keyvalues -> byte[] in columnsincremental updatesbodies are rows of byte[], 64x1MB...
Memory Driverkeyspace, columnFamily, keyvalues -> byte[] in columnsincremental updatesbodies are rows of byte[], 64x1MB pe...
JDBC Driverkeyspace, columnFamily, hash(key) Whole Map -> byte[] in columnsColumn Family Selects Table bodies on Shared Fi...
Shard JDBC Driverkeyspace, columnFamily, hash(key)     lookup SQL Statement       lookup Connection  DB(a-h)       DB(g-m)...
Core Sparse Performance100% Concurrent, no waits1K Sessions                 Memory    Cassandra   MySQL     User Adds   33...
Nakamura     Jackrabbit              Sparse Application Content      User Content Enterprise Content       Social Content ...
Upcoming SlideShare
Loading in …5
×

Sparse Content Map Storage System

2,358 views

Published on

Presentation given at a BOF at Sakai 2011 Conference in LA. Describes the Content System used for User Content in Nakamura the server for Sakai OAE

Published in: Education
  • Be the first to comment

  • Be the first to like this

Sparse Content Map Storage System

  1. 1. User Content after JCRWarning: This will be a low level technical presentation. Dr Ian Boston Timefields Ltd OAE Server Team Lead
  2. 2. Content Content Management Enterprise Social100755025 0 Writers Anon Readers All Readers Read Concurrency Write Concurrency Acess Control
  3. 3. JSR-170, JSR-283, JSR-333, JCRAny guesses what type of content management ?
  4. 4. Jackrabbit concurrent Session ItemCache concurrent read Shared Item Cache Single Threaded PM
  5. 5. Jackrabbit ACLs concurrent Session concurrent ItemCache Session concurrent read ItemCache Shared Item Cache Singleton Single Threaded Read Bypass PM
  6. 6. Jackrabbit Cluster Single Threaded Single Threaded Single Threaded PM PM PM Single Threaded Shared Journal Lock Step Sequence Replay
  7. 7. Social Content Store Requirements Jackrabbit Social ContentHigh Concurrency 40No Synchronisation 20Lock free 0 Session footprint in KLight Weight SessionsShort Lived Sessions Simple Flat Hierarchies Clustering, Scaling Versioning, ACLs Storage Agnostic
  8. 8. Sparse Map ConceptMap Addressing KeySpace ColumnFamily RowID MapMap Storage cA cB cC cD RowID v1 v2 v9 RowID v1 v2
  9. 9. Sparse Map Concept updates cA cB cC cD RowID v1 v2 v9 cC cD Update v8 del cA cB cC cD RowID v1 v8
  10. 10. Hierarchy Model JCR Sparse Parent Parent list of children child child child hash(parent) find all child child iterate nodes hash(parent) list of with childchildren hash(par hash(parent)property ent) Fast Listing Slow Listing
  11. 11. Hierarchy Model JCR Sparse / / /child /child /child/subchild /child/subchildws2:/child/subchild alternate-root1 alternate-root1 alternate-root1/childa alternate-rootn alternate-rootn/childb
  12. 12. Threading Model shared Not thread safe, not shared between threads, no sync, no locks, 1K size Repository Session AccessConrtrolManager ContentManager AuthorizableManager Storage Client API shared StorageClientPool StorageClient Thread Bound Long Lived Persistence Connection
  13. 13. CachingContentManager CachingManager SharedCache StorageCacheManager.get concurrent, write through, ContentCache() Immutable values Memory Bundle EhCache Infinispan
  14. 14. Objects Exposed Objects UserInternal GroupInternalContent User Group InternalContent Authorizable Manipulation Objects AclModification
  15. 15. Data Formats (Authorizables)Authorizable id(string) map addressing name(string) type(string) principals(string[]) n:au:iebUser keyspace Column Family password(string) rowidGroup members(string[]) members(string[]) A key value pair named members containing a String[] versionId(string) A key value pair where the key is a versionId and the value is a String
  16. 16. Data Formats (Content) Content map addressing _:cid(string) to Structure Map n:cn:a/path/to/content _path(string) keyspaceparenthash(string) Column Family rowid _:id(string) _path(string) to Content Map n:cn:d4f3s3g3sft _blockId keyspace _blockId/streamA Column Family rowid StreamContentHelper Files BlockContentHelper Maps of byte[]
  17. 17. Data Formats (ACL) ACL map addressing principal@g(int) to ACL Map n:ac:cn;a/path/to/content principal@d(int) keyspace Column Family rowidStatic ACE 00000001 read 0x0001 ieb@d(int) 00000010 write 0x0002 ieb@g(int) 00000100 delete 0x0004Token ACE 0001..... readACL 0x1000 _tp_3ef45tr3w3e@g(int) 0010..... writeACL 0x2000 _secretKey(String) 0100..... deleteACL 0x4000
  18. 18. Content Versioning Content map addressing _:cid(string) to Structure Map n:cn:a/path/to/content _path(string) keyspaceparenthash(string) Column Family rowid _:id(string) _path(string) to Content Map n:cn:d4f3s3g3sft _versionHistoryId keyspace _previousVersion Column Family rowid _:id(string) _:id(string) _path(string) _path(string) versionId(string) _versionHistoryId _versionHistoryId versionId(string) _previousVersion _previousVersion ms timestamp _nextVersion _nextVersion
  19. 19. Content Linking Content map addressing _:cid(string) to Structure Map n:cn:a/path/to/content _path(string) keyspaceparenthash(string) Column Family rowid _:id(string) _path(string) to Content Map n:cn:d4f3s3g3sft _versionHistoryId keyspace _previousVersion Column Family rowid _:cid(string) _path(string)parenthash(string)
  20. 20. Cassandra Driverkeyspace, columnFamily, keyvalues -> byte[] in columnsincremental updatesbodies are rows of byte[], 64x1MB per rowfind operations via lookup Indexing and Finding n:au:? user=Ian user:au:ieb ieb(ieb), ib236(ib236)
  21. 21. Memory Driverkeyspace, columnFamily, keyvalues -> byte[] in columnsincremental updatesbodies are rows of byte[], 64x1MB per rowfind operations via lookup Indexing and Finding n:au:? user=Ian user:au:ieb ieb(ieb), ib236(ib236) ConcurrentHashMap
  22. 22. JDBC Driverkeyspace, columnFamily, hash(key) Whole Map -> byte[] in columnsColumn Family Selects Table bodies on Shared Filesystem find operations via query rowid column valueDB by DDL and SQL file Derby, Oracle, MySQL, PostgreSQL
  23. 23. Shard JDBC Driverkeyspace, columnFamily, hash(key) lookup SQL Statement lookup Connection DB(a-h) DB(g-m) DB(m-z)
  24. 24. Core Sparse Performance100% Concurrent, no waits1K Sessions Memory Cassandra MySQL User Adds 33000/s 3500/s 100/sPure Jar, can be used without OSGi
  25. 25. Nakamura Jackrabbit Sparse Application Content User Content Enterprise Content Social Content Updated every month Updated every ms

×