Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Oak, the Architecture of the new Repository

6,711 views

Published on

Apache Jackrabbit Oak is a new JCR implementation with a completely new architecture. Based on concepts like eventual consistency and multi-version concurrency control, and borrowing ideas from distributed version control systems and cloud-scale databases, the Oak architecture is a major leap ahead for Jackrabbit. This presentation describes the Oak architecture and shows what it means for the scalability and performance of modern content applications. Changes to existing Jackrabbit functionality are described and the migration process is explained.

Published in: Software
  • Be the first to comment

Oak, the Architecture of the new Repository

  1. 1. APACHE SLING & FRIENDS TECH MEETUP BERLIN, 22-24 SEPTEMBER 2014 Oak, the Architecture of the new Repository Michael Dürig, Adobe Research
  2. 2. Design goals  Scalable  Big repositories  Clustering  Customisable, flexible  OSGi friendly adaptTo() 2014 2
  3. 3. Outline  CRUD  Changes  Search adaptTo() 2014 3
  4. 4. Tree model a d b c adaptTo() 2014 4
  5. 5. Updating ? x a d b c adaptTo() 2014 5
  6. 6. MVCC HEAD r1: / r2: / a d b c r1: /d r2: /d r1: /a/b r2: /a/b r2: /d/x adaptTo() 2014 6
  7. 7. Refresh and Garbage Collection adaptTo() 2014 7
  8. 8. Refresh garbage adaptTo() 2014 8
  9. 9. Garbage collection garbage adaptTo() 2014 9
  10. 10. Concurrency and Conflicts adaptTo() 2014 10
  11. 11. Concurrent updates r2a r1 r2b adaptTo() 2014 11
  12. 12. Merging merge r2a upates r1 r3 r2b adaptTo() 2014 12
  13. 13. Conflict handling: serialisation  Fully serialised  Fail, no concurrent update  Partially serialised  Concurrent conflict free updates adaptTo() 2014 13
  14. 14. Conflict handling strategies: merging  Partial merge  Conflict markers, deferred resolution  Full merge  Need to choose victim adaptTo() 2014 14
  15. 15. Replicas and Sharding adaptTo() 2014 15
  16. 16. Replica and caches master copy full replica cache adaptTo() 2014 16
  17. 17. Sharding strategies by path by level by hash with caching adaptTo() 2014 17
  18. 18. Implementations adaptTo() 2014 18
  19. 19. MicroKernel / NodeStore  Tree / Revision model implementation Responsible for Clustering Sharding Caching Conflict handling Not responsible for Validation Access control Search Versioning adaptTo() 2014 19
  20. 20. Current implementations DocumentMK TarMK (SegmentMK) Persistence MongoDB, JDBC Local FS Conflict handling Partial serialisation Full serialisation Clustering MongoDB clustering Simple failover Sharding MongoDB sharding N/A Node Performance Moderate High Key use cases Large deployments (>1TB), concurrent Small/medium deployments, mostly deployments, mostly read adaptTo() 2014 20
  21. 21. Access Control adaptTo() 2014 21
  22. 22. Accessible paths a d b c adaptTo() 2014 22
  23. 23. xistentialism  All paths traversable  Node may not exist  Decorator on NodeStore root.getChildNode("a").exists(); root.getChildNode("a") .getChildNode("b").exists(); ⟹ false ⟹ true adaptTo() 2014 23
  24. 24. Comparing Revisions adaptTo() 2014 24
  25. 25. Content diff  What changed between trees  Cornerstone for  Validation  Indexing  Observation  … adaptTo() 2014 25
  26. 26. What changed? Δ adaptTo() 2014 26
  27. 27. Example: merging r1 r2a r2b r3 Δ r1 ➞ r2a “a” modified “b” removed Δ r1 ➞ r2b “d” modified “x” added adaptTo() 2014 27
  28. 28. Commit Hooks adaptTo() 2014 28
  29. 29. Commit hooks  Key plugin mechanism  Higher level functionality  Validation (node type, access control, …)  Trigger (auto create, defaults, …)  Updates (index, …) adaptTo() 2014 29
  30. 30. Editing a commit Δ Δ + x adaptTo() 2014 30
  31. 31. Commit hooks  Based on content diff  pass a commit  fail a commit  edit a commit  Applied in sequence adaptTo() 2014 31
  32. 32. Type of hooks CommitHook Editor Validator Content diff Optional Always Always Can modify Yes Yes No Programming model Simple Callbacks Callbacks Performance impact High Medium Low adaptTo() 2014 32
  33. 33. Observers adaptTo() 2014 33
  34. 34. Observers  Observe changes  After commit  Often does a content diff  Asynchronous  Optionally synchronous  Local cluster node only adaptTo() 2014 34
  35. 35. Examples  JCR observation  External index update  Cache invalidation  Logging adaptTo() 2014 35
  36. 36. Search adaptTo() 2014 36
  37. 37. Query Engine SELECT WHERE x=y /a//* Parse r parse execute post process Parse r Parser Parser Index Index Index Travers e adaptTo() 2014 37
  38. 38. Index Implementations  Property (ordered)  Reference  Lucene  In-content or file system  Solr  Embedded or external adaptTo() 2014 38
  39. 39. Big Picture adaptTo() 2014 39
  40. 40. Big picture JCR API Oak JCR Oak API Oak Core NodeStore API MicroKernel Plugins adaptTo() 2014 40
  41. 41. Resources http://jackrabbit.apache.org/oak/ adaptTo() 2014 41
  42. 42. Appendix adaptTo() 2014 42
  43. 43. Resources http://jackrabbit.apache.org/oak/ http://jackrabbit.apache.org/oak/docs/ https://svn.apache.org/repos/asf/jackrabbit/oak/trunk/ adaptTo() 2014 43

×