File System On Steroids

  • 3,964 views
Uploaded on

Presentation at ApacheCon EU 2008 in Amsterdam

Presentation at ApacheCon EU 2008 in Amsterdam

More in: Business , Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
3,964
On Slideshare
0
From Embeds
0
Number of Embeds
3

Actions

Shares
Downloads
99
Comments
0
Likes
3

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. File system on steroids an introduction to JCR
      • Jukka Zitting
      • Apache Jackrabbit
  • 2. Agenda
    • Big Picture
    • Content Repository
    • Repository Features
    • Apache Jackrabbit
  • 3. The Big Picture User Interface Processing Storage
  • 4. Our Focus: Storage
    • Main requirements
    • Persistence
    • Consistency
    • Scalability
    • Performance
    • Main alternatives
    • File system
    • Database
    • Network
  • 5. Introducing The Content Repository File system Database Content Repository read write transactions structured integrity query hierarchical streams access control locking observation versioning full text unstructured
  • 6. JCR, JSR 170, JSR 283
    • Content Repository for Java Technology API
      • Not just the Java API, but also the content repository semantics
      • POSIX file system defined as a C API
    • Accessible from other environments
      • JVM: Groovy, JRuby, Scala, etc.
      • Network: WebDAV, Ajax (JSON)‏
      • Ports planned: .NET, PHP
  • 7. Why Something New?
    • Goal: Single API for all storage
      • Universal access
      • No content silos
    • Existing systems don't cover all needs
      • Reiser: “Storage layers above the FS: A sure symptom the FS developer has failed”
    • Solution: Content repository
  • 8. Content Repository Semantics
    • Everything is content
      • Hierarchy of named and typed nodes
      • Content in named and typed properties
    • Superset of file system semantics
      • Can be used to store files and folders, and more
      • Can be mounted as a file system
    • With many database semantics
  • 9. Granularity of Content
  • 10. Granularity of Content, 1/2
    • File systems are typically best with coarse grained content
      • Small files in ReiserFS, NTFS, etc.
      • Extended properties in many systems
    • XML & co for fine grained content
      • DJB: “Don't parse”
  • 11. Granularity of Content, 2/2
    • Databases are best with fine grained content
      • Blobs are becoming better supported
      • Often special limitations for search, access, etc.
    • Content repository: Uniform interface for both stream and scalar properties
  • 12. Structure vs. Flexibility
  • 13. Structure vs. Flexibility
    • File systems have no constraints
      • Any file or directory can go anywhere
      • Naming conventions and access control
    • Databases have nothing but constraints
      • Structure of content is predefined
    • Content repository: Both structured and unstructured content
  • 14. Search
  • 15. Search
    • Traditionally no search in file systems
    • Custom indexers and search APIs
      • Google Desktop Search
      • Mac OS X Spotlight
      • Lucene in many applications
    • Content repository: Built-in search with full text indexing
  • 16. Transactions
  • 17. Transactions
    • File systems have limited support for atomic updates
      • The copy-and-move trick
    • No transactions that cover multiple changes
      • Journaling is internal to the system
    • Content repository: Change sets, distributed transactions
  • 18. Versioning
  • 19. Versioning
    • Typically no tracking of previous versions of content
      • Snapshots in ZFS & co.
      • Version control systems
    • Backups for archival vs. restore purpose
      • Mac OS X Time Machine
    • Content repository: Built-in versioning
  • 20. Observation
  • 21. Observation
    • File system change monitoring
      • File Alteration Monitor
      • Polling
      • Event APIs
    • Triggers in databases
    • Content repository: Standard observation API
  • 22. Apache Jackrabbit
  • 23. Apache Jackrabbit
    • Fully featured JCR content repository
    • Releases
      • 1.0 in 2006
      • 1.4 available since January 2008
      • 1.5 (with explorer) planned for Q2
      • 2.0 (with JCR 2.0) planned for 2008
    • Focus on conformance and flexibility
  • 24. Image credits
    • Images from the morgueFile archive, used as licensed
      • http://morguefile.com/archive/?display=96733 , Infographe_Elle
      • http://morguefile.com/archive/?display=81906 , msxo
      • http://morguefile.com/archive/?display=132988 , imelenchon
      • http://morguefile.com/archive/?display=95446 , ronnieb
      • http://morguefile.com/archive/?display=175657 , seriousfun
      • http://morguefile.com/archive/?display=135511 , rollingroscoe
      • http://morguefile.com/archive/?display=134540 , cohdra
      • http://morguefile.com/archive/?display=196920 , penywise
      • http://morguefile.com/archive/?display=48096 , bluekdesign
      • http://morguefile.com/archive/?display=128133 , gracey
  • 25. Thank you!
    • Questions / Comments?