Oracle Berkeley DB Java Edition: Simple Java Object Persistence

6,953 views

Published on

An in-depth view of the Direct Persistence Layer (DPL) which is nearly identical to JPA but avoids the overhead of translating objects to their SQL/relational equivalents.

Published in: Technology
1 Comment
3 Likes
Statistics
Notes
No Downloads
Views
Total views
6,953
On SlideShare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
100
Comments
1
Likes
3
Embeds 0
No embeds

No notes for slide

Oracle Berkeley DB Java Edition: Simple Java Object Persistence

  1. 1. Oracle Berkeley DB Java Edition: Simple Transactional Java Object Storage
  2. 2. Agenda <ul><li>Code to store, access, and modify objects </li></ul><ul><li>Overview of Berkeley DB Java Edition </li></ul><ul><li>Performance versus ORM Solutions </li></ul><ul><li>Customers of Berkeley DB Java Edition </li></ul><ul><li>Q&A </li></ul>
  3. 3. Prepare an object to be persistent @Persistent class Address { String street; String city; String state; int zipCode; private Address() {} } street city state zipCode Address
  4. 4. Create a persistent Entity object @Entity class Employer { @PrimaryKey(sequence=&quot;ID&quot;) long id; @SecondaryKey(relate=ONE_TO_ONE) String name; Address address; private Employer() {} } street city state zipCode Address id name address Employer
  5. 5. Relationships between objects @Entity class Person { @PrimaryKey String ssn; String name; Address address; @SecondaryKey(relate=MANY_TO_ONE, relatedEntity=Person.class) String parentSsn; @SecondaryKey(relate=MANY_TO_MANY, relatedEntity=Employer.class) Set<Long> employerIds; Person(String name, String ssn, Address addr) { ... } } ssn name address parentSsn employerIds Person id name address Employer street city state zipCode Address
  6. 6. Quick access to object as Collections EntityStore store = new EntityStore(...); PrimaryIndex<String,Person> personBySsn = store.getPrimaryIndex(String.class, Person.class); Map<String,Person> map = personBySsn.map(); 222-11-9292 | Jon Pont | 21 | 111-11-1111 | { 32 } 333-99-5534 | Biff Pont | 24 | 443-76-7483 | { 18 } 666-22-7422 | Jane Doe | 22 | 747-01-1701 | { 32 } 222-11-9292 | Craig Bon | 23 | 273-01-1132 | { 18, 32 } Map from “ssn” string to “Person” instances ssn name address parentSsn employerIds Person
  7. 7. Quickly store new object instances // Earlier in the code storeConfig.setTransactional(true); personBySsn.put (new Person(&quot;Bob Pont&quot;, &quot;111-11-1111&quot;, null)); 222-11-9292 | Jon Pont | 21 | 111-11-1111 | { 32 } 333-99-5534 | Biff Pont | 24 | 111-11-1111 | { 18 } 666-22-7422 | Jane Doe | 22 | 747-01-1701 | { 32 } 222-11-9292 | Craig Bon | 23 | 273-01-1132 | { 18, 32 } 111-11-1111 | Bob Pont | | | {} New “Person” instance added to data store ssn name address parentSsn employerIds Person
  8. 8. Sub-indexes and cursors EntityCursor<Person> children = personByParentSsn.subIndex(&quot;111-11-1111&quot;).entities(); try { for (Person child : children) { System.out.println(child.ssn + ' ' + child.name); } } finally { children.close(); } ssn name address parentSsn employerIds Person 222-11-9292 | Jon Pont | 21 | { 32, 17, 19 } 333-99-5534 | Biff Pont | 24 | { 18 } Children of “Person” with “ssn” 111-11-1111
  9. 9. Agenda <ul><li>Code to store, access, and modify objects </li></ul><ul><li>Overview of Berkeley DB Java Edition </li></ul><ul><li>Performance versus ORM Solutions </li></ul><ul><li>Customers of Berkeley DB Java Edition </li></ul><ul><li>Q&A </li></ul>
  10. 10. Berkeley DB Java Edition vs. RDBMS Java Persistence API JDBC TCP/IP TCP/IP SQL Parser Query Optimizer Your Application Code Direct Persistence Layer Berkeley DB JE Storage Manager Your Application Code
  11. 11. What is Berkeley DB Java Edition? <ul><li>Reliable, scalable, flexible, fast and transactional data management </li></ul><ul><li>A single, small JAR loaded into the JVM along with your application </li></ul><ul><li>Enterprise database functionality with: </li></ul><ul><ul><li>Full transactional access to Java POJO Objects </li></ul></ul><ul><ul><li>Simple Direct Persistence Layer (DPL) for fast and easy development </li></ul></ul><ul><ul><li>Ability to use Java Collections as persistent, transactional data </li></ul></ul><ul><ul><li>Programmatic administration eliminates the need for a DBA </li></ul></ul><ul><ul><li>Scalable to TB of data, predictable cache size in memory </li></ul></ul><ul><ul><li>Highly concurrent design, hundreds of threads concurrently transacting </li></ul></ul><ul><ul><li>Open source, dual license </li></ul></ul><ul><li>Simple yet sophisticated data management hidden within an application </li></ul><ul><li>A database engine for your specialized data management needs. </li></ul>
  12. 12. Agenda <ul><li>Code to store, access, and modify objects </li></ul><ul><li>Overview of Berkeley DB Java Edition </li></ul><ul><li>Performance versus ORM Solutions </li></ul><ul><li>Customers of Berkeley DB Java Edition </li></ul><ul><li>Q&A </li></ul>
  13. 13. JE Compared to Apache Derby - Derby is a good Java RDBMS/SQL Engine - But sometimes SQL and ORM is overhead JE Derby % JE With DPL Derby With Hibernate % Random Create 9,082 3,571 154% 7,773 1,906 308% Random Read 101,626 13,925 630% 47,824 6,094 685% Random Update 33,692 8,694 288% 18,850 4,824 291% Random Delete 16,652 5,621 196% 16,155 3,962 308% Serial Scan 131,319 114,351 15% 55,617 7,612 631%
  14. 14. The Performance Gap between RDBMS and JE - functionally similar, fundamentally different Java Persistence API JDBC TCP/IP TCP/IP SQL Parser Query Optimizer Your Application Code Direct Persistence Layer Berkeley DB JE Storage Manager Your Application Code
  15. 15. Agenda <ul><li>Code to store, access, and modify objects </li></ul><ul><li>Overview of Berkeley DB Java Edition </li></ul><ul><li>Performance verses ORM Solutions </li></ul><ul><li>Customers of Berkeley DB Java Edition </li></ul><ul><li>Q&A </li></ul>
  16. 16. <ul><li>Internet Archive ( www.archive.org ) </li></ul><ul><ul><li>Library that stores historical copies of the entire Web </li></ul></ul><ul><ul><li>Over 70 billion URLs and growing </li></ul></ul><ul><ul><li>Petabytes of data </li></ul></ul><ul><li>Berkeley DB Java Edition is the repository for Heritrix, the webcrawler for Internet Archive, soon it will manage the index data as well </li></ul><ul><ul><li>Very large crawls (tens of millions of URLs in queue at a given time) bounded by RAM </li></ul></ul><ul><ul><li>Berkeley DB Java Edition used to persist data to disk with minimal performance penalty </li></ul></ul><ul><li>Why Berkeley DB Java Edition? </li></ul><ul><ul><li>Very high scalability </li></ul></ul><ul><ul><li>Very high performance </li></ul></ul><ul><ul><li>Simple Java Collections API </li></ul></ul><ul><ul><li>Pure Java </li></ul></ul>Internet Archive – Heritrix and the WayBack Machine “ The Oracle Berkeley DB Java Edition database allows us to crawl and rank all those URLs efficiently. Because the crawler is written entirely in Java, we needed a highly scalable, very fast, 100% pure Java database engine. ” Gordon Mohr, Chief Architect, Internet Archive
  17. 17. Sun - OpenDS <ul><li>Open Source </li></ul><ul><li>Pure Java LDAP/JNDI </li></ul><ul><li>Sun’s existing non-Java LDAP server uses Berkeley DB (ANSI C) </li></ul><ul><li>OpenDS is as fast, faster in some cases </li></ul><ul><li>JE’s highly concurrent design is exploited by Java 6 and the Niagara line of servers </li></ul>
  18. 18. Cisco License Manager <ul><li>Business critical data </li></ul><ul><li>Accurate, fast and available at all times </li></ul><ul><li>A software server requiring fast local persistence </li></ul><ul><li>Pure Java database </li></ul><ul><li>Simple schema </li></ul><ul><li>Queries known in advance, no dynamic requirement </li></ul>“ Cisco License Manager (CLM) is a secure client/server application that manages Cisco IOS Software activation and licenses for Cisco network devices.” Cisco.com FAQ on the CLM product
  19. 19. Oracle Coherence <ul><li>Coherence & Berkeley DB JE </li></ul><ul><li>Fast J2EE data grid solution </li></ul><ul><li>JE manages the grid’s node local information when data within the data grid exceeds memory </li></ul><ul><li>Rebuilding the data grid’s distributed information wastes time and resources </li></ul><ul><li>JE is the consistent, local and recoverable database maintaining the grid’s distributed information </li></ul>
  20. 20. Amazon.com’s Carbonado Framework <ul><li>Database neutral </li></ul><ul><li>Allows migration </li></ul><ul><li>Support for </li></ul><ul><ul><li>Berkeley DB </li></ul></ul><ul><ul><li>Berkeley DB JE </li></ul></ul><ul><ul><li>many different RDBMS </li></ul></ul><ul><li>Allows for dynamic queries, joins, indexes </li></ul><ul><li>JE is the default storage engine </li></ul><ul><li>Open Source </li></ul>
  21. 21. TIBCO BusinessEvents Uses Berkeley DB Java Edition <ul><li>TIBCO BusinessEvents </li></ul><ul><ul><li>Monitors disparate systems for interesting activity </li></ul></ul><ul><ul><li>Correlates business and IT events based on rules </li></ul></ul><ul><li>Berkeley DB Java Edition stores: </li></ul><ul><ul><li>Rules identifying “interesting activity” and “business events” </li></ul></ul><ul><ul><li>Event descriptions </li></ul></ul><ul><ul><li>Log for audit and reporting </li></ul></ul><ul><ul><li>System state saved every 20-30 seconds </li></ul></ul><ul><li>Why Berkeley DB Java Edition? </li></ul><ul><ul><li>High throughput and reliability </li></ul></ul><ul><ul><li>Scalability </li></ul></ul><ul><ul><li>Integration with Java runtime </li></ul></ul>“ Berkeley DB Java Edition is the optimal choice for an internal database for BusinessEvents, because it provides fast, transactional persistence in a pure Java package.” Matt Quinn VP Product Strategy TIBCO
  22. 22. <ul><li>CoMotion Visualization Software – collaborative visualization software for decision-making </li></ul><ul><li>Applications: </li></ul><ul><ul><li>Battlefield decision support in Iraq </li></ul></ul><ul><ul><li>Supply-chain visibility </li></ul></ul><ul><ul><li>Patient records </li></ul></ul><ul><ul><li>Financial monitoring </li></ul></ul><ul><li>Berkeley DB Java Edition provides the repository for CoMotion </li></ul><ul><li>Why Berkeley DB Java Edition? </li></ul><ul><ul><li>High performance </li></ul></ul><ul><ul><li>Reliability </li></ul></ul><ul><ul><li>Lightweight and pure Java </li></ul></ul>General Dynamics Command Post of the Future (CPOF)
  23. 23. For More Information http://search.oracle.com or http://www.oracle.com/database/berkeley-db Berkeley DB
  24. 24. The preceding is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remain at the sole discretion of Oracle.
  25. 25. A Q &
  26. 27. Fundamental Concepts <ul><li>Fast indexed or sequential lookup </li></ul><ul><ul><li>API is get/put, or through cursors </li></ul></ul><ul><ul><li>DPL and Collections APIs for easy access </li></ul></ul><ul><li>Concurrent threaded access to data </li></ul><ul><li>Transactions </li></ul><ul><ul><li>ACID – Atomic, Consistent, Isolated, Durable </li></ul></ul><ul><ul><li>Recovery after failure </li></ul></ul>Conceptual View of Data Data organized in memory as a BTREE, stored on disk as a set of transactional log files. Cleaner Thread key data key data key data key data 000003.jdb 000004.jdb 000002.jdb 000001.jdb
  27. 28. Log-based Storage on Disk <ul><li>Data is written only once (even updates) </li></ul><ul><ul><li>Writes are sequential, not random </li></ul></ul><ul><ul><li>Disk head generally stays on same track </li></ul></ul><ul><ul><li>Inserts and updates are fast </li></ul></ul><ul><li>Data experiences temporal (not spatial) locality </li></ul><ul><ul><li>Assumption: working set fits in memory </li></ul></ul><ul><ul><li>Dynamic re-clustering </li></ul></ul><ul><ul><li>No random I/O on checkpoints </li></ul></ul><ul><li>Log and data (the “material DB”) are the same </li></ul><ul><li>Backup and restore are simplified </li></ul><ul><li>Cleaner reclaims unused disk space </li></ul>
  28. 29. Features of Berkeley DB Java Edition <ul><li>Full transactional semantics </li></ul><ul><ul><li>Transactions may be disabled </li></ul></ul><ul><ul><li>Multiple durability options </li></ul></ul><ul><ul><li>All ANSI serialization levels supported </li></ul></ul><ul><li>Record level locking </li></ul><ul><ul><li>Locking may be disabled </li></ul></ul><ul><li>No IPC required for data access </li></ul><ul><li>High-concurrency BTREE storage </li></ul><ul><li>Optimized for write performance </li></ul><ul><li>Architecture neutral on-disk format </li></ul><ul><li>No SQL, JDBC – no query processing overhead </li></ul>
  29. 30. Highly Concurrent, by Design <ul><li>Record-level locking </li></ul><ul><ul><li>Finer-grained than the page-level locking </li></ul></ul><ul><li>Logical locking on key/data pairs </li></ul><ul><ul><li>Provide ACID properties and cursor stability </li></ul></ul><ul><ul><li>User-visible effects </li></ul></ul><ul><li>Non-transactional latching on internal nodes </li></ul><ul><ul><li>Short-term latches improve concurrency </li></ul></ul><ul><ul><li>Invisible to user; never held across API calls </li></ul></ul><ul><li>Recoverability and transactional semantics on user data are guaranteed in all cases </li></ul>
  30. 31. Administration <ul><li>Command line or application-embedded </li></ul><ul><li>All the utilities you expect: </li></ul><ul><ul><li>Dump, load, verify, salvage </li></ul></ul><ul><ul><li>Recover </li></ul></ul><ul><ul><li>Checkpoint and log file archival and removal </li></ul></ul><ul><ul><li>Hot backup </li></ul></ul><ul><ul><li>Deadlock detection </li></ul></ul><ul><ul><li>Statistics </li></ul></ul>
  31. 32. <ul><li>Persistent Java Objects are not accessed outside of the ORM </li></ul><ul><li>Queries are known in advance </li></ul><ul><li>Performance matters </li></ul><ul><li>Queries are (relatively) simple </li></ul><ul><li>The system must run without administrator support </li></ul><ul><li>Build once, deploy with confidence 1000s of times </li></ul><ul><li>Speed, scale and concurrency are important </li></ul>Berkeley DB Java Edition The right answer when:

×