Successfully reported this slideshow.
Your SlideShare is downloading. ×

Datalevin London-meetup2020

Datalevin London-meetup2020

Download to read offline

Datalevin is a simple, fast and free Datalog database for building stateful applications. This talk introduce Datalevin, its motivation, design, implementation and benchmark results.

Datalevin is a simple, fast and free Datalog database for building stateful applications. This talk introduce Datalevin, its motivation, design, implementation and benchmark results.

More Related Content

Related Books

Free with a 30 day trial from Scribd

See all

Datalevin London-meetup2020

  1. 1. Datalevin A simple, fast and free Datalog database for everyone Huahai Yang, Ph.D. Juji, Inc. September 22, 2020
  2. 2. Background & Motivation • Juji is a conversational AI company • Conversational data query (NLDB) • Upload a CSV file, then query it • Natural language => database query • Context sensitive • My previous research in NLDB convinced me: • NLDB is more of a DB problem, than a NL problem • Data themselves provide the best context • Better DB is the key
  3. 3. Database Design Goals • Datalog is the best target query language for NLDB • Declarative • Composable • Amicable for code generation • In-process embedded use • Bulk writes, frequent reads • Multiple DB paradigms • Transparent data replication
  4. 4. Datalevin Design Principle - Simplicity • Simple to use • Just a library, add to deps, and start coding • Simply require a different namespace to get a different DB paradigm • Current: Key-value, Datalog • Future: Graph, Document • Simple to operate • No need for complex ops: setup, backup and recovery should be dead simple • No DB maintenance threads or processes • No need for performance tuning • Simple to scale • Just provision more physical resources
  5. 5. Why fork Datascript? • Datascript is a great baseline Datalog implementation • Comprehensive test coverage • Well maintained code base • Similar API to Datomic • Lots of users • We have very different goals from the alternatives • No interest in building a Datomic clone • Focus on query performance • We have plans to go far beyond NLDB • Juji Slogan for AI: “Symbolic as the bones, machine learning as the flesh” • High performance graph database is the basis of symbolic AI of the future
  6. 6. Roles of Database • Operational • Database as the surrogate of the external world • ACID is derived from this use: to maintain the illusion of external world • Primary, necessary for most use cases • Focus on present, OLTP • Archival • Database as a recording of events and facts • Don’t need ACID, eventual consistency is fine • Secondary, necessary for many use cases, but not all • Focus on provenance and history, OLAP
  7. 7. Merging operational and archival DB is hard • More stringent performance requirements • History has more data than present • More Complex APIs • Need to deal with history • Need to distinguish history and present • More complex user mental model • More things to consider -> less simple • Mind needs to forget to work properly • Hyperthymesia is a painful condition
  8. 8. Operational DB should be stateful • In people’s mind, external world is stateful • Wrong assumption of time model is one of the main sources of immutable DB programming errors • “Why do I get the wrong data with this query?” • “I have to sort by transaction id to get the latest version?” • Datalevin is an operational database • meant to be embedded in applications to manage state
  9. 9. Datalevin Architecture • LMDB key value store as the storage • Optimized Clojure API for LMDB • EAV index on top of key-value • User-facing API on top • Key-value • EAV index access • Datalog LMDB Key Value Processing Key-value API Index Access API Datalog API EAV Index Processing
  10. 10. LMDB Features • Lightning Memory Mapped DB • ACID key value database • DB is a memory mapped file • Use OS filesystem cache • B+ tree, optimized for read • The fastest key value store for read • Performs well in writing large values (>2KB) • Works on bytes, support range query • Support multiple independent tables (DBI)
  11. 11. LMDB Design • Read and write transactions • Single writer • Many concurrent readers (MVCC) • No locks on read • Linear scale by reader threads • Copy on write • Similar to immutable data structure • Reclaim obsolete pages • Read/write do not block each other
  12. 12. Datalevin Optimizations • Read transaction pool • Avoid cost of allocating read transactions • Pre-allocate off-heap buffers in JVM • Write buffer (one per DBI) • Read buffer • Range query start and end buffers • Auto-resize value buffers • Re-allocate on overflows • Auto-resize DB size • LMDB needs to specify total DB size
  13. 13. Datalevin Key-Value API • Open/close LMDB • Open/clear/drop DBI • Transact key-values as a batch • :put, :del • Fetch single value • get-value, get-first • Range query • get-range • Predicate filtering • get-some, range-filter • Counts • entries, range-count, range-filter-count
  14. 14. EAV Indexing Processing • Entity-Attribute-Value data model • Versatile • relational model: entity = tuple, attribute = column, value = value • graph model: entity = node, attribute = edge, value = node (ref) • RDF triple: entity = subject, attribute = predicate, value = object • The triple is called a “datom” • Cover indices • EAV: row oriented index, all datoms • AEV: column oriented index, all datoms • AVE: support attribute range query, all datoms • VAE: graph reverse index, only for reference type datoms
  15. 15. Index Storage • In memory indices as cache • Inherits Datascript’s persistent sorted sets • On disk indices as permanent storage • Binary encoded datoms into key-values • LMDB’s key size is fixed at compile time, default: 511 bytes • Each index is stored in its own DBI • Key (up to 511 bytes) • Small value: encoded datom • Large value: encoded datom with (truncated value + hash) to support range query • Value (8 bytes) • Small: a sentinel long, indicating small value • Large: a long reference to the key of the full datom in the “giant” DBI
  16. 16. Datom Index Disk Format • Attribute id (aid): binary encoded 32 bit integer • Entity id (eid): binary encoded 64 bit long • Value: • Data type header byte, use disallowed bytes in UTF-8 • Data types: int, long, id, boolean, float, double, byte, bytes, keyword, symbol, instant, uuid • Potentially truncated prefix bytes of the value • Each value data type is encoded differently to ensure: bitwise order = value order • If truncated, a truncator byte • If truncated, a 32 bit Clojure hash of the value • A separator byte aid eid header separator hash truncator value 511 bytes key
  17. 17. • Giants • For large values, the full datoms are stored in a giant DBI • append-only, fast write • Key: auto-incremental long (gid) • Value: serialized full datom • Schema • Stored in a schema DBI • Key: attribute name • Value: serialized Clojure map of attribute properties • TODO: non-trivial schema migration More Disk Storage Details
  18. 18. Datalog Query • Retain most Datascript query logic • Search on-disk indices instead of in-memory cache • Leverage indices that Datascript does not enable: AVET and VAET • Adopted a few performance optimization PRs that Datascript did not merge • Cache all on-disk indices access API call results in a LRU cache • Main reason for the speed advantage shown in query benchmarks • TODO: move to a more performant query engine • Datascript query engine does hash joins on returned full datoms • Nested maps should do less work and be more performant
  19. 19. Datalog Transaction • Retain Datascript transaction logic • Reads during transaction: first search in-memory cache, then search on disk • Transact to in-memory cache • Identical to Datascript • Cache content is lost when DB restarts • Transact to disk storage • Collect transacted datoms, commit them as a batch • Sync to disk after each transaction • Clear on-disk index access cache after a transaction
  20. 20. Status • Index Access API is identical to Datascript • Missing feature from Datascript • Composite tuples (TODO) • Persisted transaction functions (TODO) • Features that make sense for in-memory DB (Maybe) • DB serialization • DB pretty print
  21. 21. Benchmark: Write • 100K entities of random people information • Bulk load of datoms is fast • Bulk transaction is fast too • Transacting small number of datoms is slow • Advise: batch as much as possible data in a transaction
  22. 22. Benchmark: Read • Datalevin is faster than Datascript across the board for all tested Datalog queries
  23. 23. Benchmark: Multi-threads Read • Does LMDB claim of linear scale by reader threads hold? • Yes • Is Datalevin able to keep the same? • Yes
  24. 24. Roadmap • 0.4.0 Distributed mode with raft based replication • 0.5.0 New Datalog query engine with an optimizer • 0.6.0 Automatic schema migration • 0.7.0 Datalog query parity with Datascript • 0.8.0 Implement loom graph protocols • 0.9.0 Auto indexing of document fields • 1.0.0 Materialized views and incremental maintenance
  25. 25. Thank you! Question? Huahai Yang https://github.com/huahaiy @huahaiy https://juji.io

×