• Save
Datomic – A Modern Database - StampedeCon 2014
Upcoming SlideShare
Loading in...5
×
 

Datomic – A Modern Database - StampedeCon 2014

on

  • 278 views

At StampedeCon 2014, Alex Miller (Cognitect) presented "Datomic – A Modern Database." ...

At StampedeCon 2014, Alex Miller (Cognitect) presented "Datomic – A Modern Database."

Datomic is a distributed database designed to run on next-generation cloud architectures. Datomic stores facts and retractions using a flexible schema, consistent transactions, and a logic-based query language. The focus on facts over time gives you the ability to look at the state of the database at any point in time and traverse your transactional data in many ways.

We’ll take a tour of the Datomic data model, transactions, query language, and architecture to highlight some of the unique attributes of Datomic and why it is an ideal modern database.

Statistics

Views

Total Views
278
Views on SlideShare
278
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Datomic – A Modern Database - StampedeCon 2014 Datomic – A Modern Database - StampedeCon 2014 Presentation Transcript

  • Datomic A Modern Database Alex Miller
  • Overview • Rationale • Data model - facts • Schema - entities, attributes • Transactions - assert, retract, excise • Architecture - peer, transactor, storage • Queries - datalog, rules
  • Rationale
  • A database that reconsiders... • Immutable data model • Flexible, extensible schemas • Importance of time in our data • Transactions and queries as data • Deployment for the cloud • Storage as a service
  • Data Model
  • Fact • One piece of information • About one thing • At a specific point in time • An immutable value
  • Fact = Datom
  • Fact = Datom Entity Attribute Value Tx op 21005 :name "Stuart" 1000 assert 21005 :likes tea 1000 assert 21005 :likes coffee 1022 assert 21005 :likes tea 1022 retract
  • Fact = Datom Entity Attribute Value Tx op 21005 :name "Stuart" 1000 assert 21005 :likes tea 1000 assert 21005 :likes coffee 1022 assert 21005 :likes tea 1022 retract
  • Fact = Datom Entity Attribute Value Tx op 21005 :name "Stuart" 1000 assert 21005 :likes tea 1000 assert 21005 :likes coffee 1022 assert 21005 :likes tea 1022 retract
  • Datoms are Values Entity Attribute Value Tx op 21005 :name "Stuart" 1000 assert 21005 :likes tea 1000 assert 21005 :likes coffee 1022 assert 21005 :likes tea 1022 retract
  • Datoms are Efficient Entity Attribute Value Tx op 21005 43 "Stuart" 1000 true 21005 75 3299 1000 true 21005 75 1730 1022 true 21005 75 3299 1022 false
  • Database • A collection of facts • At a specific point in time • An immutable value
  • Database is a Value Entity Attribute Value Tx op 21005 :name "Stuart" 1000 assert 21005 :likes tea 1000 assert 21005 :likes coffee 1022 assert 21005 :likes tea 1022 retract
  • Schema
  • Entities + Attributes • Entities - identity (no "type") • Similar to: nodes in a graph, rows in rdbms • Attributes • Similar to: edges in a graph, cols in rdbms • Relation from entities to values, or • Relation from entities to entities
  • Value Attributes “Alex Miller” “puredanger” :handle Entity Value Attribute
  • Ref Attributes “Alex Miller” :follows Entity Entity Attribute “Mario Aquino”
  • Attribute Definition • Entities, just like your data • Required attribute attributes: • ident (keyword) - attribute identifier • valueType (keyword) - string, boolean, long, bigint, float, double, bigdec, instant, uuid, uri, bytes, ref • cardinality (keyword) - one or many
  • Optional Attributes • Optional attribute attributes: • doc (string) - documentation string • unique (keyword) - value or identity • index (boolean) - index this attribute • fulltext (boolean) - searchable • isComponent (boolean) - composite values • noHistory (boolean) - whether to retain history (rarely used, for things like counters)
  • Attribute Definition {:db/id #db/id[:db.part/db] :db/ident :comment/body :db/valueType :db.type/string :db/cardinality :db.cardinality/one :db.install/_attribute :db.part/db}
  • ERD Modeling Legend Quote Useruser text timestamp handle email first-name last-name follows Entity Type attribute relationship attribute
  • Alternative Views Structure View row datoms sharing a common E column datoms sharing a common A document traversal of attributes graph traversal of reference attributes
  • Transactions
  • Transactions • Set of facts (assertions or retractions) • Applied at a point in time
  • Transactions as Data • Defined as data (not INSERT/UPDATE/ DELETE strings) • Transactions are *also* entities! • You can query them - for when they happened and what datoms they include • And add new facts about them!
  • Datoms in a Transaction ;; E A V Tx op [21005 :name "Stuart" 1000 true] [21005 :likes 3299 1000 true] [21005 :likes 1730 1022 true] [21005 :likes 3299 1022 false]
  • Integrity • ACID transactions • Equivalent to isolation level SERIALIZED
  • Constraints • Schema constraints enforced on attributes • Transaction functions • From old db value to new db value • Enforce arbitrary constraints • Can reject transactions
  • Uses for Tx Fns • Atomic update • Maintaining integrity constraints (composite keys) • Strict validation • Constructing entities • Annotating transactions
  • Architecture
  • Server Indexing Trans- actions Query App Process I/O App Strings DDL + DML Result Sets Storage cache monolithic server
  • Server Indexing Trans- actions Query App Process I/O App Strings DDL + DML Result Sets Storage cache monolithic server Storage Service App Process D Peer Lib b,c,ea,d,e a,b,d D Transactor Indexing Trans- actions Query Cache App Data Data Data segments Live Index Data Segments Data Segments peer, transactor, storage
  • Peer Library Storage Service Transactor Datomic Components
  • Your Application Peer Library Storage Service Transactor Peer Library • Included in your app • Executes queries locally
  • Your Application Peer Library Storage Service Transactor Peer Library • Reads data from storage • Caches locally cache
  • Your Application Peer Library Storage Service Transactor Scale Horizontally cache Your Application Peer Library Your Application Peer Library cache cache
  • Your Application Peer Library Storage Service Transactor Transactor • Standalone system • Scales vertically
  • Your Application Peer Library Storage Service Transactor Transactor • Standalone system • Hot standby for failover
  • Your Application Peer Library Storage Service Transactor Transactor • Coordinates writes • Guarantees ACID transactions
  • Your Application Peer Library Storage Service Transactor Transactor •Writes transaction log to storage • Generates indexes
  • Storage Service Transactor Transactor • Broadcasts live updates Your Application Peer Library cache Your Application Peer Library Your Application Peer Library cache cache
  • Your Application Peer Library Storage Service Transactor Storage • Provided as a service • Many different back-ends
  • Your Application Peer Library Storage Service Transactor Local Storage • Memory • Filesystem
  • PostgreSQL, Oracle, … Your Application Peer Library Storage Service Transactor Nearby Storage • SQL database
  • Your Application Peer Library Storage Service Transactor Distributed Storage • DynamoDB • Riak • Couchbase • Infinispan • Cassandra
  • Your Application Peer Library Storage Service Your Application Peer Library Your Application Peer Library Datomic Components Transactor
  • Your Application Peer Library Storage Service Transactions Your Application Peer Library Your Application Peer Library ACID Writes Transactor Transactions & Indexes
  • Your Application Peer Library Storage Service Live updates Your Application Peer Library Your Application Peer Library Live Updates Transactor
  • Your Application Peer Library Storage Service Your Application Peer Library Your Application Peer Library Distributed Reads Transactor Reads
  • Your Application Peer Library Storage Service Your Application Peer Library Your Application Peer Library Local Caches cache cache cache Transactor Reads
  • Your Application Peer Library Storage Service Your Application Peer Library Your Application Peer Library Shared Memcached Transactor Reads Memcached cache cache cache
  • Queries
  • Queries as Data • Datalog • Queries defined as data, not as strings
  • Find User’s Comments (d/q '[:find ?comment :in $ ?email :where [?user :user/email ?email] [?comment :comment/author ?user]] db "editor@example.com")
  • (d/q '[:find ?comment :in $ ?email :where [?user :user/email ?email] [?comment :comment/author ?user]] db "editor@example.com") Data Pattern ?user :user/email "editor@example.com" ?comment :comment/author
  • [:find ?customer ?email :in $cust $emp :where [$cust ?customer :email ?email] [$emp _ :email ?email]] “Find me the customers who are also employees.” Join across dbs (d/q query custDb empDb) implicit join
  • [:find ?customer ?email :in $cust $emp :where [$cust ?customer :email ?email] [$emp _ :email ?email]] “Find me the customers who are also employees.” Join across dbs (d/q query custDb empDb) data patterns can be led by database names
  • Travel Through Time • Database *now* • Database *last week* • Database *if I added some transactions*
  • Views of a database name semantics supports (default) current state what is the current situation? .asOf state at point in past how were things in the past? .since state since point in past how have things changed? tx report before / after / change view of a tx automated event response .with state with proposed additions what would happen if we did X? .history timeless view of all history anything!
  • (d/q '[:find ?customer :where [?customer :id] [?customer :orders]] (d/as-of db #inst "2013-01-01")) Time travel
  • Query Engine is Local • Local query engine, local cache • If working set is in memory, everything is FAST • Reads do not require any transactor interaction • Use your own functions in the query
  • Extension with custom fns [:find ?customer ?product :where [?customer :shipAddress ?addr] [?addr :zip ?zip] [?product :product/weight ?weight] [?product :product/price ?price] [(Shipping/estimate ?zip ?weight) ?shipCost] [(<= ?price ?shipCost)]] “Find me the customer/product combinations where the shipping cost dominates the product cost.” predicate
  • Extension with custom fns [:find ?customer ?product :where [?customer :shipAddress ?addr] [?addr :zip ?zip] [?product :product/weight ?weight] [?product :product/price ?price] [(Shipping/estimate ?zip ?weight) ?shipCost] [(<= ?price ?shipCost)]] “Find me the customer/product combinations where the shipping cost dominates the product cost.” function
  • Rules
  • Rules [(relatedProduct ?p1 ?p2) [?p1 :category ?c] [?p2 :category ?c] [(!= ?p1 ?p2)]] “Products are related if they have a common category.”
  • Rules [(relatedProduct ?p1 ?p2) [?p1 :category ?c] [?p2 :category ?c] [(!= ?p1 ?p2)]] “Products are related if they have a common category.” head is true ...
  • Rules [(relatedProduct ?p1 ?p2) [?p1 :category ?c] [?p2 :category ?c] [(!= ?p1 ?p2)]] “Products are related if they have a common category.” if body is true
  • q("[:find ?p2 :in $ % :where (expensiveChocolate p1) (relatedProduct p1 p2)]", db rules) “Find all products related to expensive chocolate.” Rule inputs rules are a kind of input
  • q("[:find ?p2 :in $ % :where (expensiveChocolate p1) (relatedProduct p1 p2)", db, rules) “Find all products related to expensive chocolate.” Naming rule inputs rule names begin with %
  • q("[:find ?p2 :in $ % :where (expensiveChocolate p1) (relatedProduct p1 p2)", db, rules) “Find all products related to expensive chocolate.” Using rule patterns rule patterns can appear in :where clause
  • [[(relatedProduct ?p1 ?p2) [?p1 :category ?c] [?p2 :category ?c] [(!= ?p1 ?p2)]] [(relatedProduct ?p1 ?p2) [?o :order/item ?item1] [?item1 :order/product ?p1] [?o :order/item ?item2] [?item2 :order/product ?p2] [(!= ?p1 ?p2)]]] “Products are related if they have the same category, or they have appeared in the same order.” Implicit or
  • ;; base case [(story-comment ?story ?comment) [?story :story/title] [?story :new/comments ?comment]] Recursive query for graph navigation it is a story comment if...
  • ;; base case [(story-comment ?story ?comment) [?story :story/title] [?story :new/comments ?comment]] Recursive query for graph navigation ... there is a story ...
  • ;; base case [(story-comment ?story ?comment) [?story :story/title] [?story :new/comments ?comment]] Recursive query for graph navigation ... with a comment
  • ;; recursion [(story-comment ?story ?comment) [?parent :news/comments ?comment) (story-comment ?story ?parent)] Recursive query for graph navigation or, it is a story comment if...
  • ;; recursion [(story-comment ?story ?comment) [?parent :news/comments ?comment] (story-comment ?story ?parent)] Recursive query for graph navigation ... it has a parent comment ...
  • ;; recursion [(story-comment ?story ?comment) [?parent :news/comments ?comment) (story-comment ?story ?parent)] Recursive query for graph navigation which is itself a story comment
  • Indexes and Logs
  • Direct Index Access • seek-datoms - walk the datoms in a specified index between particular transactions or dates • entid-at - gets a fabricated entity id based on a transaction id or date
  • Direct Log Access • Walk the transaction log directly • Starting from txid or point in time
  • Entities and Graph Walking
  • Entity • A collection of facts • About one thing • At a specific point in time
  • Entity • A collection of datoms • With the same entity ID • At a specific point in time
  • Entity • A collection of datoms • With the same entity ID • At a specific point in time • Can be viewed as a map • Traversable as a graph
  • Entities ⬌ Datoms ;; Entity {:db/id 21005 :name "Stuart" :likes #{{:db/id 1730 :food/name "coffee"}} ;; Datoms [21005 :name "Stuart"] [21005 :likes 1730] [1730 :food/name "coffee"]
  • Entities • Retrieve entity from the database as a map • Follow references to lazily walk the graph • Follow references in both directions • Use touch to retrieve all entity attributes (def ent (d/entity dbval 17592186045417)) => {:db/id 17592186045417} (d/touch ent) => {:db/doc "Hello world", :db/id 17592186045417}
  • Sweet Spots • Flexible data model • Audit trail, provenance, time • Transactional data of record • Horizontal query scaling • Cloud deployment
  • Leverage other systems • Blobs (images, sound, movies, giant text) • Write churn (hit counter) • Horizontal write scaling
  • Datomic Free Datomic Pro Starter Datomic Pro Cost Redistributable? Number of Peers High-Availability Transactor Storage Services Memcached Support $0 $0 per peer Yes No No 2 2 by license No No Yes Local only All All No No Yes Community Community Enterprise