Datomic – A Modern Database - StampedeCon 2014

1,258 views
1,128 views

Published on

At StampedeCon 2014, Alex Miller (Cognitect) presented "Datomic – A Modern Database."

Datomic is a distributed database designed to run on next-generation cloud architectures. Datomic stores facts and retractions using a flexible schema, consistent transactions, and a logic-based query language. The focus on facts over time gives you the ability to look at the state of the database at any point in time and traverse your transactional data in many ways.

We’ll take a tour of the Datomic data model, transactions, query language, and architecture to highlight some of the unique attributes of Datomic and why it is an ideal modern database.

Published in: Technology, Business
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,258
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

Datomic – A Modern Database - StampedeCon 2014

  1. 1. Datomic A Modern Database Alex Miller
  2. 2. Overview • Rationale • Data model - facts • Schema - entities, attributes • Transactions - assert, retract, excise • Architecture - peer, transactor, storage • Queries - datalog, rules
  3. 3. Rationale
  4. 4. A database that reconsiders... • Immutable data model • Flexible, extensible schemas • Importance of time in our data • Transactions and queries as data • Deployment for the cloud • Storage as a service
  5. 5. Data Model
  6. 6. Fact • One piece of information • About one thing • At a specific point in time • An immutable value
  7. 7. Fact = Datom
  8. 8. Fact = Datom Entity Attribute Value Tx op 21005 :name "Stuart" 1000 assert 21005 :likes tea 1000 assert 21005 :likes coffee 1022 assert 21005 :likes tea 1022 retract
  9. 9. Fact = Datom Entity Attribute Value Tx op 21005 :name "Stuart" 1000 assert 21005 :likes tea 1000 assert 21005 :likes coffee 1022 assert 21005 :likes tea 1022 retract
  10. 10. Fact = Datom Entity Attribute Value Tx op 21005 :name "Stuart" 1000 assert 21005 :likes tea 1000 assert 21005 :likes coffee 1022 assert 21005 :likes tea 1022 retract
  11. 11. Datoms are Values Entity Attribute Value Tx op 21005 :name "Stuart" 1000 assert 21005 :likes tea 1000 assert 21005 :likes coffee 1022 assert 21005 :likes tea 1022 retract
  12. 12. Datoms are Efficient Entity Attribute Value Tx op 21005 43 "Stuart" 1000 true 21005 75 3299 1000 true 21005 75 1730 1022 true 21005 75 3299 1022 false
  13. 13. Database • A collection of facts • At a specific point in time • An immutable value
  14. 14. Database is a Value Entity Attribute Value Tx op 21005 :name "Stuart" 1000 assert 21005 :likes tea 1000 assert 21005 :likes coffee 1022 assert 21005 :likes tea 1022 retract
  15. 15. Schema
  16. 16. Entities + Attributes • Entities - identity (no "type") • Similar to: nodes in a graph, rows in rdbms • Attributes • Similar to: edges in a graph, cols in rdbms • Relation from entities to values, or • Relation from entities to entities
  17. 17. Value Attributes “Alex Miller” “puredanger” :handle Entity Value Attribute
  18. 18. Ref Attributes “Alex Miller” :follows Entity Entity Attribute “Mario Aquino”
  19. 19. Attribute Definition • Entities, just like your data • Required attribute attributes: • ident (keyword) - attribute identifier • valueType (keyword) - string, boolean, long, bigint, float, double, bigdec, instant, uuid, uri, bytes, ref • cardinality (keyword) - one or many
  20. 20. Optional Attributes • Optional attribute attributes: • doc (string) - documentation string • unique (keyword) - value or identity • index (boolean) - index this attribute • fulltext (boolean) - searchable • isComponent (boolean) - composite values • noHistory (boolean) - whether to retain history (rarely used, for things like counters)
  21. 21. Attribute Definition {:db/id #db/id[:db.part/db] :db/ident :comment/body :db/valueType :db.type/string :db/cardinality :db.cardinality/one :db.install/_attribute :db.part/db}
  22. 22. ERD Modeling Legend Quote Useruser text timestamp handle email first-name last-name follows Entity Type attribute relationship attribute
  23. 23. Alternative Views Structure View row datoms sharing a common E column datoms sharing a common A document traversal of attributes graph traversal of reference attributes
  24. 24. Transactions
  25. 25. Transactions • Set of facts (assertions or retractions) • Applied at a point in time
  26. 26. Transactions as Data • Defined as data (not INSERT/UPDATE/ DELETE strings) • Transactions are *also* entities! • You can query them - for when they happened and what datoms they include • And add new facts about them!
  27. 27. Datoms in a Transaction ;; E A V Tx op [21005 :name "Stuart" 1000 true] [21005 :likes 3299 1000 true] [21005 :likes 1730 1022 true] [21005 :likes 3299 1022 false]
  28. 28. Integrity • ACID transactions • Equivalent to isolation level SERIALIZED
  29. 29. Constraints • Schema constraints enforced on attributes • Transaction functions • From old db value to new db value • Enforce arbitrary constraints • Can reject transactions
  30. 30. Uses for Tx Fns • Atomic update • Maintaining integrity constraints (composite keys) • Strict validation • Constructing entities • Annotating transactions
  31. 31. Architecture
  32. 32. Server Indexing Trans- actions Query App Process I/O App Strings DDL + DML Result Sets Storage cache monolithic server
  33. 33. Server Indexing Trans- actions Query App Process I/O App Strings DDL + DML Result Sets Storage cache monolithic server Storage Service App Process D Peer Lib b,c,ea,d,e a,b,d D Transactor Indexing Trans- actions Query Cache App Data Data Data segments Live Index Data Segments Data Segments peer, transactor, storage
  34. 34. Peer Library Storage Service Transactor Datomic Components
  35. 35. Your Application Peer Library Storage Service Transactor Peer Library • Included in your app • Executes queries locally
  36. 36. Your Application Peer Library Storage Service Transactor Peer Library • Reads data from storage • Caches locally cache
  37. 37. Your Application Peer Library Storage Service Transactor Scale Horizontally cache Your Application Peer Library Your Application Peer Library cache cache
  38. 38. Your Application Peer Library Storage Service Transactor Transactor • Standalone system • Scales vertically
  39. 39. Your Application Peer Library Storage Service Transactor Transactor • Standalone system • Hot standby for failover
  40. 40. Your Application Peer Library Storage Service Transactor Transactor • Coordinates writes • Guarantees ACID transactions
  41. 41. Your Application Peer Library Storage Service Transactor Transactor •Writes transaction log to storage • Generates indexes
  42. 42. Storage Service Transactor Transactor • Broadcasts live updates Your Application Peer Library cache Your Application Peer Library Your Application Peer Library cache cache
  43. 43. Your Application Peer Library Storage Service Transactor Storage • Provided as a service • Many different back-ends
  44. 44. Your Application Peer Library Storage Service Transactor Local Storage • Memory • Filesystem
  45. 45. PostgreSQL, Oracle, … Your Application Peer Library Storage Service Transactor Nearby Storage • SQL database
  46. 46. Your Application Peer Library Storage Service Transactor Distributed Storage • DynamoDB • Riak • Couchbase • Infinispan • Cassandra
  47. 47. Your Application Peer Library Storage Service Your Application Peer Library Your Application Peer Library Datomic Components Transactor
  48. 48. Your Application Peer Library Storage Service Transactions Your Application Peer Library Your Application Peer Library ACID Writes Transactor Transactions & Indexes
  49. 49. Your Application Peer Library Storage Service Live updates Your Application Peer Library Your Application Peer Library Live Updates Transactor
  50. 50. Your Application Peer Library Storage Service Your Application Peer Library Your Application Peer Library Distributed Reads Transactor Reads
  51. 51. Your Application Peer Library Storage Service Your Application Peer Library Your Application Peer Library Local Caches cache cache cache Transactor Reads
  52. 52. Your Application Peer Library Storage Service Your Application Peer Library Your Application Peer Library Shared Memcached Transactor Reads Memcached cache cache cache
  53. 53. Queries
  54. 54. Queries as Data • Datalog • Queries defined as data, not as strings
  55. 55. Find User’s Comments (d/q '[:find ?comment :in $ ?email :where [?user :user/email ?email] [?comment :comment/author ?user]] db "editor@example.com")
  56. 56. (d/q '[:find ?comment :in $ ?email :where [?user :user/email ?email] [?comment :comment/author ?user]] db "editor@example.com") Data Pattern ?user :user/email "editor@example.com" ?comment :comment/author
  57. 57. [:find ?customer ?email :in $cust $emp :where [$cust ?customer :email ?email] [$emp _ :email ?email]] “Find me the customers who are also employees.” Join across dbs (d/q query custDb empDb) implicit join
  58. 58. [:find ?customer ?email :in $cust $emp :where [$cust ?customer :email ?email] [$emp _ :email ?email]] “Find me the customers who are also employees.” Join across dbs (d/q query custDb empDb) data patterns can be led by database names
  59. 59. Travel Through Time • Database *now* • Database *last week* • Database *if I added some transactions*
  60. 60. Views of a database name semantics supports (default) current state what is the current situation? .asOf state at point in past how were things in the past? .since state since point in past how have things changed? tx report before / after / change view of a tx automated event response .with state with proposed additions what would happen if we did X? .history timeless view of all history anything!
  61. 61. (d/q '[:find ?customer :where [?customer :id] [?customer :orders]] (d/as-of db #inst "2013-01-01")) Time travel
  62. 62. Query Engine is Local • Local query engine, local cache • If working set is in memory, everything is FAST • Reads do not require any transactor interaction • Use your own functions in the query
  63. 63. Extension with custom fns [:find ?customer ?product :where [?customer :shipAddress ?addr] [?addr :zip ?zip] [?product :product/weight ?weight] [?product :product/price ?price] [(Shipping/estimate ?zip ?weight) ?shipCost] [(<= ?price ?shipCost)]] “Find me the customer/product combinations where the shipping cost dominates the product cost.” predicate
  64. 64. Extension with custom fns [:find ?customer ?product :where [?customer :shipAddress ?addr] [?addr :zip ?zip] [?product :product/weight ?weight] [?product :product/price ?price] [(Shipping/estimate ?zip ?weight) ?shipCost] [(<= ?price ?shipCost)]] “Find me the customer/product combinations where the shipping cost dominates the product cost.” function
  65. 65. Rules
  66. 66. Rules [(relatedProduct ?p1 ?p2) [?p1 :category ?c] [?p2 :category ?c] [(!= ?p1 ?p2)]] “Products are related if they have a common category.”
  67. 67. Rules [(relatedProduct ?p1 ?p2) [?p1 :category ?c] [?p2 :category ?c] [(!= ?p1 ?p2)]] “Products are related if they have a common category.” head is true ...
  68. 68. Rules [(relatedProduct ?p1 ?p2) [?p1 :category ?c] [?p2 :category ?c] [(!= ?p1 ?p2)]] “Products are related if they have a common category.” if body is true
  69. 69. q("[:find ?p2 :in $ % :where (expensiveChocolate p1) (relatedProduct p1 p2)]", db rules) “Find all products related to expensive chocolate.” Rule inputs rules are a kind of input
  70. 70. q("[:find ?p2 :in $ % :where (expensiveChocolate p1) (relatedProduct p1 p2)", db, rules) “Find all products related to expensive chocolate.” Naming rule inputs rule names begin with %
  71. 71. q("[:find ?p2 :in $ % :where (expensiveChocolate p1) (relatedProduct p1 p2)", db, rules) “Find all products related to expensive chocolate.” Using rule patterns rule patterns can appear in :where clause
  72. 72. [[(relatedProduct ?p1 ?p2) [?p1 :category ?c] [?p2 :category ?c] [(!= ?p1 ?p2)]] [(relatedProduct ?p1 ?p2) [?o :order/item ?item1] [?item1 :order/product ?p1] [?o :order/item ?item2] [?item2 :order/product ?p2] [(!= ?p1 ?p2)]]] “Products are related if they have the same category, or they have appeared in the same order.” Implicit or
  73. 73. ;; base case [(story-comment ?story ?comment) [?story :story/title] [?story :new/comments ?comment]] Recursive query for graph navigation it is a story comment if...
  74. 74. ;; base case [(story-comment ?story ?comment) [?story :story/title] [?story :new/comments ?comment]] Recursive query for graph navigation ... there is a story ...
  75. 75. ;; base case [(story-comment ?story ?comment) [?story :story/title] [?story :new/comments ?comment]] Recursive query for graph navigation ... with a comment
  76. 76. ;; recursion [(story-comment ?story ?comment) [?parent :news/comments ?comment) (story-comment ?story ?parent)] Recursive query for graph navigation or, it is a story comment if...
  77. 77. ;; recursion [(story-comment ?story ?comment) [?parent :news/comments ?comment] (story-comment ?story ?parent)] Recursive query for graph navigation ... it has a parent comment ...
  78. 78. ;; recursion [(story-comment ?story ?comment) [?parent :news/comments ?comment) (story-comment ?story ?parent)] Recursive query for graph navigation which is itself a story comment
  79. 79. Indexes and Logs
  80. 80. Direct Index Access • seek-datoms - walk the datoms in a specified index between particular transactions or dates • entid-at - gets a fabricated entity id based on a transaction id or date
  81. 81. Direct Log Access • Walk the transaction log directly • Starting from txid or point in time
  82. 82. Entities and Graph Walking
  83. 83. Entity • A collection of facts • About one thing • At a specific point in time
  84. 84. Entity • A collection of datoms • With the same entity ID • At a specific point in time
  85. 85. Entity • A collection of datoms • With the same entity ID • At a specific point in time • Can be viewed as a map • Traversable as a graph
  86. 86. Entities ⬌ Datoms ;; Entity {:db/id 21005 :name "Stuart" :likes #{{:db/id 1730 :food/name "coffee"}} ;; Datoms [21005 :name "Stuart"] [21005 :likes 1730] [1730 :food/name "coffee"]
  87. 87. Entities • Retrieve entity from the database as a map • Follow references to lazily walk the graph • Follow references in both directions • Use touch to retrieve all entity attributes (def ent (d/entity dbval 17592186045417)) => {:db/id 17592186045417} (d/touch ent) => {:db/doc "Hello world", :db/id 17592186045417}
  88. 88. Sweet Spots • Flexible data model • Audit trail, provenance, time • Transactional data of record • Horizontal query scaling • Cloud deployment
  89. 89. Leverage other systems • Blobs (images, sound, movies, giant text) • Write churn (hit counter) • Horizontal write scaling
  90. 90. Datomic Free Datomic Pro Starter Datomic Pro Cost Redistributable? Number of Peers High-Availability Transactor Storage Services Memcached Support $0 $0 per peer Yes No No 2 2 by license No No Yes Local only All All No No Yes Community Community Enterprise

×