The Reemergence of Datalog
Return of the    Living   Datalog
I liketurtles
Data
Rectangles
RDL
Rectangulation• Relationship between entities• Sparse data• Multi-valued attributes• PLace-Oriented Programming
Java
Java
Where’s  the Data?          Java
Java
Java
RDL
ORMG!
Code as Code.Data as Data.
Code as Code.Data as Data.
CookiesUser information Protocols   Lisp Schemas  EventsChess moves                   as Data.   TX    ...
Code as Code.Data as Data.
Unification
Your data
punching holes inYour data
fitting the holes inYour data
Variables
Deriving bindings
Substitution
Leaving variables
Related variables
MGU
Prolog
Prolog data
Prolog facts
Prolog rules
Prolog query
Data as Code!
Data as Code! Caveat Emptor
Caveats• Clause-order dependence• Non-termination• Infection of imperative
Clause order dependence
Clause order dependence
Non-termination
Non-termination
Fixed?
Imperative infection•!• fail
Cut
Cut
fail
Prolog is prelude• Powerful• Often beautiful• Not as declarative as we’d like
Datalog
Datalog is...• A query language• Not Turing complete• Explicit• Simple • to use    • ... and implement     • ... kinda
History• 1977: Gallaire and Minkers Symposium on  Logic Data Bases• 1980s: Nail, LDL, Coral• 1995: Stonebraker and Hellers...
History• 2002: Binder, a logic-based security language  by DeTreville• 2000s: Declarative networking, bddbddb,  Orchestra ...
Datalog is also• Declarative logic programming with  termination• Recursive queries• Implicit joins
EAV• “Entities” (objects?) - a grouping of tuples• Make that efficient
Query elements
Patterns
Simple query• Find all language entities with a website entry
Simple query• Find all language entities with a website entry                                          entity             ...
Binding query• Find all language URLs with a website entry                          entity           attribute            ...
Join• Repeating ?language indicates a join
Rules                                              head                                              body• All variables i...
Recursive rules
Recursive rules     Simula   LISP        Smalltalk          Dart
Datomic
Datalog plus• No need for a database• Time travel
Where’s the DB?
Where’s the DB?
Where’s the DB?
Where’s the DB?
now• How do you keep  a notion of time in  a relational  database?• Ever write now()?
Always?
Time• Total ordering of transactions• Every datom retains a reference to its  enclosing transaction• Transactions are first...
Time• You can obtain the value of the db as-of, or  since, a point in time, or both  • without parameterizing your logic w...
Datomic is also• Fully navigable lazy entity maps• Query across databases• Optimistic and pessimistic concurrency• “Upsert...
Dedalus
Datalog plus• Time• State via rules
Time• Tick model• Time is an element of the tuple
Deductive time• “Right now”
Deductive time• “Right now”• All terms have the same time...
Inductive time• “Some other time”
Inductive time• “Some other time”• Next time tick
Async time• Unreliable network
State• At time tick 0
State• At time tick 0• Facts
Update• At time tick 100• Facts
Update• At time tick 300• Facts
Mutable persistence rule
Mutable persistence rule• Update
Cascalog
Datalog plus• Map/reduce processing • Order independence a win
Cascalog
Three stages• Pre-aggregation• Aggregation• Post-aggregation    Pre-                         Post-                Aggregat...
Pre-aggregation• Joins aggregator functions• Applies bindable functions and filters• Dataflow-esque    Pre-                 ...
Aggregation• Partition result tuples along logic variables• Execute aggregators for each logic var    Pre-                ...
Post-aggregation• Execute the the remaining filters and  functions on dependent aggregator output    Pre-                  ...
Example   Pre-                         Post-               Aggregationaggregation                  aggregation
Pre-aggregation   Pre-                        Post-              Aggregationaggregation                 aggregation
Aggregation   Pre-                        Post-              Aggregationaggregation                 aggregation
Post-aggregation   Pre-                        Post-              Aggregationaggregation                 aggregation
Bacwn
Datalog plus• Negation
Delicious!
Bacwn facts
Bacwn rules!
Bacwn query
Bacwn negation
Bacwn negation• Find all non-humans at a location
Openquestions• Query plans• Optimizations
Query plans• The holy grail of DBs is that they do the  right thing• Query plans• No runtime guarantees
Gaming the query• /*+ Hinting */• Prolog - Ordering for termination• Datalog - Ordering for speed
Gaming the query• /*+ Hinting */• Prolog - Ordering for termination• Datalog - Ordering for speed
Wut• Find all descendants of root number #100  with value = 4• SLLLLLLOOOOOOWWWW
Wat• Find all descendants of root number #100  with value = 4• FAST!!
Wat• Find all descendants of root number #100  with value = 4• Most-bound
Order doesn’t   matterExcept when   it does
Pluggable Optimizers• Order agnostic is a win• Some orders are better than others• Plug in your own optimizer • That knows...
• Rich Hickey, Stu         THX!  Halloway, Clojure/core• Clojure/dev• Manning Publications• The fam• You
double-secret slides
?(foo)   engine
?(foo)   engine
?(foo)   engine
?(foo)   engine
Data as Data.
Scalar
Scalar
Client   ?   Server
JSONClient          Server   JSON as Data
Client   ?     ServerClojure as Code         118
Client         ServerClojure as Data         119
as Code.Data
Engine  121
Specification   Engine                 122
Specification   Engine                 123
I’m really                           doing                        something                            coolSpecification   ...
Code       as Data.
Function call      semantics:    fn call            arg                     (println "Hello World")      structure:       ...
Function definition           define a fn   fn name                                    docstring              (defn greet   ...
Its all data           symbol      symbol                                    string               (defn greet             ...
The Return of the Living Datalog
The Return of the Living Datalog
The Return of the Living Datalog
Upcoming SlideShare
Loading in...5
×

The Return of the Living Datalog

6,291

Published on

My talk, "T

Published in: Technology
4 Comments
9 Likes
Statistics
Notes
No Downloads
Views
Total Views
6,291
On Slideshare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
0
Comments
4
Likes
9
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Cancer of the rectangle\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Mistaking the menu for the meal.\n
  • \n
  • * One Size Fits All: An Idea Whose Time has Come and Gone\n\n
  • * With a severe line between\n
  • \n
  • * We’re surrounded by data\n
  • * We need better ways to leverage our data, in all of its facets\n* Unify the two?\n
  • \n
  • \n
  • \n
  • \n
  • * Ground terms surrounding variables\n
  • \n
  • \n
  • \n
  • \n
  • * The resulting substitution between two forms that provides a symmetry between the two forms\n* Amalgamation of the bindings of either gives the MGU on subst\n
  • \n
  • \n
  • * unromantic view of childbirth\n* genealogy is the killer app\n
  • \n
  • \n
  • * backtracking\n* depth-first tree traversals\n
  • \n
  • * Bill’s descendents?\n
  • * Flip order fixes!\n
  • \n
  • \n
  • * This is not completely bad, but writing Prolog is often many balancing acts.\n* Like this\n
  • *generically cut is a way to prune branches from the search tree, but...\n
  • * This kills the idea of data as code\n
  • * Performing IO\n* This kills the idea of data as code\n
  • * Make some compromises?\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • * Relation patterns unify across the existing data\n
  • \n
  • \n
  • \n
  • \n
  • * \n
  • \n
  • * the gist of datalog\n* datalog is a family of languages\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • * Runs sentence generator and executes splitoperation\n* Binds ?word logic var\n\n
  • * Executes the count aggregator\n* Binds ?count logic var\n\n
  • * Executes the (> ?count 5) filter\n
  • \n
  • * negation or a toy\n* bacwn facts\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • * Stratified negation\n** subtle\n* Soft-stratification\n** PhD dissertation\n\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • * Most DBs prioritze automatic planning\n** 1-bajillion man-years of effort\n* Hinting\n** unstable -- black art on a black box\n\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • * This is the conceptual model for data as code\n
  • \n
  • \n
  • \n
  • \n
  • The Return of the Living Datalog

    1. 1. The Reemergence of Datalog
    2. 2. Return of the Living Datalog
    3. 3. I liketurtles
    4. 4. Data
    5. 5. Rectangles
    6. 6. RDL
    7. 7. Rectangulation• Relationship between entities• Sparse data• Multi-valued attributes• PLace-Oriented Programming
    8. 8. Java
    9. 9. Java
    10. 10. Where’s the Data? Java
    11. 11. Java
    12. 12. Java
    13. 13. RDL
    14. 14. ORMG!
    15. 15. Code as Code.Data as Data.
    16. 16. Code as Code.Data as Data.
    17. 17. CookiesUser information Protocols Lisp Schemas EventsChess moves as Data. TX ...
    18. 18. Code as Code.Data as Data.
    19. 19. Unification
    20. 20. Your data
    21. 21. punching holes inYour data
    22. 22. fitting the holes inYour data
    23. 23. Variables
    24. 24. Deriving bindings
    25. 25. Substitution
    26. 26. Leaving variables
    27. 27. Related variables
    28. 28. MGU
    29. 29. Prolog
    30. 30. Prolog data
    31. 31. Prolog facts
    32. 32. Prolog rules
    33. 33. Prolog query
    34. 34. Data as Code!
    35. 35. Data as Code! Caveat Emptor
    36. 36. Caveats• Clause-order dependence• Non-termination• Infection of imperative
    37. 37. Clause order dependence
    38. 38. Clause order dependence
    39. 39. Non-termination
    40. 40. Non-termination
    41. 41. Fixed?
    42. 42. Imperative infection•!• fail
    43. 43. Cut
    44. 44. Cut
    45. 45. fail
    46. 46. Prolog is prelude• Powerful• Often beautiful• Not as declarative as we’d like
    47. 47. Datalog
    48. 48. Datalog is...• A query language• Not Turing complete• Explicit• Simple • to use • ... and implement • ... kinda
    49. 49. History• 1977: Gallaire and Minkers Symposium on Logic Data Bases• 1980s: Nail, LDL, Coral• 1995: Stonebraker and Hellerstein declare "no practical applications …"• The dark years...
    50. 50. History• 2002: Binder, a logic-based security language by DeTreville• 2000s: Declarative networking, bddbddb, Orchestra CDSS, Doop, SecureBlox, Dedalus, more• 2010: The Declarative Imperative by Hellerstein!• Today: Bloom, Cascalog, Datomic, LogicBlox, more
    51. 51. Datalog is also• Declarative logic programming with termination• Recursive queries• Implicit joins
    52. 52. EAV• “Entities” (objects?) - a grouping of tuples• Make that efficient
    53. 53. Query elements
    54. 54. Patterns
    55. 55. Simple query• Find all language entities with a website entry
    56. 56. Simple query• Find all language entities with a website entry entity attribute
    57. 57. Binding query• Find all language URLs with a website entry entity attribute value
    58. 58. Join• Repeating ?language indicates a join
    59. 59. Rules head body• All variables in the head, must appear in the body
    60. 60. Recursive rules
    61. 61. Recursive rules Simula LISP Smalltalk Dart
    62. 62. Datomic
    63. 63. Datalog plus• No need for a database• Time travel
    64. 64. Where’s the DB?
    65. 65. Where’s the DB?
    66. 66. Where’s the DB?
    67. 67. Where’s the DB?
    68. 68. now• How do you keep a notion of time in a relational database?• Ever write now()?
    69. 69. Always?
    70. 70. Time• Total ordering of transactions• Every datom retains a reference to its enclosing transaction• Transactions are first-class entities, can have their own attributes
    71. 71. Time• You can obtain the value of the db as-of, or since, a point in time, or both • without parameterizing your logic with a time argument• You can also get the entire history of an entity!
    72. 72. Datomic is also• Fully navigable lazy entity maps• Query across databases• Optimistic and pessimistic concurrency• “Upsertting”• http://datomic.com
    73. 73. Dedalus
    74. 74. Datalog plus• Time• State via rules
    75. 75. Time• Tick model• Time is an element of the tuple
    76. 76. Deductive time• “Right now”
    77. 77. Deductive time• “Right now”• All terms have the same time...
    78. 78. Inductive time• “Some other time”
    79. 79. Inductive time• “Some other time”• Next time tick
    80. 80. Async time• Unreliable network
    81. 81. State• At time tick 0
    82. 82. State• At time tick 0• Facts
    83. 83. Update• At time tick 100• Facts
    84. 84. Update• At time tick 300• Facts
    85. 85. Mutable persistence rule
    86. 86. Mutable persistence rule• Update
    87. 87. Cascalog
    88. 88. Datalog plus• Map/reduce processing • Order independence a win
    89. 89. Cascalog
    90. 90. Three stages• Pre-aggregation• Aggregation• Post-aggregation Pre- Post- Aggregation aggregation aggregation
    91. 91. Pre-aggregation• Joins aggregator functions• Applies bindable functions and filters• Dataflow-esque Pre- Post- Aggregation aggregation aggregation
    92. 92. Aggregation• Partition result tuples along logic variables• Execute aggregators for each logic var Pre- Post- Aggregation aggregation aggregation
    93. 93. Post-aggregation• Execute the the remaining filters and functions on dependent aggregator output Pre- Post- Aggregation aggregation aggregation
    94. 94. Example Pre- Post- Aggregationaggregation aggregation
    95. 95. Pre-aggregation Pre- Post- Aggregationaggregation aggregation
    96. 96. Aggregation Pre- Post- Aggregationaggregation aggregation
    97. 97. Post-aggregation Pre- Post- Aggregationaggregation aggregation
    98. 98. Bacwn
    99. 99. Datalog plus• Negation
    100. 100. Delicious!
    101. 101. Bacwn facts
    102. 102. Bacwn rules!
    103. 103. Bacwn query
    104. 104. Bacwn negation
    105. 105. Bacwn negation• Find all non-humans at a location
    106. 106. Openquestions• Query plans• Optimizations
    107. 107. Query plans• The holy grail of DBs is that they do the right thing• Query plans• No runtime guarantees
    108. 108. Gaming the query• /*+ Hinting */• Prolog - Ordering for termination• Datalog - Ordering for speed
    109. 109. Gaming the query• /*+ Hinting */• Prolog - Ordering for termination• Datalog - Ordering for speed
    110. 110. Wut• Find all descendants of root number #100 with value = 4• SLLLLLLOOOOOOWWWW
    111. 111. Wat• Find all descendants of root number #100 with value = 4• FAST!!
    112. 112. Wat• Find all descendants of root number #100 with value = 4• Most-bound
    113. 113. Order doesn’t matterExcept when it does
    114. 114. Pluggable Optimizers• Order agnostic is a win• Some orders are better than others• Plug in your own optimizer • That knows your data• Will not affect other Datalog engine optimization techniques
    115. 115. • Rich Hickey, Stu THX! Halloway, Clojure/core• Clojure/dev• Manning Publications• The fam• You
    116. 116. double-secret slides
    117. 117. ?(foo) engine
    118. 118. ?(foo) engine
    119. 119. ?(foo) engine
    120. 120. ?(foo) engine
    121. 121. Data as Data.
    122. 122. Scalar
    123. 123. Scalar
    124. 124. Client ? Server
    125. 125. JSONClient Server JSON as Data
    126. 126. Client ? ServerClojure as Code 118
    127. 127. Client ServerClojure as Data 119
    128. 128. as Code.Data
    129. 129. Engine 121
    130. 130. Specification Engine 122
    131. 131. Specification Engine 123
    132. 132. I’m really doing something coolSpecification Engine now 124
    133. 133. Code as Data.
    134. 134. Function call semantics: fn call arg (println "Hello World") structure: symbol string list126
    135. 135. Function definition define a fn fn name docstring (defn greet "Returns a friendly greeting" [your-name] (str "Hello, " your-name)) arguments fn body127
    136. 136. Its all data symbol symbol string (defn greet "Returns a friendly greeting" [your-name] (str "Hello, " your-name)) vector list128

    ×