The Return of the Living Datalog

14,386 views

Published on

My talk, "T

Published in: Technology
4 Comments
10 Likes
Statistics
Notes
No Downloads
Views
Total views
14,386
On SlideShare
0
From Embeds
0
Number of Embeds
4,092
Actions
Shares
0
Downloads
0
Comments
4
Likes
10
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Cancer of the rectangle\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Mistaking the menu for the meal.\n
  • \n
  • * One Size Fits All: An Idea Whose Time has Come and Gone\n\n
  • * With a severe line between\n
  • \n
  • * We’re surrounded by data\n
  • * We need better ways to leverage our data, in all of its facets\n* Unify the two?\n
  • \n
  • \n
  • \n
  • \n
  • * Ground terms surrounding variables\n
  • \n
  • \n
  • \n
  • \n
  • * The resulting substitution between two forms that provides a symmetry between the two forms\n* Amalgamation of the bindings of either gives the MGU on subst\n
  • \n
  • \n
  • * unromantic view of childbirth\n* genealogy is the killer app\n
  • \n
  • \n
  • * backtracking\n* depth-first tree traversals\n
  • \n
  • * Bill’s descendents?\n
  • * Flip order fixes!\n
  • \n
  • \n
  • * This is not completely bad, but writing Prolog is often many balancing acts.\n* Like this\n
  • *generically cut is a way to prune branches from the search tree, but...\n
  • * This kills the idea of data as code\n
  • * Performing IO\n* This kills the idea of data as code\n
  • * Make some compromises?\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • * Relation patterns unify across the existing data\n
  • \n
  • \n
  • \n
  • \n
  • * \n
  • \n
  • * the gist of datalog\n* datalog is a family of languages\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • * Runs sentence generator and executes splitoperation\n* Binds ?word logic var\n\n
  • * Executes the count aggregator\n* Binds ?count logic var\n\n
  • * Executes the (> ?count 5) filter\n
  • \n
  • * negation or a toy\n* bacwn facts\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • * Stratified negation\n** subtle\n* Soft-stratification\n** PhD dissertation\n\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • * Most DBs prioritze automatic planning\n** 1-bajillion man-years of effort\n* Hinting\n** unstable -- black art on a black box\n\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • * This is the conceptual model for data as code\n
  • \n
  • \n
  • \n
  • \n
  • The Return of the Living Datalog

    1. The Reemergence of Datalog
    2. Return of the Living Datalog
    3. I liketurtles
    4. Data
    5. Rectangles
    6. RDL
    7. Rectangulation• Relationship between entities• Sparse data• Multi-valued attributes• PLace-Oriented Programming
    8. Java
    9. Java
    10. Where’s the Data? Java
    11. Java
    12. Java
    13. RDL
    14. ORMG!
    15. Code as Code.Data as Data.
    16. Code as Code.Data as Data.
    17. CookiesUser information Protocols Lisp Schemas EventsChess moves as Data. TX ...
    18. Code as Code.Data as Data.
    19. Unification
    20. Your data
    21. punching holes inYour data
    22. fitting the holes inYour data
    23. Variables
    24. Deriving bindings
    25. Substitution
    26. Leaving variables
    27. Related variables
    28. MGU
    29. Prolog
    30. Prolog data
    31. Prolog facts
    32. Prolog rules
    33. Prolog query
    34. Data as Code!
    35. Data as Code! Caveat Emptor
    36. Caveats• Clause-order dependence• Non-termination• Infection of imperative
    37. Clause order dependence
    38. Clause order dependence
    39. Non-termination
    40. Non-termination
    41. Fixed?
    42. Imperative infection•!• fail
    43. Cut
    44. Cut
    45. fail
    46. Prolog is prelude• Powerful• Often beautiful• Not as declarative as we’d like
    47. Datalog
    48. Datalog is...• A query language• Not Turing complete• Explicit• Simple • to use • ... and implement • ... kinda
    49. History• 1977: Gallaire and Minkers Symposium on Logic Data Bases• 1980s: Nail, LDL, Coral• 1995: Stonebraker and Hellerstein declare "no practical applications …"• The dark years...
    50. History• 2002: Binder, a logic-based security language by DeTreville• 2000s: Declarative networking, bddbddb, Orchestra CDSS, Doop, SecureBlox, Dedalus, more• 2010: The Declarative Imperative by Hellerstein!• Today: Bloom, Cascalog, Datomic, LogicBlox, more
    51. Datalog is also• Declarative logic programming with termination• Recursive queries• Implicit joins
    52. EAV• “Entities” (objects?) - a grouping of tuples• Make that efficient
    53. Query elements
    54. Patterns
    55. Simple query• Find all language entities with a website entry
    56. Simple query• Find all language entities with a website entry entity attribute
    57. Binding query• Find all language URLs with a website entry entity attribute value
    58. Join• Repeating ?language indicates a join
    59. Rules head body• All variables in the head, must appear in the body
    60. Recursive rules
    61. Recursive rules Simula LISP Smalltalk Dart
    62. Datomic
    63. Datalog plus• No need for a database• Time travel
    64. Where’s the DB?
    65. Where’s the DB?
    66. Where’s the DB?
    67. Where’s the DB?
    68. now• How do you keep a notion of time in a relational database?• Ever write now()?
    69. Always?
    70. Time• Total ordering of transactions• Every datom retains a reference to its enclosing transaction• Transactions are first-class entities, can have their own attributes
    71. Time• You can obtain the value of the db as-of, or since, a point in time, or both • without parameterizing your logic with a time argument• You can also get the entire history of an entity!
    72. Datomic is also• Fully navigable lazy entity maps• Query across databases• Optimistic and pessimistic concurrency• “Upsertting”• http://datomic.com
    73. Dedalus
    74. Datalog plus• Time• State via rules
    75. Time• Tick model• Time is an element of the tuple
    76. Deductive time• “Right now”
    77. Deductive time• “Right now”• All terms have the same time...
    78. Inductive time• “Some other time”
    79. Inductive time• “Some other time”• Next time tick
    80. Async time• Unreliable network
    81. State• At time tick 0
    82. State• At time tick 0• Facts
    83. Update• At time tick 100• Facts
    84. Update• At time tick 300• Facts
    85. Mutable persistence rule
    86. Mutable persistence rule• Update
    87. Cascalog
    88. Datalog plus• Map/reduce processing • Order independence a win
    89. Cascalog
    90. Three stages• Pre-aggregation• Aggregation• Post-aggregation Pre- Post- Aggregation aggregation aggregation
    91. Pre-aggregation• Joins aggregator functions• Applies bindable functions and filters• Dataflow-esque Pre- Post- Aggregation aggregation aggregation
    92. Aggregation• Partition result tuples along logic variables• Execute aggregators for each logic var Pre- Post- Aggregation aggregation aggregation
    93. Post-aggregation• Execute the the remaining filters and functions on dependent aggregator output Pre- Post- Aggregation aggregation aggregation
    94. Example Pre- Post- Aggregationaggregation aggregation
    95. Pre-aggregation Pre- Post- Aggregationaggregation aggregation
    96. Aggregation Pre- Post- Aggregationaggregation aggregation
    97. Post-aggregation Pre- Post- Aggregationaggregation aggregation
    98. Bacwn
    99. Datalog plus• Negation
    100. Delicious!
    101. Bacwn facts
    102. Bacwn rules!
    103. Bacwn query
    104. Bacwn negation
    105. Bacwn negation• Find all non-humans at a location
    106. Openquestions• Query plans• Optimizations
    107. Query plans• The holy grail of DBs is that they do the right thing• Query plans• No runtime guarantees
    108. Gaming the query• /*+ Hinting */• Prolog - Ordering for termination• Datalog - Ordering for speed
    109. Gaming the query• /*+ Hinting */• Prolog - Ordering for termination• Datalog - Ordering for speed
    110. Wut• Find all descendants of root number #100 with value = 4• SLLLLLLOOOOOOWWWW
    111. Wat• Find all descendants of root number #100 with value = 4• FAST!!
    112. Wat• Find all descendants of root number #100 with value = 4• Most-bound
    113. Order doesn’t matterExcept when it does
    114. Pluggable Optimizers• Order agnostic is a win• Some orders are better than others• Plug in your own optimizer • That knows your data• Will not affect other Datalog engine optimization techniques
    115. • Rich Hickey, Stu THX! Halloway, Clojure/core• Clojure/dev• Manning Publications• The fam• You
    116. double-secret slides
    117. ?(foo) engine
    118. ?(foo) engine
    119. ?(foo) engine
    120. ?(foo) engine
    121. Data as Data.
    122. Scalar
    123. Scalar
    124. Client ? Server
    125. JSONClient Server JSON as Data
    126. Client ? ServerClojure as Code 118
    127. Client ServerClojure as Data 119
    128. as Code.Data
    129. Engine 121
    130. Specification Engine 122
    131. Specification Engine 123
    132. I’m really doing something coolSpecification Engine now 124
    133. Code as Data.
    134. Function call semantics: fn call arg (println "Hello World") structure: symbol string list126
    135. Function definition define a fn fn name docstring (defn greet "Returns a friendly greeting" [your-name] (str "Hello, " your-name)) arguments fn body127
    136. Its all data symbol symbol string (defn greet "Returns a friendly greeting" [your-name] (str "Hello, " your-name)) vector list128

    ×