Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Neo4j -[:LOVES]-> Cypher

9,827 views

Published on

The Neo4j Graph database was lacking a declarative query language.
We wanted to add a humane query language which is easy to read and understand. It borrows on other languages like SQL and SPARQL but brings it it's own flavor. Cypher uses ASCII ART to describe graph patterns that you're looking for.
We used Scala's parser combinator library in combination with functional approaches and lazy evaluation to develop the Cypher query language.
The talk describes the internals of the Cypher implementation.

Published in: Technology
  • Be the first to comment

Neo4j -[:LOVES]-> Cypher

  1. 1. Geekout Tallin EstoniaGot a Graph Database? Need a Query Language! (Neo4j) -[:LOVES]-> (Cypher)
  2. 2. (Michael) -[:WORKS_ON]-> (Neo4j) consoleCypher community graph Community MEServer Spring Cloud
  3. 3. YOU ? SQL NOSQLGraph Database
  4. 4. Graphs areeverywhere
  5. 5. (Neo4j) -[:IS_A]-> (Graph Database) Lucene Sharding 1 M/s Master/ Index Slave LS TRAVERSA HIG S H_A TE VA RA IL. TEG IN PROVIDES Server ACID TX RUN LI S_A CE S NS ED O _L Ruby IK ES_T RU JS E MySQL S NS _A _O SC ALClojure NS .net N RU Mongo embedded 34bn Heroku Java Nodes
  6. 6. Homework- so Pay Attention• Go To http://bit.ly/geekout-cypher• With your Cypher knowledge • Add yourself to the graph • Determine a path from you to me• Share the console on Twitter by Monday• Win this AR-Drone
  7. 7. Query a Graph with ___ _____ _____ _____ _____ ___ ______ _____ / _ / ___/ __ _ _|_ _| / _ | ___ _ _|/ /_ `--.| / / | | | | / /_ |_/ / | || _ |`--. | | | | | | _ | / | || | | /__/ / __/_| |_ _| |_ | | | | | | |_| |_|____/ ____/___/ ___/ _| |_|_| _| _/Whirlwind Tour
  8. 8. A Graph Can‘t see the Patterns for the Trees http://maxdemarzi.com/2012/02/13/visualizing-a-network-with-cypher/
  9. 9. Patterns? Where?
  10. 10. Patterns? Describe them!
  11. 11. Patterns? Find them on bound nodes
  12. 12. Patterns? There are more !
  13. 13. Patterns? Make them visible!
  14. 14. How ?A Graph Query Language with Pattern Matching
  15. 15. SparQL, SQL ? No, Cypherdeveloped by and for Neo4j
  16. 16. Can I draw Patterns? A B C
  17. 17. Sure, with ASCII ART! A B C
  18. 18. Sure, with ASCII ART! A B C A --> B --> C, A --> C
  19. 19. Sure, with ASCII ART! A B C A --> B --> C, A --> C A --> B --> C <-- A
  20. 20. Example Directed Relationship A B
  21. 21. Example Directed Relationship A B (A) --> (B)
  22. 22. ExampleLabeled Directed Relationship LOVES A B
  23. 23. ExampleLabeled Directed Relationship LOVES A B (A) -[:LOVES]-> (B)
  24. 24. ExampleTransitive Relationship Path A B C
  25. 25. ExampleTransitive Relationship Path A B C(A)-->(B)-->(C)
  26. 26. ExampleVariable Length Path A B A B A B ...
  27. 27. ExampleVariable Length Path A B A B A B ... (A) -[*]-> (B)
  28. 28. Real World Examples
  29. 29. Give me: JOINSSELECT skills.*, user_skill.*FROM usersJOIN user_skill ON users.id = user_skill.user_idJOIN skills ON user_skill.skill_id = skill.idWHERE users.id = 1 START user = node(1) MATCH user -[r:USER_SKILL]-> skill RETURN skill, r
  30. 30. Give me: Old, Influential FriendsSTART me = node(...)MATCH (me) - [f:FRIEND] - (old_friend) - [:FRIEND ] - (fof)WHERE ({today}-f.begin) > 365*10WITH  old_friend, collect(fof.name) as namesWHERE length(names) > 100RETURN old_friend, namesORDER BY old_friend.name ASC f:FRIEND :FRIEND me friend fof
  31. 31. Give me: Simple RecommendationSTART me = node(...)MATCH (me) -[r1:RATED ]->(thing) <-[r2:RATED ]- (someone) -[r3:RATED ]->(cool_thing)WHERE ABS(r1.stars-r2.stars) <= 2 AND r3.stars > 3RETURN cool_thing, count(*) AS cntORDER BY cnt DESC LIMIT 10 r1:RATED thing r2:RATED me so r3:RATED cool thing
  32. 32. Results ? • Tables: for Human Brainz & Tools • Graphs: to highlight, visualize, export & refining queries
  33. 33. One Step Back• Goals / Intent• Origins• Decisions• Implementation• Future
  34. 34. What is Cypher?• Graph Query Language for Neo4j• Querying for Humans
  35. 35. Goals, Origins & Design Picking good ideas and having some of our own
  36. 36. Cypher: Some Goals ASCII-art Pattern patterns Matching Declarative External DSL Closures SQL Familiarity
  37. 37. Something new? • Existing Neo4j query mechanisms were not simple enough • Too verbose (Java API) • Too prescriptive (Gremlin) • Other query languages
  38. 38. Java API? • Object oriented • Node, Relationship, Path objects • Imperative • Verbose • Traversers, for-loops, Index • mostly lazy-eval
  39. 39. Gremlin? • DSL for pipes • imperative • a single path expression • loop constructs • side-effects • lots of closures
  40. 40. SQL? • Well known and understood • Unable to express paths • these are crucial for graph-based reasoning • Cumbersome Mutation • Neo4j is schema/table free
  41. 41. SPARQL? • SPARQL designed for a different data model • namespaces / URI‘s • reified properties as nodes • SPARQL/RDF mostly in academia • developers don‘t get it • Pattern matching is cool
  42. 42. Cypher: Some Goals ASCII-art Pattern patterns Matching Declarative External DSL Closures SQL Familiarity
  43. 43. Cypher: Some Goals ASCII-art Pattern patterns Matching Declarative External DSL Closures SQL Familiarity
  44. 44. Design Decisions Closures / Quantifiers START london = node(1), moscow = node(2) MATCH path = london -[*]-> moscow WHERE all(city in nodes(path) where city.capital) RETURN path
  45. 45. Cypher: Some Goals ASCII-art Pattern patterns Matching Declarative External DSL Closures SQL Familiarity
  46. 46. Design Decisions Parsed, not an internal DSL Execution Semantics Serialisation Type System Portability
  47. 47. Cypher: Some Goals ASCII-art Pattern patterns Matching Declarative External DSL Closures SQL Familiarity
  48. 48. Design Decisions Familiar for SQL users select start from match where where group by return order by order by skip limit
  49. 49. Design Decisions Database vs Application Design Goal: single user interaction expressible as single query Queries have enough logic to find required data, not enough to process it
  50. 50. Implementation Docs fromExecution Tests Java Bridge Plan Pattern Matching Parsing Scala
  51. 51. CODE
  52. 52. Scala• Parser Combinator• Query Object Creation• Combining Pipes• Pattern Matching• Lazy Evaluation
  53. 53. Example: PipesSTART n=node(0)MATCH n-[*]->mRETURN n, count(*)ORDER BY n.nameResulting Pipes:Parameters()Nodes(n) // Identifier nPatternMatch(n-[*]->m) // Identifier mExtract([n,n.name])Sort(n.name ASC)EagerAggregation( keys: [n], aggregates: [count(*)])ColumnFilter([n.name,n,count(*)])
  54. 54. Why Scala?• Fun Language• Parser Combinators• Functional Concepts• Collection library• Immutable Values• Scala Pattern Matching
  55. 55. Problems with Scala?• Scala versions• Slow compilation• Low Collection Write Throughput• Code gets easily complicated• Hard to ramp up for other devs• Big separate library
  56. 56. Implementation Query Parts• START• MATCH • Pattern Matching• WHERE • Expressions, Predicates• RETURN
  57. 57. START WHERE Aggregation BINDS FILTERSDELETE RESTRUCTURES IdentifierREMOVE_FROM Result BOUND_TO USED_IN PAGINATES BUILDS_UP SKIP/LIMIT Graph FOUND_IN Pattern DESCRIBES ADD_TO CREATES MATCH FIX COMPLETES CREATE RELATE
  58. 58. STARTSTART dev=node:user(name=“Andres“) • lazy source of identifiers bound to nodes and relationships • each identifier is an iterable • spawns execution per value • cross product between multiple • index lookup or direct lookup
  59. 59. MATCHMATCH dev-[:WORKED_ON]->project• describe patterns with ASCII art• declare identifiers • paths • var. length paths • graph algorithms • optional relationships• expands results with each found subgraph
  60. 60. Core Acitvity: Pattern Matching• Match clause with ASCII-art• derive pattern description• bound Nodes and Relationships• finds patterns attached to bound entities• each found Pattern spawns Subgraph Result• recursive PM with Backtracking
  61. 61. Pattern Matching• Scala • incremental search with backtracking • allows: • Variable length paths • Filter during matching • optional relationships • powerful but slower
  62. 62. Pattern Matching• Java • existing graph matching libray • fast but less capable • integrated with lazy Scala API• NEW Traversal Framework approach • for one or two bound nodes • use Neo4j traversal framework• pattern description & complexity determine Pattern Matcher selection
  63. 63. Implementation • Recursive matching with backtrackingSTART x=... MATCH x-->y, x-->z, y-->z, z-->a-->b, z-->b
  64. 64. Implementation • Recursive matching with backtrackingSTART x=... MATCH x-->y, x-->z, y-->z, z-->a-->b, z-->b
  65. 65. Implementation • Recursive matching with backtrackingSTART x=... MATCH x-->y, x-->z, y-->z, z-->a-->b, z-->b
  66. 66. Implementation • Recursive matching with backtrackingSTART x=... MATCH x-->y, x-->z, y-->z, z-->a-->b, z-->b
  67. 67. Implementation • Recursive matching with backtrackingSTART x=... MATCH x-->y, x-->z, y-->z, z-->a-->b, z-->b
  68. 68. Implementation • Recursive matching with backtrackingSTART x=... MATCH x-->y, x-->z, y-->z, z-->a-->b, z-->b
  69. 69. Implementation • Recursive matching with backtrackingSTART x=... MATCH x-->y, x-->z, y-->z, z-->a-->b, z-->b
  70. 70. Implementation • Recursive matching with backtrackingSTART x=... MATCH x-->y, x-->z, y-->z, z-->a-->b, z-->b
  71. 71. Implementation • Recursive matching with backtrackingSTART x=... MATCH x-->y, x-->z, y-->z, z-->a-->b, z-->b
  72. 72. WHEREWHERE project.name = „Cypher“• filters results• single big boolean expression• needs existing identifiers to work with• much like SQL
  73. 73. Expressions• expressions compute values • composite Specification Pattern • input is ExecutionContext Map• have name• declare symbolic dependencies • self & composite• directly derived from parser • probably rewritten in between
  74. 74. Predicates• WHERE clause filters results with a single composed predicate• boolean expressions • composable boolean algebra• Patterns as predicates• Quantifiers (ALL, NONE, SINGLE)• Collections as predicates
  75. 75. RETURNRETURN project, collect(idea),count(fun) • determines what to return • column names or aliases with AS • automatic aggregation • when aggregation function exists • all non-aggregated values are grouping • lazyness is killed with aggregation
  76. 76. SKIP LIMIT ORDER BY SKIP 5 LIMIT 10 ORDER by length(ideas) DESC • the usual suspects • lazyness is killed with ordering
  77. 77. Mutation Transaction WITH Idem-(Separation) potence DRY Mass Data Handling Lazyness
  78. 78. Mutation• need transactions• must not disturb reads / traversals • separation of read and write query parts• explicit context change and scope: WITH • implicit split for simple queries• granularity of mutation aligned with # of executions• work with parameters
  79. 79. Mutation - Impl.• Query Builder create UpdateCommands• UpdatePipe ??• Transaction Scope for whole Query• collect statistics• create new entities • add identifiers to the ExecutionContext• need to iterate through to get all updates executed (no lazyness), special Pipe• have to track already deleted entities
  80. 80. ImplementationMutating Query Parts• WITH• CREATE• RELATE• SET, DELETE• FOREACH
  81. 81. START WHERE Aggregation BINDS FILTERSDELETE RESTRUCTURES IdentifierREMOVE_FROM Result BOUND_TO USED_IN PAGINATES BUILDS_UP SKIP/LIMIT Graph FOUND_IN Pattern DESCRIBES ADD_TO CREATES MATCH FIX COMPLETES CREATE RELATE
  82. 82. WITHWITH me, count(friend) as friends • syntax like RETURN • separate query parts like a pipe • declares a new scope with new identifiers • all other ids will be gone • useful for HAVING • spawns a new ExecutionContext • continue with read or write part
  83. 83. CREATECREATE me = {name: „Michael“}• create new nodes or relationships• can work with map params (or Iterables thereof)• assign new identifiers• can create full paths
  84. 84. RELATERELATE posts-[:POST]->(p {title: „..“} • FIX the graph • construct missing relationships and nodes • need at least one bound node • match given properties • try to advance (from multiple sides) and create missing stuff then iterate
  85. 85. DELETE DELETE n, rel, m.prop• as expected• idempotent deletion• can only delete unconnected nodes • so delete relationships first START n=node(*) MATCH n-[r?]-() DELETE n,r
  86. 86. SETSET n.name = „Father of “+m.name • can work with arbitrary expressions • use coalesce for idempotent defaults • can mass assign with map-parameters
  87. 87. FOREACHFOREACH ( f in new_friends : RELATE me-[:FRIEND]->f) • iterable loop for mutating operations • saves a lot of repetetive code • wraps current execution context in a temporary proxy
  88. 88. Self Documenting Tests• Provides • Title & Description • Define Sample Graph • Declare & Execute Query • Results
  89. 89. Self Documenting Tests• Asserts • No Syntax Errors • Multiple Cypher Versions • No Execution Errors • Results • Resulting graph
  90. 90. Self Documenting Tests• Generates • Graph Rendering Graph-Viz • Tabular Results • Ascii-doc for Documentation • Live-Console Integration
  91. 91. Self Documenting Tests • Good enough for • Fun to write • Integration Tests • Manual • Blog-ready Cookbook Examples
  92. 92. Self DocumentingTests
  93. 93. Cypher Console(s)• Integrated in Neo4j-Shell• Integrated in Webadmin (Shell & Search)• REPL & GIST - console.neo4j.org• Interactive learning GCLI• Scala spray-can & cypher = bansky• py2neo, bash• Google Document
  94. 94. Live Console• In Memory GDBs in Web Session• Set up with mutating Cypher (or Geoff)• Executes Cypher (also mutating)• Visualizes Graph & Query Results (d3)• Multiple Cypher Versions• Share: short link, tweet, yUML• Embeddable <iframe>• Live Console for Docs
  95. 95. The Rabbithole http://console.neo4j.org This Graph: http://tinyurl.com/7cnvmlq
  96. 96. Homework- Apply your Wits• Go To http://bit.ly/geekout-cypher• With your new Cypher knowledge • Add yourself to the graph • Determine a path from you to me• Share the console on Twitter by Monday• Win this AR-Drone
  97. 97. Graph your World• Neo4j.org• Google Group• Grab your own ego-Graph
  98. 98. Thanks for Listening! Questions?michael.hunger@neotechnology.com @mesirii

×