Shutl	
  delivers	
  with	
  Neo4j
Tuesday, 30 July 13
Volker Pacher
senior developer @shutl
@vpacher
http://github.com/vpacher
Tuesday, 30 July 13
Tuesday, 30 July 13
Tuesday, 30 July 13
Problems?
Tuesday, 30 July 13
http://xkcd.com/287/
Tuesday, 30 July 13
• exponential growth of joins in mysql with added features
• code base too complex and unmaintanable
• api response time growing too large the more data was added
• our fastest delivery was quicker then our slowest query!
problems with our previous attempt (v1):
Tuesday, 30 July 13
The case for graph databases:
• relationships are explicit stored (RDBS lack relationships)
• domain modelling is simplified because adding new ‘subgraphs‘
doesn’t affect the existing structure and queries (additive model)
• white board friendly
• schema-less
• db performance remains relatively constant because queries are
localized to its portion of the graph. O(1) for same query
• traversals of relationships are easy and very fast
Tuesday, 30 July 13
What is a graph anyway?
Node 1 Node 2
Node 4
Node 3
a collection of vertices (nodes)
connected by edges (relationships)
Tuesday, 30 July 13
a short history: the seven bridges of Königsberg (1735)
Leonard Euler
Tuesday, 30 July 13
directed graph
Node 1 Node 2
Node 4
Node 3
each relationship has a direction or
one start node and one end node
Tuesday, 30 July 13
property graph:
name:Volker
• nodes contain properties (key, value)
• relationships have a type and are always directed
• relationships can contain properties too
name: Sam
:friends
name: Megan
:knows
since: 2005
name: Paul
:friends
:works_for
:knows
Tuesday, 30 July 13
Tuesday, 30 July 13
a graph is its own index (constant query performance)
Tuesday, 30 July 13
Tuesday, 30 July 13
Querying the graph: Cypher
• declarative query language specific to neo4j
• easy to learn and intuitive
• enables the user to specify specific patterns to query for (something that looks like ‘this’)
• inspired partly by SQL (WHERE and ORDER BY) and SPARQL (pattern matching)
• focuses on what to query for and not how to query for it
• switch from a mySQl world is made easier by the use of cypher instead of having to learn
a traversal framework straight away
Tuesday, 30 July 13
• START: Starting points in the graph, obtained via index lookups or by element IDs.
• MATCH: The graph pattern to match, bound to the starting points in START.
• WHERE: Filtering criteria.
• RETURN: What to return.
• CREATE: Creates nodes and relationships.
• DELETE: Removes nodes, relationships and properties.
• SET: Set values to properties.
• FOREACH: Performs updating actions once per element in a list.
• WITH: Divides a query into multiple, distinct parts
cypher clauses
Tuesday, 30 July 13
an example graph
Node 1
me
Node 2
Steve
Node 3
Sam
Node 4
David
Node 5
Megan
me - [:knows] -> Steve -
[:knows] -> David
me - [:knows] -> Sam -
[:knows] -> Megan
Megan - [:knows] -> David
knows
knowsknows
knows
knows
Tuesday, 30 July 13
START me=node(1)
MATCH me-[:knows]->()-[:knows]->fof
RETURN fof
the query
Tuesday, 30 July 13
START me=node(1)
MATCH me-[:knows*2..]->fof
WHERE fof.name =~ 'Da.*'
RETURN fof
Tuesday, 30 July 13
root (0)
Year: 2013
Month: 05 Month 01
2014
01
05
2013
Year: 2014
Month: 06
06
Day: 24 Day: 25
24
25
Day: 26
26
Event 1 Event 2 Event 3
happens happens happens happens
representing dates/times
Tuesday, 30 July 13
find all events on a specific day
START root=node(0)
MATCH root-[:‘2013’]-()-[:’05’]-()-[:’24’]-()-
[:happens]-event
RETURN event
Tuesday, 30 July 13
QUESTIONS?
Volker Pacher
volker@shutl.com
www.shutl.com
Tuesday, 30 July 13

Shutl

  • 1.
    Shutl  delivers  with  Neo4j Tuesday, 30 July 13
  • 2.
    Volker Pacher senior developer@shutl @vpacher http://github.com/vpacher Tuesday, 30 July 13
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
    • exponential growthof joins in mysql with added features • code base too complex and unmaintanable • api response time growing too large the more data was added • our fastest delivery was quicker then our slowest query! problems with our previous attempt (v1): Tuesday, 30 July 13
  • 8.
    The case forgraph databases: • relationships are explicit stored (RDBS lack relationships) • domain modelling is simplified because adding new ‘subgraphs‘ doesn’t affect the existing structure and queries (additive model) • white board friendly • schema-less • db performance remains relatively constant because queries are localized to its portion of the graph. O(1) for same query • traversals of relationships are easy and very fast Tuesday, 30 July 13
  • 9.
    What is agraph anyway? Node 1 Node 2 Node 4 Node 3 a collection of vertices (nodes) connected by edges (relationships) Tuesday, 30 July 13
  • 10.
    a short history:the seven bridges of Königsberg (1735) Leonard Euler Tuesday, 30 July 13
  • 11.
    directed graph Node 1Node 2 Node 4 Node 3 each relationship has a direction or one start node and one end node Tuesday, 30 July 13
  • 12.
    property graph: name:Volker • nodescontain properties (key, value) • relationships have a type and are always directed • relationships can contain properties too name: Sam :friends name: Megan :knows since: 2005 name: Paul :friends :works_for :knows Tuesday, 30 July 13
  • 13.
  • 14.
    a graph isits own index (constant query performance) Tuesday, 30 July 13
  • 15.
  • 16.
    Querying the graph:Cypher • declarative query language specific to neo4j • easy to learn and intuitive • enables the user to specify specific patterns to query for (something that looks like ‘this’) • inspired partly by SQL (WHERE and ORDER BY) and SPARQL (pattern matching) • focuses on what to query for and not how to query for it • switch from a mySQl world is made easier by the use of cypher instead of having to learn a traversal framework straight away Tuesday, 30 July 13
  • 17.
    • START: Startingpoints in the graph, obtained via index lookups or by element IDs. • MATCH: The graph pattern to match, bound to the starting points in START. • WHERE: Filtering criteria. • RETURN: What to return. • CREATE: Creates nodes and relationships. • DELETE: Removes nodes, relationships and properties. • SET: Set values to properties. • FOREACH: Performs updating actions once per element in a list. • WITH: Divides a query into multiple, distinct parts cypher clauses Tuesday, 30 July 13
  • 18.
    an example graph Node1 me Node 2 Steve Node 3 Sam Node 4 David Node 5 Megan me - [:knows] -> Steve - [:knows] -> David me - [:knows] -> Sam - [:knows] -> Megan Megan - [:knows] -> David knows knowsknows knows knows Tuesday, 30 July 13
  • 19.
  • 20.
    START me=node(1) MATCH me-[:knows*2..]->fof WHEREfof.name =~ 'Da.*' RETURN fof Tuesday, 30 July 13
  • 21.
    root (0) Year: 2013 Month:05 Month 01 2014 01 05 2013 Year: 2014 Month: 06 06 Day: 24 Day: 25 24 25 Day: 26 26 Event 1 Event 2 Event 3 happens happens happens happens representing dates/times Tuesday, 30 July 13
  • 22.
    find all eventson a specific day START root=node(0) MATCH root-[:‘2013’]-()-[:’05’]-()-[:’24’]-()- [:happens]-event RETURN event Tuesday, 30 July 13
  • 23.