Ian	
  Robinson,	
  Neo	
  Technology	
  

Graph	
  Search:	
  The	
  Power	
  of	
  
Connected	
  Data	
  
Twi$er	
  
@iansrobinson	
  
#neo4j	
  
Outline	
  
•  Data	
  complexity	
  
•  Graph	
  databases	
  –	
  features	
  and	
  benefits	
  
•  Querying	
  graph	
  data	
  
Data	
  Complexity	
  

complexity = f(size, semi-structure, connectedness)
Data	
  Complexity	
  

complexity = f(size

, semi-structure, connectedness)
Semi-­‐Structure	
  
Semi-­‐Structure	
  
USER_ID	
   FIRST_NAME	
   LAST_NAME	
  
315	
  

Ian	
  

Robinson	
  

EMAIL_1	
  

EMAIL_2	
  

ian@neotechnology.com	
   iansrobinson@gmail.com	
  

Email:	
  ian@neotechnology.com	
  
Email:	
  iansrobinson@gmail.com	
  
Twi$er:	
  @iansrobinson	
  
Skype:	
  iansrobinson	
  

0..n	
  

USER	
  

FACEBOOK	
  
NULL	
  

TWITTER	
  

SKYPE	
  

@iansrobinson	
   iansrobinson	
  

CONTACT	
  

CONTACT_TYPE	
  
Social	
  Network	
  
Network	
  Impact	
  Analysis	
  
Route	
  Finding	
  
Recommenda]ons	
  
Logis]cs	
  
Access	
  Control	
  
Fraud	
  Analysis	
  
Securi]es	
  and	
  Debt	
  

Image:	
  orgnet.com	
  
Graphs	
  Are	
  Everywhere	
  
Graph	
  Databases	
  
•  Store	
  
•  Manage	
  
•  Query	
  

data	
  
Neo4j	
  is	
  a	
  Graph	
  Database	
  
Labeled	
  Property	
  Graph	
  
The	
  Power	
  of	
  Rela]onships	
  
Connectedness	
  
•  Pre-­‐computed	
  joins	
  	
  
•  Several	
  million	
  joins	
  per	
  second	
  per	
  thread	
  per	
  
core	
  
Variable	
  Structure	
  
•  Defined	
  with	
  regard	
  to	
  node	
  instances,	
  not	
  
classes	
  of	
  nodes	
  
–  Contrast	
  with	
  rela]onal	
  schemas,	
  where	
  foreign	
  
key	
  rela]onships	
  apply	
  to	
  all	
  rows	
  in	
  a	
  table	
  
Graph	
  Database	
  Benefits	
  
“Minutes	
  to	
  milliseconds”	
  performance	
  
•  Millions	
  of	
  ‘joins’	
  per	
  second	
  
•  Consistent	
  query	
  ]mes	
  as	
  dataset	
  grows	
  
Fit	
  for	
  the	
  domain	
  
•  Lots	
  of	
  join	
  tables?	
  Connectedness	
  
•  Lots	
  of	
  sparse	
  tables?	
  Semi-­‐structure	
  
Business	
  responsiveness	
  
•  Easy	
  to	
  evolve	
  
Querying	
  Graph	
  Data	
  
•  Describing	
  graphs	
  
•  Crea]ng	
  nodes,	
  rela]onships	
  and	
  proper]es	
  
•  Querying	
  graphs	
  
How	
  to	
  Describe	
  a	
  Graph?	
  
Cypher	
  Padern	
  

(rest)<-[:HAS_SKILL]-(ben)-[:HAS_SKILL]->(neo4j),	
(ben)-[:WORKS_FOR]->(acme)
Create	
  Some	
  Data	
  
CREATE (ben:Person { name:'Ben' }),	
(acme:Company { name:'Acme' }),	
(rest:Skill { name:'REST' }),	
(neo4j:Skill= { name:'Neo4j' }),	
(ben)-[:WORKS_FOR]->(acme),	
(ben)-[:HAS_SKILL]->(rest),	
(ben)-[:HAS_SKILL]->(graphs)	
RETURN ben
Create	
  Nodes	
  
CREATE (ben:Person { name:'Ben' }),	
(acme:Company { name:'Acme' }),	
(rest:Skill { name:'REST' }),	
(neo4j:Skill= { name:'Neo4j' }),	
(ben)-[:WORKS_FOR]->(acme),	
(ben)-[:HAS_SKILL]->(rest),	
(ben)-[:HAS_SKILL]->(graphs)	
Iden]fier	
  
Proper]es	
  
RETURN ben	
Label	
  
Create	
  Rela]onships	
  
CREATE (ben:Person { name:'Ben' }),	
Iden]fier	
  
Iden]fier	
  
(acme:Company { name:'Acme' }),	
Rela]onship	
  
(rest:Skill { name:'REST' }),	
(neo4j:Skill= { name:'Neo4j' }),	
(ben)-[:WORKS_FOR]->(acme),	
(ben)-[:HAS_SKILL]->(rest),	
(ben)-[:HAS_SKILL]->(graphs)	
RETURN ben
Return	
  Node	
  
CREATE (ben:Person { name:'Ben' }),	
(acme:Company { name:'Acme' }),	
(rest:Skill { name:'REST' }),	
(neo4j:Skill= { name:'Neo4j' }),	
(ben)-[:WORKS_FOR]->(acme),	
(ben)-[:HAS_SKILL]->(rest),	
(ben)-[:HAS_SKILL]->(graphs)	
RETURN ben
Querying	
  a	
  Graph	
  
Graph	
  Local	
  
•  Find	
  one	
  or	
  more	
  start	
  nodes	
  
•  Explore	
  surrounding	
  graph	
  
•  Millions	
  of	
  hops	
  per	
  second	
  
Which	
  people,	
  who	
  work	
  for	
  the	
  same	
  
company	
  as	
  me,	
  share	
  my	
  skills?	
  
Cypher	
  Padern	
  

(company)<-[:WORKS_FOR]-(me)-[:HAS_SKILL]->(skill),	
(company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill)
Cypher	
  Query	
  
Which	
  people,	
  who	
  work	
  for	
  the	
  same	
  company	
  
as	
  me,	
  have	
  similar	
  skills	
  to	
  me?	
  
	
MATCH (company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill),	
(company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill)	
WHERE me.name = 'ian'	
RETURN colleague.name AS name,	
count(skill) AS score,	
collect(skill.name) AS skills	
ORDER BY score DESC
Graph	
  Padern	
  
Which	
  people,	
  who	
  work	
  for	
  the	
  same	
  company	
  
as	
  me,	
  have	
  similar	
  skills	
  to	
  me?	
  
	
MATCH (company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill),	
(company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill)	
WHERE me.name = 'ian'	
RETURN colleague.name AS name,	
count(skill) AS score,	
collect(skill.name) AS skills	
ORDER BY score DESC
Anchor	
  Padern	
  in	
  Graph	
  
Which	
  people,	
  who	
  work	
  for	
  the	
  same	
  company	
  
as	
  me,	
  have	
  similar	
  skills	
  to	
  me?	
  
	
MATCH (company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill),	
(company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill)	
WHERE me.name = 'ian'	
RETURN colleague.name AS name,	
count(skill) AS score,	
collect(skill.name) AS skills	
ORDER BY score DESC	

Search	
  nodes	
  labelled	
  
‘Person’,	
  matching	
  on	
  
‘name’	
  property	
  
Create	
  Results	
  
Which	
  people,	
  who	
  work	
  for	
  the	
  same	
  company	
  
as	
  me,	
  have	
  similar	
  skills	
  to	
  me?	
  
	
MATCH (company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill),	
(company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill)	
WHERE me.name = 'ian'	
RETURN colleague.name AS name,	
count(skill) AS score,	
collect(skill.name) AS skills	
ORDER BY score DESC
Results	
  
+--------------------------------------+	
| name
| score | skills
|	
+--------------------------------------+	
| "Ben"
| 2
| ["Neo4j","REST"] |	
| "Charlie" | 1
| ["Neo4j"]
|	
+--------------------------------------+	
2 rows
Case	
  Studies	
  
Network	
  Impact	
  Analysis	
  
•  Which	
  parts	
  of	
  network	
  
does	
  a	
  customer	
  
depend	
  on?	
  
•  Who	
  will	
  be	
  affected	
  if	
  
we	
  replace	
  a	
  network	
  
element?	
  
Asset	
  Management	
  &	
  Access	
  Control	
  
•  Which	
  assets	
  can	
  an	
  
admin	
  control?	
  
•  Who	
  can	
  change	
  my	
  
subscrip]on?	
  
Logis]cs	
  
•  What’s	
  the	
  quickest	
  
delivery	
  route	
  for	
  this	
  
parcel?	
  
Social	
  Network	
  &	
  Recommenda]ons	
  
•  Which	
  assets	
  can	
  I	
  
access?	
  
•  Who	
  shares	
  my	
  
interests?	
  
graphdatabases.com	
  
ts gy
en lo
i m no
pl h
m ec
Co eo T
N
of

Graph
h
Databases
Ian Robinson,
Jim Webber & Emil Eifrem

Thank	
  you	
  
@ianSrobinson	
  
ian@neotechnology.com	
  
	
  
	
  

Graph Search: The Power of Connected Data

  • 1.
    Ian  Robinson,  Neo  Technology   Graph  Search:  The  Power  of   Connected  Data   Twi$er   @iansrobinson   #neo4j  
  • 2.
    Outline   •  Data  complexity   •  Graph  databases  –  features  and  benefits   •  Querying  graph  data  
  • 3.
    Data  Complexity   complexity= f(size, semi-structure, connectedness)
  • 4.
    Data  Complexity   complexity= f(size , semi-structure, connectedness)
  • 5.
  • 6.
    Semi-­‐Structure   USER_ID  FIRST_NAME   LAST_NAME   315   Ian   Robinson   EMAIL_1   EMAIL_2   ian@neotechnology.com   iansrobinson@gmail.com   Email:  ian@neotechnology.com   Email:  iansrobinson@gmail.com   Twi$er:  @iansrobinson   Skype:  iansrobinson   0..n   USER   FACEBOOK   NULL   TWITTER   SKYPE   @iansrobinson   iansrobinson   CONTACT   CONTACT_TYPE  
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
    Securi]es  and  Debt   Image:  orgnet.com  
  • 15.
  • 16.
    Graph  Databases   • Store   •  Manage   •  Query   data  
  • 17.
    Neo4j  is  a  Graph  Database  
  • 18.
  • 19.
    The  Power  of  Rela]onships   Connectedness   •  Pre-­‐computed  joins     •  Several  million  joins  per  second  per  thread  per   core   Variable  Structure   •  Defined  with  regard  to  node  instances,  not   classes  of  nodes   –  Contrast  with  rela]onal  schemas,  where  foreign   key  rela]onships  apply  to  all  rows  in  a  table  
  • 20.
    Graph  Database  Benefits   “Minutes  to  milliseconds”  performance   •  Millions  of  ‘joins’  per  second   •  Consistent  query  ]mes  as  dataset  grows   Fit  for  the  domain   •  Lots  of  join  tables?  Connectedness   •  Lots  of  sparse  tables?  Semi-­‐structure   Business  responsiveness   •  Easy  to  evolve  
  • 21.
    Querying  Graph  Data   •  Describing  graphs   •  Crea]ng  nodes,  rela]onships  and  proper]es   •  Querying  graphs  
  • 22.
    How  to  Describe  a  Graph?  
  • 23.
  • 24.
    Create  Some  Data   CREATE (ben:Person { name:'Ben' }), (acme:Company { name:'Acme' }), (rest:Skill { name:'REST' }), (neo4j:Skill= { name:'Neo4j' }), (ben)-[:WORKS_FOR]->(acme), (ben)-[:HAS_SKILL]->(rest), (ben)-[:HAS_SKILL]->(graphs) RETURN ben
  • 25.
    Create  Nodes   CREATE(ben:Person { name:'Ben' }), (acme:Company { name:'Acme' }), (rest:Skill { name:'REST' }), (neo4j:Skill= { name:'Neo4j' }), (ben)-[:WORKS_FOR]->(acme), (ben)-[:HAS_SKILL]->(rest), (ben)-[:HAS_SKILL]->(graphs) Iden]fier   Proper]es   RETURN ben Label  
  • 26.
    Create  Rela]onships   CREATE(ben:Person { name:'Ben' }), Iden]fier   Iden]fier   (acme:Company { name:'Acme' }), Rela]onship   (rest:Skill { name:'REST' }), (neo4j:Skill= { name:'Neo4j' }), (ben)-[:WORKS_FOR]->(acme), (ben)-[:HAS_SKILL]->(rest), (ben)-[:HAS_SKILL]->(graphs) RETURN ben
  • 27.
    Return  Node   CREATE(ben:Person { name:'Ben' }), (acme:Company { name:'Acme' }), (rest:Skill { name:'REST' }), (neo4j:Skill= { name:'Neo4j' }), (ben)-[:WORKS_FOR]->(acme), (ben)-[:HAS_SKILL]->(rest), (ben)-[:HAS_SKILL]->(graphs) RETURN ben
  • 29.
    Querying  a  Graph   Graph  Local   •  Find  one  or  more  start  nodes   •  Explore  surrounding  graph   •  Millions  of  hops  per  second  
  • 30.
    Which  people,  who  work  for  the  same   company  as  me,  share  my  skills?  
  • 31.
  • 32.
    Cypher  Query   Which  people,  who  work  for  the  same  company   as  me,  have  similar  skills  to  me?   MATCH (company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill) WHERE me.name = 'ian' RETURN colleague.name AS name, count(skill) AS score, collect(skill.name) AS skills ORDER BY score DESC
  • 33.
    Graph  Padern   Which  people,  who  work  for  the  same  company   as  me,  have  similar  skills  to  me?   MATCH (company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill) WHERE me.name = 'ian' RETURN colleague.name AS name, count(skill) AS score, collect(skill.name) AS skills ORDER BY score DESC
  • 34.
    Anchor  Padern  in  Graph   Which  people,  who  work  for  the  same  company   as  me,  have  similar  skills  to  me?   MATCH (company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill) WHERE me.name = 'ian' RETURN colleague.name AS name, count(skill) AS score, collect(skill.name) AS skills ORDER BY score DESC Search  nodes  labelled   ‘Person’,  matching  on   ‘name’  property  
  • 35.
    Create  Results   Which  people,  who  work  for  the  same  company   as  me,  have  similar  skills  to  me?   MATCH (company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill) WHERE me.name = 'ian' RETURN colleague.name AS name, count(skill) AS score, collect(skill.name) AS skills ORDER BY score DESC
  • 37.
    Results   +--------------------------------------+ | name |score | skills | +--------------------------------------+ | "Ben" | 2 | ["Neo4j","REST"] | | "Charlie" | 1 | ["Neo4j"] | +--------------------------------------+ 2 rows
  • 38.
  • 39.
    Network  Impact  Analysis   •  Which  parts  of  network   does  a  customer   depend  on?   •  Who  will  be  affected  if   we  replace  a  network   element?  
  • 40.
    Asset  Management  &  Access  Control   •  Which  assets  can  an   admin  control?   •  Who  can  change  my   subscrip]on?  
  • 41.
    Logis]cs   •  What’s  the  quickest   delivery  route  for  this   parcel?  
  • 42.
    Social  Network  &  Recommenda]ons   •  Which  assets  can  I   access?   •  Who  shares  my   interests?  
  • 43.
    graphdatabases.com   ts gy enlo i m no pl h m ec Co eo T N of Graph h Databases Ian Robinson, Jim Webber & Emil Eifrem Thank  you   @ianSrobinson   ian@neotechnology.com