Graph Search: The Power of Connected Data

740 views

Published on

Today’s complex data is big, variably-structured and densely connected. In this session we’ll look at how size, structure and connectedness have converged to change the way we work with data. We’ll then go on to look at some of the new opportunities for creating end-user value that have emerged in a world of connected data, illustrated with graph search examples implemented using the Neo4j graph database.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
740
On SlideShare
0
From Embeds
0
Number of Embeds
60
Actions
Shares
0
Downloads
22
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Graph Search: The Power of Connected Data

  1. 1. Ian  Robinson,  Neo  Technology   Graph  Search:  The  Power  of   Connected  Data   Twi$er   @iansrobinson   #neo4j  
  2. 2. Outline   •  Data  complexity   •  Graph  databases  –  features  and  benefits   •  Querying  graph  data  
  3. 3. Data  Complexity   complexity = f(size, semi-structure, connectedness)
  4. 4. Data  Complexity   complexity = f(size , semi-structure, connectedness)
  5. 5. Semi-­‐Structure  
  6. 6. Semi-­‐Structure   USER_ID   FIRST_NAME   LAST_NAME   315   Ian   Robinson   EMAIL_1   EMAIL_2   ian@neotechnology.com   iansrobinson@gmail.com   Email:  ian@neotechnology.com   Email:  iansrobinson@gmail.com   Twi$er:  @iansrobinson   Skype:  iansrobinson   0..n   USER   FACEBOOK   NULL   TWITTER   SKYPE   @iansrobinson   iansrobinson   CONTACT   CONTACT_TYPE  
  7. 7. Social  Network  
  8. 8. Network  Impact  Analysis  
  9. 9. Route  Finding  
  10. 10. Recommenda]ons  
  11. 11. Logis]cs  
  12. 12. Access  Control  
  13. 13. Fraud  Analysis  
  14. 14. Securi]es  and  Debt   Image:  orgnet.com  
  15. 15. Graphs  Are  Everywhere  
  16. 16. Graph  Databases   •  Store   •  Manage   •  Query   data  
  17. 17. Neo4j  is  a  Graph  Database  
  18. 18. Labeled  Property  Graph  
  19. 19. The  Power  of  Rela]onships   Connectedness   •  Pre-­‐computed  joins     •  Several  million  joins  per  second  per  thread  per   core   Variable  Structure   •  Defined  with  regard  to  node  instances,  not   classes  of  nodes   –  Contrast  with  rela]onal  schemas,  where  foreign   key  rela]onships  apply  to  all  rows  in  a  table  
  20. 20. Graph  Database  Benefits   “Minutes  to  milliseconds”  performance   •  Millions  of  ‘joins’  per  second   •  Consistent  query  ]mes  as  dataset  grows   Fit  for  the  domain   •  Lots  of  join  tables?  Connectedness   •  Lots  of  sparse  tables?  Semi-­‐structure   Business  responsiveness   •  Easy  to  evolve  
  21. 21. Querying  Graph  Data   •  Describing  graphs   •  Crea]ng  nodes,  rela]onships  and  proper]es   •  Querying  graphs  
  22. 22. How  to  Describe  a  Graph?  
  23. 23. Cypher  Padern   (rest)<-[:HAS_SKILL]-(ben)-[:HAS_SKILL]->(neo4j), (ben)-[:WORKS_FOR]->(acme)
  24. 24. Create  Some  Data   CREATE (ben:Person { name:'Ben' }), (acme:Company { name:'Acme' }), (rest:Skill { name:'REST' }), (neo4j:Skill= { name:'Neo4j' }), (ben)-[:WORKS_FOR]->(acme), (ben)-[:HAS_SKILL]->(rest), (ben)-[:HAS_SKILL]->(graphs) RETURN ben
  25. 25. Create  Nodes   CREATE (ben:Person { name:'Ben' }), (acme:Company { name:'Acme' }), (rest:Skill { name:'REST' }), (neo4j:Skill= { name:'Neo4j' }), (ben)-[:WORKS_FOR]->(acme), (ben)-[:HAS_SKILL]->(rest), (ben)-[:HAS_SKILL]->(graphs) Iden]fier   Proper]es   RETURN ben Label  
  26. 26. Create  Rela]onships   CREATE (ben:Person { name:'Ben' }), Iden]fier   Iden]fier   (acme:Company { name:'Acme' }), Rela]onship   (rest:Skill { name:'REST' }), (neo4j:Skill= { name:'Neo4j' }), (ben)-[:WORKS_FOR]->(acme), (ben)-[:HAS_SKILL]->(rest), (ben)-[:HAS_SKILL]->(graphs) RETURN ben
  27. 27. Return  Node   CREATE (ben:Person { name:'Ben' }), (acme:Company { name:'Acme' }), (rest:Skill { name:'REST' }), (neo4j:Skill= { name:'Neo4j' }), (ben)-[:WORKS_FOR]->(acme), (ben)-[:HAS_SKILL]->(rest), (ben)-[:HAS_SKILL]->(graphs) RETURN ben
  28. 28. Querying  a  Graph   Graph  Local   •  Find  one  or  more  start  nodes   •  Explore  surrounding  graph   •  Millions  of  hops  per  second  
  29. 29. Which  people,  who  work  for  the  same   company  as  me,  share  my  skills?  
  30. 30. Cypher  Padern   (company)<-[:WORKS_FOR]-(me)-[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill)
  31. 31. Cypher  Query   Which  people,  who  work  for  the  same  company   as  me,  have  similar  skills  to  me?   MATCH (company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill) WHERE me.name = 'ian' RETURN colleague.name AS name, count(skill) AS score, collect(skill.name) AS skills ORDER BY score DESC
  32. 32. Graph  Padern   Which  people,  who  work  for  the  same  company   as  me,  have  similar  skills  to  me?   MATCH (company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill) WHERE me.name = 'ian' RETURN colleague.name AS name, count(skill) AS score, collect(skill.name) AS skills ORDER BY score DESC
  33. 33. Anchor  Padern  in  Graph   Which  people,  who  work  for  the  same  company   as  me,  have  similar  skills  to  me?   MATCH (company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill) WHERE me.name = 'ian' RETURN colleague.name AS name, count(skill) AS score, collect(skill.name) AS skills ORDER BY score DESC Search  nodes  labelled   ‘Person’,  matching  on   ‘name’  property  
  34. 34. Create  Results   Which  people,  who  work  for  the  same  company   as  me,  have  similar  skills  to  me?   MATCH (company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill) WHERE me.name = 'ian' RETURN colleague.name AS name, count(skill) AS score, collect(skill.name) AS skills ORDER BY score DESC
  35. 35. Results   +--------------------------------------+ | name | score | skills | +--------------------------------------+ | "Ben" | 2 | ["Neo4j","REST"] | | "Charlie" | 1 | ["Neo4j"] | +--------------------------------------+ 2 rows
  36. 36. Case  Studies  
  37. 37. Network  Impact  Analysis   •  Which  parts  of  network   does  a  customer   depend  on?   •  Who  will  be  affected  if   we  replace  a  network   element?  
  38. 38. Asset  Management  &  Access  Control   •  Which  assets  can  an   admin  control?   •  Who  can  change  my   subscrip]on?  
  39. 39. Logis]cs   •  What’s  the  quickest   delivery  route  for  this   parcel?  
  40. 40. Social  Network  &  Recommenda]ons   •  Which  assets  can  I   access?   •  Who  shares  my   interests?  
  41. 41. graphdatabases.com   ts gy en lo i m no pl h m ec Co eo T N of Graph h Databases Ian Robinson, Jim Webber & Emil Eifrem Thank  you   @ianSrobinson   ian@neotechnology.com      

×