Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Modelling Data as Graphs (Neo4j)

1,501 views

Published on

Modelling Data in Neo4j for beginners, common mistakes, frequently asked questions, hardware sizing and a few extra tips

Published in: Technology

Modelling Data as Graphs (Neo4j)

  1. 1. Modelling Data in Neo4j a few best practices and lessons learned by Michal Bachman GraphAware TM
  2. 2. GraphAware TM
  3. 3. Example Domain Ride-sharing website History of rides Friendships from Facebook Aim: build trust between users GraphAware TM
  4. 4. Modelling Data as Graphs There is no single correct way. GraphAware TM
  5. 5. Modelling Data as Graphs Graphs are very whiteboard friendly. GraphAware TM
  6. 6. User name: “Peter” DROVE ipsum DROVE User User name: “Alice” FRIEND_OF name: “Michael” User name: “Laura” FRIEND_OF
  7. 7. User User name: “Jenny” name: “Peter” DROVE DROVE ipsum DROVE User User name: “Alice” FRIEND_OF name: “Michael” User name: “Laura” FRIEND_OF
  8. 8. User User name: “Jenny” name: “Peter” DROVE date: 2014-01-27 DROVE date: 2014-01-29 ipsum DROVE date: 2014-01-29 User User name: “Alice” FRIEND_OF name: “Michael” User name: “Laura” FRIEND_OF
  9. 9. User User name: “Jenny” name: “Peter” DROVE date: 2014-01-27 ipsum DROVE date: 2014-01-29 User DROVE date: 2014-01-29 RODE_TOGETHER User name: “Alice” FRIEND_OF name: “Michael” RODE_TOGETHER User name: “Laura” FRIEND_OF
  10. 10. User name: “Jenny” User name: “Peter” DRIVER Ride Ride date: 2014-01-27 from: “Brighton” to: “Hastings” date: 2014-01-29 from: “London” to: “Nottingham” PASSENGER PASSENGER DRIVER ipsum PASSENGER User User name: “Alice” FRIEND_OF name: “Michael” User name: “Laura” FRIEND_OF
  11. 11. Nodes vs. Relationships Make important concepts in your domain nodes, you will gain flexibility. GraphAware TM
  12. 12. User name: “Jenny” User name: “Peter” DRIVER Ride Ride date: 2014-01-27 from: “Brighton” to: “Hastings” date: 2014-01-29 from: “London” to: “Nottingham” PASSENGER PASSENGER DRIVER ipsum PASSENGER User User name: “Alice” FRIEND_OF name: “Michael” User name: “Laura” FRIEND_OF
  13. 13. User name: “Jenny” User name: “Peter” DRIVER Ride Ride date: 2014-01-27 from: “Brighton” to: “Hastings” date: 2014-01-29 from: “London” to: “Nottingham” DRIVER ipsum RATED rating: 5 RATED rating: 3 PASSENGER PASSENGER PASSENGER User User name: “Alice” FRIEND_OF name: “Michael” User name: “Laura” FRIEND_OF
  14. 14. User name: “Jenny” User name: “Peter” DRIVER Ride Ride date: 2014-01-27 from: “Brighton” to: “Hastings” date: 2014-01-29 from: “London” to: “Nottingham” DRIVER ipsum RATED rating: 5 RATED rating: 3 PASSENGER PASSENGER User name: “Alice” PASSENGER User name: “Michael” User FRIEND_OF name: “Laura”
  15. 15. User name: “Jenny” User name: “Peter” DRIVER Ride Ride date: 2014-01-27 from: “Brighton” to: “Hastings” date: 2014-01-29 from: “London” to: “Nottingham” DRIVER ipsum RATED rating: 5 RATED rating: 3 PASSENGER PASSENGER User name: “Alice” PASSENGER User name: “Michael” User FRIEND_OF name: “Laura”
  16. 16. Bidirectional Relationships a common mistake GraphAware TM
  17. 17. Ice Hockey Czech Republic DEFEATED Sweden GraphAware TM
  18. 18. Ice Hockey Czech Republic DEFEATED Sweden GraphAware TM
  19. 19. Ice Hockey (Implied Relationship) DEFEATED Czech Republic Sweden DEFEATED_BY GraphAware TM
  20. 20. Ice Hockey (Implied Relationship) DEFEATED Czech Republic Sweden DEFEATED_BY GraphAware TM
  21. 21. Company Partnership (Naturally Bidirectional) Neo Technology PARTNER GraphAware Neo Technology PARTNER GraphAware GraphAware TM
  22. 22. Company Partnership (Naturally Bidirectional) PARTNER Neo Technology GraphAware PARTNER GraphAware TM
  23. 23. Company Partnership (Naturally Bidirectional) PARTNER Neo Technology GraphAware PARTNER GraphAware TM
  24. 24. Company Partnership (Naturally Bidirectional) Neo Technology PARTNER GraphAware GraphAware TM
  25. 25. Company Partnership (Naturally Bidirectional) Neo Technology PARTNER GraphAware GraphAware TM
  26. 26. Traversal Speed In Neo4j, the speed of traversal does not depend on the direction of the relationships being traversed. GraphAware TM
  27. 27. Why? GraphAware TM TM
  28. 28. Node Record in the Node Store (9 bytes), first bit = inUse flag next relationship (35 bits) next property (36 bits) Relationship Record in the Relationship Store (33 bytes), first bit = inUse flag, second bit unused first node's first node's second second type previous next node's first node's next next property first node second node (16 relationship relationship relationship relationship (36 bits) (35 bits) (35 bits) bits) (35 bits) (35 bits) (35 bits) (35 bits) Neo4j Data Layout GraphAware TM
  29. 29. Traversal APIs Neo4j APIs allow developers to completely ignore relationship direction when querying the graph. GraphAware TM
  30. 30. Cypher MATCH  (neo)-­‐[:PARTNER]-­‐>(partner) GraphAware TM
  31. 31. Cypher MATCH  (neo)<-­‐[:PARTNER]-­‐(partner) GraphAware TM
  32. 32. Cypher MATCH  (neo)-­‐[:PARTNER]-­‐(partner) GraphAware TM
  33. 33. Heads Up! Different quality in each direction => should have two relationships! LOVES Geeky Guy Girl DOESN’T CARE ABOUT GraphAware TM
  34. 34. User name: “Jenny” User name: “Peter” DRIVER Ride Ride date: 2014-01-27 from: “Brighton” to: “Hastings” date: 2014-01-29 from: “London” to: “Nottingham” DRIVER ipsum RATED rating: 5 RATED rating: 3 PASSENGER PASSENGER User name: “Alice” PASSENGER User name: “Michael” User FRIEND_OF name: “Laura”
  35. 35. User name: “Jenny” User name: “Peter” DRIVER Ride Ride date: 2014-01-27 from: “Brighton” to: “Hastings” date: 2014-01-29 from: “London” to: “Nottingham” DRIVER ipsum RATED rating: ? RATED rating: 3 PASSENGER PASSENGER User name: “Alice” HATED DISLIKED NEUTRAL LIKED LOVED PASSENGER User name: “Michael” User FRIEND_OF name: “Laura”
  36. 36. User name: “Jenny” User name: “Peter” DRIVER Ride Ride date: 2014-01-27 from: “Brighton” to: “Hastings” DRIVER date: 2014-01-29 from: “London” to: “Nottingham” LOVED NEUTRAL PASSENGER PASSENGER User name: “Alice” PASSENGER User name: “Michael” User FRIEND_OF name: “Laura”
  37. 37. Qualifying Relationships performance comparison GraphAware TM
  38. 38. User Qualifying by Properties name: “Jenny” User name: “Peter” DRIVER Ride Ride date: 2014-01-27 from: “Brighton” to: “Hastings” date: 2014-01-29 from: “London” to: “Nottingham” DRIVER ipsum RATED rating: 5 RATED rating: 3 PASSENGER PASSENGER User name: “Alice” PASSENGER User name: “Michael” User FRIEND_OF name: “Laura”
  39. 39. Who liked the ride? (Cypher) START      ride=node({id})   MATCH      (ride)<-­‐[r:RATED]-­‐(passenger)   WHERE      r.rating  >  3   RETURN    passenger GraphAware TM
  40. 40. Who liked the ride? (Java) for  (Relationship  r  :  ride.getRelationships(INCOMING,  RATED))     {          if  ((int)  r.getProperty("rating")  >  3)            {                  Node  passenger  =  r.getStartNode();  //do  something  with  it          }   } GraphAware TM
  41. 41. User Qualifying by Relationship Type name: “Jenny” User name: “Peter” DRIVER Ride Ride date: 2014-01-27 from: “Brighton” to: “Hastings” DRIVER date: 2014-01-29 from: “London” to: “Nottingham” LOVED NEUTRAL PASSENGER PASSENGER User name: “Alice” PASSENGER User name: “Michael” User FRIEND_OF name: “Laura”
  42. 42. Who liked the ride? (Cypher) START      ride=node({id})   MATCH      (ride)<-­‐[r:LIKED|LOVED]-­‐(passenger)   RETURN    passenger GraphAware TM
  43. 43. Who liked the ride? (Java) for  (Relationship  r  :  ride.getRelationships(INCOMING,  LIKED,  LOVED))     {          Node  passenger  =  r.getStartNode();  //do  something  with  it   } GraphAware TM
  44. 44. GraphAware TM
  45. 45. GraphAware TM
  46. 46. User Winner! name: “Jenny” User name: “Peter” DRIVER Ride Ride date: 2014-01-27 from: “Brighton” to: “Hastings” DRIVER date: 2014-01-29 from: “London” to: “Nottingham” LOVED NEUTRAL PASSENGER PASSENGER User name: “Alice” PASSENGER User name: “Michael” User FRIEND_OF name: “Laura”
  47. 47. Other interesting info?
  48. 48. Hardware Sizing frequently asked question GraphAware TM
  49. 49. JVM Other APIs Transaction Management Core API Object Cache Operating System File System Cache HDD Properties Relationship Types Relationships Record Files Nodes Neo4j Architecture Neo4j Transaction Log GraphAware TM
  50. 50. Disk Space >  cd  data   >  ls  -­‐ah GraphAware TM
  51. 51. Disk Space drwxr-­‐xr-­‐x      5  bachmanm    wheel      170B  19  Oct  12:56  index   -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel        31K  19  Oct  12:56  messages.log   -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel        69B  19  Oct  12:56  neostore   -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel          9B  19  Oct  12:56  neostore.id   -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel      8.8K  19  Oct  12:56  neostore.nodestore.db   -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel          9B  19  Oct  12:56  neostore.nodestore.db.id   -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel        39M  19  Oct  12:56  neostore.propertystore.db   -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel      153B  19  Oct  12:56  neostore.propertystore.db.arrays   -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel          9B  19  Oct  12:56  neostore.propertystore.db.arrays.id   -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel          9B  19  Oct  12:56  neostore.propertystore.db.id   -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel        43B  19  Oct  12:56  neostore.propertystore.db.index   -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel          9B  19  Oct  12:56  neostore.propertystore.db.index.id   -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel      140B  19  Oct  12:56  neostore.propertystore.db.index.keys   -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel          9B  19  Oct  12:56  neostore.propertystore.db.index.keys.id   -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel      154B  19  Oct  12:56  neostore.propertystore.db.strings   -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel          9B  19  Oct  12:56  neostore.propertystore.db.strings.id   -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel        31M  19  Oct  12:56  neostore.relationshipstore.db   -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel          9B  19  Oct  12:56  neostore.relationshipstore.db.id   -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel        38B  19  Oct  12:56  neostore.relationshiptypestore.db   -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel          9B  19  Oct  12:56  neostore.relationshiptypestore.db.id   -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel      140B  19  Oct  12:56  neostore.relationshiptypestore.db.names   -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel          9B  19  Oct  12:56  neostore.relationshiptypestore.db.names.id GraphAware TM
  52. 52. Disk Space node 14B relationship 33B property 41B GraphAware TM
  53. 53. Disk Space (Example) 1,000 nodes 1,000,000 rels 2,010,000 props TOTAL x 14B = 13.7 kB x 33B = 31.5 MB x 41B = 78.6 MB 110.1 MB GraphAware TM
  54. 54. Low Level Cache How about low level cache? Any guesses? GraphAware TM
  55. 55. Low Level Cache Same as disk space GraphAware TM
  56. 56. High Level Cache node 344B relationship 208B property 116B ... GraphAware TM
  57. 57. Other interesting info?
  58. 58. Java API vs. Cypher Cypher is great! Cypher is improving But don’t be afraid of writing some Java GraphAware TM
  59. 59. Conclusion Experiment Measure Analyse Ask GraphAware TM
  60. 60. Thanks! www.graphaware.com @graph_aware GraphAware TM
  61. 61. Next  meetup • The  transport  graph   – Roads,  Nodes  and  Automobiles
 (Jacqui  Read)   – Transport  Network  Route  Finding  Using  A  Graph
 (Ian  Cartwright  &  Ben  Earlham)   th  February  2014   26 • • Here! GraphAware TM
  62. 62. GraphAware TM
  63. 63. ts gy en lo i m no pl h m ec Co eo T N of Graph h Databases Ian Robinson, Jim Webber & Emil Eifrem GraphAware TM
  64. 64. Take  me  to  the  pub… GraphAware TM
  65. 65. Thanks! www.graphaware.com @graph_aware GraphAware TM

×