Modelling Data in Neo4j (plus a few tips)

  • 1,545 views
Uploaded on

Modelling Data in Neo4j, bidirectional relationships, qualifying relationships with properties vs. relationship types (performance comparison), Neo4j hardware sizing, Cypher vs. Java API

Modelling Data in Neo4j, bidirectional relationships, qualifying relationships with properties vs. relationship types (performance comparison), Neo4j hardware sizing, Cypher vs. Java API

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,545
On Slideshare
0
From Embeds
0
Number of Embeds
3

Actions

Shares
Downloads
31
Comments
0
Likes
9

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Modelling Data in Neo4j plus a few best practices and lessons learned by Michal Bachman GraphAware TM
  • 2. GraphAware TM
  • 3. Contents GraphAware TM
  • 4. Contents Quick intro GraphAware TM
  • 5. Contents Quick intro 1x mistake GraphAware TM
  • 6. Contents Quick intro 1x mistake 1x experiment GraphAware TM
  • 7. Contents Quick intro 1x mistake 1x experiment 1x FAQ GraphAware TM
  • 8. Contents Quick intro 1x mistake 1x experiment 1x FAQ 1x case-study GraphAware TM
  • 9. Data Has Changed GraphAware TM
  • 10. Data Has Changed Larger Volumes GraphAware TM
  • 11. Data Has Changed Larger Volumes Less Structured GraphAware TM
  • 12. Data Has Changed Larger Volumes Less Structured More Interconnected GraphAware TM
  • 13. Data Has Changed Larger Volumes Less Structured More Interconnected Polygot Persistence GraphAware TM
  • 14. NoSQL GraphAware TM
  • 15. NoSQL Key-Value Stores GraphAware TM
  • 16. NoSQL Key-Value Stores Column-Family Stores GraphAware TM
  • 17. NoSQL Key-Value Stores Column-Family Stores Document Databases GraphAware TM
  • 18. NoSQL Key-Value Stores Column-Family Stores Document Databases Graph Databases GraphAware TM
  • 19. Graph Databases The first three use aggregate data models, graph databases work with simple records and complex interconnections. GraphAware TM
  • 20. Neo4j GraphAware TM
  • 21. Neo4j Open-source GraphAware TM
  • 22. Neo4j Open-source Schema-less GraphAware TM
  • 23. Neo4j Open-source Schema-less JVM-based GraphAware TM
  • 24. Neo4j Open-source Schema-less JVM-based Fully ACID GraphAware TM
  • 25. Property Graph name: "Triller" type: "genre" name: "Drama" type: "genre" IS _ OF _G EN RE ED CT ipsum IS_A IS_ A _ IS name: "Pulp Fiction" year: 1994 type: "movie" name: "Director" type: "occupation" A name: "Actor" type: "occupation" IS_OF_GENRE RE DI A ED CT le: ro name: "Samuel L. Jackson" type: "person" name: "Quentin Tarantino" type: "person" _IN imm "J Dim ie ck" mi N D_I ld" TE AC nfie Win les "J u : ole r GraphAware TM
  • 26. Traversal name: "Triller" type: "genre" name: "Drama" type: "genre" IS _ OF _G EN RE IS_ IS_A ED CT A _ IS name: "Pulp Fiction" year: 1994 type: "movie" name: "Director" type: "occupation" A name: "Actor" type: "occupation" IS_OF_GENRE E IR D A ED CT le: ro name: "Samuel L. Jackson" type: "person" _IN imm "J name: "Quentin Tarantino" type: "person" Dim ie " ick m _IN D A : ole r E CT les "J u " eld fi inn W GraphAware TM
  • 27. Modeling Data as Graphs There is no single correct way. GraphAware TM
  • 28. One Way name: "Triller" type: "genre" name: "Drama" type: "genre" IS _ OF _G EN RE ED CT ipsum IS_A IS_ A _ IS name: "Pulp Fiction" year: 1994 type: "movie" name: "Director" type: "occupation" A name: "Actor" type: "occupation" IS_OF_GENRE RE DI A ED CT le: ro name: "Samuel L. Jackson" type: "person" name: "Quentin Tarantino" type: "person" _IN imm "J Dim ie ck" mi N D_I ld" TE AC nfie Win les "J u : ole r GraphAware TM
  • 29. name: "Pulp Fiction" year: 1994 type: "movie" genres: "Drama", "Thriller" TE R AC AR CH CT ED DI RE name: "Quentin Tarantino" type: "person" occupation: "Actor", "Director" _IN TER ACTED_AS RAC name: "Jimmie Dimmick" type: "role" CHA _IN Another Way name: "Jules Winnfield" type: "role" ACTED_AS name: "Samuel L. Jackson" type: "person" occupation: "Actor" GraphAware TM
  • 30. Bidirectional Relationships a common mistake GraphAware TM
  • 31. Ice Hockey Czech Republic DEFEATED Sweden GraphAware TM
  • 32. Ice Hockey Czech Republic DEFEATED Sweden GraphAware TM
  • 33. Ice Hockey (Implied Relationship) DEFEATED Czech Republic Sweden DEFEATED_BY GraphAware TM
  • 34. Ice Hockey (Implied Relationship) DEFEATED Czech Republic Sweden DEFEATED_BY GraphAware TM
  • 35. Traversals In Neo4j, the speed of traversal does not depend on the direction of the relationships being traversed. GraphAware TM
  • 36. Why? GraphAware TM TM
  • 37. Node Record in the Node Store (9 bytes), first bit = inUse flag next relationship (35 bits) next property (36 bits) Relationship Record in the Relationship Store (33 bytes), first bit = inUse flag, second bit unused first node's first node's second second type previous next node's first node's next next property first node second node (16 relationship relationship relationship relationship (36 bits) (35 bits) (35 bits) bits) (35 bits) (35 bits) (35 bits) (35 bits) Neo4j Data Layout GraphAware TM
  • 38. Company Partnership (Naturally Bidirectional) Neo Technology PARTNER GraphAware Neo Technology PARTNER GraphAware GraphAware TM
  • 39. Company Partnership (Naturally Bidirectional) PARTNER Neo Technology GraphAware PARTNER GraphAware TM
  • 40. Company Partnership (Naturally Bidirectional) PARTNER Neo Technology GraphAware PARTNER GraphAware TM
  • 41. Company Partnership (Naturally Bidirectional) Neo Technology PARTNER GraphAware GraphAware TM
  • 42. Company Partnership (Naturally Bidirectional) Neo Technology PARTNER GraphAware GraphAware TM
  • 43. Why? Neo4j APIs allow developers to completely ignore relationship direction when querying the graph. GraphAware TM
  • 44. Cypher MATCH  (neo)-­‐[:PARTNER]-­‐>(partner) GraphAware TM
  • 45. Cypher MATCH  (neo)<-­‐[:PARTNER]-­‐(partner) GraphAware TM
  • 46. Cypher MATCH  (neo)-­‐[:PARTNER]-­‐(partner) GraphAware TM
  • 47. Qualifying Relationships performance comparison GraphAware TM
  • 48. Qualifying by Properties Daniela D E T A R :4 g tin a r Pulp Fiction RATED Michal rating: 5 RA ra TE tin D g: 1 Mark GraphAware TM
  • 49. Who liked Pulp Fiction? (Cypher) START      pulpFiction=node({id}) MATCH      (pulpFiction)<-­‐[r:RATED]-­‐(fan) WHERE      r.rating  >  3 RETURN    fan GraphAware TM
  • 50. Who liked Pulp Fiction? (Java) for  (Relationship  r  :  pulpFiction.getRelationships(INCOMING,  RATED))   {        if  ((int)  r.getProperty("rating")  >  3)          {                Node  fan  =  r.getStartNode();  //do  something  with  it        } } GraphAware TM
  • 51. Qualifying by Relationship Type Daniela D E IK L Pulp Fiction LOVED Michal HA TE D Mark GraphAware TM
  • 52. Who liked Pulp Fiction? (Cypher) START      pulpFiction=node({id}) MATCH      (pulpFiction)<-­‐[r:LIKED|LOVED]-­‐(fan) RETURN    fan GraphAware TM
  • 53. Who liked Pulp Fiction? (Java) for  (Relationship  r  :  pF.getRelationships(INCOMING,  LIKED,  LOVED))   {        Node  fan  =  r.getStartNode();  //do  something  with  it } GraphAware TM
  • 54. GraphAware TM
  • 55. GraphAware TM
  • 56. Winner! Daniela D E IK L Pulp Fiction LOVED Michal HA TE D Mark GraphAware TM
  • 57. Other interesting info?
  • 58. Hardware Sizing frequently asked question GraphAware TM
  • 59. JVM Other APIs Transaction Management Core API Object Cache Operating System File System Cache HDD Properties Relationship Types Relationships Record Files Nodes Neo4j Architecture Neo4j Transaction Log GraphAware TM
  • 60. Disk Space >  cd  data >  ls  -­‐ah GraphAware TM
  • 61. Disk Space drwxr-­‐xr-­‐x      5  bachmanm    wheel      170B  19  Oct  12:56  index -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel        31K  19  Oct  12:56  messages.log -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel        69B  19  Oct  12:56  neostore -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel          9B  19  Oct  12:56  neostore.id -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel      8.8K  19  Oct  12:56  neostore.nodestore.db -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel          9B  19  Oct  12:56  neostore.nodestore.db.id -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel        39M  19  Oct  12:56  neostore.propertystore.db -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel      153B  19  Oct  12:56  neostore.propertystore.db.arrays -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel          9B  19  Oct  12:56  neostore.propertystore.db.arrays.id -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel          9B  19  Oct  12:56  neostore.propertystore.db.id -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel        43B  19  Oct  12:56  neostore.propertystore.db.index -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel          9B  19  Oct  12:56  neostore.propertystore.db.index.id -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel      140B  19  Oct  12:56  neostore.propertystore.db.index.keys -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel          9B  19  Oct  12:56  neostore.propertystore.db.index.keys.id -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel      154B  19  Oct  12:56  neostore.propertystore.db.strings -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel          9B  19  Oct  12:56  neostore.propertystore.db.strings.id -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel        31M  19  Oct  12:56  neostore.relationshipstore.db -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel          9B  19  Oct  12:56  neostore.relationshipstore.db.id -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel        38B  19  Oct  12:56  neostore.relationshiptypestore.db -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel          9B  19  Oct  12:56  neostore.relationshiptypestore.db.id -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel      140B  19  Oct  12:56  neostore.relationshiptypestore.db.names -­‐rw-­‐r-­‐-­‐r-­‐-­‐      1  bachmanm    wheel          9B  19  Oct  12:56  neostore.relationshiptypestore.db.names.id GraphAware TM
  • 62. Disk Space node 9B relationship 33B property 41B GraphAware TM
  • 63. Disk Space (Example) 1,000 nodes 1,000,000 rels 2,010,000 props TOTAL x 9B = 8.8 kB x 33B = 31.5 MB x 41B = 78.6 MB = 110.1 MB GraphAware TM
  • 64. Low Level Cache How about low level cache? Any guesses? GraphAware TM
  • 65. Low Level Cache Same as disk space GraphAware TM
  • 66. High Level Cache node 344B relationship 208B property 116B ... GraphAware TM
  • 67. Other interesting info?
  • 68. Java API vs. Cypher case study GraphAware TM
  • 69. Data Model User 2 TRAVELLED_WITH weight: 3 User 4 TRAVELLED_WITH weight: 4 TRAVELLED_TOGETHER weight: 1 User 3 FRIEND weight: 5 User 1 GraphAware TM
  • 70. START        from=node:node_auto_index(user_id="{FROM}"),                  to=node:node_auto_index(user_id="{TO}") MATCH        p  =  from-­‐[r*1..5]-­‐>to RETURN      extract(n  in  nodes(p)  :  n.user_id),                    extract(rel  in  relationships(p)  :  rel.weight),                    extract(rel  in  relationships(p)  :  type(rel)) ORDER  BY  length(p),                    reduce(totalWeight  =  0,  rel  in  relationships(p)  :                  totalWeight  +  rel.weight) LIMIT        3 GraphAware TM
  • 71. START        from=node:node_auto_index(user_id="{FROM}"),                  to=node:node_auto_index(user_id="{TO}") MATCH        p  =  from-­‐[r*1..5]-­‐>to RETURN      extract(n  in  nodes(p)  :  n.user_id),                    extract(rel  in  relationships(p)  :  rel.weight),                    extract(rel  in  relationships(p)  :  type(rel)) ORDER  BY  length(p),                    reduce(totalWeight  =  0,  rel  in  relationships(p)  :                  totalWeight  +  rel.weight) LIMIT        3 > 1 second GraphAware TM
  • 72. 10 - 20 ms
  • 73. Java API vs. Cypher GraphAware TM
  • 74. Java API vs. Cypher Cypher is great! GraphAware TM
  • 75. Java API vs. Cypher Cypher is great! Cypher is improving GraphAware TM
  • 76. Java API vs. Cypher Cypher is great! Cypher is improving But don’t be afraid of writing some Java GraphAware TM
  • 77. Thanks! www.graphaware.com @graph_aware GraphAware TM