Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Redis Day TLV 2018 - Graph Distribution

128 views

Published on

A graph database module for Redis

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Redis Day TLV 2018 - Graph Distribution

  1. 1. Graph Distribution
  2. 2. Graph Database SRC DEST Relation
  3. 3. Graph Database Use cases: Fraud detection Recommendation engine Social networks...
  4. 4. ● Property graph ● Labeled entities ● Schema less ● Cypher query language ● Aggregations, Arithmetic expressions, Sort... ● Tabular resultset RedisGraph
  5. 5. Structure
  6. 6. Tables Name Age Height Roi 33 187 Hila 33 170 Shany 23 167 Amit 31 180 Name Population Israel 8.5M Japan 127M Italy 60M SRC DEST 1 2 2 2 2 3 4 1 4 3 Person CountryVisit
  7. 7. Documents ID: 1, Name: ‘Roi’, Age: 33, Height: 187, Visited: [6] ID: 6, Name: ‘Japan’, Population: 127M
  8. 8. Graph structure 101
  9. 9. Adjacency list 12 3 4 1 2 3 4
  10. 10. Adjacency matrix 1 0 1 0 1 0 0 0 0 1 0 0 1 1 0 1 0 0 1 0 1 0 0 0 1 1 0 1 1 0 1 1 1 0 1 0 1 1 0 0 Node i is connected to node j If A[i,j] = 1
  11. 11. Hexastore SPO OSP SOP PSO OPS POS S Subject P Predicate O Object 6
  12. 12. Graph structure Hexastore Triplets SPO:Michael:Boss:Jim SOP:Michael:Jim:Boss OPS:Jim:Boss:Michael OSP:Jim:Michael:Boss PSO:Boss:Michael:Jim POS:Boss:Jim:Michael Michael S Jim O Boss P
  13. 13. Node property set Entities - Key value store. Person node with attributes: { ‘name’: ‘Bruce Buffer’, ‘age’: 60, ‘gender’: ‘male’ }
  14. 14. 2 billion users 338 average friends for user 676 billion edges 152 terabytes ~= 1024*32 bytes per user + 64 * 2 bytes per edge Problem
  15. 15. Partitioning
  16. 16. Entities distribution Property set 1 Property set 2 Graph index
  17. 17. Query Find friends of mine who’ve visited places I’ve been to and are older than me. Match (ME:person)-[friend]->(F:person)-[visited]->(C:country)<-[visit]-(ME) WHERE ME.ID = 33 AND F.age > ME.age RETURN F.name, C.name
  18. 18. (ME:person) ME.ID = 33 Graph traversal Graph index
  19. 19. Graph traversal (ME:person)-[friend]->(F:person) Graph index
  20. 20. (F:person)-[visited]->(C:country) Graph traversal Graph index
  21. 21. (C:country)<-[visit]-(ME) Graph traversal Graph index
  22. 22. Resultset Friend ID Friend name Country ID Country name 70 ? 25 ? 92 ? 55 ? 56 ? 4 ?
  23. 23. Query WHERE F.age > ME.age RETURN F.name, C.name NETWORK! Index Entities Fetch age for ID 33
  24. 24. Query example continued WHERE F.age > ME.age RETURN F.name, C.name NETWORK! Index Entities Fetch name of every entity in (IDs) Entity’s age > 29
  25. 25. Resultset Friend ID Friend name Country ID Country name 70 Noam 25 Japan
  26. 26. Index distribution Friend relation Visit relation Graph index
  27. 27. Query Find all posts liked by friends of friends of mine, written by author X. MATCH (ME:person)-[friend]->(:person)-[friend]->(F:person)-[like]->(post)<-[author]-(A:author) WHERE ME.ID=46 AND A.ID=71070 RETURN A.name, F.name
  28. 28. 1. Node X contains FRIEND relations. 2. Seek to my ID in Node X (1 RPC). Retrieve a list of friend uids. 3. Do multiple seeks for each of the friend uids, to generate a list of friends of friends uids. result set 1 Query Friend Index Query executor (ME:person)-[friend]->(:person)-[friend]->(F:person)
  29. 29. Resultset 1 Friends of friends Friend ID Friend name 70 ? 92 ? 56 ?
  30. 30. 1. Node Y contains posting list for predicate LIKE. 2. Ship result set 1 to Node Y (1 RPC), and do seeks to generate a list of all posts liked by result set 1. result set 2 Query Like Index Query executor (F:person)-[like]->(post) Resultset 1
  31. 31. Resultset 2 Liked posts Friend ID Friend name Post ID 70 ? 534 70 ? 431 92 ? 8964 56 ? 12 56 ? 5356
  32. 32. Query Node Z contains relations for predicate AUTHOR. Ship result set 2 to Node Z (1 RPC). Seek to author X, and generate a list of posts authored by X. result set 3 Author Index Query executor (post)<-[author]-(A:author) Resultset 2
  33. 33. Resultset 4 Intersected resultset 2 and 3 Friend ID Friend name Post ID Author ID Author name 70 ? 534 71070 ? 92 ? 8964 71070 ?
  34. 34. Node N contains names for all uids, ship result set 4 to Node N (1 RPC), and convert uids to names by doing multiple seeks. Query Author Index Query executor RETURN A.name, F.name Resultset 4
  35. 35. Resultset 4 Intersected resultset 2 and 3 Friend ID Friend name Post ID Author ID Author name 70 Ailon 534 71070 Omri 92 Boaz 8964 71070 Omri
  36. 36. RedisGraph Not distributed, Yet, Work in progress: Compact distributed index Concurrent fast independent traversals
  37. 37. (you)-[ask]->(question) @roilipman
  38. 38. JanusGraph successor of Titan ● Relays on a storage backend e.g. Casandar. ● Provides a graph interface on top of a table. ● Delegates storing, replicating, distributing and persisting a graph to the underline storage backend. Takes a mature application from a similar domain and introduce a new data type API on top of existing data structure. (not optimal) Solutions
  39. 39. Solutions DGraph Uses the concept of RDF NQuad to represents connections and badger as its key value store. Both the graph index and the entities are distributed.
  40. 40. Solutions Arangodb From my understanding this multi model database uses documents to represent all three data types: Documents, key value store and graph. Not sure about how it distributes its data but it’s using RAFT to ensure consistency It is ACID.

×