Graphs, Edges &                               Nodes                             Untangling the social web.Wednesday, March...
What’s a graph?Wednesday, March 9, 2011
GraphWednesday, March 9, 2011
GraphWednesday, March 9, 2011
GraphWednesday, March 9, 2011
Graph                                                             10                                                     1...
Graph                                                          11                      10                    10           ...
Simple                       At most one edge bet ween any pair of nodes.Wednesday, March 9, 2011
Multigraph                           Multiple edges bet ween vertices allowed.Wednesday, March 9, 2011
Pseudograph                           Self-loops are permitted.Wednesday, March 9, 2011
G = (V, E)Wednesday, March 9, 2011
Wednesday, March 9, 2011
What’s a node?                                  vertex                                   point                            ...
Wednesday, March 9, 2011
Wednesday, March 9, 2011
What’s an edge?                                   arc                                 branch                              ...
DirectedWednesday, March 9, 2011
Wednesday, March 9, 2011
Wednesday, March 9, 2011
UndirectedWednesday, March 9, 2011
UndirectedWednesday, March 9, 2011
Wednesday, March 9, 2011
Wednesday, March 9, 2011
Wednesday, March 9, 2011
Wednesday, March 9, 2011
Data StructuresWednesday, March 9, 2011
1                                                       3                           2                                     ...
vertices                           0       1       1      1                           1       0       0      0            ...
vertices                           0       1       1      1                           1       0       0      0            ...
1                                                       3                           2                                     ...
[1, 2, 3, 4]                            2   1       1         1                            3           4         3        ...
VisualizationsWednesday, March 9, 2011
You are here.Wednesday, March 9, 2011
Wednesday, March 9, 2011
(Graph does not include Justin Bieber)Wednesday, March 9, 2011
Social GraphsWednesday, March 9, 2011
Wednesday, March 9, 2011
Wednesday, March 9, 2011
Wednesday, March 9, 2011
Wednesday, March 9, 2011
Wednesday, March 9, 2011
Wednesday, March 9, 2011
Wednesday, March 9, 2011
Wednesday, March 9, 2011
Wednesday, March 9, 2011
User-based item recommendationsWednesday, March 9, 2011
People                           Recommend items to me that are popular amongst my friendsWednesday, March 9, 2011
People                                                                                (friends)                           ...
People                                                                                (friends)                           ...
People                                                                                (friends)                           ...
People                             (me)                                               (friends)                           ...
People                             (me)                                               (friends)                           ...
People                             (me)                                               (friends)                           ...
2-step path on homogeneous bipartite                                           graph.Wednesday, March 9, 2011
Strong Connection Problem (SCP)Wednesday, March 9, 2011
There are many of these ‘fundamental’                  graph units:                      -    tripartite graphs (user/asse...
Graph Storage                              EnginesWednesday, March 9, 2011
Neo4j    “An embedded, disk-based, fully transactional Java persistence engine that            stores data structured in g...
HypergraphDB“A general purpose, extensible, portable, distributed, embeddable, open-source    data storage mechanism. It i...
Special Purpose                           Storage EnginesWednesday, March 9, 2011
FlockDB                  “FlockDB is a database that stores graph data, but it isnt a database                optimized fo...
Redis       “Redis is an advanced key-value store. [...] the dataset is not volatile, and values       can be strings, exa...
A Redis Friends/                 Followers ExampleWednesday, March 9, 2011
Redis makes you think in terms of datastructures,                and operations on those structures.Wednesday, March 9, 2011
Set:         Finite (for our cases) collection of objects in which         order has no significance and multiplicity is g...
Insert a user into a set                            SET uid:1000:username jperras                           Command    Key...
Use sets for denoting my followers/people                                  I follow.Wednesday, March 9, 2011
Adding a new follower                            SADD uid:1000:following 1001                            SADD uid:1001:fol...
Posting Updates                      $r = Redis();                      $postid = $r->incr("global:nextPostId");          ...
Common followers? - Set intersections!                           SINTER users:1000:followers users:1000:followers         ...
A MySQL Example                           (simplified)Wednesday, March 9, 2011
# Mutual Friends                           select f.friend_id                               from friends f                ...
# Mutual Friends                           select f.friend_id                               from friends f                ...
Relational databases can work for the simplest          of cases, but are not always the best solution for                ...
Graphs and graph-databases are only                            going to be more and more useful.Wednesday, March 9, 2011
However, graph algorithms are hard.                                So don’t write your own.        And make sure you use a...
ResourcesWednesday, March 9, 2011
Resources                           The Algorithm Design Manual,                           Steve S. Skiena                ...
@jperrasWednesday, March 9, 2011
Photo Credits                           Graph of the internet, circa 2003: http://www.duniacyber.com/freebies/education/wh...
References                           Large Scale Graph Algorithms (class lectures), Yuri Lifshits, Steklov Institute of   ...
Upcoming SlideShare
Loading in …5
×

Graphs, Edges & Nodes: Untangling the Social Web

5,272 views

Published on

Published in: Technology

Graphs, Edges & Nodes: Untangling the Social Web

  1. 1. Graphs, Edges & Nodes Untangling the social web.Wednesday, March 9, 2011
  2. 2. What’s a graph?Wednesday, March 9, 2011
  3. 3. GraphWednesday, March 9, 2011
  4. 4. GraphWednesday, March 9, 2011
  5. 5. GraphWednesday, March 9, 2011
  6. 6. Graph 10 19 9 7 2 15 7 3 12 13 9 6 6 4 3 5 7 4 14 1 4Wednesday, March 9, 2011
  7. 7. Graph 11 10 10 19 6 9 7 2 15 7 21 3 8 12 15 13 13 17 9 22 6 6 3 4 4 3 2 5 7 4 6 14 9 12 1 10 4 19Wednesday, March 9, 2011
  8. 8. Simple At most one edge bet ween any pair of nodes.Wednesday, March 9, 2011
  9. 9. Multigraph Multiple edges bet ween vertices allowed.Wednesday, March 9, 2011
  10. 10. Pseudograph Self-loops are permitted.Wednesday, March 9, 2011
  11. 11. G = (V, E)Wednesday, March 9, 2011
  12. 12. Wednesday, March 9, 2011
  13. 13. What’s a node? vertex point junction 0-simplexWednesday, March 9, 2011
  14. 14. Wednesday, March 9, 2011
  15. 15. Wednesday, March 9, 2011
  16. 16. What’s an edge? arc branch line link 1-simplexWednesday, March 9, 2011
  17. 17. DirectedWednesday, March 9, 2011
  18. 18. Wednesday, March 9, 2011
  19. 19. Wednesday, March 9, 2011
  20. 20. UndirectedWednesday, March 9, 2011
  21. 21. UndirectedWednesday, March 9, 2011
  22. 22. Wednesday, March 9, 2011
  23. 23. Wednesday, March 9, 2011
  24. 24. Wednesday, March 9, 2011
  25. 25. Wednesday, March 9, 2011
  26. 26. Data StructuresWednesday, March 9, 2011
  27. 27. 1 3 2 4 (Finite simple graph)Wednesday, March 9, 2011
  28. 28. vertices 0 1 1 1 1 0 0 0 vertices 1 0 0 1 1 0 1 0 Adjacency Matrix (2d array)Wednesday, March 9, 2011
  29. 29. vertices 0 1 1 1 1 0 0 0 vertices 1 0 0 1 1 0 1 0 Adjacency Matrix (2d array)Wednesday, March 9, 2011
  30. 30. 1 3 2 4 (Finite simple graph)Wednesday, March 9, 2011
  31. 31. [1, 2, 3, 4] 2 1 1 1 3 4 3 4 Array entries (vertices) point to singly linked-listsWednesday, March 9, 2011
  32. 32. VisualizationsWednesday, March 9, 2011
  33. 33. You are here.Wednesday, March 9, 2011
  34. 34. Wednesday, March 9, 2011
  35. 35. (Graph does not include Justin Bieber)Wednesday, March 9, 2011
  36. 36. Social GraphsWednesday, March 9, 2011
  37. 37. Wednesday, March 9, 2011
  38. 38. Wednesday, March 9, 2011
  39. 39. Wednesday, March 9, 2011
  40. 40. Wednesday, March 9, 2011
  41. 41. Wednesday, March 9, 2011
  42. 42. Wednesday, March 9, 2011
  43. 43. Wednesday, March 9, 2011
  44. 44. Wednesday, March 9, 2011
  45. 45. Wednesday, March 9, 2011
  46. 46. User-based item recommendationsWednesday, March 9, 2011
  47. 47. People Recommend items to me that are popular amongst my friendsWednesday, March 9, 2011
  48. 48. People (friends) Recommend items to me that are popular amongst my friendsWednesday, March 9, 2011
  49. 49. People (friends) Items Recommend items to me that are popular amongst my friendsWednesday, March 9, 2011
  50. 50. People (friends) Items Recommend items to me that are popular amongst my friendsWednesday, March 9, 2011
  51. 51. People (me) (friends) Items Recommend items to me that are popular amongst my friendsWednesday, March 9, 2011
  52. 52. People (me) (friends) Items Recommend items to me that are popular amongst my friendsWednesday, March 9, 2011
  53. 53. People (me) (friends) Items Recommend items to me that are popular amongst my friendsWednesday, March 9, 2011
  54. 54. 2-step path on homogeneous bipartite graph.Wednesday, March 9, 2011
  55. 55. Strong Connection Problem (SCP)Wednesday, March 9, 2011
  56. 56. There are many of these ‘fundamental’ graph units: - tripartite graphs (user/asset/tag) - folksonomies - multicolor-multiparity graph - etc.Wednesday, March 9, 2011
  57. 57. Graph Storage EnginesWednesday, March 9, 2011
  58. 58. Neo4j “An embedded, disk-based, fully transactional Java persistence engine that stores data structured in graphs rather than in tables.” http://neo4j.orgWednesday, March 9, 2011
  59. 59. HypergraphDB“A general purpose, extensible, portable, distributed, embeddable, open-source data storage mechanism. It is a graph database designed specifically for artificial intelligence and semantic web projects.” http://kobrix.org/hgdb.jspWednesday, March 9, 2011
  60. 60. Special Purpose Storage EnginesWednesday, March 9, 2011
  61. 61. FlockDB “FlockDB is a database that stores graph data, but it isnt a database optimized for graph-traversal operations. Instead, its optimized for very large adjacency lists, fast reads and writes, and page-able set arithmetic queries.” http://engineering.twitter.com/2010/05/introducing-flockdb.htmlWednesday, March 9, 2011
  62. 62. Redis “Redis is an advanced key-value store. [...] the dataset is not volatile, and values can be strings, exactly like in memcached, but also lists, sets, and ordered sets. All this data types can be manipulated with atomic operations to push/pop elements, add/remove elements, perform server side union, intersection, difference between sets, etc.” http://code.google.com/p/redisWednesday, March 9, 2011
  63. 63. A Redis Friends/ Followers ExampleWednesday, March 9, 2011
  64. 64. Redis makes you think in terms of datastructures, and operations on those structures.Wednesday, March 9, 2011
  65. 65. Set: Finite (for our cases) collection of objects in which order has no significance and multiplicity is generally ignored. S = { Alice, Bob, Carol } List: Finite (for our cases) collection of objects in which order *is* significant and multiplicity is allowed. L = [ X, Y, X, Z, Q]Wednesday, March 9, 2011
  66. 66. Insert a user into a set SET uid:1000:username jperras Command Key ValueWednesday, March 9, 2011
  67. 67. Use sets for denoting my followers/people I follow.Wednesday, March 9, 2011
  68. 68. Adding a new follower SADD uid:1000:following 1001 SADD uid:1001:followers 1000 Command Key ValueWednesday, March 9, 2011
  69. 69. Posting Updates $r = Redis(); $postid = $r->incr("global:nextPostId"); $post = $User[id] ."|". time() ."|". $status; $r->set("post:$postid", $post); $followers = $r->smembers("uid:".$User[id].":followers"); if ($followers === false) $followers = Array(); $followers[] = $User[id]; /* Add the post to our own posts too */ foreach($followers as $fid) {     $r->push("uid:$fid:posts", $postid, false); } # Push the post on the timeline, and trim the timeline to the # newest 1000 elements. $r->push("global:timeline", $postid, false); $r->ltrim("global:timeline",0,1000);Wednesday, March 9, 2011
  70. 70. Common followers? - Set intersections! SINTER users:1000:followers users:1000:followers Command Key 1 Key 2Wednesday, March 9, 2011
  71. 71. A MySQL Example (simplified)Wednesday, March 9, 2011
  72. 72. # Mutual Friends select f.friend_id from friends f join friends m on m.user_id = f.friend_id and m.friend_id = f.user_id where f.user_id = 1234 # Following (for directed graphs) select f.friend_id from friends f left join friends m on m.user_id = f.friend_id and m.friend_id = f.user_id where f.user_id = 1234 and m.user_id is null; # Followers (for directed graphs) select m.friend_id from friends f left join friends m on m.user_id = f.friend_id and m.friend_id = f.user_id where f.friend_id = 1234 and m.user_id is nullWednesday, March 9, 2011
  73. 73. # Mutual Friends select f.friend_id from friends f join friends m on m.user_id = f.friend_id and m.friend_id = f.user_id where f.user_id = 1234 # Following (for directed graphs) select f.friend_id from friends f left join friends m Not too bad. on m.user_id = f.friend_id and m.friend_id = f.user_id where f.user_id = 1234 and m.user_id is null; # Followers (for directed graphs) select m.friend_id from friends f left join friends m on m.user_id = f.friend_id and m.friend_id = f.user_id where f.friend_id = 1234 and m.user_id is nullWednesday, March 9, 2011
  74. 74. Relational databases can work for the simplest of cases, but are not always the best solution for many graph operations/algorithms.Wednesday, March 9, 2011
  75. 75. Graphs and graph-databases are only going to be more and more useful.Wednesday, March 9, 2011
  76. 76. However, graph algorithms are hard. So don’t write your own. And make sure you use a persistent storage engine that is best suited for the type of queries you will be performing.Wednesday, March 9, 2011
  77. 77. ResourcesWednesday, March 9, 2011
  78. 78. Resources The Algorithm Design Manual, Steve S. Skiena Programming Collective Intelligence, Toby Segaran Introduction to Algorithms, Cormen, Leiserson, RivestWednesday, March 9, 2011
  79. 79. @jperrasWednesday, March 9, 2011
  80. 80. Photo Credits Graph of the internet, circa 2003: http://www.duniacyber.com/freebies/education/what- is-internet-lookslike/ (built from partial troll of public servers using traceroute) My real friends for letting me use their Facebook profile images.Wednesday, March 9, 2011
  81. 81. References Large Scale Graph Algorithms (class lectures), Yuri Lifshits, Steklov Institute of Mathematics at St. Petersburg http://mathworld.wolfram.com/Set.html Programming Collective Intelligence, Toby Segaran The Algorithm Design Manual, Steve S. SkienaWednesday, March 9, 2011

×