Horton: online query execution on large distributed graphs<br />Sameh Elnikety<br />Microsoft Research<br />
People<br />Team<br />Interns<br />Mohamed Sarwat (UMN), Jiaqing Du (EPFL)<br />
Large Graphs Are Important<br />Large graphs appear in many application domains<br />Examples: social networks, knowledge ...
Example : Social Network<br />Scale<br />LinkedIn<br />70 million users<br />Facebook<br />500 million users<br />65 Billi...
Objective: Management & Querying<br />Manage and query large graphs online<br />Today<br />Expensive hardware (limited)<br...
Key Design Decisions<br />Interactive queries<br />Graph stored in random access memory<br />Graph too large for one machi...
Three Research Problems<br />How to partition the graph<br />Static and dynamic techniques<br />E.g., Evolving set partiti...
Concurrency Control<br /><ul><li>Generalized Snapshot Isolation – GSI
Replicas agree</li></ul>On which update transactions commit<br />On their commit order<br /><ul><li>Example (bank database...
    Replication<br />Single<br />Graph Server<br />Replica 1<br />Load <br />Balancer<br />Replica 2<br />Replica 3<br />
    Read Tx<br />Replica 1<br />Load <br />Balancer<br />Replica 2<br />T<br />Replica 3<br />
    Update Tx<br />Replica 1<br />Load <br />Balancer<br />Replica 2<br />T<br />ws<br />ws<br />Replica 3<br />
Replication Middleware<br />Tx A<br />Replica 1<br />Tx A<br />A  B  C<br />durability<br />OFF<br />A  B  C<br />Repl...
Partitioning – Key Ideas<br />Objective<br />Maximize temporal and spatial locality<br />Static<br />Linear time<br />Evol...
Three Research Problems<br />How to partition the graph<br />Static and dynamic techniques<br />E.g., Evolving set partiti...
Data Model<br />Node<br />ID, type, attributes <br />Edge<br />Connects two nodes<br />Direction, type, attributes <br />A...
Query Language <br />Trade-off<br />Expressiveness vs. efficiency<br />Regular language reachability<br />Query is a regul...
Query Language - Operators<br />Projection<br />Alice’s photos<br />Photo, tags, Alice<br />Node: type=photo  ,  edge: typ...
Example App: CodeBook - Graph<br />
Example App: CodeBook - Queries<br />Person, FileOwner>, TFSFile,FileOwner<, Person<br />Person, DiscussionOwner>, Discuss...
Query Language - Future Extensions<br />From: path<br />Sequence of predicates<br />To: sub-graph<br />Graph pattern<br />...
Talk Outline<br /><ul><li>Graph query language
Graph query execution engine
Graph query optimizer
Initial experimental results
Demo</li></li></ul><li>Execution Engine<br />Build a FSM from input query<br />Optimize FSM<br />Executes FSM using distri...
Centralized Query Execution<br />Photo<br />Tags<br />Alice<br />S2<br />S0<br />S1<br />S3<br />Alice, Tags, Photo<br />B...
Distributed Execution Engine<br />
Distributed Query Execution<br />Alice, Tags, Photo, Tags, Hillary<br />Partition 1<br />Partition 2<br />
Distributed Query Execution<br />Alice, Tags, Photo, Tags, Hillary<br />FSM<br />Partition 1<br />Partition 2<br />S0<br /...
Graph Query Optimization<br />
Graph Query Optimization<br />Input<br />Graph statistics + query plan<br />Output<br />Optimized query plan (execution se...
Graph Query Optimization Techniques<br />Predicate ordering<br />Derived edges<br />Fragment reuse<br />
Predicate Ordering<br />Find Mike’s photo that is also tagged by at least one of his friends<br />Mike, Tagged, Photo, Tag...
Predicate Ordering<br />Find Mike’s photo that is also tagged by at least one of his friends<br />Mike, Tagged, Photo, Tag...
Predicate Ordering<br />Find Mike’s photo that is also tagged by at least one of his friends<br />Mike, Tagged, Photo, Tag...
Predicate Ordering<br />Find Mike’s photo that is also tagged by at least one of his friends<br />Mike, Tagged, Photo, Tag...
Predicate Ordering<br />Find Mike’s photo that is also tagged by at least one of his friends<br />Mike, Tagged, Photo, Tag...
Predicate Ordering<br />Find Mike’s photo that is also tagged by at least one of his friends<br />Mike, Tagged, Photo, Tag...
How to Decide Predicate Ordering?<br /><ul><li>Enumerate execution sequences of predicates
Estimate their costs using graph statistics
Find the sequence with minimum cost</li></li></ul><li>Cost Estimation using Graph Statistics<br />Graph Statistics<br />Le...
Cost Estimation using Graph Statistics<br />Graph Statistics<br />Left to right<br />Mike, Tagged, Photo, Tagged, Person, ...
Cost Estimation using Graph Statistics<br />Graph Statistics<br />Left to right<br />Mike, Tagged, Photo, Tagged, Person, ...
Cost Estimation using Graph Statistics<br />Graph Statistics<br />Left to right<br />Mike, Tagged, Photo, Tagged, Person, ...
Plan Enumeration<br />Find Mike’s photo that is also tagged by at least one of his friends<br />Plan1<br />Mike, Tagged, P...
Derived Edges<br />Friends-photo-tag<br />Mike, Tagged, Photo, Tagged, Person, FriendOf, Mike<br />Cost = 7<br />Mike, Tag...
Query Fragment Reuse<br />Query1<br />Mike, Tagged, Photo<br />Query2<br />Person, FriendOf, Mike<br />Query3<br />Mike, T...
Query Fragment Reuse<br />Query1<br />Mike, Tagged, Photo<br />Query2<br />Person, FriendOf, Mike<br />Query3<br />Mike, T...
Upcoming SlideShare
Loading in...5
×

GDM 2011 Talk

3,040

Published on

GDM 2011 keynote slides

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,040
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
27
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

GDM 2011 Talk

  1. 1. Horton: online query execution on large distributed graphs<br />Sameh Elnikety<br />Microsoft Research<br />
  2. 2. People<br />Team<br />Interns<br />Mohamed Sarwat (UMN), Jiaqing Du (EPFL)<br />
  3. 3. Large Graphs Are Important<br />Large graphs appear in many application domains<br />Examples: social networks, knowledge representation, …<br />
  4. 4. Example : Social Network<br />Scale<br />LinkedIn<br />70 million users<br />Facebook<br />500 million users<br />65 Billion photos<br />Queries<br />Find Alice’s friends<br />Find Alice’s photos with friends<br />Rich graph<br />Types, attributes<br />
  5. 5. Objective: Management & Querying<br />Manage and query large graphs online<br />Today<br />Expensive hardware (limited)<br />Cluster of machines (ad-hoc)<br />Tomorrow <br />Growth area<br />Analogy: provide system support<br />Analogy: DBMS manages application data<br />
  6. 6. Key Design Decisions<br />Interactive queries<br />Graph stored in random access memory<br />Graph too large for one machine<br />Distributed problem<br />Partitioning: slice graph into pieces<br />Replication: for fault tolerance<br />Long time to build graph<br />Support updates<br />
  7. 7. Three Research Problems<br />How to partition the graph<br />Static and dynamic techniques<br />E.g., Evolving set partitioning<br />How to query the graph<br />Data model & query language<br />Execution engine <br />Query optimizations<br />How to update the graph<br />Multi-version concurrency control <br />E.g., Generalized Snapshot Isolation<br />
  8. 8. Concurrency Control<br /><ul><li>Generalized Snapshot Isolation – GSI
  9. 9. Replicas agree</li></ul>On which update transactions commit<br />On their commit order<br /><ul><li>Example (bank database)</li></ul>T1: set balance = $1000<br />T2: set balance = $2000<br />Replica1: see T1 then T2  balance = $2000<br />Replica2: see T2 then T1  balance = $1000<br />
  10. 10. Replication<br />Single<br />Graph Server<br />Replica 1<br />Load <br />Balancer<br />Replica 2<br />Replica 3<br />
  11. 11. Read Tx<br />Replica 1<br />Load <br />Balancer<br />Replica 2<br />T<br />Replica 3<br />
  12. 12. Update Tx<br />Replica 1<br />Load <br />Balancer<br />Replica 2<br />T<br />ws<br />ws<br />Replica 3<br />
  13. 13. Replication Middleware<br />Tx A<br />Replica 1<br />Tx A<br />A  B  C<br />durability<br />OFF<br />A  B  C<br />Replication MW <br />(global ordering)<br />Tx B<br />Replica 2<br />Tx B<br />durability<br />A  B  C<br />A  B  C<br />A  B  C<br />durability<br />OFF<br />A  B  C<br />Replica 3<br />Tx C<br />Tx C<br />A  B  C<br />A  B  C<br />durability<br />OFF<br />
  14. 14. Partitioning – Key Ideas<br />Objective<br />Maximize temporal and spatial locality<br />Static<br />Linear time<br />Evolving-set partitioning<br />Streaming algorithm<br />During loading<br />Dynamic<br />Incremental<br />Affinity propagation<br />
  15. 15. Three Research Problems<br />How to partition the graph<br />Static and dynamic techniques<br />E.g., Evolving set partitioning<br />How to query the graph<br />Data model & query language<br />Execution engine <br />Query optimizations<br />How to update the graph<br />Multi-version concurrency control <br />E.g., Generalized Snapshot Isolation<br />
  16. 16. Data Model<br />Node<br />ID, type, attributes <br />Edge<br />Connects two nodes<br />Direction, type, attributes <br />App<br />Horton<br />
  17. 17. Query Language <br />Trade-off<br />Expressiveness vs. efficiency<br />Regular language reachability<br />Query is a regular expression<br />Sequence of node and edge predicates<br />Example<br />Alice’s photos<br />Photo, tags, Alice<br />Node: type=photo , edge: type=tags , node: type=person, name=Alice<br />Result:matching paths<br />
  18. 18. Query Language - Operators<br />Projection<br />Alice’s photos<br />Photo, tags, Alice<br />Node: type=photo , edge: type=tags , node: type=person, name=Alice<br />SELECT photoFROM photo, tags, Alice<br />OR<br />(Photo | video), tags, Alice<br />Kleenestar <br />Alice org chart<br />Alice, (manages, person)*<br />
  19. 19. Example App: CodeBook - Graph<br />
  20. 20. Example App: CodeBook - Queries<br />Person, FileOwner>, TFSFile,FileOwner<, Person<br />Person, DiscussionOwner>, Discussion, DiscussionOwner<, Person<br />Person, WorkItemOwner>, TFSWorkItem, WorkItemOwner< ,Person<br />Person, Manages<, Person, Manages>, Person<br />Person, WorkItemOwner>, TFSWorkItem, Mentions>, TFSFile, Mentions>, TFSWorkItem, WorkItemOwner<, Person<br />Person, WorkItemOwner>, TFSWorkItem, Mentions>, TFSFile, FileOwner<, Person<br />Person, FileOwner>, TFSFile, Mentions>, TFSWorkItem, Mentions>, TFSFile,   FileOwner<, Person<br />
  21. 21. Query Language - Future Extensions<br />From: path<br />Sequence of predicates<br />To: sub-graph<br />Graph pattern<br />Sub-graph isomorphism<br />
  22. 22. Talk Outline<br /><ul><li>Graph query language
  23. 23. Graph query execution engine
  24. 24. Graph query optimizer
  25. 25. Initial experimental results
  26. 26. Demo</li></li></ul><li>Execution Engine<br />Build a FSM from input query<br />Optimize FSM<br />Executes FSM using distributed BFS<br />
  27. 27. Centralized Query Execution<br />Photo<br />Tags<br />Alice<br />S2<br />S0<br />S1<br />S3<br />Alice, Tags, Photo<br />Breadth First Search<br />Answer Paths:<br />Alice, Tags, Photo1Alice, Tags, Photo8<br />
  28. 28. Distributed Execution Engine<br />
  29. 29. Distributed Query Execution<br />Alice, Tags, Photo, Tags, Hillary<br />Partition 1<br />Partition 2<br />
  30. 30. Distributed Query Execution<br />Alice, Tags, Photo, Tags, Hillary<br />FSM<br />Partition 1<br />Partition 2<br />S0<br />Partition 1<br />Step 1<br />Alice<br />Alice<br />S1<br />Tags<br />S2<br />Step 2<br />Photo1<br />Photo8<br />Photo<br />S3<br />Tags<br />S4<br />Step 3<br />Hillary<br />Hillary<br />Partition 2<br />S5<br />
  31. 31. Graph Query Optimization<br />
  32. 32. Graph Query Optimization<br />Input<br />Graph statistics + query plan<br />Output<br />Optimized query plan (execution sequence)<br />Action<br />Enumerate execution plans<br />Evaluate their costs using graph statistics<br />Find the plan with minimum cost<br />
  33. 33. Graph Query Optimization Techniques<br />Predicate ordering<br />Derived edges<br />Fragment reuse<br />
  34. 34. Predicate Ordering<br />Find Mike’s photo that is also tagged by at least one of his friends<br />Mike, Tagged, Photo, Tagged, Person, FriendOf, Mike<br />Execute left to right<br />Execute right to left<br />Different predicate orders can result in different execution costs.<br />
  35. 35. Predicate Ordering<br />Find Mike’s photo that is also tagged by at least one of his friends<br />Mike, Tagged, Photo, Tagged, Person, FriendOf, Mike<br />Execute left to right<br />
  36. 36. Predicate Ordering<br />Find Mike’s photo that is also tagged by at least one of his friends<br />Mike, Tagged, Photo, Tagged, Person, FriendOf, Mike<br />Execute left to right<br />
  37. 37. Predicate Ordering<br />Find Mike’s photo that is also tagged by at least one of his friends<br />Mike, Tagged, Photo, Tagged, Person, FriendOf, Mike<br />Execute left to right<br />
  38. 38. Predicate Ordering<br />Find Mike’s photo that is also tagged by at least one of his friends<br />Mike, Tagged, Photo, Tagged, Person, FriendOf, Mike<br />Execute left to right<br />Total cost = 14<br />
  39. 39. Predicate Ordering<br />Find Mike’s photo that is also tagged by at least one of his friends<br />Mike, Tagged, Photo, Tagged, Person, FriendOf, Mike<br />Execute left to right<br />Execute right to left<br />Different predicate orders can result in different execution costs.<br />Total cost = 7<br />Total cost = 14<br />
  40. 40. How to Decide Predicate Ordering?<br /><ul><li>Enumerate execution sequences of predicates
  41. 41. Estimate their costs using graph statistics
  42. 42. Find the sequence with minimum cost</li></li></ul><li>Cost Estimation using Graph Statistics<br />Graph Statistics<br />Left to right<br />Mike, Tagged, Photo, Tagged, Person, FriendOf, Mike<br />EstimatedCost = ???<br />
  43. 43. Cost Estimation using Graph Statistics<br />Graph Statistics<br />Left to right<br />Mike, Tagged, Photo, Tagged, Person, FriendOf, Mike<br />EstimatedCost = 1 [find Mike]<br />
  44. 44. Cost Estimation using Graph Statistics<br />Graph Statistics<br />Left to right<br />Mike, Tagged, Photo, Tagged, Person, FriendOf, Mike<br />EstimatedCost = 1 [find Mike]<br /> + (1* 2.2) [find Mike, Tagged, Photo]<br />
  45. 45. Cost Estimation using Graph Statistics<br />Graph Statistics<br />Left to right<br />Mike, Tagged, Photo, Tagged, Person, FriendOf, Mike<br />EstimatedCost = 1 [find Mike]<br /> + (1* 2.2) [find Mike, Tagged, Photo]<br /> + (2.2 * 1.6) [find Mike, Tagged, Photo, Tagged, Person]<br /> + (2.2 * 1.6 * 1.2) [find Mike, Tagged, Photo, Tagged, Person, FriendOf, Mike]<br /> = 11<br />
  46. 46. Plan Enumeration<br />Find Mike’s photo that is also tagged by at least one of his friends<br />Plan1<br />Mike, Tagged, Photo, Tagged, Person, FriendOf, Mike<br />Plan2<br />Mike, FriendOf, Person, Tagged, Photo, Tagged, Mike<br />Plan3<br />(Mike, FriendOf, Person) ⋈ (Person, Tagged, Photo, Tagged, Mike)<br />Plan4<br />(Mike, FriendOf, Person, Tagged, Photo) ⋈ (Photo, Tagged, Mike)<br />.<br />.<br />.<br />.<br />.<br />
  47. 47. Derived Edges<br />Friends-photo-tag<br />Mike, Tagged, Photo, Tagged, Person, FriendOf, Mike<br />Cost = 7<br />Mike, Tagged, Photo, Friends-photo-tag, Mike<br />Cost = 5<br />
  48. 48. Query Fragment Reuse<br />Query1<br />Mike, Tagged, Photo<br />Query2<br />Person, FriendOf, Mike<br />Query3<br />Mike, Tagged, Photo, Tagged, Person, FriendOf, Mike<br />
  49. 49. Query Fragment Reuse<br />Query1<br />Mike, Tagged, Photo<br />Query2<br />Person, FriendOf, Mike<br />Query3<br />Mike, Tagged, Photo, Tagged, Person, FriendOf, Mike<br />Execute Query3 based on cached results of Query1 and Query2: <br />[Query1], Tagged, Person, FriendOf, Mike<br />[Query2], Tagged, Photo, Tagged, Mike <br />[Query1]⋈[Photo, Tagged, Person] ⋈[Query2]<br />
  50. 50. Summary of Query Optimization<br />Predicate ordering<br />Derived edges<br />Reuse of query fragments<br />
  51. 51. Experimental Evaluation<br />Graph<br />Codebook application<br />1 Million nodes <br />10 million edges<br />Machines<br />7 machines <br />Intel Core 2 Duo 2.26 GHz , 4 GB ram<br />Graph partitioned on 6 machines<br />
  52. 52. Initial Results<br />
  53. 53. Summary<br /><ul><li>Graph query language</li></ul>Regular language expression<br /><ul><li>Graph query execution engine</li></ul>Distributed breadth first-search<br /><ul><li>Graph query optimizer</li></ul>Predicate ordering, derived edges, query fragment reuse<br /><ul><li>Implementation</li></ul>Initial experimental results<br />Demo<br />
  54. 54. Questions?<br />.<br />
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×