Pregel:A System for Large-Scale Graph ProcessingGrzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik,James C. Dehnert, Il...
Outline Introduction Computation Model Writing a Pregel Program System Implementation Experiments Conclusion & Futur...
Outline Introduction Computation Model Writing a Pregel Program System Implementation Experiments Conclusion & Futur...
Introduction (1/2)Source: SIGMETRICS ’09 Tutorial – MapReduce: The Programming Model and Practice, by Jerry Zhao          ...
Introduction (2/2) Many practical computing problems concern large graphs      Large graph data                        Gr...
Single Source Shortest Path (SSSP) Problem   – Find shortest path from a source node to all target nodes Solution   – Si...
Example: SSSP – Dijkstra’s Algorithm                               1              10          0        2   3           9  ...
Example: SSSP – Dijkstra’s Algorithm                       10           1              10          0        2        3    ...
Example: SSSP – Dijkstra’s Algorithm                       8           1               14              10          0      ...
Example: SSSP – Dijkstra’s Algorithm                       8            1               13              10          0     ...
Example: SSSP – Dijkstra’s Algorithm                       8            1               9              10          0      ...
Example: SSSP – Dijkstra’s Algorithm                       8            1               9              10          0      ...
Single Source Shortest Path (SSSP) Problem   – Find shortest path from a source node to all target nodes Solution   – Si...
MapReduce Execution Overview                   14
Example: SSSP – Parallel BFS in MapReduce Adjacency matrix                                    B                       C  ...
Example: SSSP – Parallel BFS in MapReduce Map input: <node ID, <dist, adj list>>                                B        ...
Example: SSSP – Parallel BFS in MapReduce Reduce input: <node ID, dist>                      B                       C<A,...
Example: SSSP – Parallel BFS in MapReduce Reduce input: <node ID, dist>                      B                       C<A,...
Example: SSSP – Parallel BFS in MapReduce Reduce output: <node ID, <dist, adj list>>                             B       ...
Example: SSSP – Parallel BFS in MapReduce Reduce input: <node ID, dist>                    B                        C<A, ...
Example: SSSP – Parallel BFS in MapReduce Reduce input: <node ID, dist>                    B                        C<A, ...
Example: SSSP – Parallel BFS in MapReduce Reduce output: <node ID, <dist, adj list>>       B                       C   = ...
Outline Introduction Computation Model Writing a Pregel Program System Implementation Experiments Conclusion & Futur...
Computation Model (1/3)             Input                           Supersteps                           (a sequence of it...
Computation Model (2/3)  “Think like a vertex”  Inspired by Valiant’s Bulk Synchronous Parallel model (1990)Source: http...
Computation Model (3/3) Superstep: the vertices compute in parallel   – Each vertex        Receives messages sent in the...
Example: SSSP – Parallel BFS in Pregel                                1              10          0        2   3           ...
Example: SSSP – Parallel BFS in Pregel                                     1                   10              10         ...
Example: SSSP – Parallel BFS in Pregel                       10            1              10          0        2        3 ...
Example: SSSP – Parallel BFS in Pregel                        10            1           11                                ...
Example: SSSP – Parallel BFS in Pregel                       8            1               11              10          0   ...
Example: SSSP – Parallel BFS in Pregel                       8            1           9        11              10         ...
Example: SSSP – Parallel BFS in Pregel                       8            1               9              10          0    ...
Example: SSSP – Parallel BFS in Pregel                       8            1                9              10          0   ...
Example: SSSP – Parallel BFS in Pregel                       8            1               9              10          0    ...
Differences from MapReduce Graph algorithms can be written as a series of chained  MapReduce invocation Pregel   – Keeps...
Outline Introduction Computation Model Writing a Pregel Program System Implementation Experiments Conclusion & Futur...
C++ API Writing a Pregel program   – Subclassing the predefined Vertex class                                 Override thi...
Example: Vertex Class for SSSP                     39
Outline Introduction Computation Model Writing a Pregel Program System Implementation Experiments Conclusion & Futur...
System Architecture Pregel system also uses the master/worker model   – Master        Maintains worker        Recovers ...
Execution of a Pregel Program1. Many copies of the program begin executing on a cluster of machines2. The master assigns a...
Fault Tolerance Checkpointing   – The master periodically instructs the workers to save the state of their     partitions...
Outline Introduction Computation Model Writing a Pregel Program System Implementation Experiments Conclusion & Futur...
Experiments Environment   – H/W: A cluster of 300 multicore commodity PCs   – Data: binary trees, log-normal random graph...
Experiments SSSP – 1 billion vertex binary tree: varying # of worker tasks                                  46
Experiments SSSP – binary trees: varying graph sizes on 800 worker tasks                                 47
Experiments SSSP – Random graphs: varying graph sizes on 800 worker tasks                               48
Outline Introduction Computation Model Writing a Pregel Program System Implementation Experiments Conclusion & Futur...
Conclusion & Future Work Pregel is a scalable and fault-tolerant platform with an API that is  sufficiently flexible to e...
Thank You!Any questions or comments?
Upcoming SlideShare
Loading in …5
×

2010-Pregel

2,805 views

Published on

0 Comments
6 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,805
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
133
Comments
0
Likes
6
Embeds 0
No embeds

No notes for slide

2010-Pregel

  1. 1. Pregel:A System for Large-Scale Graph ProcessingGrzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik,James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz CzajkwoskiGoogle, Inc.SIGMOD ’10 7 July 2010 Taewhi Lee
  2. 2. Outline Introduction Computation Model Writing a Pregel Program System Implementation Experiments Conclusion & Future Work 2
  3. 3. Outline Introduction Computation Model Writing a Pregel Program System Implementation Experiments Conclusion & Future Work 3
  4. 4. Introduction (1/2)Source: SIGMETRICS ’09 Tutorial – MapReduce: The Programming Model and Practice, by Jerry Zhao 4
  5. 5. Introduction (2/2) Many practical computing problems concern large graphs Large graph data Graph algorithms Web graph PageRank Transportation routes Shortest path Citation relationships Connected components Social networks Clustering techniques MapReduce is ill-suited for graph processing – Many iterations are needed for parallel graph processing – Materializations of intermediate results at every MapReduce iteration harm performance 5
  6. 6. Single Source Shortest Path (SSSP) Problem – Find shortest path from a source node to all target nodes Solution – Single processor machine: Dijkstra’s algorithm 6
  7. 7. Example: SSSP – Dijkstra’s Algorithm 1 10 0 2 3 9 4 6 5 7 2 7
  8. 8. Example: SSSP – Dijkstra’s Algorithm 10 1 10 0 2 3 9 4 6 5 7 5 2 8
  9. 9. Example: SSSP – Dijkstra’s Algorithm 8 1 14 10 0 2 3 9 4 6 5 7 5 7 2 9
  10. 10. Example: SSSP – Dijkstra’s Algorithm 8 1 13 10 0 2 3 9 4 6 5 7 5 7 2 10
  11. 11. Example: SSSP – Dijkstra’s Algorithm 8 1 9 10 0 2 3 9 4 6 5 7 5 7 2 11
  12. 12. Example: SSSP – Dijkstra’s Algorithm 8 1 9 10 0 2 3 9 4 6 5 7 5 7 2 12
  13. 13. Single Source Shortest Path (SSSP) Problem – Find shortest path from a source node to all target nodes Solution – Single processor machine: Dijkstra’s algorithm – MapReduce/Pregel: parallel breadth-first search (BFS) 13
  14. 14. MapReduce Execution Overview 14
  15. 15. Example: SSSP – Parallel BFS in MapReduce Adjacency matrix B C A B C D E 1 A 10 5 B 1 2 10 A C 4 D 3 9 2 0 2 3 9 4 6 E 7 6 5 7 Adjacency ListA: (B, 10), (D, 5) 2B: (C, 1), (D, 2) D EC: (E, 4)D: (B, 3), (C, 9), (E, 2)E: (A, 7), (C, 6) 15
  16. 16. Example: SSSP – Parallel BFS in MapReduce Map input: <node ID, <dist, adj list>> B C<A, <0, <(B, 10), (D, 5)>>> 1<B, <inf, <(C, 1), (D, 2)>>><C, <inf, <(E, 4)>>> 10 A<D, <inf, <(B, 3), (C, 9), (E, 2)>>><E, <inf, <(A, 7), (C, 6)>>> 0 2 3 9 4 6 5 7 Map output: <dest node ID, dist><B, 10> <D, 5> <A, <0, <(B, 10), (D, 5)>>> 2<C, inf> <D, inf> <B, <inf, <(C, 1), (D, 2)>>> D E<E, inf> <C, <inf, <(E, 4)>>><B, inf> <C, inf> <E, inf> <D, <inf, <(B, 3), (C, 9), (E, 2)>>><A, inf> <C, inf> <E, <inf, <(A, 7), (C, 6)>>> Flushed to local disk!! 16
  17. 17. Example: SSSP – Parallel BFS in MapReduce Reduce input: <node ID, dist> B C<A, <0, <(B, 10), (D, 5)>>> 1<A, inf> 10<B, <inf, <(C, 1), (D, 2)>>> A<B, 10> <B, inf> 0 2 3 9 4 6<C, <inf, <(E, 4)>>><C, inf> <C, inf> <C, inf> 5 7<D, <inf, <(B, 3), (C, 9), (E, 2)>>> 2<D, 5> <D, inf> D E<E, <inf, <(A, 7), (C, 6)>>><E, inf> <E, inf> 17
  18. 18. Example: SSSP – Parallel BFS in MapReduce Reduce input: <node ID, dist> B C<A, <0, <(B, 10), (D, 5)>>> 1<A, inf> 10<B, <inf, <(C, 1), (D, 2)>>> A<B, 10> <B, inf> 0 2 3 9 4 6<C, <inf, <(E, 4)>>><C, inf> <C, inf> <C, inf> 5 7<D, <inf, <(B, 3), (C, 9), (E, 2)>>> 2<D, 5> <D, inf> D E<E, <inf, <(A, 7), (C, 6)>>><E, inf> <E, inf> 18
  19. 19. Example: SSSP – Parallel BFS in MapReduce Reduce output: <node ID, <dist, adj list>> B C = Map input for next iteration 1 10<A, <0, <(B, 10), (D, 5)>>> Flushed to DFS!!<B, <10, <(C, 1), (D, 2)>>> 10 A<C, <inf, <(E, 4)>>> 0 2 3 9 4 6<D, <5, <(B, 3), (C, 9), (E, 2)>>><E, <inf, <(A, 7), (C, 6)>>> 5 7 Map output: <dest node ID, dist><B, 10> <D, 5> <A, <0, <(B, 10), (D, 5)>>> 5 2<C, 11> <D, 12> <B, <10, <(C, 1), (D, 2)>>> D E<E, inf> <C, <inf, <(E, 4)>>><B, 8> <C, 14> <E, 7> <D, <5, <(B, 3), (C, 9), (E, 2)>>> <E, <inf, <(A, 7), (C, 6)>>> Flushed to local disk!!<A, inf> <C, inf> 19
  20. 20. Example: SSSP – Parallel BFS in MapReduce Reduce input: <node ID, dist> B C<A, <0, <(B, 10), (D, 5)>>> 10 1<A, inf> 10<B, <10, <(C, 1), (D, 2)>>> A<B, 10> <B, 8> 0 2 3 9 4 6<C, <inf, <(E, 4)>>><C, 11> <C, 14> <C, inf> 5 7<D, <5, <(B, 3), (C, 9), (E, 2)>>> 5 2<D, 5> <D, 12> D E<E, <inf, <(A, 7), (C, 6)>>><E, inf> <E, 7> 20
  21. 21. Example: SSSP – Parallel BFS in MapReduce Reduce input: <node ID, dist> B C<A, <0, <(B, 10), (D, 5)>>> 10 1<A, inf> 10<B, <10, <(C, 1), (D, 2)>>> A<B, 10> <B, 8> 0 2 3 9 4 6<C, <inf, <(E, 4)>>><C, 11> <C, 14> <C, inf> 5 7<D, <5, <(B, 3), (C, 9), (E, 2)>>> 5 2<D, 5> <D, 12> D E<E, <inf, <(A, 7), (C, 6)>>><E, inf> <E, 7> 21
  22. 22. Example: SSSP – Parallel BFS in MapReduce Reduce output: <node ID, <dist, adj list>> B C = Map input for next iteration 1 8 11<A, <0, <(B, 10), (D, 5)>>> Flushed to DFS!!<B, <8, <(C, 1), (D, 2)>>> 10 A<C, <11, <(E, 4)>>> 0 2 3 9 4 6<D, <5, <(B, 3), (C, 9), (E, 2)>>><E, <7, <(A, 7), (C, 6)>>> 5 7 … the rest omitted … 5 7 2 D E 22
  23. 23. Outline Introduction Computation Model Writing a Pregel Program System Implementation Experiments Conclusion & Future Work 23
  24. 24. Computation Model (1/3) Input Supersteps (a sequence of iterations) Output 24
  25. 25. Computation Model (2/3)  “Think like a vertex”  Inspired by Valiant’s Bulk Synchronous Parallel model (1990)Source: http://en.wikipedia.org/wiki/Bulk_synchronous_parallel 25
  26. 26. Computation Model (3/3) Superstep: the vertices compute in parallel – Each vertex  Receives messages sent in the previous superstep  Executes the same user-defined function  Modifies its value or that of its outgoing edges  Sends messages to other vertices (to be received in the next superstep)  Mutates the topology of the graph  Votes to halt if it has no further work to do – Termination condition  All vertices are simultaneously inactive  There are no messages in transit 26
  27. 27. Example: SSSP – Parallel BFS in Pregel 1 10 0 2 3 9 4 6 5 7 2 27
  28. 28. Example: SSSP – Parallel BFS in Pregel 1 10 10 0 2 3 9 4 6 5 7 5 2 28
  29. 29. Example: SSSP – Parallel BFS in Pregel 10 1 10 0 2 3 9 4 6 5 7 5 2 29
  30. 30. Example: SSSP – Parallel BFS in Pregel 10 1 11 14 10 8 0 2 3 9 4 6 5 12 7 5 2 7 30
  31. 31. Example: SSSP – Parallel BFS in Pregel 8 1 11 10 0 2 3 9 4 6 5 7 5 7 2 31
  32. 32. Example: SSSP – Parallel BFS in Pregel 8 1 9 11 10 13 0 14 2 3 9 4 6 5 7 15 5 7 2 32
  33. 33. Example: SSSP – Parallel BFS in Pregel 8 1 9 10 0 2 3 9 4 6 5 7 5 7 2 33
  34. 34. Example: SSSP – Parallel BFS in Pregel 8 1 9 10 0 2 3 9 4 6 5 7 13 5 7 2 34
  35. 35. Example: SSSP – Parallel BFS in Pregel 8 1 9 10 0 2 3 9 4 6 5 7 5 7 2 35
  36. 36. Differences from MapReduce Graph algorithms can be written as a series of chained MapReduce invocation Pregel – Keeps vertices & edges on the machine that performs computation – Uses network transfers only for messages MapReduce – Passes the entire state of the graph from one stage to the next – Needs to coordinate the steps of a chained MapReduce 36
  37. 37. Outline Introduction Computation Model Writing a Pregel Program System Implementation Experiments Conclusion & Future Work 37
  38. 38. C++ API Writing a Pregel program – Subclassing the predefined Vertex class Override this! in msgs out msg 38
  39. 39. Example: Vertex Class for SSSP 39
  40. 40. Outline Introduction Computation Model Writing a Pregel Program System Implementation Experiments Conclusion & Future Work 40
  41. 41. System Architecture Pregel system also uses the master/worker model – Master  Maintains worker  Recovers faults of workers  Provides Web-UI monitoring tool of job progress – Worker  Processes its task  Communicates with the other workers Persistent data is stored as files on a distributed storage system (such as GFS or BigTable) Temporary data is stored on local disk 41
  42. 42. Execution of a Pregel Program1. Many copies of the program begin executing on a cluster of machines2. The master assigns a partition of the input to each worker – Each worker loads the vertices and marks them as active3. The master instructs each worker to perform a superstep – Each worker loops through its active vertices & computes for each vertex – Messages are sent asynchronously, but are delivered before the end of the superstep – This step is repeated as long as any vertices are active, or any messages are in transit4. After the computation halts, the master may instruct each worker to save its portion of the graph 42
  43. 43. Fault Tolerance Checkpointing – The master periodically instructs the workers to save the state of their partitions to persistent storage  e.g., Vertex values, edge values, incoming messages Failure detection – Using regular “ping” messages Recovery – The master reassigns graph partitions to the currently available workers – The workers all reload their partition state from most recent available checkpoint 43
  44. 44. Outline Introduction Computation Model Writing a Pregel Program System Implementation Experiments Conclusion & Future Work 44
  45. 45. Experiments Environment – H/W: A cluster of 300 multicore commodity PCs – Data: binary trees, log-normal random graphs (general graphs) Naïve SSSP implementation – The weight of all edges = 1 – No checkpointing 45
  46. 46. Experiments SSSP – 1 billion vertex binary tree: varying # of worker tasks 46
  47. 47. Experiments SSSP – binary trees: varying graph sizes on 800 worker tasks 47
  48. 48. Experiments SSSP – Random graphs: varying graph sizes on 800 worker tasks 48
  49. 49. Outline Introduction Computation Model Writing a Pregel Program System Implementation Experiments Conclusion & Future Work 49
  50. 50. Conclusion & Future Work Pregel is a scalable and fault-tolerant platform with an API that is sufficiently flexible to express arbitrary graph algorithms Future work – Relaxing the synchronicity of the model  Not to wait for slower workers at inter-superstep barriers – Assigning vertices to machines to minimize inter-machine communication – Caring dense graphs in which most vertices send messages to most other vertices 50
  51. 51. Thank You!Any questions or comments?

×