Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Pregel - Paper Review


Published on

For the course: Advanced Topics in Distributed Computing (KTH)

Published in: Technology
  • Be the first to comment

Pregel - Paper Review

  1. 1. Pregel: A System for Large-Scale Graph Processing Paper Review Maria Stylianou November 2, 20121 MotivationNowadays, large-scale graphs, like the Web graph and social networks, are among themain sources of new computing problems. Processing such graphs efficiently can be achallenge. MapReduce can be a solution, though very inefficient due to the require-ment of passing the entire state of the graph from one stage to another. Hence, theauthors propound Pregel, a distributed programming model especially designed to ad-dress the processing of large-scale graphs, which preserves efficiency, scalability andfault-tolerance[1].2 ContributionsSo far, there was a gap in the area of frameworks for large-scale graphs processing thatcan offer scalability, while being distributed and fault-tolerant. Pregel is exactly designedwith these characteristics. The authors designed Pregel for the Google cluster architec-ture, in which clusters are interconnected and geographically distributed, and each oneof them containing thousands of commodity machines. Their main contributions in-clude: 1. Design of a fault-tolerant distributed programming framework for enablingexecution of graph algorithms in parallel over thousands of machines. 2. Provision ofan API with direct message-passing among vertices, combiners for reducing overhead,aggregators for global communication and monitoring, and lastly topology mutations bysolving conflicting requests.3 SolutionPregel operates as a repeated synchronized computation process on vertices. Uponinserting a graph as an input, the graph is divided into partitions, which include a setof vertices and their outgoing edges. The vertices are assigned to machines and one of 1
  2. 2. them acts as a master for coordinating the worker machines. The workers then undergoa series of iterations, called supersteps. In every superstep, all vertices in each workerexecute the same user-defined function which can (a) receive messages sent during theprevious superstep, (b) modify the state of the vertex and its outgoing edges (verticesand edges are kept on the machines) and (c) send messages to be delivered during thenext superstep. At the end of each superstep a global synchronization point occurs.Vertices can become inactive and the sequence of iterations terminates when all verticesare inactive and there are no messages in transit. During computation, the master alsosends ping messages to check for workers failures. The network is used only for sendingmessages and therefore it significantly reduces the communication overhead, becomingmore efficient.4 Strong PointsS1 Fault-tolerance is achieved with the use of checkpoints, in which the state of nodes’ partitions is saved to a persistent storage. Upon a machine failure during compu- tation, the rest of the machines reload their partition state from the most recent checkpoint.S2 Combiners are an optimization for less network traffic and can be manually enabled by the user. With this option, messages can be combined and sent in a single message, reducing the overhead.S3 Aggregators are a mechanism for global communication and monitoring. They have different uses, like: in statistics, for global coordination or even in more advanced implementations. . . .5 Weak PointsW1 The user has to modify Pregel a lot in order to personalise it to his/her needs. More precisely, the user has to code for enabling combiners and for customizing aggregators. Additionally, the user is responsible for solving conflicting requests. He/She needs to define handlers, which increases the complexity in the system.W2 No failure detection is mentioned for the master, making it a single point of failure.W3 The evaluation presented in the paper is very limited with very little explanation. There is no clear comparison with other systems. An experimental comparison with MapReduce would be an interesting approach. Also, there is no experiment evaluating the fault-tolerance of the system. . . .References[1] G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Cza- jkowski, “Pregel: a system for large-scale graph processing,” in Proceedings of the 2
  3. 3. 2010 ACM SIGMOD International Conference on Management of data, SIGMOD’10, (New York, NY, USA), pp. 135–146, ACM, 2010. 3