Pregel
Upcoming SlideShare
Loading in...5
×
 

Pregel

on

  • 745 views

Google's graph processing framework.

Google's graph processing framework.

Statistics

Views

Total Views
745
Views on SlideShare
745
Embed Views
0

Actions

Likes
4
Downloads
38
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • 什麼是區域性? 簡單的來說,就是存取資料或資源時,很常存取或是相關的資料放在一起、或很近的地方的特性
  • Each cluster consists of thousands of commodity PCs organized into racks with high intra-rack bandwidth.Clusters are interconnected but distributed geographically.
  • Sum of all PageRanks = Number of pages
  • Sum of all PageRanks = 1

Pregel Pregel Presentation Transcript

  • In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data (pp. 135-146). ACM Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkwoski Pregel: A System for Large-Scale Graph Processing
  • Source: SIGMETRICS ’09 Tutorial – MapReduce: The Programming Model and Practice, by Jerry Zhao 2
  • Outline • Introduction • Computation Model • Writing a Pregel Program • System Implementation • Applications • Experiments • Related Work • Conclusion & Future Work 3
  • The Problem • Many practical computing problems concern large graphs. Large graph data Graph algorithms Web graph Transportation routes Citation relationships Social networks PageRank Shortest path Connected components Clustering techniques • Efficient processing of large graphs is challenging: Poor locality of memory access Very little work per vertex Changing degree of parallelism Running over many machines makes the problem worse 4
  • Want to Process a Large Scale Graph? The Options: 1. Crafting a custom distributed infrastructure. Substantial engineering effort. 2. Relying on an existing distributed platform: e.g. Map Reduce. Inefficient: Must store graph state in each state  too much communication between stages. 3. Using a single-computer graph algorithm library. Not scalable.  4. Using an existing parallel graph system. Not fault tolerance.  5
  • Pregel • Google, to overcome, these challenges came up with Pregel. Provides scalability Fault-tolerance Flexibility to express arbitrary algorithms • The high level organization of Pregel programs is inspired by Valiant’s Bulk Synchronous Parallel model [45]. [45] Leslie G. Valiant, A Bridging Model for Parallel Computation. Comm. ACM 33(8), 1990 6
  • Bulk Synchronous Parallel Input All Vote to Halt Output • • • • Series of iterations (supersteps) . Each vertex V invokes a function in parallel. Can read messages sent in previous superstep (S-1). Can send messages, to be read at the next superstep (S+1). • Can modify state of outgoing edges. 7
  • Advantage? In Vertex-Centric Approach • Users focus on a local action. • Processing each item independently. • Ensures that pregel programs are inherently free of deadlocks and data races common in asynchronous systems. 8
  • Outline • Introduction • Computation Model • Writing a Pregel Program • System Implementation • Applications • Experiments • Related Work • Conclusion & Future Work 9
  • Model of Computation All Vote to Halt • • • • Outpu t A Directed Graph is given to Pregel. It runs the computation at each vertex. Until all nodes vote for halt. Pregel gives you a directed graph back. 10
  • Vertex State Machine • Algorithm termination is based on every vertex voting to halt. • In superstep 0, every vertex is in the active state. • A vertex deactivates itself by voting to halt. • It can be reactivated by receiving an (external) message. 11
  • 3 6 2 1 Blue Arrows are messages. 6 3 6 2 1 6 6 6 6 2 6 6 6 6 6 Blue vertices have voted to halt. Example: Finding the largest value in a graph 12
  • Outline • Introduction • Computation Model • Writing a Pregel Program • System Implementation • Applications • Experiments • Related Work • Conclusion & Future Work 13
  • The C++ API • Subclassing the predefined Vertex class, and writes a Compute method. Compute() method: which will be executed at each active vertex in every superstep. • Can get/set vertex value. GetValue() / MutableValue() • Can get/set outgoing edges values. GetOutEdgeIterator() • Can send/receive messages. SendMessageTo() / Compute() 14
  • The C++ API – Vertex Class 3 value types Override this! in msgs Vertex Edge out msg 15
  • The C++ API Message passing: • No guaranteed message delivery order. • Messages are delivered exactly once. • Can send messages to any node. • If dest_vertex doesn’t exist, user’s function is called. void SendMessageTo(const string& dest_vertex, const MessageValue& message); 16
  • The C++ API Combiners (not active by default): • Sending a message to another vertex that exists on a different machine has some overhead. • User specifies a way to reduce many messages into one value (ala Reduce in MR). by overriding the Combine() method. Must be commutative and associative. • Exceedingly useful in certain contexts (e.g., 4x speedup on shortest-path computation). 17
  • The C++ API Aggregators: • A mechanism for global communication, monitoring, and data. Each vertex can produce a value in a superstep S for the Aggregator to use. The Aggregated value is available to all the vertices in superstep S+1. • Aggregators can be used for statistics and for global communication. E.g., Sum applied to out-edge count of each vertex.  generates the total number of edges in the graph and communicate it to all the vertices. 18
  • The C++ API Topology mutations: • Some graph algorithms need to change the graph's topology. E.g. A clustering algorithm may need to replace a cluster with a node • Vertices can create / destroy vertices at will. • Resolving conflicting requests: Partial ordering: E Remove,V Remove,V Add, E Add. User-defined handlers: You fix the conflicts on your own. 19
  • The C++ API Input and output: • It has Reader/Writer for common file formats: Text file Vertices in a relational DB Rows in BigTable • User can customize Reader/Writer for new input/outputs. Subclassing Reader/Writer classes. 20
  • Outline • Introduction • Computation Model • Writing a Pregel Program • System Implementation • Applications • Experiments • Related Work • Conclusion & Future Work 21
  • Implementation • Pregel was designed for the Google cluster architecture. • Persistent data is stored as files on a distributed storage system like GFS or BigTable. • Temporary data is stored on local disk. • Vertices are assigned to the machines based on their vertex-ID ( hash(ID) ) so that it can easily be understood that which node is where. 22
  • System Architecture • Executable is copied to many machines. • One machine becomes the Master. Maintains worker. Recovers faults of workers. Provides Web-UI monitoring tool of job progress. • Other machines become Workers. Processes its task. Communicates with the other workers. 23
  • Pregel Execution 1. User programs are copied on machines. 2. One machine becomes the master.  Other computer can find the master using name service and register themselves to it.  The master determines how many partitions the graph have 3. The master assigns one or more partitions and a portion of user input to each worker. 4. The workers run the compute function for active vertices and send the messages asynchronously.  There is one thread for each partition in each worker.  When the superstep is finished workers tell the master how many vertices will be active for next superstep. 24
  • Source: http://www.cnblogs.com/huangfox/archive/2013/01/03/2843103.html 25
  • Fault Tolerance • Checkpointing The master periodically instructs the workers to save the state of their partitions to persistent storage.  e.g., Vertex values, edge values, incoming messages. • Failure detection Using regular “ping” messages. • Recovery The master reassigns graph partitions to the currently available workers. The workers all reload their partition state from most recent available checkpoint. 26
  • Outline • Introduction • Computation Model • Writing a Pregel Program • System Implementation • Applications • Experiments • Related Work • Conclusion & Future Work 27
  • Application – Page Rank • A = A given page • T1 …. Tn = Pages that point to page A (citations) • d = Damping factor between 0 and 1 (usually kept as 0.85) • C(T) = number of links going out of T • PR(A) = the PageRank of page A PR ( A) PR (T1 ) (1 d ) d ( C (T1 ) PR (T2 ) ........ C (T2 ) PR (Tn ) ) C (Tn ) 28
  • Application – Page Rank Source: Wikipedia 29
  • Application – Page Rank Store and carry PageRank class PageRankVertex : public Vertex<double, void, double> { public: virtual void Compute(MessageIterator* msgs) { if (superstep() >= 1) { double sum = 0; for (; !msgs->Done(); msgs->Next()) sum += msgs->Value(); *MutableValue() = 0.15 / NumVertices() + 0.85 * sum; } if (superstep() < 30) { const int64 n = GetOutEdgeIterator().size(); SendMessageToAllNeighbors(GetValue() / n); } else VoteToHalt(); For convergence, either there is a limit on } the number of supersteps or aggregators }; are used to detect convergence. 30
  • Application – Shortest Path class ShortestPathVertex a constant larger than : public Vertex<int, int, int> { any feasible distance void Compute(MessageIterator* msgs) { int mindist = IsSource(vertex_id()) ? 0 : INF; In the 1st superstep, only for (; !msgs->Done(); msgs->Next()) the source vertex will mindist = min(mindist, msgs->Value()); update its value (from INF if (mindist < GetValue()) { to zero) *MutableValue() = mindist; OutEdgeIterator iter = GetOutEdgeIterator(); for (; !iter.Done(); iter.Next()) SendMessageTo(iter.Target(),mindist + iter.GetValue()); } VoteToHalt(); } }; 31
  • Example: SSSP in Pregel 1 10 2 0 9 3 5 4 6 7 2 32
  • Example: SSSP in Pregel 1 10 10 2 0 9 3 5 4 6 7 5 2 33
  • Example: SSSP in Pregel 1 10 10 2 0 9 3 5 4 6 7 5 2 34
  • Example: SSSP in Pregel 2 5 14 8 10 0 11 1 10 9 3 12 4 6 7 5 2 7 35
  • Example: SSSP in Pregel 1 8 11 10 2 0 9 3 5 4 6 7 5 2 7 36
  • Example: SSSP in Pregel 9 1 8 11 10 0 14 13 2 9 3 5 4 7 5 2 6 15 7 37
  • Example: SSSP in Pregel 1 8 9 10 2 0 9 3 5 4 6 7 5 2 7 38
  • Example: SSSP in Pregel 1 8 9 10 2 0 9 3 5 4 7 5 2 6 13 7 39
  • Example: SSSP in Pregel 1 8 9 10 2 0 9 3 5 4 6 7 5 2 7 40
  • Outline • Introduction • Computation Model • Writing a Pregel Program • System Implementation • Applications • Experiments • Related Work • Conclusion & Future Work 41
  • Experiments • 300 multicore commodity PCs used. • Only running time is counted. Checkpointing disabled. • Measures scalability of Worker tasks. • Measures scalability w.r.t. # of Vertices. in binary trees and log-normal trees. • Naïve single-source shortest paths (SSSP) implementation. The weight of all edges = 1 42
  • SSSP - 1 billion vertex binary tree: # of Pregel workers varies from 50 to 800 174 s 16 times workers ↓ Speedup of 10 17.3 s 43
  • SSSP – binary trees: varying graph sizes on 800 worker tasks 702 s 17.3 s Graph with a low average outdegree the runtime Increases linearly in the graph size. 44
  • SSSP – log-normal random graphs (mean outdegree = 127.1): varying graph sizes on 800 worker tasks The runtime Increases linearly in the graph size, too. 45
  • Outline • Introduction • Computation Model • Writing a Pregel Program • System Implementation • Applications • Experiments • Related Work • Conclusion & Future Work 46
  • Related Work • MapReduce Pregel is similar in concept to MapReduce, but with a natural graph API and much more efficient support for iterative computations over the graph. • Bulk Synchronous Parallel model the Oxford BSP Library[38], Green BSP library[21], BSPlib[26] and Paderborn University BSP library.  The scalability and fault-tolerance implementation has not been evaluated beyond several dozen machines,  and none of them provides a graph-specific API. 47
  • Related Work • The closest matches to Pregel are: Parallel Boost Graph Library[22],[23]  Pregel provides fault-tolerance CGMgraph[8]  object-oriented programming style at some performance cost • There have been few systems reporting experimental results for graphs at the scale of billions of vertices. 48
  • Outline • Introduction • Computation Model • Writing a Pregel Program • System Implementation • Applications • Experiments • Related Work • Conclusion & Future Work 49
  • Conclusion & Future Work • Pregel is a scalable and fault-tolerant platform with an API that is sufficiently flexible to express arbitrary graph algorithms. • Future work Relaxing the synchronicity of the model.  Not to wait for slower workers at inter-superstep barriers. Assigning vertices to machines to minimize inter-machine communication. Caring dense graphs in which most vertices send messages to most other vertices. 50
  • Comment • No comparison with other systems. • The user has to modify Pregel a lot in order to personalize it to his/her needs. • No failure detection is mentioned for the master, making it a single point of failure. 51
  • Any questions? THANK YOU 52