Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

- Giraph at Hadoop Summit 2014 by Claudio Martella 10419 views
- Fast, Scalable Graph Processing: Ap... by Hadoop Summit 7651 views
- Gephi, Graphx, and Giraph by Doug Needham 1300 views
- Hadoop Graph Processing with Apache... by Hadoop Summit 9876 views
- Introducing Apache Giraph for Large... by sscdotopen 25992 views
- Large Scale Graph Processing with A... by sscdotopen 17021 views

1,310 views

Published on

http://strataconf.com/strata2014/public/schedule/detail/32137

Graph analytics have applications beyond large web scale organizations. Many computing problems can be efficiently expressed and processed as a graph and can lead to useful insights that drive product and business decisions

While you can express graph algorithms as SQL queries in Hive or Hadoop MapReduce programs, an API designed specifically for graph processing makes writing many iterative graph computations (such as page rank, connected components, label propagation, graph-based clustering, etc.) easy to express in simpler and easier to understand code. Apache Giraph provides such a native graph processing API, runs on existing Hadoop infrastructure and can directly access HDFS and/or Hive tables.

This talk describes our efforts at Facebook to scale Apache Giraph to very large graphs of up to one trillion edges and how we run Apache Giraph in production. We will also talk about several algorithms that we have implemented and their use cases.

Published in:
Education

No Downloads

Total views

1,310

On SlideShare

0

From Embeds

0

Number of Embeds

26

Shares

0

Downloads

52

Comments

0

Likes

1

No embeds

No notes for slide

- 1. Graph Analysis with One Trillion Edges on Apache Giraph 2/13/2014 Avery Ching, Facebook Strata
- 2. Motivation
- 3. Apache Giraph • Inspired by Google’s Pregel but runs on Hadoop • “Think like a vertex” • Maximum value vertex example Processor 1 Time 5 5 5 1 1 5 5 5 2 Processor 2 5 2 2 5 5
- 4. Giraph on Hadoop / Yarn Giraph MapReduce Hadoop 0.20.x Hadoop 0.20.203 Hadoop 1.x YARN Hadoop 2.0.x
- 5. Apache Giraph data ﬂow Split 3 Load/ Send Graph Part 1 Part 2 Part 3 Compute/ Send Messages Compute/ Send Messages Send stats/iterate! Worker 0 Part 0 Worker 0 Load/ Send Graph Storing the graph Worker 1 Split 2 In-memory graph Worker 1 Split 1 Compute / Iterate Master Master Split 0 Worker 1 Input format Worker 0 Loading the graph Part 0 Part 1 Output format Part 0 Part 1 Part 2 Part 3 Part 2 Part 3
- 6. Beyond Pregel Sharded aggregators Master computation Composable computation
- 7. Use case: k-means clustering Cluster input vectors into k clusters • Assign each input vector to the closest centroid • Update centroid locations based on assignments Random centroid location Assignment to centroid c0 Update centroids c0 c2 c0 c2 c2 c0 c2 c1 c1 c1 c1
- 8. k-means in Giraph Partitioning the problem c0 c2 Input vectors → vertices • Partitioned across machines Centroids → aggregators • Shared data across all machines c1 ! ! Worker 0 Problem solved....right? Worker 1 c0 c0 c2 c1 c2 c1
- 9. Problem 1: Massive dimensions Cluster Facebook members by friendships? • 1 billion members (dimensions) • k clusters Each worker sending to the master a maximum of • 1B * (2 bytes - max 5k friends) * k = 2 * k GB Master receives up to 2 * k * workers GB • Saturated network link • OOM
- 10. Sharded aggregators Master handles all aggregators Aggregators sharded to workers ﬁnal agg 0 master ﬁnal agg 0 master ﬁnal agg 1 ﬁnal agg 1 ﬁnal agg 2 ﬁnal agg 2 partial agg 0 partial agg 1 ﬁnal agg 1 partial agg 2 worker 0 ﬁnal agg 0 partial agg 0 worker 0 ﬁnal agg 0 ﬁnal agg 2 partial agg 2 ﬁnal agg 2 ﬁnal agg 0 partial agg 0 ﬁnal agg 0 partial agg 1 ﬁnal agg 1 partial agg 2 ﬁnal agg 2 partial agg 2 ﬁnal agg 2 partial agg 0 ﬁnal agg 0 partial agg 0 ﬁnal agg 0 partial agg 1 ﬁnal agg 1 partial agg 1 ﬁnal agg 1 partial agg 2 worker 2 ﬁnal agg 1 partial agg 0 worker 1 partial agg 1 ﬁnal agg 2 worker 1 worker 2 partial agg 1 partial agg 2 ﬁnal agg 1 ﬁnal agg 2 • Share aggregator load across workers • Future work - tree-based optimizations (not yet a problem)
- 11. Problem 2: Edge cut metric Clusters should reduce the number of cut edges Two phases • Send all out edges your cluster id • Aggregate edges with different cluster ids Calculate no more than once an hour?
- 12. Master computation Serial computation on master • Communicates to workers via aggregators • Added to Giraph by Stanford GPS team Master Worker 0 Worker 1 Time k-means k-means start cut end cut k-means k-means k-means start cut end cut k-means
- 13. Problem 3: More phases, more problems Add a stage to initialize the centroids Add random input vectors to centroids • Add a few random friends Two phases c0 c2 • Randomly sample input vertices to add • Send messages to a few random neighbors c3
- 14. Problem 3: (continued) Cannot easily support different messages, combiners Vertex compute code getting messy c0 c2 if (phase == INITIALIZE_SELF) // Randomly add to centroid else if (phase == INITIALIZE_FRIEND) // Add my vector to centroid if a friend selected me else if (phase == K_MEANS) // Do k-means else if (phase == START_EDGE_CUT)... c3
- 15. Composable computation Decouple vertex from computation Master sets the computation, combiner classes Reusable and composable Computation Add random centroid / random friends Add to centroid K-means Start edge cut End edge cut In message Null Centroid message Null Null Cluster Out message Centroid message Null Null Cluster Null Combiner N/A N/A N/A Cluster combiner N/A
- 16. Composable computation (cont) Balanced Label Propagation compute candidates to move to partitions probabilistically move vertices Continue if halting condition not met (i.e. < n vertices moved?)
- 17. Composable computation (cont) Balanced Label Propagation compute candidates to move to partitions probabilistically move vertices Continue if halting condition not met (i.e. < n vertices moved?) Afﬁnity Propagation calculate and send responsibilities calculate and send availabilities Continue if halting condition met (i.e. < n vertices changed exemplars?) update exemplars
- 18. Faster than Hive? Application Graph Size CPU Time Speedup Elapsed Time Speedup Page rank 400B+ edges 26x 120x 71B+ edges 12.5x 48x (single iteration) Friends of friends score
- 19. Apache Giraph scalability Scalability of workers Scalability of edges (50 (200B edges) workers) 500 375 375 Seconds Seconds 500 250 125 0 50 100 150 200 250 300 # of Workers Giraph Ideal 250 125 0 1E+09 7E+10 1E+11 # of Edges Giraph Ideal 2E+11
- 20. A billion edges isn’t cool. You know what’s cool? A TRILLION edges.
- 21. Page rank on 200 machines with 1 trillion (1,000,000,000,000) edges <4 minutes / iteration! * Results from 6/30/2013 with one-to-all messaging + request processing improvements
- 22. Why balanced partitioning Random partitioning == good balance BUT ignores entity afﬁnity 0 1 6 3 4 5 10 7 8 9 2 11
- 23. Balanced partitioning application Results from one service: Cache hit rate grew from 70% to 85%, bandwidth cut in 1/2 ! ! 0 3 6 9 1 4 7 10 2 5 8 11
- 24. Balanced label propagation results * Loosely based on Ugander and Backstrom. Balanced label propagation for partitioning massive graphs, WSDM '13
- 25. Avoiding out-of-core Example: Mutual friends calculation between neighbors ! C:{D} D:{C} A 1. Send your friends a list of your friends ! ! E:{} B 2. Intersect with your friend list ! 1.23B (as of 1/2014) A:{D} D:{A,E} E:{D} C E 200+ average friends (2011 S1) 8-byte ids (longs) = 394 TB / 100 GB machines 3,940 machines (not including the graph) D A:{C} C:{A,E} E:{C} B:{} C:{D} D:{C}
- 26. Superstep splitting Subsets of sources/destinations edges per superstep * Currently manual - future work automatic! Sources: A (on), B (off) Destinations: A (on), B (off) Sources: A (on), B (off) Destinations: A (off), B (on) B Sources: A (off), B (on) Destinations: A (on), B (off) B Sources: A (off), B (on) Destinations: A (off), B (on) B B A B A B A B A B B A B A B A B A A A A A
- 27. Debugging with GiraphicJam
- 28. Giraph in Production Over 1.5 years in production Over 100 jobs processed a week 30+ applications in our internal application repository Sample production job - 700B+ edges Very stable • Checkpointing disabled (highly loaded HDFS adds instability) • Retries handle intermittent failures
- 29. Giraph roadmap 2/12 - 0.1 Relaxing BSP - 1.2? • Giraph++ (IBM research) • Giraphx (University at Buffalo, SUNY) 5/13 - 1.0 Spring 2014 - 1.1
- 30. Future work Evaluate alternative computing models Performance Lower the barrier to entry Applications
- 31. Our team ! Pavan Athivarapu Avery Ching Maja Kabiljo Greg Malewicz Sambavi Muthukrishnan

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment