Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)

Thorny path to the
Large-Scale Graph
Processing
Zinoviev Alexey

About
• I am a <graph theory, machine learning, traffic jams prediction, BigData
algorythms> scientist
• But I'm a <Java, JavaScript, Android, NoSQL, Hadoop, Spark>
programmer

Big Data of old times
• Astronomy
• Weather
• Trading
• Sea routes
• Battles

And now ...
• Web graph
• Facebook friend network
• Gmail email graph
• EU road network
• Citation graph
• PayPal transaction graph

Graph Number of
vertexes
Number of
edges
Volume Data/per day
Web-graph 1,5 * 10^12 1,2 * 10^13 100 PB 300 TB
Facebook
1,1 * 10^9 160 * 10^9 1 PB 15 TB
(friends
graph)
Road graph
of EU
18 * 10^6 42 * 10^6 20 GB 50 MB
Road graph
of this city
250 000 460 000 500 MB 100 KB

Problems
• Popularity rank (page rank)
• Determining popular users, news, jobs, etc.
• Shortest paths
• Max flow
• How are users, groups connected?
• Clustering, semi-clustering
• Max clique, triangle closure, label propagation algorithms
• Finding related people, groups, interests

Node Centrality Problem
• Verticies with high impact
• Removal of important vertices reduces the reliability
Cases:
• Bioinformatics
• Social connections
• Road network
• Spam detection
• Recommendation system

Small World Problem
Facebook 4.74 712 M 69 G
Twitter 3.67 ---- 5G follows
MSN Messenger
(1 month)
6.6 180 M 1.3 G arcs

Large graph processing tools
15/65

Think like a vertex…
• Majority of graph algorithms are iterative and traverse the graph in
some way
• Classic map-reduce overheads (job startup/shutdown, reloading data
from HDFS, shuffling)
• High complexity of graph problem reduction to key-value model
• Iteration algorythms, but multiple chained jobs in M/R with full saving
and reading of each state

Why not use MapReduce/Hadoop?
• Example: PageRank, Google‘s
famous algorithm for measuring the
authority of a webpage based on the
underlying network of hyperlinks
• defined recursively: each vertex
distributes its authority to its neighbors
in equal proportions

Google Pregel
• Distributed system especially developed for large scale graph
processing
• Bulk Synchronous Parallel (BSP) as execution model
• Supersteps are atomic units of parallel computation
• Any superstep can be restarted from a checkpoint (need not be user
defined)
• A new superstep provides an opportunity for rebalancing of
components among available resources

Vertex-centric BSP
• Each vertex has an id, a value, a list of its adjacent vertex ids and the
corresponding edge values
• Each vertex is invoked in each superstep, can recompute its value and
send messages to other vertices, which are delivered over superstep
barriers
• Advanced features : termination votes, combiners, aggregators,
topology mutations

Why Apache Giraph
Pregel is proprietary, but:
• Apache Giraph is an open source implementation of Pregel
• Runs on standard Hadoop infrastructure
• Computation is executed in memory
• Can be a job in a pipeline(MapReduce, Hive)
• Uses Apache ZooKeeperfor synchronization

Why Apache Giraph
• No locks: message-based communication
• No semaphores: global synchronization
• Iteration isolation: massively parallelizable

ZooKeeper in Apache Giraph
ZooKeeper: responsible for
computation state
• Partition/worker mapping
• Global state: superstep
• Checkpoint paths, aggregator
values, statistics

Master in Apache Giraph
Master: responsible for coordination
• Assigns partitions to workers
• Coordinates synchronization
• Requests checkpoints
• Aggregates aggregator values
• Collects health statuses

Worker in Apache Giraph
Worker: responsible for vertices
• Invokes active vertices
compute() function
• Sends, receives and assigns
messages
• Computes local aggregation
values

Scaling Giraph to a trillion edges

Fault tolerance
No single point of failure from Giraph threads
• With multiple master threads, if the current master dies, a new
one will automatically take over.
• If a worker thread dies, the application is rolled back to a
previously checkpointed superstep.
• If a zookeeper server dies, as long as a quorum remains, the
application can proceed
Hadoop single points of failure still exist (Namenode, jobtracker)

Worker Scalability, 250m nodes

Vertex scalability, 300 workers

MapReduce vs Giraph
6 machines with 2x8core Opteron CPUs, 4x1TB disks and 32GB RAM each, ran 1
Giraph worker per core
Wikipedia page link graph (6 million vertices, 200 million edges)
PageRank on Hadoop/Mahout
• 10 iterations approx. 29 minutes
• average time per iteration: approx. 3 minutes
PageRank on Giraph
• 30 iterations took approx. 15 minutes
• average time per iteration: approx. 30 seconds
10x performance improvement

Okapi
• Apache Mahout for graphs
• Graph-based recommenders: ALS,
SGD, SVD++, etc.
• Graph analytics: Graph
partitioning, Community Detection,
K-Core, etc.

Spark
• MapReduce in memory
• Up to 50x faster than Hadoop
• Support for Shark (like Hive), MLlib
(Machine learning), GraphX (graph
processing)
• RDD is a basic building block
(immutable distributed collections of
objects)

GraphX
Supported algorythms
● PageRank
● Connected components
● Label propagation
● SVD++
● Strongly connected components
● Triangle count

GraphChi
• Asynchronous Disk-based version of GraphLab
• Utilizing parallel sliding window
• Very small number of non-sequential accessesto the disk
• Graph does not fit in memory
• Input graph is split into P disjoint intervals to balance edges,
each associated with a shard
• For Home deals ...

Definition
• Edge weights > 0
• A few classes of roads
• Lat/Lon attributes for each vertex
• Subgraphs for cross-roads
• Not so big as web graph
• Static

We need in fast system!
• Response < 10 ms (with high accuracy)
• Shortest path (SP) with O(n)
• Preprocessing phase
• Don’t keep all SP - O(n^2)
• Use geo attributes
• Using compression and recoding for
disk storage
• Network is stable

EU Road network
Dijkstra ALT RE HH CH TN HL
2 008 300 24 656 2444 462.0 94.0 1.8 0.3
• ALT: [Goldberg & Harrelson 05], [Delling & Wagner 07]
• RE: [Gutman 05], [Goldberg et al. 07]
• HH: [Sanders & Schultes 06]
• CH: [Geisberger et al. 08]
• TN: [Geisberger et al. 08]
• HL: [Abraham et al. 11]

Transit nodes (TN)
• Divide graph G on subgraphs G_i
• Find R (subset of G_i) for each G_i
• All sortest path in G_i across R
• Build pairs (v_i, r_k) for each v_i where
r_k is closest Transit Node
• Calculate shortest paths between transit
nodes in R
• Save it!

Optimization problems
• Unstable graph
• Prerpocessing phase is meaningless
• How to invest 1B $ in road network to minimize human time in
traffic jams
• How to invest 1M $ in road network to improve reliability before
the flooding

Last steps ...
• I/O Efficient Algorythms and Data Structures
• Graphs and Memory Errors

Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)

More Related Content

What's hot

Similar to Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)

More from Ontico

Recently uploaded

Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)