A Lightweight Infrastructure for Graph Analytics

A Lightweight Infrastructure for
Graph Analytics
Donald Nguyen
Andrew Lenharth and Keshav Pingali
The University of Texas at Austin

What is Graph Analytics?
• Algorithms to compute properties of
graphs
– Connected components, shortest
paths, centrality measures,
diameter, PageRank, …
• Many applications
– Google, path routing, friend
recommendations, network analysis
• Difficult to implement on a large
scale
– Data sets are large, data accesses
are irregular
– Need parallelismand efficient
runtimes
Bridges of Konigsberg
The Social Network

Contributions
• A lightweight infrastructure for graph
analytics based on improvements to the
Galois system
– Example: scalable priority scheduling
• Orders of magnitude faster than third-party
graph DSLs for many applications and inputs
– Galois permits better algorithms to be written
– Galois infrastructure more scalable
3

Outline
1. Abstraction for graph analytics applications
and implementation in Galois system
2. One piece of infrastructure: priority
scheduling
3. Evaluation of Galois system versus graph
analytics DSLs
4

Example: SSSP
• Find the shortest distance from source
node to all other nodes in a graph
– Label nodes with tentativedistance
– Assume non-negativeedge weights
• Algorithms
– Chaotic relaxation O(2V)
– Bellman-Ford O(VE)
– Dijkstra’s algorithm O(E log V)
• Uses priority queue
– Δ-stepping
• Uses sequence of bags to prioritize work
• Δ=1, O(E log V)
• Δ=∞, O(VE)
• Different algorithms are different
schedules for applying relaxations
– SSSP needs priority scheduling for work
efficiency
1
9
2
3
1∞
∞
0
1 9
4
3
1
3
Operator
Edge relaxation
∞
2
5

Example: SSSP
• Algorithms
– Δ-stepping
• Δ=∞, O(VE)
efficiency
1
9
2
3
1∞
∞
0
Active edge
1 9
4
3
1
3
Operator
Edge relaxation
∞
2
6

Example: SSSP
• Algorithms
– Δ-stepping
• Δ=∞, O(VE)
efficiency
1
9
2
3
1∞
∞
0
Active edge
Neighborhood
Activity
1 9
4
3
1
3
Operator
Edge relaxation
∞
2
7

Parallel Program = Operator + Schedule + Parallel Data Structure
Algorithm
8

• What is the operator?
– Ligra, PowerGraph:only vertex programs
– Galois: Unrestricted, may even morph graph by
adding/removing nodes and edges
• Where/When does it execute?
– Autonomous scheduling: activities execute
transactionally
– Coordinated scheduling: activities execute in rounds
• Read values refer to previous rounds
• Multiple updates to the same location are resolved with
reduction, etc.
Algorithm
9

transactionally
reduction, etc.
Algorithm
10

transactionally
reduction, etc.
Algorithm
11

transactionally
reduction, etc.
Algorithm
ba
c
12

Galois System
Operators Schedules Data StructureAPI Calls
Schedulers Data Structures
AllocatorThread Primitives
Multicore
User Program
Galois System
13

Outline
scheduling
3. Evaluation of Galois system versus DSLs
14

Galois Infrastructure for Scheduling
• Library of different schedulers
– FIFO, LIFO, Priority, …
– DSL for specifying scheduling policies
• Nguyen and Pingali (ASPLOS ’11)
15

Priority Scheduling
• Concurrent priority queue
– Implemented in TBB, java.concurrent
– Does not scale: our tasks are very fine-grain
• Sequence of bags
– One bag per priority level
– High overhead for finding non-empty, urgent-
priority bags
16

Ordered by Metric (OBIM)
• Sparse sequence of Bags
– One Bag per priority level
– To reduce synchronization
• Sequence is replicated among cores
• Bag is distributed between cores
– To reduce overhead of finding work
• A reduction tree over machine topology to get
estimate of current urgent priority
17

Scheduling Bag
c1 c2
Head
Package 1
18
Bag
c1 c2
Head
Package 2

Scheduling Bag
c1 c2
Head
Package 1
19
Bag
c1 c2
Head
Package 2

Scheduling Bag
c1 c2
Head
Package 1
20
Bag
c1 c2
Head
Package 2

Scheduling Bag
c1 c2
Head
Package 1
21
Bag
c1 c2
Head
Package 2

Scheduling Bag
c1 c2
Head
Package 1
22
Bag
c1 c2
Head
Package 2

Scheduling Bag
c1 c2
Head
Package 1
23
Bag
c1 c2
Head
Package 2

Scheduling Bag
c1 c2
Head
Package 1
24
Bag
c1 c2
Head
Package 2

Scheduling Bag
c1 c2
Head
Package 1
25
Bag
c1 c2
Head
Package 2

Priority Map
Global Map
Scheduling
Bags
1 30
26

Priority Map
0 1
Global Map
Local Maps
Core 1 Core 2
Scheduling
Bags
1 3
1 30
27

Priority Map
0 1
Global Map
Local Maps
Core 1 Core 2
Scheduling
Bags
1 3
1 30
28

Priority Map
0 1
Global Map
Local Maps
Core 1 Core 2
Scheduling
Bags
1 3
1 30
13
29

Priority Map
0 1
Global Map
Local Maps
Core 1 Core 2
Scheduling
Bags
1 3
1 30
13
30

Priority Map
0 1
Global Map
Local Maps
Core 1 Core 2
Scheduling
Bags
1 3
1 30
13
3
31

Priority Map
0 1
Global Map
Local Maps
Core 1 Core 2
Scheduling
Bags
1 3
1 30
1
3
32

Priority Map
0 1
Global Map
Local Maps
Core 1 Core 2
Scheduling
Bags
1 3
1 30
1
3
15
33

Priority Map
0 1
Global Map
Local Maps
Core 1 Core 2
Scheduling
Bags
1 3
1 30
1
3
15
34

Priority Map
0 1
Global Map
Local Maps
Core 1 Core 2
Scheduling
Bags
1 3
1 30
1
3
15
5
5
35

Priority Map
0 1
Global Map
Local Maps
Core 1 Core 2
Scheduling
Bags
1 3
1 30
1
3
1
5
5
36

Outline
scheduling
3. Evaluation of Galois system versus graph
analytics DSLs
37

Graph Analytics DSLs
• Easy to implement their APIs on top of Galois
system
– Galois implementations called PowerGraph-g,
Ligra-g, etc.
– About 200-300 lines of code each
38
Ÿ GraphLab Low et al. (UAI ’10)
Ÿ PowerGraph Gonzalez et al. (OSDI ’12)
Ÿ GraphChi Kyrola et al. (OSDI ’12)
Ÿ Ligra Shun and Blelloch (PPoPP ’13)
Ÿ Pregel Malewicz et al. (SIGMOD ‘10)
Ÿ …

Evaluation
• Platform
– 40-core system
• 4 socket, Xeon E7-4860
(Westmere)
– 128 GB RAM
• Applications
– Breadth-first search (bfs)
– Connected components (cc)
– Approximate diameter (dia)
– PageRank (pr)
– Single-source shortest paths
(sssp)
• Inputs
– twitter50 (50 M nodes, 2 B
edges, low-diameter)
– road (20 M nodes, 60 M
edges, high-diameter)
• Comparison with
– Ligra (shared memory)
– PowerGraph (distributed)
• Runtimes with 64 16-core
machines (1024 cores) does not
beat one 40-core machine using
Galois
39

40
Ligra runtime
Galois runtime
1
10
100
1000
10000
1
10
100
1000
10000
bfs cc dia pr sssp
Ligra runtime
Ligra⌫g runtime
1
2
1
2
4
6
bfs cc dia pr sssp
Range
<= 1
> 1

41
Ligra runtime
Galois runtime
1
10
100
1000
10000
1
10
100
1000
10000
bfs cc dia pr sssp
Ligra runtime
Ligra⌫g runtime
1
2
1
2
4
6
bfs cc dia pr sssp
Range
<= 1
> 1

42
PowerGraph runtime
Galois runtime
1
10
100
1000
10000
1
10
100
1000
10000
bfs cc dia pr sssp
PowerGraph runtime
PowerGraph⇠g runtime
1
2
1
2
4
6
bfs cc dia pr sssp

43
PowerGraph runtime
Galois runtime
1
10
100
1000
10000
1
10
100
1000
10000
bfs cc dia pr sssp
PowerGraph runtime
1
2
1
2
4
6
bfs cc dia pr sssp

PowerGraph runtime
Galois runtime
1
10
100
1000
10000
1
10
100
1000
10000
bfs cc dia pr sssp
PowerGraph runtime
1
2
1
2
4
6
bfs cc dia pr sssp
44

PowerGraph runtime
Galois runtime
1
10
100
1000
10000
1
10
100
1000
10000
bfs cc dia pr sssp
PowerGraph runtime
1
2
1
2
4
6
bfs cc dia pr sssp
• The best algorithm may
require application-
specific scheduling
– Priority scheduling for
SSSP
not be expressible as a
vertex program
– Connected components
with union-find
• Autonomous scheduling
required for high-
diameter graphs
– Coordinated scheduling
uses many rounds and
has too much overhead
45

PowerGraph runtime
Galois runtime
1
10
100
1000
10000
1
10
100
1000
10000
bfs cc dia pr sssp
PowerGraph runtime
1
2
1
2
4
6
bfs cc dia pr sssp
specific scheduling
SSSP
vertex program
with union-find
required for high-
diameter graphs
46

PowerGraph runtime
Galois runtime
1
10
100
1000
10000
1
10
100
1000
10000
bfs cc dia pr sssp
PowerGraph runtime
1
2
1
2
4
6
bfs cc dia pr sssp
specific scheduling
SSSP
vertex program
with union-find
required for high-
diameter graphs
47

See Paper For
• More inputs, applications
• Detailed discussion of performance results
• More infrastructure: non-priority scheduling,
memory management
• Results for out-of-core DSLs
48

Summary
• Galois programming model is general
– Permits efficient algorithms to be written
• Galois infrastructure is lightweight
• Simpler graph DSLs can be layered on top
– Can perform better than existing systems
• Download at http://iss.ices.utexas.edu/galois
49

A Lightweight Infrastructure for Graph Analytics

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to A Lightweight Infrastructure for Graph Analytics

Similar to A Lightweight Infrastructure for Graph Analytics (20)

Recently uploaded

Recently uploaded (20)

A Lightweight Infrastructure for Graph Analytics