Joey gonzalez, graph lab, m lconf 2013

Machine Learning on Graphs

Joseph Gonzalez
Co-Founder, GraphLab Inc.
joseph@graphlab.com
Postdoc, UC Berkeley AMPLab
jegonzal@eecs.berkeley.edu

Big
Data

Graphs
More
Signal

More
Noise

2!

Social Media

Science

Advertising

Web

Graphs encode relationships between:

People

Products
Ideas
Facts
Interests

Big: billions of vertices and edges & rich metadata
Facebook
(10/2012):
1B
users,
144B
friendships

Twi>er
(2011):
15B
follower
edges

3

Graphs are Essential to "
Data Mining and Machine Learning

Identify inﬂuential people and information
Find communities
Understand people’s shared interests
Model complex data dependencies

Predicting User Behavior
?

?

?

Liberal

?

?

?

Conservative

?

?

?

Post

Post

?

?

Post

Post

Post

?

Post

?

Post

Post

?

?

?

Post

?

Post

Post

Post

?

Conditional Random Field! ?

?

?

?

?

?

Belief Propagation!

Post

?

?

Post

Post

Post

?

?

?

?

5

Finding Communities
Count triangles passing through each vertex:
"

2

3

1
4

Measures “cohesiveness” of local community

Fewer Triangles
Weaker Community

More Triangles
Stronger Community

Recommending Products
Users

Ratings

Items

Recommending Products

≈

Movies

f(j)

f(i)

Movies
Iterate:

f [i] = arg min

w2Rd

X

j2Nbrs(i)

rij

f(1)

User Factors (U)

Users

Netﬂix

x

f(2)

T

w f [j]

r13
r14
r24
r25

2

f(3)
f(4)
f(5)

Movie Factors (M)

Users

Low-Rank Matrix Factorization:

+ ||w||2
2
8

Identifying Leaders
R[i] = 0.15 +

X

wji R[j]

j2Nbrs(i)

Rank of
user i

Weighted sum of
neighbors’ ranks

Everyone starts with equal ranks
Update ranks in parallel
Iterate until convergence
10

Graph-Parallel Algorithms

Model / Alg.
State

Computation depends
only on the neighbors
11

Many More Graph Algorithms
•  Collaborative Filtering!
– 
– 
– 
– 

•  Graph Analytics!

Alternating Least Squares!
Stochastic Gradient Descent!
Tensor Factorization!
SVD!

•  Structured Prediction!
–  Loopy Belief Propagation!
–  Max-Product Linear
Programs!
–  Gibbs Sampling!

•  Semi-supervised ML!
–  Graph SSL !
–  CoEM!

– 
– 
– 
– 
– 
– 

PageRank!
Shortest Path!
Triangle-Counting!
Graph Coloring!
K-core Decomposition!
Personalized PageRank!

•  Classiﬁcation!
–  Neural Networks!
–  Lasso!
…!

12

How should we program"
graph-parallel algorithms?

13

Structure of Computation
Data-Parallel

Graph-Parallel
Dependency Graph

Table
Row
6. Before

Row
Row

Result
7. After

Row
14

8. After

How should we program"
graph-parallel algorithms?

“Think like a Vertex.”

- Pregel [SIGMOD’10]

15

The Graph-Parallel Abstraction
A user-deﬁned Vertex-Program runs on each vertex
Graph constrains interaction along edges
Using messages (e.g. Pregel [PODC’09, SIGMOD’10])
Through shared state (e.g., GraphLab [UAI’10, VLDB’12])

Parallelism: run multiple vertex programs simultaneously

16

The GraphLab Vertex Program

Vertex Programs directly access adjacent vertices and edges

GraphLab_PageRank(i)

//
Compute
sum
over
neighbors

total
=
0

foreach
(j
in
neighbors(i)):

total
=
total
+
R[j]
*
wji

//
Update
the
PageRank

R[i]
=
0.15
+
total

//
Trigger
neighbors
to
run
again

if
R[i]
not
converged
then

signal
nbrsOf(i)
to
be
recomputed

R[4]
*
w41

4

+

1

+

3

Signaled vertices are recomputed eventually.

2

17

Num-‐Ver1ces

Be>er

Convergence of Dynamic PageRank

100000000

51%
updated
only
once!

1000000

10000

100

1

0

10

20

30

40

Number
of
Updates

50

60

70

18

Adaptive Belief Propagation
Challenge = Boundaries

Many

Updates

Splash

Noisy “Sunset” Image

Few

Updates

Cumulative Vertex Updates

Algorithm identiﬁes and focuses

on hidden sequential structure

Graphical Model

6. Before

Graph-‐parallel
Abstrac(ons

BeDer
for
Machine
Learning

Messaging

i

Synchronous

7. After

8. After

Shared
State

i

Dynamic
Asynchronous

20

Natural Graphs 
Graphs derived from natural
phenomena

21

Properties of Natural Graphs

Regular Mesh

Natural Graph

Power-Law Degree Distribution
22

Power-Law Degree Distribution

“Star Like” Motif
President
Obama

Followers

23

Challenges
of
High-‐Degree
VerMces

SequenMally
process

edges

Touches
a
large

fracMon
of
graph

CPU 1

CPU 2

Provably
Diﬃcult
to
ParMMon

24

ment. While fast and easy to implement,
placement cuts most of the edges:
Random
ParMMoning

em 5.1. If vertices random
(hashed)
assigne
are randomly
•  GraphLab
resorts
to

parMMoning
on
natural
graphs

nes then the expected fraction of edges cut


|Edges Cut|
E
=1
|E|

1
p

10
Machines
!
90%
of
edges
cut

example if just two machines are used, hal
100
Machines
!
99%
of
edges
cut!

Machine
1

Machine
2

es will be cut requiring order |E| /2 commun
25

Program

For
This

Run
on
This

Machine 1

Machine 2

•  Split
High-‐Degree
verMces

•  New
Abstrac1on
!
Equivalence
on
Split
Ver(ces

26

A Common Pattern in 
Vertex Programs
GraphLab_PageRank(i)

//
Compute
sum
over
neighbors

total
=
0

Gather
Informa1on

foreach(
j
in
neighbors(i)):

About
Neighborhood

total
=
total
+
R[j]
*
wji

//
Update
the
PageRank

Update
Vertex

R[i]
=
total

//
Trigger
neighbors
to
run
again

priority
=
|R[i]
–
oldR[i]|

Signal
Neighbors
&

if
R[i]
not
converged
then

Modify
Edge
Data

signal
neighbors(i)
with
priority

27

GAS Decomposition
Machine
1

Machine
2

Master

Gather

Apply

Sca>er

Y’

Y’

Y’

Y’

Σ1

Σ

Σ2

+

+

+

Mirror

Y

Σ3

Σ4

Mirror

Machine
3

Mirror

Machine
4

28

Minimizing Communication in
PowerGraph

Y
Communication is linear in "
the number of machines "
each vertex spans

A vertex-cut minimizes "
machines each vertex spans
Percolation theory suggests that power law
graphs have good vertex cuts. [Albert et al. 2000]
29

Machine Learning and Data-Mining
Toolkits
Graph

AnalyMcs

Graphical

Models

Computer

Vision

Clustering

Topic

Modeling

CollaboraMve

Filtering

GraphLab2
System

MPI/TCP-‐IP

PThreads

HDFS

EC2
HPC
Nodes

http://graphlab.org
Apache 2 License

PageRank on Twitter Follower Graph
Natural Graph with 40M Users, 1.4 Billion Links
Run1me
Per
Itera1on

0

50

100

150

200

Hadoop

GraphLab

Twister

Piccolo

Order of magnitude
by exploiting
properties of Natural
Graphs

PowerGraph

Hadoop results from [Kang et al. '11]
Twister (in-memory MapReduce) [Ekanayake et al. ‘10]

31

GraphLab2 is Scalable
Yahoo Altavista Web Graph (2002):

One of the largest publicly available web graphs

1.4 Billion Webpages, 6.6 Billion Links

7 Seconds per Iter.
1B links Nodes
processed per second
64 HPC
1024 Cores (2048
30 lines of user code
HT)

32

Topic Modeling
English language Wikipedia
–  2.6M Documents, 8.3M Words, 500M Tokens

–  Computationally intensive algorithm
Million
Tokens
Per
Second

0

Smola
et
al.

PowerGraph

20

40

60

80

100

120

140

160

100 Yahoo! Machines

Speciﬁcally engineered for this task
64 cc2.8xlarge EC2 Nodes
200 lines of code & 4 human hours
33

Triangle Counting on Twitter
40M Users, 1.4 Billion Links

Counted: 34.8 Billion Triangles

Hadoop
[WWW’11]

1536 Machines

423 Minutes

64 Machines

15 Seconds

1000 x Faster

34

S.
Suri
and
S.
Vassilvitskii,
“CounMng
triangles
and
the
curse
of
the
last
reducer,”
WWW’11

7. After

8. After

By exploiting common patterns in graph data and computation:

New ways to represent  
real-world graphs

New ways execute  
graph algorithms
Machine 1
Machine 2

Orders of magnitude improvements
over existing systems

7. After

8. After

Possibility
Scalability
Usability

Exciting Time to Work in ML
J Unique opportunities to change the world!!
With ML, I will
cure cancer!!!

With ML I will
ﬁnd true love.

Why won’t
ML read
my mind???

L Building scalable learning system requires experts …

But…
Even
basics
of
scalable
ML

can
be
challenging

ML key to any
new service we
want to build

6
months
from
prototype

to
producMon

State-‐of-‐art
ML
algorithms

trapped
in
research
papers

Goal of GraphLab 3:
Make large-scale machine learning accessible to all! J

Adding a Python Layer
Python
API

Graph

AnalyMcs

Graphical

Models

Computer

Vision

Clustering

Topic

Modeling

CollaboraMve

Filtering

GraphLab2
System

MPI/TCP-‐IP

PThreads

EC2
HPC
Nodes

HDFS

Learning ML with  
GraphLab Notebook

https://beta.graphlab.com/examples!

Prototype to Production 
with Python GraphLab:
Easily install prototype locally

Deploy to the cluster in one step

Learn:  
GraphLab
Notebook

Prototype:  
pip install graphlab  
è  
local prototyping

Production:  
Same code scales
to EC2 cluster

GraphLab Toolkits
Highly scalable, state-of-the-art  
machine learning straight from python

Graph  
Analytics

Graphical 
Models

Computer 
Vision

Clustering

Topic 
Modeling

Collaborative 
Filtering

Machine Learning on Graphs
partners@graphlab.com

NIPS Workshop on Big Learning: biglearn.org
Lake Tahoe, December 9th

Joseph Gonzalez
Co-Founder, GraphLab Inc.
joseph@graphlab.com

Joey gonzalez, graph lab, m lconf 2013

More Related Content

What's hot

Viewers also liked

Similar to Joey gonzalez, graph lab, m lconf 2013

More from MLconf

Recently uploaded

Joey gonzalez, graph lab, m lconf 2013