Localized methods for
diffusions in large graphs
David F. Gleich!
Purdue University!
Joint work with 
Kyle Kloster @"
Purd...
Image from rockysprings, deviantart, CC share-alike
Everything in the world can be
explained by a matrix, and we see
how d...
3
Graph diffusions
David Gleich · Purdue
4
f =
1X
k=0
↵k Pk
s
ate
t in
on
work, or mesh, from a typical problem in scientific...
Graph diffusions
David Gleich · Purdue
5
ate
t in
on
work, or mesh, from a typical problem in scientific computing
high
low...
Graph diffusions
David Gleich · Purdue
6
h = e t
1X
k=0
tk
k!
Pk
s
h = e t
exp{tP}s
PageRank
Heat kernel
0 20 40 60 80 100...
Uniformly localized "
solutions in livejournal
1 2 3 4 5
x 10
6
0
0.5
1
1.5
nnz = 4815948
magnitude
plot(x)
10
0
10
1
10
2...
Our mission!
Find the solution with work "
roughly proportional to the "
localization, not the matrix.
David Gleich · Purd...
Two types of localization
David Gleich · Purdue
9
kx x⇤
k1  " kD 1
(x x⇤
)k1  "
x ⇡ x⇤
Uniform (Strong)! Entry-wise (Wea...
We have four results
1.  A new interpretation for the PageRank
diffusion in relationship with a mincut
problem.
2.  A new ...
Our algorithms for uniform localization"
www.cs.purdue.edu/homes/dgleich/codes/nexpokit
10
0
10
1
10
2
10
3
10
4
10
5
10
6...
PageRank, mincuts, and the
push method via
Algorithmic Anti-Differentiation
David Gleich · Purdue
12
Gleich & Mahoney,
ICM...
The PageRank problem & "
the Laplacian on undirected graphs
Combinatorial Laplacian L = D - A!
David Gleich · Purdue
13

T...
minimize kBxkC,1 =
P
ij2E Ci,j |xi xj |
subject to xs = 1, xt = 0, x 0.
The s-t min-cut problem
Unweighted incidence matri...
The localized cut graph



Related to a construction
used in “FlowImprove” "
Andersen & Lang (2007); and
Orecchia & Zhu (2...
The localized cut graph
Connect s to vertices
in S with weight ↵ · degree
Connect t to vertices
in ¯S with weight ↵ · degr...
The localized cut graph
Connect s to vertices
in S with weight ↵ · degree
Connect t to vertices
in ¯S with weight ↵ · degr...
s-t min-cut à PageRank 
 Proof
Square and expand
the objective into
a Laplacian, then
apply constraints.
David Gleich · P...
PageRank à s-t min-cut
That equivalence works if s is degree-weighted.
What if s is the uniform vector? 
A(s) =
2
4
0 ↵sT...
Insight 1!
PageRank implicitly approximates the
solution of these s-t mincut problems
David Gleich · Purdue
20
MMDS 2014
The Push Algorithm for PageRank
Proposed (in closest form) in Andersen, Chung, Lang "
(also by McSherry, Jeh & Widom) for ...
Why do we care
about push?

1.  Used for empirical stud-
ies of “communities”
2.  Local Cheeger inequality.
3.  Used for “...
The push method revisited
Let x be the output from the push method
with 0 < < 1, v = dS/vol(S),
⇢ = 1, and ⌧ > 0.
Set ↵ = ...
Insight 2!
The PageRank push method
implicitly solves a 1-norm regularized
2-norm cut approximation. 
David Gleich · Purdu...
Insight 2’
We get 3-digits of accuracy on P and 
16-digits of accuracy on P’.
David Gleich · Purdue
25
MMDS 2014
David Gleich · Purdue
26
Anti-di↵erentiating Approximat
16 nonzeros 15 nonzeros
Figure 2. Examples of the di↵erent cut vec...
The push method revisited
Let x be the output from the push method
with 0 < < 1, v = dS/vol(S),
⇢ = 1, and ⌧ > 0.
Set ↵ = ...
This is a case of 
Algorithmic Anti-differentiation!
28
MMDS 2014
David Gleich · Purdue
Understand why H works!
Show heuristic H solves P’
Guess and check!
until you find something H
solves
Derive characterizati...
Without these insights, we’d
draw the wrong conclusion.
David Gleich · Purdue
30
Gleich & Mahoney,
Submitted
Our s-t mincu...
Without these insights, we’d
draw the wrong conclusion.
David Gleich · Purdue
31
Gleich & Mahoney,
Submitted
Our s-t mincu...
Without these insights, we’d
draw the wrong conclusion.
David Gleich · Purdue
32
Gleich & Mahoney,
Submitted
Our s-t mincu...
Recap so far
1.  Used the relationship between
PageRank and mincut to get a new
understanding of the implicit properties
o...
Graph diffusions
David Gleich · Purdue
34
h = e t
1X
k=0
tk
k!
Pk
s
h = e t
exp{tP}s
PageRank
Heat kernel
0 20 40 60 80 10...
We can turn the heat kernel
into a linear system
Direct expansion!


"
!
!
!

David Gleich · Purdue
35
x = exp(P)ec ⇡
PN
k...
There is a fast deterministic
adaptation of the push method
David Gleich · Purdue
36
Kloster & Gleich,
KDD2014


ons
hat
h...
PageRank vs. Heat Kernel
David Gleich · Purdue
37
5 6 7 8 9
0
0.5
1
1.5
2
Runtime: hk vs. ppr
log10(|V|+|E|)
Runtime(s)
hk...
References and ongoing work
Gleich and Kloster – Relaxation methods for the matrix exponential,
Submitted"
Kloster and Gle...
Upcoming SlideShare
Loading in...5
×

Localized methods for diffusions in large graphs

333
-1

Published on

I describe a few ongoing research projects on diffusions in large graphs and how we can create efficient matrix computations in order to determine them efficiently.

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
333
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Localized methods for diffusions in large graphs

  1. 1. Localized methods for diffusions in large graphs David F. Gleich! Purdue University! Joint work with Kyle Kloster @" Purdue & Michael Mahoney @ Berkeley supported by " NSF CAREER CCF-1149756 Code "www.cs.purdue.edu/homes/dgleich/codes/nexpokit ! "www.cs.purdue.edu/homes/dgleich/codes/l1pagerank! David Gleich · Purdue 1 MMDS 2014
  2. 2. Image from rockysprings, deviantart, CC share-alike Everything in the world can be explained by a matrix, and we see how deep the rabbit hole goes The talk ends, you believe -- whatever you want to.
  3. 3. 3
  4. 4. Graph diffusions David Gleich · Purdue 4 f = 1X k=0 ↵k Pk s ate t in on work, or mesh, from a typical problem in scientific computing high low A – adjacency matrix! D – degree matrix! P – column stochastic operator s – the “seed” (a sparse vector) f – the diffusion result 𝛼k – the path weights P = AD 1 Px = X j!i 1 dj xj Graph diffusions help: 1.  Attribute prediction 2.  Community detection 3.  “Ranking” 4.  Find small conductance sets MMDS 2014
  5. 5. Graph diffusions David Gleich · Purdue 5 ate t in on work, or mesh, from a typical problem in scientific computing high low h = e t 1X k=0 tk k! Pk s h = e t exp{tP}s PageRank Heat kernel x = (1 ) 1X k=0 k Pk s (I P)x = (1 )s P = AD 1 Px = X j!i 1 dj xj MMDS 2014
  6. 6. Graph diffusions David Gleich · Purdue 6 h = e t 1X k=0 tk k! Pk s h = e t exp{tP}s PageRank Heat kernel 0 20 40 60 80 100 10 −5 10 0 t=1 t=5 t=15 α=0.85 α=0.99 Weight Length x = (1 ) 1X k=0 k Pk s (I P)x = (1 )s MMDS 2014
  7. 7. Uniformly localized " solutions in livejournal 1 2 3 4 5 x 10 6 0 0.5 1 1.5 nnz = 4815948 magnitude plot(x) 10 0 10 1 10 2 10 3 10 4 10 5 10 6 10 −14 10 −12 10 −10 10 −8 10 −6 10 −4 10 −2 10 0 1−normerror largest non−zeros retained 10 0 10 1 10 2 10 3 10 4 10 5 10 6 10 −14 10 −12 10 −10 10 −8 10 −6 10 −4 10 −2 10 0 1−normerror largest non−zeros retained x = exp(P)ec David Gleich · Purdue 7 nnz(x) = 4, 815, 948 Gleich & Kloster, arXiv:1310.3423 MMDS 2014
  8. 8. Our mission! Find the solution with work " roughly proportional to the " localization, not the matrix. David Gleich · Purdue 8 MMDS 2014
  9. 9. Two types of localization David Gleich · Purdue 9 kx x⇤ k1  " kD 1 (x x⇤ )k1  " x ⇡ x⇤ Uniform (Strong)! Entry-wise (Weak)! Localized vectors are not sparse, but they can be approximated by sparse vectors. Good global approximation using only a local region. “Hard” to prove. “Need” a graph property. Good approximation for cuts and communities. “Easy” to prove. “Fast” algorithms MMDS 2014
  10. 10. We have four results 1.  A new interpretation for the PageRank diffusion in relationship with a mincut problem. 2.  A new understanding of the scalable, localized PageRank “push” method 3.  A new algorithm for the heat kernel diffusion in a degree weighted norm. 4.  Algorithms for diffusions as functions of matrices (K. Kloster’s poster on Thurs.) David Gleich · Purdue 10 Undirected graphs only Entry-wise localization Directed, uniform localization MMDS 2014
  11. 11. Our algorithms for uniform localization" www.cs.purdue.edu/homes/dgleich/codes/nexpokit 10 0 10 1 10 2 10 3 10 4 10 5 10 6 10 −8 10 −6 10 −4 10 −2 10 0 non−zeros 1−normerror gexpm gexpmq expmimv 10 0 10 1 10 2 10 3 10 4 10 5 10 6 10 −8 10 −6 10 −4 10 −2 10 0 non−zeros 1−normerror David Gleich · Purdue 11 MMDS 2014 work = O ⇣ log(1 " )(1 " )3/2 d2 (log d)2 ⌘ nnz = O ⇣ log(1 " )(1 " )3/2 d(log d) ⌘
  12. 12. PageRank, mincuts, and the push method via Algorithmic Anti-Differentiation David Gleich · Purdue 12 Gleich & Mahoney, ICML 2014 MMDS 2014
  13. 13. The PageRank problem & " the Laplacian on undirected graphs Combinatorial Laplacian L = D - A! David Gleich · Purdue 13 The PageRank random surfer 1.  With probability beta, follow a random-walk step 2.  With probability (1-beta), jump randomly ~ dist. s. Goal find the stationary dist. x! x = (1 ) 1X k=0 k Pk s1. (I AD 1 )x = (1 )s; 2. [↵D + L]z = ↵s where = 1/(1 + ↵) and x = Dz. MMDS 2014
  14. 14. minimize kBxkC,1 = P ij2E Ci,j |xi xj | subject to xs = 1, xt = 0, x 0. The s-t min-cut problem Unweighted incidence matrix Diagonal capacity matrix 14 David Gleich · Purdue t s In the unweighted case, " solve via max-flow. In the weighted case, solve via network simplex or industrial LP. MMDS 2014
  15. 15. The localized cut graph Related to a construction used in “FlowImprove” " Andersen & Lang (2007); and Orecchia & Zhu (2014) AS = 2 4 0 ↵dT S 0 ↵dS A ↵d¯S 0 ↵dT ¯S 0 3 5 Connect s to vertices in S with weight ↵ · degree Connect t to vertices in ¯S with weight ↵ · degree David Gleich · Purdue 15 MMDS 2014
  16. 16. The localized cut graph Connect s to vertices in S with weight ↵ · degree Connect t to vertices in ¯S with weight ↵ · degree BS = 2 4 e IS 0 0 B 0 0 I¯S e 3 5 minimize kBSxkC(↵),1 subject to xs = 1, xt = 0 x 0. Solve the s-t min-cut David Gleich · Purdue 16 MMDS 2014
  17. 17. The localized cut graph Connect s to vertices in S with weight ↵ · degree Connect t to vertices in ¯S with weight ↵ · degree BS = 2 4 e IS 0 0 B 0 0 I¯S e 3 5 Solve the “electrical flow” 
 s-t min-cut minimize kBSxkC(↵),2 subject to xs = 1, xt = 0 David Gleich · Purdue 17 MMDS 2014
  18. 18. s-t min-cut à PageRank Proof Square and expand the objective into a Laplacian, then apply constraints. David Gleich · Purdue 18 MMDS 2014 The PageRank vector z that solves (↵D + L)z = ↵s with s = dS/vol(S) is a renormalized solution of the electrical cut computation: minimize kBSxkC(↵),2 subject to xs = 1, xt = 0. Specifically, if x is the solution, then x = 2 4 1 vol(S)z 0 3 5
  19. 19. PageRank à s-t min-cut That equivalence works if s is degree-weighted. What if s is the uniform vector? A(s) = 2 4 0 ↵sT 0 ↵s A ↵(d s) 0 ↵(d s)T 0 3 5 . David Gleich · Purdue 19 MMDS 2014
  20. 20. Insight 1! PageRank implicitly approximates the solution of these s-t mincut problems David Gleich · Purdue 20 MMDS 2014
  21. 21. The Push Algorithm for PageRank Proposed (in closest form) in Andersen, Chung, Lang " (also by McSherry, Jeh & Widom) for personalized PageRank Strongly related to Gauss-Seidel on Ax=b (see my talk at Simons) Derived to show improved runtime for balanced solvers 1. x(1) = 0, r(1) = (1 )ei , k = 1 2. while any rj > ⌧dj (dj is the degree of node j) 3. x(k+1) = x(k) + (rj ⌧dj ⇢)ej 4. r(k+1) i = 8 >< >: ⌧dj ⇢ i = j r(k) i + (rj ⌧dj ⇢)/dj i ⇠ j r(k) i otherwise 5. k k + 1 The Push Method! ⌧, ⇢ David Gleich · Purdue 21 a b c MMDS 2014
  22. 22. Why do we care about push? 1.  Used for empirical stud- ies of “communities” 2.  Local Cheeger inequality. 3.  Used for “fast Page- Rank approximation” 4.  It produces weakly localized approximations to PageRank! Newman’s netscience! 379 vertices, 1828 nnz “zero” on most of the nodes s has a single " one here 22 kD 1 (x x⇤ )k1  " 1 (1 )" edges
  23. 23. The push method revisited Let x be the output from the push method with 0 < < 1, v = dS/vol(S), ⇢ = 1, and ⌧ > 0. Set ↵ = 1 ,  = ⌧vol(S)/ , and let zG solve: minimize 1 2 kBSzk 2 C(↵),2 + kDzk1 subject to zs = 1, zt = 0, z 0 , where z = h 1 zG 0 i . Then x = DzG/vol(S). Proof Write out KKT conditions Show that the push method solves them. Slackness was “tricky” Regularization for sparsity David Gleich · Purdue 23 Need for normalization MMDS 2014
  24. 24. Insight 2! The PageRank push method implicitly solves a 1-norm regularized 2-norm cut approximation. David Gleich · Purdue 24 MMDS 2014
  25. 25. Insight 2’ We get 3-digits of accuracy on P and 16-digits of accuracy on P’. David Gleich · Purdue 25 MMDS 2014
  26. 26. David Gleich · Purdue 26 Anti-di↵erentiating Approximat 16 nonzeros 15 nonzeros Figure 2. Examples of the di↵erent cut vectors on a portion of the netscience with its vertices enlarged. In the other subfigures, we show the solution vectors (4), and (6), solved with min-cut, PageRank, and ACL) for this set S . Each v values are large and dark. White vertices with outlines are numerically non-zer outlined, in contrast to the third figure). The true min-cut set is large in all ve with many fewer non-zeros than the vanilla PageRank problem. References Andersen, Reid and Lang, Kevin. An algorithm for improving graph partitions. In Proceedings of the 19th annual ACM-SIAM Symposium on Discrete Algorithms, pp. 651–660, 2008. Leskov Mic clus Inte Mahon Anti-di↵erentiating Approximation Algorithms eros 15 nonzeros 284 nonzeros 24 nonzeros of the di↵erent cut vectors on a portion of the netscience graph. In the left subfigure, we show the set S highlighted arged. In the other subfigures, we show the solution vectors from the various cut problems (from left to right, Probs. (2), Push’s sparsity helps it identify the “right” graph feature with fewer non-zeros The set S The mincut solution The push solution The PageRank solution MMDS 2014
  27. 27. The push method revisited Let x be the output from the push method with 0 < < 1, v = dS/vol(S), ⇢ = 1, and ⌧ > 0. Set ↵ = 1 ,  = ⌧vol(S)/ , and let zG solve: minimize 1 2 kBSzk 2 C(↵),2 + kDzk1 subject to zs = 1, zt = 0, z 0 , where z = h 1 zG 0 i . Then x = DzG/vol(S). Regularization for sparsity in solution and residual David Gleich · Purdue 27 The push method is scalable because it gives us sparse solutions AND sparse residuals r. MMDS 2014
  28. 28. This is a case of Algorithmic Anti-differentiation! 28 MMDS 2014 David Gleich · Purdue
  29. 29. Understand why H works! Show heuristic H solves P’ Guess and check! until you find something H solves Derive characterization of heuristic H The real world Given “find-communities” Hack around " Write paper presenting “three steps of the power method on P finds communities” Algorithmic Anti-differentiation! Given heuristic H, is there a problem P’ such that H is an algorithm for P’ ? MMDS 2014 David Gleich · Purdue 29 e.g. Mahoney & Orecchia, Dhillon et al. (Graclus); Saunders
  30. 30. Without these insights, we’d draw the wrong conclusion. David Gleich · Purdue 30 Gleich & Mahoney, Submitted Our s-t mincut framework extends to many diffusions used in semi-supervised learning. MMDS 2014
  31. 31. Without these insights, we’d draw the wrong conclusion. David Gleich · Purdue 31 Gleich & Mahoney, Submitted Our s-t mincut framework extends to many diffusions used in semi-supervised learning. 2 4 6 8 10 0 0.2 0.4 0.6 0.8 errorrate average training samples per class K2 RK2 K3 RK3 Off the shelf SSL procedure MMDS 2014
  32. 32. Without these insights, we’d draw the wrong conclusion. David Gleich · Purdue 32 Gleich & Mahoney, Submitted Our s-t mincut framework extends to many diffusions used in semi-supervised learning. 2 4 6 8 10 0 0.2 0.4 0.6 0.8 errorrate average training samples per class K2 RK2 K3 RK3 2 4 6 8 10 0 0.2 0.4 0.6 0.8 errorrate average training samples per class K2 RK2 K3 RK3 Off the shelf SSL procedure Rank-rounded SSL MMDS 2014
  33. 33. Recap so far 1.  Used the relationship between PageRank and mincut to get a new understanding of the implicit properties of the push method 2.  Showed that this insight helps improve semi-supervised learning. (next) A new algorithm for the heat kernel diffusion in a degree weighted norm. David Gleich · Purdue 33 MMDS 2014
  34. 34. Graph diffusions David Gleich · Purdue 34 h = e t 1X k=0 tk k! Pk s h = e t exp{tP}s PageRank Heat kernel 0 20 40 60 80 100 10 −5 10 0 t=1 t=5 t=15 α=0.85 α=0.99 Weight Length x = (1 ) 1X k=0 k Pk s (I P)x = (1 )s Many “empirically useful” properties of PageRank also hold for the Heat kernel diffusion, e.g. " Chung (2007) showed a local Cheeger inequality. No “local” algorithm until a randomized method by Simpson & Chung (2013). MMDS 2014
  35. 35. We can turn the heat kernel into a linear system Direct expansion! " ! ! ! David Gleich · Purdue 35 x = exp(P)ec ⇡ PN k=0 1 k! Pk ec = xN 2 6 6 6 6 6 6 4 III P/1 III P/2 ... ... III P/N III 3 7 7 7 7 7 7 5 2 6 6 6 6 6 6 4 v0 v1 ... ... vN 3 7 7 7 7 7 7 5 = 2 6 6 6 6 6 6 4 ec 0 ... ... 0 3 7 7 7 7 7 7 5 xN = NX i=0 vi (III ⌦ IIIN SN ⌦ P)v = e1 ⌦ ec Lemma we approximate xN well if we approximate v well Kloster & Gleich, WAW2013 MMDS 2014
  36. 36. There is a fast deterministic adaptation of the push method David Gleich · Purdue 36 Kloster & Gleich, KDD2014 ons hat hen, erm s is d to the s: (7) (8) tity em (9) to k ⇡ we # G is graph as dictionary -of -sets , # seed is an array of seeds , # t, eps , N, psis are precomputed x = {} # Store x, r as dictionaries r = {} # initialize residual Q = collections.deque () # initialize queue for s in seed: r[(s ,0)] = 1./ len(seed) Q.append ((s ,0)) while len(Q) > 0: (v,j) = Q.popleft () # v has r[(v,j)] ... rvj = r[(v,j)] # perform the hk -relax step if v not in x: x[v] = 0. x[v] += rvj r[(v,j)] = 0. mass = (t*rvj/( float(j)+1.))/ len(G[v]) for u in G[v]: # for neighbors of v next = (u,j+1) # in the next block if j+1 == N: # last step , add to soln x[u] += rvj/len(G(v)) continue if next not in r: r[next] = 0. thresh = math.exp(t)*eps*len(G[u]) thresh = thresh /(N*psis[j+1])/2. if r[next] < thresh and r[next] + mass >= thresh: Q.append(next) # add u to queue r[next] = r[next] + mass Figure 2: Pseudo-code for our algorithm as work- ing python code. The graph is stored as a dic- Let h = e t exp{tP}s. Let x = hk-push(") output Then kD 1 (x h)k1  " after looking at 2Net " edges. We believe that the bound below suffices N  2t log(1/") MMDS 2014
  37. 37. PageRank vs. Heat Kernel David Gleich · Purdue 37 5 6 7 8 9 0 0.5 1 1.5 2 Runtime: hk vs. ppr log10(|V|+|E|) Runtime(s) hkgrow 50% 25% 75% pprgrow 50% 25% 75% 5 6 7 8 9 10 −2 10 −1 10 0 Conductances: hk vs. ppr log10(|V|+|E|) log10(Conductances) hkgrow 50% 25% 75% pprgrow 50% 25% 75% 5 6 7 8 9 0 0.5 1 1.5 2 Runtime: hk vs. ppr log10(|V|+|E|) Runtime(s) hkgrow 50% 25% 75% pprgrow 50% 25% 75% 10 −2 10 −1 10 0 Conductances: hk vs. ppr log10(Conductances) hkgrow 50% 25% 75% pprgrow 50% 25% 75% On large graphs, our heat kernel takes slightly longer than a localized PageRank, but produces sets with smaller (better) conductance scores. Our python code on clueweb12 (72B edges) via libbvg: •  99 seconds to load •  1 second to compute MMDS 2014
  38. 38. References and ongoing work Gleich and Kloster – Relaxation methods for the matrix exponential, Submitted" Kloster and Gleich – Heat kernel based community detection KDD2014 Gleich and Mahoney – Algorithmic Anti-differentiation, ICML 2014 " Gleich and Mahoney – Regularized diffusions, Submitted www.cs.purdue.edu/homes/dgleich/codes/nexpokit! www.cs.purdue.edu/homes/dgleich/codes/l1pagerank •  Improved localization bounds for functions of matrices •  Asynchronous and parallel “push”-style methods David Gleich · Purdue 38 Supported by NSF CAREER 1149756-CCF www.cs.purdue.edu/homes/dgleich
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×