(PageRank) Centrality
of dynamic graph
structures
David F. Gleich!
Computer Science"
Purdue University
1
David Gleich · Pu...
Models and algorithms for high performance !
matrix and network computations
AN14 · MS59
David Gleich · Purdue 
2
1
error
...
I hope to add power-grid networks soon! 
AN14 · MS59
David Gleich · Purdue 
3
Centrality measures
“relative importance in a
network” –Wikipedia
“it’s a guess about what
might be important” -Me
They te...
Centrality measures of
dynamic graphs
Something about my network is changing, what
should I do? 

1.  Recompute at each ch...
What else to do???
“If the optimization is hard, you should be
solving a different optimization problem” "
–Cris Moore
1. ...
Smart centrality for the "
smart grid?
You need to adapt your centrality measure for
your application! (Or try to get luck...
Application to the power grid
Prior work 
•  Kim, Obah, 2007; Jin et al., 2010; Adolf et al., 2011; Halappanavar et
al., 2...
1.  Perspectives on PageRank
2.  PageRank as a dynamical system and
time-dependent teleportation
3.  Predicting using Page...
The random surfer model!
At a node …
1.  follow edges with prob α
2.  do something else with prob (1-α)
Google’s PageRank ...
My preferred version "
of PageRank
A PageRank vector x is the solution of the linear system:
(I – αP) x = (1 – α) v
where ...
This definition applies to a
remarkable variety of problems
1.  GeneRank 
2.  ProteinRank 
3.  FoodRank 
4.  SportsRank 
5....
The teleportation distribution v
models where surfers “restart”

What if this changes with time?
13
David Gleich · Purdue ...
Let’s look at how PageRank
evolves with iterations
x(k)
= x(k+1)
x(k)
= ↵Px(k)
+ (1 ↵)v x(k)
= (1 ↵)v (I ↵P)x(k)
x0
(t) = ...
A dynamical system for "
time-dependent teleportation
+ Easy to integrate
+ Easy to understand
+ Possible to treat analyti...
Need a symplectic integrator
(or self-correcting…)
We use a standard RK integrator "
(ode45 in Matlab)
We used the formula...
Where is this model realistic?
On Wikipedia, we have
hourly visit data that provides
a coarse measure of outside
interest
...
Now PageRank values are
time-series, not static scores
1 MainPage 2 FrancisMag 3
11 501(c) 12 Searching 1
Earthquake
Austr...
Some quick theory
x(t) = exp[ (I ↵P)t]x(0)
+ (1 ↵)
Z t
0
exp[ (I ↵P)(t ⌧)]v(⌧) d⌧.
x0
(t) = (1 ↵)v(t) (I ↵P)x(t)
Z t
0
exp...
Thus we recover "
the original PageRank vector "
if interest stops changing.
20
David Gleich · Purdue 
 AN14 · MS59
Modeling cyclical behavior
Cyclically switch between teleportation vectors vj 
v(t) =
1
k
kX
j=1
vj
⇣
cos(t + (j 1)2⇡
k ) ...
0 5 10 15 20
0.1
0.2
0.3
0.4
0.5
time
DynamicPageRank
Page 1
Page 2
Page 3
Page 4
Cyclical behavior in the time-
dependent...
Modeling cyclical behavior
Cyclically switch between teleportation vectors vj 
v(t) =
1
k
kX
j=1
vj
⇣
cos(t + (j 1)2⇡
k ) ...
Summary
If you have cyclical interest on a node, we have
a NEW centrality measure that provides the
magnitude of the oscil...
Thus we can determine "
the size of the oscillation "
for the case of cyclical
teleportation
25
David Gleich · Purdue 
 AN...
Is it useful? Let’s try and
predict retweets on Twitter 
We crawled Twitter and gathered "
a graph of who follows who and ...
… and then there are details I can go into …
AN14 · MS59
David Gleich · Purdue 
27
The results
Dataset Type ✓ Error Ratio
s (timescale)
1 2 6 1
TWITTER stationary 0.01 0.635 0.929 0.913 0.996
0.50 0.636 0....
Using Granger Causality to study link
relationships on Wikipedia
51 Greygoo 52 pageprotec 53 R
61 Science 62 Gackt 63 T
71...
To the power grid … 
Line failures in the grid
can be anticipated via
linearized DC
dynamics 


Hines el al.?
AN14 · MS59
...
The PageRank problem & "
the Laplacian
Combinatorial "
Laplacian
AN14 · MS59
David Gleich · Purdue 
31
1. (I ↵AD 1
)x = (1...
Some potential applications
1.  PageRank can be thought of as a type of
regularization; often helps improve on simple
cent...
Results on the power grid 
… pending … 
AN14 · MS59
David Gleich · Purdue 
33
Questions, Conclusions, and
References!
Questions!
How to validate some of these
ideas?
Too simplistic?
Other power-grid p...
Upcoming SlideShare
Loading in …5
×

PageRank Centrality of dynamic graph structures

867
-1

Published on

A talk I gave at the SIAM Annual Meeting Mini-symposium on the mathematics of the power grid organized by Mahantesh Halappanavar. I discuss a few ideas on how our dynamic centrality could help analyze such situations.

Published in: Technology, Education
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
867
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
16
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

PageRank Centrality of dynamic graph structures

  1. 1. (PageRank) Centrality of dynamic graph structures David F. Gleich! Computer Science" Purdue University 1 David Gleich · Purdue AN14 · MS59
  2. 2. Models and algorithms for high performance ! matrix and network computations AN14 · MS59 David Gleich · Purdue 2 1 error 1 std 0 2 (b) Std, s = 0.39 cm 10 error 0 0 10 std 0 20 (d) Std, s = 1.95 cm model compared to the prediction standard de- bble locations at the final time for two values of = 1.95 cm. (Colors are visible in the electronic approximately twenty minutes to construct using s. ta involved a few pre- and post-processing steps: m Aria, globally transpose the data, compute the nd errors. The preprocessing steps took approx- recise timing information, but we do not report Tensor eigenvalues" and a power method FIGURE 6 – Previous work from the PI tackled net- work alignment with ma- trix methods for edge overlap: i j j0 i0 OverlapOverlap A L B This proposal is for match- ing triangles using tensor methods: j i k j0 i0 k0 TriangleTriangle A L B t r o s. g n. o n s s- g maximize P ijk Tijk xi xj xk subject to kxk2 = 1 where ! ensures the 2-norm [x(next) ]i = ⇢ · ( X jk Tijk xj xk + xi ) SSHOPM method due to " Kolda and Mayo Big data methods SIMAX ‘09, SISC ‘11,MapReduce ‘11, ICASSP ’12 Network alignment ICDM ‘09, SC ‘11, TKDE ‘13 Fast & Scalable" Network centrality SC ‘05, WAW ‘07, SISC ‘10, WWW ’10, … Data clustering WSDM ‘12, KDD ‘12, CIKM ’13 … Ax = b min kAx bk Ax = x Massive matrix " computations on multi-threaded and distributed architectures
  3. 3. I hope to add power-grid networks soon! AN14 · MS59 David Gleich · Purdue 3
  4. 4. Centrality measures “relative importance in a network” –Wikipedia “it’s a guess about what might be important” -Me They tell us something about a network considering it’s topology. They need to be deployed with extreme care! AN14 · MS59 David Gleich · Purdue 4 From Wikipedia
  5. 5. Centrality measures of dynamic graphs Something about my network is changing, what should I do? 1.  Recompute at each change 2.  Batch up changes, and periodically recompute 3.  Efficiently update (i.e. recompute smartly!) 4.  Approximately update/compute 5.  Do something else. AN14 · MS59 David Gleich · Purdue 5
  6. 6. What else to do??? “If the optimization is hard, you should be solving a different optimization problem” " –Cris Moore 1.  Des Higham et al. " Adopt the fundamentals to discrete time 2.  Use dynamical system generalizations, Gleich and Rossi 2012/2014; and " Des Higham et al. 2014 3.  Likely more too… AN14 · MS59 David Gleich · Purdue 6
  7. 7. Smart centrality for the " smart grid? You need to adapt your centrality measure for your application! (Or try to get lucky!) AN14 · MS59 David Gleich · Purdue 7
  8. 8. Application to the power grid Prior work •  Kim, Obah, 2007; Jin et al., 2010; Adolf et al., 2011; Halappanavar et al., 2012 has found that graph properties have important correlations with power-grid vulnerabilities and contingency analysis 8 David Gleich · Purdue AN14 · MS59
  9. 9. 1.  Perspectives on PageRank 2.  PageRank as a dynamical system and time-dependent teleportation 3.  Predicting using PageRank 4.  Applications to the power-grid? 9 David Gleich · Purdue AN14 · MS59
  10. 10. The random surfer model! At a node … 1.  follow edges with prob α 2.  do something else with prob (1-α) Google’s PageRank is one possible answer PageRank by Google 1 2 3 4 5 6 The Model 1. follow edges uniformly with probability , and 2. randomly jump with probability 1 , we’ll assume everywhere is equally likely The places we find the surfer most often are im- portant pages. The important pages are the places we are most likely to find the random surfer 10 David Gleich · Purdue AN14 · MS59
  11. 11. My preferred version " of PageRank A PageRank vector x is the solution of the linear system: (I – αP) x = (1 – α) v where P is a column stochastic matrix, 0 ≤ α < 1, and v is a probability vector. tails ! 2 6 6 4 1/6 1/2 0 0 0 0 1/6 0 0 1/3 0 0 1/6 1/2 0 1/3 0 0 1/6 0 1/2 0 0 0 1/6 0 1/2 1/3 0 1 1/6 0 0 0 1 0 3 7 7 5 | {z } P P j 0 eT P=eT Just three ingredients! vi 0, eT v = 1 ↵ usually 0.5 to 0.99 11 David Gleich · Purdue AN14 · MS59
  12. 12. This definition applies to a remarkable variety of problems 1.  GeneRank 2.  ProteinRank 3.  FoodRank 4.  SportsRank 5.  HostRank 6.  TrustRank 7.  BadRank 8.  ObjectRank 9.  ItemRank 10.  ArticleRank 11.  BookRank 12.  FutureRank 13.  TimedPageRank 14.  SocialPageRank 15.  DiffusionRank 16.  ImpressionRank 17.  TweetRank 18.  TwitterRank 19.  ReversePageRank 20.  PageTrust 21.  PopRank 22.  CiteRank 23.  FactRank 24.  InvestorRank 25.  ImageRank 26.  VisualRank 27.  QueryRank 28.  BookmarkRank 29.  StoryRank 30.  PerturbationRank 31.  ChemicalRank 32.  RoadRank 33.  PaperRank 34.  Etc… 12 David Gleich · Purdue AN14 · MS59
  13. 13. The teleportation distribution v models where surfers “restart” What if this changes with time? 13 David Gleich · Purdue AN14 · MS59
  14. 14. Let’s look at how PageRank evolves with iterations x(k) = x(k+1) x(k) = ↵Px(k) + (1 ↵)v x(k) = (1 ↵)v (I ↵P)x(k) x0 (t) = (1 ↵)v (I ↵P)x(t) PageRank is the steady-state solution of the ODE 14 David Gleich · Purdue AN14 · MS59
  15. 15. A dynamical system for " time-dependent teleportation + Easy to integrate + Easy to understand + Possible to treat analytically! – Need to “model time” (not dimensionless) – Still useful to have a data assimilation model x0 (t) = (1 ↵)v(t) (I ↵P)x(t) 15 David Gleich · Purdue AN14 · MS59
  16. 16. Need a symplectic integrator (or self-correcting…) We use a standard RK integrator " (ode45 in Matlab) We used the formulation to maintain x(t) as a probability distribution x0 (t) = (1 ↵)v(t) ( I ↵P)x(t) = (1 ↵)eT v(t) + ↵eT x(t) 16 David Gleich · Purdue AN14 · MS59
  17. 17. Where is this model realistic? On Wikipedia, we have hourly visit data that provides a coarse measure of outside interest 17 David Gleich · Purdue AN14 · MS59
  18. 18. Now PageRank values are time-series, not static scores 1 MainPage 2 FrancisMag 3 11 501(c) 12 Searching 1 Earthquake Australian Earthquake occurs! Main page Time Time Importance 18 David Gleich · Purdue AN14 · MS59
  19. 19. Some quick theory x(t) = exp[ (I ↵P)t]x(0) + (1 ↵) Z t 0 exp[ (I ↵P)(t ⌧)]v(⌧) d⌧. x0 (t) = (1 ↵)v(t) (I ↵P)x(t) Z t 0 exp[ (I ↵P)(t ⌧)]v(⌧) d⌧ = (I ↵P) 1 v exp[ (I ↵P)t](I ↵P) 1 v x(t) = exp[ (I ↵P)t](x(0) x) + x For general v(t) For static v(t) = v The original " PageRank vector 19 David Gleich · Purdue AN14 · MS59
  20. 20. Thus we recover " the original PageRank vector " if interest stops changing. 20 David Gleich · Purdue AN14 · MS59
  21. 21. Modeling cyclical behavior Cyclically switch between teleportation vectors vj v(t) = 1 k kX j=1 vj ⇣ cos(t + (j 1)2⇡ k ) + 1 ⌘ 0 20 40 60 80 0 0.05 0.1 0.15 0.2 time Time−dependentteleportation Page 1 Page 2 Page 3 Page 4 v1 v2 v1 v2 21 David Gleich · Purdue AN14 · MS59
  22. 22. 0 5 10 15 20 0.1 0.2 0.3 0.4 0.5 time DynamicPageRank Page 1 Page 2 Page 3 Page 4 Cyclical behavior in the time- dependent PageRank scores 1 2 3 4 0 20 40 60 80 0 0.05 0.1 0.15 0.2 time Time−dependentteleportation Page 1 Page 2 Page 3 Page 4 22 David Gleich · Purdue AN14 · MS59
  23. 23. Modeling cyclical behavior Cyclically switch between teleportation vectors vj v(t) = 1 k kX j=1 vj ⇣ cos(t + (j 1)2⇡ k ) + 1 ⌘ x(t) = x + Re {s exp(ıt)} Then the eventual solution is (I ↵P)x = (1 ↵) 1 k Ve (I ↵ 1+ı P)s = (1 ↵) 1 k(1+ı) V exp(ıf) PageRank vector with average teleportation PageRank with complex teleportation 23 David Gleich · Purdue AN14 · MS59
  24. 24. Summary If you have cyclical interest on a node, we have a NEW centrality measure that provides the magnitude of the oscillation based on PageRank with complex valued “teleportation.” AN14 · MS59 David Gleich · Purdue 24
  25. 25. Thus we can determine " the size of the oscillation " for the case of cyclical teleportation 25 David Gleich · Purdue AN14 · MS59
  26. 26. Is it useful? Let’s try and predict retweets on Twitter We crawled Twitter and gathered " a graph of who follows who and " how active each user is in a month This yields a graph and 6 vectors v! ! Our goal is to predict how many tweets you’ll send next month based on the current month! 26 David Gleich · Purdue AN14 · MS59
  27. 27. … and then there are details I can go into … AN14 · MS59 David Gleich · Purdue 27
  28. 28. The results Dataset Type ✓ Error Ratio s (timescale) 1 2 6 1 TWITTER stationary 0.01 0.635 0.929 0.913 0.996 0.50 0.636 0.735 0.854 0.939 1.00 0.522 0.562 0.710 0.963 non-stationary 0.01 0.461 0.841 1.001 0.992 0.50 0.261 0.608 0.585 0.929 1.00 0.137 0.605 0.617 0.918 Err Ratio = SMAPE of tweets + Time-dependent PR / SMAPE of tweets only If this ratio < 1, then using Time-dependent PR helps Stationary nodes are those with small maximum change in scores Non-stationary nodes are those with large maximum change in scores 28 David Gleich · Purdue AN14 · MS59
  29. 29. Using Granger Causality to study link relationships on Wikipedia 51 Greygoo 52 pageprotec 53 R 61 Science 62 Gackt 63 T 71 Madonna(en 72 Richtermag 73 T 81 Livingpeop 82 Mathematic 83 S 91 Categories 92 Germany 93 M ogy 20 Geography atic 30 Biography en(f 40 Earthquake io 50 Raceandeth 60 Football(s Earthquake Richter Mag. Causes? Of course! We build this into the model. But, the question is, which of these are preserved after incorporating the effects of page view data? 29 David Gleich · Purdue AN14 · MS59
  30. 30. To the power grid … Line failures in the grid can be anticipated via linearized DC dynamics Hines el al.? AN14 · MS59 David Gleich · Purdue 30 c = diag(B (L)+ BT )
  31. 31. The PageRank problem & " the Laplacian Combinatorial " Laplacian AN14 · MS59 David Gleich · Purdue 31 1. (I ↵AD 1 )x = (1 ↵)v; 2. (I ↵A)y = (1 ↵)D 1/2 v, where A = D 1/2 AD 1/2 and x = D1/2 y; and 3. [ D + L]z = v where ↵ = 1/(1 + ) and x = Dz. Let x(↵) solve PageRank and let vT e = 0. Then lim↵!1 x(↵) ! SL+ v where S is a scaling matrix.
  32. 32. Some potential applications 1.  PageRank can be thought of as a type of regularization; often helps improve on simple centrality baselines 2.  Limits of PageRank interpolate between centrality and spectral clustering [Mahoney, Orecchia, and Vishnoi] 3.  Time dependent teleportation models; adaptations to node dropouts possible. 4.  Use PageRank on the line graph? AN14 · MS59 David Gleich · Purdue 32
  33. 33. Results on the power grid … pending … AN14 · MS59 David Gleich · Purdue 33
  34. 34. Questions, Conclusions, and References! Questions! How to validate some of these ideas? Too simplistic? Other power-grid problems where similar ideas may be able to help? Collaborators????? 34 David Gleich · Purdue AN14 · MS59 Dear David, Please remember to repeat the question! Paper Gleich & Rossi, Internet Mathematics, 2014 Code https://www.cs.purdue.edu/homes/dgleich/codes/dynsyspr-im Conclusions! Centrality is more complicated than just one method. It’s possible to tune centrality measures to different structures and this makes it a flexible setup."
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×