Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

- Localized methods in graph mining by David Gleich 1392 views
- Anti-differentiating approximation ... by David Gleich 757 views
- Spacey random walks and higher-orde... by David Gleich 707 views
- Higher-order organization of comple... by David Gleich 3367 views
- Overlapping clusters for distribute... by David Gleich 2189 views
- Graph libraries in Matlab: MatlabBG... by David Gleich 27754 views

1,631 views

Published on

No Downloads

Total views

1,631

On SlideShare

0

From Embeds

0

Number of Embeds

20

Shares

0

Downloads

33

Comments

0

Likes

5

No embeds

No notes for slide

- 1. (PageRank) Centrality of dynamic graph structures David F. Gleich! Computer Science" Purdue University 1 David Gleich · Purdue AN14 · MS59
- 2. Models and algorithms for high performance ! matrix and network computations AN14 · MS59 David Gleich · Purdue 2 1 error 1 std 0 2 (b) Std, s = 0.39 cm 10 error 0 0 10 std 0 20 (d) Std, s = 1.95 cm model compared to the prediction standard de- bble locations at the ﬁnal time for two values of = 1.95 cm. (Colors are visible in the electronic approximately twenty minutes to construct using s. ta involved a few pre- and post-processing steps: m Aria, globally transpose the data, compute the nd errors. The preprocessing steps took approx- recise timing information, but we do not report Tensor eigenvalues" and a power method FIGURE 6 – Previous work from the PI tackled net- work alignment with ma- trix methods for edge overlap: i j j0 i0 OverlapOverlap A L B This proposal is for match- ing triangles using tensor methods: j i k j0 i0 k0 TriangleTriangle A L B t r o s. g n. o n s s- g maximize P ijk Tijk xi xj xk subject to kxk2 = 1 where ! ensures the 2-norm [x(next) ]i = ⇢ · ( X jk Tijk xj xk + xi ) SSHOPM method due to " Kolda and Mayo Big data methods SIMAX ‘09, SISC ‘11,MapReduce ‘11, ICASSP ’12 Network alignment ICDM ‘09, SC ‘11, TKDE ‘13 Fast & Scalable" Network centrality SC ‘05, WAW ‘07, SISC ‘10, WWW ’10, … Data clustering WSDM ‘12, KDD ‘12, CIKM ’13 … Ax = b min kAx bk Ax = x Massive matrix " computations on multi-threaded and distributed architectures
- 3. I hope to add power-grid networks soon! AN14 · MS59 David Gleich · Purdue 3
- 4. Centrality measures “relative importance in a network” –Wikipedia “it’s a guess about what might be important” -Me They tell us something about a network considering it’s topology. They need to be deployed with extreme care! AN14 · MS59 David Gleich · Purdue 4 From Wikipedia
- 5. Centrality measures of dynamic graphs Something about my network is changing, what should I do? 1. Recompute at each change 2. Batch up changes, and periodically recompute 3. Efﬁciently update (i.e. recompute smartly!) 4. Approximately update/compute 5. Do something else. AN14 · MS59 David Gleich · Purdue 5
- 6. What else to do??? “If the optimization is hard, you should be solving a different optimization problem” " –Cris Moore 1. Des Higham et al. " Adopt the fundamentals to discrete time 2. Use dynamical system generalizations, Gleich and Rossi 2012/2014; and " Des Higham et al. 2014 3. Likely more too… AN14 · MS59 David Gleich · Purdue 6
- 7. Smart centrality for the " smart grid? You need to adapt your centrality measure for your application! (Or try to get lucky!) AN14 · MS59 David Gleich · Purdue 7
- 8. Application to the power grid Prior work • Kim, Obah, 2007; Jin et al., 2010; Adolf et al., 2011; Halappanavar et al., 2012 has found that graph properties have important correlations with power-grid vulnerabilities and contingency analysis 8 David Gleich · Purdue AN14 · MS59
- 9. 1. Perspectives on PageRank 2. PageRank as a dynamical system and time-dependent teleportation 3. Predicting using PageRank 4. Applications to the power-grid? 9 David Gleich · Purdue AN14 · MS59
- 10. The random surfer model! At a node … 1. follow edges with prob α 2. do something else with prob (1-α) Google’s PageRank is one possible answer PageRank by Google 1 2 3 4 5 6 The Model 1. follow edges uniformly with probability , and 2. randomly jump with probability 1 , we’ll assume everywhere is equally likely The places we ﬁnd the surfer most often are im- portant pages. The important pages are the places we are most likely to ﬁnd the random surfer 10 David Gleich · Purdue AN14 · MS59
- 11. My preferred version " of PageRank A PageRank vector x is the solution of the linear system: (I – αP) x = (1 – α) v where P is a column stochastic matrix, 0 ≤ α < 1, and v is a probability vector. tails ! 2 6 6 4 1/6 1/2 0 0 0 0 1/6 0 0 1/3 0 0 1/6 1/2 0 1/3 0 0 1/6 0 1/2 0 0 0 1/6 0 1/2 1/3 0 1 1/6 0 0 0 1 0 3 7 7 5 | {z } P P j 0 eT P=eT Just three ingredients! vi 0, eT v = 1 ↵ usually 0.5 to 0.99 11 David Gleich · Purdue AN14 · MS59
- 12. This deﬁnition applies to a remarkable variety of problems 1. GeneRank 2. ProteinRank 3. FoodRank 4. SportsRank 5. HostRank 6. TrustRank 7. BadRank 8. ObjectRank 9. ItemRank 10. ArticleRank 11. BookRank 12. FutureRank 13. TimedPageRank 14. SocialPageRank 15. DiffusionRank 16. ImpressionRank 17. TweetRank 18. TwitterRank 19. ReversePageRank 20. PageTrust 21. PopRank 22. CiteRank 23. FactRank 24. InvestorRank 25. ImageRank 26. VisualRank 27. QueryRank 28. BookmarkRank 29. StoryRank 30. PerturbationRank 31. ChemicalRank 32. RoadRank 33. PaperRank 34. Etc… 12 David Gleich · Purdue AN14 · MS59
- 13. The teleportation distribution v models where surfers “restart” What if this changes with time? 13 David Gleich · Purdue AN14 · MS59
- 14. Let’s look at how PageRank evolves with iterations x(k) = x(k+1) x(k) = ↵Px(k) + (1 ↵)v x(k) = (1 ↵)v (I ↵P)x(k) x0 (t) = (1 ↵)v (I ↵P)x(t) PageRank is the steady-state solution of the ODE 14 David Gleich · Purdue AN14 · MS59
- 15. A dynamical system for " time-dependent teleportation + Easy to integrate + Easy to understand + Possible to treat analytically! – Need to “model time” (not dimensionless) – Still useful to have a data assimilation model x0 (t) = (1 ↵)v(t) (I ↵P)x(t) 15 David Gleich · Purdue AN14 · MS59
- 16. Need a symplectic integrator (or self-correcting…) We use a standard RK integrator " (ode45 in Matlab) We used the formulation to maintain x(t) as a probability distribution x0 (t) = (1 ↵)v(t) ( I ↵P)x(t) = (1 ↵)eT v(t) + ↵eT x(t) 16 David Gleich · Purdue AN14 · MS59
- 17. Where is this model realistic? On Wikipedia, we have hourly visit data that provides a coarse measure of outside interest 17 David Gleich · Purdue AN14 · MS59
- 18. Now PageRank values are time-series, not static scores 1 MainPage 2 FrancisMag 3 11 501(c) 12 Searching 1 Earthquake Australian Earthquake occurs! Main page Time Time Importance 18 David Gleich · Purdue AN14 · MS59
- 19. Some quick theory x(t) = exp[ (I ↵P)t]x(0) + (1 ↵) Z t 0 exp[ (I ↵P)(t ⌧)]v(⌧) d⌧. x0 (t) = (1 ↵)v(t) (I ↵P)x(t) Z t 0 exp[ (I ↵P)(t ⌧)]v(⌧) d⌧ = (I ↵P) 1 v exp[ (I ↵P)t](I ↵P) 1 v x(t) = exp[ (I ↵P)t](x(0) x) + x For general v(t) For static v(t) = v The original " PageRank vector 19 David Gleich · Purdue AN14 · MS59
- 20. Thus we recover " the original PageRank vector " if interest stops changing. 20 David Gleich · Purdue AN14 · MS59
- 21. Modeling cyclical behavior Cyclically switch between teleportation vectors vj v(t) = 1 k kX j=1 vj ⇣ cos(t + (j 1)2⇡ k ) + 1 ⌘ 0 20 40 60 80 0 0.05 0.1 0.15 0.2 time Time−dependentteleportation Page 1 Page 2 Page 3 Page 4 v1 v2 v1 v2 21 David Gleich · Purdue AN14 · MS59
- 22. 0 5 10 15 20 0.1 0.2 0.3 0.4 0.5 time DynamicPageRank Page 1 Page 2 Page 3 Page 4 Cyclical behavior in the time- dependent PageRank scores 1 2 3 4 0 20 40 60 80 0 0.05 0.1 0.15 0.2 time Time−dependentteleportation Page 1 Page 2 Page 3 Page 4 22 David Gleich · Purdue AN14 · MS59
- 23. Modeling cyclical behavior Cyclically switch between teleportation vectors vj v(t) = 1 k kX j=1 vj ⇣ cos(t + (j 1)2⇡ k ) + 1 ⌘ x(t) = x + Re {s exp(ıt)} Then the eventual solution is (I ↵P)x = (1 ↵) 1 k Ve (I ↵ 1+ı P)s = (1 ↵) 1 k(1+ı) V exp(ıf) PageRank vector with average teleportation PageRank with complex teleportation 23 David Gleich · Purdue AN14 · MS59
- 24. Summary If you have cyclical interest on a node, we have a NEW centrality measure that provides the magnitude of the oscillation based on PageRank with complex valued “teleportation.” AN14 · MS59 David Gleich · Purdue 24
- 25. Thus we can determine " the size of the oscillation " for the case of cyclical teleportation 25 David Gleich · Purdue AN14 · MS59
- 26. Is it useful? Let’s try and predict retweets on Twitter We crawled Twitter and gathered " a graph of who follows who and " how active each user is in a month This yields a graph and 6 vectors v! ! Our goal is to predict how many tweets you’ll send next month based on the current month! 26 David Gleich · Purdue AN14 · MS59
- 27. … and then there are details I can go into … AN14 · MS59 David Gleich · Purdue 27
- 28. The results Dataset Type ✓ Error Ratio s (timescale) 1 2 6 1 TWITTER stationary 0.01 0.635 0.929 0.913 0.996 0.50 0.636 0.735 0.854 0.939 1.00 0.522 0.562 0.710 0.963 non-stationary 0.01 0.461 0.841 1.001 0.992 0.50 0.261 0.608 0.585 0.929 1.00 0.137 0.605 0.617 0.918 Err Ratio = SMAPE of tweets + Time-dependent PR / SMAPE of tweets only If this ratio < 1, then using Time-dependent PR helps Stationary nodes are those with small maximum change in scores Non-stationary nodes are those with large maximum change in scores 28 David Gleich · Purdue AN14 · MS59
- 29. Using Granger Causality to study link relationships on Wikipedia 51 Greygoo 52 pageprotec 53 R 61 Science 62 Gackt 63 T 71 Madonna(en 72 Richtermag 73 T 81 Livingpeop 82 Mathematic 83 S 91 Categories 92 Germany 93 M ogy 20 Geography atic 30 Biography en(f 40 Earthquake io 50 Raceandeth 60 Football(s Earthquake Richter Mag. Causes? Of course! We build this into the model. But, the question is, which of these are preserved after incorporating the effects of page view data? 29 David Gleich · Purdue AN14 · MS59
- 30. To the power grid … Line failures in the grid can be anticipated via linearized DC dynamics Hines el al.? AN14 · MS59 David Gleich · Purdue 30 c = diag(B (L)+ BT )
- 31. The PageRank problem & " the Laplacian Combinatorial " Laplacian AN14 · MS59 David Gleich · Purdue 31 1. (I ↵AD 1 )x = (1 ↵)v; 2. (I ↵A)y = (1 ↵)D 1/2 v, where A = D 1/2 AD 1/2 and x = D1/2 y; and 3. [ D + L]z = v where ↵ = 1/(1 + ) and x = Dz. Let x(↵) solve PageRank and let vT e = 0. Then lim↵!1 x(↵) ! SL+ v where S is a scaling matrix.
- 32. Some potential applications 1. PageRank can be thought of as a type of regularization; often helps improve on simple centrality baselines 2. Limits of PageRank interpolate between centrality and spectral clustering [Mahoney, Orecchia, and Vishnoi] 3. Time dependent teleportation models; adaptations to node dropouts possible. 4. Use PageRank on the line graph? AN14 · MS59 David Gleich · Purdue 32
- 33. Results on the power grid … pending … AN14 · MS59 David Gleich · Purdue 33
- 34. Questions, Conclusions, and References! Questions! How to validate some of these ideas? Too simplistic? Other power-grid problems where similar ideas may be able to help? Collaborators????? 34 David Gleich · Purdue AN14 · MS59 Dear David, Please remember to repeat the question! Paper Gleich & Rossi, Internet Mathematics, 2014 Code https://www.cs.purdue.edu/homes/dgleich/codes/dynsyspr-im Conclusions! Centrality is more complicated than just one method. It’s possible to tune centrality measures to different structures and this makes it a ﬂexible setup."

No public clipboards found for this slide

Be the first to comment