Your SlideShare is downloading. ×
Dynamic PageRank using Evolving Teleportation
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Dynamic PageRank using Evolving Teleportation

811
views

Published on

WAW12

WAW12

Published in: Technology, Education

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
811
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. ⋯ ⋯ ⋯ Time Dynamic PageRank using Evolving Teleportation Ryan A. Rossi David F. Gleich Tunisia Egypt Libya Tunisia Egypt Libya Tunisia Egypt Libya Tunisia Egypt Libya
  • 2. Problem: Importance of nodes is NOT static (static PageRank) Evolving in reality!Ryan Rossi (Purdue) Dynamic PageRank
  • 3. Problem: Importance of nodes is NOT static Formulate PageRank as Dynamical System! Evolving in reality! Importance of 100 nodes changing over time Dynamic Generalization of PageRank Helps in prediction!Ryan Rossi (Purdue) Dynamic PageRank
  • 4. Detecting dynamic anomaliesDynamic Ranks Australian Spike! Earthquake occurs! Earthquake Prediction observed future values values TV shows/ “American Idol” Earthquake Modeling Causes? 1 TimAllen 2 TheOffice( 3 DrivingMis human dynamics 4 JoannaPaci 5 AmericanId 6 BloodDrive 11 KrisAllen 12 KatharineM 13 AmericanId 14 DavidFoste 15 ListofTheO 16 TheOffice Clustering nodes with similar 21 TheOffice( 22 TheLastHou 23 AmericanId 24 TheLastHou 25 JasonKay 26 CamillaBel time-series patterns Richter Mag. 31 AsherRoth 32 DwightSchr 33 B.J.Novak 34 PromNight( 35 JennaFisch 36 RashidaJonRyan Rossi (Purdue) Dynamic PageRank
  • 5. 1 2 Static PageRank Model. At a node, a random surfer can: 3 4 1. follow edges uniformly with probability α, and 5 2. randomly jump with probability 1 − α (for now, assume vi = 1/n) The nodes that are visited most often are important! Induces a Markov chain model (random walk) Or the linear systemwhereRyan Rossi (Purdue) Dynamic PageRank
  • 6. 1 2 Static PageRank Model. At a node, a random surfer can: 3 4 1. follow edges uniformly with probability α, and 5 2. randomly jump with probability 1 − α (for now, assume vi = 1/n) Too simplistic! that important! most The nodes often are are visited Graph & attributes evolve! Importance continuously changes! Induces a Markov chain model (random walk) Or the linear systemwhereRyan Rossi (Purdue) Dynamic PageRank
  • 7. Majority of work focuses on static networks! Combine PageRank with crawling process S. Abiteboul, M. Preda, & G. Cobena: Adaptive on-line page importance computation Walks on dynamic graphs P. Grindrod, D. Higham, M. Parsons, & E. Estrada: Communicability Across Evolving Networks Other work: J. O’Madadhain & P. Smyth, EventRank: A framework for ranking time-varying networksRyan Rossi (Purdue) Dynamic PageRank
  • 8. All of these techniques are not placed in the context of a dynamical system We want to gain additional flexibility by adapting these problems as continuous dynamical systemsRyan Rossi (Purdue) Dynamic PageRank
  • 9. Evolving teleportation 96 (e.g. pageviews) 105 281 42 11 27 ⋯ time Importance continuously changes as the external influence evolves! Dynamic PageRank ⋯ ⋯Ryan Rossi (Purdue) Dynamic PageRank
  • 10. Evolving teleportation 96 113 (e.g. pageviews) 105 139 281 397 42 64 11 16 27 21 ⋯ time time Importance continuously changes as the external influence evolves! Dynamic PageRank ⋯ ⋯Ryan Rossi (Purdue) Dynamic PageRank
  • 11. 96 113 103 105 139 125 281 397 331 42 64 53 11 16 12 27 21 39 ⋯ ⋯ ⋯ time Importance continuously changes as the external influence evolves! Dynamic PageRank ⋯ ⋯Ryan Rossi (Purdue) Dynamic PageRank
  • 12. Changes in PageRank values evolve Dynamical System Dynamic TeleportationRyan Rossi (Purdue) Dynamic PageRank
  • 13. Dynamic Teleportation Model Generalization of static PageRank. If v(t) = v stops changing, then we recover the original PageRank vector x as the steady-state solution:Ryan Rossi (Purdue) Dynamic PageRank
  • 14.  A principled dynamical system framework for studying these problems  Flexibility to choose our algorithm to solve it  Determines the effective length scale  Seamlessly generalizes PageRank for dynamics  We can easily and naturally incorporate the complete set of dynamic componentsRyan Rossi (Purdue) Dynamic PageRank
  • 15. Evolve the dynamical system, Select any standard method! forward Euler Family of Runge-Kutta … methods Many others! Classical methods Adaptive methods RK2,…,RK4,…Ryan Rossi (Purdue) Dynamic PageRank
  • 16. Evolve the dynamical system, Forward EulerRyan Rossi (Purdue) Dynamic PageRank
  • 17. How we map updates to v into the dynamical system time determines the effective length-scale that we are looking at time-scale of dynamical system Relationship? time-scale of application x(1)?  1 sec, 1 min,...?Ryan Rossi (Purdue) Dynamic PageRank
  • 18. How we map updates to v into the dynamical system time determines the effective length-scale that we are looking at Equivalent to running the time-scale of dynamical system Relationship? power-method until time-scale of application convergence each hour! x(1)?  1 sec, 1 min,...? time in application h=1 t=1 60 iterations time-scale = 1 (1 min) (application) between each hour h=1 t=1 3 iterations after time-scale = 1 (20 min) (application) each hourly changeRyan Rossi (Purdue) Dynamic PageRank
  • 19. v(t) changes at fixed intervals Better idea might be to smooth out these “jumps”! Feature of the new model! 0.2 0.18 Utilize this informationh=1 0.16 from the evolution 1 (12 min) t= time-scale = 1 hourConvergence Measure 0.14 (application) 0.12 0.1 0.08 0.06 0.04 5 iterations after 0.02 each hourly change 0 0 5 10 15 20 25 30 35 40 45 50 IterationRyan Rossi (Purdue) Dynamic PageRank
  • 20.  Transient — Instantaneous values of  Summary & Cumulative — Any summary function s(⋅) of the time-series: integral, min, max, variance  Difference Rank Among many others...Ryan Rossi (Purdue) Dynamic PageRank
  • 21.  Wikipedia — Hyperlink graph — Hourly pageviews  Twitter — Who-follows-whom — Tweet rates (monthly) Dataset Nodes Edges tmax Period Average pi Max pi Wikipedia 4,143,840 72,718,664 20 hours 1.3225 334,650 Twitter 465,022 835,424 6 months 0.5569 1056Ryan Rossi (Purdue) Dynamic PageRank
  • 22. Nope, pageviews and degree uncorrelated! 8 correlation=0.02 7 High degree,In Degree (Log) 6 Low pageviews 5 4 3 2 High pageviews, 1 Low degree 0 0 1 2 3 4 5 6 7 8 9 Total Pageviews (Log)Ryan Rossi (Purdue) Dynamic PageRank
  • 23. Main Finding: Combing the external influence with the graph, produces something new, that is not captured by the other methodsRyan Rossi (Purdue) Dynamic PageRank
  • 24. Learn model as (Exponential moving avg)Predicts p(t+1) asEvaluate models (total errors) asRyan Rossi (Purdue) Dynamic PageRank
  • 25. Base Model. Only pageviews (or tweet-rates) Dynamic PageRank. Pageviews and Dynamic PageRank time-series Dataset Forecasting Dynamic PageRank Base Model Non-stationary 0.4349 0.5028 Wikipedia Stationary 0.3672 0.4373 Non-stationary 0.4852 1.2333 Twitter Stationary 0.6690 0.9180 Main Finding. Dynamic PageRank time-series provides valuable information for forecasting future pageviews (or tweet-rates)Ryan Rossi (Purdue) Dynamic PageRank
  • 26. Many applications such as Base Model. Only pageviews (or tweet-rates) systems • Actively adapting caches in large DB Dynamic PageRank. Pageviews and Dynamic PageRank time-series • Dynamically recommending pages Dataset Forecasting Dynamic PageRank Base Model Non-stationary 0.4349 0.5028 Wikipedia Stationary 0.3672 0.4373 Non-stationary 0.4852 1.2333 Twitter Stationary 0.6690 0.9180Ryan Rossi (Purdue) Dynamic PageRank
  • 27. Top 100 pages that fluctuate the most! Dynamic PageRank identifies interesting pages that pertain to recent external interest.Ryan Rossi (Purdue) Dynamic PageRank
  • 28. Top 100 pages that fluctuate the most! Pages related to a recent Australian earthquake!Ryan Rossi (Purdue) Dynamic PageRank
  • 29. Top 100 pages that fluctuate the most! Just released movie “Watchmen”Ryan Rossi (Purdue) Dynamic PageRank
  • 30. Top 100 pages that fluctuate the most! Famous co- host/musician that diedRyan Rossi (Purdue) Dynamic PageRank
  • 31. Top 100 pages that fluctuate the most! Recent “American Idol” gossipRyan Rossi (Purdue) Dynamic PageRank
  • 32. Top 100 pages that fluctuate the most! A remembrance of Eve Carson from a contestant on “American Idol” Recent “American Idol” gossipRyan Rossi (Purdue) Dynamic PageRank
  • 33. Top 100 pages that fluctuate the most! Main Finding. These examples reveal the ability of our Dynamic PageRank to mesh the network structure with changes in external interest!Ryan Rossi (Purdue) Dynamic PageRank
  • 34.  Clustering PageRank trends  Granger Causality  Better algorithms (RK4,…)  Put more theoretical teeth behind these resultsRyan Rossi (Purdue) Dynamic PageRank
  • 35. 0.25 Well-separated and unique! Temporal Pattern1 Temporal Pattern2Normalized Dynamic PageRank Temporal Pattern3 Temporal Pattern4 0.2 Temporal Pattern5 Centroids! 0.15 Most nodes stationary! 0.1 0.05 0 0 2 4 6 8 10 12 14 16 18 20 Time Ryan Rossi (Purdue) Dynamic PageRank
  • 36. Non-stationary nodes (and clusters) Potential Anomalies: Large-scale disasters, breaking news 0.25 Temporal Pattern1 Temporal Pattern2Normalized Dynamic PageRank Temporal Pattern3 Temporal Pattern4 0.2 Temporal Pattern5 Centroids! 0.15 Most nodes stationary! 0.1 0.05 0 0 2 4 6 8 10 12 14 16 18 20 Time Ryan Rossi (Purdue) Dynamic PageRank
  • 37. 1 TimAllen 2 TheOffice( 3 DrivingMis 4 Jo 11 KrisAllen 12 KatharineM 13 AmericanId 14 D Allows us identify nodes that become 21 TheOffice( 22 TheLastHou 23 AmericanId 24 T important around similar times (nodes 31 AsherRoth 32 DwightSchr 33 B.J.Novak 34 P w/ similar trends of importance may be 41 TheOffice( 42 SeanHannit 43 Drake(ente 44 P related) 51 SaraPaxton 52 BobbyBrown 53 Sting 54 61 CelticWoma 62 PaulWalker 63 TheHauntin 64 0.25 Temporal Pattern1 71 TracyMorga 72 YouSpinMeR 73 AnnCoulter 74 Temporal Pattern2Normalized Dynamic PageRank Temporal Pattern3 Temporal Pattern4 81 JoBethWill 82 AHaunting 83 Octopussy 84 0.2 Temporal Pattern5 91 MarcoPierr 92 Rebirth(Li 93 LietoMe(TV 94 T Centroids! 0.15 1 Chile 2 WorldWarII 3 Iraq 4 An 11 Jew 12 Brazil 13 Frenchlang 14 S 0.1 21 Caribbean 22 Judaism 23 RomanCatho 2 31 Rome 32 NaziGerman 33 2007 3 0.05 41 2005 42 Christiani 43 Christian 4 0 51 2004 52 Gold 53 2008 54 0 2 4 6 8 10 12 14 16 18 20 Time 61 God 62 Wiktionary 63 Mammal 64 Ryan Rossi (Purdue) Dynamic PageRank 71 LatinAmeri 72 Disappeare 73 Yearofbirt 74 Y
  • 38. Question: Does an earthquake at time t cause people to visit Richter magnitude page at t+1? Causes? Earthquake Richter Mag. Statement on Granger Causality (Stronger version) 1. cause must occur before the effect 2. cause contains information about the effect 3. cause and effect must be linked in the graphRyan Rossi (Purdue) Dynamic PageRank
  • 39. Multivariate regression lag vector of errors vector of response variables regression coefficients to estimate Granger Causality exists if the error by using the time-series x in the forecast model is smaller than without considering x: Significance of the difference in error is measured using the F-testRyan Rossi (Purdue) Dynamic PageRank
  • 40. 0.000406*** Significant! Earthquake Richter Mag. Caused by Earthquake in Australia p-value Earthquake preparedness 0.000607*** Aftershock 0.009619** Asperity 0.001601** Stick-slip phenomenon 0.002312** Landslide dam 0.004820** pval < 0.5 (*), 0.01 (**), 0.001 (***)Ryan Rossi (Purdue) Dynamic PageRank
  • 41. 0.000406*** Significant! Main Finding. Allows us to identify the Earthquake Richter Mag. pages that influence the others with regards to how users find information Caused by Earthquake in Australia p-value Earthquake preparedness 0.000607*** Aftershock 0.009619** Asperity 0.001601** Stick-slip phenomenon 0.002312** Landslide dam 0.004820** pval < 0.5 (*), 0.01 (**), 0.001 (***)Ryan Rossi (Purdue) Dynamic PageRank
  • 42.  Introduced dynamical system framework for PageRank  Stated a dynamic Generalization of PageRank  Dynamic PageRank can help in prediction  Useful for many other applicationsRyan Rossi (Purdue) Dynamic PageRank
  • 43. Thanks! Questions? rrossi@purdue.edu http://www.cs.purdue.edu/homes/rrossiRyan Rossi (Purdue) Dynamic PageRank
  • 44. Ryan Rossi (Purdue) Dynamic PageRank
  • 45. Hourly Pageviews Earthquake Preparedness Earthquake 132 172 time Richter 35 31 Mag. Charles RichterRyan Rossi (Purdue) Dynamic PageRank
  • 46. Earthquake Preparedness Earthquake 132 172 764 Spike in the number of pageviews for that given hour! time Richter 35 31 56 Mag. Charles RichterRyan Rossi (Purdue) Dynamic PageRank
  • 47. ΔPR importance substantially increases! Earthquake Preparedness Earthquake 132 172 764 Spike in the number of pageviews for that given hour! time Richter 35 31 56 Mag. Charles RichterRyan Rossi (Purdue) Dynamic PageRank
  • 48. ΔPR importance substantially increases! Earthquake Preparedness Earthquake 132 172 764After a few iterations,importance diffuses Spike in the number of pageviewsfrom Earthquake to for that given hour!Richter Mag!Direct result of meshing timegraph with pageviews! Richter 35 31 56 Mag. Charles RichterRyan Rossi (Purdue) Dynamic PageRank
  • 49. ΔPR importance substantially increases! Earthquake Preparedness Earthquake 132 172 764After a few iterations,importance diffuses Spike in the number of pageviewsfrom Earthquake to for that given hour!Richter Mag!Direct result of meshing timegraph with pageviews! Richter 35 31 56 becomes important Mag. at this time Hence, Richter magnitude receives a high dynamic PageRank score, becoming increasingly important at this Charles time, while its pageviews are not significantly increasing. RichterRyan Rossi (Purdue) Dynamic PageRank
  • 50. Earthquake Preparedness Earthquake 132 172 764 3406 time Richter 35 31 56 1447 Mag. In the next hour, we find that Charles the pageviews of Richter spike! Richter Reinforcing the importance!Ryan Rossi (Purdue) Dynamic PageRank
  • 51. Earthquake Preparedness Earthquake 132 172 764 3406 Dynamic PageRank is predictive (by definition)! Importance of Richter magnitude captured by dynamic PageRank an hour earlier than when it time actually became important (spike in pageviews) Richter 35 31 56 1447 Mag. In the next hour, we find that Charles the pageviews of Richter spike! Richter Reinforcing the importance!Ryan Rossi (Purdue) Dynamic PageRank
  • 52.  Real-world networks are naturally dynamic — Information Networks (e.g., Wikipedia: article-links-article) — Social Networks (e.g., Twitter: who-follows-whom) — Biological Networks … ⇒ Importance changes! Static methods fail to capture the temporal flow of information Lead to misleading or simply incorrect conclusionsRyan Rossi (Purdue) Dynamic PageRank
  • 53. Graph dynamic networks ⋯ ⋯ ⋯ timeRyan Rossi (Purdue) Dynamic PageRank
  • 54. Graph dynamic networks Attributes ✓ External Influence (e.g., pageviews) 96 113 103 139 125 281 397 331 42 64 53 11 16 12 27 21 39 ⋯ ⋯ ⋯ timeRyan Rossi (Purdue) Dynamic PageRank