Two numerical graph algorithms

1,602 views
1,537 views

Published on

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,602
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
202
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Two numerical graph algorithms

  1. 1. Two matrix computations for numerical graph problems: PageRank and Network Alignment David F. Gleich Sandia National Labs Livermore, CA IBM Almaden Seminar San Jose, CA January 17th, 2011 In collaboration with Andrew Gray (UBC), Chen Greif (UBC) Tracy Lau (UBC/IBM?), Mohsen Bayati (Stanford) Ying Wang (Stanford), Margot Gerritsen (Stanford) Amin Saberi (Stanford) Supported by the Library of Congress and Microsoft Live Labs FellowshipDavid F. Gleich (Sandia) IBM Almaden 1 / 47
  2. 2. Sketch of talk two algorithms inner-outer and belief propagation two problems PageRank and network alignment big graphs for both iterative matrix computations for both multi-core parallel results inner-outer onlystandard flowproblem → algorithm → theory (hopefully) → empirical resultsexcept “fun” results firstsome open questions at end David F. Gleich (Sandia) IBM Almaden 2 / 47
  3. 3. A PageRank algorithmInstead of the power method, Web Data, α = 0.99 x(k+1) = αPx(k) + (1 − α)v. Nodes 105,896,555 Edges 3,783,733,648Use an outer iteration Power Method 964 its 5.15 hrs. (k+1) Inner-Outer 857 its 4.45 hrs.( − βP)x = (α − β)Px(k) + (1 − α)v . Network-Alignment Data, α = 0.95 f Nodes 4,219,893,141 Edges 91,886,357,440with the inner iteration Power Method 271 its 54.6 hrs. Inner-Outer 188 its 36.2 hrs.y(j+1) = βPy(j) + f. Codes and data available. It’s faster!Note Web data is uk-2006 from UNIMI’s (Univ. Milano) DSI group. David F. Gleich (Sandia) IBM Almaden 3 / 47
  4. 4. Network Alignment r Square s A is about 200,000 vertices B is about 300,000 vertices L has around 5,000,000 edges 5 million variable integer QP t ∼ 90% of optimality in minutes. t A L B Codes and data available. DEMO David F. Gleich (Sandia) IBM Almaden 4 / 47
  5. 5. PageRank PageRank Algorithms Inner-outer Performance Network AlignmentPageRank Motivation Network alignmentSlide 5 of 47 Network Alignment Algorithms Results Conclusion
  6. 6. PageRank is a ... ... modified Markov chain, ... damped random walk on a graph, ... pinball game on the reverse web, or ... random surfer model.Proposed by Brin and Page in 1998, but similar ideas fromearlier... (Sebastiano Vigna is working on tracing the history –the current history dates to 1949) Langville and Meyer (2006) is a good general reference; Berkhin (2005) has lots of goodies; and Des Higham called it pinball. David F. Gleich (Sandia) PageRank IBM Almaden 6 / 47
  7. 7. The PageRank Random Surfer important pages ↔ highly probable to visit 3 1. follow out-edges uniformly with probability α, and 2 5 2. randomly jump according to v with 4 probability 1 − α, we’ll assume = 1/ n. 1 6 Induces a Markov chain model αP + (1 − α)veT x(α) = x(α)  1/ 6 1/ 2 ↓ or the linear system 0 0 0 0   1/ 6 1/ 2 0 0 1/ 3 0 0  1/ 6 0 1/ 3 0 0 ( − αP)x(α) = (1 − α)v  1/ 6 0 1/ 2 0 0 0 1/ 6 0 1/ 2 1/ 3 0 1 1/ 6 0 0 0 1 0 But it’s just a model. PNote I’m omitting important details about dangling nodes, I’ll mention them a bit later. David F. Gleich (Sandia) PageRank IBM Almaden 7 / 47
  8. 8. What is α? Author α Brin and Page (1998) 0.85 Najork et al. (2007) 0.85 Litvak et al. (2006) 0.5 Katz (1953) 0.5 Experiment (2009) 0.63 ≈ 0.85 · 0.5 Algorithms (...) ≥ 0.85Our regime 3.0 InfBeta( 3.2 , 2.0 , 1.9e−05 , 0.0019 ) α from browsers α ≥ .85 otherwise 2.5 power is fast. 2.0 density 1.5 P only available 1.0 for mat-vec 0.5 otherwise custom 0.0 techniques 0.0 0.2 0.4 0.6 0.8 1.0 Raw α possible. Constantine, Flaxman, Gleich, Gunawardana, Tracking the Random Surfer, WWW2010 Constantine and Gleich, Random Alpha PageRank, Internet Math. David F. Gleich (Sandia) PageRank IBM Almaden 8 / 47
  9. 9. PageRank PageRank Algorithms Inner-outer Performance Network AlignmentPageRank Algorithms Motivation Network alignmentSlide 9 of 47 Network Alignment Algorithms Results Conclusion
  10. 10. PageRank formulations and theory Codes Theory Strongly prefer- ential PageRank PseudoRank EigensystemsGraph or Substochastic Weakly prefer- PageRankWeb graph matrix ential PageRank Linear systems Sink preferential PageRank Other transformations v teleportation vector ¯ P substochastic matrix (for algorithms) d dangling node vector (d = e − PT e) P + vdT → P ¯ Strongly preferential PageRank P + dT → P ¯ Weakly preferential PageRank ( = v) P PageRank stochastic matrix (for theory) ( − αP)x = (1 − α)v PageRank linear system David F. Gleich (Sandia) PageRank Algorithms IBM Almaden 10 / 47
  11. 11. MotivationWhy another PageRank algorithm? An ideal algorithm is 1. reliable 2. fast over a range of α’s fancy → Use Matlab’s “” 3. efficient for big problems → Use a Gauss-Seidel or custom Richardson method 4. uses only matvec products → Use the inner-outer iteration 5. uses only 2 vectors of memory → Use the power method simple David F. Gleich (Sandia) PageRank Algorithms IBM Almaden 11 / 47
  12. 12. Simple algorithmsThe power method The Richardson methodFor Ax = λx, the iteration For Ax = b, the iteration x(k+1) = Ax(k) / Ax(k) x(k+1) = x(k) + ω (b − Ax(k) ) residualcomputes the largesteigenpair. computes x.The PageRank Markov chain The PageRank linear system iseigenvector problem is ( − αP)x = (1 − α)v. [αP + (1 − α)veT ]x = x For ω = 1If eT x(0) = 1 and j ≥0 x(k+1) = αPx(k) + (1 − α)vx(k+1) = αPx(k) +(1−α)v eT x(k) and the Richardson iteration is =1 the power method. David F. Gleich (Sandia) PageRank Algorithms IBM Almaden 12 / 47
  13. 13. Inner-Outer Note PageRank is easier when α is smaller Thus Solve PageRank with itself using β < α!Outer ( − βP)x(k+1) = (α − β)Px(k) + (1 − α)v ≡ f(k)Inner y(0) = x(k) y(j+1) = βPy(j) + f(k) A new parameter? What is β? 0.5 How many inner iterations? Until a residual of 10−2 Gleich, Gray, Greif, Lau, SISC 2010. David F. Gleich (Sandia) PageRank Algorithms IBM Almaden 13 / 47
  14. 14. Inner-Outer algorithm uses only three vectors Input: P, v, α, τ, (β = 0.5, η = 10−2 ) of memory Output: x 1: x ← v Convergence? 2: y ← Px if 0 ≤ β ≤ α, with “ex- 3: while αy + (1 − α)v − x 1 ≥ τ act” iteration 4: f ← (α − β)y + (1 − α)v but also (small theo- 5: repeat rem) with any η! 6: x ← f + βy 7: y ← Px Parameters? 8: until f + βy − x 1 < η β = 0.5, η = 10−2 often 9: end while faster than the power 10: x ← αy + (1 − α)v method (or just a titch slower)Note Note that the inner-loop checks its condition after doing one iteration. An inexact iteration isalways at least as good as one-step of the power method. David F. Gleich (Sandia) PageRank Algorithms IBM Almaden 14 / 47
  15. 15. Inner-Outer Parameters Question: What parameters should we pick? in−2004, α=0.99 in−2004, α=0.99 1500 1500 power power η = 1e−01 β = 0.10 1400 η = 1e−02 1400 β = 0.30 η = 1e−03 β = 0.50 η = 1e−04 β = 0.70 1300 1300 η = 1e−05 1200 1200 Multiplications Multiplications 1100 1100 1000 1000 900 900 800 800 700 700 −4 −3 −2 −1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 10 10 10 10 β η α = 0.99, in-2004 graph (1.3M nodes, 16.9M edges) Just use β = 0.5 and η = 10−2 !Note Many similar plots appear in my thesis. David F. Gleich (Sandia) PageRank Algorithms IBM Almaden 15 / 47
  16. 16. The CompetitionOur Requirement: only Px is available! Quadratic Extrapolation (Kamvar, Haveliwala, et al.) Aggregation/Disaggregation (Langville and Meyer; Stewart) Permutations/Strong Components (Del Corso, Gulli, and Romani; Langville and Meyer) Krylov methods (Gleich, Zhukov, Berkhin; Del Corso, Gulli, and Romani) Padé-type extrapolation (Brezinski and Redivo-Zaglia) Arnoldi methods (Greif and Golub) Gauss-Seidel (Arasu, Novak, Tomkins, and Tomlin) David F. Gleich (Sandia) PageRank Algorithms IBM Almaden 16 / 47
  17. 17. PageRank PageRank Algorithms Inner-outer PerformanceInner-outer Network Alignment MotivationPerformance Network alignmentSlide 17 of 47 Network Alignment Algorithms Results Conclusion
  18. 18. Datasetsname size nonzeros avg nz/rowubc-cs-2006 51,681 673,010 13.0ubc-2006 339,147 4,203,811 12.4eu-2005 862,664 19,235,140 22.3in-2004 1,382,908 16,917,053 12.2wb-edu 9,845,725 57,156,537 5.8arabic-2005 22,744,080 639,999,458 28.1sk-2005 50,636,154 1,949,412,601 38.5uk-2007 105,896,555 3,738,733,648 35.3 David F. Gleich (Sandia) Inner-outer Performance IBM Almaden 18 / 47
  19. 19. One example wb−edu, α = 0.85 wb−edu, α = 0.99 0 10 0 10 −1 0 10 10 −1 10 10 0 −2 10 −2 −2 10 10 10 −2 5 10 15 20 20 40 −3 −3 10 10 Residual Residual −4 −4 10 10 −5 −5 10 10 −6 −6 10 10 power power inout inout −7 −7 10 10 10 20 30 40 50 60 70 80 200 400 600 800 1000 1200 Multiplication Multiplication τ = 10−7 , β = 0.5, η = 10−2 ; wb-edu graph (9.8M nodes, 57.M edges) David F. Gleich (Sandia) Inner-outer Performance IBM Almaden 19 / 47
  20. 20. Advantage Inner-Outer tol. graph work (mults.) time (secs.) power in/out gain power in/out gain 10−3 ubc-cs-2006 226 141 37.6% 1.9 1.2 35.2% ubc 242 141 41.7% 13.6 8.3 38.4% α = 0.99, β = 0.5, η = 10−2 in-2004 232 129 44.4% 51.1 30.4 40.5% eu-2005 149 150 -0.7% 26.9 28.3 -5.3% wb-edu 221 130 41.2% 291.2 184.6 36.6% arabic-2005 213 139 34.7% 779.2 502.5 35.5% sk-2005 156 144 7.7% 1718.2 1595.9 7.1% uk-2007 145 125 13.8% 2802.0 2359.3 15.8% 10−5 ubc-cs-2006 574 432 24.7% 4.7 3.6 22.9% ubc 676 484 28.4% 37.7 27.8 26.2% in-2004 657 428 34.9% 144.3 97.5 32.4% eu-2005 499 476 4.6% 89.3 87.4 2.1% wb-edu 647 417 35.5% 850.6 572.0 32.8% arabic-2005 638 466 27.0% 2333.5 1670.0 28.4% sk-2005 523 460 12.0% 5729.0 5077.1 11.4% uk-2007 531 463 12.8% 10225.8 8661.9 15.3% 10−7 ubc-cs-2006 986 815 17.3% 8.0 6.8 15.4% ubc 1121 856 23.6% 62.5 49.0 21.6% in-2004 1108 795 28.2% 243.1 179.8 26.0% eu-2005 896 814 9.2% 159.9 148.6 7.1% wb-edu 1096 777 29.1% 1442.9 1059.0 26.6% arabic-2005 1083 843 22.2% 3958.8 3012.9 23.9% sk-2005 951 828 12.9% 10393.3 9122.9 12.2% uk-2007 964 857 11.1% 18559.2 16016.7 13.7% David F. Gleich (Sandia) Inner-outer Performance IBM Almaden 20 / 47
  21. 21. Parallelizationparallel Pxxi=x[i]/degree(i); for (j in edges of i) { atomic(y[j]+=xi); }. 8 linear power relative 6 7 inout relative 1e−3 power 1e−3 inout Speedup relative to best 1 processor 6 1e−5 power 1e−5 inout 1e−7 power 5 1e−7 inout 4 5 3 2 1 4 8 0 1 2 3 4 5 6 7 8 Number of processors David F. Gleich (Sandia) Inner-outer Performance IBM Almaden 21 / 47
  22. 22. PageRank PageRank Algorithms Inner-outer PerformanceNetwork Alignment Network Alignment MotivationMotivation Network alignmentSlide 22 of 47 Network Alignment Algorithms Results Conclusion
  23. 23. David F. Gleich (Sandia) Network Alignment Motivation IBM Almaden 23 / 47
  24. 24. David F. Gleich (Sandia) Network Alignment Motivation IBM Almaden 24 / 47
  25. 25. Alignment and overlap: The goal 3 Educational psychology 2 b2 a 1 b1 Psychiatric hospitals b Mental health is better than 3 2 b2 Health organizations Health 1 b1 Wikipedia LCSH r Square s t t A L B Maximize squares/overlap in 1-1 matching Find a good mapping to investigate similarity! David F. Gleich (Sandia) Network Alignment Motivation IBM Almaden 25 / 47
  26. 26. PageRank PageRank Algorithms Inner-outer Performance Network AlignmentNetwork alignment Motivation Network alignmentSlide 26 of 47 Network Alignment Algorithms Results Conclusion
  27. 27. Integrating Matching and Overlap: A QPSquares produce overlap → bonus for some and j → jVariables, Data r Square s = edge indicator e ∈L = weight of edges e = (t, )Sj squares in S = t t t A L BProblem 1m ximize + j m ximize wT x + 2 xT Sx x :e ∈L ,j∈S ↔ subject to Ax ≤ esubject to is a matching ∈ {0, 1} David F. Gleich (Sandia) Network alignment IBM Almaden 27 / 47
  28. 28. An example with overlap (2,2 ) 0 0 0 0 0 1 0 1 0 1 1 1 0.6    (2,1 ) 0 0 0 0 1 0 1 0 1 0 0 0   0.9  (2,3 ) 0 0 0 0 1 0 1 0 1 0 0 0   0.3  (2,4 ) 0 0 0 0 1 0 1 0 1 0 0 0   0.1    5 0.5 5  (1,2 ) 0 1 1 1 0 0 0 0 0 0 0 1   0.9  (1,1 ) 1 0 0 0 0 0 0 0 0 0 0 0   0.6     4 0.4 4 0.1 0.1 (3,2 )  0 1 1 1 0 0 0 0 0 0 0 , 0   0.3  , (3,3 ) 1 0 0 0 0 0 0 0 0 0 0 0   0.5      2 0.6 2 (4,2 ) 0 1 1 1 0 0 0 0 0 0 0 0   0.1  0.3 0.3 (4,4 ) 1 0 0 0 0 0 0 0 0 0 0 0   0.4  3 0.5 3 (5,5 )     1 0 0 0 0 0 0 0 0 0 0 0 0.5 (6,1 ) 1 0 0 0 1 0 0 0 0 0 0 0 1.0 0.9 0.9 edge order S w 6 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0  1.0 0 0 0 0 0 0 0 0 0 0 1 0 A = 0 0 0 0 0 0 0 0 0 0 0 1 1 0.6 1 1 0 0 0 1 0 1 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 David F. Gleich (Sandia) Network alignment IBM Almaden 28 / 47
  29. 29. Network alignment NETWORK ALIGNMENT β m ximize αwT x + 2 xT Sx subject to Ax ≤ e, ∈ {0, 1}History Sparse problems QUADRATIC ASSIGNMENT Sparse L often ignored (a MAXIMUM COMMON SUBGRAPH few exceptions). Our paper tackles that PATTERN RECOGNITION case explicitly. ONTOLOGY MATCHING We do large problems, BIOINFORMATICS too. Conte el al. Thirty years of graph matching, 2004.; Melnik et al. Similarity ooding, 2004; Blondel et al. SIREV 2004; Singh et al. RECOMB 2007; Klau, BMC Bioinformatics 10:S59, 2009. David F. Gleich (Sandia) Network alignment IBM Almaden 29 / 47
  30. 30. PageRank PageRank Algorithms Inner-outer PerformanceNetwork Alignment Network Alignment MotivationAlgorithms Network alignmentSlide 30 of 47 Network Alignment Algorithms Results Conclusion
  31. 31. Algorithms 1. L P Convert to LP, relax, solve (Skipped) 2. T I G H T L P Improve the LP (Skipped) 3. I S O R A N K Use a PageRank heuristic (Singh et al. 2007) 4. B P Max-product belief propagation for the LP 5. T I G H T B P BP for the TIGHTLP (skipped) 6. M R Sub-gradient descent on TIGHTLP (Klau 2009; skipped)Note Not discussed: early heuristic: Flannick et al. Genome Research 16:1169–1181, 2006; anindependent BP algorithm: Bradde et al. arXiv:0905.1893, 2009 Singh et al. RECOMB2007; Klau, 2009 David F. Gleich (Sandia) Network Alignment Algorithms IBM Almaden 31 / 47
  32. 32. IsoRank m ximize αwT x + (β/ 2)xT Sx subject to 0 ≤ Ax ≤ e, ∈ 0, 1Solve PageRank on S and w! 1. Normalize S to stochastic P 2. Normalize w to stochastic v 3. Compute power iterations and round at each 4. Output best solution Need to evaluate a range of PageRank α Designed for complete bipartite L Singh et al. RECOMB2007; Ninove Ph.D. Thesis Louvain, 2008 David F. Gleich (Sandia) Network Alignment Algorithms IBM Almaden 32 / 47
  33. 33. Inner-outer for this problem? Only on the cores of the two graphs. Dataset Size Non-Zeros LCSH-2 59,849 227,464 WC-3 70,509 403,960 Product Graph 4,219,893,141 91,886,357,440 α = 0.95, w from text similarity Inner-Outer 188 mat-vec 36.2 hours Power 271 mat-vec 54.6 hours Caveat: I’m ignoring all the details of actually using this technique. David F. Gleich (Sandia) Network Alignment Algorithms IBM Almaden 33 / 47
  34. 34. Belief propagation: Our algorithmSummary History Construct a probability BP used for computing model where the most marginal probabilities and likely state is the solution! maximum aposterori Locally update information probability Like a generalized dynamic Wildly successful at solving program satisfiability problems Convergent algorithm for It works max-weight matching Most likely, it won’t converge Bayati et al. 2005; David F. Gleich (Sandia) Network Alignment Algorithms IBM Almaden 34 / 47
  35. 35. M →j { = s} = i Mj → { = s} j ∈{N( )j} j variable tells function j what it thinks about being in state s. This is just the product of what all the other functions tell about being in state s. i Mj→ { = s} = m xim mvariables functions y:all possible choices for variables ∈N(j)max-product of function nodes   j ƒj (y) M →j { = y }  variables have state 0 or 1 ∈{N(j) }function nodes compute aproduct function j tells variable what it thinks about being in state s. This means that wemessages are the belief (local have to locally maxamize ƒj among allobjective) about a node for a possible choices. Note y = s always (toostate cumbersome to include in notation.) David F. Gleich (Sandia) Network Alignment Algorithms IBM Almaden 35 / 47
  36. 36. NetAlign factor graph: Loopy BP Variables Functions A B ƒ1 11 ƒ2 1 1 12 g1 22 2 2 g2 23 g3 3 11 22 h11 22Note It’s pretty hairy to put all the stuff I should put here on a single slide. Most of it is in the paper.The rest is just “turning the crank” with standard tricks in BP algorithms. David F. Gleich (Sandia) Network Alignment Algorithms IBM Almaden 36 / 47
  37. 37. Get tropical In the max-plus sense. David F. Gleich (Sandia) Network Alignment Algorithms IBM Almaden 37 / 47
  38. 38. Belief propagation: A view  m xj  bo nd ,b z A :m×n 1,j j  m xj ≡ min(b, m x( , z)) Ar 2,j j  A = A x≡ z<  Ac . .   . = z  ≤z≤b    x :n×1 m xj  m,j j b z>b  NETALIGNBP ALGORITHM y(0) = 0, z(0) = 0, S(0) = 0, β = β/ 2 ˜ while t = 1, . . . do T d = bo nd0,β (S(t−1) + βS) · e ˜ ˜ y(t) = αw − bo nd0,∞ [(AT Ar − ) r z(t−1) ] + d z(t) = αw − bo nd0,∞ [(AT Ac − ) c y(t−1) ] + d T S(t) = (Y(t) + Z(t) − αW − D) · S − bo nd0,β (S(t−1) + βS) ˜ ˜ end whileNote α = 1, β = 2, γ = 0.99 damping, max-weight matching rounding gives 15,214 overlap, 56,361weight in 10 mins. David F. Gleich (Sandia) Network Alignment Algorithms IBM Almaden 38 / 47
  39. 39. PageRank PageRank Algorithms Inner-outer Performance Network AlignmentResults Motivation Network alignmentSlide 39 of 47 Network Alignment Algorithms Results Conclusion
  40. 40. Synthetic experiments: BP does well! 1 1 rounded objective values 0.8 0.8 fraction correct 0.6 0.6 0.4 0.4 MR−upper MR MR 0.2 BP 0.2 BP BPSC BPSC IsoRank IsoRank 0 0 0 5 10 15 20 0 5 10 15 20 expected degree of noise in L (p ⋅ n) expected degree of noise in L (p ⋅ n) David F. Gleich (Sandia) Results IBM Almaden 40 / 47
  41. 41. Biological data: A close tie 400 1200 376 overlap upper bound 381 1076 overlap upper bound 1087 1000 300 800Overlap Overlap 200 600 400 max weight max weight 100 671.551 2733 BP BP 200 SCBP SCBP IsoRank IsoRank MR MR 0 0 0 100 200 300 400 500 600 700 0 500 1000 1500 2000 2500 Weight WeightProblem |VA | |EA | |VB | |EB | |EL |dmela-scere 9459 25636 5696 31261 34582Mus M.-Homo S. 3247 2793 9695 32890 15810 David F. Gleich (Sandia) Results IBM Almaden 41 / 47
  42. 42. Real dataset 20000 overlap upper bound 16836 17608 15000 Overlap 10000 max weight 5000 60119.8 BP SCBP IsoRank MR 0 0 10000 20000 30000 40000 50000 60000 70000 WeightProblem |VA | |EA | |VB | |EB | |EL |lcsh2wiki 297,266 248,230 205,948 382,353 4,971,629 David F. Gleich (Sandia) Results IBM Almaden 42 / 47
  43. 43. Matching results: A little too hot! LCSH WC Science fiction television series Science fiction television programs Turing test Turing test Machine learning Machine learning Hot tubs Hot dog David F. Gleich (Sandia) Results IBM Almaden 43 / 47
  44. 44. Foreign subject headings The US uses LCSH for subj. headings (342k verts, 258k edges). France uses Rameau for subj. headings (155k verts, 156k edges). Generate L by automatic translation and text matching. Used Google’s automatic translation service (translate.google.com). Produces 22,195,304 possible links based on text. cardinality overlap correct Manual 54,259 39,749 MWM 125,609 17,134 29,133 50.54% NetAlignBP 121,316 46,534 32,467 56.32% NetAlignMR 119,120 45,977 25,086 43.52% Upper 50,753Note NetAlignBP with α = 1, β = 2, γ = 0.99 for 100 iterations; NetAlignMR with α = 0, β = 1 for 1000iterations. David F. Gleich (Sandia) Results IBM Almaden 44 / 47
  45. 45. PageRank PageRank Algorithms Inner-outer Performance Network AlignmentConclusion Motivation Network alignmentSlide 45 of 47 Network Alignment Algorithms Results Conclusion
  46. 46. PhilosophyWhy matrix computations? Simple, iterative methods “Easy” to code “Easy” to parallelize “Often” apply to graph problems David F. Gleich (Sandia) Conclusion IBM Almaden 46 / 47
  47. 47. Summary and Future ideasInner-outer iterations for BP algorithms for networkPageRank alignment Robust analysis Fast and scalable Good for general graphs Good results on biology PPI Can combine with other networks techniques Reasonable results with Works for Gauss-Seidel Rameau to LCSH Works for non-stationary Future work iterations No vertex label information for matches?Future work Are “overlap” scores Gauss-Seidel performance? significant? O P E N Asymptotic Are LCSH and Wikipedia performance of inner-outer? really similar? Dynamic β and η? O P E N An approx. algorithm? David F. Gleich (Sandia) Conclusion IBM Almaden 47 / 47
  48. 48. PAPER 1 stanford.edu/~dgleich/publications/2009/ gleich-2009-inner-outer.html SIAM J. Scientific Computing Google “inner outer gleich”CODE stanford.edu/~dgleich/publications/2009/innout Google “innout gleich”PAPER 2 arxiv.org/abs/0907.3338 ICDM 2009 Google “network alignment gleich”CODE stanford.edu/~dgleich/publications/2009/netalign Google “netalign gleich”

×