• Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • I am surprised that this approach (map reduce) could be compared to GraphChi, PowerGraph since there are studies that mapreduce might not suite for computational graphs.
    Are you sure you want to
    Your message goes here
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Massive MapReduceMatrix Computations &Multicore GraphAlgorithmsDAVID F. GLEICHCOMPUTER SCIENCEPURDUE UNIVERSITY 1 David Gleich · Purdue
  • 2. i “imvol3” — 2007/7/25 — 21:25 — page 257 — #1 Internet Mathematics Vol. 3, No. 3: 257-294It’s a pleasure … Approximating PersonalizedIntel Intern 2005 in PageRank with Minimal UseApplication Research of Web Graph DataLab in Santa Clara David Gleich and Marzia Polito Could you run your own search engineResulting in one of and crawl the web to compute proximations to the personalized PageRank score of ayou are We focus on your own PageRank vector if webpage. Abstract. In this paper, we consider the problem of calculating fast and amy favorite papers! highly concerned with privacy? to improve speed by limiting the amount of web graph data we need to acc Our algorithms provide both the approximation to the personalized Page as well as guidance in using only the necessary information—and therefo Yes! Theory, Experiments, Implementation! reduce not only the computational cost of the algorithm but also the m memory bandwidth requirements. We report experiments with these alg web graphs of up to 118 million pages and prove a theoretical approxima 2 David Gleich · Purdue for all. Finally, we propose a local, personalized web-search system for a f system using our algorithms.
  • 3. Massive MapReduce Matrix Computations Yangyang Hou " Purdue, CS A1 Paul G. Constantine " Austin Benson " Joe Nichols" A2 Stanford University James Demmel " A3 UC Berkeley Joe Ruthruff " A4 Jeremy Templeton" Sandia CA Funded by Sandia National Labs CSAR project. 3 David Gleich · Purdue
  • 4. By 2013(?) all Fortune 500companies will have a datacomputer 4 David Gleich · Purdue
  • 5. Data computers I’ve worked with … Magellan Cluster @ ! Student Cluster @ ! Nebula Cluster @ ! NERSC! Stanford! Sandia CA!128GB/core storage, " 3TB/core storage, " 2TB/core storage, "80 nodes, 640 cores, " 11 nodes, 44 cores, " 64 nodes, 256 cores, "infiniband GB ethernet GB ethernet Cost $30k Cost $150k These systems are good for working with enormous matrix data! 5 David Gleich · Purdue
  • 6. How do you program them? 6 David Gleich · Purdue
  • 7. MapReduce and"Hadoop overview 7 David Gleich · Purdue
  • 8. MapReduce in a picture Like an MPI all-to-all In parallel In parallel 8 David Gleich · Purdue
  • 9. Computing a histogram "A simple MapReduce example 1 1 5Input! 1 1 Output! 15 1! 1 ! 10 shuffle 1 9Key ImageId 1 1 1 Key Color 3Value Pixels Map 1 1 Reduce Value " 1 17 1 5 1 1 # of pixels 10 Map(ImageId, Pixels) Reduce(Color, Values) for each pixel emit" emit" Key = Color Key = (r,g,b)" Value = sum(Values) Value = 1 9 David Gleich · Purdue
  • 10. Why a limited computational model?Data scalability, fault tolerance. Maps M M The last page of a1 M 1 2 136-page error dump. Reduce2 M M M R 3 43 M R4 M M 55 M Shuffle The idea ! Bring the computations to the data MR can schedule map After waiting in the queue for a month and " functions without after 24 hours of finding eigenvalues, " moving data. one node randomly hiccups. 10 David Gleich · Purdue
  • 11. Tall-and-Skinny matrices (m ≫ n) Many rows (like a billion)A A few columns (under 10,000) regression and general linear models" with many samples From tinyimages" collection Used in block iterative methods panel factorizations simulation data analysis ! big-data SVD/PCA! 11 David Gleich · Purdue
  • 12. Scientific simulations as " Tall-and-Skinny matrices Input " Time history"Parameters of simulation s f" ~100GB The simulation as a matrix The simulation as a vector 2 3 time q(x1 , t1 , s) A database parameters tall-and-skinny matrix The database is a very" 6 . . 7 of simulations 6 . 7 6 7 6q(xn , t1 , s)7 6 7 space-by-time 6q(x1 , t2 , s)7 s1 -> f1 space 6 7 f(s) = 6 7 A 6 6 . . . 7 7 s2 -> f2 A 6q(xn , t2 , s)7 6 6 7 7 . 4 . . 5 sk -> fk q(xn , tk , s) 12 David Gleich · Purdue
  • 13. Model reduction Constantine & Gleich, ICASSP 2012 A Large Scale ExampleNonlinear heat transfer model80k nodes, 300 time-steps104 basis runsSVD of 24m x 104 data matrix 500x reduction in wall clock time(100x including the SVD) 13 David Gleich · Purdue
  • 14. PCA of 80,000,000" images First 16 columns of V as images1000 pixels R    V SVD (principal TSQR components) 80,000,000 images Top 100 A X singular values Zero" mean" rows 14/22 MapReduce Post Processing Constantine & Gleich, MapReduce 2010. David Gleich · Purdue
  • 15. All these applications need isTall-and-Skinny QR 15 David Gleich · Purdue
  • 16.    the solution of QR is block nor    is    orthogonal (   ) “normalize” a vQuick review of QRQR Factorization usually genera computing    in    is    upper triangular.Let    , real Using QR for regression    is given by    the solution of    0 A QR is = Q block normalization   is    orthogonal (   ) “normalize” a vector R usually generalizes to computing    in the QR   is    upper David Gleich (Sandia) triangular. MapReduce 2011Current MapReduce algs use the normal equations 0 AT Cholesky ! RT R 1 = Q A = QR A A Q = AR Rwhich can limit numerical accuracy 16David Gleich (Sandia) MapReduce 2011 4/22 David Gleich · Purdue
  • 17. There are good MPIimplementations. Why MapReduce? 17 David Gleich · Purdue
  • 18. Full TSQR code inhadoopy In hadoopyimport random, numpy, hadoopy def close(self):class SerialTSQR: self.compress() def __init__(self,blocksize,isreducer): for row in self.data: key = random.randint(0,2000000000) self.bsize=blocksize yield key, row self.data = [] if isreducer: self.__call__ = self.reducer def mapper(self,key,value): else: self.__call__ = self.mapper self.collect(key,value) def reducer(self,key,values): def compress(self): for value in values: self.mapper(key,value) R = numpy.linalg.qr( numpy.array(self.data),r) if __name__==__main__: # reset data and re-initialize to R mapper = SerialTSQR(blocksize=3,isreducer=False) self.data = [] reducer = SerialTSQR(blocksize=3,isreducer=True) for row in R: hadoopy.run(mapper, reducer) self.data.append([float(v) for v in row]) def collect(self,key,value): self.data.append(value) if len(self.data)>self.bsize*len(self.data[0]): self.compress() 18 David Gleich (Sandia) MapReduceDavid 2011 Gleich · Purdue 13/22
  • 19. Tall-and-skinny matrixstorage in MapReduceA : m x n, m ≫ n A1Key is an arbitrary row-id A2Value is the 1 x n array "for a row A3 A4 Each submatrix Ai is an "the input to a map task. 19 David Gleich · Purdue
  • 20. Numerical stability was a problem for prior approaches Constantine & Gleich, MapReduce 2010 Prior work norm ( QTQ – I )Previous methodscouldn’t ensure AR-1that the matrix Qwas orthogonal Benson, Gleich, Demmel, Submitted AR + " -1 nt Direct TSQR refineme iterative Benson, Gleich, " Demmel, Submitted 105 1020 Condition number 20 David Gleich · Purdue
  • 21. Communication avoiding QR (Demmel et al. 2008) " on MapReduce (Constantine and Gleich, 2010) Algorithm Data Rows of a matrix A1 A1 Map QR factorization of rows A2 qr Reduce QR factorization of rows A2 Q2 R2Mapper 1 qrSerial TSQR A3 A3 Q3 R3 “Manual reduce” can make A4 qr emit A4 Q4 R4 it faster by adding a second iteration. A5 A5 qr A6 A6 Q6 R6 Computes only R and not Q Mapper 2 qr A7 A7Serial TSQR Q7 R7 Can get Q via Q = AR-1 with A8 A8 qr Q8 R8 emit another MR iteration. R4 Use the standard R4Reducer 1 Householder method?Serial TSQR qr emit R8 R8 Q R 21 David Gleich · Purdue
  • 22. Taking care of business bykeeping track of Q 3. Distribute the pieces of Q*1 and form the true Q Mapper 1 Mapper 3 Task 2 R1 Q11 A1 Q1 R1 Q11 R Q1 Q1 R2 Q21 Q output R output R2 R3 Q31 Q21 A2 Q2 Q2 Q2 R4 Q41 R3 Q31 2. Collect R on one A3 Q3 Q3 Q3 node, compute Qs for each piece R4 Q41 A4 Q4 Q4 Q4 1. Output local Q and R in separate files 22 David Gleich · Purdue
  • 23. The price is right! Based onperformance model and tests Experiment on 2500 NERSC DirectTSQR is Magellan faster than computer, 80 refinement for … and not any nodes, 640seconds few columns slower for many processors, columns. 80TB disk 500 800M-by-10 7.5B-by-4 150M-by-100 500M-by-50 23 David Gleich · Purdue
  • 24. Ongoing workMake AR-1 stable with targeted quad-precisionarithmetic to get a numerically orthogonal Q" Performance model says it’s feasible!How to handle more than ~ 10,000 columns? " Some randomized methods?Do we need quad-precision for big-data?"Standard error analysis n 𝜀 to compute sum." I’ve seen this with PageRank computations! 24 David Gleich · Purdue
  • 25. Multicore Graph " Assefaw Gehraimbem "Algorithms Arif Khan" Alex Pothen" Ryan Rossi" Mem Purdue, CS CPU Mahantesh Halappanavar" Mem PNNL Mem CPU Chen Greif" CPU David Kurokawa" Univ. British Columbia Mohsen Bayati"Funded by DOE CSCAPES Institute grant Amin Saberi"(DE-FC02-08ER25864), NSF CAREER grant Ying Wang (now Google)"1149756-CCF, and the Center for AdaptiveSuper Computing Software Multithreaded Stanford 25Architectures (CASS-MT) at PNNL. David Gleich · Purdue
  • 26. Network alignment"What is the best way of matching "graph A to B? w v s r t u A B 26 David Gleich · Purdue
  • 27. the Figure 2. The NetworkBLAST local network alignment algorithm. Given two inputs) orodes lem Network alignment" networks, a network alignment graph is constructed. Nodes in this graph correspond to pairs of sequence-similar proteins, one from each species, and edges correspond to conserved interactions. A search algorithm identifies highly similar subnetworks that follow a prespecified interaction pattern. Adapted from Sharan and Ideker.30n the ent;nied ped lem net- one oneplest ying einsome the be-d as aphever, ap- From Sharan and Ideker, Modeling cellular machinery through biologicalrked network comparison. Nat. Biotechnol. 24, 4 (Apr. 2006), 427–433. 27 , we Figure 3. Performance comparison of computational approaches.mon- David Gleich · Purdue
  • 28. Network alignment"What is the best way of matching "graph A to B using only edges in L? w v s r wtu t u A L B 28 David Gleich · Purdue
  • 29. Network alignment"Matching? 1-1 relationship"Best? highest weight and overlap w v Overlap s r wtu t u A L B 29 David Gleich · Purdue
  • 30. Our contributionsA new belief propagation method (Bayati et al. 2009, 2013)"Outperformed state-of-the-art PageRank and optimization-based heuristic methodsHigh performance C++ implementations (Khan et al. 2012)"40 times faster (C++ ~ 3, complexity ~ 2, threading ~ 8)"5 million edge alignments ~ 10 sec"www.cs.purdue.edu/~dgleich/codes/netalignmc 30 David Gleich · Purdue
  • 31. 31David Gleich · Purdue
  • 32. Each iteration involves Let x[i] be the score forMatrix-vector-ish computations each pair-wise match in Lwith a sparse matrix, e.g. sparsematrix vector products in a semi- for i=1 to ...ring, dot-products, axpy, etc. update x[i] to y[i]Bipartite max-weight matching compute ausing a different weight vector at max-weight match with yeach iteration update y[i] to x[i]" (using match in MR)No “convergence” "100-1000 iterations 32 David Gleich · Purdue
  • 33. The methodsEach iteration involves! Belief Propagation! ! Listing 2. A belief-propagation message passing procedure for network alignment. See the text for a description of othermax and round heuristic. D 1 y(0) = 0, z(0) = 0, d(0) = 0, S(k) = 0 tMatrix-vector-ish computations ! 2 3 for k = 1 to niter T F = bound0, [ S + S(k) ] Step 1: compute F O swith a sparse matrix, e.g. sparse 4 d = ↵w + Fe Step 2: compute d a ! 5 y(k) = d othermaxcol(z(k 1) ) Step 3: othermax imatrix vector products in a semi- 6 z(k) = d othermaxrow(y(k 1) ) i h S(k) = diag(y(k) + z(k) d)S F Step 4: update S ! 7ring, dot-products, axpy, etc. 8 (y(k) , z(k) , S(k) ) k (y(k) , z(k) , S(k) )+ O a 9 (1 k )(y(k 1) , z(k 1) , S(k 1) ) Step 5: damping e 10 11 ! round heuristic (y(k) ) Step 6: matching round heuristic (z(k) ) Step 6: matching I 12 endBipartite max-weight matching return y(k) or z(k) with the largest objective value ! 13 t pusing a different weight vector at m ! weach iteration interpretation, the weight vectors are usually called messages as they communicate the “beliefs” of each “agent.” In this A particular problem, the neighborhood of an agent represents 33 all of the other edges in graph L incident on the same vertex s in graph A (1st vector), all edges in L incident on the same David in graph BPurdue vertex Gleich · (2nd vector), or the edges in L that are fi “
  • 34. The NEW methods Each iteration involves! Belief Propagation! el ! Listing 2. A belief-propagation message passing procedure for network alignment. See the text for a description of othermax and round heuristic. D lParal (0) (0) (0) (k) y = 0, z = 0, d = 0, S = 0 1 t ! F = bound Matrix-vector-ish computations for k = 1 to n [ S + S ] Step 1: compute F 2 3 iter 0, (k) T O s with a sparse matrix, e.g. sparse d = ↵wd+ Fe Step 2: compute dStep 3: othermax 4 a ! y = d othermaxrow(y )) = 5 (k) othermaxcol(z (k 1) i matrix vector products in a semi- z 6 (k) (k) (k 1) i h S = diag(y + z d)S F Step 4: update S (k) (k) ! (y , z , S ) (y , z , S )+ 7 ring, dot-products, axpy, etc. 8 (k) (k) (k) k (k) (k) (k) O a 9 (1 k )(y(k 1) , z(k 1) , S(k 1) ) Step 5: damping e 10 11 ! round heuristic (y(k) ) Step 6: matching round heuristic (z(k) ) Step 6" I 12 end approx matching Approximate bipartite max- return y or z with the largest objective value (k) (k) ! 13 t p weight matching is used here m ! w instead! interpretation, the weight vectors are usually called messages as they communicate the “beliefs” of each “agent.” In this A particular problem, the neighborhood of an agent represents 34 all of the other edges in graph L incident on the same vertex s in graph A (1st vector), all edges in L incident on the same David in graph BPurdue vertex Gleich · (2nd vector), or the edges in L that are fi “
  • 35. MR Approximation doesn’t hurt the between the Library of Congress r 0.2 ApproxMRpedia categories (lcsh-wiki). While BP e hierarchical tree, they also have belief propagation algorithm ApproxBP r types of relationships. Thus we 0 0 5 10 15 20 l graphs. The second problem is an expected degree of noise in L (p ⋅ n)rary of Congress subject headingsFrench National Library: Rameau. 1d weights in L are computed via a heading strings (and via translated of correct matchau). These problems are larger than 0.8 BP a Fraction fraction correct indis nd App tingu roxB NMENT WITH APPROXIMATE 0.6 isha P ATCHING ble are ss the question: how does the be- 0.4 d the BP method change when wematching procedure from Section V MR 0.2 ApproxMR step in each algorithm? Note that BP ching in the first step of Klau’s ApproxBPch) because the problems in each 0we parallelize over perturb onealso 0 5 10 15 20 Randomly rows. Note expected degree of noise in L (p ⋅ n) is much more integral to Klau’s B power-law graph to get A, The amount of random-ness in L in average expected degreeedure. Generate L by the true-we Fig. 2. Alignment with a power-law graph shows the large effect that For the BP procedure, ing problem to evaluate the quality approximate rounding can have on solutions from Klau’s method (MR). With 35 match + random edgesKlau’s method, the results of the that method, using exact rounding will yield the identity matching for all David Gleich · Purdue problems (bottom figure), whereas using the approximation results in over a
  • 36. A local dominating edgemethod for bipartite matching j i The method guarantees r s •  ½ approximation •  maximal matching based on work by Preis (1999), Manne and wtu Bisseling (2008), and t u Halappanavar et al (2012) A L BA locally dominating edge is an edgeheavier than all neighboring edges.For bipartite Work on smaller side only 36 David Gleich · Purdue
  • 37. A local dominating edgemethod for bipartite matching j Queue all vertices i r s Until queue is empty! In Parallel over vertices" Match to heavy edge and if there’s a conflict, wtu u check the winner, and t find an alternative for A L B the loser Add endpoint of non-A locally dominating edge is an edge dominating edges toheavier than all neighboring edges. the queueFor bipartite Work on smaller side only 37 David Gleich · Purdue
  • 38. A local dominating edgemethod for bipartite matching j i Customized first iteration r s (with all vertices) Use OpenMP locks to update choices wtu t u Use sync_and_fetch_add A L B for queue updates.A locally dominating edge is an edgeheavier than all neighboring edges.For bipartite Work on smaller side only 38 David Gleich · Purdue
  • 39. Remaining multi-threadingprocedures are straightforwardStandard OpenMP for matrix-computations" use schedule=dynamic to handle skewWe can batch the matching procedures in theBP method for additional parallelism for i=1 to ... update x[i] to y[i] save y[i] in a buffer when the buffer is full compute max-weight match for all in buffer and save the best 39 David Gleich · Purdue
  • 40. Performance evaluation(2x4)-10 core Intel E7-8870, 2.4 GHz (80-cores)16 GB memory/proc (128 GB)Scaling study Mem Mem Mem Mem1.  Thread binding " CPU CPU CPU CPU scattered vs. compact CPU CPU CPU CPU2.  Memory binding " Mem Mem Mem Mem interleaved vs. bind 40 David Gleich · Purdue
  • 41. Scaling BP with no batching lcsh-rameau, 400 iterations 25 scatter and interleave 20 Speedup 15 115 seconds for 40-thread 10 5 1450 seconds for 1-thread 0 0 20 40 60 80 Threads 41 David Gleich · Purdue
  • 42. Ongoing workBetter memory handling! " numactl, affinity insufficient for full scalingBetter models!" These get to be much bigger computations.Distributed memory." Trying to get an MPI version, looking into GraphLab 42 David Gleich · Purdue
  • 43. PageRank was created byageRank details Google to rank by Google PageRank web-pages 3 3 2 5 The Model 0 0 0 3 2 1/ 6 1/ 2 0 2 5 6 1/ 6 0 0 1/ 3 0 0 7 1. follow edges uniformlyPwith j 0 ! 6 probability1/ 3, 0 0 7 eT P=eT 1/ 6 1/ 2 0 0 0 4 4 4 1/ 6 0 1/ 2 0 and 5 1/ 6 0 1/ 2 1/ 3 0 1 2. randomly jump 0 with probability 1 6 | 1/ 6 0 {z 0 1 } 0 1 6 1 , we’ll assume everywhere is P equally likely T 0 “jump” ! v = [ 1 ... 1 ] n n eT v=1 î ó Markov chain P + (1 )ve T x=x The places we find the unique x ) j 0, eT x = 1. are im- surfer most often Linear system ( portant pages. P)x = (1 )v 43/40 Ignored dangling nodes patched back to v algorithms later Gleich, Purdue UTRC Seminar David
  • 44. ther uses for PageRankensitivity? else people use PageRank to do ProteinRank GeneRank ObjectRank NM_003748 NM_003862 Contig32125_RC U82987 AB037863 NM_020974 Contig55377_RC NM_003882 NM_000849 Contig48328_RC Contig46223_RC NM_006117 NM_003239 NM_018401 AF257175 AF201951 NM_001282 Contig63102_RC NM_000286 Contig34634_RC NM_000320 AB033007 AL355708 NM_000017 NM_006763 AF148505 Contig57595 NM_001280 AJ224741 U45975 Contig49670_RC Contig753_RC Contig25055_RC Contig53646_RC Contig42421_RC Contig51749_RC EventRank AL137514 NM_004911 NM_000224 NM_013262 Contig41887_RC NM_004163 AB020689 NM_015416 Contig43747_RC IsoRank NM_012429 AB033043 AL133619 NM_016569 NM_004480 NM_004798 Contig37063_RC NM_000507 AB037745 Contig50802_RC NM_001007 Contig53742_RC NM_018104 Contig51963 Contig53268_RC NM_012261 NM_020244 Contig55813_RC Contig27312_RC Contig44064_RC NM_002570 NM_002900 AL050090 NM_015417 Contig47405_RC NM_016337 Contig55829_RC Contig37598 Contig45347_RC NM_020675 NM_003234 AL080110 AL137295 Contig17359_RC NM_013296 NM_019013 AF052159 Contig55313_RC NM_002358 NM_004358 Contig50106_RC NM_005342 NM_014754 U58033 Contig64688 NM_001827 Contig3902_RC Contig41413_RC NM_015434 NM_014078 NM_018120 NM_001124 L27560 Contig45816_RC AL050021 NM_006115 NM_001333 NM_005496 Contig51519_RC Contig1778_RC NM_014363 NM_001905 NM_018454 NM_002811 Clustering NM_004603 AB032973 NM_006096 D25328 Contig46802_RC X94232 NM_018004 Contig8581_RC Contig55188_RC Contig50410 Contig53226_RC NM_012214 NM_006201 NM_006372 Contig13480_RC AL137502 Contig40128_RC NM_003676 NM_013437 Contig2504_RC AL133603 NM_012177 R70506_RC NM_003662 NM_018136 NM_000158 NM_018410 Contig21812_RC NM_004052 Contig4595 Contig60864_RC NM_003878 U96131 NM_005563 NM_018455 Contig44799_RC NM_003258 P)x = (1 NM_004456 NM_003158 NM_014750 Contig25343_RC NM_005196 Contig57864_RC NM_014109 NM_002808 Contig58368_RC Contig46653_RC ( )v NM_004504 M21551 NM_014875 NM_001168 NM_003376 NM_018098 AF161553 NM_020166 NM_017779 (graph partitioning) NM_018265 AF155117 NM_004701 NM_006281 Contig44289_RC NM_004336 Contig33814_RC NM_003600 NM_006265 NM_000291 NM_000096 NM_001673 NM_001216 NM_014968 NM_018354 NM_007036 NM_004702 Contig2399_RC NM_001809 Contig20217_RC NM_003981 NM_007203 NM_006681 AF055033 NM_014889 NM_020386 NM_000599 Contig56457_RC NM_005915 Contig24252_RC Contig55725_RC NM_002916 NM_014321 NM_006931 AL080079 Contig51464_RC NM_000788 NM_016448 X05610 NM_014791 Contig40831_RC AK000745 NM_015984 NM_016577 Contig32185_RC AF052162 AF073519 NM_003607 NM_006101 NM_003875 Contig25991 Contig35251_RC NM_004994 NM_000436 NM_002073 NM_002019 NM_000127 NM_020188 Sports ranking AL137718 Contig28552_RC Contig38288_RC AA555029_RC NM_016359 Contig46218_RC Contig63649_RC AL080059 10 20 30 40 50 60 70he (links : 1examined and understood se GD )x = w to Food webs nd “nearby” important Centrality enes. Teaching 44/40 Conjectured new papers: TweetRank (Done, WSDM 2010), WaveRank,he jump : examined, understood, and u Rank, PaperRank, UniversityRank, LabRank. Gleich, Purdue last one involves a David I think the UTRC Seminar
  • 45. Multicore PageRank… similar story … Serialized preprocessingParallelize the linear algebra via an "asynchronous Gauss-Seidel iterative method~10x scaling on same (80-core) machine "(1M nodes, 15M edges, synthetic) 45 David Gleich · Purdue
  • 46. Questions? Papers on my webpage www.cs.purdue.edu/homes/dgleich Codes github.com/arbenson/mrtsqrwww.cs.purdue.edu/homes/dgleich/codes/netalignmc github.com/dgleich/prpack 46 David Gleich · Purdue