Sparse matrix computations in MapReduce

9,660 views

Published on

Slides from my presentation on sparse matrix methods in MapRed

Published in: Education, Technology

Sparse matrix computations in MapReduce

  1. 1. ICME MapReduce Workshop!April 29 – May 1, 2013!!David F. Gleich!Computer Science!Purdue UniversityDavid Gleich · Purdue 1!Website www.stanford.edu/~paulcon/icme-mapreduce-2013Paul G. Constantine!Center for Turbulence Research!Stanford UniversityMRWorkshop
  2. 2. GoalsLearn the basics of MapReduce & HadoopBe able to process large volumes of data fromscience and engineering applications … help enable you to explore on your own!David Gleich · Purdue 2MRWorkshop
  3. 3. Workshop overviewMonday!Me! Sparse matrix computations in MapReduce!Austin Benson Tall-and-skinny matrix computations in MapReduceTuesday!Joe Buck Extending MapReduce for scientific computing!Chunsheng Feng Large scale video analytics on pivotal HadoopWednesday!Joe Nichols Post-processing CFD dynamics data in MapReduce !Lavanya Ramakrishnan Evaluating MapReduce and Hadoop for scienceDavid Gleich · Purdue 3MRWorkshop
  4. 4. Sparse matrix computationsin MapReduce!David F. Gleich!Computer Science!Purdue UniversityDavid Gleich · Purdue 4Slides online soon!Code https://github.com/dgleich/mapreduce-matrix-tutorialMRWorkshop
  5. 5. How to compute with big matrix data !A tale of two computers 224k Cores10 PB drive1.7 Pflops7 MWCustom !interconnect!$104 M80k cores!50 PB drive? Pflops? MWGB ethernet$?? M625 GB/core!High disk to CPU45 GB/coreHigh CPU to disk5ORNL 2010 Supercomputer!Google’s 2010? !Data computer!David Gleich · Purdue MRWorkshop
  6. 6. My data computers 6Nebula Cluster @ Sandia CA!2TB/core storage, 64 nodes,256 cores, GB ethernetCost $150kThese systems are good for working withenormous matrix data!ICME Hadoop @ Stanford!3TB/core storage, 11 nodes,44 cores, GB ethernetCost $30kDavid Gleich · Purdue MRWorkshop
  7. 7. My data computers 7Nebula Cluster @ Sandia CA!2TB/core storage, 64 nodes,256 cores, GB ethernetCost $150kThese systems are good for working withenormous matrix data!ICME Hadoop @ Stanford!3TB/core storage, 11 nodes,44 cores, GB ethernetCost $30k^but not great,David Gleich · Purdue some^MRWorkshop
  8. 8. By 2013(?) all Fortune 500companies will have a datacomputerDavid Gleich · Purdue 8MRWorkshop
  9. 9. How do you program them?9David Gleich · Purdue MRWorkshop
  10. 10. MapReduce and!Hadoop overview10David Gleich · Purdue MRWorkshop
  11. 11. MapReduce is designed tosolve a different set of problemsfrom standard parallel libraries11David Gleich · Purdue MRWorkshop
  12. 12. The MapReduceprogramming modelInput a list of (key, value) pairsMap apply a function f to all pairsReduce apply a function g to !all values with key k (for all k)Output a list of (key, value) pairs12David Gleich · Purdue MRWorkshop
  13. 13. Computing a histogram !A simple MapReduce example13Input!!Key ImageIdValue Pixels Map(ImageId, Pixels)for each pixelemit"Key = (r,g,b)"Value = 1Reduce(Color, Values)emit"Key = ColorValue = sum(Values)Output!!Key ColorValue !# of pixels David Gleich · Purdue 5151093175101111Map Reduce111111111111shuffleMRWorkshop
  14. 14. Many matrix computationsare possible in MapReduceColumn sums are easy !Input Key (i,j) Value AijOther basic methods !can use common parallel/out-of-core algs!Sparse matrix-vector products y = AxSparse matrix-matrix products C = AB14Reduce(j,Values)emitKey = j, Value = sum(Values)David Gleich · Purdue Map((i,j), val)emit"Key = j, Value = valA11 A12 A13 A14A21 A22 A23 A24A31 A32 A33 A34A41 A42 A43 A44(3,4) -> 5(1,2) -> -6.0(2,3) -> -1.2(1,1) -> 3.14…“Coordinate storage”MRWorkshop
  15. 15. Many matrix computationsare possible in MapReduceColumn sums are easy !Input Key (i,j) Value AijOther basic methods !can use common parallel/out-of-core algs!Sparse matrix-vector products y = AxSparse matrix-matrix products C = AB15Reduce(j,Values)emitKey = j, Value = sum(Values)David Gleich · Purdue Map((i,j), val)emit"Key = j, Value = valA11 A12 A13 A14A21 A22 A23 A24A31 A32 A33 A34A41 A42 A43 A44(3,4) -> 5(1,2) -> -6.0(2,3) -> -1.2(1,1) -> 3.14…“Coordinate storage”Beware of un-thoughtful ideasMRWorkshop
  16. 16. Why so many limitations?16David Gleich · Purdue MRWorkshop
  17. 17. The MapReduceprogramming modelInput a list of (key, value) pairsMap apply a function f to all pairsReduce apply a function g to !all values with key k (for all k)Output a list of (key, value) pairsMap function f must be side-effect free!All map functions run in parallelReduce function g must be side-effect free!All reduce functions run in parallel17David Gleich · Purdue MRWorkshop
  18. 18. A graphical view of the MapReduceprogramming modelDavid Gleich · Purdue 18dataMapdataMapdataMapdataMapkeyvaluekeyvaluekeyvaluekeyvaluekeyvaluekeyvalue()ShufflekeyvaluevaluedataReducekeyvaluevaluevaluedataReducekeyvalue dataReduceMRWorkshop
  19. 19. Data scalabilityThe idea !Bring the computations to the dataMR can schedule map functions withoutmoving data.1 MMRRMMMMapsReduceShuffle23451 2M M3 4M M5M19David Gleich · Purdue MRWorkshop
  20. 20. After waiting in the queue for a month and !after 24 hours of finding eigenvalues, one node randomly hiccups. heartbreak on node rs252David Gleich · Purdue 20MRWorkshop
  21. 21. Fault tolerantRedundant input helps make maps data-localJust one type of communication: shuffleMMRRMMInput stored in triplicateMap output!persisted to disk!before shuffleReduce input/!output on diskDavid Gleich · Purdue 21MRWorkshop
  22. 22. Fault injection10 100 10001/Prob(failure) – mean number of success per failureTimetocompletion(sec)200100No faults (200M by 200)Faults (800M by 10)Faults (200M by 200)No faults !(800M by 10)With 1/5tasks failing,the job onlytakes twiceas long.David Gleich · Purdue 22MRWorkshop
  23. 23. Data scalabilityThe idea !Bring the computations to the dataMR can schedule map functions withoutmoving data.1 MMRRMMMMapsReduceShuffle23451 2M M3 4M M5M23David Gleich · Purdue MRWorkshop
  24. 24. Computing a histogram !A simple MapReduce example24Input!!Key ImageIdValue Pixels Map(ImageId, Pixels)for each pixelemit"Key = (r,g,b)"Value = 1Reduce(Color, Values)emit"Key = ColorValue = sum(Values)Output!!Key ColorValue !# of pixels David Gleich · Purdue 5151093175101111Map Reduce111111111111shuffleThe entire dataset is“transposed” fromimages to pixels. This moves the datato the computation! (Using a combinerhelps to reduce thedata moved, but itcannot always beused) MRWorkshop
  25. 25. Hadoop and MapReduce arebad systems for some matrixcomputations.David Gleich · Purdue 25MRWorkshop
  26. 26. How should you evaluate aMapReduce algorithm?Build a performance model!Measure the worst mapper Usually not too badMeasure the data moved Could be very badMeasure the worst reducer Could be very badDavid Gleich · Purdue 26MRWorkshop
  27. 27. Tools I likehadoop streamingdumbomrjobhadoopyC++David Gleich · Purdue 27MRWorkshop
  28. 28. Tools I don’t use but otherpeople seem to like …pigjavahbasemahoutEclipseCassandraDavid Gleich · Purdue 28MRWorkshop
  29. 29. hadoop streamingthe map function is a program!(key,value) pairs are sent via stdin!output (key,value) pairs goes to stdoutthe reduce function is a program!(key,value) pairs are sent via stdin!keys are grouped!output (key,value) pairs goes to stdoutDavid Gleich · Purdue 29MRWorkshop
  30. 30. mrjob from a wrapper around hadoop streaming formap and reduce functions in pythonclass MRWordFreqCount(MRJob):def mapper(self, _, line):for word in line.split():yield (word.lower(), 1)def reducer(self, word, counts):yield (word, sum(counts))if __name__ == __main__:MRWordFreqCount.run()David Gleich · Purdue 30MRWorkshop
  31. 31. How can Hadoop streamingpossibly be fast?Iter 1QR (secs.)Iter 1Total (secs.)Iter 2Total (secs.)OverallTotal (secs.)Dumbo 67725 960 217 1177Hadoopy 70909 612 118 730C++ 15809 350 37 387Java 436 66 502Synthetic data test 100,000,000-by-500 matrix (~500GB)Codes implemented in MapReduce streamingMatrix stored as TypedBytes lists of doublesPython frameworks use Numpy+AtlasCustom C++ TypedBytes reader/writer with AtlasNew non-streaming Java implementation tooDavid Gleich (Sandia)All timing results from the Hadoop job trackerC++ in streaming beats a native Java implementation.16/22MapReduce 2011David Gleich · Purdue 31Example available from github.com/dgleich/mrtsqr!for verificationmrjob could be faster if it usedtypedbytes for intermediate storage seehttps://github.com/Yelp/mrjob/pull/447MRWorkshop
  32. 32. Code samples and short tutorials atgithub.com/dgleich/mrmatrixgithub.com/dgleich/mapreduce-matrix-tutorialDavid Gleich · Purdue 32MRWorkshop
  33. 33. Matrix-vector productDavid Gleich · Purdue 33Ax = yyi =XkAik xkAxFollow along! mapreduce-matrix-tutorial!/codes/smatvec.py!MRWorkshop
  34. 34. Matrix-vector productDavid Gleich · Purdue 34Ax = yyi =XkAik xkAxA is stored by row$ head samples/smat_5_5.txt !0 0 0.125 3 1.024 4 0.121!1 0 0.597!2 2 1.247!3 4 -1.45!4 2 0.061!x is stored entry-wise!$ head samples/vec_5.txt!0 0.241!1 -0.98!2 0.237!3 -0.32!4 0.080!Follow along! mapreduce-matrix-tutorial!/codes/smatvec.py!MRWorkshop
  35. 35. Matrix-vector product!(in pictures)David Gleich · Purdue 35Ax = yyi =XkAik xkAxAxInput Map 1!Align on columns!Reduce 1!Output Aik xk!keyed on row iAxReduce 2!Output sum(Aik xk)!yMRWorkshop
  36. 36. Matrix-vector product!(in pictures)David Gleich · Purdue 36Ax = yyi =XkAik xkAxAxInput Map 1!Align on columns!def joinmap(self, key, line):!vals = line.split()!if len(vals) == 2:!# the vector!yield (vals[0], # row!(float(vals[1]),)) # xi!else:!# the matrix!row = vals[0]!for i in xrange(1,len(vals),2):!yield (vals[i], # column!(row, # i,Aij!float(vals[i+1])))!MRWorkshop
  37. 37. Matrix-vector product!(in pictures)David Gleich · Purdue 37Ax = yyi =XkAik xkAxAxInput Map 1!Align on columns!Reduce 1!Output Aik xk!keyed on row iAxdef joinred(self, key, vals):!vecval = 0. !matvals = []!for val in vals:!if len(val) == 1:!vecval += val[0]!else:!matvals.append(val) !for val in matvals:!yield (val[0], val[1]*vecval)!Note that you should use asecondary sort to avoidreading both in memory MRWorkshop
  38. 38. Matrix-vector product!(in pictures)David Gleich · Purdue 38Ax = yyi =XkAik xkAxAxInput Map 1!Align on columns!Reduce 1!Output Aik xk!keyed on row iAxReduce 2!Output sum(Aik xk)!ydef sumred(self, key, vals):!yield (key, sum(vals))!MRWorkshop
  39. 39. Move the computations to thedata? Not really!David Gleich · Purdue 39AxAxInput Map 1!Align on columns!Reduce 1!Output Aik xk!keyed on row iAxReduce 2!Output sum(Aik xk)!yCopy data once, now aligned on column Copy data again,align on row MRWorkshop
  40. 40. Matrix-matrix productDavid Gleich · Purdue 40ABAB = CCij =XkAik BkjFollow along! mapreduce-matrix-tutorial!/codes/matmat.py!MRWorkshop
  41. 41. Matrix-matrix productDavid Gleich · Purdue 41ABAB = CCij =XkAik BkjA is stored by row$ head samples/smat_10_5_A.txt !0 0 0.599 4 -1.53!1!2 2 0.260!3!4 0 0.267 1 0.839 B is stored by row$ head samples/smat_5_5.txt !0 0 0.125 3 1.024 4 0.121!1 0 0.597!2 2 1.247!Follow along! mapreduce-matrix-tutorial!/codes/matmat.py!MRWorkshop
  42. 42. Matrix-matrix product !(in pictures)David Gleich · Purdue 42ABAB = CCij =XkAik BkjAMap 1!Align on columns!BReduce 1!Output Aik Bkj!keyed on (i,j)AB Reduce 2!Output sum(Aik Bkj)!CMRWorkshop
  43. 43. Matrix-matrix product !(in code)David Gleich · Purdue 43ABAB = CCij =XkAik BkjAMap 1!Align on columns!Bdef joinmap(self, key, line):!mtype = self.parsemat()!vals = line.split()!row = vals[0]!rowvals = ![(vals[i],float(vals[i+1])) !for i in xrange(1,len(vals),2)]!if mtype==1:!# matrix A, output by col!for val in rowvals:!yield (val[0], (row, val[1]))!else:!yield (row, (rowvals,))!MRWorkshop
  44. 44. Matrix-matrix product !(in code)David Gleich · Purdue 44ABAB = CCij =XkAik BkjAMap 1!Align on columns!BReduce 1!Output Aik Bkj!keyed on (i,j)ABdef joinred(self, key, line):!# load the data into memory !brow = []!acol = []!for val in vals:!if len(val) == 1:!brow.extend(val[0])!else:!acol.append(val)!!for (bcol,bval) in brow:!for (arow,aval) in acol:!yield ((arow,bcol),aval*bval)!MRWorkshop
  45. 45. Matrix-matrix product !(in pictures)David Gleich · Purdue 45ABAB = CCij =XkAik BkjAMap 1!Align on columns!BReduce 1!Output Aik Bkj!keyed on (i,j)AB Reduce 2!Output sum(Aik Bkj)!Cdef sumred(self, key, vals):!yield (key, sum(vals))!MRWorkshop
  46. 46. Why is MapReduce so popular?if (root) {!PetscInt cur_nz=0;!unsigned char* root_nz_buf;!unsigned int *root_nz_buf_i,*root_nz_buf_j;!double *root_nz_buf_v;!PetscMalloc((sizeof(unsignedint)*2+sizeof(double))*root_nz_bufsize,root_nz_buf);!PetscMalloc(sizeof(unsignedint)*root_nz_bufsize,root_nz_buf_i);!PetscMalloc(sizeof(unsignedint)*root_nz_bufsize,root_nz_buf_j);!PetscMalloc(sizeof(double)*root_nz_bufsize,root_nz_buf_v);!!unsigned long long int nzs_to_read = total_nz;!!while (send_rounds 0) {!// check if we are near the end of the file!// and just read that amount!size_t cur_nz_read = root_nz_bufsize;!if (cur_nz_read nzs_to_read) {!cur_nz_read = nzs_to_read;!}!PetscInfo2(PETSC_NULL, reading %i non-zeros of %llin,cur_nz_read, nzs_to_read);!600 lines of grosscode in order toload a sparse matrixinto memory,streaming from oneprocessor.MapReduce offers abetter alternativeDavid Gleich · Purdue 46MRWorkshop
  47. 47. Thoughts on a better systemDefault quadruple precisionMatrix computations without indexingEasy setup of MPI data jobsDavid Gleich · Purdue 47 Initial data load of any MPI job Compute taskMRWorkshop
  48. 48. Double-precision floating pointwas designed for the erawhere “big” was 1000-10000David Gleich · Purdue 48MRWorkshop
  49. 49. Error analysis of summations = 0; for i=1 to n: s = s + x[i]A simple summation formula has !error that is not always small if n is a billionDavid Gleich · Purdue 49fl(x + y) = (x + y)(1 + )fl(Xixi )Xixi  nµXi|xi | µ ⇡ 10 16MRWorkshop
  50. 50. If your application mattersthen watch out for this issue.Use quad-precision arithmeticor compensated summationinstead.David Gleich · Purdue 50MRWorkshop
  51. 51. Compensated Summation“Kahan summation algorithm” on Wikipedias = 0.; c = 0.;for i=1 to n: y = x[i] – c t = s + yc = (t – s) – y s = tDavid Gleich · Purdue 51Mathematically, c is always zero.On a computer, c can be non-zeroThe parentheses matter!fl(csum(x))Xixi  (µ + nµ2)Xi|xi |µ ⇡ 10 16MRWorkshop
  52. 52. SummaryMapReduce is a powerful but limited tool that has a rolein the future of computational math.… but it should be used carefully! See Austin’s talk next!David Gleich · Purdue 52MRWorkshopCode samples and short tutorials atgithub.com/dgleich/mrmatrixgithub.com/dgleich/mapreduce-matrix-tutorial
  53. 53. David Gleich · Purdue 53MRWorkshop

×