Upcoming SlideShare
×

# Distinguishing the signal from noise in an SVD of simulation data

932 views

Published on

My talk at the massive data and signal processing workshop at ICASSP 2012 in Kyoto Japan.

Published in: Technology, Education
2 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
932
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
17
0
Likes
2
Embeds 0
No embeds

No notes for slide

### Distinguishing the signal from noise in an SVD of simulation data

1. 1. Distinguishing signal noisefrom noise in an SVDof simulation dataDAVID F. GLEICH ! PAUL G. CONSTANTINE! PURDUE UNIVERSITY STANFORD UNIVERSITYCOMPUTER SCIENCE ! DEPARTMENT 1 David Gleich · Purdue ICASSP
2. 2. Large scale non-linear, timedependent heat transfer problem 105 nodes, 103 time steps 30 minutes on 16 cores ~ 1GB Questions What is the probability of failure? Which input values cause failure? 2 David Gleich · Purdue ICASSP
3. 3. Insight and conﬁdence requires multiple runsand hits the curse of dimensionality.The problemA simulation run is time-consuming!Our solutionUse “big-data” techniques and platforms. 3 David Gleich · Purdue ICASSP
4. 4. We store a few runs …Supercomputer Data computing cluster EngineerRun 100-1000 Store them on the Run 10000-100000simulations MapReduce cluster interpolated simulations for approximate statistics … and build an interpolant from the data for computational steering. 4 David Gleich · Purdue ICASSP
5. 5. The Database Input " Time history" Parameters of simulation s1 -> f1 s2 -> f2 s f " 5-10 of them “a few gigabytes” sk -> fk 2 3 A single simulation q(x1 , t1 , s) 6 . . 7 at one time stepThe simulation 6 6 . 7 7 as a vector 6q(xn , t1 , s)7 6 7 6q(x1 , t2 , s)7 ⇥ ⇤ 6 7 f(s) = 6 . 7 X = f(s1 ) f(s2 ) ... f(sp ) 6 . . 7 6 7 6q(xn , t2 , s)7 6 7 The database as a matrix. 6 . 7 4 . . 5 100GB – 100TB q(xn , tk , s) 5 David Gleich · Purdue ICASSP
6. 6. Xi,j = f (xi , sj ) One-dimensional 1 test problemf (x, s) = log[1 + 4s(x 2 x)] 8s f(x) X= f1 f2 f5 x “plot( X )” “imagesc(X )” 6 David Gleich · Purdue ICASSP
7. 7. The interpolantMotivation! This idea was inspired byLet the data give you the basis. the success of other ⇥ ⇤ reduced order models X = f(s1 ) f(s2 ) ... f(sp ) like POD; and Paul’s residual minimizing idea.Then ﬁnd the right combination Xr f(s) ⇡ uj ↵j (s) j=1 These are the left singular vectors from X! 7 David Gleich · Purdue ICASSP
8. 8. Why the SVD? It splits “space- time” from “parameters” treat each right singular vector x is the “space-time” index as samples of the unknown r r basis functions X Xf (xi , sj ) = Ui,` ` Vj,` = u` (xi ) ` v` (sj ) `=1 `=1 split x and s a general parameter r p X X (`)f (xi , s) = u` (xi ) ` v` (s) v` (s) ⇡ v` (sj ) j (s) `=1 j=1 Interpolate v any way you wish … and it has a “smoothness” property. 8 David Gleich · Purdue ICASSP
9. 9. MapReduce and Interpolation f1 Interpolation Sample f2 Interp.! f5 The Database New Samples The Surrogate s1 -> f1 sa -> fa s2 -> f2 Use SVD on Form a linear sb -> fb MapReduce Just one machine combination of sk -> fk cluster to get singular vectors s -> f c c singular vector On the MapReduce cluster basis On the MapReduce clusterICASSP David Gleich · Purdue 9/18
10. 10. A quiz!Which section would you rathertry and interpolate, A or B? A B 10 David Gleich · Purdue ICASSP
11. 11. Fig. 1. An example of when the functions v` become dHow predictable is a ! cult to interpolate. Each plot shows a singular-vector f the example in Section 3, which we interpret as a funcsingular vector? v` (s). While we might have some conﬁdence in an interp tion of v1 (s) and v2 (s), interpolating v3 (s) for s nearby problematic, and interpolating v7 (s) anywhere is dubiousFolk Theorem (O’Leary 2011) v1 v2 1 1The singular vectors of a matrix of 0 0“smooth” data become more −1 −1oscillatory as the index increases. −1 0 1 −1 0 1 v v 3 7Implication! 0.5 0.5The gradient of the singular vectors 0 0increases as the index increases. −0.5 −1 0 1 −0.5 −1 0 1 Fig. 2. For reference, we show a ﬁner discretization ofv1 (s), v2 (s), ... , vt (s) v (s), ... , v (s) functions above, which shows that interpolating v7 (s) ne 1 is difﬁcult.t+1 r Predictable signal Unpredictable noise Once we have determined the predictable bases, w 11 terpolate them using procedures discussed above to cr David Gleich · Purdue ICASSP the ↵` (s). From the singular values and left singular vec
12. 12. A reﬁned method with !an error model Don’t even try to interpolate the predictable modes. t(s) r X Xf(s) ⇡ uj ↵j (s) + uj j ⌘j j=1 Predictable j=t(s)+1 Unpredictable ⌘j ⇠ N(0, 1) 0 1 r X 2 TAVariance[f] = diag @ j uj uj j=t(s)+1 But now, how to choose t(s)? 12 David Gleich · Purdue ICASSP
13. 13. Our current approach tochoosing the predictability v1 v2t(s) is the largest 𝜏 such that 1 1 0 0 X⌧ 1 @vi −1 −1 0 −1 1 −1 0 1 i v3 v7 1 @s 1 1 i=1 0 0 < threshold −1 −1 −1 0 1 −1 0 1Better ideas? Come talk to me! We can use more black v` becom Fig. 1. An example of when the functions gradients than red gradients, cult to interpolate. Each will be higher singular-vecto so error plot shows a for red. the example in Section 3, which we interpret as a fu 13 v` (s). While we might have some conﬁdence in an int tion of vDavidand v2 (s), interpolating v3 (s) for s nearb 1 (s) Gleich · Purdue ICASSP
14. 14. An experimental test case A heat equation problem Two parameters that control the material properties 14 David Gleich · Purdue ICASSP
15. 15. Where the error is the worst Error Our Reduced Order Model 10-2 10-3Histogram of errors The Truth 15 Error 10-3 10-2 David Gleich · Purdue ICASSP
16. 16. A Large Scale ExampleNonlinear heat transfer model80k nodes, 300 time-steps104 basis runsSVD of 24m x 104 data matrix 500x reduction in wall clock time(100x including the SVD) 16 David Gleich · Purdue ICASSP
17. 17. SVD from QR: R-SVDOld algorithm …Let A = QR Tthen A= QUR ⌃R VR… helps when A is tall and skinny. 17 David Gleich · Purdue ICASSP
18. 18. Intro to MapReduceOriginated at Google for indexing web Data scalablepages and computing PageRank. Maps M M 1 2 1 MThe idea Bring the Reduce 2 M M Mcomputations to the data. R 3 4 3 M R M MExpress algorithms in " 4 5 5 M Shufﬂedata-local operations. Fault-tolerance by designImplement one type of Input stored in triplicatecommunication: shufﬂe. M Reduce input/" output on disk MShufﬂe moves all data with M Rthe same key to the same M Rreducer. Map output" persisted to disk" 18 before shufﬂe David Gleich · Purdue ICASSP
19. 19. MapReduceTSQR summary MapReduce is great for TSQR!Data A tall and skinny (TS) matrix by rowsMap QR factorization of local rows Demmel et al. showed that this construction works toReduce QR factorization of local rows compute a QR factorization with minimal communicationInput 500,000,000-by-100 matrixEach record 1-by-100 rowHDFS Size 423.3 GBTime to compute    (the norm of each column) 161 sec.Time to compute    in qr(   ) 387 sec. 19 On a 64-node Hadoop cluster with · Purdue David Gleich 4x2TB, one Core i7-920,ICASSP 12GB RAM/node
20. 20. Key LimitationsComputes only R and not QCan get Q via Q = AR+ with another MR iteration. " (we currently use this for computing the SVD) Not numerically orthogonal; iterative reﬁnement helps.We are working on better ways to compute Q"(with Austin Benson, Jim Demmel) 20 David Gleich · Purdue ICASSP
21. 21. Our vision!To enable analystsand engineers tohypothesize from " Paul G. Constantine " data computationsinstead of expensiveHPC computations. 21 David Gleich · Purdue ICASSP