Upcoming SlideShare
×

# The power and Arnoldi methods in an algebra of circulants

688 views

Published on

My talk from the CCAM seminar on April 19th on our NLA paper with Chen Greif and Jim Varah (http://dx.doi.org/10.1002/nla.1845)

Published in: Technology, Education
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

### The power and Arnoldi methods in an algebra of circulants

1. 1. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 1 / 29The power and Arnoldi methods in an algebra of circulantsDavid F. GleichComputer SciencePurdue UniversityCCAM SeminarApril 19th, 2013In collaboration withChen Greif and Jim Varah (UBC)Supported by a research grant from NSERCand the Sandia National Labs John von Neumann fellowship
2. 2. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 2 / 29IntroductionKilmer, Martin, and Perrone (2008) presented a circulantalgebra: a set of operations that generalize matrix algebra tothree-way data and provided an SVD.The essence of this approach amounts to viewingthree-dimensional objects as two-dimensional arrays (i.e.,matrices) of one-dimensional arrays (i.e., vectors).Braman (2010) developed spectraland other decompositions.We have extended this algebra withthe ingredients required for iterativemethods such as the power methodand Arnoldi method, and have char-acterized the behavior of these algo-rithms.
3. 3. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 3 / 29A look at the power methodRequire A, x(0), τx(0) ← x(0) x(0) −1for k = 1, . . . , until convergencey(k) ← Ax(k−1)α(k) ← y(k)x(k) ← y(k)α(k)−1if sign((k)1 )x(k) − sign((k−1)1 )x(k−1) < τreturn x(k)end ifend forRequire a scalar inverse, norm, absolute value,...
4. 4. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 4 / 29Three-way arraysGiven an m × n × k table of data,we view this data as an m × n ma-trix where each “scalar” is a vectorof length k.A ∈ Km×nkWe denote the space of length-k scalars as Kk.These scalars interact like circulant matrices.
5. 5. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 5 / 29CirculantsCirculant matrices are a commutative, closed class under thestandard matrix operations.α1 αk . . . α2α2 α1............... αkαk . . . α2 α1Well see more of their properties shortly!
6. 6. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 6 / 29The circ operationWe denote the space of length-k scalars as Kk.These scalars interact like circulant matrices.α = {α1 ... αk } ∈ Kk.α ↔ circ(α) ≡α1 αk . . . α2α2 α1............... αkαk . . . α2 α1.α+β ↔ circ(α)+circ(β) and α◦β ↔ circ(α)circ(β);0 = {0 0 ... 0} 1 = {1 0 ... 0}Kk is the ring of length-k circulants.
7. 7. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 7 / 29The circ operation on matricesA ◦ x =nj=1A1,j ◦ j...nj=1Am,j ◦ j ↔circ(A1,1) ... circ(A1,n).........circ(Am,1) ... circ(Am,n)circ(1)...circ(n) .Deﬁnecirc(A) ≡circ(A1,1) ... circ(A1,n).........circ(Am,1) ... circ(Am,n) circ(x) ≡circ(1)...circ(n)A ◦ x ↔ circ(A)circ(x) matrix-vector products.x ◦ α ↔ circ(x)circ(α) vector-scalar productsThis is equivalent to Kilmer, Martin, Perrone (2008).
8. 8. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 8 / 29A look at the power methodRequire A, x(0), τ −→ A, x(0), τx(0) ← x(0) x(0) −1−→ x(0) ◦ x(0) −1↔ circ(x(0))circ( x(0) )−1for k = 1, . . . , until convergencey(k) ← Ax(k−1) −→ y(k) ← A ◦ x(k−1) ↔ circ(A)circ(x(k−1))α(k) ← y(k) −→ . . .x(k) ← y(k)α(k)−1if sign((k)1 )x(k) − sign((k−1)1 )x(k−1) < τreturn x(k)end ifend forRequire a scalar inverse , norm (?), absolute value (?) ,...
9. 9. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 9 / 29Circulants and Fourier transformsLet C be a k × k circulant matrix. Then the eigenvector matrixof C is given by the k × k discrete Fourier transform matrix F,whereFj =1kω(−1)(j−1)and ω = e2πι/k.This matrix is complex symmetric, FT= F, and unitary,F∗= F−1. Thus, C = FDF∗, D = dig(λ1, . . . , λk).Multiplying a vector by F or F∗can be accomplished viathe fast Fourier transform in O(k log k) time instead ofO(k2) for the typical matrix-vector product algorithm.Computing the matrix D can be done in time O(k log k) aswell.d = fft(a)
10. 10. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 10 / 29cft and icftWe deﬁne the “Circulant Fourier Transform” or cftcft : α ∈ Kk → Ck×kand its inverseicft : Ck×k→ Kkas follows:cft(α) ≡ˆα1...ˆαk= F∗circ(α)F,icftˆα1...ˆαk≡ α ↔ F cft(α)F∗,where ˆαj are the eigenvalues of circ(α) as produced in theFourier transform. These transformations satisfyicft(cft(α)) = α and provide a convenient way of movingbetween operations in Kk to the more familiar environment ofdiagonal matrices in Ck×k.
11. 11. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 11 / 29OperationsLet α, β ∈ Kk. Note thatα + β = icft(cft(α) + cft(β)), andα ◦ β = icft(cft(α) cft(β)).In the Fourier space – the output of the cft operation – theseoperations are both O(k) time because they occur betweendiagonal matrices. These simpliﬁcations generalize tomatrix-based operations too. For example,A ◦ x = icft(cft(A) cft(x)).
12. 12. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 12 / 29Operations (cont.)In the Fourier space, this system is a series of independentmatrix vector products:cft(A) cft(x) =ˆA1...ˆAkˆx1...ˆxk=ˆA1 ˆx1...ˆAk ˆxk .We use ˆAj and ˆxj to denote the blocks of Fourier coefﬁcients, orequivalently, circulant eigenvalues. This formulation takesO(mnk log k + nk log k)cft and icft+ O(kmn)matvecsoperations instead of O(mnk2) using the circ formulation.
13. 13. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 13 / 29Operations (cont.)More operations are simpliﬁed in the Fourier space too. Letcft(α) = dig [ˆα1, ..., ˆαk]. Because the ˆαj values are theeigenvalues of circ(α), we have:abs(α) = icft(dig [| ˆα1|, ..., | ˆαk|]),α = icft(dig [ˆα1, ..., ˆαk]) = icft(cft(α)∗), andangle(α) = icft(dig [ˆα1/| ˆα1|, ..., ˆαk/| ˆαk|]).
14. 14. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 14 / 29Decompositional interpretation of cftAlgebraically, the cft operation for a matrix A ∈ Km×nk iscft(A) = Pm(m ⊗ F∗)circ(A)(n ⊗ F)PTn,where Pm and Pn are permutation matrices. We canequivalently write this directly in terms of the eigenvalues ofeach of the circulant blocks of circ(A):cft(A) ≡ˆA1...ˆAk , ˆAj =λ1,1j ... λ1,nj.........λm,1j ... λm,nj ,where λr,s1 , . . . , λr,sk are the diagonal elements of cft(Ar,s). Theinverse operation icft, takes a block diagonal matrix andreturns the matrix in Km×nk :icft(A) ↔ (m ⊗ F)PTmAPn(n ⊗ F∗).
15. 15. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 15 / 29Back to gure
16. 16. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 16 / 29ExampleLet A = {2 3 1} {8 −2 0}{−2 0 2} {3 1 1} . The result of the circ and cftoperations are:circ(A) =2 1 3 8 0 −23 2 1 −2 8 01 3 2 0 −2 8−2 2 0 3 1 10 −2 2 1 3 12 0 −2 1 1 3,( ⊗ F∗)circ(A)( ⊗ F) =6 6− 3ι −9 + 3ι3ι −9 − 3ι0 5−3 + 3ι 2−3 − 3ι 2,cft(A) =6 60 5− 3ι −9 + 3ι−3 + 3ι 23ι −9 − 3ι−3 − 3ι 2.
17. 17. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 17 / 29A look at the power methodRequire A, x(0), τx(0) ← x(0) x(0) −1for k = 1, . . . , until convergencey(k) ← Ax(k−1)α(k) ← y(k)x(k) ← y(k)α(k)−1if sign((k)1 )x(k) − sign((k−1)1 )x(k−1) < τreturn x(k)end ifend forRequire a scalar inverse , norm , absolute value ,...
18. 18. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 18 / 29These operations can now bestraightforwardly de nedinverse of a scalar:α−1↔ circ(α)−1.more generally, function of a scalar:ƒ(α) ↔ ƒ(circ(α))angle:angle() || = , angle(α) ↔ circ(abs(α))−1circ(α).The norm of a vector in Knkproduces a scalar in Kk:x ↔ (circ(x)∗circ(x))1/2=n=1circ()∗circ()1/2.Inner product:〈x, y〉 ↔ circ(y)∗circ(x).
19. 19. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 19 / 29ExampleRun the power method on{2 3 1} {0 0 0}{0 0 0} {3 1 1}Result
20. 20. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 19 / 29ExampleRun the power method on{2 3 1} {0 0 0}{0 0 0} {3 1 1}Result λ = (1/3) {10 4 4}
21. 21. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 20 / 29ExampleA ={2 3 1} {0 0 0}{0 0 0} {3 1 1}ˆA1 =6 00 5, ˆA2 =-ι 3 00 2, ˆA3 =ι 3 00 2.λ1 = icft(dig [6 2 2]) = (1/3) {10 4 4}λ2 = icft(dig [5 -ι 3 ι 3]) = (1/3) {5 2 2}λ3 = icft(dig [6 -ι 3 ι 3]) = {2 3 1}λ4 = icft(dig [5 2 2]) = (1/3) {3 1 1} .The corresponding eigenvectors arex1 ={1/3 1/3 1/3}{2/3 -1/3 -1/3}; x2 ={2/3 -1/3 -1/3}{1/3 1/3 1/3};x3 ={1 0 0}{0 0 0}; x4 ={0 0 0}{1 0 0}.
22. 22. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 21 / 29Canonical setThere are more eigenvaluesλ5 = icft(dig [6 -ι 3 2]) λ6 = icft(dig [6 2 ι 3])λ7 = icft(dig [5 -ι 3 2]) λ8 = icft(dig [5 2 ι 3]),altogether polynomial number, exceeds dimension of matrix.Deﬁnition. A canonical set of eigenvalues and eigenvectors isa set of minimum size, ordered such thatabs(λ1) ≥ abs(λ2) ≥ . . . ≥ abs(λk), which contains theinformation to reproduce any eigenvalue or eigenvector of AIn this case, the only canonical set is {(λ1, x1), (λ2, x2)}. (Needtwo, and have abs(λ1) ≥ abs(λ2).)
23. 23. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 22 / 29Keeping in real...Let A ∈ Kn×nk be real-valued with diagonalizable ˆAj matrices. Ifk is odd, then the eigendecomposition X ◦ Λ ◦ X−1is real-valuedif and only if ˆA1 has real-valued eigenvalues. If k is even, thenX ◦ Λ ◦ X−1is real-valued if and only if ˆA1 and ˆAk/2+1 havereal-valued eigenvalues.
24. 24. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 23 / 29The power method convergesLet A ∈ Kn×nk have a canonical set of eigenvalues λ1, . . . , λnwhere |λ1| > |λ2|, then the power method in the circulantalgebra convergences to an eigenvector x1 with eigenvalue λ1.Where we use the ordering ...α < β ↔ cft(α) < cft(β) elementwise
25. 25. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 24 / 29The Arnoldi processLet A be an n × n matrix with real valued entries. Then theArnoldi method is a technique to build an orthogonal basisfor the Krylov subspace Kt(A, v) = span{v, Av, . . . , At−1v},where v is an initial vector.We have the decompositionAQt = Qt+1Ht+1,twhere Qt is an n × t matrix, and Ht+1,t is a (t + 1) × t upperHessenberg matrix.Using our repertoire of operations, the Arnoldi method inthe circulant algebra is equivalent to individual Arnoldiprocesses on each matrix ˆAj.Equivalent to a block Arnoldi process.Using the cft and icft operations, we produce an Arnoldifactorization:A ◦ Qt = Qt+1 ◦ Ht+1,t.
26. 26. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 25 / 29ExampleConsider−Δ(, y) = ƒ(, y) (, 0) = (, 1), (0, y) = y(1, y) = 0for (, y) ∈ [0, 1] × [0, 1] with a uniform mesh and the standard5-point discrete Laplacian:−Δ(, yj) ≈ −(−1, yj) − (, yj−1)+ 4(, yj) − (+1, yj) − (, yj+1).Apply the boundary conditions and organizing the unknowns of in y-major order.An approximate solution  is given by solving anN(N − 1) × N(N − 1) block-tridiagonal, circulant-block system.
27. 27. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 26 / 29The Linear SystemC −− C......... −− CA(1, ·)(2, ·)...(N−1, ·)=f(1, ·)f(2, ·)...f(N−1, ·)f,C =4 −1 −1−1 4......... −1−1 −1 4N×N,That is, A = f, or A ◦  = f, where A is an N − 1 × N − 1 matrixof KN elements,  and f have compatible sizes, andA = circ(A),  = vec(), f = vec(f).
28. 28. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 27 / 29The canonical eigenvalues of A areλj = {4+2 cos(jπ/N),−1,0,...,0,−1} .To see this result, let λ(μ) = {μ,−1,0,...,0,−1} . Then(A − λ(μ) ◦ ) =(4 − μ) ◦ 1 −1 ◦ 1−1 ◦ 1 (4 − μ) ◦ 1......... −1 ◦ 1−1 ◦ 1 (4 − μ) ◦ 1.
29. 29. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 28 / 292000 4000 6000 800010−1510−1010−51002 +2 cos( 2π/n)2 +2 cos( π/n)2 i6 +2 cos( 2π/n)6 +2 cos( π/n)2 i6 +2 cos( 2π/n)6 +2 cos( π/n)iiterationmagnitudeEigenvalue ErrorEigenvector ChangeFigure: The convergence behavior of the powermethod in the circulant algebra. The gray lines showthe error in the each eigenvalue in Fourier space.These curves track the predictions made based onthe eigenvalues as discussed in the text. The redline shows the magnitude of the change in theeigenvector. We use this as the stopping criteria. Italso decays as predicted by the ratio of eigenvalues.The blue ﬁt lines have been visually adjusted tomatch the behavior in the convergence tail.0 10 20 30 40 5010−1510−1010−5100Arnoldi iterationMagnitudeAbsolute errorResidual magnitudeFigure: The convergence behavior of a GMRESprocedure using the circulant Arnoldi process. Thegray lines show the error in each Fourier componentand the red line shows the magnitude of theresidual. We observe poor convergence in oneFourier component; until the Arnoldi basis capturesall of the eigenvalues after N/2 + 1 = 26 iterations.These results show how the two computations areperforming individual power methods or Arnoldiprocesses in Fourier space.
30. 30. 40 60 80 100 120406080mmDavid F. Gleich (Purdue) CCAM Seminar 29 / 29The EndPaper available online from http://www.cs.ubc.ca/˜greif:“The power and Arnoldi methods in an algebra of circulants”,David Gleich, Chen Greif and Jim VarahThank you!