Conjugate Gradient for Normal Equations and Preconditioning

FAHAD BIN MOSTAFA
TEXAS TECH UNIVERSITY
DECEMBER 09, 2020
DEPARTMENT OF MATHEMATICS AND STATISTICS, TEXAS TECH UNIVERSITY 1
Texas Tech University
Conjugate Gradient for Normal Equations and
Preconditioning

Outlines
Ø Linear Systems and its least square problem
Ø Arnoldi’s iteration, GMRES Method, and its Convergence
Ø Preconditioned Conjugate Gradient with Normal Equations
Ø Block Conjugate Gradient algorithms for least squares
problems

Linear Systems & Sparse Matrix
A X b
Data Source: Harwell-Boeing test matrix “A", a matrix describing
connections in a model of a diffraction column in a chemical plant which
is not SPD [1] ; Dimension of data set: 479 by 479

Matrix Statistics (479 x 479, 1910 entries)

Least square Problem
• min 𝑟& '
'
= 𝑏 − 𝐴𝑥& '
'
;
• let’s take 𝐴-×/
and 𝑥𝜖𝑅/
. Let’s take
the initial guess 𝑥2 = 0 and the
residual will be 𝑟2 = 𝑏;
• relative residual error 𝑅𝑅𝑉 =
56789
5
• LS stopped at iteration 20 without converging
to the desired tolerance 1e-06 because the
maximum number of iterations was reached.
• The iterate returned (number 20) has relative
residual 0.0017.
• Elapsed time is 1.922524 seconds.

Arnoldi’s iteration and GMRES
• The order n Krylov subspace of 𝐴 generated by 𝑏 is,
𝐾& 𝐴, 𝑏 = 𝑠𝑝𝑎𝑛 𝑟2, 𝐴𝑏, 𝐴'
𝑏, … , 𝐴&6A
𝑏 for 𝑛 = 1,2, …
If the vectors 𝑏, 𝐴𝑏, 𝐴'
𝑏, … , 𝐴&6A
𝑏 are linearly
independent, then they are the basis for 𝐾& 𝐴, 𝑏 .
Although, lim
&↑I
𝐴&
𝑏 → 𝑑𝑜𝑚𝑖𝑛𝑎𝑛𝑡 𝑒𝑖𝑔𝑒𝑛𝑣𝑒𝑐𝑡𝑜𝑟 𝑜𝑓 𝐴,
which fills the basis with many almost parallel vectors.
• Finally 𝑏 𝑒A − 𝐻Y& 𝑧
'
'
→ 𝑚𝑖𝑛.
• 𝑟& '
'
= 𝑏 − 𝐴𝑥& '
'
= 𝑟2 − 𝐴𝑄& 𝑧 '
'
→ 𝑚𝑖𝑛.
0 10 20 30 40 50 60 70 80 90 100
Iteration number
10-8
10
-7
10
-6
10
-5
10
-4
10
-3
10
-2
Relativeresidual
No preconditioner GMRES
ILU preconditioner GMRES
Tolerance
• Solve the preconditioned system M−1A x=b by
specifying L and U as inputs to GMRES.
• Elapsed time is 0.253678 seconds

Convergence of GMRES
While the GMRES convergence is a little tricky to know
in advance, the theorem above tells us some situations
when we expect rapid convergence of GMRES
1. The eigenvalues of 𝐴 are far from the origin,
2. The eigenvector matrix of 𝐴 is well-conditioned, and
3. The eigenvalues of 𝐴 are clustered.
Fast Convergence
Fact Check

Proof

Conjugate Gradient for Normal
Equations
Conjugate Gradient for Normal Equations: 𝐴]
𝐴𝑥 = 𝐴]
𝑏
For our moto here, it is useful to recast this problem as an
equivalent optimization problem
max
8
𝑓 𝑥 =
1
2
𝑥]
𝐴𝑥 − 𝑏]
𝑥 + 𝑐 (5)
Note that ∇𝑓 𝑥 = 𝐴𝑥 − 𝑏, so the minimizing function 𝑓(𝑥) is
the same as solving
0 = ∇𝑓 𝑥 = 𝐴𝑥 − 𝑏 (6)

Algorithm
Algorithm for PCG:
Take 𝑥2=0, 𝑟2 = 0
Solve 𝑀 𝑟2f = 𝑟2 for 𝑟2f
Consider 𝑃2 = 𝑟2f
for 𝑛 = 1,2,3, …
𝛼& = 𝑟&6A
]
𝑟&6Aj/𝑃&6A
]
𝐴𝑃&6A;
𝑥& = 𝑥&6A + 𝛼& 𝑃&6A; # Approximate solution
𝑟& = 𝑟&6A − 𝛼& 𝐴𝑃&6A; # Update the residual
Solve 𝑀 𝑟&f = 𝑟&for 𝑟&f
𝛽& = 𝑟&
]
𝑟&f/𝑟&6A
]
𝑟&6Aj; # Gradient correction factor
𝑃& = 𝑟&f + 𝛽& 𝑃&6A; # New search Direction PCG converged at iteration 52 to a solution with relative
residual 9.1e-08.

CG with Normal Equation
𝐴]
𝐴𝑥 = 𝐴]
𝑏

Preconditioning technique
An standard way to choose preconditioner 𝑀, such a way 𝐴m = 𝑀6A
𝐴 which will be better clustered eigenvalues
for the GMRES method. 𝑀6A
𝐴]
𝐴𝑥 = 𝑀6A
𝐴]
𝑏 ; 𝐴m 𝑥o = 𝑏 p (9)
Which converge faster. The number of iterations required for the conjugate gradient algorithm to converge is
proportional to 𝜅 𝐴 . So, we want something like 𝜅 𝐴m ≪ 𝜅 𝐴 . In order to ensure symmetry and positive
definiteness of 𝐴m we check 𝑀6A
= 𝐿𝐿]
. So the new residual is 𝑟&v = 𝑏w − 𝐴m 𝑥&x = 𝐿]
𝐴]
𝑏 − 𝐿]
𝐴]
𝐴𝑥& = 𝐿]
𝑟&
Then we have 𝑃&
y = 𝐿6A
𝑃& ; 𝑟&f = 𝑀6A
𝑟&

Algorithm PCGNE
Take 𝑥2=0, 𝑟2 = 0
Solve 𝑀 𝑟2f = 𝑟2 for 𝑟2f
Consider 𝑃2 = 𝑟2f
for 𝑛 = 1,2,3, …
𝛼& = 𝑟&6A
]
𝑟&6Aj/𝐴]
𝑃&6A
]
𝐴𝑃&6A;
𝑥& = 𝑥&6A + 𝛼& 𝑃&6A; # Approximate solution
𝑟& = 𝑟&6A − 𝛼& 𝐴]
𝐴𝑃&6A; # Update the residual
Solve 𝑀 𝑟&f = 𝑟&for 𝑟&f
𝛽& = 𝑟&
]
𝑟&f/𝑟&6A
]
𝑟&6Aj; # Gradient correction factor
𝑃& = 𝑟&f + 𝛽& 𝑃&6A; # New search Direction

PCGNE

Block Conjugate Gradient algorithms for least
squares problems
• Conjugate Gradient Least Squares (CGLS) algorithm
in block forms, so-called Block Conjugate Gradient
Least Squares (BCGLS) by Ji, H., & Li, Y. (2017).
• BCGLS considers a linear system in block form with s
right hand sides in the block matrix 𝐵. i.e. 𝐴𝑥 = 𝐵
• BCGLS constructs Krylov subspace in block form
𝐾& 𝐴, 𝑏 = 𝑠𝑝𝑎𝑛 𝐴]
𝑟2, 𝐴]
𝐴𝑏, (𝐴]
𝐴)'
𝑏, … , (𝐴]
𝐴)&6A
𝑏

Algorithm

Results
Colormap of 𝐴 𝑇
𝐴 -orthogonality between Search Matrices in the first 31 iterations (left), condition number of
𝑄𝑖
𝑇
𝑄𝑖 (right up), and maximum and i minimum relative residual norms of columns in Xi (right lower) for a block
linear system with 100 right hand sides using ‘‘gre_1107’’ as the coefficient matrix along BCGLS iterations. [5]

References
• https://math.nist.gov/MatrixMarket/collections/hb.html
• Y. Saad and M. H. Schultz, GMRES: A generalized minimal residual algorithm for solving nonsymmetric
linear systems, SIAM J. Sci. Statist. Comput., 7 (1986), pp. 856–869.
• D. C. Sorensen, Implicit application of polynomial filters in a k-step Arnoldi method, SIAM
• J. MatrixAnal. Appl., 13 (1992), pp. 357–385.
• W. A. Joubert, On the convergence behavior of the restarted GMRES algorithm for solving nonsymmetric
linear systems, Numer. Linear Algebra Appl., 1 (1994), pp. 427–447.
• Ji, H., & Li, Y. (2017). Block conjugate gradient algorithms for least squares problems. Journal of
Computational and Applied Mathematics, 317, 203-217.

Conjugate Gradient for Normal Equations and Preconditioning

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Conjugate Gradient for Normal Equations and Preconditioning

Similar to Conjugate Gradient for Normal Equations and Preconditioning (20)

More from Fahad B. Mostafa

More from Fahad B. Mostafa (6)

Recently uploaded

Recently uploaded (20)

Conjugate Gradient for Normal Equations and Preconditioning