Between many ideas of solving linear systems, each technique has its own merits and demerits. To handle large linear system, it is particularly important to use a convenient method for reducing convergence time and obtaining better outputs. In this project we will use Steepest Decent (SD) method and Conjugate Gradient (CG) method for solving linear system with high dimensions. There are many other techniques to solve such systems, for example, simple Gaussian elimination or Cholesky Factorization, however these methods are unsuitable to handle large systems. On the other hand, SD and CG methods are quite familiar to solve sparse system of linear equations. Moreover, from many existed technique, normal equation is one of the convenient ways to solve big non-invertible system. However, this technique is more ill conditioned, but it plays a good role for some specific methods. We introduce preconditioning of the system so that condition number is close to one. For optimizing convex quadratic functions, CG can be used for optimal results with fast convergence. Then, we apply preconditioned Conjugate Gradient method with normal equation on the modified system. In this study, we compared three different methods (LS with SD, GMRES with Arnoldi’s iteration and PCGNE). PCGNE is the main aim to show in this project. We basically use many plots and tables to show the better method. Finally, we use a block preconditioning technique known as Block Conjugate Gradient algorithms for least squares for some better and faster convergence.
2. Outlines
Ø Linear Systems and its least square problem
Ø Arnoldi’s iteration, GMRES Method, and its Convergence
Ø Preconditioned Conjugate Gradient with Normal Equations
Ø Block Conjugate Gradient algorithms for least squares
problems
DEPARTMENT OF MATHEMATICS AND STATISTICS, TEXAS TECH UNIVERSITY 2
3. Linear Systems & Sparse Matrix
DEPARTMENT OF MATHEMATICS AND STATISTICS, TEXAS TECH UNIVERSITY 3
A X b
Data Source: Harwell-Boeing test matrix “A", a matrix describing
connections in a model of a diffraction column in a chemical plant which
is not SPD [1] ; Dimension of data set: 479 by 479
5. Least square Problem
DEPARTMENT OF MATHEMATICS AND STATISTICS, TEXAS TECH UNIVERSITY 5
• min 𝑟& '
'
= 𝑏 − 𝐴𝑥& '
'
;
• let’s take 𝐴-×/
and 𝑥𝜖𝑅/
. Let’s take
the initial guess 𝑥2 = 0 and the
residual will be 𝑟2 = 𝑏;
• relative residual error 𝑅𝑅𝑉 =
56789
5
• LS stopped at iteration 20 without converging
to the desired tolerance 1e-06 because the
maximum number of iterations was reached.
• The iterate returned (number 20) has relative
residual 0.0017.
• Elapsed time is 1.922524 seconds.
6. Arnoldi’s iteration and GMRES
DEPARTMENT OF MATHEMATICS AND STATISTICS, TEXAS TECH UNIVERSITY 6
• The order n Krylov subspace of 𝐴 generated by 𝑏 is,
𝐾& 𝐴, 𝑏 = 𝑠𝑝𝑎𝑛 𝑟2, 𝐴𝑏, 𝐴'
𝑏, … , 𝐴&6A
𝑏 for 𝑛 = 1,2, …
If the vectors 𝑏, 𝐴𝑏, 𝐴'
𝑏, … , 𝐴&6A
𝑏 are linearly
independent, then they are the basis for 𝐾& 𝐴, 𝑏 .
Although, lim
&↑I
𝐴&
𝑏 → 𝑑𝑜𝑚𝑖𝑛𝑎𝑛𝑡 𝑒𝑖𝑔𝑒𝑛𝑣𝑒𝑐𝑡𝑜𝑟 𝑜𝑓 𝐴,
which fills the basis with many almost parallel vectors.
• Finally 𝑏 𝑒A − 𝐻Y& 𝑧
'
'
→ 𝑚𝑖𝑛.
• 𝑟& '
'
= 𝑏 − 𝐴𝑥& '
'
= 𝑟2 − 𝐴𝑄& 𝑧 '
'
→ 𝑚𝑖𝑛.
0 10 20 30 40 50 60 70 80 90 100
Iteration number
10-8
10
-7
10
-6
10
-5
10
-4
10
-3
10
-2
Relativeresidual
No preconditioner GMRES
ILU preconditioner GMRES
Tolerance
• Solve the preconditioned system M−1A x=b by
specifying L and U as inputs to GMRES.
• Elapsed time is 0.253678 seconds
7. Convergence of GMRES
DEPARTMENT OF MATHEMATICS AND STATISTICS, TEXAS TECH UNIVERSITY 7
While the GMRES convergence is a little tricky to know
in advance, the theorem above tells us some situations
when we expect rapid convergence of GMRES
1. The eigenvalues of 𝐴 are far from the origin,
2. The eigenvector matrix of 𝐴 is well-conditioned, and
3. The eigenvalues of 𝐴 are clustered.
Fast Convergence
Fact Check
9. Conjugate Gradient for Normal
Equations
DEPARTMENT OF MATHEMATICS AND STATISTICS, TEXAS TECH UNIVERSITY 9
Conjugate Gradient for Normal Equations: 𝐴]
𝐴𝑥 = 𝐴]
𝑏
For our moto here, it is useful to recast this problem as an
equivalent optimization problem
max
8
𝑓 𝑥 =
1
2
𝑥]
𝐴𝑥 − 𝑏]
𝑥 + 𝑐 (5)
Note that ∇𝑓 𝑥 = 𝐴𝑥 − 𝑏, so the minimizing function 𝑓(𝑥) is
the same as solving
0 = ∇𝑓 𝑥 = 𝐴𝑥 − 𝑏 (6)
10. Algorithm
DEPARTMENT OF MATHEMATICS AND STATISTICS, TEXAS TECH UNIVERSITY 10
Algorithm for PCG:
Take 𝑥2=0, 𝑟2 = 0
Solve 𝑀 𝑟2f = 𝑟2 for 𝑟2f
Consider 𝑃2 = 𝑟2f
for 𝑛 = 1,2,3, …
𝛼& = 𝑟&6A
]
𝑟&6Aj/𝑃&6A
]
𝐴𝑃&6A;
𝑥& = 𝑥&6A + 𝛼& 𝑃&6A; # Approximate solution
𝑟& = 𝑟&6A − 𝛼& 𝐴𝑃&6A; # Update the residual
Solve 𝑀 𝑟&f = 𝑟&for 𝑟&f
𝛽& = 𝑟&
]
𝑟&f/𝑟&6A
]
𝑟&6Aj; # Gradient correction factor
𝑃& = 𝑟&f + 𝛽& 𝑃&6A; # New search Direction PCG converged at iteration 52 to a solution with relative
residual 9.1e-08.
12. Preconditioning technique
DEPARTMENT OF MATHEMATICS AND STATISTICS, TEXAS TECH UNIVERSITY 12
An standard way to choose preconditioner 𝑀, such a way 𝐴m = 𝑀6A
𝐴 which will be better clustered eigenvalues
for the GMRES method. 𝑀6A
𝐴]
𝐴𝑥 = 𝑀6A
𝐴]
𝑏 ; 𝐴m 𝑥o = 𝑏 p (9)
Which converge faster. The number of iterations required for the conjugate gradient algorithm to converge is
proportional to 𝜅 𝐴 . So, we want something like 𝜅 𝐴m ≪ 𝜅 𝐴 . In order to ensure symmetry and positive
definiteness of 𝐴m we check 𝑀6A
= 𝐿𝐿]
. So the new residual is 𝑟&v = 𝑏w − 𝐴m 𝑥&x = 𝐿]
𝐴]
𝑏 − 𝐿]
𝐴]
𝐴𝑥& = 𝐿]
𝑟&
Then we have 𝑃&
y = 𝐿6A
𝑃& ; 𝑟&f = 𝑀6A
𝑟&
13. Algorithm PCGNE
DEPARTMENT OF MATHEMATICS AND STATISTICS, TEXAS TECH UNIVERSITY 13
Take 𝑥2=0, 𝑟2 = 0
Solve 𝑀 𝑟2f = 𝑟2 for 𝑟2f
Consider 𝑃2 = 𝑟2f
for 𝑛 = 1,2,3, …
𝛼& = 𝑟&6A
]
𝑟&6Aj/𝐴]
𝑃&6A
]
𝐴𝑃&6A;
𝑥& = 𝑥&6A + 𝛼& 𝑃&6A; # Approximate solution
𝑟& = 𝑟&6A − 𝛼& 𝐴]
𝐴𝑃&6A; # Update the residual
Solve 𝑀 𝑟&f = 𝑟&for 𝑟&f
𝛽& = 𝑟&
]
𝑟&f/𝑟&6A
]
𝑟&6Aj; # Gradient correction factor
𝑃& = 𝑟&f + 𝛽& 𝑃&6A; # New search Direction
15. Block Conjugate Gradient algorithms for least
squares problems
DEPARTMENT OF MATHEMATICS AND STATISTICS, TEXAS TECH UNIVERSITY 15
• Conjugate Gradient Least Squares (CGLS) algorithm
in block forms, so-called Block Conjugate Gradient
Least Squares (BCGLS) by Ji, H., & Li, Y. (2017).
• BCGLS considers a linear system in block form with s
right hand sides in the block matrix 𝐵. i.e. 𝐴𝑥 = 𝐵
• BCGLS constructs Krylov subspace in block form
𝐾& 𝐴, 𝑏 = 𝑠𝑝𝑎𝑛 𝐴]
𝑟2, 𝐴]
𝐴𝑏, (𝐴]
𝐴)'
𝑏, … , (𝐴]
𝐴)&6A
𝑏
17. Results
DEPARTMENT OF MATHEMATICS AND STATISTICS, TEXAS TECH UNIVERSITY 17
Colormap of 𝐴 𝑇
𝐴 -orthogonality between Search Matrices in the first 31 iterations (left), condition number of
𝑄𝑖
𝑇
𝑄𝑖 (right up), and maximum and i minimum relative residual norms of columns in Xi (right lower) for a block
linear system with 100 right hand sides using ‘‘gre_1107’’ as the coefficient matrix along BCGLS iterations. [5]
18. References
DEPARTMENT OF MATHEMATICS AND STATISTICS, TEXAS TECH UNIVERSITY 18
• https://math.nist.gov/MatrixMarket/collections/hb.html
• Y. Saad and M. H. Schultz, GMRES: A generalized minimal residual algorithm for solving nonsymmetric
linear systems, SIAM J. Sci. Statist. Comput., 7 (1986), pp. 856–869.
• D. C. Sorensen, Implicit application of polynomial filters in a k-step Arnoldi method, SIAM
• J. MatrixAnal. Appl., 13 (1992), pp. 357–385.
• W. A. Joubert, On the convergence behavior of the restarted GMRES algorithm for solving nonsymmetric
linear systems, Numer. Linear Algebra Appl., 1 (1994), pp. 427–447.
• Ji, H., & Li, Y. (2017). Block conjugate gradient algorithms for least squares problems. Journal of
Computational and Applied Mathematics, 317, 203-217.