t

Solution of
Macroeconometric Models
Giorgio Pauletto
November 1995
Department of Econometrics
University of Geneva

Contents
1 Introduction 1
2 A Review of Solution Techniques 4
2.1 LU Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1 Pivoting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.2 Computational Complexity . . . . . . . . . . . . . . . . . 8
2.1.3 Practical Implementation . . . . . . . . . . . . . . . . . . 8
2.2 QR Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Direct Methods for Sparse Matrices . . . . . . . . . . . . . . . . . 10
2.3.1 Data Structures and Storage Schemes . . . . . . . . . . . 11
2.3.2 Fill-in in Sparse LU . . . . . . . . . . . . . . . . . . . . . 13
2.4 Stationary Iterative Methods . . . . . . . . . . . . . . . . . . . . 14
2.4.1 Jacobi Method . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4.2 Gauss-Seidel Method . . . . . . . . . . . . . . . . . . . . . 15
2.4.3 Successive Overrelaxation Method . . . . . . . . . . . . . 16
2.4.4 Fast Gauss-Seidel Method . . . . . . . . . . . . . . . . . . 17
2.4.5 Block Iterative Methods . . . . . . . . . . . . . . . . . . . 17
2.4.6 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.5 Nonstationary Iterative Methods . . . . . . . . . . . . . . . . . . 21
2.5.1 Conjugate Gradient . . . . . . . . . . . . . . . . . . . . . 21
2.5.2 Preconditioning . . . . . . . . . . . . . . . . . . . . . . . . 22
2.5.3 Conjugate Gradient Normal Equations . . . . . . . . . . . 24

CONTENTS ii
2.5.4 Generalized Minimal Residual . . . . . . . . . . . . . . . . 25
2.5.5 BiConjugate Gradient Method . . . . . . . . . . . . . . . 27
2.5.6 BiConjugate Gradient Stabilized Method . . . . . . . . . 28
2.5.7 Implementation of Nonstationary Iterative Methods . . . 29
2.6 Newton Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.6.2 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.7 Finite Difference Newton Method . . . . . . . . . . . . . . . . . . 32
2.7.1 Convergence of the Finite Difference Newton Method . . 33
2.8 Simplified Newton Method . . . . . . . . . . . . . . . . . . . . . . 34
2.8.1 Convergence of the Simplified Newton Method . . . . . . 35
2.9 Quasi-Newton Methods . . . . . . . . . . . . . . . . . . . . . . . 35
2.10 Nonlinear First-order Methods . . . . . . . . . . . . . . . . . . . 37
2.10.1 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.11 Solution by Minimization . . . . . . . . . . . . . . . . . . . . . . 39
2.12 Globally Convergent Methods . . . . . . . . . . . . . . . . . . . . 41
2.12.1 Line-search . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.12.2 Model-trust Region . . . . . . . . . . . . . . . . . . . . . . 43
2.13 Stopping Criteria and Scaling . . . . . . . . . . . . . . . . . . . . 44
3 Solution of Large Macroeconometric Models 46
3.1 Blocktriangular Decomposition of the Jacobian Matrix . . . . . . 47
3.2 Orderings of the Jacobian Matrix . . . . . . . . . . . . . . . . . . 48
3.2.1 The Logical Framework of the Algorithm . . . . . . . . . 50
3.2.2 Practical Considerations . . . . . . . . . . . . . . . . . . . 56
3.3 Point Methods versus Block Methods . . . . . . . . . . . . . . . . 56
3.3.1 The Problem . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.3.2 Discussion of the Block Method . . . . . . . . . . . . . . . 57
3.3.3 Ordering and Convergence for First-order Iterations . . . 59
3.4 Essential Feedback Vertex Sets and the Newton Method . . . . . 61
4 Model Simulation on Parallel Computers 62
4.1 Introduction to Parallel Computing . . . . . . . . . . . . . . . . . 62
4.1.1 A Taxonomy for Parallel Computers . . . . . . . . . . . . 63
4.1.2 Communication Tasks . . . . . . . . . . . . . . . . . . . . 67
4.1.3 Synchronization Issues . . . . . . . . . . . . . . . . . . . . 69

CONTENTS iii
4.1.4 Speedup and Eﬃciency of an Algorithm . . . . . . . . . . 70
4.2 Model Simulation Experiences . . . . . . . . . . . . . . . . . . . . 71
4.2.1 Econometric Models and Solution Algorithms . . . . . . . 71
4.2.2 Parallelization Potential for Solution Algorithms . . . . . 73
4.2.3 Practical Results . . . . . . . . . . . . . . . . . . . . . . . 76
5 Rational Expectations Models 82
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.1.1 Formulation of RE Models . . . . . . . . . . . . . . . . . . 85
5.1.2 Uniqueness and Stability Issues . . . . . . . . . . . . . . . 86
5.2 The Model MULTIMOD . . . . . . . . . . . . . . . . . . . . . . . 89
5.2.1 Overview of the Model . . . . . . . . . . . . . . . . . . . . 89
5.2.2 Equations of a Country Model . . . . . . . . . . . . . . . 90
5.2.3 Structure of the Complete Model . . . . . . . . . . . . . . 92
5.3 Solution Techniques for RE Models . . . . . . . . . . . . . . . . . 92
5.3.1 Extended Path . . . . . . . . . . . . . . . . . . . . . . . . 93
5.3.2 Stacked-time Approach . . . . . . . . . . . . . . . . . . . 94
5.3.3 Block Iterative Methods . . . . . . . . . . . . . . . . . . . 97
5.3.4 Newton Methods . . . . . . . . . . . . . . . . . . . . . . . 107
A Appendix 122
A.1 Finite Precision Arithmetic . . . . . . . . . . . . . . . . . . . . . 122
A.2 Condition of a Problem . . . . . . . . . . . . . . . . . . . . . . . 123
A.3 Complexity of Algorithms . . . . . . . . . . . . . . . . . . . . . . 125

List of Tables
4.1 Complexity of communication tasks on a linear array and a hy-
percube with p processors. . . . . . . . . . . . . . . . . . . . . . 69
4.2 Execution times of Gauss-Seidel and Jacobi algorithms. . . . . . 77
4.3 Execution time on CM2 and Sun ELC. . . . . . . . . . . . . . . . 80
4.4 Execution time on Sun ELC and CM2 for the Newton-like algo-
rithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.1 Labels for the zones/countries considered in MULTIMOD. . . . 90
5.2 Spectral radii for point and block Gauss-Seidel. . . . . . . . . . . 100
5.3 Operation count in Mflops for Newton combined with SGE and
MATLAB’s sparse solver, and Gauss-Seidel. . . . . . . . . . . . 113
5.4 Average number of Mflops for BiCGSTAB. . . . . . . . . . . . . 118
5.5 Average number of Mflops for QMR. . . . . . . . . . . . . . . . 119
5.6 Average number of Mflops for GMRES(m). . . . . . . . . . . . . 120
5.7 Average number of Mflops for MATLAB’s sparse LU. . . . . . . 121

List of Figures
2.1 A one dimensional function F(x) with a unique zero and its cor-
responding function f(x) with multiple local minima. . . . . . . . 41
2.2 The quadratic model ˆg(ω) built to determine the minimum ˆω. . . 43
3.1 Blockrecursive pattern of a Jacobian matrix. . . . . . . . . . . . 49
3.2 Sparsity pattern of the reordered Jacobian matrix. . . . . . . . . 49
3.3 Situations considered for the transformations. . . . . . . . . . . . 53
3.4 Tree T = (S, U). . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.5 Numerical example showing the structure is not suﬃcient. . . . 60
4.1 Shared memory system. . . . . . . . . . . . . . . . . . . . . . . . 64
4.2 Distributed memory system. . . . . . . . . . . . . . . . . . . . . 65
4.3 Linear Array. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.4 Ring. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.5 Mesh. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.6 Torus. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.7 Hypercubes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.8 Complete graph. . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.9 Long communication delays between two processors. . . . . . . . 70
4.10 Large diﬀerences in the workload of two processors. . . . . . . . 70
4.11 Original and ordered Jacobian matrix and corresponding DAG. . 75
4.12 Blockrecursive pattern of the model’s Jacobian matrix. . . . . . . 77
4.13 Matrix L for the Gauss-Seidel algorithm. . . . . . . . . . . . . . . 78
5.1 Linkages of the country models in the complete version of MUL-
TIMOD. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.2 Incidence matrix of D in MULTIMOD. . . . . . . . . . . . . . . 93
5.3 Incidence matrices E3 to E1, D and A1 to A5. . . . . . . . . . . 100

LIST OF FIGURES vi
5.4 Alignment of data in memory. . . . . . . . . . . . . . . . . . . . 104
5.5 Elapsed time for 4 processors and for a single processor. . . . . 106
5.6 Relation between r and κ2 in submodel for Japan for MULTIMOD.112
5.7 Scheduling of operations for the solution of the linear system as
computed on page 110. . . . . . . . . . . . . . . . . . . . . . . . . 116
5.8 Incidence matrix of the stacked system for T = 10. . . . . . . . 117

Acknowledgements
This thesis is the result of my research at the Department of Econometrics of
the University of Geneva, Switzerland.
First and foremost, I wish to express my deepest gratitude to Professor Manfred
Gilli, my thesis supervisor for his constant support and help. He has shown a
great deal of patience, availability and humane qualities beyond his professional
competence.
I would like to thank Professor Andrew Hughes-Hallett for accepting to read
and evaluate this work. His research also made me discover and take interest in
the ﬁeld of simulation of large macroeconometric models.
I am also grateful to Professor Fabrizio Carlevaro for accepting the presidency
of the jury, and also reading my thesis.
Moreover, I thank Professor Jean-Philippe Vial and Professor Gerhard Wanner
for being part of the jury and evaluating my work.
I am happy to be able to show my gratitude to my colleagues and friends of
the Departement of Econometrics for creating a pleasant and enjoyable work-
ing environment. David Miceli provided constant help and kind understanding
during all the stages of my research.
I am grateful to Pascale Mignon for helping me proofreading my text.
Finally, I wish to thank my parents for their kindness and encouragements
without which I could never have achieved my goals.
Geneva, November 1995.

Chapter 1
Introduction
The purpose of this book is to present the available methodologies for the solu-
tion of large-scale macroeconometric models. This work reviews classical solu-
tion methods and introduces more recent techniques, such as parallel computing
and nonstationary iterative algorithms.
The development of new and more efficient computational techniques has sig-
nificantly influenced research and practice in macroeconometric modeling. Our
aim here is to supply practitioners and researchers with both a general presen-
tation of numerical solution methods and specific discussions about particular
problems encountered in the field.
An econometric model is a simplified representation of actual economic phe-
nomena. Real economic behavior is typically represented by an algebraic set
of equations that forms a system of equations. The latter involves endogenous
variables, which are determined by the system itself, and exogenous variables,
which influence but are not determined by the system. The model also con-
tains parameters that we will assume are already estimated by an adequate
econometric technique.
We may express the econometric model in matrix form for a given period t as
F(yt, zt, β) = εt ,
where F is a vector of n functions fi, yt is a vector of n endogenous variables,
zt is a vector of m exogenous variables, β is a vector of k parameters and εt is
a vector of n stochastic disturbances with zero mean.
In this work, we will concentrate on the solution of the model with respect to
the endogenous variables yt. Hence, we will solve a system such as
F(yt, zt) = 0 . (1.1)
Such a model will be solved period after period for some horizon, generally
outside the sample range used for estimation. Therefore, we usually drop the
index t.
A particular class of models, which contain anticipated variables, are described
in Chapter 5. In this case, the solution has to be computed simultaneously for
the periods considered.

Introduction 2
Traditionally, in the practice of solving large macroeconometric models, two
kinds of solution algorithms have been used. The most popular ones are proba-
bly first-order iterative techniques and related methods like Gauss-Seidel. One
obvious reason for this is their ease of implementation. Another reason is that
their computational complexity is in general quite low, mainly because Gauss-
Seidel naturally exploits the sparse structure of the system of equations. The
convergence of these methods depends on the particular quantification of the
equations and their ordering. Convergence is not guaranteed and its speed is
linear.
Newton-type methods constitute a second group of techniques commonly used
to solve models. These methods use the information about the derivatives of
the equations. The major advantages are then a quadratic convergence, the
fact that the equations do not need to be normalized and that the ordering
does not influence the convergence rate. The computational cost comprises the
evaluation of the derivatives forming the Jacobian matrix and the solution of
the linear system. If the linear system is solved using a classical direct method
based on LU or QR decomposition, the complexity of the whole method is O(n3
).
This promises interesting savings in computations if size n can be reduced. A
common technique consists then in applying the Newton method only to a subset
of equations, for instance the equations formed by the spike variables.
This leads to a block method, i.e. a first-order iterative method where only a
subsystem of equations is solved with a Newton method. A recursive system
constitutes the first block constitutes and the second block (in general much
smaller) is solved by a Newton method.
However, such a method brings us back to the problem of convergence for the
outer loop. Moreover, for macroeconometric models in most cases the block of
spike variables is also recursive, which then results in carrying out unnecessary
computations.
Thus, the block method tries to take advantage from both the sparse structure
of the system under consideration and the desirable convergence properties of
Newton-type algorithms. However, as explained above, this approach relapses
into the convergence problem existing in the framework of a block method.
This suggests that the sparsity should be exploited when solving the linear
system in the Newton method, which can be achieved by using appropriate
sparse techniques.
This work presents methods for the solution of large macroeconometric models.
The classical approaches mentioned above are presented with a particular em-
phasis on the problem of the ordering of the equations. We then look into more
recent developments in numerical techniques.
The solution of a linear system is a basic task of most solution algorithms
for systems of nonlinear equations. Therefore, we pay special attention to the
solution of linear systems. A central characteristic of the linear systems arising
in macroeconometric modeling is their sparsity. Hence, methods able to take
advantage of a sparse structure are of crucial importance.
A more recent set of tools available for the solution of linear equations are
nonstationary methods. We explore their performance for a particular class of

Introduction 3
models in economics.
The last decade has revealed that parallel computation is now practical and has
a significant impact on how large scale computation is performed. This tech-
nology is therefore available to solve large numerical problems in economics. A
consequence of this trend is that the efficient use of parallel machines may re-
quire new algorithm development. We therefore address some practical aspects
concerning parallel computation.
A particular class of macroeconometric models are models containing forward
looking variables. Such models naturally give raise to very large systems of
equations, the solution of which requires heavy computations. Thus such models
constitute an interesting testing ground for the numerical methods addressed in
this research.
This work is organized into five chapters. Chapter 2 reviews solution tech-
niques for linear and nonlinear systems. First, we discuss direct methods with a
particular stress on the sparse case. This is followed by the presentation of iter-
ative methods for linear systems, displaying both stationary and nonstationary
techniques.
For the nonlinear case, we concentrate on the Newton method and some of
its principal variants. Then, we examine the nonlinear versions of first-order
iterative techniques and quasi-Newton methods. The alternative approach of
residual minimization and issues about global convergence are also analyzed.
The macroeconometric models we consider are large and sparse and therefore
analyzing their logical structure is relevant. Chapter 3 introduces a graph-
theoretical approach to perform this analysis. We first introduce the method to
investigate the recursive structures. Later, original techniques are developed to
analyze interdependent structures, in particular by an algorithm for computing
minimal feedback sets. These techniques are then used to seek for a block
decomposition of a model and we conclude with a comparison of computational
complexity of point methods versus block methods.
Chapter 4 addresses the main concerning the type of computer and the solution
technique used in parallel computation. Practical aspects are also examined
through the application of parallel techniques to the simulation of a medium
sized macroeconometric model.
In Chapter 5, we present the theoretical framework of rational expectation mod-
els. In the first part, we discuss issues concerning the existence and unicity of the
solution. In the second part, we present a multi-region econometric model with
forward-looking variables. Then, different solution techniques are experimented
to solve this model.

Chapter 2
A Review of Solution
Techniques
This chapter reviews classic and well implemented solution techniques for linear
and nonlinear systems. First, we discuss direct and iterative methods for linear
systems. Some of these methods are part of the fundamental building blocks
for many techniques for solving nonlinear systems presented later. The topic
has been extensively studied and many methods have been analyzed in scientific
computing literature, see e.g. Golub and Van Loan [56], Gill et al. [47], Barrett
et al. [8] and Hageman and Young [60].
Second, the nonlinear case is addressed essentially presenting methods based on
Newton iterations.
First, direct methods for solving linear systems of equations are displayed. The
first section presents the LU factorization—or Gaussian elimination technique—
is presented and the second section describes, an orthogonalization decompo-
sition leading to the QR factorization. The case of dense and sparse systems
are then addressed. Other direct methods also exist, such as the Singular Value
Decomposition (SVD) which can be used to solve linear systems. Even though
this can constitute an interesting and useful approach we do not resort to it
here.
Section 2.4 introduces stationary iterative methods such as Jacobi, Gauss-Seidel,
SOR techniques and their convergence characteristics.
Nonstationary iterative methods—such as the conjugate gradient, general min-
imal residual and biconjugate gradient, for instance—a class of more recently
developed techniques constitute the topic of Section 2.5.
Section 2.10 presents nonlinear first-order methods that are quite popular in
macroeconometric modeling. The topic of Section 2.11 is an alternative ap-
proach to the solution of a system of nonlinear equations: a minimization of the
residuals norm.
To overcome the nonconvergent behavior of the Newton method in some circum-
stances, two globally convergent modifications are introduced in Section 2.12.

2.1 LU Factorization 5
Finally, we discuss stopping criteria and scaling.
2.1 LU Factorization
For a linear model, finding a vector of solutions amounts to solving for x a
system written in matrix form
Ax = b , (2.1)
where A is a n × n real matrix and b a n × 1 real vector.
System (2.1) can be solved by the Gaussian elimination method which is a
widely used algorithm and here, we present its application for a dense matrix
A with no particular structure.
The basic idea of Gaussian elimination is to transform the original system into
an equivalent triangular system. Then, we can easily find the solution of such
a system. The method is based on the fact that replacing an equation by a
linear combination of the others leaves the solution unchanged. First, this idea
is applied to get an upper triangular equivalent system. This stage is called the
forward elimination of the system. Then, the solution is found by solving the
equations in reverse order. This is the back substitution phase.
To describe the process with matrix algebra, we need to define a transformation
that will take care of zeroing the elements below the diagonal in a column of
matrix A. Let x ∈ Rn
be a column vector with xk = 0. We can define
τ(k)
= [0 . . . 0 τ
(k)
k+1 . . . τ(k)
n ] with τ
(k)
i = xi/xk for i = k + 1, . . . , n .
Then, the matrix Mk = I −τ(k)
ek with ek being the k-th standard vector of Rn
,
represents a Gauss transformation. The vector τ(k)
is called a Gauss vector. By
applying Mk to x, we check that we get
Mkx =











1 · · · 0 0 · · · 0
...
...
...
...
...
0 · · · 1 0 · · · 0
0 · · · −τ
(k)
k+1 1 · · · 0
...
...
...
...
...
0 · · · −τ
(k)
n 0 · · · 1





















x1
...
xk
xk+1
...
xn










=










x1
...
xk
0
...
0










.
Practically, applying such a transformation is carried out without explicitly
building Mk or resorting to matrix multiplications. For example, in order to
multiply Mk by a matrix C of size n × r, we only need to perform an outer
product and a matrix subtraction:
MkC = (I − τ(k)
ek)C = C − τ(k)
(ekC) . (2.2)
The product ekC selects the k-th row of C, and the outer product τ(k)
(ekC) is
subtracted from C. However, only the rows from k + 1 to n of C have to be
updated as the first k elements in τ(k)
are zeros. We denote by A(k)
the matrix
Mk · · · M1A, i.e. the matrix A after the k-th elimination step.

To triangularize the system, we need to apply n − 1 Gauss transformations,
provided that the Gauss vector can be found. This is true if all the divisors
a
(k)
kk —called pivots—used to build τ(k)
for k = 1, . . . , n are different from zero.
If for a real n×n matrix A the process of zeroing the elements below the diagonal
is successful, we have
Mn−1Mn−2 · · · M1A = U ,
where U is a n × n upper triangular matrix. Using the Sherman-Morrison-
Woodbury formula, we can easily find that if Mk = I − τ(k)
ek then M−1
k =
I + τ(k)
ek and so defining L = M−1
1 M−1
2 · · · M−1
n−1 we can write
A = LU .
As each matrix Mk is unit lower triangular, each M−1
k also has this property;
therefore, L is unit lower triangular too. By developing the product defining L,
we have
L = (I + τ(1)
e1)(I + τ(2)
e2) · · · (I + τ(n−1)
en−1) = I +
n−1
k=1
τ(k)
ek .
So L contains ones on the main diagonal and the vector τ(k)
in the k-th column
below the diagonal for k = 1, . . . , n − 1 and we have
L =








1
τ
(1)
1 1
τ
(1)
2 τ
(2)
1 1
...
...
...
τ
(1)
n−1 τ
(2)
n−2 · · · τ
(n−1)
1 1








.
By applying the Gaussian elimination to A we found a factorization of A into a
unit lower triangular matrix L and an upper triangular matrix U. The existence
and uniqueness conditions as well as the result are summarized in the following
theorem.
Theorem 1 A ∈ Rn×n
has an LU factorization if the determinants of the first
n − 1 principal minors are different from 0. If the LU factorization exists and
A is nonsingular, then the LU factorization is unique and det(A) = u11 · · · unn.
The proof of this theorem can be found for instance in Golub and Van Loan [56,
p. 96]. Once the factorization has been found, we obtain the solution for the
system Ax = b, by first solving Ly = b by forward substitution and then solving
Ux = y by back substitution.
Forward substitution for a unit lower triangular matrix is easy to perform. The
first equation gives y1 = b1 because L contains ones on the diagonal. Substitut-
ing y1 in the second equation gives y2. Continuing thus, the triangular system
Ly = b is solved by substituting all the known yj to get the next one.
Back substitution works similarly, but we start with xn since U is upper trian-
gular. Proceeding backwards, we get xi by replacing all the known xj (j > i)
in the i-th equation of Ux = y.

2.1.1 Pivoting
As described above, the Gaussian elimination breaks down when a pivot is equal
to zero. In such a situation, a simple exchange of the equations leading to a
nonzero pivot may get us round the problem. However, the condition that all
the pivots have to be different than zero does not suffice to ensure a numerically
reliable result. Moreover at this stage, the Gaussian elimination method, is
still numerically unstable. This means that because of cancellation errors, the
process described can lead to catastrophic results. The problem lies in the size
of the elements of the Gauss vector τ. If they are too large compared to the
elements from which they are subtracted in Equation (2.2), rounding errors may
be magnified thus destroying the numerical accuracy of the computation.
To overcome this difficulty a good strategy is to exchange the rows of the matrix
during the process of elimination to ensure that the elements of τ will always
be smaller or equal to one in magnitude. This is achieved by choosing the
permutation of the rows so that
|a
(k)
kk | = max
i>k
|a
(k)
ik | . (2.3)
Such an exchange strategy is called partial pivoting and can be formalized in
matrix language as follows.
Let Pi be a permutation matrix of order n, i.e. the identity matrix with its rows
reordered. To ensure that no element in τ is larger than one in absolute value,
we must permute the rows of A before applying the Gauss transformation. This
is applied at each step of the Gaussian elimination process, which leads to the
following theorem:
Theorem 2 If Gaussian elimination with partial pivoting is used to compute
the upper triangularization
Mn−1Pn−1 · · · M1P1A = U ,
then PA = LU where P = Pn−1 · · · P1 and L is a unit lower triangular matrix
with | ij| ≤ 1.
Thus, when solving a linear system Ax = b, we first compute the vector y =
Mn−1Pn−1 · · · M1P1b and then solve Ux = y by back substitution. This method
is much more stable and it is very unlikely to find catastrophic cancellation
problems. The proof of Theorem 2 is given in Golub and Van Loan [56, p. 112].
Going one step further would imply permuting not only the rows but also the
columns of A so that in the k-th step of the Gaussian elimination the largest
element of the submatrix to be transformed is used as pivot. This strategy is
called complete pivoting. However, applying complete pivoting is costly because
one needs to search for the largest element in a matrix instead of a vector at
each elimination step. This overhead does not justify the gain one may obtain
in the stability of the method in practice. Therefore, the algorithm of choice
for solving Ax = b, when A has no particular structure, is Gaussian elimination
with partial pivoting.

2.2 QR Factorization 8
2.1.2 Computational Complexity
The number of elementary arithmetic operations (flops) for the Gaussian elim-
ination is 2
3 n3
− 1
2 n2
− 1
6 n and therefore this methods is O(n3
).
2.1.3 Practical Implementation
In the case where one is only interested in the solution vector, it is not necessary
to explicitly build matrix L. It is possible to directly compute the y vector
(solution of Ly = b) while transforming matrix A into an upper triangular
matrix U.
Despite the fact that Gaussian elimination seems to be easy to code, it is cer-
tainly not advisable to write our own code. A judicious choice is to rely on
carefully tested software as the routines in the LAPACK library. These routines
are publicly available on NETLIB1
and are also used by the software MATLAB2
which is our main computing environment for the experiments we carried out.
2.2 QR Factorization
The QR factorization is an orthogonalization method that can be applied to
square or rectangular matrices. Usually this is a key algorithm for computing
eigenvalues or least-squares solutions and it is less applied to find the solution
of a square linear system. Nevertheless, there are at least 3 reasons (see Golub
and Van Loan [56]) why orthogonalization methods, such as QR, might be
considered:
• The orthogonal methods have guaranteed numerical stability which is not
the case for Gaussian elimination.
• In case of ill-conditioning, orthogonal methods give an added measure of
reliability.
• The flop count tends to exaggerate the Gaussian elimination advantage.3
(Particularly for parallel computers, memory traffic and other overheads
tend to reduce this advantage.)
Another advantage that might favor the QR factorization is the possibility of up-
dating the factors Q and R corresponding to a rank one modification of matrix
A in O(n2
) operations. This is also possible for the LU factorization; however,
1NETLIB can be accessed through the World Wide Web at http://www.netlib.org/
and collects mathematical software, articles and databases useful for the scientific com-
munity. In Europe the URL is http://www.netlib.no/netlib/master/readme.html or
http://elib.zib-berlin.de/netlib/master/readme.html .
2MATLAB High Performance Numeric Computation and Visualization Software is a prod-
uct and registered trademark of The MathWorks, Inc., Cochituate Place, 24 Prime Park Way,
Natick MA 01760, USA. URL: http://www.mathworks.com/ .
3In the application discussed in Section 4.2.2 we used the QR factorization available in the
libraries of the CM2 parallel computer.

2.2 QR Factorization 9
the implementation is much simpler with QR, see Gill et al. [47]. Updating tech-
niques will prove particularly useful in the quasi-Newton algorithm presented
in Section 2.9.
These reasons suggest that QR probably are, especially on parallel devices, a
possible alternative to LU to solve square systems. The QR factorization can
be applied to any rectangular matrix, but we will focus on the case of a n × n
real matrix A.
The goal is to apply to A successive orthogonal transformation matrices Hi,
i = 1, 2, . . . , r to get an upper triangular matrix R, i.e.
Hr · · · H1A = R .
The orthogonal transformations presented in the literature are usually based
upon Givens rotations or Householder reflections. This latter choice leads to
algorithms involving less arithmetic operations and is therefore presented in the
following.
A Householder transformation is a matrix of the form
H = I − 2ww with w w = 1 .
Such a matrix is symmetric, orthogonal and its determinant is −1. Geometri-
cally, this matrix represents a reflection with respect to the hyperplane defined
by {x|w x = 0}. By properly choosing the reflection plane, it is possible to zero
particular elements in a vector.
Let us partition our matrix A in n column vectors [a1 · · · an]. We first look for
a matrix H1 such as all the elements of H1a1 except the first one are zeros. We
define
s1 = −sign(a11) a1
µ1 = (2s2
1 − 2a11s1)−1/2
u1 = [(a11 − s1) a21 · · · an1]
w1 = µ1u1 .
Actually the sign of s1 is free, but it is chosen to avoid catastrophic cancellation
that may otherwise appear in computing µ1. As w1w1 = 1, we can let H1 =
I − 2w1w1 and verify that H1a1 = [s1 0 · · · 0] .
Computationally, it is more efficient to calculate the product H1A in the fol-
lowing manner
H1A = A − 2w1w1A
= A − 2w1 w1a1 w1a2 · · · w1am
so the i-th column of H1A is ai − 2(w1ai)w1 = ai − (c1 u1ai)w1 and c1 = 2µ2
1 =
(s2
1 − s1a11)−1
.
We continue this process in a similar way on a matrix A where we have removed
the first row and column. The vectors w2 and u2 will now be of dimension
(n − 1) × 1 but we can complete them with zeros to build
H2 = I −
0
w2
0 w2 .

2.3 Direct Methods for Sparse Matrices 10
After n − 1 steps, we have Hn−1 · · · H2H1A = R. As all the matrices Hi are
orthogonal, their product is orthogonal too and we get
A = QR ,
with Q = (Hn−1 · · · H1) = H1 · · · Hn−1. In practice, one will neither form the
vectors wi nor calculate the Q matrix as all the information is contained in the
ui vectors and the si scalars for i = 1, . . . , n.
The possibility to choose the sign of s1 such that there never is a subtraction
in the computation of µ1 is the key for the good numerical behavior of the QR
factorization. We notice that the computation of u1 also involves a subtraction.
It is possible to permute the column with the largest sum of squares below row
i − 1 into column i during the i-th step in order to minimize the risk of digit
cancellation. This then leads to a factorization
PA = QR ,
where P is a permutation matrix.
Using this factorization of matrix A, it is easy to find a solution for the system
Ax = b. We first compute y = Q b and then solve Rx = y by back substitution.
The computational complexity of the QR algorithm for a square matrix of order
n is 4
3 n3
+ O(n2
). Hence the method is of O(n3
) complexity.
Again as for the LU decomposition, the explicit computation of matrix Q is not
necessary as we may build vector y during the triangularization process. Only
the back substitution phase is needed to get the solution of the linear system
Ax = b.
As has already been mentioned, the routines for computing a QR factoriza-
tion (or solving a system via QR) are readily available in LAPACK and are
implemented in MATLAB.
2.3 Direct Methods for Sparse Matrices
In many cases, matrix A of the linear system contains numerous zero entries.
This is particularly true for linear systems derived from large macroeconometric
models. Such a situation may be exploited in order to organize the computations
in a way that involves only the nonzero elements. These techniques are known as
sparse direct methods (see e.g. Duff et al. [30]) and crucial for efficient solution
of linear systems in a wide class of practical applications.

2.3.1 Data Structures and Storage Schemes
The interest of considering sparse structures is twofold: first, the information
can be stored in a much more compact way; second, the computations may
be performed avoiding redundant arithmetic operations involving zeros. These
two aspects are somehow conflicting as a compact storage scheme may involve
more time consuming addressing operations for performing the computations.
However, this conflict vanishes quickly when large problems are considered. In
order to define our idea more clearly, let us define the density of a matrix as
the ratio between its nonzero entries and its total number of entries. Generally,
when the size of the system gets larger, the density of the corresponding matrix
decreases. In other words, the larger the problem is, the sparser its structure
becomes.
Several storage structures exist for a same sparse matrix. There is no one
best data structure since the choice depends both on the data manipulations
the computations imply and on the computer architecture and/or language in
which these are implemented.
The following three data structures are generally used:
• coordinate scheme,
• list of successors (collection of sparse vectors),
• linked list.
The following example best illustrates these storage schemes. We consider the
5 × 5 sparse matrix
A =





0 −2 0 0 0.5
0 5 0 7 0
0 0 1.7 0 6
3.1 0 0 −0.2 0
0 0 1.2 −3 0





.
Coordinate Scheme
In this case, three arrays are used: two integer arrays for the row and column
indices—respectively r and c—and a real array x containing the elements.
For our example we have
r 4 1 2 3 5 2 4 5 1 3
c 1 2 2 3 3 4 4 4 5 5
x 3.1 −2 5 1.7 1.2 7 −0.2 −3 0.5 6
.
Each entry of A is represented by a triplet and corresponds to a column in
the table above. Such a storage scheme needs less memory than a full storage
if the density of A is less than 1
3 . The insertion and deletion of elements are
easy to perform, whereas the direct access of elements is relatively complex.
Many computations in linear algebra involve successive scans of the columns of
a matrix which is difficult to carry out using this representation.

List of successors (Collection of Sparse Vectors)
With this storage scheme, the sparse matrix A is stored as the concatenation
of the sparse vectors representing its columns. Each sparse vector consists of a
real array containing the nonzero entries and an integer array of corresponding
row indices. A second integer array gives the locations in the other arrays of
the first element in each column.
For our matrix A, this representation is
index 1 2 3 4 5 6
h 1 2 4 6 9 11
index 1 2 3 4 5 6 7 8 9 10
4 1 2 3 5 2 4 5 1 3
x 3.1 −2 5 1.7 1.2 7 −0.2 −3 0.5 6
The integer array h contains the addresses of the list of row elements in and
x. For instance, the nonzero entries in column 4 of A are stored at positions
h(4) = 6 to h(5)−1 = 9−1 = 8 in x. Thus, the entries are x(6) = 7, x(7) = −0.2
and x(8) = −3. The row indices are given by the same locations in array , i.e.
(6) = 2, (7) = 4 and (8) = 5.
MATLAB mainly uses this data structure to store its sparse matrices, see Gilbert
et al. [44]. The main advantage is that columns can be easily accessed, which
is of very important for numerical linear algebra algorithms. The disadvantage
of such a representation is the difficulty of inserting new entries. This arises for
instance when adding a row to another.
Linked List
The third alternative that is widely used for storing sparse matrices is the linked
list. Its particularity is that we define a pointer (named head) to the first entry
and each entry is associated to a pointer pointing to the next entry or to the
null pointer (named 0) for the last entry. If the matrix is stored by columns,
we start a new linked list for each column and therefore we have as many head
pointers as there are columns. Each entry is composed of two pieces: the row
index and the value of the entry itself.
This is represented by the picture:
head 1 4 3.1 0
head 5 1 0.5 3 6 0
E
E E
...
•
The structure can be implemented as before with arrays and we get

index 1 2 3 4 5
head 4 5 9 1 7
index 1 2 3 4 5 6 7 8 9 10
row 2 4 5 4 1 2 1 3 3 5
entry 7 −0.2 −3 3.1 −2 5 0.5 6 1.7 1.2
link 2 3 0 0 6 0 8 0 10 0
For instance, to retrieve the elements of column 3, we begin to read head(3)=9.
Then row(9)=3 gives the row index, the entry value is entry(9)=1.7 and the
pointer link(9)=10 gives the next index address. The values row(10)=5, en-
try(10)=1.2 and link(10)=0 indicate that the element 1.2 is at row number 5
and is the last entry of the column.
The obvious advantage is the ease with which elements can be inserted and
deleted: the pointers are simply updated to take care of the modification. This
data structure is close to the list of successors representation, but does not
necessitate contiguous storage locations for the entries of a same column.
In practice it is often necessary to switch from one representation to another.
We can also note that the linked list and the list of successors can similarly be
defined row-wise rather than column wise.
2.3.2 Fill-in in Sparse LU
Given a storage scheme, one could think of executing a Gaussian elimination as
described in Section 2.1. However, by doing so we may discover that the sparsity
of our initial matrix A is lost and we may obtain relatively dense matrices L
and U.
Indeed, depending on the choice of the pivots, the number of entries in L and
U may vary. From Equation (2.2), we see that at step k of the Gaussian elimi-
nation algorithm, we subtract two matrices in order to zero the elements below
the diagonal of the k-th column. Depending on the Gauss vector τ(k)
, matrix
τ(k)
ekC may contain nonzero elements which do not exist in matrix C. This
creation of new elements is called fill-in.
A crucial problem is then to minimize the fill-in as the number of operations is
proportional to the density of the submatrix to be triangularized. Furthermore,
a dense matrix U will result in an expensive back substitution phase.
A minimum fill-in may however conflict with the pivoting strategy, i.e. the
pivot chosen to minimize the fill-in may not correspond to the element with
maximum magnitude among the elements below the k-th diagonal as defined
by Equation (2.3). A common tradeoff to limit the loss of numerical stability
of the sparse Gaussian elimination is to accept a pivot element satisfying the
following threshold inequality
|a
(k)
kk | ≥ u max
i>k
|a
(k)
ik | ,
where u is the threshold parameter and belongs to (0, 1]. A choice for u suggested
by Duff, Erisman and Reid [30] is u = 0.1 . This parameter heavily influences
the fill-in and hence the complexity of the method.

2.4 Stationary Iterative Methods 14
It is not easy to establish an exact operation count for the sparse LU. The count
depends on the particular structure of matrix A and on the chosen pivoting
strategy. For a good implementation, we may expect a complexity of O(c2
n)
where c is the average number of elements in a row and n is the order of matrix
A.
A widely used code for the direct solution of sparse linear systems is the Harwell
MA28 code available on NETLIB, see Duff [29]. A new version called MA48 is
presented in Duff and Reid [31].
The software MATLAB has its own implementation using partial pivoting and
minimum-degree ordering for the columns to reduce fill-in, see Gilbert et al. [44]
and Gilbert and Peierls [45].
Other direct sparse solvers are also available through NETLIB (e.g. Y12MA,
UMFPACK, SuperLU, SPARSE).
2.4 Stationary Iterative Methods
Iterative methods form an important class of solution techniques for solving
large systems of equations. They can be an interesting alternative to direct
methods because they take into account the sparsity of the system and are
moreover easy to implement.
Iterative methods may be divided into two classes: stationary and nonstationary.
The former rely on invariant information from an iteration to another, whereas
the latter modify their search by using the results of previous iterations.
In this section, we present stationary iterative methods such as Jacobi, Gauss-
Seidel and SOR techniques.
The solution x∗
of the system Ax = b can be approximated by replacing A by
a simpler nonsingular matrix M and by rewriting the systems as,
Mx = (M − A)x + b .
In order to solve this equivalent system, we may use the following recurrence
formula from a chosen starting point x0,
Mx(k+1)
= (M − A)x(k)
+ b , k = 0, 1, 2, . . . . (2.4)
At each step k the system (2.4) has to be solved, but this task can be easy
according to the choice of M.
The convergence of the iterates to the solution is not guaranteed. However, if
the sequence of iterates {x(k)
}k=0,1,2,... converges to a limit x(∞)
, then we have
x(∞)
= x∗
, since relation (2.4) becomes Mx(∞)
= (M − A)x(∞)
+ b, that is
Ax(∞)
= b.

The iterations should be carried out an infinite number of times to reach the
solution, but we usually obtain a good approximation of x∗
after a fairly small
number of iterations.
There is a tradeoff between the ease in computing x(k+1)
from (2.4) and the
speed of convergence of the stationary iterative method. The simplest choice
for M would be to take M = I and the fastest convergence would be obtained
by setting M = A. Of course, the choices of M that are of interest to us lie
between these two extreme cases.
Let us split the original system matrix A into
A = L + D + U ,
where D is the diagonal of matrix A and L and U are the strictly lower and
upper triangular parts of A, defined respectively by dii = aii for all i, lij = aij
for i > j and uij = aij for i < j.
2.4.1 Jacobi Method
One of the simplest iterative procedures is the Jacobi method, which is found
by setting M = D. If we assume that the diagonal elements of A are nonzero,
then solving the system Dx(k+1)
= c for x(k+1)
is easy; otherwise, we need to
permute the equations to find such a matrix D. We can note that when the
model is normalized, we have D = I and the iterations are further simplified.
The sequence of Jacobi’s iterates is defined in matrix form by
Dx(k+1)
= −(L + U)x(k)
+ b , k = 0, 1, 2, . . . ,
or by
Algorithm 1 Jacobi Method
Given a starting point x(0)
∈ Rn
for k = 0, 1, 2, . . . until convergence
for i = 1, . . . , n
x
(k+1)
i = (bi −
j=i
aijx
(k)
i )/aii
end
end
In this method, all the entries of the vector x(k+1)
are computed using only the
entries of x(k)
. Hence, two separate vectors must be stored to carry out the
iterations.
2.4.2 Gauss-Seidel Method
In the Gauss-Seidel method (GS), we use the most recently available information
to update the iterates. In this case, the i-th component of x(k+1)
is computed
using the (i − 1) first entries of x(k+1)
that have already been obtained and the
(n − i − 1) other entries from x(k)
.

This process amounts to using M = L + D and leads to the formula
(L + D)x(k+1)
= −Ux(k)
+ b ,
or to the following algorithm:
Algorithm 2 Gauss-Seidel Method
∈ Rn
for i = 1, . . . , n
x
(k+1)
i = (bi −
j<i
aijx
(k+1)
i ) −
j>i
aij x
(k)
i )/aii
end
end
The matrix formulation of the iterations is useful for theoretical purposes, but
the actual computation will generally be implemented component-wise as in
Algorithm 1 and Algorithm 2.
2.4.3 Successive Overrelaxation Method
A third useful technique called SOR for Successive Overrelaxation method is
very closely related to the Gauss-Seidel method. The update is computed as an
extrapolation of the Gauss-Seidel step as follows: let x
(k+1)
GS denote the (k + 1)
iterate for the GS method; the new iterates can then be written as in the next
algorithm.
Algorithm 3 Successive Overrelaxation Method
∈ Rn
Compute x
(k+1)
GS by Algorithm 2
for i=1,. . . ,n
x
(k+1)
i = x
(k)
i + ω(x
(k+1)
GS,i − x
(k)
i )
end
end
The scalar ω is called the relaxation parameter and its optimal value, in order to
achieve the fastest convergence, depends on the characteristics of the problem
in question. A necessary condition for the method to converge is that ω lies in
the interval (0, 2]. When ω < 1, the GS step is dampened and this is sometimes
referred to as under-relaxation.
In matrix form, the SOR iteration is deﬁned by
(ωL + D)x(k+1)
= ((1 − ω)D − ωU)x(k)
+ ωb , k = 0, 1, 2, . . . . (2.5)
When ω is unity, the SOR method collapses to GS.

2.4.4 Fast Gauss-Seidel Method
The idea of extrapolating the step size to improve the speed of convergence
can also be applied to SOR iterates and gives rise to the Fast Gauss-Seidel
method (FGS) or Accelerated Over Relaxation method, see Hughes Hallett [68]
and Hadjidimos [59].
Let us denote by x
(k+1)
SOR the (k + 1) iterate obtained by Equation (3); then the
FGS iterates are defined by
Algorithm 4 FGS Method
∈ Rn
Compute x
(k+1)
SOR by Algorithm 3
for i = 1, . . . , n
x
(k+1)
i = x
(k)
i + γ(x
(k+1)
SOR,i − x
(k)
i )
end
end
This method may be seen as a second-order method, since it uses a SOR iterate
as an intermediate step to compute its next guess, and that the SOR already
uses the information from a GS step. It is easy to see that when γ = 1, we find
the SOR method.
Like ω in the SOR part, the choice of the value for γ is not straightforward.
For some problems, the optimal choice of ω can be explicitly found (this is dis-
cussed in Hageman and Young [60]). However, it cannot be determined a priori
for general matrices. There is no way of computing the optimal value for γ
cheaply and some authors (e.g. Hughes Hallett [69], Yeyios [103]) offered ap-
proximations of γ. However, numerical tests produced variable outcomes: some-
times the approximation gave good convergence rates, sometimes poor ones, see
Hughes-Hallett [69]. As for the ω parameter, the value of γ is usually chosen by
experimentation on the characteristics of system at stake.
2.4.5 Block Iterative Methods
Certain problems can naturally be decomposed into a set of subproblems with
more or less tight linkages.4
In economic analysis, this is particularly true for
multi-country macroeconometric models where the different country models are
linked together by a relatively small number of trade relations for example (see
Faust and Tryon [35]). Another such situation is the case of disaggregated
multi-sectorial models where the links between the sectors are relatively weak.
In other problems where such a decomposition does not follow from the con-
struction of the system, one may resort to a partition where the subsystems are
easier to solve.
A block iterative method is then a technique where one iterates over the sub-
systems. The technique to solve the subsystem is free and not relevant for the
4The original problem is supposed to be indecomposable in the sense described in Sec-
tion 3.1.

discussion.
Let us suppose the matrix of our system is partitioned in the form
A =





A11 A12 · · · A1N
A21 A22 · · · A2N
...
...
...
AN1 AN2 · · · ANN





where the diagonal blocks Aii i = 1, 2, . . ., N are square. We deﬁne the block
diagonal matrix D, the block lower triangular matrix L and the block upper
triangular matrix U such that A = D + L + U:
D =





A11 0 · · · 0
0 A22 · · · 0
.
.
.
.
.
.
...
.
.
.
0 0 · · · ANN





, L =





0 0 · · · 0
A21 0 · · · 0
.
.
.
.
.
.
...
.
.
.
AN1 AN2 · · · 0





, U =





0 A12 · · · A1N
0 0 · · · A2N
.
.
.
.
.
.
...
.
.
.
0 0 · · · 0





.
If we write the problem Ay = b under the same partitioned form, we have



A11 · · · A1N
...
...
AN1 · · · ANN






y1
...
yN


 =



b1
...
bN



or else
N
j=1
Aij yj = bi , i = 1, 2, . . . , N .
Suppose the Aii i = 1, 2, . . ., N are nonsingular, then the following solution
scheme may be applied:
Algorithm 5 Block Jacobi method (BJ)
∈ Rn
Solve for y
(k+1)
i :
Aii y
(k+1)
i = bi −
N
j=1
j=i
Aij y
(k)
j , i = 1, 2, . . . , N
end
As we only use the information of step k to compute y
(k+1)
i , this scheme is called
a block iterative Jacobi method (BJ).
We can certainly use the most recent available information on the y’s for up-
dating y(k+1)
and this leads to the block Gauss-Seidel method (BGS):

Algorithm 6 Block Gauss-Seidel method (BGS)
∈ Rn
Solve for y
(k+1)
i :
Aii y
(k+1)
i = bi −
i−1
j=1
Aij y
(k+1)
j −
N
j=i+1
Aij y
(k)
j , i = 1, 2, . . . , N
end
Similarly to the presentation in Section 2.4.3, the SOR option can also be applied
as follows:
Algorithm 7 Block successive over relaxation method (BSOR)
∈ Rn
Solve for y
(k+1)
i :
Aii y
(k+1)
i = Aii y
(k)
i + ω bi −
i−1
j=1
Aij y
(k+1)
j −
N
j=i+1
Aij y
(k)
j − Aii y
(k)
i ,
i = 1, 2, . . . , N
end
We assume that the systems Aii yi = ci can be solved by either direct or iterative
methods.
The interest of such block methods is to offer possibilities of splitting the problem
in order to solve one piece at a time. This is useful when the size of the problem
is such that it cannot entirely fit in the memory of the computer. Parallel
computing also allows taking advantage of a block Jacobi implementation, since
different processors can simultaneously take care of different subproblems and
thus speed up the solution process, see Faust and Tryon [35].
2.4.6 Convergence
Let us now study the convergence of the stationary iterative techniques intro-
duced in the last section.
The error at iteration k is defined by e(k)
= (x(k)
− x∗
) and subtracting Equa-
tion 2.4 evaluated at x∗
to the same evaluated at x(k)
, we get
Me(k)
= (M − A)e(k−1)
.
We can now relate e(k)
to e(0)
by writing
e(k)
= Be(k−1)
= B2
e(k−2)
= · · · = Bk
e(0)
,
where B is a matrix defined to be M−1
(M − A). Clearly, the convergence of
{x(k)
}k=0,1,2,... to x∗
depends on the powers of matrix B: if limk→∞ Bk
= 0,
then limk→∞ x(k)
= x∗
. It is not difficult to show that
lim
k→∞
Bk
= 0 ⇐⇒ |λi| < 1 ∀i .

Indeed, if B = PJP−1
where J is the Jordan canonical form of B, then Bk
=
PJk
P−1
and limk→∞ Bk
= 0 if and only if limk→∞ Jk
= 0. The matrix J is
formed of Jordan blocks Ji and we see that the k-th power (for k larger than
the size of the block) of Ji is
(Ji)k
=











λk
i kλk−1
i
k
2
λk−2
i · · ·
k
n − 1
λk−n+1
i
...
...
..
.
k
2
λk−2
i
kλk−1
i
λk
i











,
and therefore that the powers of J tend to zero if and only if |λi| < 1 for all i.
We can write the different matrices governing the convergence for each station-
ary iterative method as follows:
BJ = −D−1
(L + U) for Jacobi’s method,
BGS = −(L + D)−1
U for Gauss-Seidel,
Bω = (ωL + D)−1
((1 − ω)D − ωU) for SOR.
Therefore, the speed of convergence of such methods depends on the spectral
radius of B, denoted by ρ(B) = maxi |λi| where λi stands for the i-th eigenvalue
of matrix B. The FGS method converges for some γ > 0, if the real part of the
eigenvalues of the matrix Bω is less than unity.
Given that the method converges, i.e. that ρ(B) < 1, the number of iterations
is approximately
log
log ρ(B)
,
with a convergence criterion5
expressed as
max
i
|x
(k)
i − x
(k−1)
i |
|x
(k−1)
i |
< .
Hence, to minimize the number of iterations, we seek a splitting of matrix A
and parameters that yield a matrix B with the lowest possible spectral radius.
Different row-column permutations of A influence ρ(B) when GS, SOR and FGS
methods are applied, whereas Jacobi method is invariant to such permutations.
These issues are discussed in more detail in Section 3.3. For matrices without
special structure, these problems do not have a practical solution so far.
The number of elementary operations for an iteration of Jacobi or Gauss-Seidel
is (2c + 1)n where c is the average number of elements in a row of A. For SOR,
the count is (2c + 4)n and for FGS (2c + 7)n.
Therefore, iterative methods become competitive with sparse direct methods if
the number of iterations K needed to converge is of order c or less.
5See Section 2.13 for a discussion of stopping criteria.

2.5 Nonstationary Iterative Methods 21
2.5 Nonstationary Iterative Methods
Nonstationary methods have been more recently developed. They use infor-
mation that changes from iteration to iteration unlike the stationary methods
discussed in Section 2.4. These methods are computationally attractive as the
operations involved can easily be executed on sparse matrices and also require
few storage. They also generally show a better convergence speed than station-
ary iterative methods. Presentations of nonstationary iterative methods can
be found for instance in Freund et al. [39], Barrett et al. [8], Axelsson [7] and
Kelley [73].
First, we have to present some algorithms that solve particular systems, such
as symmetric positive definite ones, from which were derived the nonstationary
iterative methods for solving the general linear systems we are interested in.
2.5.1 Conjugate Gradient
The first and perhaps best known of the nonstationary methods is the Con-
jugate Gradient (CG) method proposed by Hestenes and Stiefel [64]. This
technique solves symmetric positive definite systems Ax = b by using only
matrix-vector products, inner products and vector updates. The method may
also be interpreted as arising from the minimization of the quadratic function
q(x) = 1
2 x Ax − x b where A is the symmetric positive definite matrix and b the
right-hand side of the system. As the first order conditions for the minimization
of q(x) give the original system, the two approaches are equivalent.
The idea of the CG method is to update the iterates x(i)
in the direction p(i)
and to compute the residuals r(i)
= b−Ax(i)
in such a way as to ensure that we
achieve the largest decrease in terms of the objective function q and furthermore
that the direction vectors p(i)
are A-orthogonal.
The largest decrease in q at x(0)
is obtained by choosing an update in the
direction −Dq(x(0)
) = b−Ax(0)
. We see that the direction of maximum decrease
is the residual of x(0)
defined by r(0)
= b − Ax(0)
. We can look for the optimum
step length in the direction r(0)
by solving the line search problem
min
α
q(x(0)
+ αr(0)
) .
As the derivative with respect to α is
Dαq(x(0)
+ αr(0)
) = x(0)
Ar(0)
+ αr(0)
Ar(0)
− b r(0)
= (x(0)
A − b )r(0)
+ αr(0)
Ar(0)
= r(0)
r(0)
+ αr(0)
Ar(0)
,
the optimal α is
α0 = −
r(0)
r(0)
r(0) Ar(0)
.
The method described up to now is just a steepest descent algorithm with exact
line search on q. To avoid the convergence problems which are likely to arise
with this technique, it is further imposed that the update directions p(i)
be

A-orthogonal (or conjugate with respect to A)—in other words, that we have
p(i)
Ap(j)
= 0 i = j . (2.6)
It is therefore natural to choose a direction p(i)
that is closest to r(i−1)
and
satisfies Equation (2.6). It is possible to show that explicit formulas for such
a p(i)
can be found, see e.g. Golub and Van Loan [56, pp. 520–523]. These
solutions can be expressed in a computationally efficient way involving only one
matrix-vector multiplication per iteration.
The CG method can be formalized as follows:
Algorithm 8 Conjugate Gradient
Compute r(0)
= b − Ax(0)
for some initial guess x(0)
for i = 1, 2, . . . until convergence
ρi−1 = r(i−1)
r(i−1)
if i = 1 then
p(1)
= r(0)
else
βi−1 = ρi−1/ρi−2
p(i)
= r(i−1)
+ βi−1p(i−1)
end
q(i)
= Ap(i)
αi = ρi−1/(p(i)
q(i)
)
x(i)
= x(i−1)
+ αip(i)
r(i)
= r(i−1)
− αiq(i)
end
In the conjugate gradient method, the i-th iterate x(i)
can be shown to be the
vector minimizing (x(i)
− x∗
) A(x(i)
− x∗
) among all x(i)
in the affine subspace
x(0)
+ span{r(0)
, Ar(0)
, . . . , Am−1
r(0)
}. This subspace is called the Krylov sub-
space.
Convergence of the CG Method
In exact arithmetic, the CG method yields the solution in at most n iterations,
see Luenberger [78, p. 248, Theorem 2]. In particular we have the following
relation for the error in the k-th CG iteration
x(k)
− x∗
2 ≤ 2
√
κ
√
κ − 1
√
κ + 1
k
x(0)
− x∗
2 ,
where κ = κ2(A), the condition number of A in the two norm. However, in
finite precision and with a large κ, the method may fail to converge.
2.5.2 Preconditioning
As explained above, the convergence speed of the CG method is linked to the
condition number of the matrix A. To improve the convergence speed of the
CG-type methods, the matrix A is often preconditioned, that is transformed into
Â = SAS , where S is a nonsingular matrix. The system solved is then Âˆx = ˆb

where ˆx = (S )−1
x and ˆb = Sb. The matrix S is chosen so that the condition
number of matrix Â is smaller than the condition number of the original matrix
A and, hence, speeds up the convergence.
To avoid the explicit computation of Â and the destruction of the sparsity
pattern of A, the methods are usually formalized in order to use the original
matrix A directly. We can build a preconditioner
M = (S S)−1
and apply the preconditioning step by solving the system M ˜r = r. Since
κ2(S Â(S )−1
) = κ2(S SA) = κ2(MA), we do not actually form M from S
but rather directly choose a matrix M. The choice of M is constrained to being
a symmetric positive definite matrix.
The preconditioned version of the CG is described in the following algorithm.
Algorithm 9 Preconditioned Conjugate Gradient
Compute r(0)
= b − Ax(0)
Solve M ˜r(i−1)
= r(i−1)
ρi−1 = r(i−1)
˜r(i−1)
if i = 1 then
p(1)
= ˜r(0)
else
βi−1 = ρi−1/ρi−2
p(i)
= ˜r(i−1)
+ βi−1p(i−1)
end
q(i)
= Ap(i)
αi = ρi−1/(p(i)
q(i)
)
x(i)
= x(i−1)
+ αip(i)
r(i)
= r(i−1)
− αiq(i)
end
As the preconditioning speeds up the convergence, the question of how to choose
a good preconditioner naturally arises. There are two conflicting goals in the
choice of M. First, M should reduce the condition number of the system solved
as much as possible. To achieve this, we would like to choose an M as close
to matrix A as possible. Second, since the system M ˜r = r has to be solved
at each iteration of the algorithm, this system should be as easy as possible
to solve. Clearly, the preconditioner will be chosen between the two extreme
cases M = A and M = I. When M = I, we obtain the unpreconditioned
version of the method, and when M = A, the complete system is solved in
the preconditioning step. One possibility is to take M = diag(a11, . . . , ann).
This is not useful if the system is normalized, as it is sometimes the case for
macroeconometric systems.
Other preconditioning methods do not explicitly construct M. Some authors,
for instance Dubois et al. [28] and Adams [1], suggest to take a given number
of steps of an iterative method such as Jacobi. We can note that taking one
step of Jacobi amounts to doing a diagonal scaling M = diag(a11, . . . , ann), as
mentioned above.
Another common approach is to perform an incomplete LU factorization (ILU)

of matrix A. This method is similar to the LU factorization except that it
respects the pattern of nonzero elements of A in the lower triangular part of
L and the upper triangular part of U. In other words, we apply the following
algorithm:
Algorithm 10 Incomplete LU factorization
Set L = In The identity matrix of order n
for k = 1, . . . , n
for i = k + 1, . . . , n
if aki = 0 then Respect the sparsity pattern of A
ik = 0
else
ik = aik/akk
for j = k + 1, . . . , n
if aij = 0 then Respect the sparsity pattern of A
aij = aij − ikakj Gaussian elimination
end
end
end
end
end
Set U = upper triangular part of A
This factorization can be written as A = LU + R where R is a matrix con-
taining the elements that would fill-in L and U and is not actually computed.
The approximate system LU ˜r = r is then solved using forward and backward
substitution in the preconditioning step of the nonstationary method used.
A more detailed analysis of preconditioning and other incomplete factorizations
may be found in Axelsson [7].
2.5.3 Conjugate Gradient Normal Equations
In order to deal with nonsymmetric systems, it is necessary either to convert the
original system into a symmetric positive definite equivalent one, or to generalize
the CG method. The next sections discuss these possibilities.
The first approach, and perhaps the easiest, is to transform Ax = b into a
symmetric positive definite system by multiplying the original system by A .
As A is assumed to be nonsingular, A A is symmetric positive definite and the
CG algorithm can be applied to A Ax = A b. This method is known as the
Conjugate Gradient Normal Equation (CGNE) method.
A somewhat similar approach is to solve AA y = b by the CG method and then
to compute x = A y. The difference between the two approaches is discussed in
Golub and Ortega [55, pp. 397ff].
Besides the computation of the matrix-matrix and matrix-vector products, these
methods have the disadvantage of increasing the condition number of the system
solved since κ2(A A) = (κ2(A))2
. This in turn increases the number of iterations
of the method, see Barrett et al. [8, p. 16] and Golub and Van Loan [56].
However, since the transformation and the coding are easy to implement, the

method might be appealing in certain circumstances.
2.5.4 Generalized Minimal Residual
Paige and Saunders [86] proposed a variant of the CG method that minimizes
the residual r = b − Ax in the 2-norm. It only requires the system to be
symmetric and not positive definite. It can also be extended to unsymmetric
systems if some more information is kept from step to step. This method is
called GMRES (Generalized Minimal Residual) and was introduced by Saad
and Schultz [90].
The difficulty is not to loose the orthogonality property of the direction vectors
p(i)
. To achieve this goal, all previously generated vectors have to be kept
in order to build a set of orthogonal directions, using for instance a modified
Gram-Schmidt orthogonalization process.
However, this method requires the storage and computation of an increasing
amount of information. Thus, in practice, the algorithm is very limited because
of its prohibitive cost.
To overcome these difficulties, the method may be restarted after a chosen
number of iterations m; the information is erased and the current intermediate
results are used as a new starting point. The choice of m is critically important
for the restarted version of the method, usually referred to as GMRES(m).
The pseudo-code for this method is given hereafter.

Algorithm 11 Preconditioned GMRES(m)
Choose an initial guess x(0)
and initialize an (m + 1) × m matrix ¯Hm to hij = 0
for k = 1, 2, . . . until convergence
Solve for r(k−1)
Mr(k−1)
= b − Ax(k−1)
β = r(k−1)
2 ; v(1)
= r(k−1)
/β ; q = m
for j = 1, . . . , m
Solve for w Mw = Av(j)
for i = 1, . . . , j Orthonormal basis by modified Gram-Schmidt
hij = w v(i)
w = w − hijv(i)
end
hj+1,j = w 2
if hj+1,j is sufficiently small then
q = j
exit from loop on j
end
v(j+1)
= w/hj+1,j
end
Vm = [v(1)
. . . v(q)
]
ym = argminy βe1 − ¯Hmy 2 Use the method given below to compute ym
x(k)
= x(k−1)
+ Vmym Update the approximate solution
end
Apply Givens rotations to triangularize ¯Hm to solve the least-squares prob-
lem involving the upper Hessenberg matrix ¯Hm
d = β e1 e1 is [1 0 . . . 0]
for i = 1, . . . , q Compute the sine and cosine values of the rotation
if hii = 0 then
c = 1 ; s = 0
else
if |hi+1,i| > |hii| then
t = −hii/hi+1,i ; s = 1/
√
1 + t2 ; c = s t
else
t = −hi+1,i/hii ; c = 1/
√
1 + t2 ; s = c t
end
end
t = c di ; di+1 = −s di ; di = t
hij = c hij − s hi+1,j ; hi+1,j = 0
for j = i + 1, . . . , m Apply rotation to zero the subdiagonal of ¯Hm
t1 = hij ; t2 = hi+1,j
hij = c t1 − s t2 ; hi+1,j = s t1 + c t2
end
end
Solve the triangular system ¯Hmym = d by back substitution.
Another issue with GMRES is the use of the modified Gram-Schmidt method
which is fast but not very reliable, see Golub and Van Loan [56, p. 219]. For
ill-conditioned systems, a Householder orthogonalization process is certainly
a better alternative, even if it leads to an increase in the complexity of the
algorithm.

Convergence of GMRES
The convergence properties of GMRES(m) are given in the original paper which
introduces the method, see Saad and Schultz [90].
A necessary and sufficient condition for GMRES(m) to converge appears in the
results of recent research, see Strikwerda and Stodder [95]:
Theorem 3 A necessary and sufficient condition for GMRES(m) to converge
is that the set of vectors
Vm = {v|v Aj
v = 0 for 1 ≤ j ≤ m}
contains only the vector 0.
Specifically, it follows that for a symmetric or skew-symmetric matrix A, GM-
RES(2) converges.
Another important result stated in [95] is that, if GMRES(m) converges, it does
so with a geometric rate of convergence:
Theorem 4 If r(k)
is the residual after k steps of GMRES(m), then
r(k) 2
2 ≤ (1 − ρm)k
r(0) 2
2
where
ρm = min
v =1
m
j=1(v(1)
Av(j)
)2
m
j=1 Av(j) 2
2
and the vectors v(j)
are the unit vectors generated by GMRES(m).
Similar conditions and rate of convergence estimates are also given for the pre-
conditioned version of GMRES(m).
2.5.5 BiConjugate Gradient Method
The BiConjugate Gradient method (BiCG) takes a different approach based
upon generating two mutually orthogonal sequences of residual vectors {˜r(i)
}
and {r(j)
} and A-orthogonal sequences of direction vectors {˜p(i)
} and {p(j)
}.
The interpretation in terms of the minimization of the residuals r(i)
is lost. The
updates for the residuals and for the direction vectors are similar to those of
the CG method but are performed not only using A but also A . The scalars αi
and βi ensure the bi-orthogonality conditions ˜r(i)
r(j)
= ˜p(i)
Ap(j)
= 0 if i = j.
The algorithm for the Preconditioned BiConjugate Gradient method is given
hereafter.

Algorithm 12 Preconditioned BiConjugate Gradient
Compute r(0)
= b − Ax(0)
Set ˜r(0)
= r(0)
Solve Mz(i−1)
= r(i−1)
Solve M ˜z(i−1)
= ˜r(i−1)
ρi−1 = z(i−1)
˜r(i−1)
if ρi−1 = 0 then the method fails
if i = 1 then
p(i)
= z(i−1)
˜p(i)
= ˜z(i−1)
else
βi−1 = ρi−1/ρi−2
p(i)
= z(i−1)
+ βi−1p(i−1)
˜p(i)
= ˜z(i−1)
+ βi−1 ˜p(i−1)
end
q(i)
= Ap(i)
˜q(i)
= A ˜p(i)
αi = ρi−1/(˜p(i)
q(i)
)
x(i)
= x(i−1)
+ αip(i)
r(i)
= r(i−1)
− αiq(i)
˜r(i)
= ˜r(i−1)
− αi ˜q(i)
end
The disadvantages of the method are the potential erratic behavior of the norm
of the residuals ri and unstable behavior if ρi is very small, i.e. the vectors r(i)
and ˜r(i)
are nearly orthogonal. Another potential breakdown situation is when
˜p(i)
q(i)
is zero or close to zero.
Convergence of BiCG
The convergence of BiCG may be irregular, but when the norm of the residual
is signiﬁcantly reduced, the method is expected to be comparable to GMRES.
The breakdown cases may be avoided by sophisticated strategies, see Barrett et
al. [8] and references therein. Few other convergence results are known for this
method.
2.5.6 BiConjugate Gradient Stabilized Method
A version of the BiCG method which tries to smooth the convergence was intro-
duced by van der Vorst [99]. This more sophisticated method is called BiCon-
jugate Gradient Stabilized method (BiCGSTAB) and its algorithm is formalized
in the following.

2.6 Newton Methods 29
Algorithm 13 BiConjugate Gradient Stabilized
Compute r(0)
= b − Ax(0)
Set ˜r = r(0)
ρi−1 = ˜r r(i−1)
if ρi−1 = 0 then the method fails
if i = 1 then
p(i)
= r(i−1)
else
βi−1 = (ρi−1/ρi−2)(αi−1/wi−1)
p(i)
= r(i−1)
+ βi−1(p(i−1)
− wi−1v(i−1)
)
end
Solve M ˆp = p(i)
v(i)
= Aˆp
αi = ρi−1/˜r v(i)
s = r(i−1)
− αiv(i)
if s is small enough then
x(i)
= x(i−1)
+ αi ˆp
stop
end
Solve Mˆs = s
t = Aˆs
wi = (t s)/(t t)
x(i)
= x(i−1)
+ αi ˆp + wi ˆs
r(i)
= s − wit
For continuation it is necessary that wi = 0
end
The method is more costly in terms of required operations than BiCG, but does
not involve the transpose of matrix A in the computations; this can sometimes
be an advantage. The other main advantages of BiCGSTAB are to avoid the
irregular convergence pattern of BiCG and usually to show a better convergence
speed.
2.5.7 Implementation of Nonstationary Iterative Methods
The codes for conjugate gradient type methods are easy to implement, but the
interested user should ﬁrst check the NETLIB repository. It contains the pack-
age SLAP 2.0 that solves sparse and large linear systems using preconditioned
iterative methods.
We used the MATLAB programs distributed with the Templates book by Bar-
rett et al. [8] as a basis and modiﬁed this code for our experiments.
2.6 Newton Methods
In this section and the following ones, we present classical methods for the
solution of systems of nonlinear equations. The following notation will be used
for nonlinear systems. Let F : Rn
→ Rn
represent a multivariable function.

The solution of the nonlinear system amounts then to the vector x∗
∈ Rn
such
that
F(x∗
) = 0 ←→



f1(x∗
) = 0
f2(x∗
) = 0
...
fn(x∗
) = 0 .
(2.7)
We assume F to be continuously differentiable in an open convex set U ⊂ Rn
.
In the next section, we discuss the classical Newton recalling the main results
about convergence. We then turn to modifications of the classical method where
the exact Jacobian matrix is replaced by some approximation.
The classical Newton method proceeds in approximating iteratively x∗
by a
sequence {x(k)
}k=0,1,2,....
Given a point x(k)
∈ Rn
and an evaluation of F(x(k)
) and of the Jacobian matrix
DF(x(k)
), we can construct a better approximation, called x(k+1)
of x∗
.
We may approximate F(x) in the neighborhood of x(k)
by an affine function
and get
F(x) ≈ F(x(k)
) + DF(x(k)
)(x − x(k)
) . (2.8)
We can solve this local model to obtain the value of x that satisfies
F(x(k)
) + DF(x(k)
)(x − x(k)
) = 0 ,
i.e. the point
x = x(k)
− (DF(x(k)
))−1
F(x(k)
) . (2.9)
The value for x computed in (2.9) is again used to define a local model such
as (2.8). This leads to build the iterates
x(k+1)
= x(k)
− (DF(xk
))−1
F(x(k)
) . (2.10)
The algorithm that implements these iterations is called the classical Newton
method and is formalized as follows:
Algorithm 14 Classical Newton Method
Given F : Rn
→ Rn
continuously differentiable and a starting point x(0)
∈ Rn
Compute DF(x(k)
)
Check that DF(x(k)
) is sufficiently well conditioned
Solve for s(k)
DF(x(k)
) s(k)
= −F(x(k)
)
x(k+1)
= x(k)
+ s(k)
end
Geometrically, x(k+1)
may be interpreted as the intersection of the n tangent
hyperplanes to the functions f1, f2, . . . , fn with the hyperplane {x|x = 0}. This
local solution of the linearized problem is then used as a new starting point for
the next guess.
The main advantage of the Newton method is its quadratic convergence behav-
ior, if appropriate conditions stated later are satisfied. However, this technique

may not converge to a solution for starting points outside some neighborhood
of the solution.
The classical Newton algorithm is also computationally intensive since it re-
quires at every iteration the evaluation of the Jacobian matrix DF(x(k)
) and
the solution of a linear system.
The computationally expensive steps in the Newton algorithm are the evaluation
of the Jacobian matrix and the solution of the linear system. Hence, for a dense
Jacobian matrix, the complexity of the latter task determines an order of O(n3
)
arithmetic operations.
If the system of equations is sparse, as is the case in large macroeconometric
models, we obtain an O(c2
n) complexity (see Section 2.3) for the linear system.
An analytical evaluation of the Jacobian matrix will automatically exploit the
sparsity of the problem. Particular attention must be paid in case of a numerical
evaluation of DF as this could introduce a O(n2
) operation count.
An iterative technique may also be utilized to approximate the solution of the
linear system arising. Such techniques save computational effort, but the num-
ber of iterations needed to satisfy a given convergence criterion is not known in
advance. Possibly, the number of iterations of the iterative technique may be
fixed beforehand.
2.6.2 Convergence
To discuss the convergence of the Newton method, we need the following defi-
nition and theorem.
Definition 1 A function G : Rn
→ Rn×m
is said to be Lipschitz continuous
on an open set U ⊂ Rn
if for all x, y ∈ U there exists a constant γ such that
G(y) − G(y) a ≤ γ y − x b, where · a is a norm on Rn×m
and · b on Rn
.
The value of the constant γ depends on the norms chosen and the scale of DF.
Theorem 5 Let F : Rn
→ Rm
be continuously differentiable in the open convex
set U ⊂ Rn
, x ∈ U and let DF be Lipschitz continuous at x in the neighborhood
U. Then, for any x + p ∈ U,
F(x + p) − F(x) − DF(x)p ≤
γ
2
p 2
,
where γ is the Lipschitz constant.
This theorem gives a bound on how close the affine F(x)+DF(x)p is to F(x+p).
This bound contains the Lipschitz constant γ which measures the degree of non-
linearity of F. A proof of Theorem 5 can be found in Dennis and Schnabel [26,
p. 75] for instance.

2.7 Finite Difference Newton Method 32
The conditions for the convergence of the classical Newton method are then
stated in the following theorem.
Theorem 6 If F is continuously differentiable in an open convex set U ⊂ Rn
containing x∗
with F(x∗
) = 0, DF is Lipschitz continuous in a neighborhood of
x∗
and DF(x∗
) is nonsingular and such that DF(x∗
)−1
≤ β > 0, then the
iterates of the classical Newton method satisfy
x(k+1)
− x∗
≤ βγ x(k)
− x∗ 2
, k = 0, 1, 2, . . . ,
for a starting guess x(0)
in a neighborhood of x∗
.
Two remarks are triggered by this theorem. First, the method converges fast
as the error of step k + 1 is guaranteed to be less than some proportion of the
square of the error of step k, provided all the assumptions are satisfied. For this
reason, the method is said to be quadratically convergent. We refer to Dennis
and Schnabel [26, p. 90] for the proof of the theorem. The original works and
further references are cited in Ortega and Rheinboldt [85, p. 316].
The constant βγ gives a bound for the relative nonlinearity of F and is a scale
free measure since β is an upper bound for the norm of (DF(x∗
))−1
. Therefore
Theorem 6 tells us that the smaller this measure of relative nonlinearity, the
faster Newton method converges.
The second remark concerns the conditions needed to verify a quadratic conver-
gence. Even if the Lipschitz continuity of DF is verified, the choice of a starting
point x(0)
lying in a convergent neighborhood of the solution x∗
may be an a
priori difficult problem.
For macroeconometric models the starting values can naturally be chosen as
the last period solution, which in many cases is a point not too far from the
current solution. Macroeconometric models do not generally show a high level
of nonlinearity and, therefore, the Newton method is generally suitable to solve
them.
2.7 Finite Difference Newton Method
An alternative to an analytical Jacobian matrix is to replace the exact deriva-
tives by finite difference approximations. Even though nowadays software for
symbolic derivation is readily available, there are situations where one might
prefer to, or have to, resort to an approach which is easy to implement and only
requires function evaluations.
A circumstance where finite differences are certainly attractive occurs if the
Newton algorithm is implemented on a SIMD computer. Such an example is
discussed in Section 4.2.1.
We may approximate the partial derivatives in DF(x) by the forward difference
formula
(DF(x))·j ≈
F(x + hj ej) − F(x)
hj
= J·j j = 1, . . . , n . (2.11)

2.7 Finite Difference Newton Method 33
The discretization error introduced by this approximation verifies the following
bound
J·j − (DF(x))·j 2 ≤
γ
2
max
j
|hj| ,
where F a function satisfying Theorem 5. This suggests taking hj as small as
possible to minimize the discretization error.
A central difference approximation for DF(x) can also be used,
¯J·j =
F(x + hj ej) − F(x − hj ej)
2hj
. (2.12)
The bound of the discretization error is then lowered to maxj(γ/6)h2
j at the
cost of twice as many function evaluations.
Finally, the choice of hj also has to be discussed in the framework of the numer-
ical accuracy one can obtain on a digital computer. The approximation theory
suggests to take hj as small as possible to reduce the discretization error in
the approximation of DF. However, since the numerator of (2.11) evaluates
to function values that are close, a cancellation error might occur so that the
elements Jij may have very few or even no significant digits.
According to the theory, e.g. Dennis and Schnabel [26, p. 97], one may choose
hj so that F(x + hj ej) differs from F(x) in at least the leftmost half of its
significant digits. Assuming that the relative error in computing F(x) is u,
defined as in Section A.1, then we would like to have
fi(x + hj ej) − fi(x)
fi(x)
≤
√
u ∀i, j .
The best guess is then hj =
√
u xj in order to cope with the different sizes of the
elements of x, the discretization error and the cancellation error. In the case of
central difference approximations, the choice for hj is modified to hj = u2/3
xj.
The finite difference Newton algorithm can then be expressed as follows.
Algorithm 15 Finite Difference Newton Method
Given F : Rn
→ Rn
∈ Rn
Evaluate J(k)
according to 2.11 or 2.12
Solve J(k)
s(k)
= −F(x(k)
)
x(k+1)
= x(k)
+ s(k)
end
2.7.1 Convergence of the Finite Difference Newton Me-
thod
When replacing the analytically evaluated Jacobian matrix by a finite difference
approximation, it can be shown that the convergence of the Newton iterative
process remains quadratic if the finite difference step size is chosen to satisfy
conditions specified in the following. (Proofs can be found, for instance, in
Dennis and Schnabel [26, p. 95] or Ortega and Rheinboldt [85, p. 360].)

2.8 Simplified Newton Method 34
If the finite difference step size h(k)
is invariant with respect to the iterations
k, then the discretized Newton method shows only a linear rate of convergence.
(We drop the subscript j for convenience.)
If a decreasing sequence h(k)
is imposed, i.e. limk→∞ h(k)
= 0, the method
achieves a superlinear rate of convergence.
Furthermore, if one of the following conditions is verified,
there exist constants c1 and k1 such that |h(k)
| ≤ c1 x(k)
− x∗
∀k ≥ k1 ,
there exist constants c2 and k2 such that |h(k)
| ≤ c2 F(x(k)
) ∀k ≥ k2 ,
(2.13)
then the convergence is quadratic, as it is the case in the classical Newton
method.
The limit condition on the sequence h(k)
may be interpreted as an improve-
ment in the accuracy of the approximations of DF as we approach x∗
. Con-
ditions (2.13) ensure a tight approximation of DF and therefore lead to the
quadratic convergence of the method. In practice, however, none of the con-
ditions (2.13) can be tested as neither x∗
nor c1 and c2 are known. They are
nevertheless important from a theoretical point of view since they show that
for good enough approximations of the Jacobian matrix, the finite difference
Newton method will behave as well as the classical Newton method.
2.8 Simplified Newton Method
To avoid the repeated evaluation of the Jacobian matrix DF(x(k)
) at each step k,
one may reuse the first evaluation DF(x(0)
) for all subsequent steps k = 1, 2, . . . .
This method is called simplified Newton method and it is attractive when the
level of nonlinearity of F is not too high, since then the Jacobian matrix does
not vary too much.
Another advantage of this simplification is that the linear system to be solved at
each step is the same for different right-hand sides, leading to significant savings
in the computational work.
As discussed before, the computationally expensive steps in the Newton method
are the evaluation of the Jacobian matrix and the solution of the corresponding
linear system. If a direct method is applied in the simplified method, these two
steps are carried out only once and, for subsequent iterations, only the forward
and back substitution phases are needed.
In the one dimensional case, this technique corresponds to a parallel-chord
method. The first chord is taken to be the tangent to the point at coordi-
nates (x(0)
, F(x(0)
)) and for the next iterations this chord is simply shifted in a
parallel way.
To improve the convergence of this method, DF may occasionally be reevaluated
by choosing an integer increasing function p(k) with values in the interval [0, k]
and the linear system DF(x(p(k))
)s(k)
= −F(x(k)
) solved.
In the extreme case where p(k) = 0 , ∀k we have the simplified Newton method
and, at the other end, when p(k) = k , ∀k we have the classical Newton

2.9 Quasi-Newton Methods 35
method. The choice of the function p(k), i.e. the reevaluation scheme, has to be
determined experimentally.
Algorithm 16 Simplified Newton Method
Given F : Rn
→ Rn
∈ Rn
Compute DF(xp(k)
) if needed
Solve for s(k)
DF(xp(k)
) s(k)
= −F(x(k)
)
x(k+1)
= x(k)
+ s(k)
end
2.8.1 Convergence of the Simplified Newton Method
The kind of simplification presented leads to a degradation of the speed of
convergence as the Jacobian matrix is not updated at each step.
However, one may note that for some macroeconometric models the nonlinear-
ities are often such that this type of techniques may prove advantageous com-
pared to the classical Newton iterations because of the computational savings
that can be made.
In the classical Newton method, the direction s(k)
= −(DF(x(k)
)−1
F(x(k)
) is a
guaranteed descent direction for the function f(x) = 1
2 F(x) F(x) = 1
2 F(x) 2
2
since
Df(x) = DF(x) F(x)
and
Df(x(k)
) s(k)
= −F(x(k)
) DF(x(k)
)(DF(x(k)
))−1
F(x(k)
) (2.14)
= −F(x(k)
) F(x(k)
) < 0 for all F(x(k)
) = 0 . (2.15)
In the simplified Newton method the direction of update is
s(k)
= −(DF(x(0)
))−1
DF(x(k)
) ,
which is a descent direction for the function f(x) as long as the matrix
(DF(x(0)
))−1
DF(x(k)
) ,
is positive definite. If s(k)
is not a descent direction, then the Jacobian matrix
has to be reevaluated at x(k)
and the method restarted form this point.
2.9 Quasi-Newton Methods
The methods discussed previously did not use the exact evaluation of the Ja-
cobian matrix but resorted to approximations. We will limit our presentation
in this section to Broyden’s method which belongs to the class of so called
Quasi-Newton methods.
Quasi-Newton methods start either with an analytical or with a finite differ-
ence evaluation for the Jacobian matrix at the starting point x(0)
, and therefore

2.9 Quasi-Newton Methods 36
compute x(1)
like the classical Newton method does. For the successive steps,
DF(x(0)
)—or an approximation J(0)
to it—is updated using (x(0)
, F(x(0)
)) and
(x(1)
, F(x(1)
)). The matrix DF(x(1)
) then can be approximated at little addi-
tional cost by a secant method.
The secant approximation A(1)
satisfies the equation
A(1)
(x(1)
− x(0)
) = F(x(1)
) − F(x(0)
) . (2.16)
Matrix A(1)
is obviously not uniquely defined by relation (2.16).
Broyden [20] introduced a criterion which leads to choosing—at the generic step
k—a matrix A(k+1)
defined as
A(k+1)
= A(k)
+
(y(k)
− A(k)
s(k)
) s(k)
s(k) s(k)
(2.17)
where y(k)
= F(x(k+1)
) − F(x(k)
)
and s(k)
= x(k+1)
− x(k)
.
Broyden’s method updates matrix A(k)
by a rank one matrix computed only
from the information of the current step and the preceding step.
Algorithm 17 Quasi-Newton Method using Broyden’s Update
Given F : Rn
→ Rn
∈ Rn
Evaluate A(0)
by DF(x(0)
) or J(0)
Solve for s(k)
A(k)
s(k)
= −F(x(k)
)
x(k+1)
= x(k)
+ s(k)
y(k)
= F(x(k+1)
) − F(x(k)
)
A(k+1)
= A(k)
+ ((y(k)
− A(k)
s(k)
) s(k)
)/(s(k)
s(k)
)
end
Broyden’s method may generate sequences of matrices {A(k)
}k=0,1,... which do
not converge to the Jacobian matrix DF(x∗
), even though the method produces
a sequence {x(k)
}k=0,1,... converging to x∗
.
Dennis and Moré [25] have shown that the convergence behavior of the method
is superlinear under the same conditions as for Newton-type techniques. The
underlying reason that enables this favorable behavior is that A(k)
−DF(x(k)
)
stays sufficiently small.
From a computational standpoint, Broyden’s method is particularly attractive
since the solution of the successive linear systems A(k)
s(k)
= −F((k)
) can be
determined by updating the initial factorization of A(0)
for k = 1, 2, . . . . Such
an update necessitates O(n2
) operations, therefore reducing the original O(n3
)
cost of a complete refactorization.
Practically, the QR factorization update is easier to implement than the LU
update, see Gill et al. [47, pp. 125–150]. For sparse systems, however, the
advantage of the updating process vanishes.
A software reference for Broyden’s method is MINPACK by Moré, Garbow and
Hillstrom available on NETLIB.

2.10 Nonlinear First-order Methods 37
2.10 Nonlinear First-order Methods
The iterative techniques for the solution of linear systems described in sec-
tions 2.4.1 to 2.4.4 can be extended to nonlinear equations.
If we interpret the stationary iterations in algorithms 1 to 4 in terms of obtaining
x
(k+1)
i as the solution of the j-th equation with the other (n − 1) variables held
fixed, we may immediately apply the same idea to the nonlinear case.
The first issue is then the existence of a one-to-one mapping between the set
of equations {fi , i = 1, . . . , n} and the set of variables {xi , i = 1, . . . , n}.
This mapping is also called a matching and it can be shown that its existence
is a necessary condition for the solution to exist, see Gilli [50] or Gilli and
Garbely [51].
A matching m must be provided in order to define the variable m(i) that has
to be solved from equation i. For the method to make sense, the solution of
the i-th equation with respect to xm(i) must exist and be unique. This solution
can then be computed using a one dimensional solution algorithm—e.g. a one
dimensional Newton method.
We can then formulate the nonlinear Jacobi algorithm.
Algorithm 18 Nonlinear Jacobi Method
Given a matching m and a starting point x(0)
∈ Rn
Set up the equations so that m(i) = i for i = 1, . . . , n
for i = 1, . . . , n
Solve for xi
fi(x
(k)
1 , . . . , xi, . . . , x
(k)
n ) = 0
and set x
(k+1)
i = xi
end
end
The nonlinear Gauss-Seidel is obtained by modifying the “solve” statement.
Algorithm 19 Nonlinear Gauss-Seidel Method
Given a matching m and a starting point x(0)
∈ Rn
Set up the equations so that m(i) = i for i = 1, . . . , n
for i = 1, . . . , n
Solve for xi
fi(x
(k+1)
1 , . . . , x
(k+1)
i−1 , xi, x
(k)
i+1, . . . , x
(k)
n ) = 0
and set x
(k+1)
i = xi
end
end
In order to keep notation simple, we will assume from now on that the equations
and variables have been set up so that we have m(i) = i for i = 1, . . . , n.
The nonlinear SOR and FGS6
algorithms are obtained by a straightforward
6As already mentioned in the linear case, the FGS method should be considered as a second

2.10 Nonlinear First-order Methods 38
modification of the corresponding linear versions.
If it is possible to isolate xi from fi(x1, . . . , xn) for all i, then we have a normal-
ized system of equations. This is often the case in systems of equations arising
in macroeconometric modeling. In such a situation, each variable is isolated as
follows,
xi = gi(x1, . . . , xi−1, xi+1, . . . , xn), i = 1, 2, . . ., n . (2.18)
The “solve” statement in Algorithm 18 and Algorithm 19 is now dropped since
the solution is given in an explicit form.
2.10.1 Convergence
The matrix form of these nonlinear iterations can be found by linearizing the
equations around x(k)
, which yields
A(k)
x(k)
= b(k)
,
where A(k)
= DF(x(k)
) and b(k)
denotes the constant part of the linearization
of F. As the path of the iterates {x(k)
}k=0,1,2,... yields to different matrices
A(k)
and vectors b(k)
, the nonlinear versions of the iterative methods can no
longer be considered stationary methods. Each system A(k)
x(k)
= b(k)
will have
a different convergence behavior not only according to the splitting of A(k)
and
the updating technique chosen, but also because the values of the elements in
matrix A(k)
and vector b(k)
change from an iteration to another.
It follows that convergence criteria can only be stated for starting points x(0)
within a neighborhood of the solution x∗
. Similarly to what has been presented
in Section 2.4.6, we can evaluate the matrix B that governs the convergence
at x∗
and state that, if ρ(B) < 1, then the method is likely to converge. The
difficulty is that now the eigenvalues, and hence the spectral radius, vary with
the solution path. The same is also true for the optimal values of the parameters
ω and γ of the SOR and FGS methods.
In such a framework, Hughes Hallett [69] suggests several ways of computing
approximate optimal values for γ during the iterations. The simplest form is to
take
γk = (1 ± |x
(k)
i /x
(k−2)
i |)−1
,
the sign being positive if the iterations are cycling and negative if the itera-
tions are monotonic; xi is the element which violates the most the convergence
criterion.
One should also constrain γk to lie in the interval [0, 2], which is a necessary
condition for the FGS to converge. To avoid large fluctuations of γk, one may
smooth the sequence by the formula
˜γk = αkγk + (1 − αk)γk−1 ,
where αk is chosen in the interval [0, 1]. We may note that such strategies can
also be applied in the linear case to automatically set the value for γ.
order method.

2.11 Solution by Minimization 39
2.11 Solution by Minimization
In the preceding sections, methods for the solution of nonlinear systems of equa-
tions have been considered. An alternative to compute a solution of F(x) = 0
is to minimize the following objective function
f(x) = F(x) a , (2.19)
where · a denotes a norm in Rn
.
A reason that motivates such an alternative is that it introduces a criterion
to decide whether x(k+1)
is a better approximation to x∗
than x(k)
. As at the
solution F(x∗
) = 0, we would like to compare the vectors F(x(k+1)
) and F(x(k)
),
and to do so we compare their respective norms. What is required7
is that
F(x(k+1)
) a < F(x(k)
) a ,
which then leads us to the minimization of the objective function (2.19).
A convenient choice is the standard euclidian norm, since it permits an analytical
development of the problem. The minimization problem then reads
min
x
f(x) =
1
2
F(x) F(x) , (2.20)
where the factor 1/2 is added for algebraic convenience.
Thus methods for nonlinear least-squares problem, such as Gauss-Newton or
Levenberg-Marquardt, can immediately be applied to this framework. Since
the system is square and has a solution, we expect to have a zero residual
function f at x∗
.
In general it is advisable to take advantage of the structure of F to directly
approach the solution of F(x) = 0. However, in some circumstances, resorting to
the minimization of f(x) constitutes an interesting alternative. This is the case
when the nonlinear equations contain numerical inaccuracies preventing F(x) =
0 from having a solution. If the residual f(x) is small, then the minimization
approach is certainly preferable.
To devise a minimization algorithm for f(x), we need the gradient Df(x) and
the Hessian matrix D2
f(x), that is
Df(x) = DF(x) F(x) (2.21)
D2
f(x) = DF(x) DF(x) + Q(x) (2.22)
with Q(x) =
n
i=1
fi(x)D2
fi(x) .
We recall that F(x) = [f1(x) . . . fn(x)] , each fi(x) is a function from Rn
into R
and that each D2
fi(x) is therefore the n × n Hessian matrix of fi(x).
The Gauss-Newton method approaches the solution by computing a Newton
step for the ﬁrst order conditions of (2.20), Df(x) = 0. At step k the Newton
7The iterate x(k+1) computed for instance by a classical Newton step does not necessarily
satisfy this requirement.

2.11 Solution by Minimization 40
direction s(k)
is determined by
D2
f(x(k)
) s(k)
= −Df(x(k)
) .
Replacing (2.21) and (2.22) in the former expression we get
(DF(x(k)
) DF(x(k)
) + Q(x(k)
))s(k)
= −DF(x(k)
) F(x(k)
) . (2.23)
For x(k)
sufficiently close to the solution x∗
, the term Q(x(k)
) in the Hessian
matrix is negligible and we may obtain the approximate s
(k)
GN called the Gauss-
Newton step from
DF(x(k)
) DF(x(k)
) s
(k)
GN = −DF(x(k)
) F(x(k)
) . (2.24)
Computing s
(k)
GN using (2.24) explicitly would require calculating the solution of a
symmetric positive definite linear system. With such an approach the condition
of the linear system involving DF(x(k)
) DF(x(k)
) is squared compared to the
following alternative.
The system (2.24) constitutes the set of normal equations, and its solution can
be obtained by solving
DF(x(k)
) s
(k)
GN = −F(x(k)
)
via a QR factorization.
We notice that this development leads to the same step as in the classical New-
ton method, see Algorithm 14. It is worth mentioning that the Gauss-Newton
method does not yield the same iterates as the Newton method for a direct
minimization of f(x).
The Levenberg-Marquardt method is closely related to Gauss-Newton and to a
modification of Newton’s method for nonlinear equations that is globally con-
vergent, see Section 2.12. The Levenberg-Marquardt step s
(k)
LM is computed as
the solution of
(DF(x(k)
) DF(x(k)
) + λkI) s
(k)
LM = −DF(x(k)
) F(x(k)
) .
It can be shown that solving this equation for s
(k)
LM is equivalent to compute
s
(k)
LM = argmins F(x(k)
) + DF(x(k)
) s 2
subject to s 2 ≤ δ .
This method is therefore a trust-region technique which is presented in Sec-
tion 2.12.
We immediately see that if λk is zero, then s
(k)
LM = s
(k)
GN ; whereas when λk be-
comes very large, s
(k)
LM is the steepest descent update for minimizing f at x(k)
,
−Df(x(k)
) = −DF(x) F(x).
We may also note that every solution of F(x) = 0 is a solution to problem (2.20).
However, the converse is not true since there may be local minimizers of f(x).
Such a situation is illustrated in Figure 2.1 and can be explained by recalling
that the gradient of f(x) given by Equation (2.21), may vanish either when
F(x) = 0 or when DF(x) is singular.

t

Recommended

Recommended

More Related Content

What's hot

What's hot (16)

Similar to t

Similar to t (20)

More from Giorgio Pauletto

More from Giorgio Pauletto (20)

t