SlideShare a Scribd company logo
Jordan Canonical Form:
Application to
Differential Equations
Copyright © 2008 by Morgan & Claypool
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in
any form or by any means—electronic, mechanical, photocopy, recording, or any other except for brief quotations in
printed reviews, without the prior permission of the publisher.
Jordan Canonical Form: Application to Differential Equations
Steven H. Weintraub
www.morganclaypool.com
ISBN: 9781598298048 paperback
ISBN: 9781598298055 ebook
DOI 10.2200/S00146ED1V01Y200808MAS002
A Publication in the Morgan & Claypool Publishers series
SYNTHESIS LECTURES ON MATHEMATICS AND STATISTICS
Lecture #2
Series Editor: Steven G. Krantz, Washington University, St. Louis
Series ISSN
Synthesis Lectures on Mathematics and Statistics
ISSN pending.
Jordan Canonical Form:
Application to
Differential Equations
Steven H. Weintraub
Lehigh University
SYNTHESIS LECTURES ON MATHEMATICS AND STATISTICS #2
CM& cLaypoolMorgan publishers&
ABSTRACT
Jordan Canonical Form (JCF) is one of the most important, and useful, concepts in linear algebra.
In this book we develop JCF and show how to apply it to solving systems of differential equations.
We first develop JCF, including the concepts involved in it–eigenvalues, eigenvectors, and chains
of generalized eigenvectors. We begin with the diagonalizable case and then proceed to the general
case, but we do not present a complete proof. Indeed, our interest here is not in JCF per se, but in
one of its important applications. We devote the bulk of our attention in this book to showing how
to apply JCF to solve systems of constant-coefficient first order differential equations, where it is
a very effective tool. We cover all situations–homogeneous and inhomogeneous systems; real and
complex eigenvalues.We also treat the closely related topic of the matrix exponential.Our discussion
is mostly confined to the 2-by-2 and 3-by-3 cases, and we present a wealth of examples that illustrate
all the possibilities in these cases (and of course, a wealth of exercises for the reader).
KEYWORDS
Jordan Canonical Form,linear algebra,differential equations,eigenvalues,eigenvectors,
generalized eigenvectors, matrix exponential
v
Contents
Contents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .v
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
1 Jordan Canonical Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 The Diagonalizable Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 The General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7
2 Solving Systems of Linear Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .25
2.1 Homogeneous Systems with Constant Coefficients. . . . . . . . . . . . . . . . . . . . . . . . . . .25
2.2 Homogeneous Systems with Constant Coefficients. . . . . . . . . . . . . . . . . . . . . . . . . . .40
2.3 Inhomogeneous Systems with Constant Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.4 The Matrix Exponential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
A Background Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
A.1 Bases, Coordinates, and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
A.2 Properties of the Complex Exponential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
B Answers to Odd-Numbered Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .85
vi CONTENTS
Preface
Jordan Canonical Form (JCF) is one of the most important, and useful, concepts in linear
algebra. In this book, we develop JCF and show how to apply it to solving systems of differential
equations.
In Chapter 1, we develop JCF. We do not prove the existence of JCF in general, but
we present the ideas that go into it—eigenvalues and (chains of generalized) eigenvectors. In
Section 1.1, we treat the diagonalizable case, and in Section 1.2, we treat the general case. We
develop all possibilities for 2-by-2 and 3-by-3 matrices, and illustrate these by examples.
In Chapter 2, we apply JCF. We show how to use JCF to solve systems Y = AY + G(x) of
constant-coefficient first-order linear differential equations. In Section 2.1, we consider homoge-
neous systems Y = AY. In Section 2.2, we consider homogeneous systems when the characteristic
polynomial of A has complex roots (in which case an additional step is necessary). In Section 2.3,
we consider inhomogeneous systems Y = AY + G(x) with G(x) nonzero. In Section 2.4, we
develop the matrix exponential eAx and relate it to solutions of these systems. Also in this chapter
we provide examples that illustrate all the possibilities in the 2-by-2 and 3-by-3 cases.
Appendix A has background material. Section A.1 gives background on coordinates for
vectors and matrices for linear transformations. Section A.2 derives the basic properties of the
complex exponential function. This material is relegated to the Appendix so that readers who
are unfamiliar with these notions, or who are willing to take them on faith, can skip it and still
understand the material in Chapters 1 and 2.
Our numbering system for results is fairly standard: Theorem 2.1, for example, is the first
Theorem found in Section 2 of Chapter 1.
As is customary in textbooks, we provide the answers to the odd-numbered exercises here.
Instructors may contact me at shw2@lehigh.edu and I will supply the answers to all of the exercises.
Steven H. Weintraub
Lehigh University
Bethlehem, PA USA
July 2008
viii PREFACE
1
C H A P T E R 1
Jordan Canonical Form
1.1 THE DIAGONALIZABLE CASE
Although, for simplicity, most of our examples will be over the real numbers (and indeed over the
rational numbers), we will consider that all of our vectors and matrices are defined over the complex
numbers C. It is only with this assumption that the theory of Jordan Canonical Form (JCF) works
completely. See Remark 1.9 for the key reason why.
Definition 1.1. If v = 0 is a vector such that, for some λ,
Av = λv ,
then v is an eigenvector of A associated to the eigenvalue λ.
Example 1.2. Let A be the matrix A =
5 −7
2 −4
. Then, as you can check, if v1 =
7
2
, then
Av1 = 3v1, so v1 is an eigenvector of A with associated eigenvalue 3, and if v2 =
1
1
, then
Av2 = −2v2, so v2 is an eigenvector of A with associated eigenvalue −2.
We note that the definition of an eigenvalue/eigenvector can be expressed in an alternate form.
Here I denotes the identity matrix:
Av = λv
Av = λIv
(A − λI)v = 0 .
For an eigenvalue λ of A, we let Eλ denote the eigenspace of λ,
Eλ = {v | Av = λv} = {v | (A − λI)v = 0} = Ker(A − λI) .
(The kernel Ker(A − λI) is also known as the nullspace NS(A − λI).)
We also note that this alternate formulation helps us find eigenvalues and eigenvectors. For if
(A − λI)v = 0 for a nonzero vector v,the matrix A − λI must be singular,and hence its determinant
must be 0. This leads us to the following definition.
Definition 1.3. The characteristic polynomial of a matrix A is the polynomial det(λI − A).
2 CHAPTER 1. JORDAN CANONICAL FORM
Remark 1.4. This is the customary definition of the characteristic polynomial. But note that, if A is
an n-by-n matrix, then the matrix λI − A is obtained from the matrix A − λI by multiplying each
of its n rows by −1, and hence det(λI − A) = (−1)n det(A − λI). In practice, it is most convenient
to work with A − λI in finding eigenvectors—this minimizes arithmetic—and when we come to
find chains of generalized eigenvectors in Section 1.2, it is (almost) essential to use A − λI, as using
λI − A would introduce lots of spurious minus signs.
Example 1.5. Returning to the matrix A =
5 −7
2 −4
of Example 1.2, we compute that det(λI −
A) = λ2 − λ − 6 = (λ − 3)(λ + 2), so A has eigenvalues 3 and −2. Computation then shows that
the eigenspace E3 = Ker(A − 3I) has basis
7
2
, and that the eigenspace E−2 = Ker(A −
(−2)I) has basis
1
1
.
We now introduce two important quantities associated to an eigenvalue of a matrix A.
Definition 1.6. Let a be an eigenvalue of a matrix A. The algebraic multiplicity of the eigenvalue
a is alg-mult(a) = the multiplicity of a as a root of the characteristic polynomial det(λI − A). The
geometric multiplicity of the eigenvalue a is geom-mult(a) = the dimension of the eigenspace Ea.
It is common practice to use the word multiplicity (without a qualifier) to mean algebraic
multiplicity.
We have the following relationship between these two multiplicities.
Lemma 1.7. Let a be an eigenvalue of a matrix A. Then
1 ≤ geom-mult(a) ≤ alg-mult(a) .
Proof. By the definition of an eigenvalue, there is at least one eigenvector v with eigenvalue a, and
so Ea contains the nonzero vector v, and hence dim(Ea) ≥ 1.
For the proof that geom-mult(a) ≤ alg-mult(a), see Lemma 1.12 in Appendix A. 2
Corollary 1.8. Let a be an eigenvalue of A and suppose that a has algebraic multiplicity 1. Then a also
has geometric multiplicity 1.
Proof. In this case, applying Lemma 1.7, we have
1 ≤ geom-mult(a) ≤ alg-mult(a) = 1 ,
so geom-mult(a) = 1. 2
1.1. THE DIAGONALIZABLE CASE 3
Remark 1.9. Let A be an n-by-n matrix.Then its characteristic polynomial det(λI − A) has degree
n. Since we are considering A to be defined over the complex numbers, we may apply the Fundamental
Theorem of Algebra, which states that an nth degree polynomial has n roots, counting multiplicities.
Hence,we see that,for any n-by-n matrix A,the sum of the algebraic multiplicities of the eigenvalues
of A is equal to n.
Lemma 1.10. Let A be an n-by-n matrix. The following are equivalent:
(1) For each eigenvalue a of A, geom-mult(a) = alg-mult(a).
(2) The sum of the geometric multiplicities of the eigenvalues of A is equal to n.
Proof. Let A have eigenvalues a1, a2, . . . , am. For each i between 1 and m, let si = geom-mult(ai)
and ti = alg-mult(ai). Then, by Lemma 1.7, si ≤ ti for each i, and by Remark 1.9, m
i=1 ti = n.
Thus, if si = ti for each i, then m
i=1 si = n, while if si < ti for some i, then m
i=1 si < n. 2
Proposition 1.11. (1) Let a1, a2, . . . , am be distinct eigenvalues of A (i.e., ai = aj for i = j). For each
i between 1 and m, let vi be an associated eigenvector.Then {v1, v2, . . . , vm} is a linearly independent set
of vectors.
(2) More generally, let a1, a2, . . . , am be distinct eigenvalues of A. For each i between 1 and m, let Si be
a linearly independent set of eigenvectors associated to ai.Then S = S1 ∪ . . . Sm is a linearly independent
set of vectors.
Proof. (1) Suppose we have a linear combination 0 = c1v1 + c2v2 + . . . + cmvm. We need to show
that ci = 0 for each i.To do this, we begin with an observation: If v is an eigenvector of A associated
to the eigenvalue a, and b is any scalar, then (A − bI)v = Av − bv = av − bv = (a − b)v. (Note
that this answer is 0 if a = b and nonzero if a = b.)
We now go to work,multiplying our original relation by (A − amI).Of course,(A − amI)0 =
0, so:
0 = (A − amI)(c1v1 + c2v2 + . . . + cm−2vm−2 + cm−1vm−1 + cmvm)
= c1(A − amI)v1 + c2(A − amI)v2 + . . .
+ cm−2(A − amI)vm−2 + cm−1(A − amI)vm−1 + cm(A − amI)vm
= c1(a1 − am)v1 + c2(a2 − am)v2 + . . .
+ cm−2(am−2 − am)vm−2 + cm−1(am−1 − am)vm−1 .
4 CHAPTER 1. JORDAN CANONICAL FORM
We now multiply this relation by (A − am−1I). Again, (A − am−1I)0 = 0, so:
0 = (A − am−1I)(c1(a1 − am)v1 + c2(a2 − am)v2 + . . .
+ cm−2(am−2 − am)vm−2 + cm−1(am−1 − am)vm−1)
= c1(a1 − am)(A − am−1I)v1 + c2(a2 − am)(A − am−1I)v2 + . . .
+ cm−2(am−2 − am)(A − am−1I)vm−2 + cm−1(am−1 − am)(A − am−1I)vm−1
= c1(a1 − am)(a1 − am−1)v1 + c2(a2 − am)(a2 − am−1)v2 + . . .
+ cm−2(am−2 − am)(am−2 − am−1)vm−2 .
Proceed in this way, until at the last step we multiply by (A − a2I). We then obtain:
0 = c1(a1 − a2) · · · (a1 − am−1)(a1 − am)v1 .
But v1 = 0, as by definition an eigenvector is nonzero. Also, the product (a1 − a2) · · · (a1 −
am−1)(a1 − am) is a product of nonzero numbers and is hence nonzero.Thus, we must have c1 = 0.
Proceeding in the same way, multiplying our original relation by (A − amI), (A − am−1I),
(A − a3I),and finally by (A − a1I),we obtain c2 = 0,and,proceeding in this vein,we obtain ci = 0
for all i, and so the set {v1, v2, . . . , vm} is linearly independent.
(2)To avoid complicated notation,we will simply prove this when m = 2 (which illustrates the
general case).Thus,let m = 2,let S1 = {v1,1, . . . , v1,i1 } be a linearly independent set of eigenvectors
associated to the eigenvalue a1 of A, and let S2 = {v2,1, . . . , v2,i2 } be a linearly independent set
of eigenvectors associated to the eigenvalue a2 of A. Then S = {v1,1, . . . , v1,i1 , v2,1, . . . , v2,i2 }.
We want to show that S is a linearly independent set. Suppose we have a linear combination
0 = c1,1v1,1 + . . . + c1,i1 v1,i1 + c2,1v2,1 + . . . + c2,i2 v2,i2 . Then:
0 = c1,1v1,1 + . . . + c1,i1 v1,i1 + c2,1v2,1 + . . . + c2,i2 v2,i2
= (c1,1v1,1 + . . . + c1,i1 v1,i1 ) + (c2,1v2,1 + . . . + c2,i2 v2,i2 )
= v1 + v2
where v1 = c1,1v1,1 + . . . + c1,i1 v1,i1 and v2 = c2,1v2,1 + . . . + c2,i2 v2,i2 . But v1 is a vector in Ea1 ,
so Av1 = a1v1; similarly, v2 is a vector in Ea2 , so Av2 = a2v2. Then, as in the proof of part (1),
0 = (A − a2I)0 = (A − a2I)(v1 + v2) = (A − a2I)v1 + (A − a2I)v2
= (a1 − a2)v1 + 0 = (a1 − a2)v1
so 0 = v1; similarly, 0 = v2. But 0 = v1 = c1,1v1,1 + . . . + c1,i1 v1,i1 implies c1,1 = . . . c1,i1 = 0,
as, by hypothesis, {v1,1, . . . , v1,i1 } is a linearly independent set; similarly, 0 = v2 implies c2,1 =
. . . = c2,i2 = 0. Thus, c1,1 = . . . = c1,i1 = c2,1 = . . . = c2,i2 = 0 and S is linearly independent, as
claimed. 2
Definition 1.12. Two square matrices A and B are similar if there is an invertible matrix P with
A = PBP −1.
1.1. THE DIAGONALIZABLE CASE 5
Definition 1.13. A square matrix A is diagonalizable if A is similar to a diagonal matrix.
Here is the main result of this section.
Theorem 1.14. Let A be an n-by-n matrix over the complex numbers. Then A is diagonalizable if
and only if, for each eigenvalue a of A, geom-mult(a) = alg-mult(a). In that case, A = PJP−1 where
J is a diagonal matrix whose entries are the eigenvalues of A, each appearing according to its algebraic
multiplicity, and P is a matrix whose columns are eigenvectors forming bases for the associated eigenspaces.
Proof. We give a proof by direct computation here. For a more conceptual proof, see Theorem 1.10
in Appendix A.
First let us suppose that for each eigenvalue a of A, geom-mult(a) = alg-mult(a).
Let A have eigenvalues a1, a2, …, an. Here we do not insist that the ai’s are distinct; rather,
each eigenvalue appears the same number of times as its algebraic multiplicity.Then J is the diagonal
matrix
J = j1 j2 . . . jn
and we see that ji, the ith column of J, is the vector
ji =
⎡
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
0
...
0
ai
0
...
⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
,
with ai in the ith position, and 0 elsewhere.
We have
P = v1 v2 . . . vn ,
a matrix whose columns are eigenvectors forming bases for the associated eigenspaces.By hypothesis,
geom-mult(a) = alg-mult(a) for each eigenvector a of A, so there are as many columns of P that are
eigenvectors for the eigenvalue a as there are diagonal entries of J that are equal to a. Furthermore,
by Lemma 1.10, the matrix P indeed has n columns.
We first show by direct computation that AP = PJ. Now
AP = A v1 v2 . . . vn
6 CHAPTER 1. JORDAN CANONICAL FORM
so the ith column of AP is Avi. But
Avi = aivi
as vi is an eigenvector of A with associated eigenvalue ai.
On the other hand,
PJ = v1 v2 . . . vn J
and the ith column of PJ is Pji,
Pji = v1 v2 . . . vn ji .
Remembering what the vector ji is, and multiplying, we see that
Pji = aivi
as well.
Thus, every column of AP is equal to the corresponding column of PJ, so
AP = PJ .
By Proposition 1.11, the columns of the square matrix P are linearly independent, so P is
invertible. Multiplying on the right by P−1, we see that
A = PJP −1
,
completing the proof of this half of the Theorem.
Now let us suppose that A is diagonalizable, A = PJP −1.Then AP = PJ.We use the same
notation for P and J as in the first half of the proof. Then, as in the first half of the proof, we
compute AP and PJ column-by-column, and we see that the ith column of AP is Avi and that
the ith column of PJ is aivi, for each i. Hence, Avi = aivi for each i, and so vi is an eigenvector
of A with associated eigenvalue ai.
For each eigenvalue a of A, there are as many columns of P that are eigenvectors for a as
there are diagonal entries of J that are equal to a, and these vectors form a basis for the eigenspace
associatedoftheeigenvaluea,soweseethatforeacheigenvaluea ofA,geom-mult(a) = alg-mult(a),
completing the proof. 2
For a general matrix A, the condition in Theorem 1.14 may or may not be satisfied, i.e.,
some but not all matrices are diagonalizable. But there is one important case when this condition is
automatic.
Corollary 1.15. Let A be an n-by-n matrix over the complex numbers all of whose eigenvalues are
distinct (i.e., whose characteristic polynomial has no repeated roots). Then A is diagonalizable.
1.2. THE GENERAL CASE 7
Proof. By hypothesis, for each eigenvalue a of A, alg-mult(a) = 1. But then, by Corollary 1.8, for
each eigenvalue a of A, geom-mult(a) = alg-mult(a), so the hypothesis ofTheorem 1.14 is satisfied.
2
Example 1.16. Let A be the matrix A =
5 −7
2 −4
of Examples 1.2 and 1.5. Then, referring to
Example 1.5, we see
5 −7
2 −4
=
7 1
2 1
3 0
0 −2
7 1
2 1
−1
.
As we have indicated, we have developed this theory over the complex numbers, as JFC works
best over them. But there is an analog of our results over the real numbers—we just have to require
that all the eigenvalues of A are real. Here is the basic result on diagonalizability.
Theorem 1.17. Let A be an n-by-n real matrix. Then A is diagonalizable if and only if all the
eigenvalues of A are real numbers, and, for each eigenvalue a of A, geom-mult(a) = alg-mult(a). In that
case, A = PJP−1 where J is a diagonal matrix whose entries are the eigenvalues of A, each appearing
according to its algebraic multiplicity (and hence is a real matrix), and P is a real matrix whose columns
are eigenvectors forming bases for the associated eigenspaces.
1.2 THE GENERAL CASE
Let us begin this section by describing what a matrix in JCF looks like. A matrix in JCF is composed
of “Jordan blocks,” so we first see what a single Jordan block looks like.
Definition 2.1. A k-by-k Jordan block associated to the eigenvalue λ is a k-by-k matrix of the form
J =
⎡
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
λ 1
λ 1
λ 1
...
...
λ 1
λ
⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
.
8 CHAPTER 1. JORDAN CANONICAL FORM
In other words, a Jordan block is a matrix with all the diagonal entries equal to each other, all
the entries immediately above the diagonal equal to 1, and all the other entries equal to 0.
Definition 2.2. A matrix J in Jordan Canonical Form (JCF) is a block diagonal matrix
J =
⎡
⎢
⎢
⎢
⎢
⎢
⎣
J1
J2
J3
...
J
⎤
⎥
⎥
⎥
⎥
⎥
⎦
with each Ji a Jordan block.
Remark 2.3. Note that every diagonal matrix is a matrix in JCF, with each Jordan block a 1-by-1
block.
In order to understand and be able to use JCF, we must introduce a new concept, that of a
generalized eigenvector.
Definition 2.4. If v = 0 is a vector such that, for some λ,
(A − λI)k
(v) = 0
for some positive integer k, then v is a generalized eigenvector of A associated to the eigenvalue λ.
The smallest k with (A − λI)k(v) = 0 is the index of the generalized eigenvector v.
Let us note that if v is a generalized eigenvector of index 1, then
(A − λI)(v) = 0
(A)v = (λI)v
Av = λv
and so v is an (ordinary) eigenvector.
Recall that, for an eigenvalue λ of A, Eλ denotes the eigenspace of λ,
Eλ = {v | Av = λv} = {v | (A − λI)v = 0} .
We let ˜Eλ denote the generalized eigenspace of λ,
˜Eλ = {v | (A − λI)k
(v) = 0 for some k} .
It is easy to check that ˜Eλ is a subspace.
1.2. THE GENERAL CASE 9
Since every eigenvector is a generalized eigenvector, we see that
Eλ ⊆ ˜Eλ .
The following result (which we shall not prove) is an important fact about generalized
eigenspaces.
Proposition 2.5. Let λ be an eigenvalue of the n-by-n matrix A of algebraic multiplicity m. Then, ˜Eλ
is a subspace of Cn of dimension m.
Example 2.6. Let A be the matrix A =
0 1
−4 4
. Then, as you can check, if u =
1
2
, then
(A − 2I)u = 0, so u is an eigenvector of A with associated eigenvalue 2 (and hence a generalized
eigenvector of index 1 of A with associated eigenvalue 2). On the other hand, if v =
1
0
, then
(A − 2I)2v = 0 but (A − 2I)v = 0,so v is a generalized eigenvector of index 2 of A with associated
eigenvalue 2.
In this case,as you can check,the vector u is a basis for the eigenspace E2,so E2 = { cu | c ∈ C}
is one dimensional.
On the other hand, u and v are both generalized eigenvectors associated to the eigenvalue
2, and are linearly independent (the equation c1u + c2v = 0 only has the solution c1 = c2 = 0, as
you can readily check), so ˜E2 has dimension at least 2. Since ˜E2 is a subspace of C2, it must have
dimension exactly 2, and ˜E2 = C2 (and {u, v} is indeed a basis for C2).
Let us next consider a generalized eigenvector vk of index k associated to an eigenvalue λ, and
set
vk−1 = (A − λI)vk .
We claim that vk−1 is a generalized eigenvector of index k − 1 associated to the eigenvalue λ.
To see this, note that
(A − λI)k−1
vk−1 = (A − λI)k−1
(A − λI)vk = (A − λI)k
vk = 0
but
(A − λI)k−2
vk−1 = (A − λI)k−2
(A − λI)vk = (A − λI)k−1
vk = 0 .
Proceeding in this way, we may set
vk−2 = (A − λI)vk−1 = (A − λI)2
vk
vk−3 = (A − λI)vk−2 = (A − λI)2
vk−1 = (A − λI)3
vk
...
v1 = (A − λI)v2 = · · · = (A − λI)k−1
vk
10 CHAPTER 1. JORDAN CANONICAL FORM
and note that each vi is a generalized eigenvector of index i associated to the eigenvalue λ. A
collection of generalized eigenvectors obtained in this way gets a special name:
Definition 2.7. If {v1, . . . , vk} is a set of generalized eigenvectors associated to the eigenvalue λ of
A, such that vk is a generalized eigenvector of index k and also
vk−1 =(A − λI)vk, vk−2 = (A − λI)vk−1, vk−3 = (A − λI)vk−2,
· · · , v2 = (A − λI)v3, v1 = (A − λI)v2 ,
then {v1, . . . , vk} is called a chain of generalized eigenvectors of length k. The vector vk is called
the top of the chain and the vector v1 (which is an ordinary eigenvector) is called the bottom of the
chain.
Example 2.8. Let us return to Example 2.6.We saw there that v =
1
0
is a generalized eigenvector
of index 2 of A =
0 1
−4 4
associated to the eigenvalue 2. Let us set v2 = v =
1
0
. Then v1 =
(A − 2I)v2 =
−2
−4
is a generalized eigenvector of index 1 (i.e., an ordinary eigenvector), and
{v1, v2} is a chain of length 2.
Remark 2.9. It is important to note that a chain of generalized eigenvectors {v1, . . . , vk} is entirely
determined by the vector vk at the top of the chain. For once we have chosen vk, there are no other
choices to be made: the vector vk−1 is determined by the equation vk−1 = (A − λI)vk; then the
vector vk−2 is determined by the equation vk−2 = (A − λI)vk−1, etc.
With this concept in hand, let us return to JCF. As we have seen, a matrix J in JCF has a
number of blocks J1, J2, . . . , J , called Jordan blocks, along the diagonal. Let us begin our analysis
with the case when J consists of a single Jordan block. So suppose J is a k-by-k matrix
J =
⎡
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
λ 1
λ 1 0
λ 1
...
...
0 λ 1
λ
⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
.
1.2. THE GENERAL CASE 11
Then,
J − λI =
⎡
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
0 1
0 1
0 1
...
...
0 1
0
⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
.
Let e1=
⎡
⎢
⎢
⎢
⎢
⎢
⎣
1
0
0
...
0
⎤
⎥
⎥
⎥
⎥
⎥
⎦
, e2=
⎡
⎢
⎢
⎢
⎢
⎢
⎣
0
1
0
...
0
⎤
⎥
⎥
⎥
⎥
⎥
⎦
, e3=
⎡
⎢
⎢
⎢
⎢
⎢
⎣
0
0
1
...
0
⎤
⎥
⎥
⎥
⎥
⎥
⎦
, …, ek=
⎡
⎢
⎢
⎢
⎢
⎢
⎣
0
0
0
...
1
⎤
⎥
⎥
⎥
⎥
⎥
⎦
.
Then direct calculation shows:
(J − λI)ek = ek−1
(J − λI)ek−1 = ek−2
...
(J − λI)e2 = e1
(J − λI)e1 = 0
and so we see that {e1, . . . , ek} is a chain of generalized eigenvectors. We also note that {e1, . . . , ek}
is a basis for Ck, and so
˜Eλ = Ck
.
We first see that the situation is very analogous when we consider any k-by-k matrix with a
single chain of generalized eigenvectors of length k.
Proposition 2.10. Let {v1, . . . , vk} be a chain of generalized eigenvectors of length k associated to the
eigenvalue λ of a matrix A. Then {v1, . . . , vk} is linearly independent.
Proof. Suppose we have a linear combination
c1v1 + c2v2 + · · · + ck−1vk−1 + ckvk = 0 .
We must show each ci = 0.
By the definition of a chain, vk−i = (A − λI)ivk for each i, so we may write this equation as
c1(A − λI)k−1
vk + c2(A − λI)k−2
vk + · · · + ck−1(A − λI)vk + ckvk = 0 .
12 CHAPTER 1. JORDAN CANONICAL FORM
Now let us multiply this equation on the left by (A − λI)k−1. Then we obtain the equation
c1(A − λI)2k−2
vk + c2(A − λI)2k−3
vk + · · · + ck−1(A − λI)k
vk + ck(A − λI)k−1
vk = 0 .
Now (A − λI)k−1vk = v1 = 0. However, (A − λI)kvk = 0, and then also (A − λI)k+1vk =
(A − λI)(A − λI)kvk = (A − λI)(0) = 0, and then similarly (A − λI)k+2vk = 0, . . . , (A −
λI)2k−2vk = 0, so every term except the last one is zero and this equation becomes
ckv1 = 0 .
Since v1 = 0, this shows ck = 0, so our linear combination is
c1v1 + c2v2 + · · · + ck−1vk−1 = 0 .
Repeat the same argument, this time multiplying by (A − λI)k−2 instead of (A − λI)k−1.
Then we obtain the equation
ck−1v1 = 0 ,
and, since v1 = 0, this shows that ck−1 = 0 as well. Keep going to get
c1 = c2 = · · · = ck−1 = ck = 0 ,
so {v1, . . . , vk} is linearly independent. 2
Theorem 2.11. Let A be a k-by-k matrix and suppose that Ck has a basis {v1, . . . , vk} consisting of a
single chain of generalized eigenvectors of length k associated to an eigenvalue a. Then
A = PJP −1
where
J =
⎡
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
a 1
a 1
a 1
...
...
a 1
a
⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
is a matrix consisting of a single Jordan block and
P = v1 v2 . . . vk
is a matrix whose columns are generalized eigenvectors forming a chain.
1.2. THE GENERAL CASE 13
Proof. We give a proof by direct computation here. (Note the similarity of this proof to the proof
of Theorem 1.14.) For a more conceptual proof, see Theorem 1.11 in Appendix A.
Let P be the given matrix. We will first show by direct computation that AP = PJ.
It will be convenient to write
J = j1 j2 . . . jk
and we see that ji, the ith column of J, is the vector
ji =
⎡
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
0
...
1
a
0
...
⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
with 1 in the (i − 1)st position, a in the ith position, and 0 elsewhere.
We show that AP = PJ by showing that their corresponding columns are equal.
Now
AP = A v1 v2 . . . vk
so the ith column of AP is Avi. But
Avi = (A − aI + aI)vi
= (A − aI)vi + aIvi
= vi−1 + avi for i > 1, = avi for i = 1 .
On the other hand,
PJ = v1 v2 . . . vk J
and the ith column of PJ is Pji,
Pji = v1 v2 . . . vk ji .
Remembering what the vector ji is, and multiplying, we see that
Pji = vi−1 + avi for i > 1, = avi for i = 1
as well.
14 CHAPTER 1. JORDAN CANONICAL FORM
Thus, every column of AP is equal to the corresponding column of PJ, so
AP = PJ .
But Proposition 2.10 shows that the columns of P are linearly independent, so P is invertible.
Multiplying on the right by P−1, we see that
A = PJP−1
.
2
Example 2.12. Applying Theorem 2.11 to the matrix A =
0 1
−4 4
of Examples 2.6 and 2.8, we
see that
0 1
−4 4
=
−2 1
−4 0
2 1
0 2
−2 1
−4 0
−1
.
Here is the key theorem to which we have been heading. This theorem is one of the most
important (and useful) theorems in linear algebra.
Theorem 2.13. Let A be any square matrix defined over the complex numbers. Then A is similar to a
matrix in Jordan Canonical Form. More precisely, A = PJP −1, for some matrix J in Jordan Canonical
Form. The diagonal entries of J consist of eigenvalues of A, and P is an invertible matrix whose columns
are chains of generalized eigenvectors of A.
Proof. (Rough outline) In general, the JCF of a matrix A does not consist of a single block, but will
have a number of blocks, of varying sizes and associated to varying eigenvalues.
But in this situation we merely have to “assemble” the various blocks (to get the matrix J)
and the various chains of generalized eigenvectors (to get a basis and hence the matrix P). Actually,
the word “merely” is a bit misleading, as the argument that we can do so is, in fact, a subtle one, and
we shall not give it here. 2
In lieu of proving Theorem 2.13, we shall give a number of examples that illustrate the
situation. In fact, in order to avoid complicated notation we shall merely illustrate the situation for
2-by-2 and 3-by-3 matrices.
Theorem 2.14. Let A be a 2-by-2 matrix. Then one of the following situations applies:
1.2. THE GENERAL CASE 15
(i) A has two eigenvalues, a and b, each of algebraic multiplicity 1. Let u be an eigenvector associated to
the eigenvalue a and let v be an eigenvector associated to the eigenvalue b.Then A = PJP−1 with
J =
a 0
0 b
and P = u v .
(Note, in this case, A is diagonalizable.)
(ii) A has a single eigenvalue a of algebraic multiplicity 2.
(a) A has two linearly independent eigenvectors u and v.
Then A = PJP−1 with
J =
a 0
0 a
and P = u v .
(Note, in this case, A is diagonalizable. In fact, in this case Ea = C2 and A itself is the matrix
a 0
0 a
.)
(b) A has a single chain {v1, v2} of generalized eigenvectors. Then A = PJP−1 with
J =
a 1
0 a
and P = v1 v2 .
Theorem 2.15. Let A be a 3-by-3 matrix. Then one of the following situations applies:
(i) A has three eigenvalues, a, b, and c, each of algebraic multiplicity 1. Let u be an eigenvector
associated to the eigenvalue a, v be an eigenvector associated to the eigenvalue b, and w be an
eigenvector associated to the eigenvalue c. Then A = PJP−1 with
J =
⎡
⎣
a 0 0
0 b 0
0 0 c
⎤
⎦ and P = u v w .
(Note, in this case, A is diagonalizable.)
(ii) A has an eigenvalue a of algebraic multiplicity 2 and an eigenvalue b of algebraic multiplicity 1.
(a) A has two independent eigenvectors, u and v, associated to the eigenvalue a. Let w be an eigenvector
associated to the eigenvalue b. Then A = PJP−1 with
J =
⎡
⎣
a 0 0
0 a 0
0 0 b
⎤
⎦ and P = u v w .
(Note, in this case, A is diagonalizable.)
16 CHAPTER 1. JORDAN CANONICAL FORM
(b) A has a single chain {u1, u2} of generalized eigenvectors associated to the eigenvalue a. Let v be an
eigenvector associated to the eigenvalue b. Then A=PJP −1 with
J =
⎡
⎣
a 1 0
0 a 0
0 0 b
⎤
⎦ and P = u1 u2 v .
(iii) A has a single eigenvalue a of algebraic multiplicity 3.
(a) A has three linearly independent eigenvectors, u, v, and w. Then A = PJP−1 with
J =
⎡
⎣
a 0 0
0 a 0
0 0 a
⎤
⎦ and P = u v w .
(Note, in this case, A is diagonalizable. In fact, in this case Ea = C3 and A itself is the matrix
⎡
⎣
a 0 0
0 a 0
0 0 a
⎤
⎦.)
(b) A has a chain {u1, u2} of generalized eigenvectors and an eigenvector v with {u1, u2, v} linearly
independent. Then A = PJP −1 with
J =
⎡
⎣
a 1 0
0 a 0
0 0 a
⎤
⎦ and P = u1 u2 v .
(c) A has a single chain {u1, u2, u3} of generalized eigenvectors. Then A =PJP−1 with
J =
⎡
⎣
a 1 0
0 a 1
0 0 a
⎤
⎦ and P = u1 u2 u3 .
Remark 2.16. Suppose that A has JCF J = aI, a scalar multiple of the identity matrix. Then
A = PJP−1 = P(aI)P−1 = a(P IP−1) = aI = J.ThisjustifiestheparentheticalremarkinThe-
orems 2.14 (ii) (a) and 2.15 (iii) (a).
Remark 2.17. Note that Theorems 2.14 (i), 2.14 (ii) (a), 2.15 (i), 2.15 (ii) (a), and 2.15 (iii) (a) are
all special cases of Theorem 1.14, and in fact Theorems 2.14 (i) and 2.15 (i) are both special cases
of Corollary 1.15.
1.2. THE GENERAL CASE 17
Now we would like to apply Theorems 2.14 and 2.15. In order to do so, we need to have an
effective method to determine which of the cases we are in, and we give that here (without proof).
Definition 2.18. Let λ be an eigenvalue of A. Then for any positive integer i,
Ei
λ = {v | (A − λI)i
(v) = 0}
= Ker((A − λI)i
) .
Note that Ei
λ consists of generalized eigenvectors of index at most i (and the 0 vector), and is
a subspace. Note also that
Eλ = E1
λ ⊆ E2
λ ⊆ . . . ⊆ ˜Eλ .
In general, the JCF of A is determined by the dimensions of all the spaces Ei
λ, but this
determination can be a bit complicated. For eigenvalues of multiplicity at most 3, however, the
situationissimpler—weneedonlyconsidertheeigenspacesEλ.Thisisaconsequenceofthefollowing
general result.
Proposition 2.19. Let λ be an eigenvalue of A.Then the number of blocks in the JCF of A corresponding
to λ is equal to dim Eλ, i.e., to the geometric multiplicity of λ.
Proof. (Outline) Suppose there are such blocks. Since each block corresponds to a chain of gener-
alized eigenvectors,there are such chains.Now the bottom of the chain is an (ordinary) eigenvector,
so we get eigenvectors in this way. It can be shown that these eigenvectors are always linearly
independent and that they always span Eλ, i.e., that they are a basis of Eλ. Thus, Eλ has a basis
consisting of vectors, so dim Eλ = . 2
We can now determine the JCF of 1-by-1, 2-by-2, and 3-by-3 matrices, using the following
consequences of this proposition.
Corollary 2.20. Let λ be an eigenvalue of A of algebraic multiplicity 1. Then dim E1
λ = 1, i.e., a has
geometric multiplicity 1, and the submatrix of the JCF of A corresponding to the eigenvalue λ is a single
1-by-1 block.
Corollary 2.21. Let λ be an eigenvalue of A of algebraic multiplicity 2. Then there are the following
possibilities:
(a) dim E1
λ = 2, i.e., a has geometric multiplicity 2. In this case, the submatrix of the JCF of A
corresponding to the eigenvalue λ consists of two 1-by-1 blocks.
18 CHAPTER 1. JORDAN CANONICAL FORM
(b) dim E1
λ = 1, i.e., a has geometric multiplicity 1. Also, dim E2
λ = 2. In this case, the submatrix of
the JCF of A corresponding to the eigenvalue λ consists of a single 2-by-2 block.
Corollary 2.22. Let λ be an eigenvalue of A of algebraic multiplicity 3. Then there are the following
possibilities:
(a) dim E1
λ = 3, i.e., a has geometric multiplicity 3. In this case, the submatrix of the JCF of A corre-
sponding to the eigenvalue λ consists of three 1-by-1 blocks.
(b) dim E1
λ = 2, i.e., a has geometric multiplicity 2. Also, dim E2
λ = 3. In this case, the submatrix of
the Jordan Canonical Form of A corresponding to the eigenvalue λ consists of a 2-by-2 block and a
1-by-1 block.
(c) dim E1
λ = 1, i.e., a has geometric multiplicity 1. Also, dim E2
λ = 2, and dim E3
λ = 3. In this case,
the submatrix of the Jordan Canonical Form of A corresponding to the eigenvalue λ consists of a
single 3-by-3 block.
Now we shall do several examples.
Example 2.23. A =
⎡
⎣
2 −3 −3
2 −2 −2
−2 1 1
⎤
⎦ .
A has characteristic polynomial det (λI − A) = (λ + 1)(λ)(λ − 2).Thus, A has eigenvalues
−1,0,and2,eachofmultiplicityone,andsoweareinthesituationof Theorem2.15(i).Computation
shows that the eigenspace E−1 = Ker(A − (−I)) has basis
⎧
⎨
⎩
⎡
⎣
1
0
1
⎤
⎦
⎫
⎬
⎭
, the eigenspace E0 = Ker(A)
has basis
⎧
⎨
⎩
⎡
⎣
0
1
−1
⎤
⎦
⎫
⎬
⎭
, and the eigenspace E2 = Ker(A − 2I) has basis
⎧
⎨
⎩
⎡
⎣
−1
−1
1
⎤
⎦
⎫
⎬
⎭
. Hence, we see
that
⎡
⎣
2 −3 −3
2 −2 −2
−2 1 1
⎤
⎦ =
⎡
⎣
1 0 −1
0 −1 −1
1 1 1
⎤
⎦
⎡
⎣
−1 0 0
0 0 0
0 0 2
⎤
⎦
⎡
⎣
1 0 −1
0 −1 −1
1 1 1
⎤
⎦
−1
.
Example 2.24. A =
⎡
⎣
3 1 1
2 4 2
1 1 3
⎤
⎦ .
1.2. THE GENERAL CASE 19
A has characteristic polynomial det (λI − A) = (λ − 2)2(λ − 6).Thus, A has an eigenvalue
2 of multiplicity 2 and an eigenvalue 6 of multiplicity 1. Computation shows that the eigenspace
E2 = Ker(A − 2I) has basis
⎧
⎨
⎩
⎡
⎣
−1
1
0
⎤
⎦ ,
⎡
⎣
−1
0
1
⎤
⎦
⎫
⎬
⎭
, so dim E2 = 2 and we are in the situation of
Corollary 2.21 (a). Further computation shows that the eigenspace E6 = Ker(A − 6I) has basis⎧
⎨
⎩
⎡
⎣
1
2
1
⎤
⎦
⎫
⎬
⎭
. Hence, we see that
⎡
⎣
3 1 1
2 4 2
1 1 3
⎤
⎦ =
⎡
⎣
−1 −1 1
1 0 2
0 1 1
⎤
⎦
⎡
⎣
2 0 0
0 2 0
0 0 6
⎤
⎦
⎡
⎣
−1 −1 1
1 0 2
0 1 1
⎤
⎦
−1
.
Example 2.25. A =
⎡
⎣
2 1 1
2 1 −2
−1 0 −2
⎤
⎦ .
A has characteristic polynomial det (λI − A) = (λ + 1)2(λ − 3).Thus, A has an eigenvalue
−1 of multiplicity 2 and an eigenvalue 3 of multiplicity 1. Computation shows that the eigenspace
E−1 = Ker(A − (−I)) has basis
⎧
⎨
⎩
⎡
⎣
−1
2
1
⎤
⎦
⎫
⎬
⎭
so dim E−1 = 1 and we are in the situation of Corol-
lary 2.21 (b).Then we further compute that E2
−1 = Ker((A − (−I))2) has basis
⎧
⎨
⎩
⎡
⎣
−1
2
0
⎤
⎦ ,
⎡
⎣
0
0
1
⎤
⎦
⎫
⎬
⎭
,
therefore is two-dimensional, as we expect. More to the point, we may choose any generalized eigen-
vector of index 2, i.e., any vector in E2
−1 that is not in E1
−1, as the top of a chain. We choose u2 =
⎡
⎣
0
0
1
⎤
⎦ , and then we have u1 = (A − (−I))u2 =
⎡
⎣
1
−2
−1
⎤
⎦ , and {u1, u2} form a chain.
We also compute that, for the eigenvalue 3, the eigenspace E3 has basis
⎧
⎨
⎩
v =
⎡
⎣
−5
−6
1
⎤
⎦
⎫
⎬
⎭
.
Hence, we see that
⎡
⎣
2 1 1
2 1 −2
−1 0 2
⎤
⎦ =
⎡
⎣
1 0 −5
−2 0 −6
−1 1 1
⎤
⎦
⎡
⎣
−1 1 0
0 −1 0
0 0 3
⎤
⎦
⎡
⎣
1 0 −5
−2 0 −6
−1 1 1
⎤
⎦
−1
.
20 CHAPTER 1. JORDAN CANONICAL FORM
Example 2.26. A =
⎡
⎣
2 1 1
−2 −1 −2
1 1 2
⎤
⎦ .
A has characteristic polynomial det (λI − A) = (λ − 1)3, so A has one eigenvalue 1 of
multiplicity three. Computation shows that E1 = Ker(A − I) has basis
⎧
⎨
⎩
⎡
⎣
−1
0
1
⎤
⎦ ,
⎡
⎣
−1
1
0
⎤
⎦
⎫
⎬
⎭
, so
dim E1 = 2 and we are in the situation of Corollary 2.22 (b). Computation then shows that
dim E2
1 = 3(i.e.,(A − I)2 = 0 andE2
1 isallof C3)withbasis
⎧
⎨
⎩
⎡
⎣
1
0
0
⎤
⎦ ,
⎡
⎣
0
1
0
⎤
⎦ ,
⎡
⎣
0
0
1
⎤
⎦
⎫
⎬
⎭
.Wemaychoose
u2 to be any vector in E2
1 that is not in E1
1,and we shall choose u2 =
⎡
⎣
1
0
0
⎤
⎦ .Then u1 = (A − I)u2 =
⎡
⎣
1
−2
1
⎤
⎦ , and {u1, u2} form a chain. For the third vector, v, we may choose any vector in E1 such that
{u1, v} is linearly independent. We choose v =
⎡
⎣
−1
0
1
⎤
⎦ . Hence, we see that
⎡
⎣
2 1 1
−2 −1 2
1 1 2
⎤
⎦ =
⎡
⎣
1 1 −1
−2 0 0
1 0 1
⎤
⎦
⎡
⎣
1 1 0
0 1 0
0 0 1
⎤
⎦
⎡
⎣
1 1 −1
−2 0 0
1 0 1
⎤
⎦
−1
.
Example 2.27. A =
⎡
⎣
5 0 1
1 1 0
−7 1 0
⎤
⎦ .
A has characteristic polynomial det (λI − A) = (λ − 2)3,so A has one eigenvalue 2 of multi-
plicity three. Computation shows that E2 = Ker(A − 2I) has basis
⎧
⎨
⎩
⎡
⎣
−1
−1
3
⎤
⎦
⎫
⎬
⎭
, so dim E1
2 = 1 and
we are in the situation of Corollary 2.22 (c). Then computation shows that E2
2 = Ker((A − 2I)2)
has basis
⎧
⎨
⎩
⎡
⎣
−1
0
2
⎤
⎦ ,
⎡
⎣
−1
2
0
⎤
⎦
⎫
⎬
⎭
. (Note that
⎡
⎣
−1
−1
3
⎤
⎦ = 3/2
⎡
⎣
−1
0
2
⎤
⎦ + 1/2
⎡
⎣
−1
2
0
⎤
⎦.) Computation then
1.2. THE GENERAL CASE 21
shows that dim E3
2 = 3 (i.e., (A − 2I)3 = 0 and E3
2 is all of C3) with basis
⎧
⎨
⎩
⎡
⎣
1
0
0
⎤
⎦ ,
⎡
⎣
0
1
0
⎤
⎦ ,
⎡
⎣
0
0
1
⎤
⎦
⎫
⎬
⎭
.
We may choose u3 to be any vector in C3 that is not in E2
2, and we shall choose u3 =
⎡
⎣
1
0
0
⎤
⎦ . Then
u2 = (A − 2I)u3 =
⎡
⎣
3
1
−7
⎤
⎦ and u1 = (A − 2I)u2 =
⎡
⎣
2
2
−6
⎤
⎦ , and then {u1, u2, u3} form a chain.
Hence, we see that
⎡
⎣
5 0 1
1 1 0
−7 1 0
⎤
⎦ =
⎡
⎣
2 3 1
2 1 0
−6 −7 0
⎤
⎦
⎡
⎣
2 1 0
0 2 1
0 0 2
⎤
⎦
⎡
⎣
2 3 1
2 1 0
−6 −7 0
⎤
⎦
−1
.
As we have mentioned, we need to work over the complex numbers in order for the theory
of JCF to fully apply. But there is an analog over the real numbers, and we conclude this section by
stating it.
Theorem 2.28. Let A be a real square matrix (i.e., a square matrix with all entries real numbers), and
suppose that all of the eigenvalues of A are real numbers. Then A is similar to a real matrix in Jordan
Canonical Form. More precisely, A = PJP−1 with P and J real matrices, for some matrix J in Jordan
Canonical Form.The diagonal entries of J consist of eigenvalues of A, and P is an invertible matrix whose
columns are chains of generalized eigenvectors of A.
EXERCISES FOR CHAPTER 1
For each matrix A, write A = PJP−1 with P an invertible matrix and J a matrix in JCF.
1. A =
75 56
−90 −67
, det(λI − A) = (λ − 3)(λ − 5).
2. A =
−50 99
−20 39
, det(λI − A) = (λ + 6)(λ + 5).
3. A =
−18 9
−49 24
, det(λI − A) = (λ − 3)2.
22 CHAPTER 1. JORDAN CANONICAL FORM
4. A =
1 1
−16 9
, det(λI − A) = (λ − 5)2.
5. A =
2 1
−25 12
, det(λI − A) = (λ − 7)2.
6. A =
−15 9
−25 15
, det(λI − A) = λ2.
7. A =
⎡
⎣
1 0 0
1 2 −3
1 −1 0
⎤
⎦, det(λI − A) = (λ + 1)(λ − 1)(λ − 3).
8. A =
⎡
⎣
3 0 2
1 3 1
0 1 1
⎤
⎦, det(λI − A) = (λ − 1)(λ − 2)(λ − 4).
9. A =
⎡
⎣
5 8 16
4 1 8
−4 −4 −11
⎤
⎦, det(λI − A) = (λ + 3)2(λ − 1).
10. A =
⎡
⎣
4 2 3
−1 1 −3
2 4 9
⎤
⎦, det(λI − A) = (λ − 3)2(λ − 8).
11. A =
⎡
⎣
5 2 1
−1 2 −1
−1 −2 3
⎤
⎦, det(λI − A) = (λ − 4)2(λ − 2).
12. A =
⎡
⎣
8 −3 −3
4 0 −2
−2 1 3
⎤
⎦, det(λI − A) = (λ − 2)2(λ − 7).
13. A =
⎡
⎣
−3 1 −1
−7 5 −1
−6 6 −2
⎤
⎦, det(λI − A) = (λ + 2)2(λ − 4).
14. A =
⎡
⎣
3 0 0
9 −5 −18
−4 4 12
⎤
⎦, det(λI − A) = (λ − 3)2(λ − 4).
1.2. THE GENERAL CASE 23
15. A =
⎡
⎣
−6 9 0
−6 6 −2
9 −9 3
⎤
⎦, det(λI − A) = λ2(λ − 3).
16. A =
⎡
⎣
−18 42 168
1 −7 −40
−2 6 27
⎤
⎦, det(λI − A) = (λ − 3)2(λ + 4).
17. A =
⎡
⎣
−1 1 −1
−10 6 −5
−6 3 −2
⎤
⎦, det(λI − A) = (λ − 1)3.
18. A =
⎡
⎣
0 −4 1
2 −6 1
4 −8 0
⎤
⎦, det(λI − A) = (λ + 2)3.
19. A =
⎡
⎣
−4 1 2
−5 1 3
−7 2 3
⎤
⎦, det(λI − A) = λ3.
20. A =
⎡
⎣
−4 −2 5
−1 −1 1
−2 −1 2
⎤
⎦, det(λI − A) = (λ + 1)3.
24
25
C H A P T E R 2
Solving Systems of Linear
Differential Equations
2.1 HOMOGENEOUS SYSTEMS WITH CONSTANT
COEFFICIENTS
We will now see how to use Jordan Canonical Form (JCF) to solve systems Y = AY. We begin by
describing the strategy we will follow throughout this section.
Consider the matrix system
Y = AY .
Step 1. Write A = PJP −1 with J in JCF, so the system becomes
Y = (PJP−1
)Y
Y = PJ(P−1
Y)
P −1
Y = J(P−1
Y)
(P −1
Y) = J(P−1
Y) .
(Note that, since P −1 is a constant matrix, we have that (P −1Y) = P−1Y .)
Step 2. Set Z = P −1Y, so this system becomes
Z = JZ
and solve this system for Z.
Step 3. Since Z = P −1Y, we have that
Y = PZ
is the solution to our original system.
Examining this strategy, we see that we already know how to carry out Step 1, and also that
Step 3 is very easy—it is just matrix multiplication. Thus, the key to success here is being able to
carry out Step 2.This is where JCF comes in. As we shall see, it is (relatively) easy to solve Z = JZ
when J is a matrix in JCF.
26 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS
You will note that throughout this section, in solving Z = JZ, we write the solution as
Z = MZC, where MZ is a matrix of functions, called the fundamental matrix of the system, and
C is a vector of arbitrary constants. The reason for this will become clear later. (See Remarks 1.12
and 1.14.)
Although it is not logically necessary—we may regard a diagonal matrix as a matrix in JCF
in which all the Jordan blocks are 1-by-1 blocks—it is illuminating to handle the case when J is
diagonal first. Here the solution is very easy.
Theorem 1.1. Let J be a k-by-k diagonal matrix,
J =
⎡
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
a1
a2 0
a3
...
0 ak−1
ak
⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
.
Then the system Z = JZ has the solution
Z =
⎡
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
ea1x
ea2x 0
ea3x
...
0 eak−1x
eakx
⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
C = MZC
where C =
⎡
⎢
⎢
⎢
⎣
c1
c2
...
ck
⎤
⎥
⎥
⎥
⎦
is a vector of arbitrary constants c1, c2, . . . , ck.
Proof. Multiplying out, we see that the system Z = JZ is just the system
⎡
⎢
⎢
⎢
⎢
⎢
⎣
z1
z2
...
zk
⎤
⎥
⎥
⎥
⎥
⎥
⎦
=
⎡
⎢
⎢
⎢
⎢
⎢
⎣
a1z1
a2z2
...
akzk
⎤
⎥
⎥
⎥
⎥
⎥
⎦
.
2.1. HOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS 27
But this system is “uncoupled”, i.e., the equation for zi only involves zi and none of the
other functions. Now this equation is very familiar. In general, the differential equation z = az has
solution z = ceax, and applying that here we find that Z = JZ has solution
Z =
⎡
⎢
⎢
⎢
⎣
c1ea1x
c2ea2x
...
ckeakx
⎤
⎥
⎥
⎥
⎦
,
which is exactly the above product MZC. 2
Example 1.2. Consider the system
Y = AY where A =
5 −7
2 −4
.
We saw in Example 1.16 in Chapter 1 that A = PJP−1 with
P =
7 1
2 1
and J =
3 0
0 −2
.
Then Z = JZ has solution
Z =
e3x 0
0 e−2x
c1
c2
= MZC =
c1e3x
c2e−2x
and so Y = PZ = PMZC, i.e.,
Y =
7 1
2 1
e3x 0
0 e−2x
c1
c2
=
7e3x e−2x
2e3x e−2x
c1
c2
=
7c1e3x + c2e−2x
2c1e3x + c2e−2x .
Example 1.3. Consider the system
Y = AY where A =
⎡
⎣
2 −3 −3
2 −2 −2
−2 1 1
⎤
⎦ .
28 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS
We saw in Example 2.23 in Chapter 1 that A = PJP−1 with
P =
⎡
⎣
1 0 −1
0 −1 −1
1 1 1
⎤
⎦ and J =
⎡
⎣
−1 0 0
0 0 0
0 0 2
⎤
⎦ .
Then Z = JZ has solution
Z =
⎡
⎣
e−x 0 0
0 1 0
0 0 e2x
⎤
⎦
⎡
⎣
c1
c2
c3
⎤
⎦ = MZC
and so Y = PZ = PMZC, i.e.,
Y =
⎡
⎣
1 0 −1
0 −1 −1
1 1 1
⎤
⎦
⎡
⎣
e−x 0 0
0 1 0
0 0 e2x
⎤
⎦
⎡
⎣
c1
c2
c3
⎤
⎦
=
⎡
⎣
e−x 0 −e2x
0 −1 −e2x
e−x 1 e2x
⎤
⎦
⎡
⎣
c1
c2
c3
⎤
⎦
=
⎡
⎣
c1e−x − c3e2x
−c2 − c3e2x
c1e−x + c2 + c3e2x
⎤
⎦ .
We now see how to use JCF to solve systems Y = AY where the coefficient matrix A is not
diagonalizable.
The key to understanding systems is to investigate a system Z = JZ where J is a matrix
consisting of a single Jordan block. Here the solution is not as easy as in Theorem 1.1, but it is still
not too hard.
Theorem 1.4. Let J be a k-by-k Jordan block with eigenvalue a,
J =
⎡
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
a 1
a 1 0
a 1
...
...
0 a 1
a
⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
.
2.1. HOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS 29
Then the system Z = JZ has the solution
Z = eax
⎡
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
1 x x2/2! x3/3! · · · xk−1/(k − 1)!
1 x x2/2! · · · xk−2/(k − 2)!
1 x · · · xk−3/(k − 3)!
...
...
x
1
⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
C = MZC
where C =
⎡
⎢
⎢
⎢
⎣
c1
c2
...
ck
⎤
⎥
⎥
⎥
⎦
is a vector of arbitrary constants c1, c2, . . . , ck.
Proof. We will prove this in the cases k = 1, 2, and 3, which illustrate the pattern. As you will see,
the proof is a simple application of the standard technique for solving first-order linear differential
equations.
The case k = 1: Here we are considering the system
[z1] = [a][z1]
which is nothing other than the differential equation
z1 = az1 .
This differential equation has solution
z1 = c1eax
,
which we can certainly write as
[z1] = eaz
[1][c1] .
The case k = 2: Here we are considering the system
z1
z2
=
a 1
0 a
z1
z2
,
which is nothing other than the pair of differential equations
z1 = az1 + z2
z2 = az2 .
30 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS
We recognize the second equation as having the solution
z2 = c2eax
and we substitute this into the first equation to get
z1 = az1 + c2eax
.
To solve this, we rewrite this as
z1 − az1 = c2eax
and recognize that this differential equation has integrating factor e−ax. Multiplying by this factor,
we find
e−ax
(z1 − az1) = c2
(e−ax
z1) = c2
e−ax
z1 = c2 dx = c1 + c2x
so
z1 = eax
(c1 + c2x) .
Thus, our solution is
z1 = eax
(c1 + c2x)
z2 = eax
c2 ,
which we see we can rewrite as
z1
z2
= eax 1 x
0 1
c1
c2
.
The case k = 3: Here we are considering the system
⎡
⎣
z1
z2
z3
⎤
⎦ =
⎡
⎣
a 1 0
0 a 1
0 0 a
⎤
⎦
⎡
⎣
z1
z2
z3
⎤
⎦ ,
which is nothing other than the triple of differential equations
z1 = az1 + z2
z2 = az2 + z3
z3 = az3 .
2.1. HOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS 31
If we just concentrate on the last two equations, we see we are in the k = 2 case. Referring to
that case, we see that our solution is
z2 = eax
(c2 + c3x)
z3 = eax
c3 .
Substituting the value of z2 into the equation for z1, we obtain
z1 = az1 + eax
(c2 + c3x) .
To solve this, we rewrite this as
z1 − az1 = eax
(c2 + c3x)
and recognize that this differential equation has integrating factor e−ax. Multiplying by this factor,
we find
e−ax
(z1 − az1) = c2 + c3x
(e−ax
z1) = c2 + c3x
e−ax
z1 = (c2 + c3x) dx = c1 + c2x + c3(x2
/2)
so
z1 = eax
(c1 + c2x + c3(x2
/2)) .
Thus, our solution is
z1 = eax
(c1 + c2x + c3(x2
/2))
z2 = eax
(c2 + c3x)
z3 = eax
c3 ,
which we see we can rewrite as
⎡
⎣
z1
z2
z3
⎤
⎦ = eax
⎡
⎣
1 x x2/2
0 1 x
0 0 1
⎤
⎦
⎡
⎣
c1
c2
c3
⎤
⎦ .
2
Remark 1.5. Suppose that Z = JZ where J is a matrix in JCF but one consisting of several blocks,
not just one block. We can see that this systems decomposes into several systems, one corresponding
to each block, and that these systems are uncoupled, so we may solve them each separately, using
Theorem 1.4, and then simply assemble these individual solutions together to obtain a solution of
the general system.
32 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS
We now illustrate this (confining our illustrations to the case that A is not diagonalizable, as
we have already illustrated the diagonalizable case).
Example 1.6. Consider the system
Y = AY where A =
0 1
−4 4
.
We saw in Example 2.12 in Chapter 1 that A = PJP−1 with
P =
−2 1
−4 0
and J =
2 1
0 2
.
Then Z = JZ has solution
Z = e2x 1 x
0 1
c1
c2
=
e2x xe2x
0 e2x
c1
c2
= MZC =
c1e2x + c2xe2x
c2e2x
and so Y = PZ = PMZC, i.e.,
Y =
−2 1
−4 0
e2x 1 x
0 1
c1
c2
=
−2 1
−4 0
e2x xe2x
0 e2x
c1
c2
=
−2e2x −2xe2x + e2x
−4e2x −4xe2x
c1
c2
=
(−2c1 + c2)e2x − 2c2xe2x
−4c1e2x − 4c2xe2x .
Example 1.7. Consider the system
Y = AY where A =
⎡
⎣
2 1 1
2 1 −2
−1 0 −2
⎤
⎦ .
We saw in Example 2.25 in Chapter 1 that A = PJP−1 with
P =
⎡
⎣
1 0 −5
−2 0 −6
−1 1 1
⎤
⎦ and J =
⎡
⎣
−1 1 0
0 −1 0
0 0 3
⎤
⎦ .
2.1. HOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS 33
Then Z = JZ has solution
Z =
⎡
⎣
e−x xe−x 0
0 e−x 0
0 0 e3x
⎤
⎦
⎡
⎣
c1
c2
c3
⎤
⎦ = MZC
and so Y = PZ = PMZC, i.e.,
Y =
⎡
⎣
1 0 −5
−2 0 −6
−1 1 1
⎤
⎦
⎡
⎣
e−x xe−x 0
0 e−x 0
0 0 e3x
⎤
⎦
⎡
⎣
c1
c2
c3
⎤
⎦
=
⎡
⎣
e−x xe−x −5e3x
−2e−x −2xe−x −6e3x
−e−x −xe−x + e−x e3x
⎤
⎦
⎡
⎣
c1
c2
c3
⎤
⎦
=
⎡
⎣
c1e−x + c2xe−x − 5c3e3x
−2c1e−x − 2c2xe−x − 6c3e3x
(−c1 + c2)e−x − c2xe−x + c3e3x
⎤
⎦ .
Example 1.8. Consider the system
Y = AY where A =
⎡
⎣
2 1 1
−2 −1 −2
1 1 2
⎤
⎦ .
We saw in Example 2.26 in Chapter 1 that A = PJP−1 with
P =
⎡
⎣
1 1 1
−2 0 0
1 0 1
⎤
⎦ and J =
⎡
⎣
1 1 0
0 1 0
0 0 1
⎤
⎦ .
Then Z = JZ has solution
Z =
⎡
⎣
ex xex 0
0 ex 0
0 0 ex
⎤
⎦
⎡
⎣
c1
c2
c3
⎤
⎦ = MZC
and so Y = PZ = PMZC, i.e.,
Y =
⎡
⎣
1 1 1
−2 0 0
1 0 1
⎤
⎦
⎡
⎣
ex xex 0
0 ex 0
0 0 ex
⎤
⎦
⎡
⎣
c1
c2
c3
⎤
⎦
34 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS
=
⎡
⎣
ex xex + ex ex
−2ex −2xex 0
ex xex ex
⎤
⎦
⎡
⎣
c1
c2
c3
⎤
⎦
=
⎡
⎣
(c1 + c2 + c3)ex + c2xex
−2c1ex − 2c2xex
(c1 + c3)ex + c2xex
⎤
⎦ .
Example 1.9. Consider the system
Y = AY where A =
⎡
⎣
5 0 1
1 1 0
−7 1 0
⎤
⎦ .
We saw in Example 2.27 in Chapter 1 that A = PJP−1 with
P =
⎡
⎣
2 3 1
2 1 0
−6 −7 0
⎤
⎦ and J =
⎡
⎣
2 1 0
0 2 1
0 0 2
⎤
⎦ .
Then Z = JZ has solution
Z =
⎡
⎣
e2x xe2x (x2/2)e2x
0 e2x xe2x
0 0 e2x
⎤
⎦
⎡
⎣
c1
c2
c3
⎤
⎦ = MZC
and so Y = PZ = PMZC, i.e.,
Y =
⎡
⎣
2 3 1
2 1 0
−6 −7 0
⎤
⎦
⎡
⎣
e2x xe2x (x2/2)e2x
0 e2x xe2x
0 0 e2x
⎤
⎦
⎡
⎣
c1
c2
c3
⎤
⎦
=
⎡
⎣
2e2x 2xe2x + 3e2x x2e2x + 3xe2x + e2x
2e2x 2xe2x + e2x x2e2x + xe2x
−6e2x −6xe2x − 7e2x −3x2e2x − 7xe2x
⎤
⎦
⎡
⎣
c1
c2
c3
⎤
⎦
=
⎡
⎣
(2c1 + 3c2 + c3)e2x + (2c2 + 3c3)xe2x + c3x2e2x
(2c1 + c2)e2x + (2c2 + c3)xe2x + c3x2e2x
(−6c1 − 7c2)e2x + (−6c2 − 7c3)xe2x − 3c3x2e2x
⎤
⎦ .
2.1. HOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS 35
We conclude this section by showing how to solve initial value problems.This is just one more
step, given what we have already done.
Example 1.10. Consider the initial value problem
Y = AY where A =
0 1
−4 4
, and Y(0) =
3
−8
.
In Example 1.6, we saw that this system has the general solution
Y =
(−2c1 + c2)e2x − 2c2xe2x
−4c1e2x − 4c2xe2x .
Applying the initial condition (i.e., substituting x = 0 in this matrix), gives
3
−8
= Y(0) =
−2c1 + c2
−4c1
with solution
c1
c2
=
2
7
.
Substituting these values in the above matrix gives
Y =
3e2x − 14xe2x
−8e2x − 28te2x .
Example 1.11. Consider the initial value problem
Y = AY where A =
⎡
⎣
2 1 1
2 1 −2
−1 0 2
⎤
⎦ , and Y(0) =
⎡
⎣
8
32
5
⎤
⎦ .
In Example 1.8, we saw that this system has the general solution
Y =
⎡
⎣
c1e−x + c2xe−x − 5c3xe3x
−2c1e−x − 2c2xe−x − 6c3e3x
(−c1 + c2)e−x − c2xe−x + c3e3x
⎤
⎦ .
Applying the initial condition (i.e., substituting x = 0 in this matrix) gives
⎡
⎣
8
32
5
⎤
⎦ = Y(0) =
⎡
⎣
c1 − 5c3
−2c1 − 6c3
−c1 + c2 + c3
⎤
⎦
36 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS
with solution ⎡
⎣
c1
c2
c3
⎤
⎦ =
⎡
⎣
−7
1
−3
⎤
⎦ .
Substituting these values in the above matrix gives
Y =
⎡
⎣
−7e−x + xe−x + 15e3x
14e−x − 2xe−x + 18e3x
8e−x − xe−x − 3e3x
⎤
⎦ .
Remark 1.12. There is a variant on our method of solving systems or initial value problems.
We have written our solution of Z = JZ as Z = MZC. Let us be more explicit here and
write this solution as
Z(x) = MZ(x)C .
This notation reminds us that Z(x) is a vector of functions, MZ(x) is a matrix of functions, and C is
a vector of constants. The key observation is that MZ(0) = I, the identity matrix. Thus, if we wish
to solve the initial value problem
Z = JZ, Z(0) = Z0 ,
we find that, in general,
Z(x) = MZ(x)C
and, in particular,
Z0 = Z(0) = MZ(0)C = IC = C ,
so the solution to this initial value problem is
Z(x) = MZ(x)Z0 .
Now suppose we wish to solve the system Y = AY.Then, if A = PJP−1, we have seen that
this system has solution Y = PZ = PMZC. Let us manipulate this a bit:
Y = PMZC = PMZIC = PMZ(P−1
P)C
= (PMZP −1
)(PC) .
Now let us set MY = PMZP −1, and also let us set = PC. Note that MY is still a matrix of
functions, and that is still a vector of arbitrary constants (since P is an invertible constant matrix
and C is a vector of arbitrary constants). Thus, with this notation, we see that
Y = AY has solution Y = MY .
2.1. HOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS 37
Now suppose we wish to solve the initial value problem
Y = AY, Y(0) = Y0 .
Rewriting the above solution of Y = AY to explicitly include the independent variable, we see that
we have
Y(x) = MY (x)
and, in particular,
Y0 = Y(0) = MY (0) = PMZ(0)P−1
= PIP−1
= ,
so we see that
Y = AY, Y(0) = Y0 has solution Y(x) = MY (x)Y0 .
This variant method has pros and cons. It is actually less effective than our original method
for solving a single initial value problem (as it requires us to compute P −1 and do some extra
matrix multiplication), but it has the advantage of expressing the solution directly in terms of the
initial conditions. This makes it more effective if the same system Y = AY is to be solved for a
variety of initial conditions. Also, as we see from Remark 1.14 below, it is of considerable theoretical
importance.
Let us now apply this variant method.
Example 1.13. Consider the initial value problem
Y = AY where A =
0 1
−4 4
, and Y(0) =
a1
a2
.
As we have seen in Example 1.6,A = PJP−1 with P =
−2 1
−4 0
and J =
2 1
0 2
.Then MZ(x) =
e2x xe2x
0 e2x and
MY (x) = PMZ(x)P −1
=
−2 1
−4 0
e2x xe2x
0 e2x
−2 1
−4 0
−1
=
e2x − 2xe2x xe2x
−4xe2x e2x + 2xe2x
so
Y(x) = MY (x)
a1
a2
=
e2x − 2xe2x xe2x
−4xe2x e2x + 2xe2x
a1
a2
=
a1e2x + (−2a1 + a2)xe2x
a2e2x + (−4a1 + 2a2)xe2x .
38 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS
In particular, if Y(0) =
3
−8
, then Y(x) =
3e2x − 14xe2x
−8e2x − 28xe2x , recovering the result of Exam-
ple 1.10. But also, if Y(0) =
2
5
, then Y(x) =
2e2x + xe2x
5e2x + 2te2x , and if Y(0) =
−4
15
, then Y(x) =
−4e2x + 23xe2x
15e2x + 46xe2x , etc.
Remark 1.14. In Section 2.4 we will define the matrix exponential, and, with this definition,
MZ(x) = eJx and MY (x) = PMZ(x)P−1 = eAx.
EXERCISES FOR SECTION 2.1
For each exercise, see the corresponding exercise in Chapter 1. In each exercise:
(a) Solve the system Y = AY.
(b) Solve the initial value problem Y = AY, Y(0) = Y0.
1. A =
75 56
−90 −67
and Y0 =
1
−1
.
2. A =
−50 99
−20 39
and Y0 =
7
3
.
3. A =
−18 9
−49 24
and Y0 =
41
98
.
4. A =
1 1
−16 9
and Y0 =
7
16
.
5. A =
2 1
−25 12
and Y0 =
−10
−75
.
6. A =
−15 9
−25 15
and Y0 =
50
100
.
2.1. HOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS 39
7. A =
⎡
⎣
1 0 0
1 2 −3
1 −1 0
⎤
⎦ and Y0 =
⎡
⎣
6
−10
10
⎤
⎦.
8. A =
⎡
⎣
3 0 2
1 3 1
0 1 1
⎤
⎦ and Y0 =
⎡
⎣
0
3
3
⎤
⎦.
9. A =
⎡
⎣
5 8 16
4 1 8
−4 −4 −11
⎤
⎦ and Y0 =
⎡
⎣
0
2
−1
⎤
⎦.
10. A =
⎡
⎣
4 2 3
−1 1 −3
2 4 9
⎤
⎦ and Y0 =
⎡
⎣
3
2
1
⎤
⎦.
11. A =
⎡
⎣
5 2 1
−1 2 −1
−1 −2 3
⎤
⎦ and Y0 =
⎡
⎣
−3
2
9
⎤
⎦.
12. A =
⎡
⎣
8 −3 −3
4 0 −2
−2 1 3
⎤
⎦ and Y0 =
⎡
⎣
5
8
7
⎤
⎦.
13. A =
⎡
⎣
−3 1 −1
−7 5 −1
−6 6 −2
⎤
⎦ and Y0 =
⎡
⎣
−1
3
6
⎤
⎦.
14. A =
⎡
⎣
3 0 0
9 −5 −18
−4 4 12
⎤
⎦ and Y0 =
⎡
⎣
2
−1
1
⎤
⎦.
40 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS
15. A =
⎡
⎣
−6 9 0
−6 6 −2
9 −9 3
⎤
⎦ and Y0 =
⎡
⎣
1
3
−6
⎤
⎦.
16. A =
⎡
⎣
−18 42 168
1 −7 −40
−2 6 27
⎤
⎦ and Y0 =
⎡
⎣
7
−2
1
⎤
⎦.
17. A =
⎡
⎣
−1 1 −1
−10 6 −5
−6 3 2
⎤
⎦ and Y0 =
⎡
⎣
3
10
18
⎤
⎦.
18. A =
⎡
⎣
0 −4 1
2 −6 1
4 −8 0
⎤
⎦ and Y0 =
⎡
⎣
2
5
8
⎤
⎦.
19. A =
⎡
⎣
−4 1 2
−5 1 3
−7 2 3
⎤
⎦ and Y0 =
⎡
⎣
6
11
9
⎤
⎦.
20. A =
⎡
⎣
−4 −2 5
−1 −1 1
−2 −1 2
⎤
⎦ and Y0 =
⎡
⎣
9
5
8
⎤
⎦.
2.2 HOMOGENEOUS SYSTEMS WITH CONSTANT
COEFFICIENTS: COMPLEX ROOTS
In this section, we show how to solve a homogeneous system Y = AY where the characteristic
polynomial of A has complex roots. In principle, this is the same as the situation where the
characteristic polynomial of A has real roots, which we dealt with in Section 2.1, but in practice,
there is an extra step in the solution.
2.2. HOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS 41
We will begin by doing an example, which will show us where the difficulty lies, and then we
will overcome that difficulty. But first, we need some background.
Definition 2.1. For a complex number z, the exponential ez is defined by
ez
= 1 + z + z2
/2! + z3
/3! + . . . .
The complex exponential has the following properties.
Theorem 2.2. (1) (Euler) For any θ,
eiθ
= cos(θ) + i sin(θ) .
(2) For any a,
d
dz
(eaz
) = aeaz
.
(3) For any z1 and z2,
ez1+z2 = ez1 ez2 .
(4) If z = s + it, then
ez
= es
(cos(t) + i sin(t)) .
(5) For any z,
ez
= ez .
Proof. For the proof, see Theorem 2.2 in Appendix A. 2
The following lemma will save us some computations.
Lemma 2.3. Let A be a matrix with real entries, and let v be an eigenvector of A with associated
eigenvalue λ. Then v is an eigenvector of A with associated eigenvalue λ.
Proof. We have that Av = λv, by hypothesis. Let us take the complex conjugate of each side of this
equation. Then
Av = λv,
Av = λv,
Av = λv (as A = A since all the entries of A are real) ,
as claimed. 2
42 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS
Now for our example.
Example 2.4. Consider the system
Y = AY where A =
2 −17
1 4
.
A has characteristic polynomial λ2 − 6λ + 25 with roots λ1 = 3 + 4i and λ2 = λ1 = 3 − 4i, each
of multiplicity 1. Thus, λ1 and λ2 are the eigenvalues of A, and we compute that the eigenspace
E3+4i = Ker(A − (3 + 4i)I) has basis v1 =
−1 + 4i
1
, and hence, by Lemma 2.3, that the
eigenspace E3−4i = Ker(A − (3 − 4i)I) has basis v2 = v1 =
−1 − 4i
1
. Hence, just as before,
A = PJP−1
with P =
−1 + 4i −1 − 4i
1 1
and J =
3 + 4i 0
0 3 − 4i
.
We continue as before, but now we use F to denote a vector of arbitrary constants. (This is just for
neatness. Our constants will change, as you will see, and we will use the vector C to denote our final
constants, as usual.) Then Z = JZ has solution
Z =
e(3+4i)x 0
0 e(3−4i)x
f1
f2
= MZF =
f1e(3+4i)x
f2e(3−4i)x
and so Y = PZ = PMZF, i.e.,
Y =
−1 + 4i −1 − 4i
1 1
e(3+4i)x 0
0 e(3−4i)x
f1
f2
= f1e(3+4i)x −1 + 4i
1
+ f2e(3−4i)x −1 − 4i
1
.
Now we want our differential equation to have real solutions, and in order for this to be the
case, it turns out that we must have f2 = f1. Thus, we may write our solution as
Y = f1e(3+4i)x −1 + 4i
1
+ f1e(3−4i)x −1 − 4i
1
= f1e(3+4i)x −1 + 4i
1
+ f1e(3+4i)x −1 + 4i
1
,
where f1 is an arbitrary complex constant.
This solution is correct but unacceptable. We want to solve the system Y = AY, where A has
real coefficients, and we have a solution which is indeed a real vector, but this vector is expressed in
2.2. HOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS 43
terms of complex numbers and functions. We need to obtain a solution that is expressed totally in
terms of real numbers and functions. In order to do this, we need an extra step.
In order not to interrupt the flow of exposition, we simply state here what we need to do, and
we justify this after the conclusion of the example.
We therefore do the following: We simply replace the matrix PMZ by the matrix whose
first column is the real part Re(eλ1xv1) = Re e(3+4i)x −1 + 4i
1
, and whose second column is
the imaginary part Im(eλ1xv1) = Im e(3+4i)x −1 + 4i
1
, and the vector F by the vector C of
arbitrary real constants. We compute
e(3+4i)x −1 + 4i
1
= e3x
(cos(4x) + i sin(4x))
−1 + 4i
1
= e3x − cos(4x) − 4 sin(4x)
cos(4x)
+ ie3x 4 cos(4x) − sin(4x)
sin(4x)
and so we obtain
Y =
e3x(− cos(4x) − 4 sin(4x)) e3x(4 cos(4x) − sin(4x))
e3x cos(4x) e3x sin(4x)
c1
c2
=
(−c1 + 4c2)e3x cos(4x) + (−4c1 − c2)e3x sin(4x)
c1e3x cos(4x) + c2e3x sin(4x)
.
Now we justify the step we have done.
Lemma 2.5. Consider the system Y = AY, where A is a matrix with real entries. Let this system have
general solution of the form
Y = PMZF = v1 v1
eλ1x 0
0 eλ1x
f1
f1
= eλ1xv1 eλ1xv1
f1
f1
,
where f1 is an arbitrary complex constant. Then this system also has general solution of the form
Y = Re(eλ1xv1) Im(eλ1xv1)
c1
c2
,
where c1 and c2 are arbitrary real constants.
Proof. First note that for any complex number z = x + iy,x = Re(z) = 1
2 (z + z) and y = Im(z) =
1
2i (z − z), and similarly, for any complex vector.
44 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS
Now Y = AY has general solution Y = PMZF = PMZ(RR−1)F = (PMZR)(R−1F) for
any invertible matrix R. We now (cleverly) choose
R =
1/2 1/(2i)
1/2 −1/(2i)
.
With this choice of R,
PMZR = Re(eλ1xv1) Im(eλ1xv1) .
Then
R−1
=
1 1
i −i
.
Since f1 is an arbitrary complex constant, we may (cleverly) choose to write it as f1 = 1
2 (c1 + ic2)
for arbitrary real constants c1 and c2, and with this choice
R−1
F =
c1
c2
,
yielding a general solution as claimed. 2
We now solve Y = AY where A is a real 3-by-3 matrix with a pair of complex eigenvalues
and a third, real eigenvalue. As you will see, we use the idea of Lemma 2.5 to simply replace the
“relevant” columns of PMZ in order to obtain our final solution.
Example 2.6. Consider the system
Y = AY where A =
⎡
⎣
15 −16 8
10 −10 5
0 1 2
⎤
⎦ .
A has characteristic polynomial (λ2 − 2λ + 5)(λ − 5) with roots λ1 = 1 + 2i, λ2 = λ1 = 1 − 2i,
and λ3 = 5, each of multiplicity 1. Thus, λ1, λ2, and λ3 are the eigenvalues of A, and we compute
that the eigenspace E1+2i = Ker(A − (1 + 2i)I) has basis
⎧
⎨
⎩
v1 =
⎡
⎣
−2 + 2i
−1 + 2i
1
⎤
⎦
⎫
⎬
⎭
, and hence, by
Lemma 2.3,that the eigenspace E1−2i = Ker(A − (1 − 2i)I) has basis
⎧
⎨
⎩
v2 = v1 =
⎡
⎣
−2 − 2i
−1 − 2i
1
⎤
⎦
⎫
⎬
⎭
.
We further compute that the eigenspace E5 = Ker(A − 5I) has basis
⎧
⎨
⎩
v3 =
⎡
⎣
4
3
1
⎤
⎦
⎫
⎬
⎭
. Hence, just as
2.2. HOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS 45
before,
A = PJP−1
with P =
⎡
⎣
−2 + 2i −2 − 2i 4
−1 + 2i −1 − 2i 3
1 1 1
⎤
⎦ and J =
⎡
⎣
1 + 2i 0 0
0 1 − 2i 0
0 0 5
⎤
⎦ .
Then Z = JZ has solution
Z =
⎡
⎣
e(1+2i)x 0 0
0 e(1−2i)x 0
0 0 e5x
⎤
⎦
⎡
⎣
f1
f1
c3
⎤
⎦ = MZF =
⎡
⎢
⎣
f1e(1+2i)x
f1e(1+2i)x
c3e5x
⎤
⎥
⎦
and so Y = PZ = PMZF, i.e.,
Y =
⎡
⎣
−2 + 2i −2 − 2i 4
−1 + 2i −1 − 2i 3
1 1 1
⎤
⎦
⎡
⎣
e(1+2i)x 0 0
0 e(1−2i)x 0
0 0 e5x
⎤
⎦
⎡
⎣
f1
f1
c3
⎤
⎦ .
Now
e(1+2i)x
⎡
⎣
−2 + 2i
−1 + 2i
1
⎤
⎦ = ex
(cos(2x) + i sin(2x))
⎡
⎣
−2 + 2i
−1 + 2i
1
⎤
⎦
=
⎡
⎣
ex(−2 cos(2x) − 2 sin(2x))
ex(− cos(2x) − 2 sin(2x))
ex cos(2x)
⎤
⎦ + i
⎡
⎣
ex(2 cos(2x) − 2 sin(2x))
ex(2 cos(2x) − sin(2x))
ex sin(2x)
⎤
⎦
and of course
e5x
⎡
⎣
4
3
1
⎤
⎦ =
⎡
⎣
4e5x
3e5x
e5x
⎤
⎦ ,
so, replacing the relevant columns of PMZ, we find
Y =
⎡
⎣
ex(−2 cos(2x) − 2 sin(2x)) ex(2 cos(2x) − 2 sin(2x)) 4e5x
ex(− cos(2x) − 2 sin(2x)) ex(2 cos(2x) − sin(2x)) 3e5x
ex cos(2x) ex sin(2x) e5x
⎤
⎦
⎡
⎣
c1
c2
c3
⎤
⎦
=
⎡
⎣
(−2c1 + 2c2)ex cos(2x) + (−2c1 − 2c2)ex sin(2x) + 4c3e5x
(−c1 + 2c2)ex cos(2x) + (−2c1 − c2)ex sin(2x) + 3c3e5x
c1ex cos(2x) + c2ex sin(2x) + c3e5x
⎤
⎦ .
46 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS
EXERCISES FOR SECTION 2.2
In Exercises 1–4:
(a) Solve the system Y = AY.
(b) Solve the initial value problem Y = AY, Y(0) = Y0.
In Exercises 5 and 6, solve the system Y = AY.
1. A =
3 5
−2 5
, det(λI − A) = λ2 − 8λ + 25, and Y0 =
8
13
.
2. A =
3 4
−2 7
, det(λI − A) = λ2 − 10λ + 29, and Y0 =
3
5
.
3. A =
5 13
−1 9
, det(λI − A) = λ2 − 14λ + 58, and Y0 =
2
1
.
4. A =
7 17
−4 11
, det(λI − A) = λ2 − 18λ + 145, and Y0 =
5
2
.
5. A =
⎡
⎣
37 10 20
−59 −9 −24
−33 −12 −21
⎤
⎦, det(λI − A) = (λ2 − 4λ + 29)(λ − 3).
6. A =
⎡
⎣
−4 −42 15
4 25 −10
6 32 −13
⎤
⎦, det(λI − A) = (λ2 − 6λ + 13)(λ − 2).
2.3 INHOMOGENEOUS SYSTEMS WITH CONSTANT
COEFFICIENTS
In this section, we show how to solve an inhomogeneous system Y = AY + G(x) where G(x)
is a vector of functions. (We will often abbreviate G(x) by G). We use a method that is a direct
generalization of the method we used for solving a homogeneous system in Section 2.1.
Consider the matrix system
Y = AY + G .
2.3. INHOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS 47
Step 1. Write A = PJP−1 with J in JCF, so the system becomes
Y = (PJP−1
)Y + G
Y = PJ(P−1
Y) + G
P −1
Y = J(P−1
Y) + P −1
G
(P−1
Y) = J(P−1
Y) + P −1
G .
(Note that, since P −1 is a constant matrix, we have that (P −1Y) = P−1Y .)
Step 2. Set Z = P −1Y and H = P−1G, so this system becomes
Z = JZ + H
and solve this system for Z.
Step 3. Since Z = P −1Y, we have that
Y = PZ
is the solution to our original system.
Again,the key to this method is to be able to perform Step 2,and again this is straightforward.
Within each Jordan block,we solve from the bottom up.Let us focus our attention on a single k-by-k
block.The equation for the last function zk in that block is an inhomogeneous first-order differential
equation involving only zk,and we go ahead and solve it.The equation for the next to the last function
zk−1 in that block is an inhomogeneous first-order differential equation involving only zk−1 and zk.
We substitute in our solution for zk to obtain an inhomogeneous first-order differential equation for
zk−1 involving only zk−1, and we go ahead and solve it, etc.
In principle, this is the method we use. In practice, using this method directly is solving each
system “by hand,” and instead we choose to “automate” this procedure.This leads us to the following
method. In order to develop this method we must begin with some preliminaries.
For a fixed matrix A, we say that the inhomogeneous system Y = AY + G(x) has associated
homogeneous system Y = AY. By our previous work, we know how to find the general solution of
Y = AY. First we shall see that, in order to find the general solution of Y = AY + G(x), it suffices
to find a single solution of that system.
Lemma 3.1. Let Yi be any solution of Y = AY + G(x). If Yh is any solution of the associated ho-
mogeneous system Y = AY, then Yh + Yi is also a solution of Y = AY + G(x), and every solution of
Y = AY + G(x) is of this form.
Consequently,thegeneralsolutionof Y = AY + G(x)isgivenbyY = YH + Yi,whereYH denotes
the general solution of Y = AY.
48 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS
Proof. First we check that Y = Yh + Yi is a solution of Y = AY + G(x). We simply compute
Y = (Yh + Yi) = Yh + Yi = (AYh) + (AYi + G)
= A(Yh + Yi) + G = AY + G
as claimed.
Now we check that every solution Y of Y = AY + G(x) is of this form. So let Y be any
solution of this inhomogeneous system.We can certainly write Y = (Y − Yi) + Yi = Yh + Yi where
Yh = Y − Yi. We need to show that Yh defined in this way is indeed a solution of Y = AY. Again
we compute
Yh = (Y − Yi) = Y − Yi = (AY + G) − (AYi + G)
= A(Y − Yi) = AYh
as claimed. 2
(It is common to call Yi a particular solution of the inhomogeneous system.)
Let us now recall our work from Section 2.1, and keep our previous notation. The homoge-
neous system Y = AY has general solution YH = PMZC where C is a vector of arbitrary constants.
Let us set NY = NY (x) = PMZ(x) for convenience, so YH = NY C. Then YH = (NY C) = NY C,
and then, substituting in the equation Y = AY, we obtain the equation NY C = ANY C. Since this
equation must hold for any C, we conclude that
NY = ANY .
We use this fact to write down a solution to Y = AY + G. We will verify by direct computation
that the function we write down is indeed a solution. This verification is not a difficult one, but
nevertheless it is a fair question to ask how we came up with this function. Actually, it can be derived
in a very natural way, but the explanation for this involves the matrix exponential and so we defer it
until Section 2.4. Nevertheless, once we have this solution (no matter how we came up with it) we
are certainly free to use it.
It is convenient to introduce the following nonstandard notation. For a vector H(x), we
let 0 H(x)dx denote an arbitrary but fixed antiderivative of H(x). In other words, in obtaining
0 H(x)dx, we simply ignore the constants of integration. This is legitimate for our purposes, as by
Lemma 3.1 we only need to find a single solution to an inhomogeneous system, and it doesn’t matter
which one we find—any one will do. (Otherwise said, we can “absorb” the constants of integration
into the general solution of the associated homogeneous system.)
Theorem 3.2. The function
Yi = NY
0
N−1
Y G dx
is a solution of the system Y = AY + G.
2.3. INHOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS 49
Proof. We simply compute Yi . We have
Yi = NY
0
N−1
Y G dx
= NY N−1
Y G dx + NY
0
N−1
Y G dx
by the product rule
= NY
0
N−1
Y G dx + NY (N−1
Y G)
by the definition of the antiderivative
= NY
0
N−1
Y G dx + G
= (ANY )
0
N−1
Y G dx + G
as NY = ANY
= A NY
0
N−1
Y G dx + G
= AYi + G
as claimed. 2
We now do a variety of examples: a 2-by-2 diagonalizable system, a 2-by-2 nondiagonalizable
system, a 3-by-3 diagonalizable system, and a 2-by-2 system in which the characteristic polynomial
has complex roots. In all these examples, when it comes to finding N−1
Y , it is convenient to use the
fact that N−1
Y = (PMZ)−1 = M−1
Z P −1.
Example 3.3. Consider the system
Y = AY + G where A =
5 −7
2 −4
and G =
30ex
60e2x .
We saw in Example 1.2 that
P =
7 1
2 1
and MZ =
e3x 0
0 e−2x ,
and NY = PMZ. Then
N−1
Y G =
e−3x 0
0 e2x (1/5)
1 −1
−2 7
30ex
60e2x =
6e−2x − 12e−x
−12e3x + 84e4x .
Then
0
N−1
Y G =
−3e−2x + 12e2x
−4e3x + 21e4x
50 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS
and
Yi = NY
0
N−1
Y G =
7 1
2 1
e3x 0
0 e−2x
−3e−2x + 12e2x
−4e3x + 21e4x
=
−25ex + 105e2x
−10ex + 45e2x .
Example 3.4. Consider the system
Y = AY + G where A =
0 1
−4 4
and G =
60e3x
72e5x .
We saw in Example 1.6 that
P =
−2 1
4 0
and MZ = e2x 1 x
0 1
,
and NY = PMZ. Then
N−1
Y G = e−2x 1 −x
0 1
(1/4)
0 −1
4 −2
60e3x
72e5x =
−18e3x − 60xex + 36xe3x
60ex − 36e3x .
Then
0
N−1
Y G =
60ex − 60xex − 10e3x + 12xe3x
60ex − 12e3x
and
Yi = NY
0
N−1
Y G =
−2 1
−4 0
e2x 1 x
0 1
60ex − 60xex − 10e3x + 12xe3x
60ex − 12e3x
=
−60e3x + 8e5x
−240e3x + 40e5x .
Example 3.5. Consider the system
Y = AY + G where A =
⎡
⎣
2 −3 −3
2 −2 −2
−2 1 1
⎤
⎦ and G =
⎡
⎣
ex
12e3x
20e4x
⎤
⎦ .
2.3. INHOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS 51
We saw in Example 1.3 that
P =
⎡
⎣
1 0 −1
0 −1 −1
1 1 1
⎤
⎦ and MZ =
⎡
⎣
e−x 0 0
0 1 0
0 0 e2x
⎤
⎦ ,
and NY = PMZ. Then
N−1
Y G =
⎡
⎣
ex 0 0
0 1 0
0 0 e−2x
⎤
⎦
⎡
⎣
0 1 1
1 −2 −1
−1 1 1
⎤
⎦
⎡
⎣
ex
12e3x
20e4x
⎤
⎦ =
⎡
⎣
12e4x + 20e5x
ex − 24e3x − 20e4x
−e−x + 12ex + 20e2x
⎤
⎦ .
Then
0
N−1
Y G =
⎡
⎣
3e4x + 4e5x
ex − 8e3x − 5e4x
e−x + 12ex + 10e2x
⎤
⎦
and
Yi = NY
0
N−1
Y G =
⎡
⎣
1 0 −1
0 −1 −1
1 1 1
⎤
⎦
⎡
⎣
e−x 0 0
0 1 0
0 0 e2x
⎤
⎦
⎡
⎣
3e4x + 4e5x
ex − 8e3x − 5e4x
e−x + 12ex + 10e2x
⎤
⎦
=
⎡
⎣
−ex − 9e3x − 6e4x
−2ex − 4e3x − 5e4x
2ex + 7e3x + 9e4x
⎤
⎦ .
Example 3.6. Consider the system
Y = AY + G where A =
2 −17
1 4
and G =
200
160ex .
We saw in Example 2.4 that
P =
−1 + 4i −1 − 4i
1 1
and MZ =
e(3+4i)x 0
0 e(3−4i)x ,
and NY = PMZ. Then
N−1
Y G =
e−(3+4i)x 0
0 e−(3−4i)x (1/(8i))
1 1 + 4i
1 1 − 4i
200
160ex
=
−25e(−3−4i)x + 20(4 − i)e(−2−4i)x
25e(−3+4i)x + 20(4 + i)e(−2+4i)x .
52 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS
Then
0
N−1
Y G =
(4 + 3i)e(−3−4i)x + (−4 + 18i)e(−2−4i)x
(4 − 3i)e(−3+4i)x + (−4 − 18i)e(−2+4i)x
and
Yi = NY
0
N−1
Y G
=
−1 + 4i −1 − 4i
1 1
e(3+4i)x 0
0 e(3−4i)x
(4 + 3i)e(−3−4i)x + (−4 + 18i)e(−2−4i)x
(4 − 3i)e(−3+4i)x + (−4 − 18i)e(−2+4i)x
=
−1 + 4i −1 − 4i
1 1
(4 + 3i) + (−4 + 18i)ex
(4 − 3i) + (−4 − 18i)ex
=
−32 − 136ex
8 − 8ex .
(Note that in this last example we could do arithmetic with complex numbers directly, i.e.,
without having to convert complex exponentials into real terms.)
Once we have done this work, it is straightforward to solve initial value problems. We do a
single example that illustrates this.
Example 3.7. Consider the initial value problem
Y = AY + G, Y(0) =
7
17
, where A =
5 −7
2 −4
and G =
30ex
60e2x .
We saw in Example 1.2 that the associated homogenous system has general solution
YH =
7c1e3x + c2e−2x
2c1e3x + c2e−2x
and in Example 3.3 that the original system has a particular solution
Yi =
−25ex + 105e2x
−10ex + 45e2x .
Thus, our original system has general solution
Y = YH + Yi =
7c1e3x + c2e−2x − 25ex + 105e2x
2c1e3x + c2e−2x − 10ex + 45e2x .
We apply the initial condition to obtain the linear system
Y(0) =
7c1 + c2 + 80
2c1 + c2 + 35
=
7
17
2.4. THE MATRIX EXPONENTIAL 53
with solution c1 = −11, c2 = 4. Substituting, we find that our initial value problem has solution
Y =
−77e3x + 4e−2x − 25ex + 105e2x
−22e3x + 4e−2x − 10ex + 45e2x .
EXERCISES FOR SECTION 2.3
In each exercise, find a particular solution Yi of the system Y = AY + G(x), where A is the matrix
of the correspondingly numbered exercise for Section 2.1, and G(x) is as given.
1. G(x) =
2e8x
3e4x .
2. G(x) =
2e−7x
6e−8x .
3. G(x) =
e4x
4e5x .
4. G(x) =
e6x
9e8x .
5. G(x) =
9e10x
25e12x .
6. G(x) =
5e−x
12e2x .
7. G(x) =
⎡
⎣
1
3e2x
5e4x
⎤
⎦.
8. G(x) =
⎡
⎣
8
3e3x
3e5x
⎤
⎦.
2.4 THE MATRIX EXPONENTIAL
In this section, we will discuss the matrix exponential and its use in solving systems Y = AY.
Our first task is to ask what it means to take a matrix exponential. To answer this, we are
guided by ordinary exponentials. Recall that, for any complex number z, the exponential ez is given
by
ez
= 1 + z + z2
/2! + z3
/3! + z4
/4! + . . . .
54 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS
With this in mind, we define the matrix exponential as follows.
Definition 4.1. Let T be a square matrix. Then the matrix exponential eT is defined by
eT
= I + T +
1
2!
T 2
+
1
3!
T 3
+
1
4!
T 4
+ . . . .
(For this definition to make sense we need to know that this series always converges, and it
does.)
Recall that the differential equation y = ay has the solution y = ceax. The situation for
Y = AY is very analogous. (Note that we use rather than C to denote a vector of constants for
reasons that will become clear a little later. Note that is on the right in Theorem 4.2 below, a
consequence of the fact that matrix multiplication is not commutative.)
Theorem 4.2.
(1) Let A be a square matrix. Then the general solution of
Y = AY
is given by
Y = eAx
where is a vector of arbitrary constants.
(2) The initial value problem
Y = AY, Y(0) = Y0
has solution
Y = eAx
Y0 .
Proof. (Outline) (1) We first compute eAx. In order to do so, note that (Ax)2 = (Ax)(Ax) =
(AA)(xx) = A2x2 as matrix multiplication commutes with scalar multiplication, and (Ax)3 =
(Ax)2(Ax) = (A2x2)(Ax) = (A2A)(x2x) = A3x3, and similarly, (Ax)k = Akxk for any k. Then,
substituting in Definition 4.1, we have that
Y = eAx
= (I + Ax +
1
2!
A2
x2
+
1
3!
A3
x3
+
1
4!
A4
x4
+ . . .) .
2.4. THE MATRIX EXPONENTIAL 55
To find Y , we may differentiate this series term-by-term. (This claim requires proof, but we shall
not give it here.) Remembering that A and are constant matrices, we see that
Y = (A +
1
2!
A2
(2x) +
1
3!
A3
(3x2
) +
1
4!
A4
(4x3
) + . . .)
= (A + A2
x +
1
2!
A3
x2
+
1
3!
A4
x3
+ . . .)
= A(I + Ax +
1
2!
A2
x2
+
1
3!
A3
x3
+ . . .)
= A(eAx
) = AY
as claimed.
(2) By (1) we know that Y = AY has solution Y = eAx . We use the initial condition to
solve for . Setting x = 0, we have:
Y0 = Y(0) = eA0
= e0
= I =
(where e0 means the exponential of the zero matrix, and the value of this is the identity matrix I, as
is apparent from Definition 4.1), so = Y0 and Y = eAx = eAxY0. 2
In the remainder of this section we shall see how to translate the theoretical solution of
Y = AY given by Theorem 4.2 into a practical one. To keep our notation simple, we will stick to
2-by-2 or 3-by-3 cases, but the principle is the same regardless of the size of the matrix.
One case is relatively easy.
Lemma 4.3. If J is a diagonal matrix,
J =
⎡
⎢
⎢
⎢
⎣
d1
d2 0
0
...
dn
⎤
⎥
⎥
⎥
⎦
then eJx is the diagonal matrix
eJx
=
⎡
⎢
⎢
⎢
⎣
ed1x
ed2x 0
0
...
ednx
⎤
⎥
⎥
⎥
⎦
.
56 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS
Proof. Suppose, for simplicity, that J is 2-by-2,
J =
d1 0
0 d2
.
Then you can easily compute that J2 =
d1
2 0
0 d2
2 , J3 =
d1
3 0
0 d2
3 , and similarly, Jk =
d1
k 0
0 d2
k for any k.
Then, as in the proof of Theorem 4.2,
eJx
= I + Jx +
1
2!
J2
x2
+
1
3!
J3
x3
+
1
4!
J4
x4
+ . . .
=
1 0
0 1
+
d1 0
0 d2
x +
1
2!
d1
2 0
0 d2
2 x2
+
1
3!
d1
3 0
0 d2
3 x3
+ . . .
=
1 + d1x + 1
2! (d1x)2 + 1
3! (d1x)3 + . . . 0
0 1 + d2x + 1
2! (d2x)2 + 1
3! (d2x)3 + . . .
which we recognize as
=
ed1x 0
0 ed2x .
2
Example 4.4. We wish to find the general solution of Y = JY where
J =
3 0
0 −2
.
To do so we directly apply Theorem 4.2 and Lemma 4.3. The solution is given by
y1
y2
= Y = eJx
=
e3x 0
0 e−2x
γ1
γ2
=
γ1e3x
γ2e−2x .
Now suppose we want to find the general solution of Y = AY where A =
5 −7
2 −4
.We may
still apply Theorem 4.2 to conclude that the solution is Y = eAx . We again try to calculate eAx.
Now we find
A =
5 −7
2 −4
, A2
=
11 −7
2 2
, A3
=
41 −49
14 −22
, . . .
2.4. THE MATRIX EXPONENTIAL 57
so
eAx
=
1 0
0 1
+
5 −7
2 −4
x +
1
2!
11 −7
2 2
x2
+
1
3!
41 −49
14 −22
x3
+ . . . ,
which looks like a hopeless mess. But, in fact, the situation is not so hard!
Lemma 4.5. Let S and T be two matrices and suppose
S = PT P−1
for some invertible matrix P. Then
Sk
= PT k
P −1
for every k
and
eS
= PeT
P −1
.
Proof. We simply compute
S2
= SS = (PT P−1
)(PT P−1
) = PT (P−1
P)T P−1
= PT IT P−1
= PT T P−1
= PT 2
P −1
,
S3
= S2
S = (PT 2
P −1
)(PT P−1
) = PT 2
(P −1
P)T P −1
= PT 2
IT P −1
= PT 2
T P −1
= PT 3
P−1
,
S4
= S3
S = (PT 3
P −1
)(PT P−1
) = PT 3
(P−1
P)T P−1
= PT 3
IT P −1
= PT 3
T P −1
= PT 4
P−1
,
etc.
Then
eS
= I + S +
1
2!
S2
+
1
3!
S3
+
1
4!
S4
+ . . .
= PIP −1
+ PT P −1
+
1
2!
PT 2
P −1
+
1
3!
PT 3
P −1
+
1
4!
PT 4
P −1
+ . . .
= P(I + T +
1
2!
T 2
+
1
3!
T 3
+
1
4!
T 4
+ . . .)P−1
= PeT
P−1
as claimed. 2
58 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS
With this in hand let us return to our problem.
Example 4.6. (Compare Example 1.2.) We wish to find the general solution of Y = AY where
A =
5 −7
2 −4
.
We saw in Example 1.16 in Chapter 1 that A = PJP−1 with
P =
7 1
2 1
and J =
3 0
0 −2
.
Then
eAx
= PeJx
P −1
=
7 1
2 1
e3x 0
0 e−2x
7 1
2 1
−1
=
7
5 e3x − 2
5 e−2x −7
5 e3x + 7
5 e−2x
2
5 e3x − 2
5 e−2x − 2
5 e3x + 7
5 e−2x
and
Y = eAx
= eAx γ1
γ2
=
(7
5 γ1 − 7
5 γ2)e3x + (−2
5 γ1 + 7
5 γ2)e−2x
(2
5 γ1 − 2
5 γ2)e3x + (−2
5 γ1 + 7
5 γ2)e−2x
.
Example 4.7. (Compare Example 1.3.) We wish to find the general solution of Y = AY where
A =
⎡
⎣
2 −3 −3
2 −2 −2
−2 1 1
⎤
⎦ .
We saw in Example 2.23 in Chapter 1 that A = PJP−1 with
P =
⎡
⎣
1 0 −1
0 −1 −1
1 1 1
⎤
⎦ and J =
⎡
⎣
−1 0 0
0 0 0
0 0 2
⎤
⎦ .
2.4. THE MATRIX EXPONENTIAL 59
Then
eAx
= PeJx
P −1
=
⎡
⎣
1 0 −1
0 −1 −1
1 1 1
⎤
⎦
⎡
⎣
e−x 0 0
0 1 0
0 0 e2x
⎤
⎦
⎡
⎣
1 0 −1
0 −1 −1
1 1 1
⎤
⎦
−1
=
⎡
⎣
e2x e−x − e2x e−x − e2x
−1 + e2x 2 − e2x 1 − e2x
1 − e2x e−x − 2 + e2x e−x − 1 + e2x
⎤
⎦
and
Y = eAx
= eAx
⎡
⎣
γ1
γ2
γ3
⎤
⎦
=
⎡
⎣
(γ2 + γ3)e−x + (γ1 − γ2 − γ3)e2x
(−γ1 + 2γ2 + γ3) + (γ1 − γ2 − γ3)e2x
(γ2 + γ3)e−x + (γ1 − 2γ2 − γ3) + (−γ1 + γ2 + γ3)e2x
⎤
⎦ .
Now suppose we want to solve the initial value problem Y = AY, Y(0) =
⎡
⎣
1
0
0
⎤
⎦. Then
Y = eAx
Y(0)
=
⎡
⎣
e2x e−x − e2x e−x − e2x
−1 + e2x 2 − e2x 1 − e2x
1 − e2x e−x − 2 + e2x e−x − 1 + e2x
⎤
⎦
⎡
⎣
1
0
0
⎤
⎦
=
⎡
⎣
e2x
−1 + e2x
1 − e2x
⎤
⎦ .
Remark 4.8. Let us compare the results of our method here with that of our previous method. In
the case of Example 4.6, our previous method gives the solution
Y = P
e3x 0
0 e−2x C
= PeJx
C
60 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS
where J =
3 0
0 −2
,
while our method here gives
Y = PeJx
P −1
.
But note that these answers are really the same! For P−1 is a constant matrix, so if is a vector of
arbitrary constants, then so is P −1 , and we simply set C = P−1 .
Similarly, in the case of Example 4.7, our previous method gives the solution
Y = P
⎡
⎣
e−x 0 0
0 1 0
0 0 e2x
⎤
⎦ C
= PeJx
C
where J =
⎡
⎣
−1 0 0
0 0 0
0 0 2
⎤
⎦,
while our method here gives
Y = PeJx
P −1
and again, setting C = P−1 , we see that these answers are the same.
So the point here is not that the matrix exponential enables us to solve new problems, but
rather that it gives a new viewpoint about the solutions that we have already obtained.
While these two methods are in principle the same, we may ask which is preferable in
practice. In this regard we see that our earlier method is better, as the use of the matrix exponential
requires us to find P−1, which may be a considerable amount of work. However, this advantage
is (partially) negated if we wish to solve initial value problems, as the matrix exponential method
immediately gives the unknown constants , as = Y(0), while in the former method we must
solve a linear system to obtain the unknown constants C.
Now let us consider the nondiagonalizable case. Suppose Z = JZ where J is a matrix con-
sisting of a single Jordan block.Then by Theorem 4.2 this has the solution Z = eJx . On the other
2.4. THE MATRIX EXPONENTIAL 61
hand,inTheorem 1.1 we already saw that this system has solution Z = MZC.In this case,we simply
have C = , so we must have eJx = MZ. Let us see that this is true by computing eJx directly.
Theorem 4.9. Let J be a k-by-k Jordan block with eigenvalue a,
J =
⎡
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
a 1
a 1 0
a 1
...
...
0 a 1
a
⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
.
Then
eJx
= eax
⎡
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
1 x x2/2! x3/3! · · · xk−1/(k − 1)!
0 1 x x2/2! · · · xk−2/(k − 2)!
1 x · · · xk−3/(k − 3)!
...
...
x
1
⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
.
Proof. First suppose that J is a 2-by-2 Jordan block,
J =
a 1
0 a
.
Then J2 =
a2 2a
0 a2 , J3 =
a3 3a2
0 a3 , J4 =
a4 4a3
0 a4 , …
so
eJx
=
1 0
0 1
+
a 1
0 a
x +
1
2!
a2 2a
0 a2 x2
+
1
3!
a3 3a2
0 a3 x3
+
1
4!
a4 4a3
0 a4 x4
+ . . .
=
m11 m12
0 m22
,
and we see that
m11 = m22 = 1 + ax +
1
2!
(ax)2
+
1
3!
(ax)3
+
1
4!
(ax)4
+
1
5!
(ax)5
+ . . .
= eax
,
62 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS
and
m12 = x + ax2
+
1
2!
a2
x3
+
1
3!
a3
x4
+
1
4!
a4
x5
+ . . .
= x(1 + ax +
1
2!
(ax)2
+
1
3!
(ax)3
+
1
4!
(ax)4
+ . . .) = xeax
and so we conclude that
eJx
=
eax xeax
0 eax = eax 1 x
0 1
.
Next suppose that J is a 3-by-3 Jordan block,
J =
⎡
⎣
a 1 0
0 a 1
0 0 1
⎤
⎦ .
Then J2 =
⎡
⎣
a2 2a 1
0 a2 2a
0 0 a2
⎤
⎦, J3 =
⎡
⎣
a3 3a2 3a
0 a3 3a2
0 0 a3
⎤
⎦, J4 =
⎡
⎣
a4 4a3 6a2
0 a4 4a3
0 0 a4
⎤
⎦,
J5 =
⎡
⎣
a5 5a4 10a3
0 a5 5a4
0 0 a5
⎤
⎦, …
so
eJx
=
⎡
⎣
1 0 0
0 1 0
0 0 1
⎤
⎦ +
⎡
⎣
a 1 0
0 a 1
0 0 a
⎤
⎦ x +
1
2!
⎡
⎣
a2 2a 1
0 a2 2a
0 0 a2
⎤
⎦ x2
+
1
3!
⎡
⎣
a3 3a2 3a
0 a3 3a2
0 0 a3
⎤
⎦ x3
+
1
4!
⎡
⎣
a4 4a3 6a2
0 a4 4a3
0 0 a4
⎤
⎦ x4
+
1
5!
⎡
⎣
a5 5a4 10a3
0 a5 5a4
0 0 a5
⎤
⎦ x5
+ . . .
=
⎡
⎣
m11 m12 m13
0 m22 m23
0 0 m33
⎤
⎦ ,
and we see that
m11 = m22 = m33 = 1 + ax +
1
2!
(ax)2
+
1
3!
(ax)3
+
1
4!
(ax)4
+
1
5!
(ax)5
+ . . .
= eax
,
and
m12 = m23 = x + ax2
+
1
2!
a2
x3
+
1
3!
a3
x4
+
1
4!
a4
x5
+ . . .
= xeax
2.4. THE MATRIX EXPONENTIAL 63
as we saw in the 2-by-2 case. Finally,
m13 =
1
2!
x2
+
1
2!
ax3
+
1
2!
(
1
2!
a2
x4
) +
1
2!
(
1
3!
a3
x5
) + . . .
(as 6/4! = 1/4 = (1/2!)(1/2!) and 10/5! = 1/12 = (1/2!)(1/3!), etc.)
=
1
2!
x2
(1 + ax +
1
2!
(ax)2
+
1
3!
(ax)3
+ . . .)
=
1
2!
x2
eax
,
so
eJx
=
⎡
⎣
eax xeax 1
2! x2eax
0 eax xeax
0 0 eax
⎤
⎦ = eax
⎡
⎣
1 x x2/2!
0 1 x
0 0 1
⎤
⎦ ,
and similarly, for larger Jordan blocks. 2
Let us see how to apply this theorem in a couple of examples.
Example 4.10. (Compare Examples 1.6 and 1.10.) Consider the system
Y = AY where A =
0 1
−4 4
.
Also, consider the initial value problem Y = AY, Y(0) =
3
−8
.
We saw in Example 2.12 in Chapter 1 that A = PJP−1 with
P =
−2 1
−4 0
and J =
2 1
0 2
.
Then
eAx
= PeJx
P−1
=
−2 1
−4 0
e2x xe2x
0 e2x
−2 1
−4 0
−1
=
(1 − 2x)e2x xe2x
−4xe2x (1 + 2x)e2x ,
64 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS
and so
Y = eAx
= eAx γ1
γ2
=
γ1e2x + (−2γ1 + γ2)xe2x
γ2e2x + (−4γ1 + 2γ2)xe2x .
The initial value problem has solution
Y = eAx
Y0
= eAx 3
−8
=
3e2x − 14xe2x
8e2x − 28xe2x .
Example 4.11. (Compare Examples 1.7 and 1.11.) Consider the system
Y = AY where A =
⎡
⎣
2 1 1
2 1 −2
−1 0 2
⎤
⎦ .
Also, consider the initial value problem Y = AY, Y(0) =
⎡
⎣
8
32
5
⎤
⎦.
We saw in Example 2.25 in Chapter 1 that A = PJP−1 with
P =
⎡
⎣
1 0 −5
−2 0 6
−1 1 1
⎤
⎦ and J =
⎡
⎣
−1 1 0
0 −1 0
0 0 3
⎤
⎦ .
Then
eAx
= PeJx
P −1
=
⎡
⎣
1 0 −5
−2 0 6
−1 1 1
⎤
⎦
⎡
⎣
e−x xe−x 0
0 e−x 0
0 0 e3x
⎤
⎦
⎡
⎣
1 0 −5
−2 0 6
−1 1 1
⎤
⎦
−1
2.4. THE MATRIX EXPONENTIAL 65
=
⎡
⎢
⎢
⎣
3
8 e−x + 1
2 xe−x + 5
8 e3x − 5
16 e−x − 1
4 xe−x + 5
16 e3x xe−x
−3
4 e−x − xe−x + 3
4 e3x 5
8 e−x + 1
2 xe−x + 3
8 e3x −2xe−x
1
8 e−x − 1
2 xe−x − 1
8 e3x 1
16 e−x + 1
4 xe−x − 1
16 e3x e−x − xe−x
⎤
⎥
⎥
⎦
and so
Y = eAx
= eAx
⎡
⎣
γ1
γ2
γ3
⎤
⎦
=
⎡
⎢
⎢
⎣
(3
8 γ1 − 5
16 γ2)e−x + (1
2 γ1 − 1
4 γ2 + γ3)xe−x + (5
8 γ1 + 5
16 γ2)e3x
(−3
4 γ1 + 5
8 γ2)e−x + (−γ1 + 1
2 γ2 − 2γ3)xe−x + (3
4 γ1 + 3
8 γ2)e3x
(1
8 γ1 + 1
16 γ2 + γ3)e−x + (−1
2 γ1 + 1
4 γ2 − γ3)xe−x + (−1
8 γ1 − 1
16 γ2)e3x
⎤
⎥
⎥
⎦ .
The initial value problem has solution
Y = eAx
Y0
= eAx
⎡
⎣
8
32
5
⎤
⎦
=
⎡
⎣
−7e−x + xe−x + 15e3x
14e−x − 2xe−x + 18e3x
8e−x − xe−x − 3e3x
⎤
⎦ .
Now we solve Y = AY in an example where the matrix A has complex eigenvalues. As you
will see, our method is exactly the same.
Example 4.12. (Compare Example 2.4.) Consider the system
Y = AY where A =
2 −17
1 4
.
We saw in Example 2.4 that A = PJP−1 with
P =
−1 + 4i −1 − 4i
1 1
and J =
3 + 4i 0
0 3 − 4i
.
66 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS
Then
eAx
= PeJx
P −1
=
−1 + 4i −1 − 4i
1 1
e(3+4i)x 0
0 e(3−4i)x
−1 + 4i −1 − 4i
1 1
−1
=
−1 + 4i −1 − 4i
1 1
e(3+4i)x 0
0 e(3−4i)x (1/(8i))
1 1 + 4i
−1 −1 + 4i
=
m11 m12
m21 m22
where
m11 = (1/(8i))((−1 + 4i)e(3+4i)x
+ (−1 − 4i)(−e(3−4i)x
))
= (1/(8i))((−1 + 4i)e3x
(cos(4x) + i sin(4x)) − (−1 − 4i)e3x
(cos(4x) − i sin(4x))
= (1/(8i))(ie3x
(4 cos(4x) − sin(4x))(2)) = e3x
(cos(4x) − (1/4) sin(4x)),
m12 = (1/(8i))((−1 + 4i)(1 + 4i)e(3+4i)x
+ (−1 − 4i)(−1 + 4i)e(3−4i)x
)
= (1/(8i))(ie3x
(−17 sin(4x))(2)) = e3x
((−17/4) sin(4x)),
m21 = (1/(8i))(e(3+4i)x
− e(3−4i)x
)
= (1/(8i))(ie3x
(sin(4x))(2)) = e3x
((1/4) sin(4x)),
m22 = (1/(8i))((1 + 4i)e(3+4i)x
+ (−1 + 4i)e(3−4i)x
)
= (1/(8i))((1 + 4i)e3x
(cos(4x) + i sin(4x)) + (−1 + 4i)e3x
(cos(4x) − i sin(4x))
= (1/(8i))(ie3x
(4 cos(4x) + sin(4x))(2)) = e3x
(cos(4x) + (1/4) sin(4x)) .
Thus,
eAx
= e3x
cos(4x) − 1
4 sin(4x) −17
4 sin(4x)
1
4 sin(4x) cos(4x) + 1
4 sin(4x)
and
Y = eAx
=
γ1e3x cos(4x) + (−1
4 γ1 + −17
4 γ2)e3x sin(4x)
γ2e3x cos(4x) + (1
4 γ1 + 1
4 γ2)e3x sin(4x)
.
Remark 4.13. Our procedure in this section is essentially that of Remark 1.12. (Compare Exam-
ple 4.10 with Example 1.13.)
Remark 4.14. As we have seen, for a matrix J in JCF, eJx = MZ, in the notation of Section 2.1.
But also, in the notation of Section 2.1, if A = PJP−1, then eAx = PeJxP −1 = PMZP −1 = MY .
2.4. THE MATRIX EXPONENTIAL 67
Remark 4.15. Now let us see how to use the matrix exponential to solve an inhomogeneous system
Y = AY + G(x). Since we already know how to solve homogeneous systems, we need only, by
Lemma 3.1, find a (single) particular solution Yi of this inhomogeneous system, and that is what
we do. We shall again use our notation from Section 2.3, that 0 H(x)dx denotes an arbitrary (but
fixed) antiderivative of H(x).
Thus, consider Y = AY + G(x).Then, proceeding analogously as for an ordinary first-order
linear differential equation, we have
Y = AY + G(x)
Y − AY = G(x)
and, multiplying this equation by the integrating factor e−Ax, we obtain
e−Ax
(Y − AY) = e−Ax
G(x)
(e−Ax
Y) = e−Ax
G(x)
with solution
e−Ax
Yi =
0
e−Ax
G(x)
Yi = eAx
0
e−Ax
G(x) .
Let us compare this with the solution we found in Theorem 3.2. By Remark 4.14, we can
rewrite this solution as Yi = MY 0 M−1
Y G(x). This is almost, but not quite, what we had in Theo-
rem 3.2. There we had the solution Yi = NY 0 N−1
Y G(x), where NY = PMZ. But these solutions
are the same, as MY = PMZP −1 = NY P −1. Then M−1
Y = PM−1
Z P−1 and N−1
Y = M−1
Z P −1, so
M−1
Y = PN−1
Y . Substituting, we find
Yi = MY
0
M−1
Y G(x)
= NY P −1
0
PN−1
Y G(x) ,
and, since P is a constant matrix, we may bring it outside the integral to obtain
Yi = NY P −1
P
0
N−1
Y G(x)
= NY
0
N−1
Y G(x)
as claimed.
68 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS
Remark 4.16. In applying this method we must compute M−1
Z = (eJx)−1 = e−Jx = eJ(−x),
and, as an aid to calculation, it is convenient to make the following observation. Suppose, for
simplicity, that J consists of a single Jordan block. Then we compute: in the 1-by-1 case,
eax −1
= e−ax ; in the 2-by-2 case, eax 1 x
0 1
−1
= e−ax 1 −x
0 1
; in the 3-by-3 case,
⎛
⎝eax
⎡
⎣
1 x x2/2!
0 1 x
0 0 1
⎤
⎦
⎞
⎠
−1
= e−ax
⎡
⎣
1 −x (−x)2/2!
0 1 −x
0 0 1
⎤
⎦ = e−ax
⎡
⎣
1 −x x2/2!
0 1 −x
0 0 1
⎤
⎦, etc.
EXERCISES FOR SECTION 2.4
In each exercise:
(a) Find eAx and the solution Y = eAx of Y = AY.
(b) Use part (a) to solve the initial value problem Y = AY, Y(0) = Y0.
Exercises 1–24: In Exercise n, for 1 ≤ n ≤ 20, the matrix A and the initial vector Y0 are the
same as in Exercise n of Section 2.1. In Exercise n, for 21 ≤ n ≤ 24, the matrix A and the initial
vector Y0 are the same as in Exercise n − 20 of Section 2.2.
69
A P P E N D I X A
Background Results
A.1 BASES, COORDINATES, AND MATRICES
In this section of the Appendix,we review the basic facts on bases for vector spaces and on coordinates
for vectors and matrices for linear transformations.Then we use these to (re)prove some of the results
in Chapter 1.
First we see how to represent vectors, once we have chosen a basis.
Theorem 1.1. Let V be a vector space and let B = {v1, . . . , vn} be a basis of V . Then any vector v in
V can be written as v = c1v1 + . . . + cnvn in a unique way.
This theorem leads to the following definition.
Definition 1.2. Let V be a vector space and let B = {v1, . . . , vn} be a basis of V . Let v be a vector
in V and write v = c1v1 + . . . + cnvn. Then the vector
[v]B =
⎡
⎢
⎣
c1
...
cn
⎤
⎥
⎦
is the coordinate vector of v in the basis B.
Remark 1.3. In particular, we may take V = Cn and consider the standard basis E = {e1, . . . , en}
where
ei =
⎡
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
0
...
0
1
0
...
⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
,
with 1 in the ith position, and 0 elsewhere.
70 APPENDIX A. BACKGROUND RESULTS
Then, if
v =
⎡
⎢
⎢
⎢
⎢
⎢
⎣
c1
c2
...
cn−1
cn
⎤
⎥
⎥
⎥
⎥
⎥
⎦
,
we see that
v = c1
⎡
⎢
⎢
⎢
⎢
⎢
⎣
1
0
...
0
0
⎤
⎥
⎥
⎥
⎥
⎥
⎦
+ . . . + cn
⎡
⎢
⎢
⎢
⎢
⎢
⎣
0
0
...
0
1
⎤
⎥
⎥
⎥
⎥
⎥
⎦
= c1e1 + . . . + cnen
so we then see that
[v]E =
⎡
⎢
⎢
⎢
⎢
⎢
⎣
c1
c2
...
cn−1
cn
⎤
⎥
⎥
⎥
⎥
⎥
⎦
.
(In other words, a vector in Cn “looks like” itself in the standard basis.)
Next we see how to represent linear transformations, once we have chosen a basis.
Theorem 1.4. Let V be a vector space and let B = {v1, . . . , vn} be a basis of V . Let T : V −→ V be
a linear transformation. Then there is a unique matrix [T ]B such that, for any vector v in V ,
[T (v)]B = [T ]B[v]B .
Furthermore, the matrix [T ]B is given by
[T ]B = [v1]B [v2]B . . . [vn]B .
Similarly, this theorem leads to the following definition.
Definition 1.5. Let V be a vector space and let B = {v1, . . . , vn} be a basis of V . Let T : V −→ V
be a linear transformation. Let [T ]B be the matrix defined in Theorem 1.4.Then [T ]B is the matrix
of the linear transformation T in the basis B.
A.1. BASES, COORDINATES, AND MATRICES 71
Remark 1.6. In particular, we may take V = Cn and consider the standard basis E = {e1, . . . , en}.
Let A be an n-by-n square matrix and write A = a1 a2 . . . an . If TA is the linear transfor-
mation given by TA(v) = Av, then
[TA]E = [TA(e1)]E [TA(e2)]E . . . [TA(en)]E
= [Ae1]E [Ae2]E . . . [Aen]E
= [a1]E [a2]E . . . [an]E
= a1 a2 . . . an
= A .
(In other words, the linear transformation given by multiplication by a matrix “looks like” that same
matrix in the standard basis.)
What is essential to us is the ability to compare the situation in different bases. To that end,
we have the following theorem.
Theorem 1.7. Let V be a vector space, and let B = {v1, . . . , vn} and C = {w1, . . . , wn} be two bases
of V . Let PC←B be the matrix
PC←B = [v1]C [v2]C . . . [vn]C .
This matrix has the following properties:
(1) For any vector v in V ,
[v]C = PC←B[v]B .
(2) This matrix is invertible and
(PC←B)−1
= PB←C = [w1]B [w2]B . . . [wn]B .
(3) For any linear transformation T : V −→ V ,
[T ]C = PC←B[T ]BPB←C
= PC←B[T ]B(PC←B)−1
= (PB←C)−1
[T ]BPB←C .
[Steven h. weintraub]_jordan_canonical_form_appli(book_fi.org)
[Steven h. weintraub]_jordan_canonical_form_appli(book_fi.org)
[Steven h. weintraub]_jordan_canonical_form_appli(book_fi.org)
[Steven h. weintraub]_jordan_canonical_form_appli(book_fi.org)
[Steven h. weintraub]_jordan_canonical_form_appli(book_fi.org)
[Steven h. weintraub]_jordan_canonical_form_appli(book_fi.org)
[Steven h. weintraub]_jordan_canonical_form_appli(book_fi.org)
[Steven h. weintraub]_jordan_canonical_form_appli(book_fi.org)
[Steven h. weintraub]_jordan_canonical_form_appli(book_fi.org)
[Steven h. weintraub]_jordan_canonical_form_appli(book_fi.org)
[Steven h. weintraub]_jordan_canonical_form_appli(book_fi.org)
[Steven h. weintraub]_jordan_canonical_form_appli(book_fi.org)
[Steven h. weintraub]_jordan_canonical_form_appli(book_fi.org)
[Steven h. weintraub]_jordan_canonical_form_appli(book_fi.org)

More Related Content

What's hot

Book linear
Book linearBook linear
Book linear
Tamojit Das
 
IRJET- Wavelet based Galerkin Method for the Numerical Solution of One Dimens...
IRJET- Wavelet based Galerkin Method for the Numerical Solution of One Dimens...IRJET- Wavelet based Galerkin Method for the Numerical Solution of One Dimens...
IRJET- Wavelet based Galerkin Method for the Numerical Solution of One Dimens...
IRJET Journal
 
C:\Documents And Settings\Pen Drive\Desktop\Maths Module
C:\Documents And Settings\Pen Drive\Desktop\Maths ModuleC:\Documents And Settings\Pen Drive\Desktop\Maths Module
C:\Documents And Settings\Pen Drive\Desktop\Maths Module
Nidhi
 
Compiled Report
Compiled ReportCompiled Report
Compiled ReportSam McStay
 
Machine learning cheat sheet
Machine learning cheat sheetMachine learning cheat sheet
Machine learning cheat sheet
Hany Sewilam Abdel Hamid
 
Finite elements for 2‐d problems
Finite elements  for 2‐d problemsFinite elements  for 2‐d problems
Finite elements for 2‐d problems
Tarun Gehlot
 
201977 1-1-4-pb
201977 1-1-4-pb201977 1-1-4-pb
201977 1-1-4-pb
AssociateProfessorKM
 
Exact solutions for weakly singular Volterra integral equations by using an e...
Exact solutions for weakly singular Volterra integral equations by using an e...Exact solutions for weakly singular Volterra integral equations by using an e...
Exact solutions for weakly singular Volterra integral equations by using an e...
iosrjce
 
Free Ebooks Download ! Edhole
Free Ebooks Download ! EdholeFree Ebooks Download ! Edhole
Free Ebooks Download ! Edhole
Edhole.com
 
Constant strain triangular
Constant strain triangular Constant strain triangular
Constant strain triangular
rahul183
 
Quantum Axioms and commutation(Jacobi identity)
Quantum Axioms and commutation(Jacobi identity)Quantum Axioms and commutation(Jacobi identity)
Quantum Axioms and commutation(Jacobi identity)
Rajabumachaly
 
Isoparametric mapping
Isoparametric mappingIsoparametric mapping
Isoparametric mapping
Linh Tran
 
TYPE-2 FUZZY LINEAR PROGRAMMING PROBLEMS WITH PERFECTLY NORMAL INTERVAL TYPE-...
TYPE-2 FUZZY LINEAR PROGRAMMING PROBLEMS WITH PERFECTLY NORMAL INTERVAL TYPE-...TYPE-2 FUZZY LINEAR PROGRAMMING PROBLEMS WITH PERFECTLY NORMAL INTERVAL TYPE-...
TYPE-2 FUZZY LINEAR PROGRAMMING PROBLEMS WITH PERFECTLY NORMAL INTERVAL TYPE-...
ijceronline
 
Spectral methods for solving differential equations
Spectral methods for solving differential equationsSpectral methods for solving differential equations
Spectral methods for solving differential equations
Rajesh Aggarwal
 
Cálculo lambda
Cálculo lambdaCálculo lambda
Cálculo lambda
XequeMateShannon
 

What's hot (17)

Book linear
Book linearBook linear
Book linear
 
IRJET- Wavelet based Galerkin Method for the Numerical Solution of One Dimens...
IRJET- Wavelet based Galerkin Method for the Numerical Solution of One Dimens...IRJET- Wavelet based Galerkin Method for the Numerical Solution of One Dimens...
IRJET- Wavelet based Galerkin Method for the Numerical Solution of One Dimens...
 
C:\Documents And Settings\Pen Drive\Desktop\Maths Module
C:\Documents And Settings\Pen Drive\Desktop\Maths ModuleC:\Documents And Settings\Pen Drive\Desktop\Maths Module
C:\Documents And Settings\Pen Drive\Desktop\Maths Module
 
Ai unit-3
Ai unit-3Ai unit-3
Ai unit-3
 
Project
ProjectProject
Project
 
Compiled Report
Compiled ReportCompiled Report
Compiled Report
 
Machine learning cheat sheet
Machine learning cheat sheetMachine learning cheat sheet
Machine learning cheat sheet
 
Finite elements for 2‐d problems
Finite elements  for 2‐d problemsFinite elements  for 2‐d problems
Finite elements for 2‐d problems
 
201977 1-1-4-pb
201977 1-1-4-pb201977 1-1-4-pb
201977 1-1-4-pb
 
Exact solutions for weakly singular Volterra integral equations by using an e...
Exact solutions for weakly singular Volterra integral equations by using an e...Exact solutions for weakly singular Volterra integral equations by using an e...
Exact solutions for weakly singular Volterra integral equations by using an e...
 
Free Ebooks Download ! Edhole
Free Ebooks Download ! EdholeFree Ebooks Download ! Edhole
Free Ebooks Download ! Edhole
 
Constant strain triangular
Constant strain triangular Constant strain triangular
Constant strain triangular
 
Quantum Axioms and commutation(Jacobi identity)
Quantum Axioms and commutation(Jacobi identity)Quantum Axioms and commutation(Jacobi identity)
Quantum Axioms and commutation(Jacobi identity)
 
Isoparametric mapping
Isoparametric mappingIsoparametric mapping
Isoparametric mapping
 
TYPE-2 FUZZY LINEAR PROGRAMMING PROBLEMS WITH PERFECTLY NORMAL INTERVAL TYPE-...
TYPE-2 FUZZY LINEAR PROGRAMMING PROBLEMS WITH PERFECTLY NORMAL INTERVAL TYPE-...TYPE-2 FUZZY LINEAR PROGRAMMING PROBLEMS WITH PERFECTLY NORMAL INTERVAL TYPE-...
TYPE-2 FUZZY LINEAR PROGRAMMING PROBLEMS WITH PERFECTLY NORMAL INTERVAL TYPE-...
 
Spectral methods for solving differential equations
Spectral methods for solving differential equationsSpectral methods for solving differential equations
Spectral methods for solving differential equations
 
Cálculo lambda
Cálculo lambdaCálculo lambda
Cálculo lambda
 

Viewers also liked

Infinite sets and cardinalities
Infinite sets and cardinalitiesInfinite sets and cardinalities
Infinite sets and cardinalitieswalkerlj
 
Axiom of Choice
Axiom of Choice Axiom of Choice
Axiom of Choice
gizemk
 
Axioma Eleccion or Axiom of Choice
Axioma Eleccion or Axiom of ChoiceAxioma Eleccion or Axiom of Choice
Axioma Eleccion or Axiom of Choice
Luis Ospina
 
Jordan canonical form 01
Jordan canonical form 01Jordan canonical form 01
Jordan canonical form 01
HanpenRobot
 
types of sets
types of setstypes of sets
types of sets
jayzorts
 
Set concepts
Set conceptsSet concepts
Set concepts
Malti Aswal
 
SET THEORY
SET THEORYSET THEORY
SET THEORYLena
 

Viewers also liked (7)

Infinite sets and cardinalities
Infinite sets and cardinalitiesInfinite sets and cardinalities
Infinite sets and cardinalities
 
Axiom of Choice
Axiom of Choice Axiom of Choice
Axiom of Choice
 
Axioma Eleccion or Axiom of Choice
Axioma Eleccion or Axiom of ChoiceAxioma Eleccion or Axiom of Choice
Axioma Eleccion or Axiom of Choice
 
Jordan canonical form 01
Jordan canonical form 01Jordan canonical form 01
Jordan canonical form 01
 
types of sets
types of setstypes of sets
types of sets
 
Set concepts
Set conceptsSet concepts
Set concepts
 
SET THEORY
SET THEORYSET THEORY
SET THEORY
 

Similar to [Steven h. weintraub]_jordan_canonical_form_appli(book_fi.org)

Algorithmic Mathematics.
Algorithmic Mathematics.Algorithmic Mathematics.
Algorithmic Mathematics.
Dr. Volkan OBAN
 
Independence Complexes
Independence ComplexesIndependence Complexes
Independence ComplexesRickard Fors
 
Bachelor's Thesis
Bachelor's ThesisBachelor's Thesis
Bachelor's Thesis
Bastiaan Frerix
 
signalsandsystemsnotes.pdf
signalsandsystemsnotes.pdfsignalsandsystemsnotes.pdf
signalsandsystemsnotes.pdf
SubbuSiva1
 
signalsandsystemsnotes.pdf
signalsandsystemsnotes.pdfsignalsandsystemsnotes.pdf
signalsandsystemsnotes.pdf
SubbuSiva1
 
Numerical solution of eigenvalues and applications 2
Numerical solution of eigenvalues and applications 2Numerical solution of eigenvalues and applications 2
Numerical solution of eigenvalues and applications 2
SamsonAjibola
 
(Ebook pdf) - physics - introduction to tensor calculus and continuum mechanics
(Ebook pdf) - physics - introduction to tensor calculus and continuum mechanics(Ebook pdf) - physics - introduction to tensor calculus and continuum mechanics
(Ebook pdf) - physics - introduction to tensor calculus and continuum mechanics
Antonio Vinnie
 
Notes for signals and systems
Notes for signals and systemsNotes for signals and systems
Notes for signals and systems
Palestine Technical College
 
Cálculo de Tensores
Cálculo de TensoresCálculo de Tensores
Cálculo de Tensores
ARQUITECTOTUNQUN
 
Deturck wilf
Deturck wilfDeturck wilf
Deturck wilf
CAALAAA
 
Mathematical logic
Mathematical logicMathematical logic
Mathematical logic
ble nature
 
On the Numerical Solution of Differential Equations
On the Numerical Solution of Differential EquationsOn the Numerical Solution of Differential Equations
On the Numerical Solution of Differential Equations
Kyle Poe
 
History of the Dirichlet Problem for Laplace's equation.pdf
History of the Dirichlet Problem for Laplace's equation.pdfHistory of the Dirichlet Problem for Laplace's equation.pdf
History of the Dirichlet Problem for Laplace's equation.pdf
hutong1
 

Similar to [Steven h. weintraub]_jordan_canonical_form_appli(book_fi.org) (20)

Algorithmic Mathematics.
Algorithmic Mathematics.Algorithmic Mathematics.
Algorithmic Mathematics.
 
Independence Complexes
Independence ComplexesIndependence Complexes
Independence Complexes
 
MAINPH
MAINPHMAINPH
MAINPH
 
Bachelor's Thesis
Bachelor's ThesisBachelor's Thesis
Bachelor's Thesis
 
signalsandsystemsnotes.pdf
signalsandsystemsnotes.pdfsignalsandsystemsnotes.pdf
signalsandsystemsnotes.pdf
 
signalsandsystemsnotes.pdf
signalsandsystemsnotes.pdfsignalsandsystemsnotes.pdf
signalsandsystemsnotes.pdf
 
Numerical solution of eigenvalues and applications 2
Numerical solution of eigenvalues and applications 2Numerical solution of eigenvalues and applications 2
Numerical solution of eigenvalues and applications 2
 
(Ebook pdf) - physics - introduction to tensor calculus and continuum mechanics
(Ebook pdf) - physics - introduction to tensor calculus and continuum mechanics(Ebook pdf) - physics - introduction to tensor calculus and continuum mechanics
(Ebook pdf) - physics - introduction to tensor calculus and continuum mechanics
 
Notes for signals and systems
Notes for signals and systemsNotes for signals and systems
Notes for signals and systems
 
SMA206_NOTES
SMA206_NOTESSMA206_NOTES
SMA206_NOTES
 
Cálculo de Tensores
Cálculo de TensoresCálculo de Tensores
Cálculo de Tensores
 
Deturck wilf
Deturck wilfDeturck wilf
Deturck wilf
 
tamuthesis
tamuthesistamuthesis
tamuthesis
 
Thesis 2015
Thesis 2015Thesis 2015
Thesis 2015
 
Mathematical logic
Mathematical logicMathematical logic
Mathematical logic
 
VECTOR_QUNTIZATION
VECTOR_QUNTIZATIONVECTOR_QUNTIZATION
VECTOR_QUNTIZATION
 
VECTOR_QUNTIZATION
VECTOR_QUNTIZATIONVECTOR_QUNTIZATION
VECTOR_QUNTIZATION
 
On the Numerical Solution of Differential Equations
On the Numerical Solution of Differential EquationsOn the Numerical Solution of Differential Equations
On the Numerical Solution of Differential Equations
 
History of the Dirichlet Problem for Laplace's equation.pdf
History of the Dirichlet Problem for Laplace's equation.pdfHistory of the Dirichlet Problem for Laplace's equation.pdf
History of the Dirichlet Problem for Laplace's equation.pdf
 
thesis
thesisthesis
thesis
 

Recently uploaded

A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
thanhdowork
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
RitikBhardwaj56
 
DRUGS AND ITS classification slide share
DRUGS AND ITS classification slide shareDRUGS AND ITS classification slide share
DRUGS AND ITS classification slide share
taiba qazi
 
Assignment_4_ArianaBusciglio Marvel(1).docx
Assignment_4_ArianaBusciglio Marvel(1).docxAssignment_4_ArianaBusciglio Marvel(1).docx
Assignment_4_ArianaBusciglio Marvel(1).docx
ArianaBusciglio
 
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdfMASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
goswamiyash170123
 
Best Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDABest Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDA
deeptiverma2406
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
chanes7
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
amberjdewit93
 
Landownership in the Philippines under the Americans-2-pptx.pptx
Landownership in the Philippines under the Americans-2-pptx.pptxLandownership in the Philippines under the Americans-2-pptx.pptx
Landownership in the Philippines under the Americans-2-pptx.pptx
JezreelCabil2
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
David Douglas School District
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
Academy of Science of South Africa
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
Group Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana BuscigliopptxGroup Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana Buscigliopptx
ArianaBusciglio
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
Krisztián Száraz
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 

Recently uploaded (20)

A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
 
DRUGS AND ITS classification slide share
DRUGS AND ITS classification slide shareDRUGS AND ITS classification slide share
DRUGS AND ITS classification slide share
 
Assignment_4_ArianaBusciglio Marvel(1).docx
Assignment_4_ArianaBusciglio Marvel(1).docxAssignment_4_ArianaBusciglio Marvel(1).docx
Assignment_4_ArianaBusciglio Marvel(1).docx
 
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdfMASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
 
Best Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDABest Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDA
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
 
Landownership in the Philippines under the Americans-2-pptx.pptx
Landownership in the Philippines under the Americans-2-pptx.pptxLandownership in the Philippines under the Americans-2-pptx.pptx
Landownership in the Philippines under the Americans-2-pptx.pptx
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
 
Group Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana BuscigliopptxGroup Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana Buscigliopptx
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 

[Steven h. weintraub]_jordan_canonical_form_appli(book_fi.org)

  • 1. Jordan Canonical Form: Application to Differential Equations
  • 2. Copyright © 2008 by Morgan & Claypool All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means—electronic, mechanical, photocopy, recording, or any other except for brief quotations in printed reviews, without the prior permission of the publisher. Jordan Canonical Form: Application to Differential Equations Steven H. Weintraub www.morganclaypool.com ISBN: 9781598298048 paperback ISBN: 9781598298055 ebook DOI 10.2200/S00146ED1V01Y200808MAS002 A Publication in the Morgan & Claypool Publishers series SYNTHESIS LECTURES ON MATHEMATICS AND STATISTICS Lecture #2 Series Editor: Steven G. Krantz, Washington University, St. Louis Series ISSN Synthesis Lectures on Mathematics and Statistics ISSN pending.
  • 3. Jordan Canonical Form: Application to Differential Equations Steven H. Weintraub Lehigh University SYNTHESIS LECTURES ON MATHEMATICS AND STATISTICS #2 CM& cLaypoolMorgan publishers&
  • 4. ABSTRACT Jordan Canonical Form (JCF) is one of the most important, and useful, concepts in linear algebra. In this book we develop JCF and show how to apply it to solving systems of differential equations. We first develop JCF, including the concepts involved in it–eigenvalues, eigenvectors, and chains of generalized eigenvectors. We begin with the diagonalizable case and then proceed to the general case, but we do not present a complete proof. Indeed, our interest here is not in JCF per se, but in one of its important applications. We devote the bulk of our attention in this book to showing how to apply JCF to solve systems of constant-coefficient first order differential equations, where it is a very effective tool. We cover all situations–homogeneous and inhomogeneous systems; real and complex eigenvalues.We also treat the closely related topic of the matrix exponential.Our discussion is mostly confined to the 2-by-2 and 3-by-3 cases, and we present a wealth of examples that illustrate all the possibilities in these cases (and of course, a wealth of exercises for the reader). KEYWORDS Jordan Canonical Form,linear algebra,differential equations,eigenvalues,eigenvectors, generalized eigenvectors, matrix exponential
  • 5. v Contents Contents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .v Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii 1 Jordan Canonical Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 The Diagonalizable Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 The General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7 2 Solving Systems of Linear Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .25 2.1 Homogeneous Systems with Constant Coefficients. . . . . . . . . . . . . . . . . . . . . . . . . . .25 2.2 Homogeneous Systems with Constant Coefficients. . . . . . . . . . . . . . . . . . . . . . . . . . .40 2.3 Inhomogeneous Systems with Constant Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . 46 2.4 The Matrix Exponential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 A Background Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 A.1 Bases, Coordinates, and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 A.2 Properties of the Complex Exponential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 B Answers to Odd-Numbered Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .85
  • 7. Preface Jordan Canonical Form (JCF) is one of the most important, and useful, concepts in linear algebra. In this book, we develop JCF and show how to apply it to solving systems of differential equations. In Chapter 1, we develop JCF. We do not prove the existence of JCF in general, but we present the ideas that go into it—eigenvalues and (chains of generalized) eigenvectors. In Section 1.1, we treat the diagonalizable case, and in Section 1.2, we treat the general case. We develop all possibilities for 2-by-2 and 3-by-3 matrices, and illustrate these by examples. In Chapter 2, we apply JCF. We show how to use JCF to solve systems Y = AY + G(x) of constant-coefficient first-order linear differential equations. In Section 2.1, we consider homoge- neous systems Y = AY. In Section 2.2, we consider homogeneous systems when the characteristic polynomial of A has complex roots (in which case an additional step is necessary). In Section 2.3, we consider inhomogeneous systems Y = AY + G(x) with G(x) nonzero. In Section 2.4, we develop the matrix exponential eAx and relate it to solutions of these systems. Also in this chapter we provide examples that illustrate all the possibilities in the 2-by-2 and 3-by-3 cases. Appendix A has background material. Section A.1 gives background on coordinates for vectors and matrices for linear transformations. Section A.2 derives the basic properties of the complex exponential function. This material is relegated to the Appendix so that readers who are unfamiliar with these notions, or who are willing to take them on faith, can skip it and still understand the material in Chapters 1 and 2. Our numbering system for results is fairly standard: Theorem 2.1, for example, is the first Theorem found in Section 2 of Chapter 1. As is customary in textbooks, we provide the answers to the odd-numbered exercises here. Instructors may contact me at shw2@lehigh.edu and I will supply the answers to all of the exercises. Steven H. Weintraub Lehigh University Bethlehem, PA USA July 2008
  • 9. 1 C H A P T E R 1 Jordan Canonical Form 1.1 THE DIAGONALIZABLE CASE Although, for simplicity, most of our examples will be over the real numbers (and indeed over the rational numbers), we will consider that all of our vectors and matrices are defined over the complex numbers C. It is only with this assumption that the theory of Jordan Canonical Form (JCF) works completely. See Remark 1.9 for the key reason why. Definition 1.1. If v = 0 is a vector such that, for some λ, Av = λv , then v is an eigenvector of A associated to the eigenvalue λ. Example 1.2. Let A be the matrix A = 5 −7 2 −4 . Then, as you can check, if v1 = 7 2 , then Av1 = 3v1, so v1 is an eigenvector of A with associated eigenvalue 3, and if v2 = 1 1 , then Av2 = −2v2, so v2 is an eigenvector of A with associated eigenvalue −2. We note that the definition of an eigenvalue/eigenvector can be expressed in an alternate form. Here I denotes the identity matrix: Av = λv Av = λIv (A − λI)v = 0 . For an eigenvalue λ of A, we let Eλ denote the eigenspace of λ, Eλ = {v | Av = λv} = {v | (A − λI)v = 0} = Ker(A − λI) . (The kernel Ker(A − λI) is also known as the nullspace NS(A − λI).) We also note that this alternate formulation helps us find eigenvalues and eigenvectors. For if (A − λI)v = 0 for a nonzero vector v,the matrix A − λI must be singular,and hence its determinant must be 0. This leads us to the following definition. Definition 1.3. The characteristic polynomial of a matrix A is the polynomial det(λI − A).
  • 10. 2 CHAPTER 1. JORDAN CANONICAL FORM Remark 1.4. This is the customary definition of the characteristic polynomial. But note that, if A is an n-by-n matrix, then the matrix λI − A is obtained from the matrix A − λI by multiplying each of its n rows by −1, and hence det(λI − A) = (−1)n det(A − λI). In practice, it is most convenient to work with A − λI in finding eigenvectors—this minimizes arithmetic—and when we come to find chains of generalized eigenvectors in Section 1.2, it is (almost) essential to use A − λI, as using λI − A would introduce lots of spurious minus signs. Example 1.5. Returning to the matrix A = 5 −7 2 −4 of Example 1.2, we compute that det(λI − A) = λ2 − λ − 6 = (λ − 3)(λ + 2), so A has eigenvalues 3 and −2. Computation then shows that the eigenspace E3 = Ker(A − 3I) has basis 7 2 , and that the eigenspace E−2 = Ker(A − (−2)I) has basis 1 1 . We now introduce two important quantities associated to an eigenvalue of a matrix A. Definition 1.6. Let a be an eigenvalue of a matrix A. The algebraic multiplicity of the eigenvalue a is alg-mult(a) = the multiplicity of a as a root of the characteristic polynomial det(λI − A). The geometric multiplicity of the eigenvalue a is geom-mult(a) = the dimension of the eigenspace Ea. It is common practice to use the word multiplicity (without a qualifier) to mean algebraic multiplicity. We have the following relationship between these two multiplicities. Lemma 1.7. Let a be an eigenvalue of a matrix A. Then 1 ≤ geom-mult(a) ≤ alg-mult(a) . Proof. By the definition of an eigenvalue, there is at least one eigenvector v with eigenvalue a, and so Ea contains the nonzero vector v, and hence dim(Ea) ≥ 1. For the proof that geom-mult(a) ≤ alg-mult(a), see Lemma 1.12 in Appendix A. 2 Corollary 1.8. Let a be an eigenvalue of A and suppose that a has algebraic multiplicity 1. Then a also has geometric multiplicity 1. Proof. In this case, applying Lemma 1.7, we have 1 ≤ geom-mult(a) ≤ alg-mult(a) = 1 , so geom-mult(a) = 1. 2
  • 11. 1.1. THE DIAGONALIZABLE CASE 3 Remark 1.9. Let A be an n-by-n matrix.Then its characteristic polynomial det(λI − A) has degree n. Since we are considering A to be defined over the complex numbers, we may apply the Fundamental Theorem of Algebra, which states that an nth degree polynomial has n roots, counting multiplicities. Hence,we see that,for any n-by-n matrix A,the sum of the algebraic multiplicities of the eigenvalues of A is equal to n. Lemma 1.10. Let A be an n-by-n matrix. The following are equivalent: (1) For each eigenvalue a of A, geom-mult(a) = alg-mult(a). (2) The sum of the geometric multiplicities of the eigenvalues of A is equal to n. Proof. Let A have eigenvalues a1, a2, . . . , am. For each i between 1 and m, let si = geom-mult(ai) and ti = alg-mult(ai). Then, by Lemma 1.7, si ≤ ti for each i, and by Remark 1.9, m i=1 ti = n. Thus, if si = ti for each i, then m i=1 si = n, while if si < ti for some i, then m i=1 si < n. 2 Proposition 1.11. (1) Let a1, a2, . . . , am be distinct eigenvalues of A (i.e., ai = aj for i = j). For each i between 1 and m, let vi be an associated eigenvector.Then {v1, v2, . . . , vm} is a linearly independent set of vectors. (2) More generally, let a1, a2, . . . , am be distinct eigenvalues of A. For each i between 1 and m, let Si be a linearly independent set of eigenvectors associated to ai.Then S = S1 ∪ . . . Sm is a linearly independent set of vectors. Proof. (1) Suppose we have a linear combination 0 = c1v1 + c2v2 + . . . + cmvm. We need to show that ci = 0 for each i.To do this, we begin with an observation: If v is an eigenvector of A associated to the eigenvalue a, and b is any scalar, then (A − bI)v = Av − bv = av − bv = (a − b)v. (Note that this answer is 0 if a = b and nonzero if a = b.) We now go to work,multiplying our original relation by (A − amI).Of course,(A − amI)0 = 0, so: 0 = (A − amI)(c1v1 + c2v2 + . . . + cm−2vm−2 + cm−1vm−1 + cmvm) = c1(A − amI)v1 + c2(A − amI)v2 + . . . + cm−2(A − amI)vm−2 + cm−1(A − amI)vm−1 + cm(A − amI)vm = c1(a1 − am)v1 + c2(a2 − am)v2 + . . . + cm−2(am−2 − am)vm−2 + cm−1(am−1 − am)vm−1 .
  • 12. 4 CHAPTER 1. JORDAN CANONICAL FORM We now multiply this relation by (A − am−1I). Again, (A − am−1I)0 = 0, so: 0 = (A − am−1I)(c1(a1 − am)v1 + c2(a2 − am)v2 + . . . + cm−2(am−2 − am)vm−2 + cm−1(am−1 − am)vm−1) = c1(a1 − am)(A − am−1I)v1 + c2(a2 − am)(A − am−1I)v2 + . . . + cm−2(am−2 − am)(A − am−1I)vm−2 + cm−1(am−1 − am)(A − am−1I)vm−1 = c1(a1 − am)(a1 − am−1)v1 + c2(a2 − am)(a2 − am−1)v2 + . . . + cm−2(am−2 − am)(am−2 − am−1)vm−2 . Proceed in this way, until at the last step we multiply by (A − a2I). We then obtain: 0 = c1(a1 − a2) · · · (a1 − am−1)(a1 − am)v1 . But v1 = 0, as by definition an eigenvector is nonzero. Also, the product (a1 − a2) · · · (a1 − am−1)(a1 − am) is a product of nonzero numbers and is hence nonzero.Thus, we must have c1 = 0. Proceeding in the same way, multiplying our original relation by (A − amI), (A − am−1I), (A − a3I),and finally by (A − a1I),we obtain c2 = 0,and,proceeding in this vein,we obtain ci = 0 for all i, and so the set {v1, v2, . . . , vm} is linearly independent. (2)To avoid complicated notation,we will simply prove this when m = 2 (which illustrates the general case).Thus,let m = 2,let S1 = {v1,1, . . . , v1,i1 } be a linearly independent set of eigenvectors associated to the eigenvalue a1 of A, and let S2 = {v2,1, . . . , v2,i2 } be a linearly independent set of eigenvectors associated to the eigenvalue a2 of A. Then S = {v1,1, . . . , v1,i1 , v2,1, . . . , v2,i2 }. We want to show that S is a linearly independent set. Suppose we have a linear combination 0 = c1,1v1,1 + . . . + c1,i1 v1,i1 + c2,1v2,1 + . . . + c2,i2 v2,i2 . Then: 0 = c1,1v1,1 + . . . + c1,i1 v1,i1 + c2,1v2,1 + . . . + c2,i2 v2,i2 = (c1,1v1,1 + . . . + c1,i1 v1,i1 ) + (c2,1v2,1 + . . . + c2,i2 v2,i2 ) = v1 + v2 where v1 = c1,1v1,1 + . . . + c1,i1 v1,i1 and v2 = c2,1v2,1 + . . . + c2,i2 v2,i2 . But v1 is a vector in Ea1 , so Av1 = a1v1; similarly, v2 is a vector in Ea2 , so Av2 = a2v2. Then, as in the proof of part (1), 0 = (A − a2I)0 = (A − a2I)(v1 + v2) = (A − a2I)v1 + (A − a2I)v2 = (a1 − a2)v1 + 0 = (a1 − a2)v1 so 0 = v1; similarly, 0 = v2. But 0 = v1 = c1,1v1,1 + . . . + c1,i1 v1,i1 implies c1,1 = . . . c1,i1 = 0, as, by hypothesis, {v1,1, . . . , v1,i1 } is a linearly independent set; similarly, 0 = v2 implies c2,1 = . . . = c2,i2 = 0. Thus, c1,1 = . . . = c1,i1 = c2,1 = . . . = c2,i2 = 0 and S is linearly independent, as claimed. 2 Definition 1.12. Two square matrices A and B are similar if there is an invertible matrix P with A = PBP −1.
  • 13. 1.1. THE DIAGONALIZABLE CASE 5 Definition 1.13. A square matrix A is diagonalizable if A is similar to a diagonal matrix. Here is the main result of this section. Theorem 1.14. Let A be an n-by-n matrix over the complex numbers. Then A is diagonalizable if and only if, for each eigenvalue a of A, geom-mult(a) = alg-mult(a). In that case, A = PJP−1 where J is a diagonal matrix whose entries are the eigenvalues of A, each appearing according to its algebraic multiplicity, and P is a matrix whose columns are eigenvectors forming bases for the associated eigenspaces. Proof. We give a proof by direct computation here. For a more conceptual proof, see Theorem 1.10 in Appendix A. First let us suppose that for each eigenvalue a of A, geom-mult(a) = alg-mult(a). Let A have eigenvalues a1, a2, …, an. Here we do not insist that the ai’s are distinct; rather, each eigenvalue appears the same number of times as its algebraic multiplicity.Then J is the diagonal matrix J = j1 j2 . . . jn and we see that ji, the ith column of J, is the vector ji = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 0 ... 0 ai 0 ... ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ , with ai in the ith position, and 0 elsewhere. We have P = v1 v2 . . . vn , a matrix whose columns are eigenvectors forming bases for the associated eigenspaces.By hypothesis, geom-mult(a) = alg-mult(a) for each eigenvector a of A, so there are as many columns of P that are eigenvectors for the eigenvalue a as there are diagonal entries of J that are equal to a. Furthermore, by Lemma 1.10, the matrix P indeed has n columns. We first show by direct computation that AP = PJ. Now AP = A v1 v2 . . . vn
  • 14. 6 CHAPTER 1. JORDAN CANONICAL FORM so the ith column of AP is Avi. But Avi = aivi as vi is an eigenvector of A with associated eigenvalue ai. On the other hand, PJ = v1 v2 . . . vn J and the ith column of PJ is Pji, Pji = v1 v2 . . . vn ji . Remembering what the vector ji is, and multiplying, we see that Pji = aivi as well. Thus, every column of AP is equal to the corresponding column of PJ, so AP = PJ . By Proposition 1.11, the columns of the square matrix P are linearly independent, so P is invertible. Multiplying on the right by P−1, we see that A = PJP −1 , completing the proof of this half of the Theorem. Now let us suppose that A is diagonalizable, A = PJP −1.Then AP = PJ.We use the same notation for P and J as in the first half of the proof. Then, as in the first half of the proof, we compute AP and PJ column-by-column, and we see that the ith column of AP is Avi and that the ith column of PJ is aivi, for each i. Hence, Avi = aivi for each i, and so vi is an eigenvector of A with associated eigenvalue ai. For each eigenvalue a of A, there are as many columns of P that are eigenvectors for a as there are diagonal entries of J that are equal to a, and these vectors form a basis for the eigenspace associatedoftheeigenvaluea,soweseethatforeacheigenvaluea ofA,geom-mult(a) = alg-mult(a), completing the proof. 2 For a general matrix A, the condition in Theorem 1.14 may or may not be satisfied, i.e., some but not all matrices are diagonalizable. But there is one important case when this condition is automatic. Corollary 1.15. Let A be an n-by-n matrix over the complex numbers all of whose eigenvalues are distinct (i.e., whose characteristic polynomial has no repeated roots). Then A is diagonalizable.
  • 15. 1.2. THE GENERAL CASE 7 Proof. By hypothesis, for each eigenvalue a of A, alg-mult(a) = 1. But then, by Corollary 1.8, for each eigenvalue a of A, geom-mult(a) = alg-mult(a), so the hypothesis ofTheorem 1.14 is satisfied. 2 Example 1.16. Let A be the matrix A = 5 −7 2 −4 of Examples 1.2 and 1.5. Then, referring to Example 1.5, we see 5 −7 2 −4 = 7 1 2 1 3 0 0 −2 7 1 2 1 −1 . As we have indicated, we have developed this theory over the complex numbers, as JFC works best over them. But there is an analog of our results over the real numbers—we just have to require that all the eigenvalues of A are real. Here is the basic result on diagonalizability. Theorem 1.17. Let A be an n-by-n real matrix. Then A is diagonalizable if and only if all the eigenvalues of A are real numbers, and, for each eigenvalue a of A, geom-mult(a) = alg-mult(a). In that case, A = PJP−1 where J is a diagonal matrix whose entries are the eigenvalues of A, each appearing according to its algebraic multiplicity (and hence is a real matrix), and P is a real matrix whose columns are eigenvectors forming bases for the associated eigenspaces. 1.2 THE GENERAL CASE Let us begin this section by describing what a matrix in JCF looks like. A matrix in JCF is composed of “Jordan blocks,” so we first see what a single Jordan block looks like. Definition 2.1. A k-by-k Jordan block associated to the eigenvalue λ is a k-by-k matrix of the form J = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ λ 1 λ 1 λ 1 ... ... λ 1 λ ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ .
  • 16. 8 CHAPTER 1. JORDAN CANONICAL FORM In other words, a Jordan block is a matrix with all the diagonal entries equal to each other, all the entries immediately above the diagonal equal to 1, and all the other entries equal to 0. Definition 2.2. A matrix J in Jordan Canonical Form (JCF) is a block diagonal matrix J = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ J1 J2 J3 ... J ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ with each Ji a Jordan block. Remark 2.3. Note that every diagonal matrix is a matrix in JCF, with each Jordan block a 1-by-1 block. In order to understand and be able to use JCF, we must introduce a new concept, that of a generalized eigenvector. Definition 2.4. If v = 0 is a vector such that, for some λ, (A − λI)k (v) = 0 for some positive integer k, then v is a generalized eigenvector of A associated to the eigenvalue λ. The smallest k with (A − λI)k(v) = 0 is the index of the generalized eigenvector v. Let us note that if v is a generalized eigenvector of index 1, then (A − λI)(v) = 0 (A)v = (λI)v Av = λv and so v is an (ordinary) eigenvector. Recall that, for an eigenvalue λ of A, Eλ denotes the eigenspace of λ, Eλ = {v | Av = λv} = {v | (A − λI)v = 0} . We let ˜Eλ denote the generalized eigenspace of λ, ˜Eλ = {v | (A − λI)k (v) = 0 for some k} . It is easy to check that ˜Eλ is a subspace.
  • 17. 1.2. THE GENERAL CASE 9 Since every eigenvector is a generalized eigenvector, we see that Eλ ⊆ ˜Eλ . The following result (which we shall not prove) is an important fact about generalized eigenspaces. Proposition 2.5. Let λ be an eigenvalue of the n-by-n matrix A of algebraic multiplicity m. Then, ˜Eλ is a subspace of Cn of dimension m. Example 2.6. Let A be the matrix A = 0 1 −4 4 . Then, as you can check, if u = 1 2 , then (A − 2I)u = 0, so u is an eigenvector of A with associated eigenvalue 2 (and hence a generalized eigenvector of index 1 of A with associated eigenvalue 2). On the other hand, if v = 1 0 , then (A − 2I)2v = 0 but (A − 2I)v = 0,so v is a generalized eigenvector of index 2 of A with associated eigenvalue 2. In this case,as you can check,the vector u is a basis for the eigenspace E2,so E2 = { cu | c ∈ C} is one dimensional. On the other hand, u and v are both generalized eigenvectors associated to the eigenvalue 2, and are linearly independent (the equation c1u + c2v = 0 only has the solution c1 = c2 = 0, as you can readily check), so ˜E2 has dimension at least 2. Since ˜E2 is a subspace of C2, it must have dimension exactly 2, and ˜E2 = C2 (and {u, v} is indeed a basis for C2). Let us next consider a generalized eigenvector vk of index k associated to an eigenvalue λ, and set vk−1 = (A − λI)vk . We claim that vk−1 is a generalized eigenvector of index k − 1 associated to the eigenvalue λ. To see this, note that (A − λI)k−1 vk−1 = (A − λI)k−1 (A − λI)vk = (A − λI)k vk = 0 but (A − λI)k−2 vk−1 = (A − λI)k−2 (A − λI)vk = (A − λI)k−1 vk = 0 . Proceeding in this way, we may set vk−2 = (A − λI)vk−1 = (A − λI)2 vk vk−3 = (A − λI)vk−2 = (A − λI)2 vk−1 = (A − λI)3 vk ... v1 = (A − λI)v2 = · · · = (A − λI)k−1 vk
  • 18. 10 CHAPTER 1. JORDAN CANONICAL FORM and note that each vi is a generalized eigenvector of index i associated to the eigenvalue λ. A collection of generalized eigenvectors obtained in this way gets a special name: Definition 2.7. If {v1, . . . , vk} is a set of generalized eigenvectors associated to the eigenvalue λ of A, such that vk is a generalized eigenvector of index k and also vk−1 =(A − λI)vk, vk−2 = (A − λI)vk−1, vk−3 = (A − λI)vk−2, · · · , v2 = (A − λI)v3, v1 = (A − λI)v2 , then {v1, . . . , vk} is called a chain of generalized eigenvectors of length k. The vector vk is called the top of the chain and the vector v1 (which is an ordinary eigenvector) is called the bottom of the chain. Example 2.8. Let us return to Example 2.6.We saw there that v = 1 0 is a generalized eigenvector of index 2 of A = 0 1 −4 4 associated to the eigenvalue 2. Let us set v2 = v = 1 0 . Then v1 = (A − 2I)v2 = −2 −4 is a generalized eigenvector of index 1 (i.e., an ordinary eigenvector), and {v1, v2} is a chain of length 2. Remark 2.9. It is important to note that a chain of generalized eigenvectors {v1, . . . , vk} is entirely determined by the vector vk at the top of the chain. For once we have chosen vk, there are no other choices to be made: the vector vk−1 is determined by the equation vk−1 = (A − λI)vk; then the vector vk−2 is determined by the equation vk−2 = (A − λI)vk−1, etc. With this concept in hand, let us return to JCF. As we have seen, a matrix J in JCF has a number of blocks J1, J2, . . . , J , called Jordan blocks, along the diagonal. Let us begin our analysis with the case when J consists of a single Jordan block. So suppose J is a k-by-k matrix J = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ λ 1 λ 1 0 λ 1 ... ... 0 λ 1 λ ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ .
  • 19. 1.2. THE GENERAL CASE 11 Then, J − λI = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 0 1 0 1 0 1 ... ... 0 1 0 ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ . Let e1= ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 1 0 0 ... 0 ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ , e2= ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 0 1 0 ... 0 ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ , e3= ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 0 0 1 ... 0 ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ , …, ek= ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 0 0 0 ... 1 ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ . Then direct calculation shows: (J − λI)ek = ek−1 (J − λI)ek−1 = ek−2 ... (J − λI)e2 = e1 (J − λI)e1 = 0 and so we see that {e1, . . . , ek} is a chain of generalized eigenvectors. We also note that {e1, . . . , ek} is a basis for Ck, and so ˜Eλ = Ck . We first see that the situation is very analogous when we consider any k-by-k matrix with a single chain of generalized eigenvectors of length k. Proposition 2.10. Let {v1, . . . , vk} be a chain of generalized eigenvectors of length k associated to the eigenvalue λ of a matrix A. Then {v1, . . . , vk} is linearly independent. Proof. Suppose we have a linear combination c1v1 + c2v2 + · · · + ck−1vk−1 + ckvk = 0 . We must show each ci = 0. By the definition of a chain, vk−i = (A − λI)ivk for each i, so we may write this equation as c1(A − λI)k−1 vk + c2(A − λI)k−2 vk + · · · + ck−1(A − λI)vk + ckvk = 0 .
  • 20. 12 CHAPTER 1. JORDAN CANONICAL FORM Now let us multiply this equation on the left by (A − λI)k−1. Then we obtain the equation c1(A − λI)2k−2 vk + c2(A − λI)2k−3 vk + · · · + ck−1(A − λI)k vk + ck(A − λI)k−1 vk = 0 . Now (A − λI)k−1vk = v1 = 0. However, (A − λI)kvk = 0, and then also (A − λI)k+1vk = (A − λI)(A − λI)kvk = (A − λI)(0) = 0, and then similarly (A − λI)k+2vk = 0, . . . , (A − λI)2k−2vk = 0, so every term except the last one is zero and this equation becomes ckv1 = 0 . Since v1 = 0, this shows ck = 0, so our linear combination is c1v1 + c2v2 + · · · + ck−1vk−1 = 0 . Repeat the same argument, this time multiplying by (A − λI)k−2 instead of (A − λI)k−1. Then we obtain the equation ck−1v1 = 0 , and, since v1 = 0, this shows that ck−1 = 0 as well. Keep going to get c1 = c2 = · · · = ck−1 = ck = 0 , so {v1, . . . , vk} is linearly independent. 2 Theorem 2.11. Let A be a k-by-k matrix and suppose that Ck has a basis {v1, . . . , vk} consisting of a single chain of generalized eigenvectors of length k associated to an eigenvalue a. Then A = PJP −1 where J = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ a 1 a 1 a 1 ... ... a 1 a ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ is a matrix consisting of a single Jordan block and P = v1 v2 . . . vk is a matrix whose columns are generalized eigenvectors forming a chain.
  • 21. 1.2. THE GENERAL CASE 13 Proof. We give a proof by direct computation here. (Note the similarity of this proof to the proof of Theorem 1.14.) For a more conceptual proof, see Theorem 1.11 in Appendix A. Let P be the given matrix. We will first show by direct computation that AP = PJ. It will be convenient to write J = j1 j2 . . . jk and we see that ji, the ith column of J, is the vector ji = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 0 ... 1 a 0 ... ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ with 1 in the (i − 1)st position, a in the ith position, and 0 elsewhere. We show that AP = PJ by showing that their corresponding columns are equal. Now AP = A v1 v2 . . . vk so the ith column of AP is Avi. But Avi = (A − aI + aI)vi = (A − aI)vi + aIvi = vi−1 + avi for i > 1, = avi for i = 1 . On the other hand, PJ = v1 v2 . . . vk J and the ith column of PJ is Pji, Pji = v1 v2 . . . vk ji . Remembering what the vector ji is, and multiplying, we see that Pji = vi−1 + avi for i > 1, = avi for i = 1 as well.
  • 22. 14 CHAPTER 1. JORDAN CANONICAL FORM Thus, every column of AP is equal to the corresponding column of PJ, so AP = PJ . But Proposition 2.10 shows that the columns of P are linearly independent, so P is invertible. Multiplying on the right by P−1, we see that A = PJP−1 . 2 Example 2.12. Applying Theorem 2.11 to the matrix A = 0 1 −4 4 of Examples 2.6 and 2.8, we see that 0 1 −4 4 = −2 1 −4 0 2 1 0 2 −2 1 −4 0 −1 . Here is the key theorem to which we have been heading. This theorem is one of the most important (and useful) theorems in linear algebra. Theorem 2.13. Let A be any square matrix defined over the complex numbers. Then A is similar to a matrix in Jordan Canonical Form. More precisely, A = PJP −1, for some matrix J in Jordan Canonical Form. The diagonal entries of J consist of eigenvalues of A, and P is an invertible matrix whose columns are chains of generalized eigenvectors of A. Proof. (Rough outline) In general, the JCF of a matrix A does not consist of a single block, but will have a number of blocks, of varying sizes and associated to varying eigenvalues. But in this situation we merely have to “assemble” the various blocks (to get the matrix J) and the various chains of generalized eigenvectors (to get a basis and hence the matrix P). Actually, the word “merely” is a bit misleading, as the argument that we can do so is, in fact, a subtle one, and we shall not give it here. 2 In lieu of proving Theorem 2.13, we shall give a number of examples that illustrate the situation. In fact, in order to avoid complicated notation we shall merely illustrate the situation for 2-by-2 and 3-by-3 matrices. Theorem 2.14. Let A be a 2-by-2 matrix. Then one of the following situations applies:
  • 23. 1.2. THE GENERAL CASE 15 (i) A has two eigenvalues, a and b, each of algebraic multiplicity 1. Let u be an eigenvector associated to the eigenvalue a and let v be an eigenvector associated to the eigenvalue b.Then A = PJP−1 with J = a 0 0 b and P = u v . (Note, in this case, A is diagonalizable.) (ii) A has a single eigenvalue a of algebraic multiplicity 2. (a) A has two linearly independent eigenvectors u and v. Then A = PJP−1 with J = a 0 0 a and P = u v . (Note, in this case, A is diagonalizable. In fact, in this case Ea = C2 and A itself is the matrix a 0 0 a .) (b) A has a single chain {v1, v2} of generalized eigenvectors. Then A = PJP−1 with J = a 1 0 a and P = v1 v2 . Theorem 2.15. Let A be a 3-by-3 matrix. Then one of the following situations applies: (i) A has three eigenvalues, a, b, and c, each of algebraic multiplicity 1. Let u be an eigenvector associated to the eigenvalue a, v be an eigenvector associated to the eigenvalue b, and w be an eigenvector associated to the eigenvalue c. Then A = PJP−1 with J = ⎡ ⎣ a 0 0 0 b 0 0 0 c ⎤ ⎦ and P = u v w . (Note, in this case, A is diagonalizable.) (ii) A has an eigenvalue a of algebraic multiplicity 2 and an eigenvalue b of algebraic multiplicity 1. (a) A has two independent eigenvectors, u and v, associated to the eigenvalue a. Let w be an eigenvector associated to the eigenvalue b. Then A = PJP−1 with J = ⎡ ⎣ a 0 0 0 a 0 0 0 b ⎤ ⎦ and P = u v w . (Note, in this case, A is diagonalizable.)
  • 24. 16 CHAPTER 1. JORDAN CANONICAL FORM (b) A has a single chain {u1, u2} of generalized eigenvectors associated to the eigenvalue a. Let v be an eigenvector associated to the eigenvalue b. Then A=PJP −1 with J = ⎡ ⎣ a 1 0 0 a 0 0 0 b ⎤ ⎦ and P = u1 u2 v . (iii) A has a single eigenvalue a of algebraic multiplicity 3. (a) A has three linearly independent eigenvectors, u, v, and w. Then A = PJP−1 with J = ⎡ ⎣ a 0 0 0 a 0 0 0 a ⎤ ⎦ and P = u v w . (Note, in this case, A is diagonalizable. In fact, in this case Ea = C3 and A itself is the matrix ⎡ ⎣ a 0 0 0 a 0 0 0 a ⎤ ⎦.) (b) A has a chain {u1, u2} of generalized eigenvectors and an eigenvector v with {u1, u2, v} linearly independent. Then A = PJP −1 with J = ⎡ ⎣ a 1 0 0 a 0 0 0 a ⎤ ⎦ and P = u1 u2 v . (c) A has a single chain {u1, u2, u3} of generalized eigenvectors. Then A =PJP−1 with J = ⎡ ⎣ a 1 0 0 a 1 0 0 a ⎤ ⎦ and P = u1 u2 u3 . Remark 2.16. Suppose that A has JCF J = aI, a scalar multiple of the identity matrix. Then A = PJP−1 = P(aI)P−1 = a(P IP−1) = aI = J.ThisjustifiestheparentheticalremarkinThe- orems 2.14 (ii) (a) and 2.15 (iii) (a). Remark 2.17. Note that Theorems 2.14 (i), 2.14 (ii) (a), 2.15 (i), 2.15 (ii) (a), and 2.15 (iii) (a) are all special cases of Theorem 1.14, and in fact Theorems 2.14 (i) and 2.15 (i) are both special cases of Corollary 1.15.
  • 25. 1.2. THE GENERAL CASE 17 Now we would like to apply Theorems 2.14 and 2.15. In order to do so, we need to have an effective method to determine which of the cases we are in, and we give that here (without proof). Definition 2.18. Let λ be an eigenvalue of A. Then for any positive integer i, Ei λ = {v | (A − λI)i (v) = 0} = Ker((A − λI)i ) . Note that Ei λ consists of generalized eigenvectors of index at most i (and the 0 vector), and is a subspace. Note also that Eλ = E1 λ ⊆ E2 λ ⊆ . . . ⊆ ˜Eλ . In general, the JCF of A is determined by the dimensions of all the spaces Ei λ, but this determination can be a bit complicated. For eigenvalues of multiplicity at most 3, however, the situationissimpler—weneedonlyconsidertheeigenspacesEλ.Thisisaconsequenceofthefollowing general result. Proposition 2.19. Let λ be an eigenvalue of A.Then the number of blocks in the JCF of A corresponding to λ is equal to dim Eλ, i.e., to the geometric multiplicity of λ. Proof. (Outline) Suppose there are such blocks. Since each block corresponds to a chain of gener- alized eigenvectors,there are such chains.Now the bottom of the chain is an (ordinary) eigenvector, so we get eigenvectors in this way. It can be shown that these eigenvectors are always linearly independent and that they always span Eλ, i.e., that they are a basis of Eλ. Thus, Eλ has a basis consisting of vectors, so dim Eλ = . 2 We can now determine the JCF of 1-by-1, 2-by-2, and 3-by-3 matrices, using the following consequences of this proposition. Corollary 2.20. Let λ be an eigenvalue of A of algebraic multiplicity 1. Then dim E1 λ = 1, i.e., a has geometric multiplicity 1, and the submatrix of the JCF of A corresponding to the eigenvalue λ is a single 1-by-1 block. Corollary 2.21. Let λ be an eigenvalue of A of algebraic multiplicity 2. Then there are the following possibilities: (a) dim E1 λ = 2, i.e., a has geometric multiplicity 2. In this case, the submatrix of the JCF of A corresponding to the eigenvalue λ consists of two 1-by-1 blocks.
  • 26. 18 CHAPTER 1. JORDAN CANONICAL FORM (b) dim E1 λ = 1, i.e., a has geometric multiplicity 1. Also, dim E2 λ = 2. In this case, the submatrix of the JCF of A corresponding to the eigenvalue λ consists of a single 2-by-2 block. Corollary 2.22. Let λ be an eigenvalue of A of algebraic multiplicity 3. Then there are the following possibilities: (a) dim E1 λ = 3, i.e., a has geometric multiplicity 3. In this case, the submatrix of the JCF of A corre- sponding to the eigenvalue λ consists of three 1-by-1 blocks. (b) dim E1 λ = 2, i.e., a has geometric multiplicity 2. Also, dim E2 λ = 3. In this case, the submatrix of the Jordan Canonical Form of A corresponding to the eigenvalue λ consists of a 2-by-2 block and a 1-by-1 block. (c) dim E1 λ = 1, i.e., a has geometric multiplicity 1. Also, dim E2 λ = 2, and dim E3 λ = 3. In this case, the submatrix of the Jordan Canonical Form of A corresponding to the eigenvalue λ consists of a single 3-by-3 block. Now we shall do several examples. Example 2.23. A = ⎡ ⎣ 2 −3 −3 2 −2 −2 −2 1 1 ⎤ ⎦ . A has characteristic polynomial det (λI − A) = (λ + 1)(λ)(λ − 2).Thus, A has eigenvalues −1,0,and2,eachofmultiplicityone,andsoweareinthesituationof Theorem2.15(i).Computation shows that the eigenspace E−1 = Ker(A − (−I)) has basis ⎧ ⎨ ⎩ ⎡ ⎣ 1 0 1 ⎤ ⎦ ⎫ ⎬ ⎭ , the eigenspace E0 = Ker(A) has basis ⎧ ⎨ ⎩ ⎡ ⎣ 0 1 −1 ⎤ ⎦ ⎫ ⎬ ⎭ , and the eigenspace E2 = Ker(A − 2I) has basis ⎧ ⎨ ⎩ ⎡ ⎣ −1 −1 1 ⎤ ⎦ ⎫ ⎬ ⎭ . Hence, we see that ⎡ ⎣ 2 −3 −3 2 −2 −2 −2 1 1 ⎤ ⎦ = ⎡ ⎣ 1 0 −1 0 −1 −1 1 1 1 ⎤ ⎦ ⎡ ⎣ −1 0 0 0 0 0 0 0 2 ⎤ ⎦ ⎡ ⎣ 1 0 −1 0 −1 −1 1 1 1 ⎤ ⎦ −1 . Example 2.24. A = ⎡ ⎣ 3 1 1 2 4 2 1 1 3 ⎤ ⎦ .
  • 27. 1.2. THE GENERAL CASE 19 A has characteristic polynomial det (λI − A) = (λ − 2)2(λ − 6).Thus, A has an eigenvalue 2 of multiplicity 2 and an eigenvalue 6 of multiplicity 1. Computation shows that the eigenspace E2 = Ker(A − 2I) has basis ⎧ ⎨ ⎩ ⎡ ⎣ −1 1 0 ⎤ ⎦ , ⎡ ⎣ −1 0 1 ⎤ ⎦ ⎫ ⎬ ⎭ , so dim E2 = 2 and we are in the situation of Corollary 2.21 (a). Further computation shows that the eigenspace E6 = Ker(A − 6I) has basis⎧ ⎨ ⎩ ⎡ ⎣ 1 2 1 ⎤ ⎦ ⎫ ⎬ ⎭ . Hence, we see that ⎡ ⎣ 3 1 1 2 4 2 1 1 3 ⎤ ⎦ = ⎡ ⎣ −1 −1 1 1 0 2 0 1 1 ⎤ ⎦ ⎡ ⎣ 2 0 0 0 2 0 0 0 6 ⎤ ⎦ ⎡ ⎣ −1 −1 1 1 0 2 0 1 1 ⎤ ⎦ −1 . Example 2.25. A = ⎡ ⎣ 2 1 1 2 1 −2 −1 0 −2 ⎤ ⎦ . A has characteristic polynomial det (λI − A) = (λ + 1)2(λ − 3).Thus, A has an eigenvalue −1 of multiplicity 2 and an eigenvalue 3 of multiplicity 1. Computation shows that the eigenspace E−1 = Ker(A − (−I)) has basis ⎧ ⎨ ⎩ ⎡ ⎣ −1 2 1 ⎤ ⎦ ⎫ ⎬ ⎭ so dim E−1 = 1 and we are in the situation of Corol- lary 2.21 (b).Then we further compute that E2 −1 = Ker((A − (−I))2) has basis ⎧ ⎨ ⎩ ⎡ ⎣ −1 2 0 ⎤ ⎦ , ⎡ ⎣ 0 0 1 ⎤ ⎦ ⎫ ⎬ ⎭ , therefore is two-dimensional, as we expect. More to the point, we may choose any generalized eigen- vector of index 2, i.e., any vector in E2 −1 that is not in E1 −1, as the top of a chain. We choose u2 = ⎡ ⎣ 0 0 1 ⎤ ⎦ , and then we have u1 = (A − (−I))u2 = ⎡ ⎣ 1 −2 −1 ⎤ ⎦ , and {u1, u2} form a chain. We also compute that, for the eigenvalue 3, the eigenspace E3 has basis ⎧ ⎨ ⎩ v = ⎡ ⎣ −5 −6 1 ⎤ ⎦ ⎫ ⎬ ⎭ . Hence, we see that ⎡ ⎣ 2 1 1 2 1 −2 −1 0 2 ⎤ ⎦ = ⎡ ⎣ 1 0 −5 −2 0 −6 −1 1 1 ⎤ ⎦ ⎡ ⎣ −1 1 0 0 −1 0 0 0 3 ⎤ ⎦ ⎡ ⎣ 1 0 −5 −2 0 −6 −1 1 1 ⎤ ⎦ −1 .
  • 28. 20 CHAPTER 1. JORDAN CANONICAL FORM Example 2.26. A = ⎡ ⎣ 2 1 1 −2 −1 −2 1 1 2 ⎤ ⎦ . A has characteristic polynomial det (λI − A) = (λ − 1)3, so A has one eigenvalue 1 of multiplicity three. Computation shows that E1 = Ker(A − I) has basis ⎧ ⎨ ⎩ ⎡ ⎣ −1 0 1 ⎤ ⎦ , ⎡ ⎣ −1 1 0 ⎤ ⎦ ⎫ ⎬ ⎭ , so dim E1 = 2 and we are in the situation of Corollary 2.22 (b). Computation then shows that dim E2 1 = 3(i.e.,(A − I)2 = 0 andE2 1 isallof C3)withbasis ⎧ ⎨ ⎩ ⎡ ⎣ 1 0 0 ⎤ ⎦ , ⎡ ⎣ 0 1 0 ⎤ ⎦ , ⎡ ⎣ 0 0 1 ⎤ ⎦ ⎫ ⎬ ⎭ .Wemaychoose u2 to be any vector in E2 1 that is not in E1 1,and we shall choose u2 = ⎡ ⎣ 1 0 0 ⎤ ⎦ .Then u1 = (A − I)u2 = ⎡ ⎣ 1 −2 1 ⎤ ⎦ , and {u1, u2} form a chain. For the third vector, v, we may choose any vector in E1 such that {u1, v} is linearly independent. We choose v = ⎡ ⎣ −1 0 1 ⎤ ⎦ . Hence, we see that ⎡ ⎣ 2 1 1 −2 −1 2 1 1 2 ⎤ ⎦ = ⎡ ⎣ 1 1 −1 −2 0 0 1 0 1 ⎤ ⎦ ⎡ ⎣ 1 1 0 0 1 0 0 0 1 ⎤ ⎦ ⎡ ⎣ 1 1 −1 −2 0 0 1 0 1 ⎤ ⎦ −1 . Example 2.27. A = ⎡ ⎣ 5 0 1 1 1 0 −7 1 0 ⎤ ⎦ . A has characteristic polynomial det (λI − A) = (λ − 2)3,so A has one eigenvalue 2 of multi- plicity three. Computation shows that E2 = Ker(A − 2I) has basis ⎧ ⎨ ⎩ ⎡ ⎣ −1 −1 3 ⎤ ⎦ ⎫ ⎬ ⎭ , so dim E1 2 = 1 and we are in the situation of Corollary 2.22 (c). Then computation shows that E2 2 = Ker((A − 2I)2) has basis ⎧ ⎨ ⎩ ⎡ ⎣ −1 0 2 ⎤ ⎦ , ⎡ ⎣ −1 2 0 ⎤ ⎦ ⎫ ⎬ ⎭ . (Note that ⎡ ⎣ −1 −1 3 ⎤ ⎦ = 3/2 ⎡ ⎣ −1 0 2 ⎤ ⎦ + 1/2 ⎡ ⎣ −1 2 0 ⎤ ⎦.) Computation then
  • 29. 1.2. THE GENERAL CASE 21 shows that dim E3 2 = 3 (i.e., (A − 2I)3 = 0 and E3 2 is all of C3) with basis ⎧ ⎨ ⎩ ⎡ ⎣ 1 0 0 ⎤ ⎦ , ⎡ ⎣ 0 1 0 ⎤ ⎦ , ⎡ ⎣ 0 0 1 ⎤ ⎦ ⎫ ⎬ ⎭ . We may choose u3 to be any vector in C3 that is not in E2 2, and we shall choose u3 = ⎡ ⎣ 1 0 0 ⎤ ⎦ . Then u2 = (A − 2I)u3 = ⎡ ⎣ 3 1 −7 ⎤ ⎦ and u1 = (A − 2I)u2 = ⎡ ⎣ 2 2 −6 ⎤ ⎦ , and then {u1, u2, u3} form a chain. Hence, we see that ⎡ ⎣ 5 0 1 1 1 0 −7 1 0 ⎤ ⎦ = ⎡ ⎣ 2 3 1 2 1 0 −6 −7 0 ⎤ ⎦ ⎡ ⎣ 2 1 0 0 2 1 0 0 2 ⎤ ⎦ ⎡ ⎣ 2 3 1 2 1 0 −6 −7 0 ⎤ ⎦ −1 . As we have mentioned, we need to work over the complex numbers in order for the theory of JCF to fully apply. But there is an analog over the real numbers, and we conclude this section by stating it. Theorem 2.28. Let A be a real square matrix (i.e., a square matrix with all entries real numbers), and suppose that all of the eigenvalues of A are real numbers. Then A is similar to a real matrix in Jordan Canonical Form. More precisely, A = PJP−1 with P and J real matrices, for some matrix J in Jordan Canonical Form.The diagonal entries of J consist of eigenvalues of A, and P is an invertible matrix whose columns are chains of generalized eigenvectors of A. EXERCISES FOR CHAPTER 1 For each matrix A, write A = PJP−1 with P an invertible matrix and J a matrix in JCF. 1. A = 75 56 −90 −67 , det(λI − A) = (λ − 3)(λ − 5). 2. A = −50 99 −20 39 , det(λI − A) = (λ + 6)(λ + 5). 3. A = −18 9 −49 24 , det(λI − A) = (λ − 3)2.
  • 30. 22 CHAPTER 1. JORDAN CANONICAL FORM 4. A = 1 1 −16 9 , det(λI − A) = (λ − 5)2. 5. A = 2 1 −25 12 , det(λI − A) = (λ − 7)2. 6. A = −15 9 −25 15 , det(λI − A) = λ2. 7. A = ⎡ ⎣ 1 0 0 1 2 −3 1 −1 0 ⎤ ⎦, det(λI − A) = (λ + 1)(λ − 1)(λ − 3). 8. A = ⎡ ⎣ 3 0 2 1 3 1 0 1 1 ⎤ ⎦, det(λI − A) = (λ − 1)(λ − 2)(λ − 4). 9. A = ⎡ ⎣ 5 8 16 4 1 8 −4 −4 −11 ⎤ ⎦, det(λI − A) = (λ + 3)2(λ − 1). 10. A = ⎡ ⎣ 4 2 3 −1 1 −3 2 4 9 ⎤ ⎦, det(λI − A) = (λ − 3)2(λ − 8). 11. A = ⎡ ⎣ 5 2 1 −1 2 −1 −1 −2 3 ⎤ ⎦, det(λI − A) = (λ − 4)2(λ − 2). 12. A = ⎡ ⎣ 8 −3 −3 4 0 −2 −2 1 3 ⎤ ⎦, det(λI − A) = (λ − 2)2(λ − 7). 13. A = ⎡ ⎣ −3 1 −1 −7 5 −1 −6 6 −2 ⎤ ⎦, det(λI − A) = (λ + 2)2(λ − 4). 14. A = ⎡ ⎣ 3 0 0 9 −5 −18 −4 4 12 ⎤ ⎦, det(λI − A) = (λ − 3)2(λ − 4).
  • 31. 1.2. THE GENERAL CASE 23 15. A = ⎡ ⎣ −6 9 0 −6 6 −2 9 −9 3 ⎤ ⎦, det(λI − A) = λ2(λ − 3). 16. A = ⎡ ⎣ −18 42 168 1 −7 −40 −2 6 27 ⎤ ⎦, det(λI − A) = (λ − 3)2(λ + 4). 17. A = ⎡ ⎣ −1 1 −1 −10 6 −5 −6 3 −2 ⎤ ⎦, det(λI − A) = (λ − 1)3. 18. A = ⎡ ⎣ 0 −4 1 2 −6 1 4 −8 0 ⎤ ⎦, det(λI − A) = (λ + 2)3. 19. A = ⎡ ⎣ −4 1 2 −5 1 3 −7 2 3 ⎤ ⎦, det(λI − A) = λ3. 20. A = ⎡ ⎣ −4 −2 5 −1 −1 1 −2 −1 2 ⎤ ⎦, det(λI − A) = (λ + 1)3.
  • 32. 24
  • 33. 25 C H A P T E R 2 Solving Systems of Linear Differential Equations 2.1 HOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS We will now see how to use Jordan Canonical Form (JCF) to solve systems Y = AY. We begin by describing the strategy we will follow throughout this section. Consider the matrix system Y = AY . Step 1. Write A = PJP −1 with J in JCF, so the system becomes Y = (PJP−1 )Y Y = PJ(P−1 Y) P −1 Y = J(P−1 Y) (P −1 Y) = J(P−1 Y) . (Note that, since P −1 is a constant matrix, we have that (P −1Y) = P−1Y .) Step 2. Set Z = P −1Y, so this system becomes Z = JZ and solve this system for Z. Step 3. Since Z = P −1Y, we have that Y = PZ is the solution to our original system. Examining this strategy, we see that we already know how to carry out Step 1, and also that Step 3 is very easy—it is just matrix multiplication. Thus, the key to success here is being able to carry out Step 2.This is where JCF comes in. As we shall see, it is (relatively) easy to solve Z = JZ when J is a matrix in JCF.
  • 34. 26 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS You will note that throughout this section, in solving Z = JZ, we write the solution as Z = MZC, where MZ is a matrix of functions, called the fundamental matrix of the system, and C is a vector of arbitrary constants. The reason for this will become clear later. (See Remarks 1.12 and 1.14.) Although it is not logically necessary—we may regard a diagonal matrix as a matrix in JCF in which all the Jordan blocks are 1-by-1 blocks—it is illuminating to handle the case when J is diagonal first. Here the solution is very easy. Theorem 1.1. Let J be a k-by-k diagonal matrix, J = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ a1 a2 0 a3 ... 0 ak−1 ak ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ . Then the system Z = JZ has the solution Z = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ea1x ea2x 0 ea3x ... 0 eak−1x eakx ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ C = MZC where C = ⎡ ⎢ ⎢ ⎢ ⎣ c1 c2 ... ck ⎤ ⎥ ⎥ ⎥ ⎦ is a vector of arbitrary constants c1, c2, . . . , ck. Proof. Multiplying out, we see that the system Z = JZ is just the system ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ z1 z2 ... zk ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ a1z1 a2z2 ... akzk ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ .
  • 35. 2.1. HOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS 27 But this system is “uncoupled”, i.e., the equation for zi only involves zi and none of the other functions. Now this equation is very familiar. In general, the differential equation z = az has solution z = ceax, and applying that here we find that Z = JZ has solution Z = ⎡ ⎢ ⎢ ⎢ ⎣ c1ea1x c2ea2x ... ckeakx ⎤ ⎥ ⎥ ⎥ ⎦ , which is exactly the above product MZC. 2 Example 1.2. Consider the system Y = AY where A = 5 −7 2 −4 . We saw in Example 1.16 in Chapter 1 that A = PJP−1 with P = 7 1 2 1 and J = 3 0 0 −2 . Then Z = JZ has solution Z = e3x 0 0 e−2x c1 c2 = MZC = c1e3x c2e−2x and so Y = PZ = PMZC, i.e., Y = 7 1 2 1 e3x 0 0 e−2x c1 c2 = 7e3x e−2x 2e3x e−2x c1 c2 = 7c1e3x + c2e−2x 2c1e3x + c2e−2x . Example 1.3. Consider the system Y = AY where A = ⎡ ⎣ 2 −3 −3 2 −2 −2 −2 1 1 ⎤ ⎦ .
  • 36. 28 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS We saw in Example 2.23 in Chapter 1 that A = PJP−1 with P = ⎡ ⎣ 1 0 −1 0 −1 −1 1 1 1 ⎤ ⎦ and J = ⎡ ⎣ −1 0 0 0 0 0 0 0 2 ⎤ ⎦ . Then Z = JZ has solution Z = ⎡ ⎣ e−x 0 0 0 1 0 0 0 e2x ⎤ ⎦ ⎡ ⎣ c1 c2 c3 ⎤ ⎦ = MZC and so Y = PZ = PMZC, i.e., Y = ⎡ ⎣ 1 0 −1 0 −1 −1 1 1 1 ⎤ ⎦ ⎡ ⎣ e−x 0 0 0 1 0 0 0 e2x ⎤ ⎦ ⎡ ⎣ c1 c2 c3 ⎤ ⎦ = ⎡ ⎣ e−x 0 −e2x 0 −1 −e2x e−x 1 e2x ⎤ ⎦ ⎡ ⎣ c1 c2 c3 ⎤ ⎦ = ⎡ ⎣ c1e−x − c3e2x −c2 − c3e2x c1e−x + c2 + c3e2x ⎤ ⎦ . We now see how to use JCF to solve systems Y = AY where the coefficient matrix A is not diagonalizable. The key to understanding systems is to investigate a system Z = JZ where J is a matrix consisting of a single Jordan block. Here the solution is not as easy as in Theorem 1.1, but it is still not too hard. Theorem 1.4. Let J be a k-by-k Jordan block with eigenvalue a, J = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ a 1 a 1 0 a 1 ... ... 0 a 1 a ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ .
  • 37. 2.1. HOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS 29 Then the system Z = JZ has the solution Z = eax ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 1 x x2/2! x3/3! · · · xk−1/(k − 1)! 1 x x2/2! · · · xk−2/(k − 2)! 1 x · · · xk−3/(k − 3)! ... ... x 1 ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ C = MZC where C = ⎡ ⎢ ⎢ ⎢ ⎣ c1 c2 ... ck ⎤ ⎥ ⎥ ⎥ ⎦ is a vector of arbitrary constants c1, c2, . . . , ck. Proof. We will prove this in the cases k = 1, 2, and 3, which illustrate the pattern. As you will see, the proof is a simple application of the standard technique for solving first-order linear differential equations. The case k = 1: Here we are considering the system [z1] = [a][z1] which is nothing other than the differential equation z1 = az1 . This differential equation has solution z1 = c1eax , which we can certainly write as [z1] = eaz [1][c1] . The case k = 2: Here we are considering the system z1 z2 = a 1 0 a z1 z2 , which is nothing other than the pair of differential equations z1 = az1 + z2 z2 = az2 .
  • 38. 30 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS We recognize the second equation as having the solution z2 = c2eax and we substitute this into the first equation to get z1 = az1 + c2eax . To solve this, we rewrite this as z1 − az1 = c2eax and recognize that this differential equation has integrating factor e−ax. Multiplying by this factor, we find e−ax (z1 − az1) = c2 (e−ax z1) = c2 e−ax z1 = c2 dx = c1 + c2x so z1 = eax (c1 + c2x) . Thus, our solution is z1 = eax (c1 + c2x) z2 = eax c2 , which we see we can rewrite as z1 z2 = eax 1 x 0 1 c1 c2 . The case k = 3: Here we are considering the system ⎡ ⎣ z1 z2 z3 ⎤ ⎦ = ⎡ ⎣ a 1 0 0 a 1 0 0 a ⎤ ⎦ ⎡ ⎣ z1 z2 z3 ⎤ ⎦ , which is nothing other than the triple of differential equations z1 = az1 + z2 z2 = az2 + z3 z3 = az3 .
  • 39. 2.1. HOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS 31 If we just concentrate on the last two equations, we see we are in the k = 2 case. Referring to that case, we see that our solution is z2 = eax (c2 + c3x) z3 = eax c3 . Substituting the value of z2 into the equation for z1, we obtain z1 = az1 + eax (c2 + c3x) . To solve this, we rewrite this as z1 − az1 = eax (c2 + c3x) and recognize that this differential equation has integrating factor e−ax. Multiplying by this factor, we find e−ax (z1 − az1) = c2 + c3x (e−ax z1) = c2 + c3x e−ax z1 = (c2 + c3x) dx = c1 + c2x + c3(x2 /2) so z1 = eax (c1 + c2x + c3(x2 /2)) . Thus, our solution is z1 = eax (c1 + c2x + c3(x2 /2)) z2 = eax (c2 + c3x) z3 = eax c3 , which we see we can rewrite as ⎡ ⎣ z1 z2 z3 ⎤ ⎦ = eax ⎡ ⎣ 1 x x2/2 0 1 x 0 0 1 ⎤ ⎦ ⎡ ⎣ c1 c2 c3 ⎤ ⎦ . 2 Remark 1.5. Suppose that Z = JZ where J is a matrix in JCF but one consisting of several blocks, not just one block. We can see that this systems decomposes into several systems, one corresponding to each block, and that these systems are uncoupled, so we may solve them each separately, using Theorem 1.4, and then simply assemble these individual solutions together to obtain a solution of the general system.
  • 40. 32 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS We now illustrate this (confining our illustrations to the case that A is not diagonalizable, as we have already illustrated the diagonalizable case). Example 1.6. Consider the system Y = AY where A = 0 1 −4 4 . We saw in Example 2.12 in Chapter 1 that A = PJP−1 with P = −2 1 −4 0 and J = 2 1 0 2 . Then Z = JZ has solution Z = e2x 1 x 0 1 c1 c2 = e2x xe2x 0 e2x c1 c2 = MZC = c1e2x + c2xe2x c2e2x and so Y = PZ = PMZC, i.e., Y = −2 1 −4 0 e2x 1 x 0 1 c1 c2 = −2 1 −4 0 e2x xe2x 0 e2x c1 c2 = −2e2x −2xe2x + e2x −4e2x −4xe2x c1 c2 = (−2c1 + c2)e2x − 2c2xe2x −4c1e2x − 4c2xe2x . Example 1.7. Consider the system Y = AY where A = ⎡ ⎣ 2 1 1 2 1 −2 −1 0 −2 ⎤ ⎦ . We saw in Example 2.25 in Chapter 1 that A = PJP−1 with P = ⎡ ⎣ 1 0 −5 −2 0 −6 −1 1 1 ⎤ ⎦ and J = ⎡ ⎣ −1 1 0 0 −1 0 0 0 3 ⎤ ⎦ .
  • 41. 2.1. HOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS 33 Then Z = JZ has solution Z = ⎡ ⎣ e−x xe−x 0 0 e−x 0 0 0 e3x ⎤ ⎦ ⎡ ⎣ c1 c2 c3 ⎤ ⎦ = MZC and so Y = PZ = PMZC, i.e., Y = ⎡ ⎣ 1 0 −5 −2 0 −6 −1 1 1 ⎤ ⎦ ⎡ ⎣ e−x xe−x 0 0 e−x 0 0 0 e3x ⎤ ⎦ ⎡ ⎣ c1 c2 c3 ⎤ ⎦ = ⎡ ⎣ e−x xe−x −5e3x −2e−x −2xe−x −6e3x −e−x −xe−x + e−x e3x ⎤ ⎦ ⎡ ⎣ c1 c2 c3 ⎤ ⎦ = ⎡ ⎣ c1e−x + c2xe−x − 5c3e3x −2c1e−x − 2c2xe−x − 6c3e3x (−c1 + c2)e−x − c2xe−x + c3e3x ⎤ ⎦ . Example 1.8. Consider the system Y = AY where A = ⎡ ⎣ 2 1 1 −2 −1 −2 1 1 2 ⎤ ⎦ . We saw in Example 2.26 in Chapter 1 that A = PJP−1 with P = ⎡ ⎣ 1 1 1 −2 0 0 1 0 1 ⎤ ⎦ and J = ⎡ ⎣ 1 1 0 0 1 0 0 0 1 ⎤ ⎦ . Then Z = JZ has solution Z = ⎡ ⎣ ex xex 0 0 ex 0 0 0 ex ⎤ ⎦ ⎡ ⎣ c1 c2 c3 ⎤ ⎦ = MZC and so Y = PZ = PMZC, i.e., Y = ⎡ ⎣ 1 1 1 −2 0 0 1 0 1 ⎤ ⎦ ⎡ ⎣ ex xex 0 0 ex 0 0 0 ex ⎤ ⎦ ⎡ ⎣ c1 c2 c3 ⎤ ⎦
  • 42. 34 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS = ⎡ ⎣ ex xex + ex ex −2ex −2xex 0 ex xex ex ⎤ ⎦ ⎡ ⎣ c1 c2 c3 ⎤ ⎦ = ⎡ ⎣ (c1 + c2 + c3)ex + c2xex −2c1ex − 2c2xex (c1 + c3)ex + c2xex ⎤ ⎦ . Example 1.9. Consider the system Y = AY where A = ⎡ ⎣ 5 0 1 1 1 0 −7 1 0 ⎤ ⎦ . We saw in Example 2.27 in Chapter 1 that A = PJP−1 with P = ⎡ ⎣ 2 3 1 2 1 0 −6 −7 0 ⎤ ⎦ and J = ⎡ ⎣ 2 1 0 0 2 1 0 0 2 ⎤ ⎦ . Then Z = JZ has solution Z = ⎡ ⎣ e2x xe2x (x2/2)e2x 0 e2x xe2x 0 0 e2x ⎤ ⎦ ⎡ ⎣ c1 c2 c3 ⎤ ⎦ = MZC and so Y = PZ = PMZC, i.e., Y = ⎡ ⎣ 2 3 1 2 1 0 −6 −7 0 ⎤ ⎦ ⎡ ⎣ e2x xe2x (x2/2)e2x 0 e2x xe2x 0 0 e2x ⎤ ⎦ ⎡ ⎣ c1 c2 c3 ⎤ ⎦ = ⎡ ⎣ 2e2x 2xe2x + 3e2x x2e2x + 3xe2x + e2x 2e2x 2xe2x + e2x x2e2x + xe2x −6e2x −6xe2x − 7e2x −3x2e2x − 7xe2x ⎤ ⎦ ⎡ ⎣ c1 c2 c3 ⎤ ⎦ = ⎡ ⎣ (2c1 + 3c2 + c3)e2x + (2c2 + 3c3)xe2x + c3x2e2x (2c1 + c2)e2x + (2c2 + c3)xe2x + c3x2e2x (−6c1 − 7c2)e2x + (−6c2 − 7c3)xe2x − 3c3x2e2x ⎤ ⎦ .
  • 43. 2.1. HOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS 35 We conclude this section by showing how to solve initial value problems.This is just one more step, given what we have already done. Example 1.10. Consider the initial value problem Y = AY where A = 0 1 −4 4 , and Y(0) = 3 −8 . In Example 1.6, we saw that this system has the general solution Y = (−2c1 + c2)e2x − 2c2xe2x −4c1e2x − 4c2xe2x . Applying the initial condition (i.e., substituting x = 0 in this matrix), gives 3 −8 = Y(0) = −2c1 + c2 −4c1 with solution c1 c2 = 2 7 . Substituting these values in the above matrix gives Y = 3e2x − 14xe2x −8e2x − 28te2x . Example 1.11. Consider the initial value problem Y = AY where A = ⎡ ⎣ 2 1 1 2 1 −2 −1 0 2 ⎤ ⎦ , and Y(0) = ⎡ ⎣ 8 32 5 ⎤ ⎦ . In Example 1.8, we saw that this system has the general solution Y = ⎡ ⎣ c1e−x + c2xe−x − 5c3xe3x −2c1e−x − 2c2xe−x − 6c3e3x (−c1 + c2)e−x − c2xe−x + c3e3x ⎤ ⎦ . Applying the initial condition (i.e., substituting x = 0 in this matrix) gives ⎡ ⎣ 8 32 5 ⎤ ⎦ = Y(0) = ⎡ ⎣ c1 − 5c3 −2c1 − 6c3 −c1 + c2 + c3 ⎤ ⎦
  • 44. 36 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS with solution ⎡ ⎣ c1 c2 c3 ⎤ ⎦ = ⎡ ⎣ −7 1 −3 ⎤ ⎦ . Substituting these values in the above matrix gives Y = ⎡ ⎣ −7e−x + xe−x + 15e3x 14e−x − 2xe−x + 18e3x 8e−x − xe−x − 3e3x ⎤ ⎦ . Remark 1.12. There is a variant on our method of solving systems or initial value problems. We have written our solution of Z = JZ as Z = MZC. Let us be more explicit here and write this solution as Z(x) = MZ(x)C . This notation reminds us that Z(x) is a vector of functions, MZ(x) is a matrix of functions, and C is a vector of constants. The key observation is that MZ(0) = I, the identity matrix. Thus, if we wish to solve the initial value problem Z = JZ, Z(0) = Z0 , we find that, in general, Z(x) = MZ(x)C and, in particular, Z0 = Z(0) = MZ(0)C = IC = C , so the solution to this initial value problem is Z(x) = MZ(x)Z0 . Now suppose we wish to solve the system Y = AY.Then, if A = PJP−1, we have seen that this system has solution Y = PZ = PMZC. Let us manipulate this a bit: Y = PMZC = PMZIC = PMZ(P−1 P)C = (PMZP −1 )(PC) . Now let us set MY = PMZP −1, and also let us set = PC. Note that MY is still a matrix of functions, and that is still a vector of arbitrary constants (since P is an invertible constant matrix and C is a vector of arbitrary constants). Thus, with this notation, we see that Y = AY has solution Y = MY .
  • 45. 2.1. HOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS 37 Now suppose we wish to solve the initial value problem Y = AY, Y(0) = Y0 . Rewriting the above solution of Y = AY to explicitly include the independent variable, we see that we have Y(x) = MY (x) and, in particular, Y0 = Y(0) = MY (0) = PMZ(0)P−1 = PIP−1 = , so we see that Y = AY, Y(0) = Y0 has solution Y(x) = MY (x)Y0 . This variant method has pros and cons. It is actually less effective than our original method for solving a single initial value problem (as it requires us to compute P −1 and do some extra matrix multiplication), but it has the advantage of expressing the solution directly in terms of the initial conditions. This makes it more effective if the same system Y = AY is to be solved for a variety of initial conditions. Also, as we see from Remark 1.14 below, it is of considerable theoretical importance. Let us now apply this variant method. Example 1.13. Consider the initial value problem Y = AY where A = 0 1 −4 4 , and Y(0) = a1 a2 . As we have seen in Example 1.6,A = PJP−1 with P = −2 1 −4 0 and J = 2 1 0 2 .Then MZ(x) = e2x xe2x 0 e2x and MY (x) = PMZ(x)P −1 = −2 1 −4 0 e2x xe2x 0 e2x −2 1 −4 0 −1 = e2x − 2xe2x xe2x −4xe2x e2x + 2xe2x so Y(x) = MY (x) a1 a2 = e2x − 2xe2x xe2x −4xe2x e2x + 2xe2x a1 a2 = a1e2x + (−2a1 + a2)xe2x a2e2x + (−4a1 + 2a2)xe2x .
  • 46. 38 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS In particular, if Y(0) = 3 −8 , then Y(x) = 3e2x − 14xe2x −8e2x − 28xe2x , recovering the result of Exam- ple 1.10. But also, if Y(0) = 2 5 , then Y(x) = 2e2x + xe2x 5e2x + 2te2x , and if Y(0) = −4 15 , then Y(x) = −4e2x + 23xe2x 15e2x + 46xe2x , etc. Remark 1.14. In Section 2.4 we will define the matrix exponential, and, with this definition, MZ(x) = eJx and MY (x) = PMZ(x)P−1 = eAx. EXERCISES FOR SECTION 2.1 For each exercise, see the corresponding exercise in Chapter 1. In each exercise: (a) Solve the system Y = AY. (b) Solve the initial value problem Y = AY, Y(0) = Y0. 1. A = 75 56 −90 −67 and Y0 = 1 −1 . 2. A = −50 99 −20 39 and Y0 = 7 3 . 3. A = −18 9 −49 24 and Y0 = 41 98 . 4. A = 1 1 −16 9 and Y0 = 7 16 . 5. A = 2 1 −25 12 and Y0 = −10 −75 . 6. A = −15 9 −25 15 and Y0 = 50 100 .
  • 47. 2.1. HOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS 39 7. A = ⎡ ⎣ 1 0 0 1 2 −3 1 −1 0 ⎤ ⎦ and Y0 = ⎡ ⎣ 6 −10 10 ⎤ ⎦. 8. A = ⎡ ⎣ 3 0 2 1 3 1 0 1 1 ⎤ ⎦ and Y0 = ⎡ ⎣ 0 3 3 ⎤ ⎦. 9. A = ⎡ ⎣ 5 8 16 4 1 8 −4 −4 −11 ⎤ ⎦ and Y0 = ⎡ ⎣ 0 2 −1 ⎤ ⎦. 10. A = ⎡ ⎣ 4 2 3 −1 1 −3 2 4 9 ⎤ ⎦ and Y0 = ⎡ ⎣ 3 2 1 ⎤ ⎦. 11. A = ⎡ ⎣ 5 2 1 −1 2 −1 −1 −2 3 ⎤ ⎦ and Y0 = ⎡ ⎣ −3 2 9 ⎤ ⎦. 12. A = ⎡ ⎣ 8 −3 −3 4 0 −2 −2 1 3 ⎤ ⎦ and Y0 = ⎡ ⎣ 5 8 7 ⎤ ⎦. 13. A = ⎡ ⎣ −3 1 −1 −7 5 −1 −6 6 −2 ⎤ ⎦ and Y0 = ⎡ ⎣ −1 3 6 ⎤ ⎦. 14. A = ⎡ ⎣ 3 0 0 9 −5 −18 −4 4 12 ⎤ ⎦ and Y0 = ⎡ ⎣ 2 −1 1 ⎤ ⎦.
  • 48. 40 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS 15. A = ⎡ ⎣ −6 9 0 −6 6 −2 9 −9 3 ⎤ ⎦ and Y0 = ⎡ ⎣ 1 3 −6 ⎤ ⎦. 16. A = ⎡ ⎣ −18 42 168 1 −7 −40 −2 6 27 ⎤ ⎦ and Y0 = ⎡ ⎣ 7 −2 1 ⎤ ⎦. 17. A = ⎡ ⎣ −1 1 −1 −10 6 −5 −6 3 2 ⎤ ⎦ and Y0 = ⎡ ⎣ 3 10 18 ⎤ ⎦. 18. A = ⎡ ⎣ 0 −4 1 2 −6 1 4 −8 0 ⎤ ⎦ and Y0 = ⎡ ⎣ 2 5 8 ⎤ ⎦. 19. A = ⎡ ⎣ −4 1 2 −5 1 3 −7 2 3 ⎤ ⎦ and Y0 = ⎡ ⎣ 6 11 9 ⎤ ⎦. 20. A = ⎡ ⎣ −4 −2 5 −1 −1 1 −2 −1 2 ⎤ ⎦ and Y0 = ⎡ ⎣ 9 5 8 ⎤ ⎦. 2.2 HOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS: COMPLEX ROOTS In this section, we show how to solve a homogeneous system Y = AY where the characteristic polynomial of A has complex roots. In principle, this is the same as the situation where the characteristic polynomial of A has real roots, which we dealt with in Section 2.1, but in practice, there is an extra step in the solution.
  • 49. 2.2. HOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS 41 We will begin by doing an example, which will show us where the difficulty lies, and then we will overcome that difficulty. But first, we need some background. Definition 2.1. For a complex number z, the exponential ez is defined by ez = 1 + z + z2 /2! + z3 /3! + . . . . The complex exponential has the following properties. Theorem 2.2. (1) (Euler) For any θ, eiθ = cos(θ) + i sin(θ) . (2) For any a, d dz (eaz ) = aeaz . (3) For any z1 and z2, ez1+z2 = ez1 ez2 . (4) If z = s + it, then ez = es (cos(t) + i sin(t)) . (5) For any z, ez = ez . Proof. For the proof, see Theorem 2.2 in Appendix A. 2 The following lemma will save us some computations. Lemma 2.3. Let A be a matrix with real entries, and let v be an eigenvector of A with associated eigenvalue λ. Then v is an eigenvector of A with associated eigenvalue λ. Proof. We have that Av = λv, by hypothesis. Let us take the complex conjugate of each side of this equation. Then Av = λv, Av = λv, Av = λv (as A = A since all the entries of A are real) , as claimed. 2
  • 50. 42 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS Now for our example. Example 2.4. Consider the system Y = AY where A = 2 −17 1 4 . A has characteristic polynomial λ2 − 6λ + 25 with roots λ1 = 3 + 4i and λ2 = λ1 = 3 − 4i, each of multiplicity 1. Thus, λ1 and λ2 are the eigenvalues of A, and we compute that the eigenspace E3+4i = Ker(A − (3 + 4i)I) has basis v1 = −1 + 4i 1 , and hence, by Lemma 2.3, that the eigenspace E3−4i = Ker(A − (3 − 4i)I) has basis v2 = v1 = −1 − 4i 1 . Hence, just as before, A = PJP−1 with P = −1 + 4i −1 − 4i 1 1 and J = 3 + 4i 0 0 3 − 4i . We continue as before, but now we use F to denote a vector of arbitrary constants. (This is just for neatness. Our constants will change, as you will see, and we will use the vector C to denote our final constants, as usual.) Then Z = JZ has solution Z = e(3+4i)x 0 0 e(3−4i)x f1 f2 = MZF = f1e(3+4i)x f2e(3−4i)x and so Y = PZ = PMZF, i.e., Y = −1 + 4i −1 − 4i 1 1 e(3+4i)x 0 0 e(3−4i)x f1 f2 = f1e(3+4i)x −1 + 4i 1 + f2e(3−4i)x −1 − 4i 1 . Now we want our differential equation to have real solutions, and in order for this to be the case, it turns out that we must have f2 = f1. Thus, we may write our solution as Y = f1e(3+4i)x −1 + 4i 1 + f1e(3−4i)x −1 − 4i 1 = f1e(3+4i)x −1 + 4i 1 + f1e(3+4i)x −1 + 4i 1 , where f1 is an arbitrary complex constant. This solution is correct but unacceptable. We want to solve the system Y = AY, where A has real coefficients, and we have a solution which is indeed a real vector, but this vector is expressed in
  • 51. 2.2. HOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS 43 terms of complex numbers and functions. We need to obtain a solution that is expressed totally in terms of real numbers and functions. In order to do this, we need an extra step. In order not to interrupt the flow of exposition, we simply state here what we need to do, and we justify this after the conclusion of the example. We therefore do the following: We simply replace the matrix PMZ by the matrix whose first column is the real part Re(eλ1xv1) = Re e(3+4i)x −1 + 4i 1 , and whose second column is the imaginary part Im(eλ1xv1) = Im e(3+4i)x −1 + 4i 1 , and the vector F by the vector C of arbitrary real constants. We compute e(3+4i)x −1 + 4i 1 = e3x (cos(4x) + i sin(4x)) −1 + 4i 1 = e3x − cos(4x) − 4 sin(4x) cos(4x) + ie3x 4 cos(4x) − sin(4x) sin(4x) and so we obtain Y = e3x(− cos(4x) − 4 sin(4x)) e3x(4 cos(4x) − sin(4x)) e3x cos(4x) e3x sin(4x) c1 c2 = (−c1 + 4c2)e3x cos(4x) + (−4c1 − c2)e3x sin(4x) c1e3x cos(4x) + c2e3x sin(4x) . Now we justify the step we have done. Lemma 2.5. Consider the system Y = AY, where A is a matrix with real entries. Let this system have general solution of the form Y = PMZF = v1 v1 eλ1x 0 0 eλ1x f1 f1 = eλ1xv1 eλ1xv1 f1 f1 , where f1 is an arbitrary complex constant. Then this system also has general solution of the form Y = Re(eλ1xv1) Im(eλ1xv1) c1 c2 , where c1 and c2 are arbitrary real constants. Proof. First note that for any complex number z = x + iy,x = Re(z) = 1 2 (z + z) and y = Im(z) = 1 2i (z − z), and similarly, for any complex vector.
  • 52. 44 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS Now Y = AY has general solution Y = PMZF = PMZ(RR−1)F = (PMZR)(R−1F) for any invertible matrix R. We now (cleverly) choose R = 1/2 1/(2i) 1/2 −1/(2i) . With this choice of R, PMZR = Re(eλ1xv1) Im(eλ1xv1) . Then R−1 = 1 1 i −i . Since f1 is an arbitrary complex constant, we may (cleverly) choose to write it as f1 = 1 2 (c1 + ic2) for arbitrary real constants c1 and c2, and with this choice R−1 F = c1 c2 , yielding a general solution as claimed. 2 We now solve Y = AY where A is a real 3-by-3 matrix with a pair of complex eigenvalues and a third, real eigenvalue. As you will see, we use the idea of Lemma 2.5 to simply replace the “relevant” columns of PMZ in order to obtain our final solution. Example 2.6. Consider the system Y = AY where A = ⎡ ⎣ 15 −16 8 10 −10 5 0 1 2 ⎤ ⎦ . A has characteristic polynomial (λ2 − 2λ + 5)(λ − 5) with roots λ1 = 1 + 2i, λ2 = λ1 = 1 − 2i, and λ3 = 5, each of multiplicity 1. Thus, λ1, λ2, and λ3 are the eigenvalues of A, and we compute that the eigenspace E1+2i = Ker(A − (1 + 2i)I) has basis ⎧ ⎨ ⎩ v1 = ⎡ ⎣ −2 + 2i −1 + 2i 1 ⎤ ⎦ ⎫ ⎬ ⎭ , and hence, by Lemma 2.3,that the eigenspace E1−2i = Ker(A − (1 − 2i)I) has basis ⎧ ⎨ ⎩ v2 = v1 = ⎡ ⎣ −2 − 2i −1 − 2i 1 ⎤ ⎦ ⎫ ⎬ ⎭ . We further compute that the eigenspace E5 = Ker(A − 5I) has basis ⎧ ⎨ ⎩ v3 = ⎡ ⎣ 4 3 1 ⎤ ⎦ ⎫ ⎬ ⎭ . Hence, just as
  • 53. 2.2. HOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS 45 before, A = PJP−1 with P = ⎡ ⎣ −2 + 2i −2 − 2i 4 −1 + 2i −1 − 2i 3 1 1 1 ⎤ ⎦ and J = ⎡ ⎣ 1 + 2i 0 0 0 1 − 2i 0 0 0 5 ⎤ ⎦ . Then Z = JZ has solution Z = ⎡ ⎣ e(1+2i)x 0 0 0 e(1−2i)x 0 0 0 e5x ⎤ ⎦ ⎡ ⎣ f1 f1 c3 ⎤ ⎦ = MZF = ⎡ ⎢ ⎣ f1e(1+2i)x f1e(1+2i)x c3e5x ⎤ ⎥ ⎦ and so Y = PZ = PMZF, i.e., Y = ⎡ ⎣ −2 + 2i −2 − 2i 4 −1 + 2i −1 − 2i 3 1 1 1 ⎤ ⎦ ⎡ ⎣ e(1+2i)x 0 0 0 e(1−2i)x 0 0 0 e5x ⎤ ⎦ ⎡ ⎣ f1 f1 c3 ⎤ ⎦ . Now e(1+2i)x ⎡ ⎣ −2 + 2i −1 + 2i 1 ⎤ ⎦ = ex (cos(2x) + i sin(2x)) ⎡ ⎣ −2 + 2i −1 + 2i 1 ⎤ ⎦ = ⎡ ⎣ ex(−2 cos(2x) − 2 sin(2x)) ex(− cos(2x) − 2 sin(2x)) ex cos(2x) ⎤ ⎦ + i ⎡ ⎣ ex(2 cos(2x) − 2 sin(2x)) ex(2 cos(2x) − sin(2x)) ex sin(2x) ⎤ ⎦ and of course e5x ⎡ ⎣ 4 3 1 ⎤ ⎦ = ⎡ ⎣ 4e5x 3e5x e5x ⎤ ⎦ , so, replacing the relevant columns of PMZ, we find Y = ⎡ ⎣ ex(−2 cos(2x) − 2 sin(2x)) ex(2 cos(2x) − 2 sin(2x)) 4e5x ex(− cos(2x) − 2 sin(2x)) ex(2 cos(2x) − sin(2x)) 3e5x ex cos(2x) ex sin(2x) e5x ⎤ ⎦ ⎡ ⎣ c1 c2 c3 ⎤ ⎦ = ⎡ ⎣ (−2c1 + 2c2)ex cos(2x) + (−2c1 − 2c2)ex sin(2x) + 4c3e5x (−c1 + 2c2)ex cos(2x) + (−2c1 − c2)ex sin(2x) + 3c3e5x c1ex cos(2x) + c2ex sin(2x) + c3e5x ⎤ ⎦ .
  • 54. 46 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS EXERCISES FOR SECTION 2.2 In Exercises 1–4: (a) Solve the system Y = AY. (b) Solve the initial value problem Y = AY, Y(0) = Y0. In Exercises 5 and 6, solve the system Y = AY. 1. A = 3 5 −2 5 , det(λI − A) = λ2 − 8λ + 25, and Y0 = 8 13 . 2. A = 3 4 −2 7 , det(λI − A) = λ2 − 10λ + 29, and Y0 = 3 5 . 3. A = 5 13 −1 9 , det(λI − A) = λ2 − 14λ + 58, and Y0 = 2 1 . 4. A = 7 17 −4 11 , det(λI − A) = λ2 − 18λ + 145, and Y0 = 5 2 . 5. A = ⎡ ⎣ 37 10 20 −59 −9 −24 −33 −12 −21 ⎤ ⎦, det(λI − A) = (λ2 − 4λ + 29)(λ − 3). 6. A = ⎡ ⎣ −4 −42 15 4 25 −10 6 32 −13 ⎤ ⎦, det(λI − A) = (λ2 − 6λ + 13)(λ − 2). 2.3 INHOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS In this section, we show how to solve an inhomogeneous system Y = AY + G(x) where G(x) is a vector of functions. (We will often abbreviate G(x) by G). We use a method that is a direct generalization of the method we used for solving a homogeneous system in Section 2.1. Consider the matrix system Y = AY + G .
  • 55. 2.3. INHOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS 47 Step 1. Write A = PJP−1 with J in JCF, so the system becomes Y = (PJP−1 )Y + G Y = PJ(P−1 Y) + G P −1 Y = J(P−1 Y) + P −1 G (P−1 Y) = J(P−1 Y) + P −1 G . (Note that, since P −1 is a constant matrix, we have that (P −1Y) = P−1Y .) Step 2. Set Z = P −1Y and H = P−1G, so this system becomes Z = JZ + H and solve this system for Z. Step 3. Since Z = P −1Y, we have that Y = PZ is the solution to our original system. Again,the key to this method is to be able to perform Step 2,and again this is straightforward. Within each Jordan block,we solve from the bottom up.Let us focus our attention on a single k-by-k block.The equation for the last function zk in that block is an inhomogeneous first-order differential equation involving only zk,and we go ahead and solve it.The equation for the next to the last function zk−1 in that block is an inhomogeneous first-order differential equation involving only zk−1 and zk. We substitute in our solution for zk to obtain an inhomogeneous first-order differential equation for zk−1 involving only zk−1, and we go ahead and solve it, etc. In principle, this is the method we use. In practice, using this method directly is solving each system “by hand,” and instead we choose to “automate” this procedure.This leads us to the following method. In order to develop this method we must begin with some preliminaries. For a fixed matrix A, we say that the inhomogeneous system Y = AY + G(x) has associated homogeneous system Y = AY. By our previous work, we know how to find the general solution of Y = AY. First we shall see that, in order to find the general solution of Y = AY + G(x), it suffices to find a single solution of that system. Lemma 3.1. Let Yi be any solution of Y = AY + G(x). If Yh is any solution of the associated ho- mogeneous system Y = AY, then Yh + Yi is also a solution of Y = AY + G(x), and every solution of Y = AY + G(x) is of this form. Consequently,thegeneralsolutionof Y = AY + G(x)isgivenbyY = YH + Yi,whereYH denotes the general solution of Y = AY.
  • 56. 48 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS Proof. First we check that Y = Yh + Yi is a solution of Y = AY + G(x). We simply compute Y = (Yh + Yi) = Yh + Yi = (AYh) + (AYi + G) = A(Yh + Yi) + G = AY + G as claimed. Now we check that every solution Y of Y = AY + G(x) is of this form. So let Y be any solution of this inhomogeneous system.We can certainly write Y = (Y − Yi) + Yi = Yh + Yi where Yh = Y − Yi. We need to show that Yh defined in this way is indeed a solution of Y = AY. Again we compute Yh = (Y − Yi) = Y − Yi = (AY + G) − (AYi + G) = A(Y − Yi) = AYh as claimed. 2 (It is common to call Yi a particular solution of the inhomogeneous system.) Let us now recall our work from Section 2.1, and keep our previous notation. The homoge- neous system Y = AY has general solution YH = PMZC where C is a vector of arbitrary constants. Let us set NY = NY (x) = PMZ(x) for convenience, so YH = NY C. Then YH = (NY C) = NY C, and then, substituting in the equation Y = AY, we obtain the equation NY C = ANY C. Since this equation must hold for any C, we conclude that NY = ANY . We use this fact to write down a solution to Y = AY + G. We will verify by direct computation that the function we write down is indeed a solution. This verification is not a difficult one, but nevertheless it is a fair question to ask how we came up with this function. Actually, it can be derived in a very natural way, but the explanation for this involves the matrix exponential and so we defer it until Section 2.4. Nevertheless, once we have this solution (no matter how we came up with it) we are certainly free to use it. It is convenient to introduce the following nonstandard notation. For a vector H(x), we let 0 H(x)dx denote an arbitrary but fixed antiderivative of H(x). In other words, in obtaining 0 H(x)dx, we simply ignore the constants of integration. This is legitimate for our purposes, as by Lemma 3.1 we only need to find a single solution to an inhomogeneous system, and it doesn’t matter which one we find—any one will do. (Otherwise said, we can “absorb” the constants of integration into the general solution of the associated homogeneous system.) Theorem 3.2. The function Yi = NY 0 N−1 Y G dx is a solution of the system Y = AY + G.
  • 57. 2.3. INHOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS 49 Proof. We simply compute Yi . We have Yi = NY 0 N−1 Y G dx = NY N−1 Y G dx + NY 0 N−1 Y G dx by the product rule = NY 0 N−1 Y G dx + NY (N−1 Y G) by the definition of the antiderivative = NY 0 N−1 Y G dx + G = (ANY ) 0 N−1 Y G dx + G as NY = ANY = A NY 0 N−1 Y G dx + G = AYi + G as claimed. 2 We now do a variety of examples: a 2-by-2 diagonalizable system, a 2-by-2 nondiagonalizable system, a 3-by-3 diagonalizable system, and a 2-by-2 system in which the characteristic polynomial has complex roots. In all these examples, when it comes to finding N−1 Y , it is convenient to use the fact that N−1 Y = (PMZ)−1 = M−1 Z P −1. Example 3.3. Consider the system Y = AY + G where A = 5 −7 2 −4 and G = 30ex 60e2x . We saw in Example 1.2 that P = 7 1 2 1 and MZ = e3x 0 0 e−2x , and NY = PMZ. Then N−1 Y G = e−3x 0 0 e2x (1/5) 1 −1 −2 7 30ex 60e2x = 6e−2x − 12e−x −12e3x + 84e4x . Then 0 N−1 Y G = −3e−2x + 12e2x −4e3x + 21e4x
  • 58. 50 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS and Yi = NY 0 N−1 Y G = 7 1 2 1 e3x 0 0 e−2x −3e−2x + 12e2x −4e3x + 21e4x = −25ex + 105e2x −10ex + 45e2x . Example 3.4. Consider the system Y = AY + G where A = 0 1 −4 4 and G = 60e3x 72e5x . We saw in Example 1.6 that P = −2 1 4 0 and MZ = e2x 1 x 0 1 , and NY = PMZ. Then N−1 Y G = e−2x 1 −x 0 1 (1/4) 0 −1 4 −2 60e3x 72e5x = −18e3x − 60xex + 36xe3x 60ex − 36e3x . Then 0 N−1 Y G = 60ex − 60xex − 10e3x + 12xe3x 60ex − 12e3x and Yi = NY 0 N−1 Y G = −2 1 −4 0 e2x 1 x 0 1 60ex − 60xex − 10e3x + 12xe3x 60ex − 12e3x = −60e3x + 8e5x −240e3x + 40e5x . Example 3.5. Consider the system Y = AY + G where A = ⎡ ⎣ 2 −3 −3 2 −2 −2 −2 1 1 ⎤ ⎦ and G = ⎡ ⎣ ex 12e3x 20e4x ⎤ ⎦ .
  • 59. 2.3. INHOMOGENEOUS SYSTEMS WITH CONSTANT COEFFICIENTS 51 We saw in Example 1.3 that P = ⎡ ⎣ 1 0 −1 0 −1 −1 1 1 1 ⎤ ⎦ and MZ = ⎡ ⎣ e−x 0 0 0 1 0 0 0 e2x ⎤ ⎦ , and NY = PMZ. Then N−1 Y G = ⎡ ⎣ ex 0 0 0 1 0 0 0 e−2x ⎤ ⎦ ⎡ ⎣ 0 1 1 1 −2 −1 −1 1 1 ⎤ ⎦ ⎡ ⎣ ex 12e3x 20e4x ⎤ ⎦ = ⎡ ⎣ 12e4x + 20e5x ex − 24e3x − 20e4x −e−x + 12ex + 20e2x ⎤ ⎦ . Then 0 N−1 Y G = ⎡ ⎣ 3e4x + 4e5x ex − 8e3x − 5e4x e−x + 12ex + 10e2x ⎤ ⎦ and Yi = NY 0 N−1 Y G = ⎡ ⎣ 1 0 −1 0 −1 −1 1 1 1 ⎤ ⎦ ⎡ ⎣ e−x 0 0 0 1 0 0 0 e2x ⎤ ⎦ ⎡ ⎣ 3e4x + 4e5x ex − 8e3x − 5e4x e−x + 12ex + 10e2x ⎤ ⎦ = ⎡ ⎣ −ex − 9e3x − 6e4x −2ex − 4e3x − 5e4x 2ex + 7e3x + 9e4x ⎤ ⎦ . Example 3.6. Consider the system Y = AY + G where A = 2 −17 1 4 and G = 200 160ex . We saw in Example 2.4 that P = −1 + 4i −1 − 4i 1 1 and MZ = e(3+4i)x 0 0 e(3−4i)x , and NY = PMZ. Then N−1 Y G = e−(3+4i)x 0 0 e−(3−4i)x (1/(8i)) 1 1 + 4i 1 1 − 4i 200 160ex = −25e(−3−4i)x + 20(4 − i)e(−2−4i)x 25e(−3+4i)x + 20(4 + i)e(−2+4i)x .
  • 60. 52 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS Then 0 N−1 Y G = (4 + 3i)e(−3−4i)x + (−4 + 18i)e(−2−4i)x (4 − 3i)e(−3+4i)x + (−4 − 18i)e(−2+4i)x and Yi = NY 0 N−1 Y G = −1 + 4i −1 − 4i 1 1 e(3+4i)x 0 0 e(3−4i)x (4 + 3i)e(−3−4i)x + (−4 + 18i)e(−2−4i)x (4 − 3i)e(−3+4i)x + (−4 − 18i)e(−2+4i)x = −1 + 4i −1 − 4i 1 1 (4 + 3i) + (−4 + 18i)ex (4 − 3i) + (−4 − 18i)ex = −32 − 136ex 8 − 8ex . (Note that in this last example we could do arithmetic with complex numbers directly, i.e., without having to convert complex exponentials into real terms.) Once we have done this work, it is straightforward to solve initial value problems. We do a single example that illustrates this. Example 3.7. Consider the initial value problem Y = AY + G, Y(0) = 7 17 , where A = 5 −7 2 −4 and G = 30ex 60e2x . We saw in Example 1.2 that the associated homogenous system has general solution YH = 7c1e3x + c2e−2x 2c1e3x + c2e−2x and in Example 3.3 that the original system has a particular solution Yi = −25ex + 105e2x −10ex + 45e2x . Thus, our original system has general solution Y = YH + Yi = 7c1e3x + c2e−2x − 25ex + 105e2x 2c1e3x + c2e−2x − 10ex + 45e2x . We apply the initial condition to obtain the linear system Y(0) = 7c1 + c2 + 80 2c1 + c2 + 35 = 7 17
  • 61. 2.4. THE MATRIX EXPONENTIAL 53 with solution c1 = −11, c2 = 4. Substituting, we find that our initial value problem has solution Y = −77e3x + 4e−2x − 25ex + 105e2x −22e3x + 4e−2x − 10ex + 45e2x . EXERCISES FOR SECTION 2.3 In each exercise, find a particular solution Yi of the system Y = AY + G(x), where A is the matrix of the correspondingly numbered exercise for Section 2.1, and G(x) is as given. 1. G(x) = 2e8x 3e4x . 2. G(x) = 2e−7x 6e−8x . 3. G(x) = e4x 4e5x . 4. G(x) = e6x 9e8x . 5. G(x) = 9e10x 25e12x . 6. G(x) = 5e−x 12e2x . 7. G(x) = ⎡ ⎣ 1 3e2x 5e4x ⎤ ⎦. 8. G(x) = ⎡ ⎣ 8 3e3x 3e5x ⎤ ⎦. 2.4 THE MATRIX EXPONENTIAL In this section, we will discuss the matrix exponential and its use in solving systems Y = AY. Our first task is to ask what it means to take a matrix exponential. To answer this, we are guided by ordinary exponentials. Recall that, for any complex number z, the exponential ez is given by ez = 1 + z + z2 /2! + z3 /3! + z4 /4! + . . . .
  • 62. 54 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS With this in mind, we define the matrix exponential as follows. Definition 4.1. Let T be a square matrix. Then the matrix exponential eT is defined by eT = I + T + 1 2! T 2 + 1 3! T 3 + 1 4! T 4 + . . . . (For this definition to make sense we need to know that this series always converges, and it does.) Recall that the differential equation y = ay has the solution y = ceax. The situation for Y = AY is very analogous. (Note that we use rather than C to denote a vector of constants for reasons that will become clear a little later. Note that is on the right in Theorem 4.2 below, a consequence of the fact that matrix multiplication is not commutative.) Theorem 4.2. (1) Let A be a square matrix. Then the general solution of Y = AY is given by Y = eAx where is a vector of arbitrary constants. (2) The initial value problem Y = AY, Y(0) = Y0 has solution Y = eAx Y0 . Proof. (Outline) (1) We first compute eAx. In order to do so, note that (Ax)2 = (Ax)(Ax) = (AA)(xx) = A2x2 as matrix multiplication commutes with scalar multiplication, and (Ax)3 = (Ax)2(Ax) = (A2x2)(Ax) = (A2A)(x2x) = A3x3, and similarly, (Ax)k = Akxk for any k. Then, substituting in Definition 4.1, we have that Y = eAx = (I + Ax + 1 2! A2 x2 + 1 3! A3 x3 + 1 4! A4 x4 + . . .) .
  • 63. 2.4. THE MATRIX EXPONENTIAL 55 To find Y , we may differentiate this series term-by-term. (This claim requires proof, but we shall not give it here.) Remembering that A and are constant matrices, we see that Y = (A + 1 2! A2 (2x) + 1 3! A3 (3x2 ) + 1 4! A4 (4x3 ) + . . .) = (A + A2 x + 1 2! A3 x2 + 1 3! A4 x3 + . . .) = A(I + Ax + 1 2! A2 x2 + 1 3! A3 x3 + . . .) = A(eAx ) = AY as claimed. (2) By (1) we know that Y = AY has solution Y = eAx . We use the initial condition to solve for . Setting x = 0, we have: Y0 = Y(0) = eA0 = e0 = I = (where e0 means the exponential of the zero matrix, and the value of this is the identity matrix I, as is apparent from Definition 4.1), so = Y0 and Y = eAx = eAxY0. 2 In the remainder of this section we shall see how to translate the theoretical solution of Y = AY given by Theorem 4.2 into a practical one. To keep our notation simple, we will stick to 2-by-2 or 3-by-3 cases, but the principle is the same regardless of the size of the matrix. One case is relatively easy. Lemma 4.3. If J is a diagonal matrix, J = ⎡ ⎢ ⎢ ⎢ ⎣ d1 d2 0 0 ... dn ⎤ ⎥ ⎥ ⎥ ⎦ then eJx is the diagonal matrix eJx = ⎡ ⎢ ⎢ ⎢ ⎣ ed1x ed2x 0 0 ... ednx ⎤ ⎥ ⎥ ⎥ ⎦ .
  • 64. 56 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS Proof. Suppose, for simplicity, that J is 2-by-2, J = d1 0 0 d2 . Then you can easily compute that J2 = d1 2 0 0 d2 2 , J3 = d1 3 0 0 d2 3 , and similarly, Jk = d1 k 0 0 d2 k for any k. Then, as in the proof of Theorem 4.2, eJx = I + Jx + 1 2! J2 x2 + 1 3! J3 x3 + 1 4! J4 x4 + . . . = 1 0 0 1 + d1 0 0 d2 x + 1 2! d1 2 0 0 d2 2 x2 + 1 3! d1 3 0 0 d2 3 x3 + . . . = 1 + d1x + 1 2! (d1x)2 + 1 3! (d1x)3 + . . . 0 0 1 + d2x + 1 2! (d2x)2 + 1 3! (d2x)3 + . . . which we recognize as = ed1x 0 0 ed2x . 2 Example 4.4. We wish to find the general solution of Y = JY where J = 3 0 0 −2 . To do so we directly apply Theorem 4.2 and Lemma 4.3. The solution is given by y1 y2 = Y = eJx = e3x 0 0 e−2x γ1 γ2 = γ1e3x γ2e−2x . Now suppose we want to find the general solution of Y = AY where A = 5 −7 2 −4 .We may still apply Theorem 4.2 to conclude that the solution is Y = eAx . We again try to calculate eAx. Now we find A = 5 −7 2 −4 , A2 = 11 −7 2 2 , A3 = 41 −49 14 −22 , . . .
  • 65. 2.4. THE MATRIX EXPONENTIAL 57 so eAx = 1 0 0 1 + 5 −7 2 −4 x + 1 2! 11 −7 2 2 x2 + 1 3! 41 −49 14 −22 x3 + . . . , which looks like a hopeless mess. But, in fact, the situation is not so hard! Lemma 4.5. Let S and T be two matrices and suppose S = PT P−1 for some invertible matrix P. Then Sk = PT k P −1 for every k and eS = PeT P −1 . Proof. We simply compute S2 = SS = (PT P−1 )(PT P−1 ) = PT (P−1 P)T P−1 = PT IT P−1 = PT T P−1 = PT 2 P −1 , S3 = S2 S = (PT 2 P −1 )(PT P−1 ) = PT 2 (P −1 P)T P −1 = PT 2 IT P −1 = PT 2 T P −1 = PT 3 P−1 , S4 = S3 S = (PT 3 P −1 )(PT P−1 ) = PT 3 (P−1 P)T P−1 = PT 3 IT P −1 = PT 3 T P −1 = PT 4 P−1 , etc. Then eS = I + S + 1 2! S2 + 1 3! S3 + 1 4! S4 + . . . = PIP −1 + PT P −1 + 1 2! PT 2 P −1 + 1 3! PT 3 P −1 + 1 4! PT 4 P −1 + . . . = P(I + T + 1 2! T 2 + 1 3! T 3 + 1 4! T 4 + . . .)P−1 = PeT P−1 as claimed. 2
  • 66. 58 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS With this in hand let us return to our problem. Example 4.6. (Compare Example 1.2.) We wish to find the general solution of Y = AY where A = 5 −7 2 −4 . We saw in Example 1.16 in Chapter 1 that A = PJP−1 with P = 7 1 2 1 and J = 3 0 0 −2 . Then eAx = PeJx P −1 = 7 1 2 1 e3x 0 0 e−2x 7 1 2 1 −1 = 7 5 e3x − 2 5 e−2x −7 5 e3x + 7 5 e−2x 2 5 e3x − 2 5 e−2x − 2 5 e3x + 7 5 e−2x and Y = eAx = eAx γ1 γ2 = (7 5 γ1 − 7 5 γ2)e3x + (−2 5 γ1 + 7 5 γ2)e−2x (2 5 γ1 − 2 5 γ2)e3x + (−2 5 γ1 + 7 5 γ2)e−2x . Example 4.7. (Compare Example 1.3.) We wish to find the general solution of Y = AY where A = ⎡ ⎣ 2 −3 −3 2 −2 −2 −2 1 1 ⎤ ⎦ . We saw in Example 2.23 in Chapter 1 that A = PJP−1 with P = ⎡ ⎣ 1 0 −1 0 −1 −1 1 1 1 ⎤ ⎦ and J = ⎡ ⎣ −1 0 0 0 0 0 0 0 2 ⎤ ⎦ .
  • 67. 2.4. THE MATRIX EXPONENTIAL 59 Then eAx = PeJx P −1 = ⎡ ⎣ 1 0 −1 0 −1 −1 1 1 1 ⎤ ⎦ ⎡ ⎣ e−x 0 0 0 1 0 0 0 e2x ⎤ ⎦ ⎡ ⎣ 1 0 −1 0 −1 −1 1 1 1 ⎤ ⎦ −1 = ⎡ ⎣ e2x e−x − e2x e−x − e2x −1 + e2x 2 − e2x 1 − e2x 1 − e2x e−x − 2 + e2x e−x − 1 + e2x ⎤ ⎦ and Y = eAx = eAx ⎡ ⎣ γ1 γ2 γ3 ⎤ ⎦ = ⎡ ⎣ (γ2 + γ3)e−x + (γ1 − γ2 − γ3)e2x (−γ1 + 2γ2 + γ3) + (γ1 − γ2 − γ3)e2x (γ2 + γ3)e−x + (γ1 − 2γ2 − γ3) + (−γ1 + γ2 + γ3)e2x ⎤ ⎦ . Now suppose we want to solve the initial value problem Y = AY, Y(0) = ⎡ ⎣ 1 0 0 ⎤ ⎦. Then Y = eAx Y(0) = ⎡ ⎣ e2x e−x − e2x e−x − e2x −1 + e2x 2 − e2x 1 − e2x 1 − e2x e−x − 2 + e2x e−x − 1 + e2x ⎤ ⎦ ⎡ ⎣ 1 0 0 ⎤ ⎦ = ⎡ ⎣ e2x −1 + e2x 1 − e2x ⎤ ⎦ . Remark 4.8. Let us compare the results of our method here with that of our previous method. In the case of Example 4.6, our previous method gives the solution Y = P e3x 0 0 e−2x C = PeJx C
  • 68. 60 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS where J = 3 0 0 −2 , while our method here gives Y = PeJx P −1 . But note that these answers are really the same! For P−1 is a constant matrix, so if is a vector of arbitrary constants, then so is P −1 , and we simply set C = P−1 . Similarly, in the case of Example 4.7, our previous method gives the solution Y = P ⎡ ⎣ e−x 0 0 0 1 0 0 0 e2x ⎤ ⎦ C = PeJx C where J = ⎡ ⎣ −1 0 0 0 0 0 0 0 2 ⎤ ⎦, while our method here gives Y = PeJx P −1 and again, setting C = P−1 , we see that these answers are the same. So the point here is not that the matrix exponential enables us to solve new problems, but rather that it gives a new viewpoint about the solutions that we have already obtained. While these two methods are in principle the same, we may ask which is preferable in practice. In this regard we see that our earlier method is better, as the use of the matrix exponential requires us to find P−1, which may be a considerable amount of work. However, this advantage is (partially) negated if we wish to solve initial value problems, as the matrix exponential method immediately gives the unknown constants , as = Y(0), while in the former method we must solve a linear system to obtain the unknown constants C. Now let us consider the nondiagonalizable case. Suppose Z = JZ where J is a matrix con- sisting of a single Jordan block.Then by Theorem 4.2 this has the solution Z = eJx . On the other
  • 69. 2.4. THE MATRIX EXPONENTIAL 61 hand,inTheorem 1.1 we already saw that this system has solution Z = MZC.In this case,we simply have C = , so we must have eJx = MZ. Let us see that this is true by computing eJx directly. Theorem 4.9. Let J be a k-by-k Jordan block with eigenvalue a, J = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ a 1 a 1 0 a 1 ... ... 0 a 1 a ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ . Then eJx = eax ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 1 x x2/2! x3/3! · · · xk−1/(k − 1)! 0 1 x x2/2! · · · xk−2/(k − 2)! 1 x · · · xk−3/(k − 3)! ... ... x 1 ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ . Proof. First suppose that J is a 2-by-2 Jordan block, J = a 1 0 a . Then J2 = a2 2a 0 a2 , J3 = a3 3a2 0 a3 , J4 = a4 4a3 0 a4 , … so eJx = 1 0 0 1 + a 1 0 a x + 1 2! a2 2a 0 a2 x2 + 1 3! a3 3a2 0 a3 x3 + 1 4! a4 4a3 0 a4 x4 + . . . = m11 m12 0 m22 , and we see that m11 = m22 = 1 + ax + 1 2! (ax)2 + 1 3! (ax)3 + 1 4! (ax)4 + 1 5! (ax)5 + . . . = eax ,
  • 70. 62 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS and m12 = x + ax2 + 1 2! a2 x3 + 1 3! a3 x4 + 1 4! a4 x5 + . . . = x(1 + ax + 1 2! (ax)2 + 1 3! (ax)3 + 1 4! (ax)4 + . . .) = xeax and so we conclude that eJx = eax xeax 0 eax = eax 1 x 0 1 . Next suppose that J is a 3-by-3 Jordan block, J = ⎡ ⎣ a 1 0 0 a 1 0 0 1 ⎤ ⎦ . Then J2 = ⎡ ⎣ a2 2a 1 0 a2 2a 0 0 a2 ⎤ ⎦, J3 = ⎡ ⎣ a3 3a2 3a 0 a3 3a2 0 0 a3 ⎤ ⎦, J4 = ⎡ ⎣ a4 4a3 6a2 0 a4 4a3 0 0 a4 ⎤ ⎦, J5 = ⎡ ⎣ a5 5a4 10a3 0 a5 5a4 0 0 a5 ⎤ ⎦, … so eJx = ⎡ ⎣ 1 0 0 0 1 0 0 0 1 ⎤ ⎦ + ⎡ ⎣ a 1 0 0 a 1 0 0 a ⎤ ⎦ x + 1 2! ⎡ ⎣ a2 2a 1 0 a2 2a 0 0 a2 ⎤ ⎦ x2 + 1 3! ⎡ ⎣ a3 3a2 3a 0 a3 3a2 0 0 a3 ⎤ ⎦ x3 + 1 4! ⎡ ⎣ a4 4a3 6a2 0 a4 4a3 0 0 a4 ⎤ ⎦ x4 + 1 5! ⎡ ⎣ a5 5a4 10a3 0 a5 5a4 0 0 a5 ⎤ ⎦ x5 + . . . = ⎡ ⎣ m11 m12 m13 0 m22 m23 0 0 m33 ⎤ ⎦ , and we see that m11 = m22 = m33 = 1 + ax + 1 2! (ax)2 + 1 3! (ax)3 + 1 4! (ax)4 + 1 5! (ax)5 + . . . = eax , and m12 = m23 = x + ax2 + 1 2! a2 x3 + 1 3! a3 x4 + 1 4! a4 x5 + . . . = xeax
  • 71. 2.4. THE MATRIX EXPONENTIAL 63 as we saw in the 2-by-2 case. Finally, m13 = 1 2! x2 + 1 2! ax3 + 1 2! ( 1 2! a2 x4 ) + 1 2! ( 1 3! a3 x5 ) + . . . (as 6/4! = 1/4 = (1/2!)(1/2!) and 10/5! = 1/12 = (1/2!)(1/3!), etc.) = 1 2! x2 (1 + ax + 1 2! (ax)2 + 1 3! (ax)3 + . . .) = 1 2! x2 eax , so eJx = ⎡ ⎣ eax xeax 1 2! x2eax 0 eax xeax 0 0 eax ⎤ ⎦ = eax ⎡ ⎣ 1 x x2/2! 0 1 x 0 0 1 ⎤ ⎦ , and similarly, for larger Jordan blocks. 2 Let us see how to apply this theorem in a couple of examples. Example 4.10. (Compare Examples 1.6 and 1.10.) Consider the system Y = AY where A = 0 1 −4 4 . Also, consider the initial value problem Y = AY, Y(0) = 3 −8 . We saw in Example 2.12 in Chapter 1 that A = PJP−1 with P = −2 1 −4 0 and J = 2 1 0 2 . Then eAx = PeJx P−1 = −2 1 −4 0 e2x xe2x 0 e2x −2 1 −4 0 −1 = (1 − 2x)e2x xe2x −4xe2x (1 + 2x)e2x ,
  • 72. 64 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS and so Y = eAx = eAx γ1 γ2 = γ1e2x + (−2γ1 + γ2)xe2x γ2e2x + (−4γ1 + 2γ2)xe2x . The initial value problem has solution Y = eAx Y0 = eAx 3 −8 = 3e2x − 14xe2x 8e2x − 28xe2x . Example 4.11. (Compare Examples 1.7 and 1.11.) Consider the system Y = AY where A = ⎡ ⎣ 2 1 1 2 1 −2 −1 0 2 ⎤ ⎦ . Also, consider the initial value problem Y = AY, Y(0) = ⎡ ⎣ 8 32 5 ⎤ ⎦. We saw in Example 2.25 in Chapter 1 that A = PJP−1 with P = ⎡ ⎣ 1 0 −5 −2 0 6 −1 1 1 ⎤ ⎦ and J = ⎡ ⎣ −1 1 0 0 −1 0 0 0 3 ⎤ ⎦ . Then eAx = PeJx P −1 = ⎡ ⎣ 1 0 −5 −2 0 6 −1 1 1 ⎤ ⎦ ⎡ ⎣ e−x xe−x 0 0 e−x 0 0 0 e3x ⎤ ⎦ ⎡ ⎣ 1 0 −5 −2 0 6 −1 1 1 ⎤ ⎦ −1
  • 73. 2.4. THE MATRIX EXPONENTIAL 65 = ⎡ ⎢ ⎢ ⎣ 3 8 e−x + 1 2 xe−x + 5 8 e3x − 5 16 e−x − 1 4 xe−x + 5 16 e3x xe−x −3 4 e−x − xe−x + 3 4 e3x 5 8 e−x + 1 2 xe−x + 3 8 e3x −2xe−x 1 8 e−x − 1 2 xe−x − 1 8 e3x 1 16 e−x + 1 4 xe−x − 1 16 e3x e−x − xe−x ⎤ ⎥ ⎥ ⎦ and so Y = eAx = eAx ⎡ ⎣ γ1 γ2 γ3 ⎤ ⎦ = ⎡ ⎢ ⎢ ⎣ (3 8 γ1 − 5 16 γ2)e−x + (1 2 γ1 − 1 4 γ2 + γ3)xe−x + (5 8 γ1 + 5 16 γ2)e3x (−3 4 γ1 + 5 8 γ2)e−x + (−γ1 + 1 2 γ2 − 2γ3)xe−x + (3 4 γ1 + 3 8 γ2)e3x (1 8 γ1 + 1 16 γ2 + γ3)e−x + (−1 2 γ1 + 1 4 γ2 − γ3)xe−x + (−1 8 γ1 − 1 16 γ2)e3x ⎤ ⎥ ⎥ ⎦ . The initial value problem has solution Y = eAx Y0 = eAx ⎡ ⎣ 8 32 5 ⎤ ⎦ = ⎡ ⎣ −7e−x + xe−x + 15e3x 14e−x − 2xe−x + 18e3x 8e−x − xe−x − 3e3x ⎤ ⎦ . Now we solve Y = AY in an example where the matrix A has complex eigenvalues. As you will see, our method is exactly the same. Example 4.12. (Compare Example 2.4.) Consider the system Y = AY where A = 2 −17 1 4 . We saw in Example 2.4 that A = PJP−1 with P = −1 + 4i −1 − 4i 1 1 and J = 3 + 4i 0 0 3 − 4i .
  • 74. 66 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS Then eAx = PeJx P −1 = −1 + 4i −1 − 4i 1 1 e(3+4i)x 0 0 e(3−4i)x −1 + 4i −1 − 4i 1 1 −1 = −1 + 4i −1 − 4i 1 1 e(3+4i)x 0 0 e(3−4i)x (1/(8i)) 1 1 + 4i −1 −1 + 4i = m11 m12 m21 m22 where m11 = (1/(8i))((−1 + 4i)e(3+4i)x + (−1 − 4i)(−e(3−4i)x )) = (1/(8i))((−1 + 4i)e3x (cos(4x) + i sin(4x)) − (−1 − 4i)e3x (cos(4x) − i sin(4x)) = (1/(8i))(ie3x (4 cos(4x) − sin(4x))(2)) = e3x (cos(4x) − (1/4) sin(4x)), m12 = (1/(8i))((−1 + 4i)(1 + 4i)e(3+4i)x + (−1 − 4i)(−1 + 4i)e(3−4i)x ) = (1/(8i))(ie3x (−17 sin(4x))(2)) = e3x ((−17/4) sin(4x)), m21 = (1/(8i))(e(3+4i)x − e(3−4i)x ) = (1/(8i))(ie3x (sin(4x))(2)) = e3x ((1/4) sin(4x)), m22 = (1/(8i))((1 + 4i)e(3+4i)x + (−1 + 4i)e(3−4i)x ) = (1/(8i))((1 + 4i)e3x (cos(4x) + i sin(4x)) + (−1 + 4i)e3x (cos(4x) − i sin(4x)) = (1/(8i))(ie3x (4 cos(4x) + sin(4x))(2)) = e3x (cos(4x) + (1/4) sin(4x)) . Thus, eAx = e3x cos(4x) − 1 4 sin(4x) −17 4 sin(4x) 1 4 sin(4x) cos(4x) + 1 4 sin(4x) and Y = eAx = γ1e3x cos(4x) + (−1 4 γ1 + −17 4 γ2)e3x sin(4x) γ2e3x cos(4x) + (1 4 γ1 + 1 4 γ2)e3x sin(4x) . Remark 4.13. Our procedure in this section is essentially that of Remark 1.12. (Compare Exam- ple 4.10 with Example 1.13.) Remark 4.14. As we have seen, for a matrix J in JCF, eJx = MZ, in the notation of Section 2.1. But also, in the notation of Section 2.1, if A = PJP−1, then eAx = PeJxP −1 = PMZP −1 = MY .
  • 75. 2.4. THE MATRIX EXPONENTIAL 67 Remark 4.15. Now let us see how to use the matrix exponential to solve an inhomogeneous system Y = AY + G(x). Since we already know how to solve homogeneous systems, we need only, by Lemma 3.1, find a (single) particular solution Yi of this inhomogeneous system, and that is what we do. We shall again use our notation from Section 2.3, that 0 H(x)dx denotes an arbitrary (but fixed) antiderivative of H(x). Thus, consider Y = AY + G(x).Then, proceeding analogously as for an ordinary first-order linear differential equation, we have Y = AY + G(x) Y − AY = G(x) and, multiplying this equation by the integrating factor e−Ax, we obtain e−Ax (Y − AY) = e−Ax G(x) (e−Ax Y) = e−Ax G(x) with solution e−Ax Yi = 0 e−Ax G(x) Yi = eAx 0 e−Ax G(x) . Let us compare this with the solution we found in Theorem 3.2. By Remark 4.14, we can rewrite this solution as Yi = MY 0 M−1 Y G(x). This is almost, but not quite, what we had in Theo- rem 3.2. There we had the solution Yi = NY 0 N−1 Y G(x), where NY = PMZ. But these solutions are the same, as MY = PMZP −1 = NY P −1. Then M−1 Y = PM−1 Z P−1 and N−1 Y = M−1 Z P −1, so M−1 Y = PN−1 Y . Substituting, we find Yi = MY 0 M−1 Y G(x) = NY P −1 0 PN−1 Y G(x) , and, since P is a constant matrix, we may bring it outside the integral to obtain Yi = NY P −1 P 0 N−1 Y G(x) = NY 0 N−1 Y G(x) as claimed.
  • 76. 68 CHAPTER 2. SOLVING SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS Remark 4.16. In applying this method we must compute M−1 Z = (eJx)−1 = e−Jx = eJ(−x), and, as an aid to calculation, it is convenient to make the following observation. Suppose, for simplicity, that J consists of a single Jordan block. Then we compute: in the 1-by-1 case, eax −1 = e−ax ; in the 2-by-2 case, eax 1 x 0 1 −1 = e−ax 1 −x 0 1 ; in the 3-by-3 case, ⎛ ⎝eax ⎡ ⎣ 1 x x2/2! 0 1 x 0 0 1 ⎤ ⎦ ⎞ ⎠ −1 = e−ax ⎡ ⎣ 1 −x (−x)2/2! 0 1 −x 0 0 1 ⎤ ⎦ = e−ax ⎡ ⎣ 1 −x x2/2! 0 1 −x 0 0 1 ⎤ ⎦, etc. EXERCISES FOR SECTION 2.4 In each exercise: (a) Find eAx and the solution Y = eAx of Y = AY. (b) Use part (a) to solve the initial value problem Y = AY, Y(0) = Y0. Exercises 1–24: In Exercise n, for 1 ≤ n ≤ 20, the matrix A and the initial vector Y0 are the same as in Exercise n of Section 2.1. In Exercise n, for 21 ≤ n ≤ 24, the matrix A and the initial vector Y0 are the same as in Exercise n − 20 of Section 2.2.
  • 77. 69 A P P E N D I X A Background Results A.1 BASES, COORDINATES, AND MATRICES In this section of the Appendix,we review the basic facts on bases for vector spaces and on coordinates for vectors and matrices for linear transformations.Then we use these to (re)prove some of the results in Chapter 1. First we see how to represent vectors, once we have chosen a basis. Theorem 1.1. Let V be a vector space and let B = {v1, . . . , vn} be a basis of V . Then any vector v in V can be written as v = c1v1 + . . . + cnvn in a unique way. This theorem leads to the following definition. Definition 1.2. Let V be a vector space and let B = {v1, . . . , vn} be a basis of V . Let v be a vector in V and write v = c1v1 + . . . + cnvn. Then the vector [v]B = ⎡ ⎢ ⎣ c1 ... cn ⎤ ⎥ ⎦ is the coordinate vector of v in the basis B. Remark 1.3. In particular, we may take V = Cn and consider the standard basis E = {e1, . . . , en} where ei = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 0 ... 0 1 0 ... ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ , with 1 in the ith position, and 0 elsewhere.
  • 78. 70 APPENDIX A. BACKGROUND RESULTS Then, if v = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ c1 c2 ... cn−1 cn ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ , we see that v = c1 ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 1 0 ... 0 0 ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ + . . . + cn ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 0 0 ... 0 1 ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ = c1e1 + . . . + cnen so we then see that [v]E = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ c1 c2 ... cn−1 cn ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ . (In other words, a vector in Cn “looks like” itself in the standard basis.) Next we see how to represent linear transformations, once we have chosen a basis. Theorem 1.4. Let V be a vector space and let B = {v1, . . . , vn} be a basis of V . Let T : V −→ V be a linear transformation. Then there is a unique matrix [T ]B such that, for any vector v in V , [T (v)]B = [T ]B[v]B . Furthermore, the matrix [T ]B is given by [T ]B = [v1]B [v2]B . . . [vn]B . Similarly, this theorem leads to the following definition. Definition 1.5. Let V be a vector space and let B = {v1, . . . , vn} be a basis of V . Let T : V −→ V be a linear transformation. Let [T ]B be the matrix defined in Theorem 1.4.Then [T ]B is the matrix of the linear transformation T in the basis B.
  • 79. A.1. BASES, COORDINATES, AND MATRICES 71 Remark 1.6. In particular, we may take V = Cn and consider the standard basis E = {e1, . . . , en}. Let A be an n-by-n square matrix and write A = a1 a2 . . . an . If TA is the linear transfor- mation given by TA(v) = Av, then [TA]E = [TA(e1)]E [TA(e2)]E . . . [TA(en)]E = [Ae1]E [Ae2]E . . . [Aen]E = [a1]E [a2]E . . . [an]E = a1 a2 . . . an = A . (In other words, the linear transformation given by multiplication by a matrix “looks like” that same matrix in the standard basis.) What is essential to us is the ability to compare the situation in different bases. To that end, we have the following theorem. Theorem 1.7. Let V be a vector space, and let B = {v1, . . . , vn} and C = {w1, . . . , wn} be two bases of V . Let PC←B be the matrix PC←B = [v1]C [v2]C . . . [vn]C . This matrix has the following properties: (1) For any vector v in V , [v]C = PC←B[v]B . (2) This matrix is invertible and (PC←B)−1 = PB←C = [w1]B [w2]B . . . [wn]B . (3) For any linear transformation T : V −→ V , [T ]C = PC←B[T ]BPB←C = PC←B[T ]B(PC←B)−1 = (PB←C)−1 [T ]BPB←C .