5. Chapter 1
Vectors and Vector Spaces
Introduction
In Engineering applications we use two kinds of quantities; scalars and vectors.
• A scalar is a quantity that is determined by its magnitude.
Example: length, area, temperature, mass,...etc
• A vector is a quantity that is determined by both its magnitude and
direction.
Example: Velocity, acceleration, force,...etc
The concept of a vector is basic for the study of functions of several variables.
It provides geometric motivation for everything that follows.
1.1 Scalars and Vectors in R2
and R3
Definition of points in space
We know that a number x can be used to represent point on a line. A pair of
numbers (x, y) can be used to represent a point in the plane. A triple numbers
(x, y, z) can be used to represent a point in space.
Definition 1.1.1. If n is positive integer, then an ordered n-tuples is a
sequence of n real numbers (x1, x2, ..., xn). The set of all ordered n-tuples is
called n-space and is denoted by Rn
.
Vectors in R2
and R3
Every pair of distinct points A and B in R2
and R3
determines a directed
line segment with initial point at A and terminal point at B. We call such a
directed line segment is a vector and denoted by
−
−
→
AB.
A
B
initial point
terminal point
−
−
→
AB
The length of the line segment is the magnitude of the vector and the arrow
indicates its direction. We denote a vector by printing a letter in boldface (v)
or by putting an arrow above the letter (−
→
v ).
Definition 1.1.2. A position vector is a vector whose initial point is at the
origin.
Definition 1.1.3. Two vectors u and v in R2
or R3
are said to be equal if
they have the same magnitude and direction, and denoted by u = v.
4
6. (v1, v2)
v
v = (v1, v2)
−
→
u
−
→
v
−
→
u = −
→
v
Definition 1.1.4 (Component form). If v is a two-dimensional vector in
the plane with initial point P(x1, y1) and terminal point Q(x2, y2) then the
component form of v is
v =
→
PQ = (x2 − x1, y2 − y1).
If v is a three-dimensional vector in space with initial point P(x1, y1, z1) and
terminal point Q(x2, y2, z2) then the component form of v is
v =
→
PQ = (x2 − x1, y2 − y1, z2 − z1).
The zero vector is the vector 0 = (0, 0) or 0 = (0, 0, 0) in R2
and R3
respectively.
Example 1.1.1. Find the component form of a vector with initial point
P(−3, −2, −1) and terminal point Q(1, −2, 3).
Solution. The component form of
−
−
→
PQ is
−
−
→
PQ = (1 − (−3), −2 − (−2), 3 − (−1)) = (4, 0, 4)
J
1.2 Addition and scalar multiplication
Definition 1.2.1. Let u = (u1, u2) and v = (v1, v2) be vectors in R2
and let
c be a real number. Then the sum of u and v is defined as the vector
u + v = (u1 + v1, u2 + v2)
and the the scalar multiple of u by c is defined as the vector
cu = (cu1, cu2)
Definition 1.2.2 (Parallel Vectors). Two non-zero vectors u and v are said
to be parallel if they are scalar multiples of one another. In other word, the
two vectors u and v are said to be parallel, denoted by u k v if there exists
scalar c such that u = cv.
−
→
u
−
→
v
2−
→
u
−−
→
v
Triangle law(The head to tail rule): Given vectors u and v in R2
,
translate v so that its tail coincides with the head of u. The sum u + v of u
and v is the vector from the tail of u to the head of v.
Parallelogram rule: The sum of two vectors in the plane can be repre-
sented by the diagonal of a parallelogram having u and v as its adjacent sides,
as shown in Figure below.
Ambo University
DEPARTMENT OF MATHEMATICS 5
7. The head-to-tail rule
C
u
v
w
Parallelogram rule
u w = u + v
v
Example 1.2.1. Find the sum of the vectors.
a. u = (4, 3), v = (2, −1)
b. u = (3, −2), v = (2, −3)
Theorem 1.2.1 (Properties of Vector Addition and Scalar Multiplication in
the Plane). Let u, v and w be vectors in R2
, and let c and d be scalars.
1. u + v ∈ R2
. Closure under addition
2. u + v=v + u. Commutative property of addition
3. u + (v + w) = (u + v) + w. Associative property of addition
4. u + 0 = u. Additive identity property
5. u + (−u) = 0. Additive inverse property
6. cu ∈ R2
. Closure under scalar multiplication
7. c(u + v) = cu + cv.Distributive property
8. (c + d)u = cu + du. Distributive property
9. c(du) = (cd)u. Associative property of multiplication
10. 1(u) = u. Multiplicative identity property
The zero vector 0 in R2
is called the additive identity in R2
. Similarly,
the vector −v is called the additive inverse of v.
Theorem 1.2.2 (Properties of Additive Identity and Additive Inverse). Let
v be vector in R2
, and let c be a scalar. Then the following properties are true.
1. The additive identity is unique. That is, if u + v=v, then u = 0.
2. The additive inverse of v is unique.That is,if u + v = 0, then u = −v.
3. 0v = 0.
4. c0 = 0.
5. If cv = 0, then c = 0 or v = 0.
6. −(−v) = v.
Standard basis vectors
Let i = (1, 0, 0), j = (0, 1, 0), k = (0, 0, 1). The vectors i, j, k are called stan-
dard basis vectors. They have length 1 and points in the direction of the
positive x, y, and z-axes respectively. Similarly in two dimensions we define
i = (1, 0) and j = (0, 1).
If a = (a1, a2, a3), then we can write
a = (a1, a2, a3) = (a1, 0, 0) + (0, a2, 0) + (0, 0, a3)
= a1(1, 0, 0) + a2(0, 1, 0) + a3(0, 0, 1)
= a1i + a2j + a3k
Thus, any vector in space can be expressed in terms of i, j, and k.
Example 1.2.2. If a = i+2j −3k and b = 4i+7k, express the vector 2a+3b
in terms of i, j and k
Ambo University
DEPARTMENT OF MATHEMATICS 6
8. i = (1, 0)
j = (0, 1)
i
j
k
x
y
z
Solution.
2a + 3b = 2(i + 2j − 3k) + 3(4i + 7k)
= 2i + 4j − 6k + 12i + 21k
= 14i + 4j + 15k
J
1.3 Scalar product
Definition 1.3.1. Let u = (u1, u2, u3) be a vector in R3
. Then the magni-
tude (norm) of u, denoted by kuk is defined by:
kuk =
q
u2
1 + u2
2 + u2
3
Similarly, for a vector v = (v1, v2) ∈ R2
, its norm is given by
kvk =
q
v2
1 + v2
2
Example 1.3.1. a. If v = (−1, 4, 3), then find kvk.
b. If kuk = 6, find x such that u = (−1, x, 5).
Theorem 1.3.1. If v is a vector in R2
or R3
, then:
i. kvk ≥ 0
ii. kvk = 0 if and only if v = 0
iii. kvk = k−vk
iv. If c ∈ R, then kcvk = |c|kvk.
Definition 1.3.2 (Unit vector). Any vector v satisfying kvk = 1 is called a
unit vector.
Example 1.3.2. The vectors (0, 1),(−1, 0),( 1
√
2
, −1
√
2
),(1, 0, 0) are examples of
unit vectors.
Theorem 1.3.2. For any non-zero vector v the unit vector u corresponding
to v in the direction of v can be obtained as: u = v
kvk .
Example 1.3.3. Let a = (1, 1, 1). Then find the corresponding unit vector.
Solution. The unit vector u in the direction of a is
u =
a
kak
=
(1, 1, 1)
√
3
= (
1
√
3
,
1
√
3
.
1
√
3
)
J
Exercise 1.3.3. Find the unit vector in the direction of
a. −3i + 7j
b. 8i − j + 4k
Definition 1.3.3. If u and v are points in R2
or R3
, then we denote the
distance between u and v by d(u, v) and defined it to be
d(u, v) = ku − vk
The dot product
Definition 1.3.4. If a = (a1, a2, a3) and b = (b1, b2, b3), then the dot prod-
uct of a and b is the number a · b given by
a · b = a1b1 + a2b2 + a3b3
The dot product of two dimensional vectors is defined in a similar fashion:
(a1, a2) · (b1, b2) = a1b1 + a2b2
Ambo University
DEPARTMENT OF MATHEMATICS 7
9. Theorem 1.3.4. The dot product of two non-zero vectors a and b is the
number
a · b = kakkbkcosθ
where θ is the angle between a and b, 0 ≤ θ ≤ π. If either a or b is 0, we
define a · b = 0
Remark 1.3.1. The dot product of two vectors is a scalar quantity, and its
value is maximum when θ = 0 and minimum if θ = π.
Example 1.3.4. If u = (1, −2, 3) and v = (0, 1, −5), then find u · v, u · u
and (u + v) · v.
Solution. u·v = (1, −2, 3)·(0, 1, −5) = 1(0)+(−2)(1)+3(−5) = −17, u·u =
u2
= |u|2
= 14 J
Properties of Dot Product
If u, v and w are vectors with the same dimensions and c ∈ R, then
1. u · u = kuk2
2. u · v = v · u
3. u · (v + w) = u · v + u · w
4. (cu) · v = c(u · v) = u · (cv)
5. 0 · u = 0
1.3.1 Angle between two vectors
If θ is the angle between two non-zero vectors u and v, then the angle between
the two vectors can be obtained by:
cos θ =
u · v
kukkvk
⇒ θ = cos−1
u · v
kukkvk
Theorem 1.3.5 (Cauchy-schwarz inequality). If u and v are vectors in R2
or R3
, then
|u · v| ≤ kukkvk
Proof. Exercise
Theorem 1.3.6 (Triangle inequality). If u ,v and w are vectors in R2
or R3
,
then:
i. ku + vk ≤ kuk + kvk
ii. d(u, v) ≤ d(u, w) + d(w, v)
Proof. Exercise
Theorem 1.3.7 (Parallelogram Equation for vectors). If u and v are vectors
in R2
or R3
, then
ku + vk2
+ ku − vk2
= 2(kuk2
+ kvk2
)
Proof. Exercise
Theorem 1.3.8. If u and v are vectors in R2
or R3
with Euclidean inner
product, then
|u · v| =
1
4
ku + vk2
−
1
4
ku − vk2
Proof. Exercise
Definition 1.3.5. Two non-zero vectors u and v are said to be orthogonal
(perpendicular) denoted by u ⊥ v if and only if u · v = 0, i.e, if θ = π
2 .
Example 1.3.5. show that 2i + 2j − k is perpendicular to 5i − 4j + 2k
Solution. Since 2i + 2j − k · 5i − 4j + 2k = 10 − 8 − 2 = 0, we see that the
vectors are perpendicular. J
Example 1.3.6. Find the angle between a and b, where a = (1, 1, −1) and
b = (1, 0, 0)
Solution. cos θ = a·b
|a||b| = 1
√
3(1)
=
√
3
3 ⇒ θ = cos−1
(
√
3
3 ) = π
3 J
Theorem 1.3.9 (Pythagoras theorem). If a and b are orthogonal vectors,
then k a + b k2
= kak2
+ kbk2
.
Ambo University
DEPARTMENT OF MATHEMATICS 8
10. Proof. Suppose a ⊥ b, then
ka + bk2
= (a + b) · (a + b)
= a · a + a · b + b · a + b · b
= kak2
+ 2a · b + kbk2
= kak2
+ kbk2
, since a · b
Theorem 1.3.10. Given two vectors u and v in space, ku + vk = ku − vk if
and only if u and v are orthogonal vectors.
1.3.2 Orthogonal projection
Definition 1.3.6. Let a =
→
PQ and b =
→
PR be two vectors. If S is the foot
of the perpendicular from R to the line containing P and Q, then the vector
from P to S is called the vector projection of b onto a and is denoted by
projb
a .
P
θ
Q
R
S
Projb
a
a
b
B
C
The scalar projection of b onto a (also called the component of b along a)
is defined to be the length of projb
a , which is equal to kbk cos θ and is denoted
by compb
a . Thus, compb
a = a·b
kak and projb
a = a·b
kak2 a
Example 1.3.7. Find the scalar projection and Vector projection of b =
(1, 1, 2) onto a = (−2, 3, 1).
Solution. Since kak =
p
(−2)2 + 32 + 12 =
√
14, the scalar projection of b
onto a is
compb
a =
a · b
kak
=
−2 + 3 + 2
√
14
=
3
√
14
The vector projection is this scalar projection times the unit vector in the
direction of a.
projb
a =
3
√
14
a
kak
=
3
14
(−2, 3, 1) = (
−3
14
,
9
14
,
3
14
)
J
Remark 1.3.2. The vector projection of b onto a is the scalar projection times
the unit vector in the direction of a
Exercise 1.3.11. Let a = (−1, 3, 1) and b = (2, 4, 3). Then find projb
a , proja
b
and compb
a
1.3.3 Direction angles
Let a = a1i + a2j + a3k be a vector positioned at the origin in R3
, mak-
ing an angle of α, β and γ with the positive x, y and z axes respectively.Then
the angles α, β and γ are called the directional angles of a and the quantities
cos α, cos β and cos γ are called directional cosines of a, which can be com-
puted as follows:
cos α =
a1
kak
, α ∈ [0, π]
cos β =
a2
kak
, β ∈ [0, π]
cos γ =
a3
kak
, γ ∈ [0, π]
Remark 1.3.3. cos2
α + cos2
β + cos2
γ = 1
Exercise 1.3.12. Let a = (−1, 2, 2). Then find the directional cosines of a.
Ambo University
DEPARTMENT OF MATHEMATICS 9
11. α
β
γ
x
y
z
a
B
C
1.4 Cross product
Definition 1.4.1. Suppose that a = (a1, a2, a3) = a1i + a2j + a3k and b =
(b1, b2, b3) = b1i + b2j + b3k be two vectors on R3
. Then the cross product
a × b of the two vectors is defined as:
a × b = (a2b3 − a3b2)i + (a3b1 − a1b3)j + (a1b2 − a2b1)k
=
71. = i(−10 − 3) − j(4 + 4) + k(−6 + 20) = −13i − 8j + 14k
J
Remark 1.4.1. For two non-zero vectors a and b,
i. a × b is a vector which is orthogonal to both a and b
ii. a × b is not defined for a, b ∈ R2
iii. i × j = −j × i = k
j × k = −(k × j) = i
k × i = −(i × k) = j
Theorem 1.4.1 (Properties of Cross Product). Let a, b and c be vectors in
R3
and α be any scalar. Then
1. a × 0 = 0 × a = 0, where 0 = (0, 0, 0)
2. a × b = −b × a
3. a × (b × c) 6= (a × b) × c
4. (αa) × b = a × (αb) = α(a × b)
5. a × (b + c) = a × b + a × c
6. a · (a × b) = b · (a × b) = 0
7. If a and b are parallel, then a × b = 0
8. ka × bk = kakkbk sin θ, θ ∈ [0, π]
9. ka × bk2
= kak2
kbk2
− (a · b)2
Example 1.4.2. If kak = 2, kbk = 4 and θ = π
4 for two vectors a and b then
find ka × bk
Solution. ka × bk = kakkbk sin θ = 2 × 4 × sin π
4 = 4
√
2 J
The angle θ between a and b can be obtained by sin θ = ka×bk
kakkbk , for two
non-zero vectors a and b.
Definition 1.4.2 (Scalar Triple Product). Let a, b and c be vectors in R3
.
Their scalar triple product is given by a · (b × c), which is a scalar.
Ambo University
DEPARTMENT OF MATHEMATICS 10
72. Applications of cross product
1. Area: The area of a parallelogram whose adjacent sides coincides with
the vectors a and b is given by ka × bk = kakkbk sin θ
θ
|b| sin θ = h
a
b
B
C
Note 1. The area of the triangle formed by a and b as its adjacent sides
is given by area=1
2 ka × bk
2. Volume: The volume V of a parallelepiped with the three vectors a, b
and c in R3
as three of its adjacent edges is given by:
V = |a · (b × c)| =
84. Example 1.4.3. Find the area of a triangle whose vertices are
A(1, −1, 0), B(2, 1, −1) and C(−1, 1, 2).
Solution. The vectors on the sides of the triangle ∆ABC are
−
−
→
AB =
(1, 2, −1) and
−
→
AC = (−2, 2, 2) Then the area of the triangle ABC is
A =
1
2
k
−
−
→
AB ×
−
→
ACk = 3
√
2 square units
J
Example 1.4.4. Find the volume of the parallelepiped with edges u = i +
k, v = 2i + j + 4k, and w = j + k.
Solution. V = |u · (v × w)| = 1 J
If the volume of the parallelepiped determined by a, b and c is 0, then the
vectors must lie in the same plane; that is, they are Coplanar.
Example 1.4.5. Use the scalar triple product to show that the vectors a =
(1, 4, −7), b = (2, −1, 4) and c = (0, −9, 18) are coplanar.
Solution. The scalar triple product
V = |a · (b × c)| =
96. = 0
∴ a, b and c are coplanar. J
1.5 Lines and planes in R3
Lines
We know that in a plane(R2
) a line is determined by a point on a line and
slope of the line. i.e.
y − y0 = m(x − x0)
But in three dimensional space (R3
) a line L is determined by a point
P0(x0, y0, z0) and a vector v giving the direction of the line.
Let P(x, y, z) be an arbitrary point on L and let r0 and r be the position
vectors of P0 and P respectively.
If a =
→
P0P, then r = r0 + a.
But since a and v are parallel vectors, there is a scalar t such that a = tv.
Thus r = r0 + tv——vector equation of L.
If v = (a, b, c), then tv = (ta, tb, tc).
Let r = (x, y, z) and r0 = (x0, y0, z0). Then the vector equation becomes
(x, y, z) = (x0, y0, z0) + (ta, tb, tc)
= (x0 + ta, y0 + tb, z0 + tc), t ∈ R2
This leads to x = x0+ta, y = y0+tb, z = z0+tc——–this is called parametric
equation of a line
Example 1.5.1. a. Find the vector equation and parametric equation for the
line that passes through the point (5, 1, 3) and is parallel to the vector
i + 4j − 2k
Ambo University
DEPARTMENT OF MATHEMATICS 11
97. x
y
z
P0(x0, y0, z0) P(x, y, z)
r0 r
r − r0
v
b. Find two other points on the line
Solution. a. Here r0 = (5, 1, 3) = 5i + j + 3k and v = i + 4j − 2k.
The vector equation is
r = r0 + tv = 5i + j + 3k + t(i + 4j − 2k)
= (5 + t)i + (1 + 4t)j + (3 − 2t)k
Parametric equations are
x = 5 + t, y = 1 + 4t, z = 3 − 2t
b. Choosing t = 1 and t = 2 gives the two points (6, 5, 1) and 7, 9, −1
J
Symmetric Equation
The vector equation and parametric equations of a line are not unique. If we
change the point or the parameter or choose a different parallel vector, then
the equations change. If a vector v = (a, b, c) is used to describe the direction
of a line L, then a, b and c are called direction numbers of L.
If none of a, b or c is 0, then we can solve each of the three parametric equations
for t, equate the results and obtain the symmetric equations.
x − x0
a
=
y − y0
b
=
z − z0
c
Example 1.5.2. a. Find parametric equations and symmetric equations of
the line that passes through the points A(2, 4, −3) and B(3, −1, 1).
b. At what point does this line intersect the xy-plane?
Solution. a. The vector
−
−
→
AB = (1, −5, 4) is parallel to the line L.
If P0 = (2, 4, −3), then the parametric equations are
x = 2 + t, y = 4 − 5t, z = −3 + 4t
The symmetric equations are
x − 2
1
=
y − 4
−5
=
z + 3
4
b. z = 0 ⇒ x−2
1 = y−4
−5 = 3
4 ⇒ x = 11
4 and y = 1
4
J
Example 1.5.3. Show that the lines L1 and L2 with parametric equations
x = 1 + t, y = −2 + 3t, z = 4 − t and x = 2s, y = 3 + s, z = −3 + 4s are skew
lines. that is, the do not intersect and are not parallel.
Solution. The lines are not parallel because their direction
vectors are not parallel. If L1 and L2 had a point of in-
tersection, then there would be values of t and s such that
1 + t = 2s
−2 + 3t = 3 + s
4 − t = −3 + 4s.
(
1 + t = 2s
−2 + 3t = 3 + s
⇒ t =
11
5
and s =
8
5
, but these
values do not satisfy the third equation. J
Ambo University
DEPARTMENT OF MATHEMATICS 12
98. we know that the vector equation of a line through the vector r0 in the
direction of a vector v is r = r0 + tv. If the line also passes through r1, then
we can take v = r1 − r0 and so its vector equation is
r = r0 + t(r1 − r0) = (1 − t)r0 + tr1
The line segment from r0 to r1 is given by the vector equation
r(t) = (1 − t)r0 + tr1, 0 ≤ t ≤ 1
Planes
A plane in space is determined by a point P0(x0, y0, z0) in the plane and a
vector n that is orthogonal (perpendicular) to the plane. This orthogonal
vector n is called normal vector.
Let P(x, y, z) be an arbitrary point in the plane, and let r0 and r be the
position vectors of P0 and P. The normal vector n is orthogonal to every
x
y
z
P(x, y, z) P0(x0, y0, z0)
r r0
r − r0
n
vector in the given plane. In particular, n is orthogonal to r − r0.
Thus n · (r − r0) = 0 or n · r = n · r0. This equation is called a vector equation
of the plane. If n = (a, b, c), r = (x, y, z), and r0 = (x0, y0, z0), then
(a, b, c) · (x − x0, y − y0, z − z0) = 0
⇒a(x − x0) + b(y − y0) + c(z − z0) = 0
⇒ax + by + cz + d = 0, where d = −(ax0 + by0 + cz0)
This equation is called the equation of the plane through the point (x0, y0, z0)
with normal vector n = (a, b, c)
Example 1.5.4. Find an equation of the plane through the point (2, 4, −1)
with normal vector n = (2, 3, 4). Find the intercepts and sketch the plane.
Solution. Put P0 = (2, 4, −1) and n = (2, 3, 4)
Hence an equation of the plane is
2(x − 2) + 3(y − 4) + 4(z + 1) = 0 =⇒ 2x + 3y + 4z = 12
the x-intercept y = z = 0 ⇒ 2x = 12 ⇒ x = 6
the y-intercept is y = 4.
the z-intercept is z = 3 J
Example 1.5.5. Find an equation of the plane that passes through the points
P(1, 3, 2), Q(3, −1, 6) and R(5, 2, 0).
Solution. Let a =
−
−
→
PQ = (2, −4, 4) and b =
−
→
PR = (4, −1, −2). since both a
and b lie in the plane, their cross product a × b is orthogonal to the plane
can be taken as the normal vector. Thus
n = a × b =
110. = 12i + 20j + 14k
The equation is
12(x − 1) + 20(y − 3) + 14(z − 2) = 0
⇒ 12x + 20y + 14z = 100
⇒ 6x + 10y + 7z = 50
J
Ambo University
DEPARTMENT OF MATHEMATICS 13
111. x
y
z
(0, 4, 0)
(0, 0, 3)
(6, 0, 0)
Angle between Planes
Two plane are parallel if their normal vectors are parallel.
If two planes are not parallel, then they intersect in a straight line and the
angle between the planes is defined as the acute angle between their normal
vectors.
Example 1.5.6. Find the angle of intersection between the planes x+y+z = 1
and x − 2y + 3z = 1. Find symmetric equations for the line of intersection of
the plane.
Solution. The normal vectors are
n1 = (1, 1, 1), n2 = (1, −2, 3)
Thus if θ is the angle between the planes, then
cos θ =
n1 · n2
kn1kkn2k
=
1 − 2 + 3
√
3
√
14
=
2
√
42
∴ θ = cos−1
(
2
√
42
) = 72◦
To find a point on L take z = 0. Then
(
x + y = 1
x − 2y = 1
⇒ y = 0 and x = 1
So the point (1, 0, 0) lies on L. Since L lies on both planes, it is perpendicular
to both of the normal vectors. Thus a vector v parallel to L is given by
v = n1 × n2 =
123. = 5i − 2j − 3k
So the symmetric equation of L can be written
x − 1
5
=
y
−2
=
z
−3
J
Distance in Space
a. Distance from a point to a line: The distance D from a point P1 (not
on L) to a line L in space is given by
D =
kv ×
→
P0P1k
kvk
, where v is the directional vector of L and P0 is any point on L
Proof. execise
b. Distance from a point to a plane: The perpendicular distance D of a
point P1(x1, y1, z1) in space to the plane with the equation ax+by +cz +
d = 0 is given by:
D =
|ax1 + by1 + cz1 + d|
√
a2 + b2 + c2
=
|n ·
→
OP|
knk
where O is the foot of n within the plane.
Ambo University
DEPARTMENT OF MATHEMATICS 14
124. Proof. Exercise
Exercise 1.5.1. 1. Find the distance of the point P1(−1, 3, 0) from the
line with parametric equation L : x = 1, y = 1 + 3t, z = −1 + 2t
2. Find the distance from the point P(1, 2, 3) to the plane π : 3x + 5y −
4z + 37 = 0
3. Find the distance between the planes π1 : x + 2y − 2z = 3 and
π2 : 2x + 4y + −4z = 7
c. Distance between two parallel planes: Given two parallel planes π1 :
ax+by+cz = d1 and π2 : ax+by+cz = d2. Then the distance between π1
and π2 is the same as the distance from any arbitrary point P(x0, y0, z0)
that has been taken from π1 to the plane π2 and is given by
D =
|ax0 + by0 + cz0 − d2|
√
a2 + b2 + c2
=
|d1 − d2|
√
a2 + b2 + c2
Example 1.5.7. Find the distance between the parallel planes
π1 : 10x + 2y − 2z = 5 and
π2 : 5x + y − z = 1
Solution. The planes are parallel because their normal vectors
(10, 2, −2) and (5, 1, −1) are parallel. The distance between π1 and π2 is
the distance between any point in π1 and the plane π2.
Let y = z = 0, then x = 1
2 . Hence (1
2 , 0, 0) lies in π1.
Now the distance between the point (1
2 , 0, 0) and the plane π2 is
D =
|5(1
2 ) + 1(0) − 1(0) − 1|
p
52 + 12 + (−1)2
=
√
3
6
∴ the distance between the planes is
√
3
6 . J
1.6 Vector space; Subspaces
Definition 1.6.1. Let V be a set on which two operations, called addition
and scalar multiplication, have been defined. If u and v are in V , the sum of
u and v is denoted by u + v, and if c is a scalar, the scalar multiple of u by
c is denoted by cu. If the following axioms hold for all u, v and w in V and
for all scalars c and d, then V is called a vector space and its elements are
called vectors.
1. u + v is in V . Closure under addition
2. u + v=v + u. Commutativity
3. u + (v + w) = (u + v) + w. Associativity
4. There exists an element 0 in V , called a zero vector, such that u+0 = u.
5. For each u in V , there is an element −u in V such that u + (−u) = 0.
6. cu is in V . Closure under scalar multiplication
7. c(u + v) = cu + cv.Distributivity
8. (c + d)u = cu + du. Distributivity
9. c(du) = (cd)u.
10. 1(u) = u.
Example 1.6.1. For any n ≥ 1, Rn
is vector space over R
Example 1.6.2. The set defined by S = {(x, y, z) : x, y, z ∈ Q} is not a
vector space over R because if we take c =
√
2 ∈ R and u = (1, 3, 0) ∈ S, then
we can see that cu is not in S.
Subspace
Definition 1.6.2. A non-empty subset W of a vector space V is called a
subspace of V if W is itself a vector space with the same scalars, addition and
scalar multiplication as V .
Theorem 1.6.1. Let V be a vector space and let W be a non-empty subset
of V . Then W is a subspace of V if and only if the following conditions hold:
a. If u and v are in W, then u + v is in W.
b. If u is in W and c is scalar, then cu is in W.
Ambo University
DEPARTMENT OF MATHEMATICS 15
125. Example 1.6.3. V and {0} are the trivial subspaces of any vector space V .
1. V and {0} are the trivial subspaces of any vector space V
2. For the vector space V = R3
= {(x, y, z); x, y, z ∈ R3
} over R. Then the
set W = {(x, y, 0); x, y ∈ R} is a subspace of V . (verify!)
3. The set of all lines passing through the origin, L = {ax+by = 0, a, b ∈ R}
is a subspace of the vector space V = R2
.
Exercise 1.6.2. Is the set W = {x − 4y = 1} a subspace of V = R2
? Justify.
1.7 Linear Dependence and independence; Ba-
sis of a vector space
Definition 1.7.1. A vector u in a vector space V is called a linear com-
bination of the vectors v1, v2, . . . , vn in V when u can be written in the
form
u = α1v1 + α2v2 + · · · + αnvn
where, α1, α2, . . . , αn are scalars.
Definition 1.7.2. A set of vectors {v1, v2, . . . , vn} in a vector space V is
linearly dependant(LD) if there are α1, α3, . . . , αn, atleast one of which is
not zero, such that
α1v1 + α2v2 + · · · + αnvn = 0
A set of vectors that is not linearly dependant is said to be linearly inde-
pendent(LI).
A set of vectors {v1, v2, . . . , vn} in vector space V is linearly independent
if α1v1 + α2v2 + · · · + αnvn = 0 implies α1 = 0, α2 = 0, . . . , αn = 0
Example 1.7.1. Determine weather the following set of vectors in Vector
space V = R3
are linearly dependant or independent
i. {(1, 0, 0), (0, 1, 0), (0, 0, 3)}
ii. {(2, 6, 0), (2, 4, 1), (1, 1, 1)}
Theorem 1.7.1. Let S = {v1, v2, . . . , vr} be a set of vectors in Rn
. If r n,
then S is linearly dependent.
Definition 1.7.3. If f1 = f1(x), f2 = f2(x), . . . , fn = fn(x) are functions that
are n − 1 times differentiable on the interval (−∞, ∞) , then the determinant
W(x) =
143. is called the Wronskian of f1, f2, . . . , fn.
Theorem 1.7.2. If the functions f1, f2, . . . , fn have n − 1 continuous deriva-
tives on the interval (−∞, ∞) and if the Wronskian of these functions is not
identically zero on (−∞, ∞) , then these functions form a linearly independent
set of vectors in C(n−1)
(−∞, ∞).
Example 1.7.2. Let V be the vector space of all real valued functions of the
variable t. Then which of the following set of functions are LD/LI? Justify!
a. {t, t2
, sint}
b. {cos2
t, sin2
t, 1}
Theorem 1.7.3. A set of vectors {v1, v2, . . . , vn} in a vector space V is
linearly dependent if and only if atleast one of the vectors can be expressed as
a linear combination of the others.
Proof. Exercise
Basis of a vector space
Definition 1.7.4. If S = {v1, v2, . . . , vn} is a set of vectors in a vector
space V , then the set of all linear combinations of v1, v2, . . . , vn is called the
span of v1, v2, . . . , vn and is denoted by span(v1, v2, . . . , vn) or span(S). If
V = span(S), then S is called a spanning set for V and V is said to be spanned
by S.
Definition 1.7.5. A subset B of a vector space V is a basis for V if
1. B spans V and
2. B is linearly independent.
Ambo University
DEPARTMENT OF MATHEMATICS 16
144. Definition 1.7.6. A vector space V is called finite dimensional if it has a
basis consisting of finitely many vectors. The dimension of V , denoted by
dimV , is the number of vectors in a basis for V . The dimension of the zero
vector space {0} is defined to be zero. A vector space that has no finite basis
is called infinite dimensional.
Example 1.7.3. 1. Show that the set S = {(1, 0, 0), (0, 1, 0), (0, 0, 5)} form
a basis of the vector space R3
.
2. Determine whether the set S = {(0, 1), (1, 0), (2, 5)} is a basis of R2
.
Theorem 1.7.4. If a vector space V has one basis with n vectors, then every
basis for V has n vectors.
Theorem 1.7.5. Let B = v1, v2, . . . , vn be a basis for a vector space V .
a. Any set of more than n vectors in V must be linearly dependent.
b. Any set of fewer than n vectors in V cannot span V .
Theorem 1.7.6. Let V be a vector space of dimension n.
1. If S = {v1, v2, . . . , vn} is a linearly independent set of vectors in V , then
S is a basis for V .
2. If S = {v1, v2, . . . , vn} spans V , then S is a basis for V .
Ambo University
DEPARTMENT OF MATHEMATICS 17
145. Chapter 2
Matrices and determinants
2.1 Definition of matrix and basic operations
Definition 2.1.1. An m×n matrix A is a rectangular array of numbers, real
or complex, with m rows and n columns.
The following are all examples of matrices:
2 3
1 4
,
√
5 −2 1
π 3 i
, [2],
1 1 1 1
,
2
1
3
The size of a matrix is a description of the numbers of rows and columns it
has. A matrix is called m × n (pronounced m by n) if it has m rows and n
columns.
A 1×m matrix is called a row matrix (or row vector), and an n×1 matrix
is called a column matrix (or column vector).
A general m × n matrix A has the form
A =
a11 a12 . . . a1n
a21 a22 . . . a2n
.
.
.
.
.
.
...
.
.
.
am1 am2 . . . amn
The diagonal entries of A are a11, a22, a33, . . . , and if m = n (that is, if A has
the same number of rows as columns), then A is called a square matrix.
A square matrix whose nondiagonal entries are all zero is called a diagonal
matrix. A diagonal matrix all of whose diagonal entries are the same is called
a scalar matrix. If the scalar on the diagonal is 1, the scalar matrix is called
an identity matrix.
For example let
A =
2 4 5
0 3 4
, B =
3 1
4 5
, C =
2 0 0
0 6 0
0 0 4
, D =
1 0 0
0 1 0
0 0 1
The diagonal entries of A are 2 and 3, but A is not square; B is a square
matrix of size 2×2 with diagonal entries 3 and 5; C is a diagonal matrix; D is
a 3×3 identity matrix. The n×n identity matrix is denoted by In (or simply
I if its size is understood).
Remark 2.1.1. Two matrices are equal if they have the same size and if their
corresponding entries are equal. Thus, if A = [aij]m×n and B = [bij]r×s, then
A = B if and only if m = r and n = s and aij = bij for all i and j.
Matrix Addition and scalar Multiplication
If A = [aij] and B = [bij] are m × n matrices, their sum A + B is the m × n
matrix obtained
A + B = aij + bij
Example 2.1.1. Let
A =
1 4 0
−2 6 5
, B =
3 1 −1
3 0 2
and C =
4 3
2 1
18
146. Then
A + B =
−2 5 −1
1 6 7
but neither A + C nor B + C is defined.
If A is an m × n matrix and c is a scalar, then the scalar multiple cA is the
m × n matrix obtained by multiplying each entry of A by c.
cA = c[aij] = [caij]
Example 2.1.2. For matrix A in Example 2.1.1 2A =
2 8 0
−4 12 10
, 1
2 A =
1
2 2 0
−1 3 5
2
, (−1)A =
−1 −4 0
2 −6 −5
The matrix (−1)A is written as −A and called the negative of A. As with
vectors, we can use this fact to define the difference of two matrices: If A and
B are the same size, then
A − B = A + (−B)
2.2 Product of matrices and some algebraic
properties; Transpose of a matrix
Matrix Multiplication
Definition 2.2.1. If A is an m × n matrix and B is an n × r matrix, then
the product C = AB is an m × r matrix. The (i, j) entry of the product is
computed as follows:
ai1 ai2 . . . air
b1j
b2j
.
.
.
brj
cij
A
(n × r)
B
(r × m)
C = AB
(n × m)
row i
column j
cij = ai1b1j + ai2b2j + · · · + ainbnj
Note 2. For AB to exist, the number of columns of A must equal the number
of rows of B.
A B
m × n n × r
same
size of AB
= AB
m × r
Remark 2.2.1. If A is an m × n matrix and B is an n × r matrix, then AB
will be an m × r matrix.
Example 2.2.1. Let A =
1 3
2 0
and B =
5 0 1
3 −2 6
. Determine AB and
BA, if the product exists.
Solution. A has two columns and B has two rows; thus AB exists. Interpret
A in terms of its rows and B in terms of its columns and multiply the rows
Ambo University
DEPARTMENT OF MATHEMATICS 19
147. by the columns. We find that
AB =
1 3
2 0
5 0 1
3 −2 6
=
14 −6 19
10 0 2
BA does not exist because B has three columns and A has two rows.
We see that the order in which two matrices are multiplied is important. Un-
like multiplication of real numbers, matrix multiplication is not commutative.
In general, for two matrices A and B, AB 6= BA. J
Matrix Multiplication in Terms of Columns
Consider the product AB where A is an m × n matrix and B is an n × r ma-
trix(so that AB exists). Let the columns of B be the matrices B1, B2, . . . , Br.
Write B as
B1 B2 . . . Br
. Thus
AB = A
B1 B2 . . . Br
Matrix multiplication implies that the columns of the product are
AB1, AB2, . . . , ABr. We can write
AB =
AB1 AB2 . . . ABr
For example, suppose A =
2 0
1 5
and B =
4 1 3
0 2 −1
. Then
AB =
2 0
1 5
4
0
2 0
1 5
1
2
2 0
1 5
3
−1
=
8 2 6
4 11 −2
Theorem 2.2.1. Let A, B and C be matrices and r and s be scalars. Assume
that the sizes of the matrices are such that the operations can be performed.
Properties of Matrix Addition and Scalar Multiplication
1. A + B = B + A
2. A + (B + C) = (A + B) + C
3. A + 0 = 0 + A = A (where 0 is the appropriate zero matrix)
4. r(A + B) = rA + rB
5. (r + s)C = rC + sC
6. r(sC) = (rs)C
Properties of Matrix Multiplication
1. A(BC) = (AB)C
2. A(B + C) = AB + AC
3. (A + B)C = AC + BC
4. AI = IA = A (where I is the appropriate identity matrix)
5. r(AB) = (rA)B = A(rB)
Note 3. AB 6= BA in general. Multiplication of matrices is not commutative.
Example 2.2.2. Compute the product ABC of the following three matrices.
A =
1 2
3 −1
, B =
0 1 3
−1 0 −2
, C =
4
−1
0
Solution. Let us check to see if the product ABC exists before we start
spending time multiplying matrices. We get The product exists and will be a
A
2×2
B
2×3
C
3×1
ABC
2×1
=
size of product is 2 × 1
match match
2 × 1 matrix. Since matrix multiplication is associative, the matrices in the
product ABC can be grouped together in any manner for multiplying, as long
as the order is maintained. Let us use the grouping (AB)C. This is probably
the most natural. We get
AB =
1 2
3 −1
0 1 3
−1 0 −2
=
−2 1 −1
1 3 11
and
(AB)C =
−2 1 −1
1 3 11
4
−1
0
=
−9
1
J
Ambo University
DEPARTMENT OF MATHEMATICS 20
148. Exercise 2.2.2. Compute each of the following expressions for
A =
2 0
−1 5
, B =
−1 1
2 4
, C =
3 4
0 2
1. A − 3B2
2. A2
B + 2C3
The Transpose of a Matrix
Definition 2.2.2. The transpose of an m × n matrix A is the n × m matrix
AT
obtained by interchanging the rows and columns of A. That is, the ith
column of AT
is the ith
row of A for all i.
Example 2.2.3. Let
A =
1 3 2
5 0 1
, B =
a b
c d
, C =
5 −1 2
Then their transposes are
AT
=
1 5
3 0
2 1
, BT
=
a c
b d
, CT
=
5
−1
2
Definition 2.2.3. A square matrix A is symmetric if AT
= A—that is, if
A is equal to its own transpose.
Example 2.2.4. Let
A =
1 3 2
3 5 0
2 0 4
and B =
1 2
−1 3
Then A is symmetric, since AT
= A; but B is not symmetric, since BT
=
1 −1
2 3
6= B.
2.3 Elementary operations and its properties
Definition 2.3.1. A matrix is in row echelon form if it satisfies the following
properties:
1. Any rows consisting entirely of zeros are at the bottom.
2. In each non-zero row, the first non-zero entry (called the leading entry)
is in a column to the left of any leading entries below it.
Example 2.3.1. The following matrices are in row echelon form:
2 4 1
0 −1 2
0 0 0
,
1 0 1
0 1 5
0 0 4
,
1 1 2 1
0 0 1 3
0 0 0 0
Elementary Row Operations
Definition 2.3.2. The following elementary row operations can be performed
on a matrix:
1. Interchange two rows.
2. Multiply a row by a non-zero constant.
3. Add a multiple of a row to another row.
We will use the following shorthand notation for the three elementary row
operations:
1. Ri ←→ Rj means interchange rows i and j.
2. kRi means multiply row i by k.
3. Ri + kRj means add k times row j to row i (and replace row i with the
result).
The process of applying elementary row operations to bring a matrix into row
echelon form, called row reduction, is used to reduce a matrix to echelon
form.
Ambo University
DEPARTMENT OF MATHEMATICS 21
149. Example 2.3.2. Reduce the following matrix to echelon form:
1 2 −4 −4 5
2 4 0 0 2
2 3 2 1 5
−1 1 3 6 5
Solution.
1 2 −4 −4 5
2 4 0 0 2
2 3 2 1 5
−1 1 3 6 5
R2 − 2R1
R3 − 2R1
R4 + R1
−→
1 2 −4 −4 5
0 0 8 8 −8
0 −1 10 9 −5
0 3 −1 2 10
R2←→R3
−→
1 2 −4 −4 5
0 −1 10 9 −5
0 0 8 8 −8
0 3 −1 2 10
R4+3R2
−→
1 2 −4 −4 5
0 −1 10 9 −5
0 0 8 8 −8
0 0 29 29 −5
1
8
−→
1 2 −4 −4 5
0 −1 10 9 −5
0 0 1 1 −1
0 0 29 29 −5
R4−29R3
−→
1 2 −4 −4 5
0 −1 10 9 −5
0 0 1 1 −1
0 0 0 0 24
With this final step, we have reduced our matrix to echelon form. J
Definition 2.3.3. Matrices A and B are row equivalent if there is a sequence
of elementary row operations that converts A into B.
The matrices in example 2.3.2
1 2 −4 −4 5
2 4 0 0 2
2 3 2 1 5
−1 1 3 6 5
and
1 2 −4 −4 5
0 −1 10 9 −5
0 0 1 1 −1
0 0 0 0 24
are row equivalent.
Definition 2.3.4. The rank of a matrix is the number of non-zero rows in
its row echelon form.
Definition 2.3.5. A matrix is in reduced row echelon form if it satisfies the
following properties:
1. It is in row echelon form.
2. The leading entry in each non-zero row is a 1 (called a leading 1 ).
3. Each column containing a leading 1 has zeros everywhere else.
For 2 × 2 matrices, the possible reduced row echelon forms are
1 0
0 1
,
1 ∗
0 0
,
0 1
0 0
,
0 0
0 0
where ∗ can be any number.
Exercise 2.3.1. Determine whether the given matrix is in row echelon form.
If it is, state whether it is also in reduced row echelon form
a.
1 0 1
0 0 3
0 1 0
b.
7 0 1 0
0 1 −1 4
0 0 0 0
c.
0 1 3 0
0 0 0 1
d.
0 0 0
0 0 0
0 0 0
Ambo University
DEPARTMENT OF MATHEMATICS 22
150. e.
1 0 3 −4 0
0 0 0 0 0
0 1 5 0 1
f.
0 0 1
0 1 0
1 0 0
g.
1 2 3
1 0 0
0 1 1
0 0 1
2.4 Inverse of a matrix and its properties
Definition 2.4.1. An n × n matrix A is invertible if there exists an n × n
matrix B such that AB = In .
Example 2.4.1. Prove that the matrix A =
1 2
3 4
has an inverse B =
−2 1
3
2
−1
2
Solution. We have that
AB =
1 2
3 4
−2 1
3
2
−1
2
=
1 0
0 1
= I2
and
BA =
−2 1
3
2
−1
2
1 2
3 4
=
1 0
0 1
= I2
Thus AB = BA = I2, proving that the matrix A has an inverse B . J
Theorem 2.4.1. If A is an invertible matrix, then its inverse is unique.
Proof. Let B and C be inverses of A. Thus AB = BA = In and AC = CA =
In. Multiply both sides of the equation AB = In by C and use the algebraic
properties of matrices.
C(AB) = CIn
(CA)B = C
InB = C
B = C
Thus an invertible matrix has only one inverse.
Definition 2.4.2. If an n × n matrix A is invertible, then A−1
is called the
inverse of A and denotes the unique n×n matrix such that AA−1
= A−1
A = In
.
Determining the Inverse of a Matrix
We now derive a method for finding the inverse of a matrix. The method is
based on the Gauss-Jordan algorithm. Let A be an invertible matrix. Then
AA−1
= InÂů Let the columns of A−1
be X1, X2, . . . , Xm and the columns of
In be e1, e2, . . . , en. Express A−1
and In in terms of their columns,
A−1
=
X1 X2 . . . Xn
and In =
e1 e2 . . . en
We shall find A−1
by finding X1, X2, . . . , Xn. Write the equation AA−1
= In
in the form
A
X1 X2 . . . Xn
=
e1 e2 . . . en
Using the column form of matrix multiplication,
AX1 AX2 . . . AXn
=
e1 e2 . . . en
Thus
AX1 = e1, AX2 = e2, . . . , AXn = en
Therefore X1, X2, . . . , Xn are solutions to the system AX1 = e1, AX2 =
e2, . . . , AXn = en, all of which have the same matrix of coefficients A.
Solve these systems by using Gauss Jordan elimination on the large aug-
mented matrix
A : e1 e2 . . . en
. Since the solutions X1, X2, . . . , Xn
are unique(they are the columns of A−1
),
A : e1 e2 . . . en
≈ · · · ≈
In : X1 X2 . . . Xn
Thus, when A−1
exists,
[A : In] ≈ · · · ≈ [In : B] where B = A−1
On the other hand, if the reduced echelon form of [A : In] is computed and
the first part is not of the form In then A has no inverse.
Ambo University
DEPARTMENT OF MATHEMATICS 23
151. Example 2.4.2. Find the inverse of
1 2 −1
2 2 4
1 3 −3
if it exists.
Solution. Gauss-Jordan elimination produces
[A|I] =
1 2 −1 1 0 0
2 2 4 0 1 0
1 3 −3 0 0 1
R2 − 2R1
R3 − R1
−→
1 2 −1 1 0 0
0 −2 6 −2 1 0
0 1 −2 −1 0 1
−1
2 R2
−→
1 2 −1 1 0 0
0 1 −3 1 −1
2 0
0 1 −2 −1 0 1
R3−R2
−→
1 2 −1 1 0 0
0 1 −3 1 −1
2 0
0 0 1 −2 1
2 1
R1 + R3
R2 + 3R3
−→
1 2 0 −1 1
2 1
0 1 0 −5 1 3
0 0 1 −2 1
2 1
R1−2R2
−→
1 0 0 9 −3
2 −5
0 1 0 −5 1 3
0 0 1 −2 1
2 1
Therefore,
A−1
=
9 −3
2 −5
−5 1 3
−2 1
2 1
(You should always check that AA−1
= I by direct multiplication.) J
Example 2.4.3. Determine the inverse of the matrix
1 −1 −2
2 −3 −5
−1 3 5
Exercise 2.4.2. Find the inverse of
2 1 −4
−4 −1 6
−2 2 −2
if it exists.
2.5 Determinant of a matrix and its properties
Definition 2.5.1. The determinant of a 2 × 2 matrix A is denoted kAk and
is given by
159. = a11a22 − a12a21
Observe that the determinant of a 2 × 2 matrix is given by the difference of
the products of the two diagonals of the matrix.
The notation det(A) is also used for the determinant of A.
Example 2.5.1. Find the determinant of the matrix
2 4
−3 1
Solution. Applying the above theorem we get
167. = (2 × 1) − (4 × (−3)) = 2 + 12 = 14
J
The determinant of a 3 × 3 matrix is defined in terms of determinants of
2 × 2 matrices.
The determinant of a 4 × 4 matrix is defined in terms of determinants of 3 × 3
matrices, and so on. For these definitions we need the following concepts of
minor and cofactor.
Ambo University
DEPARTMENT OF MATHEMATICS 24
168. Definition 2.5.2. Let A be a square matrix.
The minor of the element aij is denoted Mij and is the determinant of the
matrix that remains after deleting row i and column j of A.
The cofactor of aij is denoted Cij and is given by
Cij = (−1)i+j
Mij
Note that the minor and cofactor differ in at most sign.
Example 2.5.2. Determine the minors and cofactors of the elements a11 and
a31 of the following matrix A.
A =
1 0 3
4 −1 2
0 −2 1
Solution. Applying the above definitions we get the following.
Minor of a11:
M11 =
188. = (−1 × 1) − (2 × (−2)) = 3 (2.5.0)
Cofactor of a11 : C11 = (−1)1+1
M11 = (−1)2
3 = 3
J
Definition 2.5.3. The determinant of a square matrix is the sum of the
products of the elements of the first row and their cofactors.
If A is 3 × 3, |A| = a11C11 + a12C12 + a13C13
If A is 4 × 4, |A| = a11C11 + a12C12 + a13C13 + a14C14
.
.
.
If A is n × n, |A| = a11C11 + a12C12 + · · · + a1nC1n
These equations are called cofactor expansions of |A|
Example 2.5.3. Evaluate the determinant of the following matrix A.
A =
1 2 −1
3 0 1
4 2 1
Solution. Using the elements of the first row and their corresponding cofac-
tors we get
|A| = a11C11 + a12C12 + a13C13
= 1(−1)2
212. = [(0 × 1) − (1 × 2)] − 2[(3 × 1) − (1 × 4)] − [(3 × 2) − (0 × 4)]
= −2 + 2 − 6 = −6
J
Theorem 2.5.1. The determinant of a square matrix is the sum of the prod-
ucts of the elements of any row or column and their cofactors.
ith
row expansion: |A| = ai1Ci1 + ai2Ci2 + · · · + ainCin
jth
column expansion: |A| = a1jC1j + a2jC2j + · · · + anjCnj
There is a useful rule that can be used to give the sign part, (−1)i+j
, of the
cofactors in these expansions. The rule is summarized in the following array
+ − + − . . .
− + − + . . .
+ − + − . . .
.
.
.
Example 2.5.4. Find the determinant of the following matrix using the sec-
ond row.
A =
1 2 −1
3 0 1
4 2 1
Solution. Expanding the determinant in terms of the second row we get
|A| = a21C21 + a22C22 + · · · + a23C23
= −3
236. = −3[(2 × 1) − (−1 × 2)] + 0[(1 × 1) − (−1 × 4)] − 1[(1 × 2) − (2 × 4)]
= −12 + 0 + 6 = −6
J
Ambo University
DEPARTMENT OF MATHEMATICS 25
237. Exercise 2.5.2. Evaluate the determinant of the following 4 × 4 matrix.
A =
2 1 0 4
0 −1 0 2
7 −2 3 5
0 1 0 −3
Computing Determinants of 2 × 2 and 3 × 3 Matrices
The determinants 2 × 2 and 3 × 3 matrices can be found quickly using
diagonals. For a 2 × 2 matrix the actual diagonals are used while in the
case of a 3 × 3 matrix the diagonals of an array consisting of the matrix
with the two first columns added to the right are used. A determinant
is equal to the sum of the diagonal products that go from left to right
minus the sum of the diagonal products that go from right to left, as follows.
2×2 matrix A
a1 b1
a2 b2
+ −
3×3 matrix A
a1 b1 c1
a2 b2 c2
a3 b3 c3
a1 b1
a2 b2
a3 b3
+ + + − − −
2 × 2 matrix: |A| = a1b2 − a2b1
3 × 3 matrix: |A| = a1b2c3 + b1c2a3 + c1a2b3
(diagonal products from left to right)
−
c1b2a3 − a1c2b3 − b1a2c3
(diagonal products from right to left)
For example:
A
2 3
4 1
+ −
B
1 2 3
4 0 1
5 2 6
1 2
4 0
5 2
+ + + − − −
|A| = 2 − 12 = −10 |B| = 0 + 10 + 24 − 0 − 2 − 48
There are no such short cuts for computing determinants of larger matrices.
Theorem 2.5.3 (Properties of determinants). Let A be an n × n matrix and
c be a non-zero scalar
a. If a matrix B is obtained from A by multiplying the elements of a row
(column) by c then |B| = c|A|
b. If a matrix B is obtained from A by interchanging two rows (columns) then
|B| = −|A|
c. If a matrix B is obtained from A by adding a multiple of one row (column)
to another row (column), then |B| = |A|
Example 2.5.5. Evaluate the determinant
249. Solution. We examine the rows and columns of the determinant to see if we
can create zeros in a row or column using the above operations. Note that we
can create zeros in the second column by adding twice the third column to it:
281. = (−3)(9 − 2) = −21
J
Definition 2.5.4. A square matrix A is said to be singular if |A| = 0. A is
nonsingular if |A| 6= 0.
Theorem 2.5.4. Let A be a square matrix. A is singular if
a. all the elements of a row (column) are zero.
b. two rows (columns) are equal.
c. two rows (columns) are proportional.
[Note that (b) is a special case of (c), but we list it to give it special emphasis.]
Example 2.5.6. Show that the following matrices are singular.
(a) A =
2 0 −7
3 0 1
−4 0 9
, (b) B =
2 −1 3
1 2 4
2 4 8
Ambo University
DEPARTMENT OF MATHEMATICS 26
282. Solution. (a) All the elements in column 2 of A are zero. Thus |A| = 0.
(b) Observe that every element in row 3 of B is twice the corresponding
element in row 2. We write
(row 3) = 2(row 2)
row 2 and row 3 are proportional. Thus |B| = 0.
J
Theorem 2.5.5. Let A and B be n × n matrices and c be a nonzero scalar.
a. Determinant of a scalar multiple: |cA| = cn
|A|
b. Determinant of a product: |AB| = |A||B|
c. Determinant of a transpose: |At
| = |A|
d. Determinant of an inverse: |A−1
| = 1
|A| (Assuming A−1
exists.)
Exercise 2.5.6. Prove that |A−1
At
A| = |A|
Remark 2.5.1. The determinant of a triangular matrix is the product of its
diagonal elements.
Example 2.5.7. Evaluate the determinant
346. Definition 2.5.5. Let A be an n × n matrix and Cij be the cofactor of aij.
The matrix whose (i, j)th element is Cij is called the matrix of cofactors of A.
The transpose of this matrix is called the adjoint of A and is denoted adj(A).
C11 C12 . . . C1n
C21 C22 . . . C2n
.
.
.
.
.
.
.
.
.
Cn1 Cn2 . . . Cnn
matrix of cofactors
C11 C21 . . . Cn1
C12 C22 . . . Cn2
.
.
.
.
.
.
.
.
.
C1n C2n . . . Cnn
adjoint matrix
Determinants and Matrix Inverses
Theorem 2.5.8. Let A be a square matrix with |A| 6= o. A is invertible with
A−1
=
1
|A|
adj(A)
Theorem 2.5.9. A square matrix A is invertible if and only if |A| 6= 0.
Example 2.5.8. Use the formula for the inverse of a matrix to compute the
inverse of the matrix
A =
2 0 3
−1 4 −2
1 −3 5
Solution. |A| = 25. Thus the inverse of A exists.
adj(A) =
14 −9 −12
3 7 1
−1 6 8
A−1
=
1
25
adj(A) =
14
25
−9
25
−12
25
3
25
7
25
1
25
−1
25
6
25
8
25
J
2.6 Solving system of linear equations
One of the application of determinant is to find the solution to the linear
systems Ax = b when A is an invertible square matrix.
Ambo University
DEPARTMENT OF MATHEMATICS 27
347. 2.6.1 Cramer’s rule
Before stating the theorem, we need to introduce some notation. If A =
[a1 a2 · · · an] is an n×n matrix and b is in Rn
, then let Ai denote the matrix
A after replacing ai with b. That is,
Theorem 2.6.1 (CRAMER’S RULE ). Let A be an invertible n × n matrix.
Then the components of the unique solution x to Ax = b are given by
xi =
det(Ai)
det(A)
for i = 1, 2, . . . , n
[Ai = a1 · · · ai−1 b ai+1 · · · an]
Example 2.6.1. Use Cramer’s Rule to find the solution to the system
3x1 + x2 = 5
−x1 + 2x2 + x3 = −2
−x2 + 2x3 = −1
Solution. The system is equivalent to Ax = b, where
A =
3 1 0
−1 2 1
0 −1 2
and b =
5
−2
−1
We have
A1 =
5 1 0
−2 2 1
−1 −1 2
, A2 =
3 5 0
−1 −2 1
0 −1 2
, A3 = A =
3 1 5
−1 2 −2
0 −1 −1
Computing determinants gives us det(A) = 17, det(A1) = 28, det(A2) =
1 and det(A3) = −8. Therefore, by CramerâĂŹs Rule, the solution to Ax = b
is
x1 =
det(A1)
det(A)
=
28
17
, x2 =
det(A2)
det(A)
=
1
17
, x3 =
det(A3)
det(A)
=
−8
17
J
2.6.2 Gaussian method
When row reduction is applied to the augmented matrix of a system of lin-
ear equations, we create an equivalent system that can be solved by back
substitution. The entire process is known as Gaussian elimination.
Ambo University
DEPARTMENT OF MATHEMATICS 28
348. Gaussian Elimination
1. Write the augmented matrix of the system of linear equations.
2. Use elementary row operations to reduce the augmented matrix to row
echelon form.
3. Using back substitution, solve the equivalent system that corresponds to
the row-reduced matrix.
Example 2.6.2. Solve the system
2x2 + 3x3 = 8
2x1 + 3x2 + x3 = 5
x1 − x2 − 2x3 = −5
Solution. The augmented matrix is
0 2 3 8
2 3 1 5
1 −1 −2 −5
reduce the matrix to row echelon form
0 2 3 8
2 3 1 5
1 −1 −2 −5
R1↔R3
−→
1 −1 −2 −5
2 3 1 5
0 2 3 8
We now create a second zero in the first column, using the leading 1 :
R2−2R1
−→
1 −1 −2 −5
0 5 5 15
0 2 3 8
1
5 R2
−→
1 −1 −2 −5
0 1 1 3
0 2 3 8
We now need another zero at the bottom of column 2:
R3−2R2
−→
1 −1 −2 −5
0 1 1 3
0 0 1 2
The augmented matrix is now in row echelon form, and we move to step 3.
The corresponding system is
x1 − x2 − 2x3 = −5
x2 + x3 = 3
x3 = 2
and back substitution gives x3 = 2, then x2 = 3 − x3 = 3 − 2 = 1, and finally
x1 = −5 + x2 + 2x3 = −5 + 1 + 4 = 0. We write the solution in vector form as
0
1
2
J
Gauss-Jordan Elimination
1. Write the augmented matrix of the system of linear equations.
2. Use elementary row operations to reduce the augmented matrix to re-
duced row echelon form.
3. If the resulting system is consistent, solve for the leading variables in
terms of any remaining free variables
Example 2.6.3. Solve the system in Example 2.6.2 by Gauss-Jordan elimi-
nation.
Solution. The reduction proceeds as it did in Example 2.6.2 until we reach
the echelon form:
1 −1 −2 −5
0 1 1 3
0 0 1 2
J
2.6.3 Inverse matrix method
We now see that matrix inverse enables us to conveniently express the solutions
to certain systems of linear equations.
Ambo University
DEPARTMENT OF MATHEMATICS 29
349. Theorem 2.6.2. Let AX = Y be a system of n linear equations in n variables.
If A−1
exists, the solution is unique and is given by X = A−1
Y
Example 2.6.4. Solve the following system of equations using the inverse of
the matrix of coefficients.
x1 − x2 − 2x3 = 1
2x1 − 3x2 − 5x3 = 3
−x1 + 3x2 + 5x3 = −2
Solution. This system can be written in the following matrix form,
1 −1 −2
2 −3 −5
−1 3 5
x1
x2
x3
=
1
3
−2
If the matrix of coefficients is invertible, the unique solution is
x1
x2
x3
=
1 −1 −2
2 −3 −5
−1 3 5
−1
1
3
−2
This inverse has already been found in Example 2.4.3 Using that result we get
x1
x2
x3
=
0 1 1
5 −3 −1
−3 2 1
−1
1
−2
1
The unique solution is x1 = 1, x2 = −1, x3 = 1 J
2.7 Eigenvalues and Eigenvectors
The focus of this section is eigenvalues and eigenvectors, which are charac-
teristics of matrices and linear transformations. Eigenvalues and eigenvectors
arise in a wide range of fields, including finance, quantum mechanics, image
processing, and mechanical engineering.
Definition 2.7.1. Let A be an n × n matrix. Then a nonzero vector u is an
eigenvector of A if there exists a scalar λ such that
Au = λu (2.7.0)
The scalar λ is called an eigenvalue of A.
When λ and u are related as in equation 2.7.1, we say that λ is the
eigenvalue associated with u and that u is an eigenvector associated with λ.
The next theorem shows how to use determinants to find eigenvalues.
Theorem 2.7.1. Let A be an n × n matrix. Then λ is an eigenvalue of A
if and only if det(A − λIn) = 0.
Theorem 2.7.2. Let A be a square matrix, and suppose that u is an eigen-
vector of A associated with eigenvalue λ. Then for any scalar c 6= 0, cu is
also an eigenvector of A associated with λ.
Example 2.7.1. Find the eigenvalues for A =
3 3
6 −4
.
Solution. Our aim is to determine the values of λ that satisfy det(A−λIn) =
0. We have
A − λIn =
3 3
6 −4
−
λ 0
0 λ
=
3 − λ 3
6 −4 − λ
.
Next, we compute the determinant,
det(A − λIn) = (3 − λ)(−4 − λ) − 18 = λ2
+ λ − 30
Setting det(A − λI) = 0, we have
λ2
+ λ − 30 = 0 ⇒ (λ − 5)(λ + 6) = 0 ⇒ λ = 5 or λ = −6
Thus the eigenvalues for A are λ = 5 and λ = −6. J
Exercise 2.7.3. Find the eigenvalues and eigenvectors of the following ma-
trices.
(A) L =
4 4 −2
1 4 −1
3 6 −1
(B) G =
3 −4
1 3
(C) L =
−1 3 −4
−2 3 −4
1 1 3
.
Ambo University
DEPARTMENT OF MATHEMATICS 30
350. Chapter 3
Limit and continuity
3.1 Definition of limit
Definition 3.1.1 (Informal definition of limit). Suppose f(x) is defined when
x is near the number a . (these means that f is defined on some open interval
that contains a, except possibly at a itself.)
Then we write
lim
x→a
f(x) = L
and say “the limit of f(x), as x approaches a, equals L If we can make the
values of f(x) arbitrarily close to L (as close to L as we like) by taking x to
be sufficiently close to a (on either side of a) but not equal to a.
Example 3.1.1. Guess the value of lim
x→1
x−1
x2−1 .
Solution. Note that the function f(x) = x−1
x2−1 is not defined at x = 1.
Consider the following table:
x 1 f(x) x 1 f(x)
0.5 0.66667 1.5000 0.400000
0.9 0.526316 1.1000 0.476190
0.99 0.502513 1.0010 0.499700
0.999 0.500250 1.0001 0.499975
Thus, lim
x→1
x−1
x2−1 = 0.5.
Example 3.1.1 is illustrated by the graph of f in Figure below. Now let’s
change f slightly by giving it the value 2 when x = 1 and calling the resulting
function g:
g(x) =
(
x−1
x2−1 if x 6= 1
2 if x = 1
This new function g still has the same limit as x approaches 1.
lim
x→1
g(x) = 0.5
J
Example 3.1.2. Investigate lim
x→0
sin(π
x )
Exercise 3.1.1. Describe the behavior of the the function f(x) = x2
−1
x−1 near
x = 1 and find the limit of f(x) at x = 1.
31
351. −4. −3. −2. −1. 1. 2. 3.
−2.
−1.
1.
2.
3.
4.
5.
6.
0
→ 1 ←
B
Definition 3.1.2 (The formal definition of a limit). Let f be a function
defined on some open interval that contains the number a, except possibly at
a itself .Then we say that the limit of f(x) as x approaches a is L and we
write
lim
x→a
f(x) = L
if for every number 0 there is a number δ 0 such that ,if
0 |x − a| δ
then
|f(x) − L| .
Example 3.1.3. Show that lim
x→3
(4x − 5) = 7.
Solution. Let be a given positive number. We want to find a number δ
such that ,if
0 |x − 3| δ
then
|4x − 5 − 7| .
But |(4x − 5) − 7| = |4x − 12| = 4|x − 3|.Therefore we want δ such that if
0 |x − 3| δ then 4|x − 3| .
that is ; if 0 |x − 3| δ then |x − 3|
4 .
This suggests that we should choose δ =
4
Now showing that this δ works , given 0 choose δ =
4
if
0 |x − 3| δ
then
|(4x − 5) − 7| = |4x − 12| = 4|x − 3| 4δ = 4(
4
) = .
Ambo University
DEPARTMENT OF MATHEMATICS 32
352. Thus if
0 |x − 3| δ
then
|(4x − 5) − 7| .
Therefore by a definition of a limit
lim
x→3
(4x − 5) = 7.
J
Example 3.1.4. Use Formal Definition of limit to prove that lim
x→4
x2
−2x−8
x−4 =
6.
3.2 Basic limit theorems
Theorem 3.2.1 (limit laws). Suppose that c is a constant and the limits
lim
x→a
f(x) and lim
x→a
g(x)
exists. Then
1. Limit of a sum: lim
x→a
[f(x) + g(x)] = lim
x→a
f(x) + lim
x→a
g(x)
2. Limit of a difference: lim
x→a
[f(x) − g(x)] = lim
x→a
f(x) − lim
x→a
g(x)
3. Limit of a multiple: lim
x→a
cf(x) = c lim
x→a
f(x)
4. Limit of a product: lim
x→a
f(x)g(x) = lim
x→a
f(x) lim
x→a
g(x)
5. Limit of a quotient: lim
x→a
f(x)
g(x) =
lim
x→a
f(x)
lim
x→a
g(x) if lim
x→a
g(x) 6= 0
Example 3.2.1. Evaluate the following limits
a. lim
x→0
(x2
+ cos x)
b. lim
x→9
1
2
√
x
c. lim
x→0
x cos x
d. lim
x→0
x cos x
x2+cos x
Theorem 3.2.2. 6. Power law: lim
x→a
[f(x)]n
= [lim
x→a
f(x)]n
where n is a pos-
itive integer.
7. lim
x→a
c = c
8. lim
x→a
x = a
9. lim
x→a
xn
= an
where n is a positive integer
10. lim
x→a
n
√
x = n
√
a where n is a positive integer (If n is even, we assume that
a 0).
11. Root Law: lim
x→a
n
p
f(x) = n
q
lim
x→a
f(x) where n is a positive integer (If n
is even, we assume that lim
x→a
f(x) 0).
Example 3.2.2. Evaluate lim
x→−2
x3
+3x+1
x2−3
√
5x
Theorem 3.2.3 (Limits of Polynomials and Rational Functions). 1. If
P(x) is a polynomial and a is any number, then
lim
x→a
P(x) = P(a).
2. If P(x) and Q(x) are polynomials and Q(a) 6= 0, then
lim
x→a
P(x)
Q(x)
=
P(a)
Q(a)
.
Example 3.2.3. Evaluate
lim
x→−1
(4x3
− 6x2
− 9x)
Remark 3.2.1. If f(x) = g(x) when x 6= a, then lim
x→a
f(x) = lim
x→a
g(x) provided
the limits exist.
Example 3.2.4. Find lim
x→1
x−1
x2−1
Example 3.2.5. Find lim
x→−2
x3
+2x2
−x−2
x2−4
Ambo University
DEPARTMENT OF MATHEMATICS 33
353. Theorem 3.2.4. If f(x) ≤ g(x) when x is near a (except possibly at a) and
the limits of f and g both exist as x approaches a, then
lim
x→a
f(x) ≤ lim
x→a
g(x)
Example 3.2.6. Find the lim
x→0
√
1 − x2
Theorem 3.2.5 (The Squeeze theorem). Assume f(x) ≤ g(x) ≤ h(x) for
all x in some open interval about a, except possibly a itself. If lim
x→a
f(x) =
lim
x→a
h(x) = L, then lim
x→a
g(x) exists and lim
x→a
g(x) = L.
Example 3.2.7. Show that
a. lim
x→0
sin x
x = 1.
b. lim
x→0
cos x−1
x = 0.
Exercise 3.2.6. Evaluate the following
a lim
x→−2
x2
+x−2
x2+5x+6
b lim
x→a
1
x − 1
a
x−a
c lim
x→4
√
x−2
x2−16
3.3 One sided limits
Definition 3.3.1 ( Left-hand-limit). We write
lim
x→a−
f(x) = L
and say the left-hand limit of f(x) as x approaches a [or the limit of
f(x) as x approaches a from the left] is equal to L if we can make the
values of f(x) arbitrarily close to L by taking x to be sufficiently close to a
and x less than a. In this case we consider only for x a.
Definition 3.3.2 (Right-hand-limit). If the values of f(x) can be made as
close as we like to L by making x sufficiently close to a (but greater than
a).Then we write
lim
x→a+
f(x) = L
which is read as “the limit of f(x) as x approaches a from the right is L .
lim
x→a
f(x) = L iff lim
x→a−
f(x) = L = lim
x→a+
f(x).
Example 3.3.1. Show that lim
x→0
|x| = 0
Ambo University
DEPARTMENT OF MATHEMATICS 34
354. Example 3.3.2. Let f(x) = |x|
x . Find
a. lim
x→0−
f(x)
b. lim
x→0+
f(x)
c. lim
x→0
f(x)
if it exists.
3.4 Infinite limits,limit at infinity and asymp-
totes
3.4.1 Limits at infinity (negative infinity)
Definition 3.4.1. If the function f is defined on an interval (a, ∞) and if we
can ensure that f(x) is as close as we want to the number L by taking x large
enough, then we say that f(x) approaches the limit L as x approaches infinity
,and we write
lim
x→∞
f(x) = L
If f is defined on an interval (−∞, b) and if we can ensure that f(x) is close as
we want to the number M by taking x negative and large enough in absolute
value ,then we say that f(x) approaches the limit M as x approaches negative
infinity ,and we write
lim
x→−∞
f(x) = M.
Example 3.4.1. Evaluate lim
x→∞
f(x) and lim
x→−∞
f(x) for f(x) = x
√
x2+1
3.4.2 Infinite limits
Example 3.4.2 (A two sided infinite limit). Describe the behaviour of the
function f(x) = 1
x2 near x = 0.
Solution. As x approaches 0 from either side, the values of f(x) are positive
and grow larger and larger.Thus lim
x→0
f(x) = lim
x→0
1
x2 = ∞ J
Example 3.4.3 (One sided infinite limits). Describe the behaviour of the
function f(x) = 1
x near x = 0.
−3. −2. −1. 1. 2. 3.
−1.
1.
2.
3.
4.
0
y = 1
x2
Figure 3.1: Graph of y = 1
x2
Solution. As x approaches 0 from the right ,the values of f(x) become larger
and larger positive number, and we say that fhas right -hand limit infinity at
x = 0.
⇒ lim
x→0+
f(x) = ∞
Similarly , the values of f(x) become larger and larger negative numbers as x
approaches 0 from the left, so f has left hand limit ∞ at x = 0.
lim
x→0−
f(x) = −∞
These statements do not say that the one-sided limits exists ;they do not exists
because ∞ and −∞ are not numbers. J
Example 3.4.4. polynomial behaviour at infinity
a. lim
x→0
(3x3
− x2
+ 2) = ∞
b. lim
x→−∞
(3x3
− x2
+ 2) = −∞
Ambo University
DEPARTMENT OF MATHEMATICS 35
355. −3. −2. −1. 1. 2. 3.
−1.
1.
2.
3.
4.
0
y = 1
x
Figure 3.2: Graph of y = 1
x
The highest degree term of a polynomial dominates the other terms as |x|
grows large , so the limit of this term at ∞ and −∞ determine the limits of
the whole polynomial. Thus
3x3
− x2
+ 2 = 3x3
(1 −
1
3x
+
2
3x3
) ⇒ lim
x→∞
(3x3
− x2
+ 2) = ( lim
x→∞
3x3
)( lim
x→∞
(1 −
1
3x
+
2
3x3
)) = ∞
3.4.3 Asymptotes
Definition 3.4.2. The line x = a is called a vertical asymptote of the curve
y = f(x) if at least one of the following statements is true.
a. lim
x→a
f(x) = ∞
b. lim
x→a−
f(x) = ∞
c. lim
x→a+
f(x) = ∞
d. lim
x→a
f(x) = −∞
e. lim
x→a−
f(x) = −∞
f. lim
x→a+
f(x) = −∞
For instance in examples 3.4.2 and 3.4.3 above the y-axis is a vertical asymp-
tote and the x-axis is a horizontal asymptote.
Exercise 3.4.1. a. Determine the infinite limit
lim
x→−3+
x+2
x+3 and lim
x→−3−
x+2
x+3 .
b. Find the vertical asymptotes of the function y = x2
+1
3x−2x2 .
c. Evaluate lim
x→0−
|x|
x and lim
x→0+
|x|
x .
3.4.4 Continuity
One sided continuity
Continuity from the left and right
Definition 3.4.3. A function f is continuous from the left at a point c if
lim
x→c−
f(x) = f(c)
and is continuous from the right at a point c if
lim
x→c+
f(x) = f(c)
Definition 3.4.4. A function f is said to be continuous at a point c if the
following conditions are satisfied
1. f(c) is defined
2. lim
x→c
f(x) exists
3. lim
x→c
f(x) = f(c)
Example 3.4.5. Determine whether the following functions are continuous
at the point x = 2.
a. f(x) = x2
−4
x−2
Ambo University
DEPARTMENT OF MATHEMATICS 36
356. b. g(x) =
(
x2
−4
x−2 if x 6= 2
3 if x = 2
c. h(x) =
(
x2
−4
x−2 if x 6= 2
4 if x = 2
Theorem 3.4.2. Polynomials are continuous every where.
Example 3.4.6. Show that |x| is continuous every where ,
|x| =
x if x 0
0 if x = 0
−x if x 0
So |x| is continuous every where because lim
x→c
|x| = |c|.
Theorem 3.4.3. If the functions f and g are continuous at c, then
• f + g is continuous at c
• f − g is continuous at c
• f × g is continuous at c
• f ÷g is continuous at c if g(c) 6= 0 and has a discontinuity at c if g(c) = 0
lim
x→c
f(x)
g(x) = f(c)
g(c) since f and g are continuous at c, lim
x→c
f(x) = f(c) and
lim
x→c
g(x) = g(c). Thus
lim
x→c
f(x)
g(x)
=
lim
x→c
f(x)
lim
x→c
g(x)
=
f(c)
g(c)
Theorem 3.4.4. A rational function is continuous every where except at the
points where the denominator is zero.
Example 3.4.7. For what values of x is there a hole or a gap in the graph of
y = x2
−9
x2−5x+6 .
Theorem 3.4.5. If f is continuous at b and lim
x→a
g(x) = b, then lim
x→a
f(g(x)) =
f(b). In other words
lim
x→a
f(g(x)) = f(lim
x→a
g(x))
Example 3.4.8. If g(x) = 5 − x2
and f(x) = |x| then show that f(g(x)) is
continuous at x = 3.
Theorem 3.4.6. If g is continuous at a and f is continuous at g(a), then the
composite function f ◦ g given by (f ◦ g)(x) = f(g(x)) is continuous at a.
Continuity on a closed interval
Definition 3.4.5. A function f is said to be continuous on a closed interval
[a, b] if the following conditions are satisfied.
1. f is continuous on (a, b).
2. f is continuous from the right at a.
3. f is continuous from the left at b.
But f is continuous on (a, b) if f is continuous at every element in (a, b).
Example 3.4.9. What can you say about the continuity of the function f(x) =
√
9 − x2?
1. If c ∈ (−3, 3), then lim
x→c
f(x) = lim
x→c
√
9 − x2 =
√
9 − c2 = f(c) which
shows f is continuous at each point in (−3, 3).
2. lim
x→−3+
f(x) = lim
x→−3+
√
9 − x2 =
q
lim
x→−3+
(9 − x2) = 0 = f(−3)
3. lim
x→3−
f(x) = lim
x→3−
√
9 − x2 =
q
lim
x→3−
(9 − x2) = 0 = f(3). Thus f is
continuous on the closed interval [−3, 3].
3.5 Intermediate value theorem
Theorem 3.5.1 (Intermediate value theorem). If f is continuous on a closed
interval [a, b] and k is any number between f(a) and f(b) inclusive then there
is at least one number x in the interval [a, b] such that f(x) = N.
Ambo University
DEPARTMENT OF MATHEMATICS 37
357. Theorem 3.5.2. If f is continuous on [a, b] and if f(a) and f(b) are non-
zero and have positive signs, then there is at least one solution of the equation
f(x) = 0 in the interval (a, b).
Ambo University
DEPARTMENT OF MATHEMATICS 38
358. Chapter 4
Derivatives and application of derivatives
4.1 Definition of derivatives; basic rules
Many real-world phenomena involve changing quantities - the speed of a
rocket, the inflation of currency, the number of bacteria in a culture, the
shock intensity of an earthquake, the voltage of an electrical signal, and so
forth. In this chapter we will develop the concept of a “derivative, which is
the mathematical tool for studying the rate at which one quantity changes
relative to another. The study of rates of change is closely related to the
geometric concept of a tangent line to a curve, so we will also be discussing
the general definition of a tangent line and methods for finding its slope and
equation.
Definition 4.1.1. The derivative of a function f at a number a , denoted by
f0
(a), is
f0
(a) = lim
h→0
f(a + h) − f(a)
h
if this limit exists.
If we write x = a + h , then we have h = x − a and h approaches 0 if and
only if x approaches a. Therefore an equivalent way of stating the definition
of the derivative, as we saw in finding tangent lines, is
f0
(a) = lim
x→a
f(x) − f(a)
x − a
.
Two interpretations of the derivative are as follows.
1. Geometric Interpretation of the Derivative: The derivative f0
of a
function f is a measure of the slope of the tangent line to the graph of f
at any point P(x, f(x)), provided that the derivative exists.
Example 4.1.1. Find an equation of the tangent line to the parabola
y = x2
at the P(1, 1).
2. Physical Interpretation of the Derivative: The derivative f0
of a
function f measures the instantaneous rate of change of f at x .
Example 4.1.2. Find the derivative of the function f(x) = x at the number a.
39
359. Exercise 4.1.1. Find the derivative of the the following functions with respect
to x using Definition 4.1.1.
A. f(x) =
√
x B. g(x) =
1
x + 1
C. h(x) = x3
− x D. h(x) =
1 − x
2 + x
Theorem 4.1.2 (Derivative of a Constant Function). If c is a constant, then
d
dx
(c) = 0.
Example 4.1.3. a) If f(x) = 6, then f0
(x) = d
dx (6) = 0.
b) If f(x) = π2
, then f0
(x) = d
dx (π2
) = 0.
Theorem 4.1.3 (The Power Rule). If r is any real number and f(x) = xr
,
then
f0
(x) =
d
dx
(xr
) = rxr−1
.
Exercise 4.1.4. Find the derivative of the following functions with respect to
x using power rule.
A. f(x) = x100
B. g(x) = 3
√
x C. h(x) =
1
x2
D. l(x) = xπ
Theorem 4.1.5 (The Constant Multiple Rule). If f is a differentiable func-
tion and c is a constant, then
d
dx
[cf(x)] = cf0
(x).
Exercise 4.1.6. Find the derivative of the following functions with respect to
x using power rule.
A. f(x) =
5
3x5
B. g(x) =
π
x
C.
1
x2
Theorem 4.1.7 ( The Sum and Difference Rule). If f and g are differentiable
functions, then
d
dx
[f(x) ± g(x)] = f0
(x) ± g0
(x)
Theorem 4.1.8 (The Product Rule). If f and g are differentiable functions,
then
d
dx
[f(x)g(x)] = f0
(x)g(x) + g0
(x)f(x).
Example 4.1.4. a) Find dy/dx if y = (4x2
− 1)(7x3
+ x).
b) Find ds/dt if s = (1 + t)
√
t .
c) Find dy/dx if y = (2
√
x + 3
x )(2
√
x − 2
x )
d) Let y = uv be the product of the functions u and v. Find y0
(2) if u(2) = 2,
u0
(2) = −5, v(2) = 1 and v0
(2) = 3.
Theorem 4.1.9 (The Quotient Rule). If f and g are differentiable functions
and g(x) 6= 0 , then
d
dx
f(x)
g(x)
=
g(x)f0
(x) − f(x)g0
(x)
[g(x)]2
.
Example 4.1.5. a) Find dy/dx if y = 1
x2+1 .
b) Find ds/dt if s =
√
t
1−5t .
c) Find df/dθ if f(θ) = a+bθ
m+nθ
d) Find equations of any lines that pass through the point (−1, 0) and any
tangent to the curve y = x−1
x+1 .
Theorem 4.1.10 (Chain Rule). If g is differentiable at x and f is differ-
entiable at g(x), then the composite function F(x) = f ◦ g(x) defined by
F(x) = f(g(x)) is differentiable at x and F0
is given by the product
F0
(x) = f0
(g(x)) · g0
(x)
In Leibniz notation, if y = f(u) and u = g(x) are both differentiable func-
tions,then
dy
dx
=
dy
du
du
dx
.
Example 4.1.6. a) Differentiate y = cos 4x
b) Find F0
(x) if F(x) =
√
x2 + 3
Ambo University
DEPARTMENT OF MATHEMATICS 40
360. The Power Rule Combined with the Chain Rule: If n is any real number
and u = g(x) is differentiable, then
d
dx
un
= nun−1 du
dx
Alternatively,
d
dx
[g(x)]n
= n[g(x)]n−1
· g0
(x)
Example 4.1.7. Differentiate
A. y = (x3
−1)100
B. g(x) = sin(cos(tan x)) C. h(x) =
1
3
√
x2 + x + 1
D. l(x) = esin x
4.2 Derivatives of inverse functions
Definition 4.2.1. A function f is called a one-to-one function if it never
takes on the same value twice; that is,
f(x1) 6= f(x2) whenever x1 6= x2.
HORIZONTAL LINE TEST: A function f is one-to-one if and only if
no horizontal line intersects its graph more than once.
Definition 4.2.2. Let f be a one-to-one function with domain A and range
B. Then its inverse function f−1
(x) has domain B and range A and is
defined by
f−1
(y) = x ⇔ f(x) = y
for any y in B.
domain of f−1
= range f
image of f = domain of f−1
Example 4.2.1. Find the inverse of f(x) =
√
3x − 2.
4.2.1 Inverse trigonometric functions
Definition 4.2.3 (Inverse Trigonometric Functions).
Domain
y = sin−1
x if and only if x = sin y [−1, 1]
y = cos−1
x if and only if x = cos y [−1, 1]
y = tan−1
x if and only if x = tan y (−∞, ∞)
y = csc−1
x if and only if x = csc y (−∞, −1] ∪ [1, ∞)
y = sec−1
x if and only if x = sec y (−∞, −1] ∪ [1, ∞)
y = cot−1
x if and only if x = cot y (−∞, ∞)
Theorem 4.2.1 (Derivatives of inverse trigonometric functions ). Let u be a
differentiable function of x.
• d
dx (sin−1
u) = u0
√
1−u2
• d
dx (cos−1
u) = −u0
√
1−u2
• d
dx (tan−1
u) = u0
1+u2
• d
dx (cot−1
u) = u0
1+u2
• d
dx (sec−1
u) = u0
|u|
√
u2−1
• d
dx (csc−1
u) = −u0
|u|
√
u2−1
4.2.2 Hyperbolic and inverse hyperbolic functions
Definition 4.2.4 (DEFINITION OF THE HYPERBOLIC FUNCTIONS).
sinh x =
ex
− e−x
2
csc x =
1
sinh x
cosh x =
ex
+ e−x
2
sec x =
1
cosh x
tanh x =
sinh x
cosh x
coth x =
cosh x
sinh x
Ambo University
DEPARTMENT OF MATHEMATICS 41
361. Hyperbolic Identities
sinh(−x) = − sinh(x), cosh x = cosh x
cosh2
x − sinh2
x = 1, 1 − tanh2
x = sech2
x
sinh(x + y) = sinh x cosh y + cosh x sinh y
cosh(x + y) = cosh x cosh y + sinh x sinh y
Derivatives of Hyperbolic Functions
• d
dx (sinh x) = cosh x
• d
dx (cosh x) = sinh x
• d
dx (tanh x) = sech2
x
• d
dx (cschx) = cschx coth x
• d
dx (sechx) = −sechx tanh x
• d
dx (cothx) = −csch2
x
Inverse Hyperbolic Functions
y = sinh−1
x ⇔ sinh y = x
y = cosh−1
x ⇔ cosh y = x and y ≥ 0
y = tanh−1
x ⇔ tanh y = x
Since the hyperbolic functions are defined in terms of exponential functions,
itâĂŹs not surprising to learn that the inverse hyperbolic functions can be
expressed in terms of logarithms.
• sinh−1
x = ln(x +
√
x2 + 1) x ∈ R
• cosh−1
x = ln(x +
√
x2 − 1) x ∈ [1, ∞)
• tanh−1
x = 1
2 ln 1+x
1−x x ∈ (−1, 1)
• coth−1
x = 1
2 ln x+1
x−1 x ∈ (−∞, −1) ∪ (1, ∞)
• sech−1
x = ln 1+
√
1−x2
x x ∈ (0, 1]
• csch−1
x = ln
1
x +
√
1+x2
|x|
x ∈ (−∞, 0) ∪ (0, ∞)
Example 4.2.2. Show that sinh−1
x = ln(x +
√
x2 + 1).
Derivatives of Inverse Hyperbolic Functions
• d
dx (sinh−1
x) = 1
√
1+x2
• d
dx (cosh−1
x) = 1
√
x2−1
• d
dx (tanh−1
x) = 1
1−x2
• d
dx (coth−1
x) = 1
1−x2
• d
dx (sech−1
x) = − 1
x
√
1−x2
• d
dx (csch−1
x) = − 1
|x|
√
x2+1
Example 4.2.3. Show that d
dx (sinh−1
x) = 1
√
1+x2
.
4.3 Higher order derivatives
The derivative f0
of a function f is itself a function. As such, we may consider
differentiating the function f0
. The derivative of f0
, if it exists, is denoted
by f00
and is called the second derivative of f. Continuing in this fashion, we
are led to the third, fourth, fifth, and higher-order derivatives of f, whenever
they exist. Notations for the first, second, third, and in general, the n−th
derivative of f are
f0
, f00
, f000
, . . . , f(n)
or
d
dx
[f(x)],
d2
f
dx2
,
d3
f
dx3
, . . . ,
dn
f
dxn
or
Dxf, D2
xf, D3
xf, . . . , Dn
x f
Example 4.3.1. Find the derivatives of all orders of f(x) = x4
− 3x3
+ x2
−
2x + 8.
Example 4.3.2. Find the third derivative of y = 1
x .
4.4 Implicit differentiation
Suppose y = f(x) is differentiable function of x, we have seen that dy
dx = f0
(x).
It was a case when y is written explicitly in terms of x. On the other hand,
Ambo University
DEPARTMENT OF MATHEMATICS 42
362. assume y is a diffentiable function of x expressed by equation F(x, y) = 0,
where y is not expressed explicitly. For instance, x3
+ y3
= 2xy. The process
of differentiating such functions without the need of first writing y in terms
of x, is called Implicit differentiation.
Guidelines for implicit Differentiation
1. Differentiate both sides of the equation with respect to x.
2. Collect all terms involving dy
dx on the left side of the equation and move
all other terms to the right side of the equation.
3. Factor dy
dx out of the left side of the equation.
4. Solve for dy
dx .
Example 4.4.1. Use implicit differentiation to find dy
dx if 5y2
+ sin y = x2
.
Exercise 4.4.1. (a) Use implicit differcntiation to find dy
dx for the Folium of
Descartes , x3
+ y3
= 3xy.
(b) Find an equation for the tangent line to the Folium of Descartes at the
point (3
2 , 3
2 ).
(c) At what points is the tangent line to the Folium of Descartes horizontal?
4.5 Application of derivatives
4.5.1 Extrema of a function
Definition 4.5.1 (Extrema of a Function). A function f has an absolute
maximum at c if for all x in the domain D of f. The number f(c) is called
the maximum value of f on D. Similarly, f has an absolute minimum at c
if for all x in D. The number f(c) is called the minimum value of f on D.
The absolute maximum and absolute minimum values of f on D are called
the extreme values, or extrema, of f on D.
Definition 4.5.2 (Relative Extrema of a Function). A function f has a rel-
ative (or local) maximum at c if f(c) ≥ f(x) for all values of x in some open
interval containing c . Similarly, f has a relative (or local) minimum at c if
f(c) ≤ f(x) for all values of x in some open interval containing c.
Theorem 4.5.1 (Fermat’s Theorem). If f has a relative extremum at c, then
either f0
(c) = 0 or f0
(c) does not exist.
Definition 4.5.3 (Critical Number of f). A critical number of a function f
is any number c in the domain of f at which f0
(c) = 0 or f0
(c) does not exist.
Finding the Extreme Values of a Continuous Function on a Closed Interval
Theorem 4.5.2 (The Extreme Value Theorem). If f is continuous on a
closed interval [a, b], then attains an absolute maximum value f(c) for some
number c in [a, b] and an absolute minimum value f(d) for some number d in
[a, b].
Guidelines for Finding the Extrema of a Continuous Function f on [a, b]
1. Find the critical numbers of f that lie in (a, b).
2. Compute the value of f at each of these critical numbers, and also com-
pute f(a) and f(b).
3. The absolute maximum value of f and the absolute minimum value of f
are precisely the largest and the smallest numbers found in Step 2.
Example 4.5.1. Find the extreme values of the function f(x) = 3x4
−4x3
−8
on [−1, 2].
Example 4.5.2. Find the extreme values of the function f(x) = 2 cos x − x
on [0, 2π]
4.5.2 Mean value theorem
Theorem 4.5.3 (ROLLE’S THEOREM). Let f be a function that satisfies
the following three hypotheses:
1. f is continuous on the closed interval [a, b].
2. f is differentiable on the open interval (a, b).
3. f(a) = f(b).
4. Then there is a number c in (a, b) such that f0
(c) = 0.
Example 4.5.3. Let f(x) = x3
− x for x in [−1, 1].
Ambo University
DEPARTMENT OF MATHEMATICS 43
363. • Show that f satisfies the hypotheses of Rolle’s Theorem on [−1, 1].
• Find the number(s) c in (−1, 1) such that f0
(c) = 0 as guaranteed by
Rolle’s Theorem.
Theorem 4.5.4 ( THE MEAN VALUE THEOREM). Let f be a function
that satisfies the following hypotheses:
1. f is continuous on the closed interval [a, b].
2. f is differentiable on the open interval (a, b)
Then there is a number c in (a, b) such that
f0
(c) =
f(b) − f(a)
b − a
or, equivalently,
f(b) − f(a) = f0
(c)[b − a]
Example 4.5.4. Let f(x) = x3
.
a. Show that f satisfies the hypotheses of the Mean Value Theorem on
[1, −1].
b. Find the number(s) c in (−1, 1) that satisfy Equation (1) as guaranteed
by the Mean Value Theorem.
4.6 First and second derivative tests
Definition 4.6.1 (Increasing and Decreasing Functions). A function f is
increasing on an interval I, if for every pair of numbers x1 and in x2,
x1 x2 implies that f(x1) f(x2),
f is decreasing on I if, for every pair of numbers x1 and x2 in I,
x1 x2 implies that f(x1) f(x2),
f is monotonic on I if it is either increasing or decreasing on I.
Theorem 4.6.1. Suppose f is differentiable on an open interval (a, b).
a. If f0
(x) 0 for all x in (a, b) , then f is increasing on (a, b).
b. If f0
(x) 0 for all x in (a, b) , then f is decreasing on (a, b).
c. If f0
(x) = 0 for all x in (a, b), then f is constant on (a, b).
Determining the Intervals Where a Function Is Increasing or Decreasing
1. Find all the values of x for which f0
(x) = 0 or f0
(x) does not exist. Use
these values of x to partition the domain of f into open intervals.
2. Select a test number c in each interval I found in Step 1, and determine
the sign of f0
(c) in that interval.
a. If f0
(c) 0 then f is increasing on that interval.
b. If f0
(c) 0 then f is decreasing on that interval.
c. If f0
(c) = 0, then f is constant on that interval.
Example 4.6.1. Determine the intervals where the function f(x) = x3
−
3x2
+ 2 is increasing and where it is decreasing.
We will now see how the derivative of a function can be used to help us find
the relative extrema of f.
Theorem 4.6.2 (The First Derivative Test). Let c be a critical number of a
continuous function f in the interval (a, b) and suppose that f is differentiable
at every number c in (a, b) with the possible exception of c itself.
a. If f0
(x) 0 on (a, c) and f0
(x) 0 on (c, b), then f has a relative
maximum at c.
b. If f0
(x) 0 on (a, c) and f0
(x) 0 on (c, b), then has a relative minimum
at c.
c. If f0
(x) has the same sign on (a, c) and (c, b), then f does not have a
relative extremum at c.
Theorem 4.6.3 (The Second Derivative Test). Suppose that f has a contin-
uous second derivative on an interval (a, b) containing a critical number c of
f.
a. If f00
(c) 0 , then f has a relative maximum at c.
Ambo University
DEPARTMENT OF MATHEMATICS 44
364. b. If f00
(c) 0, then has a relative minimum at c.
c. If f00
(c) = 0, then the test is inconclusive.
Example 4.6.2. Find the relative extrema of f(x) = x3
− 3x2
− 24x + 32
using the Second Derivative Test.
4.6.1 Concavity and inflection point
Definition 4.6.2. Concavity of the Graph of a Function Suppose f is differ-
entiable on an open interval I. Then
a. the graph of f is concave upward on I if f0
is increasing on I.
b. the graph of f is concave downward on I if f0
is decreasing on I.
Theorem 4.6.4. Suppose f has a second derivative on an open interval I.
a. If f00
(x) 0 for all x in I , then the graph of f is concave upward on I.
b. If f00
(x) 0 for all x in I, then the graph of f is concave downward on
I.
Determining the Intervals of Concavity of a Function
1. Find all values of x for which f00
(x) = 0 or f00
(x) does not exist. Use
these values of x to partition the domain of f into open intervals.
2. Select a test number c in each interval found in Step 1 and determine the
sign of f00
(c) in that interval.
a. If f00
(c) 0 , the graph of f is concave upward on that interval.
b. If f00
(c) 0, the graph of f is concave downward on that interval.
Definition 4.6.3. A point P on a curve y = f(x) is called an inflection point
if f is continuous there and the curve changes from concave upward to concave
downward or from concave downward to concave upward at P.
Finding Inflection Points
1. Find all numbers c in the domain of f for which f00
(c) = 0 or f00
(c) does
not exist. These numbers give rise to candidates for inflection points.
2. Determine the sign of f00
(x) to the left and to the right of each number c
found in Step 1. If the sign of f00
(x) changes, then the point P(c, f(c)) is
an inflection point of f, provided that the graph of f has a tangent line
at P.
Example 4.6.3. Find the points of inflection of f(x) = x4
− 4x3
+ 12.
4.6.2 Curve sketching
We have seen on many occasions how the graph of a function can help us to
visualize the properties of the function. From a practical point of view, the
graph of a function also gives, at one glance, a complete summary of all the
information captured by the function.
Guidelines for Curve Sketching
1. Find the domain of f.
2. Find the x− and y− intercepts of f. The y-intercept is f(0) and this tells
us where the curve intersects the y-axis. To find the x-intercepts, we set
y = 0 and solve for x. (You can omit this step if the equation is difficult
to solve.)
3. Determine whether the graph of f is symmetric with respect to the y-axis
or the origin.
i ) If f(x) = f(−x) for all x in D, that is, the equation of the curve is
unchanged when x is replaced by −x, then f is an even function and
the curve is symmetric about the y-axis. This means that our work
is cut in half. If we know what the curve looks like for x ≥ 0, then
we need only reflect about the y-axis to obtain the complete curve.
ii) If f(x) = −f(−x) for all x in D, then f is an odd function and
the curve is symmetric about the origin. Again we can obtain the
complete curve if we know what it looks like for x ≥ 0. Rotate 180◦
about the origin.
iii) If f(x + p) = f(x) for all x in D, where p is a positive constant,
then f is called a periodic function and the smallest such number p
is called the period.
4. Determine the behavior of f for large absolute values of x.
Ambo University
DEPARTMENT OF MATHEMATICS 45