Académique Documents
Professionnel Documents
Culture Documents
University of KwaZulu-Natal
Pietermaritzburg
c
C. Zaverdinos, 2010.
All rights reserved. No part of this book may be reproduced, in any form or by any means,
without permission in writing from the author.
2
Preface
This course has grown out of lectures going back well over 30 years. Names which come to
mind are J.Nevin, R.Watson, G.Parish and J. Raftery. I have been involved with teaching
linear algebra to engineers since the early 90s and some of my ideas have been incorporated in
the course. What is new in these notes is mainly my approach to the theoretical side of the
subject. Several of the numerical examples and exercises come from earlier notes, as well as a
some theoretical ones.
I would like to thank Dr. Paddy Ewer for showing me how to use the program LatexCad for
the diagrams and last, but not least, the Reverend George Parish for proof-reading an earlier
version and also for elucidating No.5 of Exercise 2, as well as Professors John and Johann van
den Berg for their encouragement in developing these notes.
Note 1 Some exercises (particularly those marked with an asterisk *) are harder and, at the
discretion of the instructor, can be omitted or postponed to a later stage.
3
Bibliography
The following books can be consulted but they should be used with caution since different
authors have a variety of starting points and use different notation. Many books also have a
tendency to become too abstract too early. Unless you are a mature reader, books can confuse
rather than help you.
5. D. Lay: Linear Algebra and its Applications (3rd Edition, Pearson). This book
treats matrix multiplication in much the same way as we do, but its treatment of geometric
aspects is less thorough.
It has over 560 pages, becomes abstract and advanced after p.230, but will probably be
useful in later years of study.
6. H. Anton: Elementary Linear Algebra (6th Edition, John Wiley and Sons).
7. E.M. Landesman & M.R. Hestenes: Linear Algebra for Mathematics, Science and
Engineering (Prentice-Hall). This is quite advanced.
I recommend especially (12) for Chapter 1 and (4) and (5) for later chapters of the notes.
Because of its elementary nature, (2) is good for Chapter 2.
Some of the above books have been used on the Durban campus and, if consulted with care,
can be helpful.
C. Zaverdinos
Pietermaritzburg, January 24, 2011
4
Contents
5
6 CONTENTS
Chapter 1
From school you are already familiar with the Cartesian plane. The x−axis and y−axis
meet at right-angles at the origin O. Every point A in the x − y plane is uniquely represented
by an ordered pair (a, b) of real numbers, where a is the x−coordinate and b is the y−coordinate
as in Figure 1.1. We write A = (a, b). Here M is the foot of the perpendicular from A to the
x−axis and likewise N is the the foot of the perpendicular from A to the y−axis.
Notice that M = (a, 0) and N = (0, b).
y− axis
N ......................................... A = (a, b)
a ..
..
..
..
..
b b...
..
..
..
..
..
a x− axis
O M
Figure 1.1
In order to represent a point A in space we add the z−axis which is perpendicular to both the
7
8CHAPTER 1. TWO AND THREE-DIMENSIONAL ANALYTIC GEOMETRY. VECTORS
x− and y−axes as in Figure 1.2. Here A = (a, b, c) is a typical point and a is the x−coordinate,
b is the y−coordinate and c is z−coordinate of the point A. In the diagram P is the foot of the
perpendicular from A to the y − z plane. Similarly, Q and R are the feet of the perpendiculars
from A to the z − x and x − y planes respectively.
z− axis
N P
b
a a
Q b A = (a, b, c)
c c
M
O
y− axis
a a
L b
R
x− axis
Figure 1.2
x−−−−
z−−→y −−−−
x−−→z y x
−−−−−−→
Have a look at yourself in the mirror while opening a bottle of wine and observe that what
is clockwise for you is anticlockwise for the person in the mirror. Thus a right-hand system of
axes observed in a mirror does not follow the corkscrew rule and is called a left-hand system.
The usual convention is to use only right-hand systems.
AB = (b1 − a1 , b2 − a2 , b3 − a3 ) (1.1)
To see this, see Figure 1.2 and use the theorem of Pythagoras:
Because the displacement AB has both direction and magnitude it is called a vector and
pictured as an arrow going from A to B. The tail of the arrow is at A and its head is at B
as in Figure 1.3.
As an example, let A = (−1, 2.3, −4.5) m and B = (2.1, 3.6, −2.5) m. Then
AB = (2.1 − (−1) , 3.6 − 2.3, −2.5 − (−4.5)) = (3.1, 1.3, 2.0) m.
Two displacement vectors AB and CD are equal if they are equal as displacements. Hence
if C = (−3.5, 1.1, 3.3) and D = (−.4, 2.4, 5.3) and A, B are as above then
CD = (−.4, 2.4, 5.3) − (−3.5, 1.1, 3.3) = (3.1, 1.3, 2.0) = AB.
q
The magnitude of this vector is |AB| = |CD| = (3.1)2 + (1.3)2 + (2.0)2 = 3.9115 m.
Various vectors (displacement, velocity, acceleration, force) play an important role in mechan-
ics and have their own units. For example, if positions are measured in meters and time is
measured in seconds, a velocity vector will have units ms−1 . So if a particle moves in such a
way that each second it displaces (−1, 2, 3) meters, we say its velocity is constantly (−1, 2, 3)
ms−1 .
Depending on the geometric interpretation, the tail (head) of displacement vectors may be
at any point. Vectors are often written as u = (u1 , u2 , u3 ), v = (v1 , v2 , v3 ), a = (a1 , a2 , a3 ),
b = (b1 , b2 , b3 ) etc.
We now have three ways of expressing the position of a point A in space. If A has
coordinates a = (a1 , a2 , a3 ), then
A = (a1 , a2 , a3 ) = a = OA (1.3)
Although equation (1.3) identifies a point with its position, geometrically speaking we like
to distinguish the point A itself from its position vector OA. When we say “a is the position
10CHAPTER 1. TWO AND THREE-DIMENSIONAL ANALYTIC GEOMETRY. VECTORS
z− axis
:B
AB
b3
a3
O
y− axis
a1
b1
a2
b2
x− axis
Figure 1.3
vector of point A” it is understood that the tail of a is at the origin and its head is at the point
A.
Our convention is that we use capital letters A, B, . . . to denote points in space.
Given the point C, there is a unique point D in space such that CD = AB (see Exercise 2,
No.4). There is also a unique point D such that OD = AB and we may write D = AB. In
that case the tail of D can only be O. The notation OD brings out the vectorial nature of the
displacement from O to D.
u − v = u + (−v) = (u1 − v1 , u2 − v2 , u3 − v3 )
1. u + (v + w) = (u + v) + w (associative law)
2. u + v = v + u (commutative law)
3. 0 + u = u (the vector 0 = (0, 0, 0) behaves like the number 0)
4. u + (−u) = 0
AB + BC = AC
To see this, let the position vectors of A, B and C be a, b and c respectively. Then AB = b−a
and BC = c − b. Hence by the properties of addition,
AB + BC = (b − a) + (c − b) = (b − b) + (c − a)
= 0 + c − a = c − a = AC
Figure 1.4 illustrates this geometrical result in which it is understood that the head B of
AB is the tail of BC etc.
The point D is chosen so that AD = BC. It follows that DC = AB (why?). We have a
geometric interpretation of the above commutative law (known as the parallelogram law for
addition of vectors):
BC + AB = AD + DC = AC = AB + BC
B: BC -... C
1 AB = DC
b 6 .... AD = BC
...
O
.. .
..
..
.. AB AC .....
.. ...
..
.. .
DC....
..
.. ...
.. ..... 6
.. ...
A ......................................................... .. D
6 AD
Figure 1.4 AC = AB + BC
while f is a scalar.
The vector αu is said to be a (scalar) multiple of u. We usually omit the qualification ‘scalar’.
1. α (u + v) = αu + αv
2. (α + β) u = αu + βu
3. α (βu) = (αβ) u
4. 1u = u
5. αu = 0 if, and only if, α = 0 or u = 0
6. |αu| = |α| |u| (the length of αu is |α| times the length of u)
The proofs of the first four are left to you. We prove the last:
1. u k u
2. u k v implies v ku
1.1. POINTS IN THREE-DIMENSIONAL SPACE 13
These properties are geometrically evident, while analytic proofs are left as an exercise. For
example, to see property (2) analytically, let u = αv. As α 6= 0, v = α1 u and v k u, as expected.
Example 1.1.1 Let u = AB, where A is the point (−1, 7, −4) and B is the point (−4, 11, 1),
so that u = (−3, 4, 5). Let v = CD, where C = (2, 9, −11) and D = (−7, 21, 4), so CD =
(−9, 12, 15) = 3u. Hence u and v have the same direction while u and −v have opposite
directions but are parallel.
1. u · v = v · u
2. α (u · v) = (αu) · v = u · (αv)
3. (αu + βv) · w = (αu · w) + (βv · w)
2 √
4. u · u = |u| , so |u| = + u · u
5. u · u > 0 and u · u = 0 if, and only if u = 0.
The proofs of these these properties are left as exercises (Exercise 2 No.8).
To see this result, see Figure 1.5 and use AB + BC = AC, so that BC = AC − AB = v − u
and
2
|BC| = (v − u) · (v − u)
= v · v + (−u) · (−u) − v · u − u · v
2 2
= |v| + |u| − 2u · v
Then by the cosine rule (which applies whether θ is acute or obtuse),
2 2 2
|BC| = |AB| + |AC| − 2 |AB| |AC| cos θ
2 2 2 2
|v| + |u| − 2u · v = |u| + |v| − 2 |u| |v| cos θ (from the previous equation)
2 2
Cancelling |v| + |u| from both sides leads to the required result (1.5).
14CHAPTER 1. TWO AND THREE-DIMENSIONAL ANALYTIC GEOMETRY. VECTORS
B
B K
>
u
u
θ v -
v j
θ ~
- A C
A C
θ acute θ obtuse
Figure 1.5
√
Example 1.1.2 Let A = (1, 0, 1), B = (2, 0, 3), C = 4, 10, 2 and D = (2, 2, −2). Find the
angle θ between AB and AC. Decide if the angle φ between AB and AD is acute or obtuse.
√ √
Solution: AB = (2, 0, 3) − (1, 0, 1) = (1, 0, 2) and AC = 4, 10, 2 − (1, 0, 1) = 3, 10, 1 .
Hence
√
AB · AC (1, 0, 2) · 3, 10, 1 1
cos θ = = √ q √ =
|AB| |AC| 2 2
12 + 02 + 22 32 + 10 + 12
and θ = 60 degrees or π3 radians.
Similarly, AD = (1, 2, −3) and
r
AB · AD (1, 0, 2) · (1, 2, −3) 5
cos φ = = √ q =−
|AB| |AD| 2 14
12 + 02 + 22 12 + 22 + (−3)
The first result follows from the fact that AB ⊥ AC if, and only if, cos θ = 0.
The second result is left as an exercise (Exercise 2 No.8).
To see the third result, use (2) and consider
|u + v|2 = (u + v) · (u + v)
= u · u + v · v + 2u · v
≤ u · u + v · v + 2 |u| |v|
= |u|2 + |v|2 + 2 |u| |v|
2
= (|u| + |v|)
1.1. POINTS IN THREE-DIMENSIONAL SPACE 15
2 2
As |u + v| ≤ (|u| + |v|) , it follows that |u + v| ≤ |u| + |v|.
Example 1.1.3 Find a non-zero vector perpendicular to both u = (2, 1, −1) and v = (3, 3, −1).
Solution:
Let x = (x1 , x2 , x3 ) satisfy x ⊥ u and x ⊥ v. Then
By subtraction −x1 − 2x2 = 0, or x1 = −2x2 . Thus from the first equation, x3 = 2x1 + x2 =
−3x2 . We may let x2 be any non-zero number, e.g. x2 = 1. Then
x = (−2, 1, −3)
is perpendicular to u and to v.
We will see later why it is always possible to find a vector x 6= 0 perpendicular to two non-zero
vectors. This fact will also come into our study of planes (see 1.3.4).
(See Figure 1.6). Notice that if π2 < θ ≤ π then the component (1.6) is negative (as is only
natural).
The projection of f on c is the vector
!
f ·c
f ·b
c bc= 2 c (1.7)
|c|
Figure 1.7 shows the geometric interpretations of the projection of the non-zero vector f = P Q
on c and on −c. The projection P R is the same in both cases: P R is the projection (in the
natural sense) of the vector f on the line parallel to c passing through the tail P of f .
More generally, if d k c, the projection of f on c is the same as the projection of f on d. (See
Exercise 2, No.13a.)
16CHAPTER 1. TWO AND THREE-DIMENSIONAL ANALYTIC GEOMETRY. VECTORS
Q
*
f
|f | sin θ (f 6= 0)
θ
P - S
|f | cos θ R PS = c
Figure 1.6
Q... Q...
.... ....
.. ..
.. ..
f .. ..
.. f ..
.. ..
.. ..
.. ..
.. ..
.. π−θ ..
θ .. ..
-.. - -..
P R PS = c S T P T = −c P R
Figure 1.7
The situation with components is different. As noted, in Figure 1.7 the component of f along
−c is −|f | cos θ, . In general, if d and c are parallel with the same direction then the components
of f along c and d are equal. If d and c have opposite directions then the component of f
along d is the negative its component along c.
We return to projections in subsections 1.2.4 and 1.2.5.
You may think of f as a vector representing a force that is dragging a box placed at
its tail. Then f sin θ is the lifting effect of f while f cos θ is the horizontal dragging
effect of the force.
q
2. Let f = (11, −3, 4) and c = (5, −1, −2). The length of c is |c| = 52 + (−1)2 + (−2)2 =
√
30.
The unit vector b
c in the direction of c is
1
c = √ (5, −1, −2)
b
30
1.1. POINTS IN THREE-DIMENSIONAL SPACE 17
a = a1 i + a2 j + a3 k (1.8)
The scalars a1 , a2 and a3 are the components of a in the directions i, j and k respectively. See
also No.17 of Exercise 2 below.
2. Interpret the equation (r − (2, −1, 5)) · (r − (2, −1, 5)) = 49 geometrically.
2 2 2
Solution: If r = (x, y, z) this reads (x − 2) + (y + 1) + (z − 5) = 49, so then equation
represents the surface of a sphere centered at (2, −1, 5) and having radius 7.
18CHAPTER 1. TWO AND THREE-DIMENSIONAL ANALYTIC GEOMETRY. VECTORS
4. (Generalization of No.3b). Let u be a given vector. Show that if the tail A of u is given
then the head B of u is uniquely determined. State and prove a similar result if the head
of u is given. Draw a picture.
Solution: If the head B of u is given then the tail A of u is uniquely determined.
For, AB = OB − OA = u, so A = OB − u.
5. See Figure 1.8. Imagine that in each case the axes are part of a room in which you are
standing and looking at the corner indicated. Decide which of the systems of mutually
orthogonal axes are left-handed and which right-handed. If you are outside the room,
what would your answer be?
Solution: The first and third are right-handed systems, the other two are left-handed. If
you were looking at the room from the outside, exactly the opposite would apply.
6. Let A0 ,...,A9 be ten points in space and put am = Am−1 Am for m = 1, · · · , 9 and a10 = A9 A0 .
Find the sum a1 + · · · + a10 .
Solution: The answer is the zero vector 0, as the ten-sided vector-figure is closed.
Hint 3 These properties just reflect similar properties of numbers. For example, the
commutative law of addition of vectors is proved by
u+v = (u1 + v1 , u2 + v2 , u3 + v3 )
= (v1 + u1 , v2 + u2 , v3 + u3 )
= v+u
1.1. POINTS IN THREE-DIMENSIONAL SPACE 19
y y x
x
z x x z y z z y
Figure 1.8
α(u + v) = α(u1 + v1 , u2 + v2 , u3 + v3 )
= (α(u1 + v1 ), α(u2 + v2 ), α(u3 + v3 ))
= (αu1 + αv1 , αu2 + αv2 , αu3 + αv3 )
= αu + αv
a and bb.
(a) The unit vectors b
Solution: ba = √ 1
9+1+4
(3, −1, 2) = √1
14
(3, −1, 2), bb = √ 1
49+9+1
(−7, 3, −1) =
√1 (−7, 3, −1) .
59
(b) The component of a along b and the component of b along a.
Solution: The component of a along b is a · bb = (3, −1, 2) · √159 (−7, 3, −1) = − √2659 .
a = (−7, 3, −1) · √114 (3, −1, 2) = − √2614
The component of b along a is b · b
20CHAPTER 1. TWO AND THREE-DIMENSIONAL ANALYTIC GEOMETRY. VECTORS
The projection of b on a is
b·a 26 13
2 a=− (3, −1, 2) = − (3, −1, 2)
|a| 14 7
f · γc γ 2f · c f ·c
2 γc = 2 c= 2 c
|γc| γ 2 |c| |c|
f ·f
2 f = f
f
14. A man walks from a point A in the N-E direction for 15 km to a point B. Find the
component of the displacement AB in the following directions
(a) East
(b) South
(c) N30◦ E
Solution: Taking the unit vectors i and j due East and due North respectively,
15 15
√
AB = √ 2
(1, 1). Hence the respective answers are √ 2
(1, 1) · (1, 0) = 15 2 2 km,
15 15
√ 15 1
√
3 15
√ 1√ 1
√ (1, 1) · (0, −1) = − 2 km and √2 (1, 1) · ( 2 , 2 ) = 2 2 2 3 + 2 = 14. 489
2 2
km.
15. A wind is blowing at 10 km/h in the direction N30◦ W. What is its component in the
direction S15◦ W?
Solution:
Solution:
√
Using the same unit vectors as before, the velocity of the wind is
1 3
w = 10 − 2 , 2 . A unit vector in the direction S15◦ W is (cos (−105) , sin (−105)). The
√
required component is 10 − 21 , 23 · (cos (−105) , (−105)) = −7. 071 1 km/h.
16. Let u and v be two vectors. The set of all linear combinations of u and v is called the
span of the vectors and is denoted by sp (u, v). (This is discussed more fully in item 2 of
subsection 3.1.2 of Chapter 3).
(a) Show that if u = (−1, 2, 3) and v = (2, 4, −5), then sp (u, v) consists of all vectors of
the form (−α + 2β, 2α + 4β, 3α − 5β), where α and β are arbitrary scalars.
(b) How do you think the span of three vectors u, v, w should be defined?
Describe sp ((−2, 3, 1), (5, −4, 2), (−1, 1, −3)).
1.2. THE STRAIGHT LINE 21
Hint 4 Take the dot product of both sides of the above equation with each of l, m and n.
The coefficients for the specific example are the components a · l = γ1 etc. A longer
procedure to find the coefficients for the specific example is to use the method of 3d above.
18. What familiar result from Euclidean Geometry does |b − a| ≤ |a| + |b| represent?
Solution: “The sum of two sides of a triangle is larger than the third”. If |b − a| =
|a| + |b|, what is happening? Then (b − a) · (b − a) = (|a| + |b|)2 and a · b = − |a| |b|: The
vertices of the triangle are on a straight line - the triangle is degenerate.
19. Let u and v have the same length: |u| = |v|. Show that (u − v) · (u + v) = 0. In other
words, if u − v and u + v are not zero they are at right angles.
Solution: (u − v) · (u + v) = (u − v) · u + (u − v) · v = u · u − v · u + u · v − v · v =
−v · u + u · v = 0.
20. How are our general results affected (and simplified) when one of the coordinates is re-
stricted to be zero for all vectors under consideration?
R
*
tAB
A
r = OR = OA + tAB
.
OA
O Figure 1.9
As r = r (t) represents a general point on the line ℓ, the equation (1.9) is called a generic or
parametric equation of the line through A and B. The variable t is called the parameter.
r = r(t) = a + t (b − a) = (1 − t) a + tb
1.2. THE STRAIGHT LINE 23
where 0 ≤ t ≤ 1 lies between A and B and the set AB of these points is called the line
segment joining A and B.
If A 6= B and 0 < t < 1, the point r(t) is said to lie strictly between A and B.
The line segment AB is a set of points and it must be distinguished from the vector AB
and also from the distance |AB|.
4. Let u 6= 0 define a direction and suppose a is the position vector of the point A. Then a
generic equation of the line ℓ through A with direction u is
r (t) = a + tu (1.10)
A typical point a + tu on the line is called a generic point. As t varies over the real
numbers in the generic equation (1.10) we obtain a set of points which we identify with
ℓ and (1.10) is its analytic definition.
5. A generic equation is not unique.
(a) If C 6= D are two points on the line given by equation (1.9), then since these points
also determine the line, another generic equation (now with parameter s) is
r (s) = OC + sCD
The parameters t and s are of course related. See Example 1.2.1 of subsection 1.2.3
and No.1b of Exercise 5.
(b) When do two generic equations like (1.10) define the same line? Let u and v be
non-zero vectors and suppose that a + tu is a generic equation of line ℓ1 and b + sv
a generic equation of line ℓ2 .
Then ℓ1 = ℓ2 if, and only if, u k v and a − b is a multiple of u. (See No.4 of Exercise
5).
Example 1.2.3 Find the foot Q of the perpendicular from the point P = (2, −1, 4) to the line
ℓ of Example 1.2.1. Hence find the shortest distance between P and ℓ.
Solution:
Let Q = (1 − 2t, 2 − t, 3 + t) be the required foot of the perpendicular from P = (2, −1, 4) to the
line ℓ. Then P Q = (−1 − 2t, 3 − t, −1 + t) must be perpendicular to the direction (−2, −1, 1)
of ℓ:
P = (2, −1, 4)
r = (1 − 2t, 2 − t, 3 + t)
Figure 1.10
P
line ℓ
AP = p − a
Q
* AQ = q − a
u
A
Figure 1.11
1.2. THE STRAIGHT LINE 25
There is a better way to visualize the position vector q of the foot Q of the perpendicular
from from P to the line ℓ with generic equation r = a + tu. From Figure 1.11 we see that
AQ = q − a is the projection of AP = p − a on the vector u. Using equation (1.7) with
f = p − a and c = u we obtain
!
1
q =a+ 2 p−a ·u u (1.11)
|u|
Note that (as in example 1.2.3) the shortest distance from P to ℓ is |P Q|.
Example 1.2.4 Let us apply the formula (1.11) to the above example 1.2.3.
Solution:
2
For the line (1, 2, 3) + t (−2, −1, 1) we have |u| = 6 and the foot of the perpendicular from
P = (2, −1, 4) to the line has position vector
1
q = (1, 2, 3) + ((2, −1, 4) − (1, 2, 3)) · (−2, −1, 1) (−2, −1, 1)
6
1 5 10
= , ,
3 3 3
Example 1.2.5 Find the foot Q of the perpendicular from a general point P = (p1 , p2 , p3 ) to
the line r = a + tu if a = (1, 2, 3) and u = (−2, −1, 1).
Solution:
Using equation (1.11) - effectively equation (1.7) - we obtain
and so
2 2 1 1 11 1 1 1 19 1 1 1
q= + p1 + p2 − p3 , + p1 + p2 − p3 , − p1 − p2 + p3
3 3 3 3 6 3 6 6 6 3 6 6
Example 1.2.6 Find the projection of P = (p1 , p2 , p3 ) on the line t (−2, −1, 1).
26CHAPTER 1. TWO AND THREE-DIMENSIONAL ANALYTIC GEOMETRY. VECTORS
Solution:
We can find the required projection by letting a = 0 in example 1.2.5 or directly from (1.12):
Example 1.2.7 A straight line lies in the x − y plane and passes through the origin. The
line makes an angle of 30◦ with the positive x−axis. Find the foot Q of the projection from
P = (p1 , p2 ) on the line.
Solution:
√ √
Let u = cos π6 , sin π6 = 23 , 12 . The parametric equation of the line is r = t 23 , 12 and
the foot Q of the perpendicular from P to the line is the projection of p = (p1 , p2 ) on the unit
vector u: ! √ !
√
3p1 p2 3 1 1 √ 1√ 1
q= + , = 3p1 + 3p2 , 3p1 + p2 (1.14)
2 2 2 2 4 4 4
We shall return to this example in subsection 1.4 below and again in Chapter 3 Example 3.2.1.
Example 1.2.8 Show that the diagonals of a parallelogram bisect each other.
Solution:
In Figure 1.12 ABCD is a parallelogram with AB = DC and BC = AD. It is understood that
the parallelogram is non-degenerate in that no three of A, B, C, D are collinear.
B ..
. -
1
C
7 .......
.... AB = DC
.... X
.... AD = BC
....
....
....
....
.... AX = XC
-....
D
A 6
Figure 12
ℓ1
ℓ2
Figure 13
A familiar example of skew lines is provided by a road passing under a railway bridge. A
train and car can only crash at a level crossing!
In the case of skew lines there will be points P on ℓ1 and
Q on ℓ2 such that P Q is perpendicular
to the directions of both lines and the distance P Q is minimum. See Figure 1.13.
Example 1.2.9 Let ℓ1 be the line through A = (0, 0, 0) and B = (1, 2, −1) and suppose ℓ2 is
the line through C = (1, 1, 1) and D = (2, 4, 3). Are the lines skew? In any case, find the
shortest distance between them.
Solution: Generic equations for ℓ1 and ℓ2 are t(1, 2, −1) and (1, 1, 1)+s(1, 3, 2) respectively.
Since (1, 2, −1) 6k (1, 3, 2), the lines are not parallel (and so cannot possibly be identical). They
will meet if for some t and s
This means that the three equations t = 1 + s, 2t = 1 + 3s and −t = 1 + 2s must hold. From the
first and last equation, 1+s = − (1 + 2s) and thus s = − 32 , t = 13 . But then 2t = 23 6= 1+3 − 23
and the second equation fails. It follows that ℓ1 and ℓ2 are skew.
Let P = t(1, 2, −1) and Q = (1, 1, 1) + s(1, 3, 2) be the points where P Q is minimum. Then
P Q = (1, 1, 1) + s(1, 3, 2) − t(1, 2, −1) is perpendicular to both directions (1, 2, −1) and (1, 3, 2).
Hence:
1. let A = (1, 2, −3), B = (−1, 1, 2), C = (5, 4, −13), D = (−5, −1, 12) and E = (−1, −2, 3).
Answer the following questions.
(a) Find a generic equation with parameter t of the straight line ℓ through A and B in
the form of equation (1.9).
Solution: Generic equation is (1, 2, −3) + t ((−1, 1, 2) − (1, 2, −3)) = (1 − 2t, 2 − t, −3 + 5t).
(Direction with this parametrization is (−2, −1, 5)).
(b) Is E on ℓ? Show that C and D are on ℓ and find a corresponding generic equation
of the line with parameter s. Find the relationship between s and t.
Solution: AC = (5, 4, −13) − (1, 2, −3) = (4, 2, −10) = (−2) (−2, −1, 5)
AD = (−5, −1, 12) − (1, 2, −3) = (−6, −3, 15) = 3 (−2, −1, 5). Hence C and D are
on ℓ.
Another generic equation for ℓ is (5, 4, −13)+s ((−5, −1, 12) − (5, 4, −13)) = (5 − 10s, 4 − 5s, −13 + 25s)
The relationship between the two parameters is 1 − 2t = 5 − 10s or 5s − t = 2.
AE = (−1, −2, 3) − (1, 2, −3) = (−2, −4, 6) / (−2, −1, 5). So E is not on ℓ.
√
(c) Find two points on ℓ at a distance 24 from E.
Solution. E = (−1, −2, 3). Let (1 − 2t, 2 − t, −3 + 5t) be such a point. (1 − 2t, 2 − t, −3 + 5t)−
(−1, −2, 3) = (2 − 2t, 4 − t, −6 + 5t).√
We want |(2 − 2t, 4 − t, −6 + 5t)| = 24 or (2 − 2t)2 + (4 − t)2 + (−6 + 5t)2 = 24,
8
Solution is : t = 15 , t = 2. The two points are: With t = 2, (1 − 2t, 2 − t, −3 + 5t) =
(−3, 0, 7).
8 1 22
and with t = 15 , (1 − 2t, 2 − t, −3 + 5t) = − 15 , 15 , − 31 .
(d) Find an equation describing the line segment joining A and B. In particular find the
midpoint of the segment and also the three points strictly between A and B dividing
the segment into four equal parts.
Solution: (1 − 2t, 2 − t, −3 + 5t) (0 ≤ t ≤1) describes segment AB. t = 12 defines
midpoint (1 − 2t, 2 − t, −3 + 5t) = 0, 32 , − 12 . The required three points dividing the
segment into 4 equal parts are the1 midpoint
and the points corresponding to t = 14
3 1 7 7 5 3
and t = 4 , i.e. 2 , 4 , − 4 and − 2 , 4 , 4 respectively.
2. Let P , Q and R be three distinct points on the line of equation (1.10). Show that they are
collinear, i.e., P Q k P R.
3. Let the generic equation r(t) = a + tu represent a line ℓ, where u 6= 0, as in the previous
question. Suppose that P with position-vector p is any point. Show that P lies on ℓ
(a) if, and only if, p − a is a multiple of u.
(b) if, and only if, the equation
a + tu = p
has a unique solution for t.
(Compare this with No.12 below for another criterion).
4. * Let u and v be non-zero vectors and suppose that a + tu and b + sv are generic equations
of lines ℓ1 and ℓ2 respectively.
Show that necessary and sufficient conditions for the generic equations to represent the
same set of points (i.e. ℓ1 = ℓ2 ) are
1.2. THE STRAIGHT LINE 29
(i) u k v and
(ii) a − b is a multiple of v.
Using this result show that two lines sharing two distinct points are identical.
Solution:
Suppose that ℓ1 = ℓ2 . Then a = b + tv for some t (showing (ii) holds). Furthermore,
a + u = b + t1 v for some t1 . Subtracting, u = (t1 − t) v, so (i) holds.
Conversely, let the conditions (i) and (ii) hold. Then u = βv for some scalar β and
a − b = γv for a scalar γ. So, given t, we can solve
a − b + tu = (γ + tβ) v = sv
(uniquely) for s. This shows that all points on ℓ1 are on ℓ2 . Since the conditions (i) and
(ii) are symmetrical (that is, v k u and b − a is a multiple of u), a symmetric argument
shows that all points on ℓ2 are on ℓ1 . Thus ℓ1 = ℓ2 .
Let
a + tu = b + sv and a + t1 u = b + s1 v where t 6= t1 and s 6= s1 . Then (t − t1 )u = (s − s1 )v.
It follows that (i) and (ii) hold and that the lines are identical.
5. * Use the results of the previous exercise to show that the four conditions in subsection
1.2.6 are in fact mutually exclusive and exhaustive.
Solution:
Either (A) u k v or (B) not. If case (A) holds, suppose the lines have a point in common,
so that a + tu = b + sv has a solution for s and t. Then it follows that a − b is a multiple
of v and so the lines are identical.
If (B) is the case the lines either meet or do not meet.
6. Using equation (1.11), or otherwise, find the point Q of the line ℓ with parametric equation
(1 − 2t, 2 − t, −3 + 5t) which is closest to
7. Find the projections of the following points P on the line (−2t, −t, 5t).
(a) P = (4, 6, 7)
Solution: The projection is
8. Find the projection of the point P with position vector p = (p1 , p2 , p3 ) on the line (−2t, −t, 5t).
Solution. Direction of line is (−2, −1, 5). Projection is
(p1 , p2 , p2 ) · (−2, −1, 5) −2p1 − p2 + 5p3
2 (−2, −1, 5) = (−2, −1, 5)
|(−2, −1, 5)| 30
9. * Derive the formula (1.11) that finds the foot Q of the perpendicular from P = (p1 , p2 , p3 )
onto the line a + tu using the technique of example 1.2.3.
Hint 7 Show that the solution to a + tu − p · u = 0 is
a−p ·u
t=− 2
|u|
10. Show that the closest the line ℓ with parametric equation a + tu comes to the point P is
q 2 2
1
d= |p − a||u| − p − a · u (1.15)
|u|
Solution: From equation (1.11) we get the square d2 of the required distance as follows.
Let a − p = s, then
! 2
2 a−p ·u
d2 = q − p = a − p − 2 u
|u|
! ! ! !
s·u s·u
= s− 2 u · s− 2 u
|u| |u|
! !2
2 s·u s·u 2
= |s| − 2 2 s·u+ 2 |u|
|u| |u|
2
(s · u)
= |s|2 − 2
|u|
1 2 2 2
= 2 a − p |u| − a−p ·u
|u|
11. Show that |r (t)|2 in equation (1.10) is a quadratic in t. How does your knowledge of
quadratics tie up with the formula (1.11) and equation (1.15)?
Solution: |r (t)|2 = (a + tu) · (a + tu) = |a|2 + 2ta · u + t2 |u|2 . The minimum of this is
when
a·u
t=− 2
|u|
In other words, r (t) comes closest to the origin for this value of t. But r (t) comes closest
to the point p just when r (t) − p comes closest to the origin. This gives
a−p ·u
t=−
|u|2
exactly as before.
1.2. THE STRAIGHT LINE 31
12. Deduce from equation (1.15) that ℓ passes through the point P if, and only if,
(p − a) · u = ±|p − a||u|. Note that geometrically this is clear in view of the formula for
the angle between p − a and u if p − a 6= 0.
Solution: ℓ passes through the the point P if, and only if, the shortest distance d from
2
P to ℓ is 0 which is the case if, and only if, |a − p|2 |u|2 − a − p · u = 0 if, and only
if, (a − p) · u = ±|a − p||u|.
13. * If t is time in seconds and u is the displacement in meters per second (the constant
velocity), then equation (1.10) can be interpreted as the position of (say) an aircraft in
meters at time t.
(a) In a situation like this we would be interested in how close the aircraft comes to a
certain point P for t ≥ 0. Find a condition for (1.15) to hold at some time t ≥ 0.
If this condition fails, what is the closest the aircraft gets to P and when does this
occur?
Solution. Equation (1.15) holds for some t ≥ 0 just when a − p · u ≤ 0. If this
condition
fails, what is the closest the aircraft gets to P is when t = 0 and the closest
is a − p.
(b) How can your findings be used to find the closest distance two aircraft travelling
with constant velocities get to each other, assuming that as time t ≥ 0 increases the
distance between the aircraft decreases?
Hint 8 Let a + tu and b + tv describe the positions of the two aircraft at time t.
Consider the line b − a + t (v − u). You want its closest distance from the origin. By
assumption, why is (b − a) · (v − u) < 0?
Solution: (b − a) · (v − u) < 0 is the condition found in (b) since the aircraft get
closer and closer as t increases.
14. A rhombus is a (non-degenerate) parallelogram all of whose sides have the same length.
Show that the diagonals of a rhombus intersect at right angles.
Hint 9 In Example 1.2.8 (Figure 1.12) use |AB| = |BC| as well as AC = AB + BC and
AB + BD = AD, so BD = AD − AB = BC − AB. Now use the result of Exercise 2
No.19.
Solution: Follows straight from the hint: (AB + BC) · (BC − AB) = 0
15. *(Generalization of previous exercise). Let a + tu and b + sv represent two straight lines
and suppose that A = a + t1 u, B = b + s1 v, C = a + t2 u, and D = b + s2 v are points on
these lines with A 6= C and B 6= D.
Show that |AB|2 + |CD|2 = |BC|2 + |DA|2 if, and only if, the lines are perpendicular.
16. Let A, B and C be three points not on a line and so forming the distinct vertices of a
triangle. The median from A is the line joining A to the midpoint of the opposite side
BC. Show that all three medians intersect in a single point M . This point is called the
centroid of the triangle and is the centre of mass of three equal masses placed at the
vertices of the triangle.
32CHAPTER 1. TWO AND THREE-DIMENSIONAL ANALYTIC GEOMETRY. VECTORS
Hint 11 Let a, b, and c, be the position vectors of A, B and C respectively. Then the
midpoint of BC has position vector 12 (b + c). A generic equation of the line through A
and the midpoint of BC is
1
a+t (b + c) − a
2
Put t = 23 and simplify the result. You should conclude that the centroid lies two thirds of
the way from any vertex along its median toward the opposite side.
Solution:
2 1 1
a+ (b + c) − a = (a + b + c)
3 2 3
The symmetry of this equation proves that the medians intersect at the point M =
1
3 (a + b + c). The distance of A from M is
2 1
(b + c) − a
3 2
17. (Cartesian equations for a straight line). Let equation (1.10) define a straight line, where
u = (u1 , u2 , u3 ) and each ui 6= 0. Show that r = (x, y, z) is on the line if, and only if,
x − a1 y − a2 z − a3
= =
u1 u2 u3
What happens if one or two of the ui = 0? Find Cartesian equations for the line of
Question 1a above. Find generic and Cartesian equations for the x−, y− and z−axis.
Solution: We have
(x, y, z) = (a1 , a2 , a3 ) + t(u1 , u2 , u3 )
x−a1 y−a2
This says u1 = u2 = z−au3
3
= t, and the result follows. If say u1 = 0, then the
Cartesian equations are x = a1 and y−a z−a3
u2 = u3 . If u1 = u2 = 0, equations are x = a1 ,
2
y = a2 .
For the line (1 − 2t, 2 − t, −3 + 5t) = (1, 2, −3) + t (−2, −1, 5) of 1a the Cartesian equa-
tions are
x−1 y−2 z+3
= =
−2 −1 5
Generic equations for the x−, y− and z−axis are r(1, 0, 0), s(0, 1, 0), and t(0, 0, 1); Carte-
sian equations are y = z = 0, x = z = 0 and x = y = 0 respectively.
18. Let ℓ1 be the line through A = (−1, −2, 6) and B = (3, 2, −2) and ℓ2 the line through
C = (1, 3, 1) and D = (1, 1, 5). Solve the following problems.
(a) Show that ℓ1 and ℓ2 are skew.Solution: Try to solve for s and t:
19. Let ℓ1 be the line through A = (2, −1, 4) and B = (3, 0, 5) and ℓ2 the line through
C = (−1, 0, 1) and D = (−2, 1, 0). Show that ℓ1 and ℓ2 meet and find the point of
intersection.
Solution: Solve for s and t: (2, −1, 4)+s ((3, 0, 5) − (2, −1, 4)) = (−1, 0, 1)+t ((−2, 1, 0) − (−1, 0, 1)).
We have (3 + s + t, −1 + s − t, 3 + s + t) = (0, 0, 0). The solution is given by s − t = 1,
s + t = −3 so s = −1, t = −2. Point of intersection is (1, −2, 3).
20. How are our general results affected (and simplified) when one of the coordinates (say the
z− coordinate) is restricted to be zero? In particular what does equation (1.11) reduce to?
1.3 Planes
sAB = AP = QR
B
7 tAC = AQ = P R
P
q R
6
A }
q
Q
q
C
0
Figure 14
Suppose that R lies in the same plane as A, B and C. Let the line through R parallel to AC
meet the line through A and B at the point P . Then AP = sAB for some scalar s. Similarly,
let the line through R parallel to AB meet the line through A and C at the point Q, so that
AQ = tAC for some scalar t. The figure AP RQ forms a parallelogram and
OR = OA + AR = OA + (AP + P R) = OA + AP + AQ
= OA + sAB + tAC (1.16)
34CHAPTER 1. TWO AND THREE-DIMENSIONAL ANALYTIC GEOMETRY. VECTORS
This equation describes the position vector r = OR of a general point on the plane through
A, B and C and is called a generic equation of the plane through A, B and C.
Example 1.3.1 Let A = (1, −1, 2), B = (1, 2, 3) and C = (−1, 0, 1) be three given points.
They are not collinear since the vectors AB = (0, 3, 1) and AC = (−2, 1, −1) are not parallel.
Hence a generic equation of the plane through A, B and C is
r = (1, −1, 2) + s(0, 3, 1) + t(−2, 1, −1)
In equation (1.16) the vectors u = AB and v = AC are non-zero and non-parallel and are
thought of as parallel to the plane described. This suggests the following definition:
n1 x1 + n2 x2 + n3 x3 − (n1 a1 + n2 a2 + n3 a3 ) (1.18)
= n1 x1 + n2 x2 + n3 x3 + c = 0
where c = − (n1 a1 + n2 a2 + n3 a3 )
A Cartesian equation is unique up to non-zero constant multiples of (n1 , n2 , n3 , c) since
evidently multiplying n1 , n2 , n3 and c by β 6= 0 cannot change a solution (x1 , x2 , x3 ).
1.3. PLANES 35
6
plane (r − a) · n = 0
n
: R = (x1 , x2 , x3 )
AR
A
Figure 15
Conversely, if equation (1.19) holds then n must be a normal (see Exercise 14, No.5 below).
Because equation (1.19) involves two equations for three unknowns, a solution n 6= 0 can always
be found. See Chapter 2, 2.2.15 for a proof of this.
Example 1.3.2 Find a normal n 6= 0 to the plane of Example 1.3.1 and hence its Cartesian
equation.
Solution:
We solve
is perpendicular (normal) to the plane. We may take n2 = 1 and n = (2, 1, −3). With
A = (1, −1, 2) we get a Cartesian equation for the plane:
Consider again example 1.3.1 with generic equation r = (1, −1, 2) + s(0, 3, 1) + t(−2, 1, −1).
Writing x = 1 − 2t, y = −1 + 3s + t and z = 2 + s − t, we can eliminate s and t as follows.
36CHAPTER 1. TWO AND THREE-DIMENSIONAL ANALYTIC GEOMETRY. VECTORS
Substituting t = 1−x
2 into y = −1 + 3s + t and z = 2 + s − t gives y = −1 + 3s + 1−x
2 and
1−x
z = 2 + s − 2 . Hence
y − 3z = −1 + 3s + 1−x2 −3 2+s− 2
1−x
= −2x − 5 and the Cartesian equation of the plane
is 2x + y − 3z + 5 = 0.
Example 1.3.3 Find a generic equation for the line of intersection of the planes
x1 + x2 + x3 + 1 = 0
2x1 + x2 − x3 + 2 = 0
Solution:
Note that (1, 1, 1) 6k (2, 1, −1) so the planes are not parallel. Eliminating x1 gives x2 + 3x3 = 0
or x2 = −3x3 . Substituting x2 = −3x3 into the first (or second) equation gives x1 = 2x3 − 1.
Hence, with t = x3 the parameter,
Example 1.3.4 Find a generic equation of the plane with Cartesian equation
Solution:
By far the easiest solution is
r = (10 + 2x2 − 5x3 , x2 , x3 )
Here the independent parameters are x2 and x3 and they can vary freely over ℜ. We could
of course let s = x2 and t = x3 , but this is only an artificial device and changes nothing.
A more long-winded approach is to find three non-collinear points A, B and C on the plane
and then proceed as in Example (1.3.1).
1.3. PLANES 37
r = OA + sAB + tAC
= (10 − 10s − 10t, −5t, 2s)
Remark 13 Instead of A, B and C we could have chosen any three non-collinear points sat-
isfying equation (1.20). The parametric equation will be different but will represent the same
plane.
Example 1.3.5 Find two non-parallel vectors u and v both perpendicular to n = (1, −2, 5).
Solution:
From the previous example two such vectors are u = AB and v = AC
Example 1.3.6 Find the point of intersection of the line through (1, −1, 0) and (0, 1, 2) and
the plane
3x1 + 2x2 + x3 + 1 = 0
Solution:
For some t the point
Example 1.3.7 Find the foot Q of the perpendicular from the point P = (4, −1, 3) to the plane
x + 2y + z = −4. Hence find the shortest distance from P to the given plane.
Solution:
The vector (1, 2, 1) is normal to the plane so the line ℓ defined by (4, −1, 3) + t (1, 2, 1) =
(4 + t, −1 + 2t, 3 + t) passes through P and has direction normal to the plane. Geometry tells
us that the line ℓ hits the plane at the required foot Q of the perpendicular. Then P Q will
be the shortest distance of P from the plane.
To find t, substitute (4 + t, −1 + 2t, 3 + t) into the Cartesian equation of the plane
(4 + t) + 2 (−1 + 2t) + (3 + t) = −4
and solve for t. This gives t = − 23 and the position vector of the point Q therefore has
position vector
3 3 3 5 3
q = 4+ − , −1 + 2 − ,3 + − = , −4,
2 2 2 2 2
38CHAPTER 1. TWO AND THREE-DIMENSIONAL ANALYTIC GEOMETRY. VECTORS
P
plane Π
Example 1.3.8 Find the projection of the point P = (4, −1, 3) onto the plane x + 2y + z = −0.
Solution:
The plane passes through the origin (that is why we call the foot Q of the perpendicular the
projection). As before, substitute (4 + t, −1 + 2t, 3 + t) into the Cartesian equation x+2y +z =
−0:
(4 + t) + 2 (−1 + 2t) + (3 + t) = 0
This gives t = − 65 and the projection of P onto the plane x + 2y + z = −0.is the point Q =
4 − 56 , −1 + 2 − 65 , 3 − 65 = 196 , − 8 13
3 6 .
,
P
........
. . .. . .. ........ ....
. plane Π ..
........ ..
.. ............. ..
.. .... .
.
........ ..
.. ..... ........ ....
........ 6 ..
.
. ..... ......... n ...
....... ..
.
..... ..
..... Q wX .
.
..... ..
..... ..
..... .
.
..... ..
..... ..
..... .
.
..... ..
..... ..
..... ..
..... .
..... ..
..... ...
..... ..
.
Figure 17: Foot Q of pependicular from P to Π
!
1
PQ = 2PX ·n n
|n|
1
= (x1 − p1 , x2 − p2 , x3 − p3 ) · (n1 , n2 , n3 ) (n1 , n2 , n3 )
n21 + n22 + n23
!
c+n·p
= − n
|n|2
Notice that these formulas are independent of the choice of X. See Exercise 14, (8) for an
alternative proof of equation (1.21.
Example 1.3.9 Find the foot Q of the perpendicular from a general point P = (p1 , p2 , p3 ) to
Solution
40CHAPTER 1. TWO AND THREE-DIMENSIONAL ANALYTIC GEOMETRY. VECTORS
2. The projection of P on x + 2y + z = 0 is
!
(1, 2, 1) · (p1 , p2 , p3 )
q = (p1 , p2 , p3 ) − (1, 2, 1)
|(1, 2, 1)|2
p1 + 2p2 + p3
= (p1 , p2 , p3 ) − (1, 2, 1)
6
5p1 p2 p3 p1 p2 p3 p1 p2 5p3
= − − ,− + − ,− − + (1.24)
6 3 6 3 3 3 6 3 6
1. Show that A = (1, 2, 1), B = (0, 1, 3) and C = (−1, 2, 5) are not collinear and so define a
unique plane Π. Find:
Solutions to Question 1.
A = (1, 2, 1), B = (0, 1, 3) and C = (−1, 2, 5) AB = (0, 1, 3) − (1, 2, 1) = (−1, −1, 2)
AC = (−1, 2, 5) − (1, 2, 1) = (−2, 0, 4).
As AB ∦ AC the points A, B, C are not collinear.
1a
Generic equation of plane is r = (1, 2, 1) + s (−1, −1, 2) + t (−2, 0, 4)
1b
Spot normal: n = (2, 0, 1) equation of plane is 2x + z + c = 0.
Substitute A = (1, 2, 1) in equation: 2 (1) + 1 + c = 0 c = −3. Cartesian equation of
plane is 2x + z − 3 = 0.
1c
Using equation (1.21) the foot is at 1 + 2 − 52 , 2, 3 + − 25 = 15 , 2, 13 5 . The shortest
√
distance from P to the plane Π is therefore (1, 2, 3) − 1 , 2, 13 = 4 , 0, 2 = 1 20 =
5 5 5 5 5
1.3. PLANES 41
√2 .
5
3. Show that the plane r = r(s, t) = a + su + tv passes through the origin if, and only if a is
a linear combination of u and v.
Solution: One way is clear, so let a = αu+βv for scalars α and β. Then r(−α, −β) = 0.
4. Find generic and Cartesian equations for the x − y and y − z planes. medskip
Solution: Generic equation of x − y plane: r = (x, y, 0) of y − z plane r = (0, y, z)
Cartesian equations are respectively z = 0 and x = 0.
5. Prove that if n satisfies equation (1.19) then it is normal to the plane given by the generic
equation (1.17). In other words, show that that n will be perpendicular to every vector
parallel to the plane and so to each line lying in the plane.
Solution: Let r1 = a + s1 u + t1 v and r2 = a + s2 u + t2 v represent two points R1 and R2
on the plane. Then,
6. Use equation (1.21) to deduce the shortest distance formula equation (1.22).
c+n·p
|n·p+c|
Solution: p − p − |n|2 n = |n|
4 + p1 + 2p2 + p3
q = (p1 , p2 , p3 ) − (1, 2, 1)
6
42CHAPTER 1. TWO AND THREE-DIMENSIONAL ANALYTIC GEOMETRY. VECTORS
For the projection we must substitute (p1 , p2 , p3 ) + t (1, 2, 1) = (p1 + t, p2 + 2t, p3 + t) into
the equation x + 2y + z = 0 of the plane:
Hint 15 Solve
n · p + tn + c = 0
for t.
11. Find a generic equation of the line of intersection of the planes 2x + 3y + z + 1 = 0 and
x − 4y + 5z − 2 = 0.
Solution: We have 2x + 3y + z + 1 − 2 (x − 4y +5z − 2) = 11y − 9z + 5 = 0. Take z as
9 5 9 5 19 2
parameter, y = 11 z − 11 and so x − 4 11 z − 11 + 5z − 2 = 0, and x = − 11 z + 11 . A
generic equation of line is
19 2 9 5
r = − z + , z − ,z
11 11 11 11
12. (Hirst, p.36). In 12a - 12c find the intersection of the three given planes. In each case
give a brief geometric description.
Note: If you have difficulty in solving three equations for three unknowns, you may
postpone this question to Exercise 32 No.(2) in Chapter 2.
1.3. PLANES 43
(a)
3x − 2y + 2z = 5
4x + y − z = 3
x+y+z = 6
(b)
x + 2y − z = 3
2x + 4y − 2z = 5
x−y+z = 1
(c)
x + 2y − z = 2
x−y+z = 1
3x + 3y − z = 5
Solution to 12a:
let (i) 3x − 2y + 2z = 5 (ii) 4x + y − z = 3 (iii) x + y + z = 6
Solution:
In the notation of 1.3.3 suppose that Π1 = Π2 . Since a1 ∈ Π2 , a1 −a2 = αu2 +βv 2 for some
scalars α and β. Hence condition (ii) holds. Again, for the same reason, a1 − a2 + u1 =
su2 + tv 2 for some scalars s and t. Subtracting gives u1 = (s − α) u2 + (t − β) v 2 . Hence
u1 ∈ sp (u2 , v 2 ). Similarly, v 1 ∈ sp (u2 , v 2 ), so that sp (u1 , v 1 ) ⊆ sp (u2 , v 2 ). As Π1 = Π2
a symmetric argument shows sp (u2 , v 2 ) ⊆ sp (u1 , v 1 ), so that sp (u1 , v 1 ) = sp (u2 , v 2 ).
This shows that (i) holds.
44CHAPTER 1. TWO AND THREE-DIMENSIONAL ANALYTIC GEOMETRY. VECTORS
Conversely, let (i) and (ii) hold. Then for certain scalars α, β, γ, δ, ǫ and η we have:
a1 − a2 = αu2 + βv 2 , and u1 = γu2 + δv 2 and v 1 = ǫu2 + ηv 2 . So for any scalars s and t,
14. (See Figure 1.18) Let n · r + c = 0 and m · r + d = 0 represent non-parallel planes Π1 and
Π2 respectively. Show that the Cartesian equation of a plane Π that contains the line of
intersection of Π1 and Π2 has the form
λ (n · r + c) + µ (m · r + d) (1.26)
= (λn + µm) · r + λc + µd = 0
where λ and µ are two scalars not both of which are zero.
Hint 17 First note that because n 6= 0 and m 6= 0 are not parallel, the linear combination
λn+ µm is also non-zero if one or both of λ, µ are non-zero. In that case equation
(1.26) represents a plane Π. Note that if (e.g.) λ 6= 0 and µ = 0 then Π is just Π1 .
Now check that a point on Π1 and Π2 is also on Π.
15. Find the form of a Cartesian equation of the plane Π that contains the intersection of the
planes
2x1 + 3x2 + x3 + 1 = 0 and x1 − 4x2 + 5x3 − 2 = 0
If the plane Π also contains the point (1, −1, 2), find its Cartesian equation.
Π1
Π
...
...
...
...
..
..............................................
.... ....
....... ..
...... .
..
..
..... ..
.. .. ..
..
.. ..
... ..
.
.. ..
.. Π2
Figure 1.17
Solution: The two planes are 2x1 + 3x2 + x3 + 1 = 0 and x1 − 4x2 + 5x3 − 2 = 0.
They are not parallel.
1.4. REFLECTIONS IN A LINE OR A PLANE 45
Q
..
Q
....
.
.
..... Π
= ..
′
P
ℓ
P
′
Figure 18
where Q is the foot of the perpendicular from P to the line or plane and we have written p′
for the position vector of P ′ .
Example 1.4.1 Consider example 1.2.7 of a line in the x-y plane. Using equation (1.14) and
equation (1.27), the reflection of p = (p1 , p2 ) in the line is
1 √ 1√ 1
p′ = 2q − p = 2 3p1 + 3p2 , 3p1 + p2 − (p1 , p2 ) (1.28)
4 4 4
1 1√ 1√ 1
= p1 + 3p2 , 3p1 − p2
2 2 2 2
Example 1.4.2 Consider (1) of Example 1.3.9 where we found the foot of the perpendicular
from p to the plane x + 2y + z = −4 given by equation (1.23):
4 + p1 + 2p2 + p3
q = (p1 , p2 , p3 ) − (1, 2, 1)
6
46CHAPTER 1. TWO AND THREE-DIMENSIONAL ANALYTIC GEOMETRY. VECTORS
From this it follows that the reflection P ′ of P in the plane is the point with position
4 + p1 + 2p2 + p3
p′ = 2q − p = p − (1, 2, 1)
3
1. Using equation (1.11) - which finds the foot Q of the perpendicular from P to the line ℓ
with parametric equation r = a + tu - find a formula for the reflection P ′ of P in the line
ℓ.
Solution: q = a + |u|1 2 p − a · u u is foot of perpendicular from P to line. Reflection
formula is !!
1
p′ = 2q − p = 2 a + 2 p−a ·u u−p
|u|
2. Using equation (1.21) - which finds the foot Q of the perpendicular from P to the plane
with Cartesian equation n · r + c = 0 - find a formula for the reflection P ′ of P in the
given plane.
c+n·p
Solution: q = p − |n|2 n is foot of perpendicular from P to plane.Reflection formula
is ! ! !
′
c+n·p c+n·p
p = 2q − p = 2 p − 2 n −p=p−2 2 n
|n| |n|
For Question 2 it is
2n · p
p′ = p − n
|n|2
4. Using the answers to Questions (2) and (3), or otherwise, find formulae for reflections
in the planes x + 2y + z = 0, 3x − y − 2z = 1 and 3x − y − 2z = 0.
Solution: For both planes n = (3, −1, −2). The formula for reflection in the first plane
is
−1 + (3, −1, −2) · p
p′ = p − (3, −1, −2)
7
−1 + 3p1 − p2 − 2p3
= p− (3, −1, −2)
7
5. In the x − y plane find a formula for reflection in the line passing through the origin and
making an angle of 60◦ with the positive x−axis.
1.4. REFLECTIONS IN A LINE OR A PLANE 47
√
Solution: Parametric equation of line is t 12 , 23 . Foot Q of perpendicular from P =
√ √
p = (p1 , p2 ) to line is q = (p1 , p2 ) · 12 , 23 p = 12 p1 + 12 p2 3 p. Formula for reflec-
tion is √
p′ = 2q − p = p1 + p2 3 − 1 p
6. * In the x − y plane find a formula for reflection in the line passing through the origin
and making an angle of θ radians with the positive x−axis.
a + b = (a1 + b1 , a2 + b2 , a3 + b3 )
a · b = a1 b 1 + a2 b 2 + a3 b 3
r = a + tu (t ∈ ℜ)
1.5.3 Planes
(See section 1.3).
A generic equation of the plane containing two non-parallel vectors u and v and passing
through the point A is
r = a + su + tv (s, t ∈ ℜ)
1.5. SUMMARY OF CHAPTER 1 49
A vector n 6= 0 is normal to the plane if it is perpendicular to every line lying in the plane.
A Cartesian equation of the plane has the form
where c is a constant.
The foot Q of the perpendicular from a point P to the plane is given by equation (1.21):
!
c+n·p
q =p− 2 n
|n|
When a line or plane goes through the origin the foot Q of the perpendicular from P to the
line or plane is called the projection of P on the plane.
The reflection P ′ of a point P in a line or plane is given by
p′ = 2q − p
Linear algebra is used in almost every branch of pure and applied mathematics. It is indis-
pensable for all branches of engineering, the physical sciences (including physics, chemistry
and biology), statistics and for operations research (mathematics applied to industry and com-
merce).
Although most results in this course are valid for various number systems (e.g. rational real
or complex numbers), it is assumed that all numbers and number-variables dealt with are real,
i.e. range over the real number system ℜ. Numbers will also be referred to as scalars.
51
52CHAPTER 2. MATRICES AND THE SOLUTION OF SIMULTANEOUS LINEAR EQUATIONS
ii (Second equivalent system). In the new system subtract (a) from (b), to obtain:
It is now obvious from (b) that x2 = −1. By back-substitution in (a) we get x1 = 1. For
reasons that will be become clear later we write this unique solution as a column vector
x1 1
x= =
x2 −1
x1 + x2 + 3x3 = −1 (a)
x1 + x2 + 4x3 = −3 (b) (2.1)
−x1 + x2 − 2x3 = 1 (c)
Solution:
x1 + x2 + 3x3 = −1 (a)
2x2 + x3 = 0 (b) (2.2)
x3 = −2 (c)
Because the coefficients of x1 in (b) and (c) are zero and also because the coefficient of x2
in (c) is zero, we say the system is in (row) echelon or (upper) triangular form. From (c),
x3 = −2. Back-substitution into (b) gives x2 = 1. Finally, back-substitution of x3 = −2 and
x2 = 1 into (a) gives x1 = 4. The solution is
x1 4
x = x2 = 1 (2.3)
x3 −2
Solution:
There are only three equations for four unknowns and the equations are already in echelon
form. One of the unknowns (say x4 ) can be any number and back-substitution gives in turn
x3 = x4 − 2, x2 = − 12 (x3 + x4 ) = 1 − x4 and x1 = −1 − x2 − 3x3 − 2x4 = 4 − 4x4 . It is
customary to let x4 = t, so
2.2. MATRICES 53
x1 4 − 4t
x2 1 − t
x= = (2.5)
x3 t − 2
x4 t
is the general solution. The parameter t ∈ ℜ can be any real number.
Remark 20 Here we have 3 equations for 4 unknowns and the solution shows that we have
an infinity of solutions. In Example 1.3.2 of Chapter 1, we found a vector n 6= 0 normal to
a plane. For β 6= 0, βn is also a normal, so again we have an infinite number of solutions,
this time of two equations in three unknowns. Example 1.3.4 of Chapter 1 involved solving one
equation for three unknowns and this involved two parameters.
These examples and also Example 2.1.3 illustrate a general result:
If a system of equations has more unknowns than equations and there is a solution, then the
system has infinitely many solutions. See, for example, 2.2.15 below. In Exercise 53, No.5 you
will get a more complete picture of this theorem.
Example 2.1.4 For which values of the variables does the following system have a solution?
x1 + x2 + 3x3 = −1 (a)
x1 + 3x2 + 4x3 = −1 (b)
2x1 + 4x2 + 7x3 = 5 (c)
Solution:
The system cannot have a solution since (b) and (c) are contradictory. The system is
inconsistent (or the equations are incompatible) and the solution set is empty.
2.2 Matrices
A rectangular array A of numbers with m rows and n columns is called an m × n (“m by n”)
matrix. Matrices will form a new kind of algebra and it is customary to place brackets (we
will use square ones) around such arrays.
The (1, 1) entry is a11 = 2. It is the first entry in column 1 and also the first entry in row 1.
The (2, 1) entry a21 = 1 is the second row entry of column 1 and the first entry in row 2. The
numbers a12 = 3, a22 = −4, are the first and second entries of column 2 respectively. Row 2
has entries a21 = 1, a22 = −4, a23 = 5 respectively. The (1, 3) entry of A is a13 = −1 the (2, 3)
entry is a23 = 5.
It is important to keep in mind that aij is in row i and in column j of A.
Sometimes we write A = Am×n to indicate that A is m × n. Thus if A has two rows and
three columns, we may write A = A2×3 .
(b)
2 −3 4 −5
−3 4 −5 6
A=
4 −5 6 −7
−5 6 −7 8
Solution:
(a) aij = (−1)i+j for i, j = 1, 2, 3, 4.
(b) aij = (−1)i+j (i + j) for i, j = 1, 2, 3, 4.
2.2. MATRICES 55
1 1 1 1 1 1
1+1−1 1+2−1 1+3−1 1 2 3
1 1 1 1 1 1
H3 = 2+1−1 2+2−1 2+3−1
=
2 3 4
1 1 1 1 1 1
3+1−1 3+2−1 3+3−1 3 4 5
The system of Example 2.1.2 (equations (2.1)) has for coefficient matrix
1 1 3
A= 1 1 4
−1 1 −2
Using the general 4 × 5 matrix of (2.6), the following matrix equation expresses a general
system of 4 equations in 5 unknowns x1 , x2 , ... , x5 :
x1
a11 a12 a13 a14 a15 b1
a21 a22 a23 a24 a25 2 b2 x
a31 a32 a33 a34 a35 x3 = b3 (2.11)
x4
a41 a42 a43 a44 a45 b4
x5
(b)
7x1 − 3x2 + 2x3 = 8
−3x1 + 9x2 − 15x3 = 1
(c)
−11x1 + 13x2 − x3 = −2
21x1 − 52 x2 + 13x3 = 29
2
2 1 5
3 x1 − 3 x2 − 9 x3 = −2
(d)
5x + 2y = −2
−7x + 3y = 12
− 32 x + 6y = − 61
(e)
− 13 1
3 x1 − 2 x2 + x3 − 7x4 = −21
5x1 + 4 x2 − 17 x3 + 9x4 = 0
1
−4x1 + 72 x2 + 19 x3 + 101x4 = 2
1 7
8 x1 − 2 x2 − 5x3 + 17x4 = 5
2. Write out in full the simultaneous equations represented by the following matrix equations
(a)
9 0 2 8
x=
−1 3 4 17
(b)
a b −5
x=
c d 6
(c)
−4 3 12 p
−6 −2 −5 x = q
11 − 31 2
5 r
(d)
−8 9 17 3
5 3 −23 −1
x =
2 −1 13 1
−11 3 −5 −2
3. Let
a11 a12 a13 a14 x1
a21 a22 a23 a24 x2
A= , x=
a31 a32 a33 a34 x3
a41 a42 a43 a44 x4
(a)
a11 a12 a13 a14 x1 a11 x1 a12 x2 a13 x3 a14 x4
a21 a22 a23 a24 x2 a21 x1 a22 x2 a23 x3 a24 x4
= = Ax
a31 a32 a33 a34 x3 a31 x1 a32 x2 a33 x3 a34 x4
a41 a42 a43 a44 x4 a41 x1 a42 x2 a43 x3 a44 x4
58CHAPTER 2. MATRICES AND THE SOLUTION OF SIMULTANEOUS LINEAR EQUATIONS
(b)
x1 a11 x2 a12 x3 a13 x4 a14
x1 a21 x2 a22 x3 a23 x4 a24
Ax =
x1 a31
x2 a32 x3 a33 x4 a34
x1 a41 x2 a42 x3 a43 x4 a44
(c)
x1 x2 x3 x4 A = Ax
(d)
a11 a12 a13 a14 x1 a11 x1 a12 x1 a13 x1 a14
a21 a22 a23 a24 x2 a24
x = x2 a21 x2 a22 x2 a23
a31 a32 a33 a34 x3 a31 x3 a32 x3 a33 x3 a34
a41 a42 a43 a44 x4 a41 x4 a42 x4 a43 x4 a44
Likewise, the left-hand sides of the first and second equations are respectively
x1
1 1 3 x2 = x1 + x2 + 3x3
x3
x1
1 1 4 x2 = x1 + x2 + 4x3
x3
Similarly, the left-hand side of the first equation of Example 2.1.3 is the product
x1
x2
1 1 3 2
x3 = x1 + x2 + 3x3 + 2x4
x4
The left-hand side of equation (2.10) in Example 2.1.3 is the column with three products
1 1 3 2 x x1 + x2 + 3x3 + 2x4
=
0 2 1 1 x 2x2 + x3 + x4
0 0 1 −1 x x3 − x4
Remark 23 The last examples show that we are dealing with a product which is a general-
ization of the dot product of Chapter 1 (subsection 1.1.12).
As remarked earlier, we will later develop special notations for row i and column j of A. What
is more, we will emphasize Ax as the product of the matrices A and x. (See subsection 3.1.6
in Chapter 3).
2.2. MATRICES 59
2 3 −1
(2.15)
1 −4 5
We have omitted brackets around the array. Here x1 and x2 can be thought of as labels
of the first two columns of the array. The third column of (2.15) is the right-hand side of
(2.14). It is clear that two equations in two unknowns determine and are determined by
such an array. Instead of performing operations on the given equations, we perform the
same operations on this array.
In the case of Example 2.1.1 these steps are:
2 3 −1
2 −8 10 2R2 (multiply Row 2 by 2)
2 3 −1
0 −11 11 −R1 + R2 (add (−1) × Row 1 to Row 2)
From the last array we read off −11x2 = 11, so x2 = −1. The first row reads 2x1 + 3x2 =
−1 and x1 = 1. This is the same solution we found before.
2. Example 2.1.2 has an expression as the matrix equation (2.8) and as an array of detached
coefficients the equations are:
1 1 3 −1
1 1 4 −3
−1 1 −2 1
Here x1 , x2 , and x3 label the first three columns. The steps leading to the solution are:
1 1 3 −1
0 0 1 −2 −R1 + R2
−1 1 −2 1
1 1 3 −1
0 0 1 −2
0 2 1 0 R1 + R3
Finally, we interchange rows 2 and 3:
1 1 3 −1
0 2 1 0 R2 ↔ R3 (2.16)
0 0 1 −2
The equations are in echelon form and we can read off the solution:
x3 = −2, 2x2 + x3 = 0, so x2 = 1 and finally from x1 + x2 + 3x3 = −1 we get x1 = 4. We
have
60CHAPTER 2. MATRICES AND THE SOLUTION OF SIMULTANEOUS LINEAR EQUATIONS
x1 4
x2 = 1
x3 −2
This is equation (2.3), found earlier.
Observation: It is important to observe that the right-hand side of the original set
of equations is a linear combination of the columns of the coefficient matrix with
coefficients x1 , x2 and x3 :
1 1 3 −1
1 4 + 1 + 4 (−2) = −3 (2.17)
−1 1 −2 1
Linear combinations were introduced for row vectors in Chapter 1, subsection 1.1.18 and
will be systematically developed in Chapter 3. See subsection 3.1.2 and especially 3.1.9.
Note that it is natural to write the coefficients on the right of the column vectors since x
appears on the right of A when we express the equations in the form Ax = b.
3. Example 2.1.3 (equations (2.4)) has the matrix expression (2.10). The corresponding
array of detached coefficients is already in echelon form:
1 1 3 2 −1
0 2 1 1 0 (2.18)
0 0 1 −1 −2
The general solution can be read off as before by putting x4 = t and solving in turn for
x3 , x2 and x1 :
x1 4 − 4t
x2 1 − t
x=
x3 = t − 2
x4 t
This is just the solution (2.5) found earlier. A particular solution is found by giving the
parameter t some value, eg. t = 3. In that case x1 = −8, x2 = −2, x3 = 1 and x4 = 3.
As a linear combination we have
1 1 3 2 −1
0 (−8) + 2 (−2) + 1 + 1 3 = 0 (2.19)
0 0 1 −1 −2
4. Example 2.1.4 has two expressions, one as equation (2.7), and also as the array of detached
coefficients (augmented matrix)
1 1 3 −1
1 3 4 −1
2 4 7 5
The steps leading to the solution are:
1 1 3 −1
0 2 1 0 −R1 + R2
0 2 1 7 −2R1 + R3
The last row reads 0x1 + 0x2 + 0x3 = 7, an impossibility. The given equations are
inconsistent (incompatible); they do not have a solution.
2.2. MATRICES 61
Remark 24 Our convention is that an operation performed on row i is written next to row i
after the operation has been done.
Exercise 25 1. Write each expression as a product of a row and a column: (this can be
done in more than one way).
Solutions:
x1
(a) 5 2 −6 x2
x3
b a b
(b) a c or b d or a d or ....
d c c
x1
x2
(c) The usual way is −8 −1 1 −2 3
x3
x4
x5
(d)
y1
x1 −7x2 9x3 y2 ,
y
3
x1
or y1 −7y2 9y3 x2 or
x
3
x 1
y1 y2 y3 −7x2 etc
9x3
(a) = 0
2
(a) 1
16
−2
(b) 13
−4
−2x1 + 6x2 + x3
(c) 3x1 − 7x2 − 3x3
4x1 − 2x2 + 5x3
−2x1 + 6x2 − 8x3
(d) 3x1 − 7x2 + 8x3
4x1 − 2x2 + 8x3
4. In the following express (i) the column b as a single matrix expression in the form b = Ax.
and (ii) each row i of Ax as a products of row i of A and x. (See Chapter 3, subsection
3.1.6, where we look at this more thoroughly) ,
2.2. MATRICES 63
−2 6
(a) b = 3 5 + −7 2
4 −2
−2 6 1
(b) .b = 3 5 + −7 2 + −3 (−4)
4 −2 5
−2 6 1
(c) b = 3 x1 + −7 x2 + −3 x3
4 −2 5
Solutions:
(a)
5
−2 6
−2 6 2
5 5
b = 3 −7 =
3 −7 ,
2 2
4 −2 5
4 −2
2
(b)
5
−2 6 1 2
−4
−2 6 1 5 5
b = 3 −7 −3 2 =
3 −7 −3 2
4 −2 5 −4
−4
5
4 −2 5 2
−4
(c)
x1
−2 6 1 x2
x3
−2 6 1 x1
x1
b= 3 −7 −3 x2 =
3 −7 −3 x2
4 −2 5 x3
x3
x1
4 −2 5 x2
x3
3. For any scalar γ and i 6= j, add γ times equation i to equation j (add γ times row Ri of
the corresponding array to row Rj ).
The above three operations done on an array are called elementary row operations
(abbreviated as eros). Note that forming βRi is just like multiplying a vector in 3-space by a
scalar, except that now the vector has n + 1 entries. The type (3) operation replaces Rj with
γRi + Rj , where addition of rows is addition of vectors.
What is obvious (and assumed at school) is that if x is a simultaneous solution of the m
given equations before any one of the above eros is performed, then x remains a solution after
the operation is performed.
To see that the reverse is true, suppose we have just performed the elementary row operation
of type (3) and that x is a solution to the new system of equations. Adding −γ times equation
i (−γ times row Ri of the augmented array) to equation j (row Rj of the array) in the new
system of equations (array) brings us back to the system before the type (3) operation was
performed. This shows that x is also a solution to the old system.
For example, consider example 2.1.2 again and the first step toward a solution:
1 1 3 −1
1 1 4 −3
−1 1 −2 1
1 1 3 −1
0 0 1 −2 −R1 + R2
−1 1 −2 1
Adding row 1 to row 2 of this array restores the original array:
1 1 3 −1
1 1 4 −3 R1 + R2
−1 1 −2 1
The following more general setting refers to the array (2.21). Suppose we add γ times Row
2 to Row 3, obtaining
a11 a12 a13 a14 a15 b1
a21 a22 a23 a24 a25 b2
γa21 + a31 γa22 + a32 γa23 + a33 γa24 + a34 γa25 + a35 γb2 + b3 γR2 + R3
a41 a42 a43 a44 a45 b4
Then,
a11 a12 a13 a14 a15 b1
a21 a22 a23 a24 a25 b2
a31 a32 a33 a34 a35 b3 −γR2 + R3
a41 a42 a43 a44 a45 b4
brings us back to the original array.
Remark 26 For convenience we have been using Ri for row i of an array R. In the next
chapter we will use a slightly different notation for row i of a matrix.
(b) for the non-zero rows, the first non-zero entry in row i + 1 does not come earlier than
the first non-zero entry in row i.
Suppose that only the first r rows are non-zero and in row echelon form. The first non-zero
entry in row r is called the pivot element and row r is the pivot or reference row. The
reference row will be used in the next step.
We assume that (a) and (b) hold. If the remaining rows are zero we are finished. Otherwise,
multiples of row r are added to rows below it in such a way that all entries below the pivot
element become zeros. The first r + 1 rows are then in echelon form and this remains the case
after arranging rows r + 1, r + 2,... so that (a) and (b) hold.
The process ends when the whole array is in row echelon form and then each non-zero row has
a pivot element.
The columns of the array containing a pivot element are called pivot columns. (This will be
used in 2.2.15 below and also in subsection 3.4.13).
−3 −10 2
7 5 6
Solution (method 1)
10
1 3 − 32 − 13 R1
7 5 6
10
1 3 − 23 ref row, pivot = 1
0 − 55
3
32
3 −7R1 + R2
Remark 28 This illustrates how the pivot element can always be made 1.
−21 −70 14
0 −55 32 R1 + R2
Remark 29 This example illustrates how it is possible to obtain a fraction-free answer when
all entries are integers or even fractions.
Example 2.2.2 Reduce the array of detached coefficients to row echelon form and hence solve
1 0 2 −1
3 0 5 x = 1
−5 1 −10 2
66CHAPTER 2. MATRICES AND THE SOLUTION OF SIMULTANEOUS LINEAR EQUATIONS
Solution:
The augmented matrix (without brackets) is
1 0 2 −1 ref row
3 0 5 1
−5 1 −10 2
Elementary row operations (eros) bringing the detached array into echelon form are:
1 0 2 −1 1 0 2 −1
0 0 −1 4 −3R1 + R2 then 0 1 0 −3 R2 ↔ R3
0 1 0 −3 5R1 + R3 0 0 −1 4
From this row echelon form we find the unique solution
7
x = −3
−4
Solution:
The initial array of detached coefficients is
3 −3 1 10 2
2 −3 −1 3 0
−1 2 0 3 −6
1 −1 1 2 4
1 −1 1 2 4
0 −1 −3 −1 −8
0 0 −2 4 −10 R2 + R3 new ref row
0 0 −2 4 −10
2.2. MATRICES 67
1 −1 1 2 4
0 −1 −3 −1 −8
0 0 −2 4 −10
0 0 0 0 0 − R3 + R4
This is in echelon form but we can simplify still further:
1 −1 1 2 4
0 1 3 1 8 −R2
1 (2.23)
0 0 1 −2 5 −2 R3
0 0 0 0 0
Remark 30 Each column vector has 4 entries instead of 3 and so we cannot visualize these
vectors in space. Nevertheless, this should not be a problem. In Chapter 3 we will consider n−
dimensional vectors. See subsections 3.1.2 3.1.9.
Solution:
In order for (b) in 2.2.8 to hold, first interchange rows 1 and 4 of A:
2 −3 −4 3 −3 R1 ↔ R4 ref row
2 −2 −2 −3 −2
0 0 3 −2 −2
0 0 −3 −3 −1
Next:
68CHAPTER 2. MATRICES AND THE SOLUTION OF SIMULTANEOUS LINEAR EQUATIONS
2 −3 −4 3 −3
0 1 2 −6 1 −R1 + R2
0 0 3 −2 −2 new ref row
0 0 −3 −3 −1
Finally,
2 −3 −4 3 −3
0 1 2 −6 1
(2.26)
0 0 3 −2 −2
0 0 0 −5 −3 R3 + R4
The matrix is now in row echelon form.
1. Express the following simultaneous equations in the form Ax = b and use Gauss-reduction
and detached coefficients to find solutions. Display the final array in row echelon form.
Also, express b as a linear combination of the columns of A whenever this is possible.
(a) i.
x1 − 2x2 = −1
3x1 − x2 = 1
ii.
x − 2y = −3
2x − 4y = −1
iii.
x − 2y = −3
2x − 4y = −6
iv.
x1 − 2x2 = −3
3x1 − x2 = 1
4x1 − 3x2 = 5
v.
x − 2y = −3
2x + y = 4
3x − y = 1
vi.
0x + 0y = 0
vii.
3x + 0y = −3
(b) i.
x1 − 2x2 − x3 = 0
3x2 + x3 = 1
ii.
x1 + x2 + x3 − x4 = 1
2x2 − x3 + 4x4 = −1
x4 = −2
iii.
x1 + 2x2 + x3 = −1
iv.
x1 + 2x2 + x3 − 5x4 = 0
2.2. MATRICES 69
3. (a) In solving simultaneous equations, if we drop an equation the resulting solution set
contains and is usually larger than the original solution set. Illustrate this with an
example.
(b) Rule 3 of permissable row operations in 2.2.6 says: For any scalar γ and i 6= j, add
γ times row Ri of the array to row Rj . Why must we have i 6= j?
(c) In an array with rows R1 , . . . , Rm , let i 6= j. Suppose that at least one of the numbers
α, β is not zero and form the row αRi + βRj . Under what conditions can this new
row replace row i or row j? Describe what is going on in terms of the strict rules of
2.2.6.
Solution:
Suppose that β 6= 0. First form βRj . Now, with row i as the reference row, add αRi
to βRj to get the new row. Thus αRi + βRj can replace Rj if β 6= 0. This will not
work if β = 0 (why?). If both α and β are non-zero, αRi + βRj can replace Ri or
Rj . In practice this can speed up calculations.
4. * (Simplified Elementary Row Operations). Consider the following simplified set of two
elementary row operations that can be applied to an array. Show that each row operation
in 2.2.6 can be obtained by applying a sequence of these two operations. In other words,
the two row operations are equivalent to those in 2.2.6.
Solution: The next thing to do is to multiply row i by α−1 . The process can be seen as
Ri αRi αRi α−1 αRi Ri
→ → → =
Rj Rj αRi + Rj αRi + Rj αRi + Rj
To interchange rows Ri and Rj for i 6= j do (for example) the following in order: Add
row i to row j, multiply row i by −1, add row j to row i, subtract row i from row j:
Ri Ri −Ri Ri + Rj − Ri
→ → →
Rj Ri + Rj Ri + Rj Ri + Rj
Rj Rj Rj
= → =
Ri + Rj −Rj + Ri + Rj Ri
This is the homogeneous system corresponding to Example 2.2.3. We found the row echelon
form of the detached coefficients for that example as (2.23). The row EF for the homogeneous
system (2.27) is therefore
1 −1 1 2 0
0 1 3 1 0
0 0 1 −2 0
0 0 0 0 0
(a)
1 2 −3 −1
−3 4 1 2
A=
−2 6 −2 1
4 −2 −4 −3
Solution:
row EF:
1 2 −3 −1
0 10 −8 −1
0 0 0 0
0 0 0 0
1 0 − 75 − 45
0 1 −4 1
− 10
row REF
0 0 0
5
0
0 0 0 0
(b)
1 −3 −2 4
2 4 6 −2
B=
−3 1
−2 −4
−1 2 1 −3
Solution:
2.2. MATRICES 71
1 −3 −2 4
0 −1 −1 1
row EF .
0 0 0 0
0 0 0 0
row REF
1 0 1 1
0 1 1 −1
0 0 0 0
0 0 0 0
(c) How are the matrices A and B related? Solution:
The rows of B are written as the columns of A. In the language of Chapter 3, B is
the transpose of A.
2. Let:
1 1 2 1 −3
2 1 5 2 −7
A=
0 −1
3 5 −4
−2 1 −3 9 3
The columns of the above A are are said to be linearly dependent since one column
can be expressed as a linear combination of the others.
Linear independence and dependence will be studied formally in Chapter 3, section
3.4. See also the theorem 2.2.15 below.
3. (a) Reduce the array of detached coefficients to echelon form and solve
1 −1 1 2 4
2 −2 −1 3 0
x =
−1 1 0 3 −6
1 −2 2 5 5
Solution:
72CHAPTER 2. MATRICES AND THE SOLUTION OF SIMULTANEOUS LINEAR EQUATIONS
Row REF:
1 0 0 − 13
2
1
2
5
0 1 0 2 − 12
5
0 0 1 2 − 32
0 0 0 0 0
General Solution:
1
+ 132 t 2
−1 − 5t
x= 2 2
−3 − 5t
2 2
t
Solution:
13
2
− 52
x=
5
t
2
1
1 −1 1 2 4
0 0 −3 −1 −8
Row REF:
5 4
1 −1 0 3 3
1 8
0 0 1 3 3
Solution is:
4
3 + s − 53 t
s
x=
8 1
3 − 3t
t
Row EF:
1 −1 1 2 0
,
0 0 −3 −1 0
Row REF:
5
1 −1 0 3 0
1
0 0 1 3 0
The homogeneous solution is
s + 5t
s
x=
t
−3t
Note that
4 5
4
3 + s − 3t s1 + 5t1
3
0
8 s1 = 8 + s1
3 − 3t 3 t 1
t 0 −3t1
This is just a simple change of variables: t = −3t1 , s = s1 .
6. (a) What are the solutions x to
3 1 1 4 −1
2 5 1 7 x = −1
1 9 1 10 2
?
Solution:
There are no solutions.
(b) Solve the corresponding homogeneous system.
Solution:
s
1s − 3t
x 4 4
− 13 s − 13 t
4 4
t
Here s and t are parameters.
7. If you have one solution to 4a and the general solution of 4b, show that you can write
down the general solution of 4a.
8. The same for 5a and 5b.
9. Solve the following system for x and relate your solution to that of its homogeneous
counterpart.
2 1 3 5 −1
−2 −1 −1 −2 0
x =
4 2 7 11 −3
6 3 10 16 −4
Solution
Detached coefficients:
2.2. MATRICES 75
2 1 3 5 −1
−2 −1 −1 −2 0
4 2 7 11 −3
6 3 10 16 −4
Row EF:
2 1 3 5 −1
0 0 4 6 −2
0 0 0 −2 −2
0 0 0 0 0
Row REF:
1
1 2 0 0 0
0 0 1 0 −2
0 0 0 1 1
0 0 0 0 0
General solution:
x1 t
x2 −2t
=
x3 −2
x4 1
Homogeneous equations
Detached: Last column (which could be omitted) becomes all zeros :
2 1 3 5 0
−2 −1 −1 −2 0
4 2 7 11 0
6 3 10 16 0
Row EF:
2 1 3 5 0
0 0 4 6 0
0 0 0 −2 0
0 0 0 0 0
Row REF:
1
1 2 0 0 0
0 0 1 0 0
0 0 0 1 0
0 0 0 0 0
Homogeneous solution:
x1 t
x2 −2t
x3 = 0
x4 0
General solution is
0 t
0 −2t
−2 + 0
1 0
76CHAPTER 2. MATRICES AND THE SOLUTION OF SIMULTANEOUS LINEAR EQUATIONS
10. Can you state a general theorem of which Nos.(7), (8) and (9) are special cases?
Suppose that you know (any) one particular solution to Example 2.2.3 (matrix equation
(2.22)) and add to this the general solution (2.28) of the homogeneous Example 2.2.5.
What is the result? Consider equation (2.24). We will return to this in chapter 3. (See
Exercise 53, No.5).
11. (a) Solve the system for x, y, z:
1 1 −1 −3
−2 3 x
4
y = 7
−1 4 1 −2
z
4 −1 −2 2
Solution:
Detached array:
1 1 −1 −3
−2 3 4 7
−1 4 1 −2
4 −1 −2 2
Row EF
1 1 −1 −3
0 5 2 1
0 0 1 3
0 0 0 1
There is no solution.
13. * (Allenby, p.30) For any a, b, c and d discuss the possible solutions x of
1 2 −1 3 −1 a
3 4 2 5 −2
x = b
1 0 4 −1 0 c
3 2 7 1 −1 d
Detached:
1 2 −3 α
3 −1 2 β
1 −5 8 −α
Row EF:
1 2 −3 α
0 −7 11 β − 3α
0 0 0 −7α + 7β
Solution exists only for α = β. This gives
1 2 −3 α
0 −7 11 −2α
0 0 0 0
Hence
x3 = t
−7x2 + 11t = −2α, x2 = 11 2
7 t + 7α
x1 + 2 11 2 1 3
7 t + 7 α − 3t = α, x1 = − 7 t + 7 α
1
x1 − 7 t + 37 α
x2 = 11 t + 2 α
7 7
x3 t
Solution
Detached:
1 1 λ 2
3 4 2 λ
2 3 −1 1
Row EF:
1 1 λ 2
0 1 2 − 3λ λ − 6
0 0 3−λ 3−λ
(i)Unique solution if λ 6= 3
1 1 3 2
0 1 −7 −3
0 0 0 0
General solution to
2.2. MATRICES 79
1 1 3 x1 2
0 1 −7 x2 = −3
0 0 0 x3 0
is
x1 5 − 10t
x2 = −3 + 7t
x3 t
16. Let Ax = b represent a system of m equations for n unknowns. Which of the following is
a true statement? Give reasons, in particular, a counterexample if a statement is false.
(Usually the simplest counterexample is the best one.)
Solution:
x1
(a) 0 0 = 1 has no solution.
x2
2. If aij (the pivot entry) is the first non-zero entry in row i, then
2.2.13 Any matrix can be brought into row reduced echelon form by
suitable elementary row operations
Let A be in row echelon form. Suppose that A has r non-zero rows. We first divide each
non-zero row by its leading (first non-zero) entry. So we may suppose that the leading entry of
row r is arj = 1. Use row r for reference row and arj as a pivot element to reduce all entries
above it to zeros. Now repeat the process with row r − 1 etc., the last reference row being row
2. The resulting array is in row reduced echelon form (row REF).
Note that in converting a row EF to a row REF all the pivot elements remain in the same
positions but become 1s.
The process is known as the Gauss-Jordan method.
Remark 37 Although a matrix does not in general have a unique row echelon form (which
ones do?), it can be shown that the row reduced echelon form is unique.
Step 1:
10
1 3 − 23
0 1 − 32
55
3
− 55 R2 ref row
Step 2:
14
1 0 11 − 10
3 R2 + R1
0 1 − 32
55
Example 2.2.7 As a second example consider Example 2.1.2. Its row echelon form was found
in (2.16):
1 1 3 −1
0 2 1 0
0 0 1 −2
Example 2.2.8 Consider Example 2.1.3 again. From (2.18) we already have the row EF of
detached coefficients:
1 1 3 2 −1
0 2 1 1 0
0 0 1 −1 −2
Example 2.2.9 Example 2.1.4 (equation (2.7)) has a row echelon form (2.20):
1 1 3 −1
0 2 1 0
0 0 0 7
1 1 3 0 R3 + R1
1
0 1 2 0
0 0 0 1
5
1 0 2 0 −R2 + R1
1
0 1 2 0 (2.29)
0 0 0 1
Example 2.2.10 As a final example, consider Example 2.2.3. We found the row echelon form
(2.23):
1 −1 1 2 4
0 1 3 1 8
0 0 1 −2 5 ref row
0 0 0 0 0
Step 2
1 0 0 11 −8 R2 + R1
0 1 0 7 −7
0 0 1 −2 5
0 0 0 0 0
The array of detached coefficients of Example 2.2.3 therefore has row REF
1 0 0 11 −8
0 1 0 7 −7
0 0 1 −2 5
0 0 0 0 0
The required solution to Example 2.2.3 (matrix equation (2.22)) is (unsurprisingly) as before.
We note that the row REF of the coefficient matrix of the same example is
1 0 0 11
0 1 0 7
(2.30)
0 0 1 −2
0 0 0 0
The homogeneous solution (2.28) to Example 2.2.5 can again be read off this array.
Here the non-pivot columns are columns 2, 3 and 6. The general solution to Ax = 0 is
−3x2 + 5x3 − 7x6 −3t1 + 5t2 − 7t3
x2 t1
x3 t 2
x=
=
2x 6 2t 3
−3x6 −3t3
x6 t3
This shows that we are actually dealing with 3 equations for 4 unknowns. Column 4 of
the coefficient matrix is the only non-pivot column. Hence x4 = t can have any value and
we get the same solution as before (equation (2.28).
Remark 39 This theorem will be used to prove the basic theorem on linear independence in
Chapter 3.
1. Find the row reduced echelon form of the matrices and arrays of detached coefficients
in Exercise (35), numbers 1a, 1b, 2, 3, 4, 9, 11 , 12 and 13. Solve (once more) the
problems involving systems of equations Ax = b using the row REF. Write down the row
REF of the corresponding coefficient matrix A. Illustrate the above theorem 2.2.15 in the
homogeneous cases.
Solution: Done in answers to previous exercise.
2. Let matrices A and B have the same number of rows. Let [A, B] be the matrix formed by
attaching the columns of B to A. Suppose that [A, B] has been reduced to row (reduced)
echelon form [A′ , B ′ ]. Show that A′ is then also in row (reduced) echelon form.
84CHAPTER 2. MATRICES AND THE SOLUTION OF SIMULTANEOUS LINEAR EQUATIONS
Hint 41 This should be clear from the observation that a zero row of A′ comes after a
non-zero row of A′ .
3. Consider the general problem of solving two linear equations for two unknowns x1 and x2 :
a11 a12 x1 b1
= (2.31)
a21 a22 x2 b2
(a) In (2.31) all the entries except x1 and x2 are supposed known. Using formal Gauss-
reduction find formulae for the unknowns x1 and x2 in terms of the other symbols.
For the purposes of reduction you may assume any number you like is non-zero.
In fact, conclude that if only d = a11 a22 − a21 a12 6= 0, then you have a formula for
x. Can you see that the solution must be unique?
Remark 42 The number a11 a22 − a21 a12 is called the determinant of the matrix
A = [aij ]. and is denoted by |A|. For example,
3 −2
7 4 = (3) (4) − (7) (−2) = 26
Hint 43 Consider the array of detached coefficients and, assuming a11 6= 0, the step
a11 a12 b1
(2.32)
0 − a21 a12
a11 + a22 − aa2111b1 + b2 − aa11
21
R1 + R2
Now assume
a21 a12
− + a22 6= 0
a11
and solve for x2 . Next, find x1 .
− aa2111b1 + b2
x2 =
− a21 a12
a11 + a22
−a21 b1 + b2 a11
=
−a21 a12 + a22 a11
a11 b1
a21 b2
=
|A|
−a21 b1 +b2 a11
Now substitute x2 = −a21 a12 +a22 a11 into the equation a11 x1 + a12 x2 = b1 for x1 and
solve for x1 :
b1 − a12 x2
x1 =
a11
−a21 b1 +b2 a11
b1 − a12 ( −a21 a12 +a22 a11
)
=
a11
b1 a22 − a12 b2
=
−a21 a12 + a22 a11
b1 a12
b2 a22
=
|A|
2.3. EXISTENCE THEOREM FOR SOLUTIONS TO LINEAR EQUATIONS 85
Solution:
157
−5 11 x1 −2 68
= , Solution is : 59
3 7 x2 13 68
86CHAPTER 2. MATRICES AND THE SOLUTION OF SIMULTANEOUS LINEAR EQUATIONS
2.4.3 Elementary row operations (eros). Row echelon and row re-
duced echelon form
(See subsections 2.2.6, 2.2.8 and 2.2.11).
Elementary row operations (eros), known as Gauss reduction, are done on the array R
reducing it to row echelon form (row EF):
C ′ = A′ , b′
Further eros (Gauss-Jordan) done on C ′ reduce it to row reduced echelon form (row REF):
C ′′ = A′′ , b′′
The important fact about elementary row operations on a system of equations (or its detached
array of coefficients) is that they leave the solution set unchanged. For any particular x
the equation Ax = b is satisfied if, and only if, the equation A′ x = b′ is satisfied if, and only if,
the equation A′′ x = b′′ is satisfied.
A system of homogeneous equations with more unknowns than equations always has a non-
trivial solution (see 2.2.15).
Section 2.1 deals with the basic theorem on the existence of solutions to a general system of
linear equations.
88CHAPTER 2. MATRICES AND THE SOLUTION OF SIMULTANEOUS LINEAR EQUATIONS
Chapter 3
In this chapter we will learn how the solution to a set of linear equations is the inverse
operation of the interpretation of a matrix A as a transformation or mapping. In this
sense A transforms a vector x into another vector y = Ax. Examples in ordinary space are
projections on planes and lines through the origin, reflections in such objects and rotations
about an axis through O (important for mechanics).
Next, we deal with the difficult idea of linear independence, and it is followed by a section
on subspaces that can be regarded as an introduction to the idea of a general ‘vector space’.
Finally, there is a section on inverses of matrices.
Alternative notation If A = [aij ], we will also find it most convenient to use the
notation
[A]ij = aij (3.1)
2. From the definitions it follows that ℜn×1 is the set of columns with n entries. We call
89
90 CHAPTER 3. LINEAR TRANSFORMATIONS AND MATRICES
For typographical reasons this row vector is sometimes written [a1 , ..., an ]. It is traditional
to use ℜn for both ℜn×1 (column vectors) and ℜ1×n (row vectors). We will only use the
notation ℜn if the context makes it quite clear whether rows or columns are meant.
3. The m × n zero matrix has 0 for all its entries and is written Om×n , or simply as O
when the size is understood.
4. Addition of matrices
Let A = [aij ] and B = [bij ] be of the same m × n size. Their sum A + B = C where the
(i, j)-entry of C is cij = aij + bij . In brief, using the notation or equation (3.1),
It is essential to note that the addition A + B makes sense only if A and B have the same
size.
5. Multiplication of a matrix by a scalar.
Let A = [aij ] and s ∈ ℜ. The product sA = As has the number saij in the (i, j) −position:
Remark 44 Notice that it makes no difference if we write the scalar s on the left or on
the right of A. The matrix −A is defined as (−1) A and has for (i, j) entry the number
−aij i.e., [(−1) A]ij = − [A]ij . The difference A − B of two matrices of the same size
means the same thing as A + (−B).
We will see that if a is a row vector (3.3) then it is more natural to write sa = sa1 sa2 ... san
rather than as.
On the other hand, for a column b, as in equation (3.2), it is better to write bs. See, for
example, Equation (3.23) below.
Remark 45 The symbols for row i and column j are consistent with our other notation.
Thus [A]ij = aij is at the intersection of row Ai• and column A•j .
Other notations in use for the ith row of A, are Ai and [A]i , and for the j th column of
j
A, A(j) or [A] . However, we will not use these.
3.1. THE ALGEBRA OF VECTORS AND MATRICES 91
8. The unit column vectors ej ∈ ℜn×1 (j = 1, ..., n) have a 1 in position j and zeros
elsewhere:
0
..
.
ej =
1
row j (3.7)
..
.
0
9. The n × n Identity matrix is the n × n matrix with columns e1 , ... ,en (in that order).
It is denoted by In (or simply I if n is understood):
..
1 0 . 0
.
0 1 .. 0
In = . . . .
(3.8)
.. .. .. ..
..
0 0 . 1
As we will see, In behaves like the number 1 when we multiply matrices. (See Exercise
63 No.6). We note that, writing I = In , by definition I•j = ej for j = 1, · · · , n. The unit
row vectors are
Ii• = 0 · · · 1 (position i) · · · 0 (i = 1, · · · , n)
10. The transpose of the m × n matrix A = [aij ] is denoted by AT (“A transpose”). The
transpose AT has for its (i, j) entry the number aji . Thus AT is an n × m matrix and
can also be vaguely defined as ‘the matrix A with its rows and columns interchanged’. In
fact, using the notation (3.1),
T
A ji = [A]ij (i = 1, ..., n; j = 1, 2, ..., m) (3.9)
11. The product a b of the row a with n entries and the column b with n of entries is
b1
b1
a b = a1 a2 ... an . = a1 b1 + a2 b2 + · · · + an bn (3.10)
..
bn
Remark 46 This was introduced in subsection 2.2.3 of Chapter 2. You can, if you wish,
call this a ‘dot product’ since it is a generalization of the dot product Equation (1.4) of
Chapter 1, but there is no need here for the ‘dot’. Note the order in a b: first a, then b,
This is important as ba will turn out to be something quite different: in fact, an n × n
matrix.(See No.4 of Exercise 63).
v = u1 s1 + u2 s2 + · · · + uk sk (3.11)
92 CHAPTER 3. LINEAR TRANSFORMATIONS AND MATRICES
We then say that v depends linearly on u1 , u2 ,. . . ,uk . Of course, we can also write
v = s1 u1 + s2 u2 + · · · + sk uk , with the coefficients on the left, but, as remarked, we prefer
to write the coefficients of columns on the right. If v and u1 , u2 ,. . . ,uk are row vectors
all of the same size our preference would be
v = s1 u1 + s2 u2 + · · · + sk uk
s1 A1 + s2 A2 + ... + sk Ak
4.
−1 54 13 2 − 47 −7 1 − 12 6
+ =
2 −4 −7 −3 25 −7 −1 − 32 −14
5.
−1 54 13 −1 54 13 −100 125 1300
(100) = (100) =
2 −4 −7 2 −4 −7 200 −400 −700
6. (a) Let
−1 2 − 32
5
A= 4 −4 5 (3.12)
13 −7 38
Then
A1• = −1 2 − 23
3
A3• = 13 −7 8
3.1. THE ALGEBRA OF VECTORS AND MATRICES 93
(b) Let
7
2
− 32
X =
9
− 56
then
X2• = − 23
X4• = − 65
Remark 47 We usually use an ‘underline’ notation like (3.2) for column vectors but
will occasionally write X2 = X2• = − 32 if it is understood that X is a column matrix
(i.e. a column vector). Similar remarks apply to row vectors (i.e. row matrices).
(b) If
Y = 5 − 72 −9 10 −23
7
Y•2 = −
2
Y•5 = [−23]
9. We have
1 0 0
I3 = 0 1 0
0 0 1
and
1 0 0 0
0 1 0 0
I4 =
0
0 1 0
0 0 0 1
94 CHAPTER 3. LINEAR TRANSFORMATIONS AND MATRICES
(b) Another, using a general 2 × 3 matrix A = [aij ], and a general 3 × 4 matrix B = [bij ]
is
Ai• B•j = ai1 b1j + ai2 b2j + ai3 b3j (i = 1, 2; j = 1, 2, 3, 4)
Note that such a product is only possible when the number of columns of A equals
the number of rows of B. The significance of such products will be seen in subsection
3.3.4 below.
1. (a) In ℜ4×1
7
−5 2 11 35
1 − 32 2 1 − 17
2 (−6) + 2
−3 9 (3) + −4 − 2 = 47
2
3 − 56 −1 −6
(b) In ℜ1×5
(−2) p q 12 −3 s + 4 −1 r + 2 − 41 5
2 1
= −2p − 4 −2q + 4r + 8 −2 16 −2s + 4
The span of the columns of A is the whole of ℜ4×1 , since for any choices of b1 , b2 , b3 and
b4 ,
b1 1 1 1 1
b2 0 1 1 1
= (b2 − b3 ) + (b3 − b4 ) + b
b3 0 (b1 − b2 ) + 0 1 1 4
b4 0 0 0 1
See also subsections 3.1.6 and 3.1.9 below.
3. In ℜ2×2
−1 2 0 8 −1 2 0 8
(7) + (−3) = (7) + (−3)
3 6 −2 5 3 6 −2 5
−7 −10
=
27 27
−1 3 1
4. The row space of consists of all row vectors of the form α −1 3 1 +
2 −4 3
β 2 −4 3 where α and β are arbitrary scalars.
x = e1 x1 + e2 x2 + ... + en xn
(Compare 1.1.17 in Chapter 1). Hence it is trivial that every x depends linearly on the
unit vectors e1 , e2 ,..., en , in other words that these unit vectors span ℜn×1 . However,
there are many other sets of vectors that have a similar property, as we shall see. In
Chapter 1, No.17 of Exercise 2 we indicated that any three mutually orthogonal vectors
in ℜ3 span the whole of space. You are asked to show this rigorously in Exercise 76, No.9.
T
11. AT = A for any matrix A.
12. Referring to the product (3.10),
T
(a b) = bT aT (3.13)
13. (a) Let u and v be column vectors with n entries and a a row with n entries. Then
a (u + v) = a u + a v (3.14)
96 CHAPTER 3. LINEAR TRANSFORMATIONS AND MATRICES
(b) If s is a scalar,
a (us) = (a u) s = (sa) u (3.15)
(c) Equations (3.14) and (3.15) combine so that for all scalars s and t:
(d) More generally, Equation (3.16) can be generalized as follows: Suppose u1 , u2 , ..,uk
are column vectors in ℜn×1 and s1 , s2 , ..,.sk are scalars and a ∈ ℜ1×n is a row
vector. Then
14. (a) Quite similarly, if a and b are row vectors with n entries and u is a column with n
entries,
(a + b) u = a u + b u (3.18)
(b) and for scalars t,
(ta) u = t (a u) (3.19)
(c) Equations (3.18) and (3.19) combine so that for all scalars s and t:
(d) The result (3.20) can be generalized as follows: Suppose a1 , a2 , ..,ak are rows in
ℜ1×n and t1 , t2 , ..,.tk are scalars and u ∈ ℜn×1 is a column, then
The (i, j) entry in s (A + B) is s (aij + bij ). But s (aij + bij ) = saij + sbij , which is the
(i, j)− entry in sA + sB. Since s (A + B) and sA + sB have the same entries, they are equal.
Using the alternate notation (equation 3.1,
[s (A + B)]ij = s [A + B]ij = s [A]ij + [B]ij
= s [A]ij + s [B]ij = [sA]ij + [sB]ij = [sA + sB]ij
a (u + v) = a1 (u1 + v1 ) + · · · + an (un + vn )
= (a1 u1 + a1 v1 ) + · · · + (an un + an vn )
= (a1 u1 + · · · + an un ) + (a1 v1 + · · · + an vn )
= au + av
3.1. THE ALGEBRA OF VECTORS AND MATRICES 97
P
Or, using - notation,
n
X n
X
a (u + v) = ai (ui + vi ) = (ai ui + ai vi )
i=1 i=1
Xn n
X
= ai u i + ai vi
i=1 i=1
= au + av
Proof of equation (3.15). Again with obvious notation, the j th entry in the vector us is
uj s. Hence
Proof of equation (3.16). By equation (3.14), a (us + vt) = a ( us) + a (vt). Using equation
(3.15), a ( us) + a (vt) = (a u) s + (a v) t.
1. Consider 21 in Chapter 2 and the matrices in 2.2.1 referred to there. For each of these
as well as for the following A ∈ ℜm×n find m and n and, using our notation in equation
(3.5) and equation (3.6), write down all rows and columns of A.
−2 5
(a) A = 3 − 47
−1 12
2
4 7 0 5
(b) A = 0 9 −1 34
− 35 1 23 0
p f c
(c) A= b q p
d f r
−3 −1 8 0
2 −7 9 1
(d) A= 0 2
2
3 −2 0
1 −4 5 −6
3. Find:
1
−1 2 1 0
3 1 −2 3
(a) 2 5
2 2 + − 2 (−6) + 1 5 +
(−1)
3 10 −1
0 − 61 −1 4
98 CHAPTER 3. LINEAR TRANSFORMATIONS AND MATRICES
6 −2 11 0 0 6
(b) 4 −1 8 + 3 2 1 −2 9 4
7 2 −4 5 −3 7
1
2
(c) 2 + p 2 −4 6 3 − 3 (2p + q) −3 9 18 −6 + 41 (p − q) 0 −24 8 6
Solution:
3 25
1 + 6p + 2q −2 − 22p 3 − 16p − 14q 2 + 2 p + 52 q
5
1 1
T
4. Verify equation (3.16) in case a = −3 2 2 −1 ,u= 2 −1 −3 4 ,v=
5 1 T
− 6 3 − 12 1 , s = 2 and t = −6.
5.
Find the transposes
of the matrices
− 56 −2 12
1 v −3 7 a
8 −9 , x −u c d and .
1 −9 w 0 z
−7 2 3
6. For the matrices A in No.5 compare Ai• and AT •i as well as A•j and AT j• . Can
T T
we say that [Ai• ] = AT •i and [A•j ] = AT j• ? Are these true for any matrix A?
Solution: Yes.
7. Prove the above basic properties in subsection 3.1.4. For equation (3.17) and equation
(3.21) you may assume k=3, since the proofs of the general cases are quite similar.
Hint 49 Study the above proofs in 3.1.5 as well as your own proofs of the statements in
1.1.6 and 1.1.9 of Chapter 1. Apart from the fact that Chapter 1 concentrated on vectors
in ℜ1×3 or ℜ1×2 most of the proofs for general vectors go through almost word-for-word.
T
9. Show that, provided A and B are of the same size, (A + B) = AT + B T . (Use the
notation of equation 3.1).
Solution:
T
[(A + B) ]ji = [A + B]ij = [A]ij + [B]ij = [AT ]ji + [B T ]ji = [AT + B T ]ji .
10. For the following pairs A, B of matrices find all possible products Ai• B•j and arrange
them in matrix form in what you think is a nice way.
(a)
2 −9 8 2 −1
A= , B=
3 1 −5 4 7
(b)
−1 5
−1 2 1 0
A= 3 0 , B=
3 0 −4 7
2 −4
Solution:
61 −32 −65
19 10 4
11. Show that with appropriate definitions in ℜn×1 (or in ℜ1×n ) properties 1 - 5 of subsection
1.1.13 in Chapter 1 are valid. Notice in pparticular No.4, which allows one to define the
length |x| of a vector x ∈ ℜn×1 as |x| = xT x. This in turn gives us a definition of the
distance between a and b as |a − b|.
Solution:
The fact is that |x| = 0 only if x = 0
This view was anticipated in Chapter 2, for example in equations (2.17, (2.19) and (2.25).
Like the previous view, it can immediately be seen from the definitions. The following examples
should make this clear.
100 CHAPTER 3. LINEAR TRANSFORMATIONS AND MATRICES
So
x1
−3 6 5 x2
2 −4 7
x3
−3x1 + 6x2 + 5x3 −3 6 5
= = x1 + x2 + x3
2x1 − 4x2 + 7x3 2 −4 7
Example 3.1.2 As a more general example, consider the general system of 4 equations for five
unknowns, equation (2.11) of the previous chapter. We found there equation (2.12):
A1• x a11 x1 + a12 x2 + a13 x3 + a14 x4 + a15 x5
A2• x a21 x1 + a22 x2 + a23 x3 + a24 x4 + a25 x5
Ax =
A3• x = a31 x1 + a32 x2 + a33 x3 + a34 x4 + a35 x5
Example 3.1.3 In the previous example let x = ej be the unit vector of equation (3.7) with
n = 5 entries . Then Ae2 = A•2 (the second column of A) since x2 = 1 and x1 = x3 = x4 =
x5 = 0. Similarly, for j = 1, · · · , 5, we have Aej = A•j .
Solving a system of equations Ax = b for x is exactly the same as looking for coefficients xj so
that b is a linear combination of the columns of A:
Ax = b if, and only if b = A•1 x1 + A•2 x2 + ... + A•n xn
In terms of the concept of ‘span’ (see item 2 in subsection 3.1.1), this is saying that b is in the
span of the columns of A:
b ∈ sp(A•1 , A•2 , . . . , A•n )
Solution:
To answer this, we change the vectors into column vectors (i.e. we use uT , v T , aT ) and try to
solve the following matrix equation for x:
1 0 1
3 1 x1 5
0 1 x2 = 2
5 1 7
In other words, we try to express the column on the right as a linear combination of the
columns of the 4 × 2 coefficient matrix. Use the array of detached coefficients:
1 0 1
3 1 5
0 1 2
5 1 7
1 0 1
0 1 2 −3R1 + R2
0 1 2
0 1 2 −5R1 + R4
This is equivalent to
1 0 1
0 1 2
0 0 0 −R2 + R3
0 0 0 −R2 + R4
Hence x2 = 2 and x1 = 1, and aT = uT + v T 2, so a = u + 2v and a depends linearly on u and
v.
Remark 50 We are in the habit of solving equations in the form Ax = b (so we are taking
linear combinations of columns), but we could just as well have left the vectors as rows: Then
we would be taking linear combinations of rows and using elementary column operations.
Remark 51 Any result about columns has a corresponding result about rows and vice-versa.
This is a fact you will appreciate more and more as we go ahead.
Solution:
Solve the homogeneous linear system
2 5 −11 x1 0
−3 −6 12 x2 = 0
4 7 −13 x3 0
x = 0 is in any case the (trivial) solution. As usual, to find the general solution, use Gauss-
reduction to find a row echelon form of the coefficient matrix A:
2 5 −11
0 1 −3 (3.26)
0 0 0
and the general solution with parameter t is
x1 −2t
x2 = 3t (3.27)
x3 t
Since each coefficient −2, 3 and 1 is non-zero, every column of A is a linearly dependent on
the other two columns. For example,
2 5 −11
3 1
A•1 = −3 = −6 + 12
2 2
4 7 −13
Solution:
Reduce A to row-echelon form:
2 5 1 2 5 1 2 5 1
0 32 3
2
3
2 R1 + R2 , 0 3
2
3
2 , 0 1 1 2
3 R2
0 −3 −4 (−2) R1 + R3 0 0 −1 2R2 + R3 0 0 1 (−1) R3
3.1. THE ALGEBRA OF VECTORS AND MATRICES 103
So a row EF for A is
2 5 1
C= 0 1 1
0 0 1
Hence no column of A can depend linearly on the other two. For example, should we have
A•1 = A•2 β + A•3 γ, then
1
x = −β
−γ
would be a non-trivial solution to Ax = 0 because 1 6= 0.
Remark 52 Questions of the sort “which columns of A are linearly dependent on other columns
of A?” are further explored below in 3.4.11.
When Ax = 0 has only the trivial solution x = 0 we will call the columns of A linearly inde-
pendent. See subsection 3.4.1.
Proof
Consider the ith entry of the left-hand side of (3.30). By equation (3.16) this is
As (Ai• u1 ) s1 + (Ai• u2 ) s2 is by definition the ith entry of the right-hand side of equation
(3.30), the result follows.
More generally, equation (3.30) easily extends to the following general result:
Let A be an m × n matrix and suppose u1 , u2 , ..,uk are vectors in ℜn = ℜn×1 and s1 , s2 ,
..,.sk are scalars. Then
2. Let A be m × n. Show that βA2• + A1• is a linear combination of A2• , · · · , Am• if, and
only if, A1• is a linear combination of A2• , · · · , Am• .
(a) Show that if the system is homogeneous i.e. b = 0, then w is also a solution, i.e.
Aw = 0
Solution:
Aw = A (v 1 α1 + v 2 α2 + · · · + v k αk ) = (Av 1 ) α1 + (Av 2 ) α2 + · · · + (Av k ) αk = 0α1 +
· · · 0αk = 0
(b) Find a simple example to show that (a) may fail if b 6= 0.
Solution:
Simplest of simples is: take A to be 1 × 1, A = [1] and let b = [1]. Then x = [1]is a
solution to Ax = b, but A (x + x) 6= b.
(c) If α1 + α2 + · · · + αk = 1 show that Aw = b.
Solution:
Aw = A (v 1 α1 + v 2 α2 + · · · + v k αk ) = (Av 1 ) α1 + (Av 2 ) α2 + · · · + (Av k ) αk
= bα1 + · · · + bαk = bα1 + · · · + bαk = b (α1 + · · · + αk ) = b1 = b.
4. (a) Consider Exercise 35 of Chapter 2, Numbers 3, 4a, 5a and 6, 11a and 11b. These
require solutions to a matrix equation Ax = b. For which of these is b a linear
combination of the columns of A?
Consider those for which b is a linear combination of the columns of A. For which
of these are the coefficients in the linear combination unique? For those which are
not, express b in two ways as linear combinations of the columns of A.
Solution: See solutions to this exercise in Chapter 2.
(b) * In Problem 13 from Exercise 35 of Chapter 2. For which values of a, b, c, d is
T
a b c d a linear combination of the columns of A?
Solution: See solutions to this exercise in Chapter 2.
5. Prove the following theorem, which was anticipated in No.10 of Exercise 35 in Chapter 2.
Let Ax = b represent m equations in n unknowns. Suppose that the system has at least
one particular solution, say p. Then if Av = 0 so is x = p + v a solution and all solutions
to Ax = b have this form.
Conclude that if m < n and Ax = b has at least one solution, then the system has
infinitely many solutions. In fact, if there are k non-pivot columns then the general
solution contains k independent parameters. (Use 2.2.15 from Chapter 2.) Solution:
If Ap = b and Av = 0 then A p + v = Ap + Av = b + 0 = b. Conversely, suppose that
x = q is a solution, i.e., Aq = b. Let v = q − p. Then Av = 0 and q = p + v. The
conclusion follows directly from 2.2.15. The k parameters are those of the homogeneous
solution.
Remark 54 If c is a fixed column vector with m entries then x → c+Ax is also a mapping from
ℜn to ℜm . However, we will concentrate mainly on the case c = 0 (linear transformations).
Example 3.2.1 Consider the Cartesian plane ℜ2 . In equation (1.28) of Chapter 1 we found
the reflection of the point (p1 , p2 ) in the line ℓ lying in the x − y plane and which passes through
the origin making an angle of 30◦ with the positive x−axis:
1 1√ 1√ 1
p′ = p1 + 3p2 , 3p1 − p2
2 2 2 2
Hence considered as a mapping, the matrix
" √ #
1 1
√2 2 3
A= 3
2 − 21
x1
transforms the point x = into its reflection in the above line:
x2
1 1
√
x1 2x√1 + 2 13x2
y=A = 1
x2 2 3x1 − 2 x2
(We have written p and p′ as columns x and y respectively). See Figure 3.1. Geometrically, it
is clear that the range of this reflection is ℜ2 . This can be seen algebraically by showing that,
whatever the values of y1 and y2 , the matrix equation
1 1
√
2x√1 + 2 3x2
=
y1
1 1 y2
2 3x1 − 2 x2
x2 - axis
line ℓ
A
30
O x1 - axis
U
y = Ax
Figure 3.1
Example 3.2.4 Consider a fixed column vector a with three entries and non-zero column ma-
trix
u1
A = u2
u3
The mapping, A : ℜ3 → ℜ1 = ℜ, and t → a + At represents a parametric equation of a straight
line in space passing through the point a, as in Chapter 1, 1.2. The only difference is that here
we are using columns instead of rows.
and represents a rotation of the point with position vector x through 90◦ (anticlockwise looking
down on the plane). Convince yourselves of this by looking at a few values of x, but we will
return to this below in 3.3.11.
1 α
Example 3.2.8 The matrix A = , interpreted as a mapping, leaves x2 fixed and
0 1
moves x1 to x1 + αx2 . It is known as a shear transformation.
x1 + αx2
Ax =
x2
See Figure 3.2 where α > 0. The box OBCD is transformed into the quadrilateral OEF D.
The range is again ℜ2 , as can very easily be seen. (Exercise 57 No.3).
x Ax
x2 ..................................... ...
B C E ... .. F
.. ..
.. ...
.. ..
.. ..
.. ..
.. ..
D ... ... .. .
O x1 αx2 x1 + αx2
Figure 3.2
The mapping (3.33) is a linear transformation if, and only if, for all u and v in ℜn and all
scalars s,
For it is obvious that if (3.34) holds then so will (3.35) and (3.36). On the other hand,
suppose that these last two conditions hold for the mapping T . Then for any u1 and u2 scalars
s1 and s2 ,
T (u1 s1 + u2 s2 ) = T (u1 s1 ) + T (u2 s2 ) = T (u1 ) s1 + T (u2 ) s2
T (x) = Ax (x ∈ ℜn ) (3.39)
Here the matrix A is defined by A•j = T ej for j = 1, · · · , n. This is saying that A is the
m × n matrix
A = T (e1 ) T (e2 ) ... T (en )
The matrix A in (3.39) is obviously unique.
3.3.3 A linear
transformation T is completely determined by its effect
T ej on the unit vectors ej
This is just a restatement of equation (3.38).
3.3. LINEAR TRANSFORMATIONS 109
Solution:
Writing x1 = p1 , x2 = p2 , x3 = p3 and using column vectors, the above projection formula
becomes 2
1
q1 3 3 − 13 x1
q2 = 1 1
− 16 x2
3 6
1 1 1
q3 −3 −6 6 x3
This is the mapping x → q = Ax and is indeed a linear transformation.
Example 3.3.2 Find a matrix representation of the mapping which sends the point P to the
foot of the perpendicular from P to the plane x1 + 2x2 + x3 = −4. Show that this mapping is
not linear but that the projection of P on the plane x1 + 2x2 + x3 = 0 is a linear transformation.
Find the matrix that reflects in this plane.
Solution:
From equation (1.23) in Chapter 1, with P = (p1 , p2 , p3 ) we found for the foot Q
4 + p1 + 2p2 + p3
q = (p1 , p2 , p3 ) − (1, 2, 1)
6
Letting x1 = p1 , x2 = p2 , x3 = p3 , and again using column vectors, this becomes
2
q1 x1 3
− x1 +2x6 2 +x3
q2 = x2 − 4 + −2 x1 +2x2 +x3
3 6
2
q3 x3 3 − x1 +2x6 2 +x3
1. Find a matrix formula for the foot Q of the perpendicular from P = x ∈ ℜ3 to the plane
2x + z − 3 = 0. Do the same for the plane 2x + z = 0, showing in the latter case that we
get a linear transformation. Write down the matrix A of this transformation, as well as
the matrix that reflects in the plane 2x + z = 0.
Hint 58 Refer to 1d, equation (1.25) from Exercise 14 in Chapter 1. There you found
the foot Q of the perpendicular from (p1 , p2 , p3 ) to the plane 2x + z − 3 = 0:
1 2 6 4 2 3
q= p1 − p3 + , p2 , p3 − p1 +
5 5 5 5 5 5
Solution:
6 1
5 5 0 − 25
q = 0 + 0 1 0 x
3
5 − 25 0 4
5
Not linear as the value of q at x = 0 is not 0. For the second plane that passes through
O, the projection is 1
5 0 − 25
q = 0 1 0 x
− 25 0 4
5
2. (a) Consider equation (1.12) in subsection 1.2.5 of Chapter 1 for the projection of a
point on a line ℓ that passes through the origin. Using column vectors find a matrix
A such that the foot Q of the perpendicular from the point x is given by q = Ax.
Solution:
From Chapter 1,
!
1 1
q= 2p ·u u= 2 (p1 u1 + p2 u2 + p3 u3 ) (u1 , u2 , u3 ) .
|u| |u|
1
The first coordinate is |u|2
(p1 u1 + p2 u2 + p3 u3 ) u1 and the others are similar. The
matrix is
u21 u1 u2 u1 u3
|u|2 |u|2 |u|2
u22
A=
u2 u1
|u|2 |u|2
u2 u3
|u|2
u3 u1 u3 u2 u23
|u|2 |u|2 |u|2
3.3. LINEAR TRANSFORMATIONS 111
(b) Consider equation (1.21) from Exercise 14 in Chapter 1. Assume that the plane goes
through the origin. Using column vectors find a matrix equation q = Ax for the foot
Q of the perpendicular from the point x onto the plane (the projection of P onto the
plane).
The matrices in 2a and 2b are called projection matrices.
Solution:
From Chapter 1, Exercise 14, No.8, the foot Q from the point P to the plane is
1 1
q =p− 2 n · p n = p − 2 (n1 p1 + n2 p2 + n3 p3 ) n
|n| |n|
In matrix terms,
n21
1− |n|2
− n|n|
1 n2
2 − n|n|
1 n3
2
n22
q=
− n|n|
2 n1
2 1− |n|2
− n|n|
2 n3
2
x
2
n3 n1 n
− |n|2 − n|n|
3 n2
2 1 − |n|32
(c) Specialize (a) in case the transformation is restricted to the Cartesian plane, i.e. is
from ℜ2 to ℜ2 .
Solution:
u21 u1 u2
|u|2 |u|2 = 1 u21 u1 u2
A= u2 u1 u22
|u|
2 u2 u1 u22
|u|2 |u|2
3. Find the range of the shear transformation (Example 3.2.8) and prove that your statement
is correct.
4. Use Exercise 18 from Chapter 1 to find linear transformations describing the following:
Remark 59 Observe that this matrix actually represents a rotation through π radi-
ans about the given line. Contrast this with the situation when we restrict ourselves
to the Cartesian plane in Exercise 63, No.9, where a reflection cannot be a rotation.
(b) The reflection in a plane that passes through the origin.
Solution:
This is done in much the same way as the previous exercise. Let A be the matrix of
the projection of No.2b. The matrix of the reflection is 2A − I3 :
2n2
1 − |n|12 − 2n|n|1 n2 2 − 2n|n|1 n2 3
2n n 2
− 2 2 1 1 − 2n22 − 2n2 n2 3
|n| |n| |n|
2n2
− 2n|n|3 n2 1 − 2n|n|3 n2 2 1 − |n|32
112 CHAPTER 3. LINEAR TRANSFORMATIONS AND MATRICES
(c) Specialize (a) in case the transformation is restricted to the Cartesian plane, i.e. is
from ℜ2 to ℜ2 .
Solution:
The reflection matrix is
2
u1 u1 u2 2u21 2u1 u2
|u|2 |u|2 |u|2
−1 |u|2
2 u2 u1 u22
− I2 =
2u2 u1 2u22
|u|2 |u|2 |u|2 |u|2
−1
5. In the x1 - x2 plane let A be the rotation matrix of Example 3.2.7 and suppose B is the
2 × 2 matrix that reflects in the x1 −axis.
(a) Find matrices C and D such that for all vectors x ∈ ℜ2 we have Cx = B (Ax) and
Dx = A (Bx).
(b) Show that C and D are reflection matrices and describe them geometrically.
Remark 60 In the next section, we will regard C as the product of B and A, in that
order: C = BA. Similarly, D = AB. See Example ??
6. Let A be one of the projection matrices from the previous exercises. Show geometrically
why you expect that A (Ax) = Ax for all points x. Now suppose that A is one of the
reflection matrices. What should A (Ax) be? See also Exercise 63, No.11 below.
B : ℜn → ℜk and A : ℜk → ℜm .
Consequently, for x ∈ ℜn
x → A (Bx) (3.41)
defines a transformation from ℜn to ℜm . See Figure 3.3. Symbolically
x−−−−
B−−→Bx−−−−
A−−→A (Bx)
It is essential to observe that (3.41) only makes sense if the number of columns of A
equals the number of rows of B (here both are k).
Remark 61 We have to read this as: first apply B to x, then apply A to the result Bx. This
“backward” reading is due to our traditional functional notation: For ordinary functions f and
g we write the composition of f and g as (f ◦ g) (x) = f (g (x)).
P x = A (Bx) (x ∈ ℜn ) (3.42)
B
ℜn ℜk
z
Bx
A
+
A(Bx)
ℜm
Figure 3.4
3.3.5 There is one, and only one, matrix P satisfying equation (3.42)
Proof
Suppose (3.42) holds. Put x = ej , one of the n unit vectors (3.7). Then
P•j = P ej = A Bej = AB•j (j = 1, · · · , n) (3.43)
In other words, for (3.42) to hold (3.43) is the only choice for P . On the other hand, (3.41) is
a linear transformation: See No.13 in Exercise 63). Therefore its matrix can only have columns
(3.43). We therefore define the matrix AB as follows:
3.3.6 Definition
If A is m × k and B is k × n, then
AB = AB•1 AB•2 ··· AB•n (3.44)
Equivalently,
(AB) x = A (Bx) (x ∈ ℜn )
The entry in row i and column j of AB is the product of row Ai• and column B•j . Equation
(3.46) can just as well be taken as the definition of AB and in fact is the standard definition.
This equation was anticipated in Exercise 48, No.10a above.
Whatever is true for columns (rows) has its exact counterpart for rows (columns). From
(3.46) we see that
[AB]i• = Ai• B (i = 1, · · · , m) (3.47)
or more fully,
A1• B
A2• B
AB = ..
.
Am• B
P
Expanding (3.46), we get using the usual convention for matrix entries and -notation,
k
X
[AB]ij = air brj = ai1 b1j + · · · + aik bkj
r=1
As already remarked, this equation only makes sense if the number of columns of A equals
the number of rows of B. We may write
Example 3.3.3 Let B be the reflection matrix of Example 3.2.1 and A that of the rotation
matrix A of Example 3.2.7. Find the matrix of (i) first rotating then reflecting and (ii) the
other way round, first reflecting then rotating.
See No.5 in Exercise 57.
Solution (i)
If we first rotate through 90◦ then reflect in the x1 −axis we get the product
0 −1
1 0 1 0
1 0 0 −1 1 0 0 −1
BA =
=
0 −1 1 0 0 −1 = −1 0
0 −1 0 −1
1 0
Solution (ii)
If we first reflect and then rotate we obtain
1 0
0 −1 0 −1
0 −1 1 0
AB = = 0 −1 = 0 1
1 0 0 −1 1 0 1 0
1 0 1 0
0 −1
Find AB.
Solution:
From (3.46),
3.3. LINEAR TRANSFORMATIONS 115
8 2
−1
2 −9 2 −9 2 −9
AB = −5 4 7
8 2 −1
3 1 3 1 3 1
−5 4 7
61 −32 −65
=
19 10 4
Example 3.3.5 Find yB , where y = y1 y2 and B is as in the previous example:.
8 2 −1
y1 y2 (3.48)
−5 4 7
Solution:
8 2 −1
yB = y1 y2
−5 4 7
8 2 −1
= y1 y2 y1 y2 y1 y2
−5 4 7
= 8y1 − 5y2 2y1 + 4y2 −y1 + 7y2
Solution:
In (3.23) we found that the product Ax is a linear combination of the columns of A with
coefficients xk . In exactly the same way we find the product (3.48) to be a linear combination
of the rows of B with coefficients y1 and y2 :
y1 B1• + y2 B2•
= y1 8 2 −1 + y2 −5 4 7
= 8y1 − 5y2 2y1 + 4y2 −y1 + 7y2
= yB
The reasoning is just as in the case of equation (3.23) in subsection 3.1.6 except that now
we are multiplying on the left of B by a row y and taking a linear combination of rows of B.
Like equation (3.22) we have
yB = yB•1 yB•2 · · · yB•n
1.
[AB]i• = ai1 B1• + ai2 B2• + · · · + aik Bk• (i = 1, · · · , m) (3.50)
2.
[AB]•j = A•1 b1j + A•2 b2j + · · · + A•k bkj (3.51)
To see equation (3.50), use equations (3.47) and (3.49) with y = Ai• .
In words: “Row i of AB is a linear combination of the rows of B with coefficients taken
from row i of A”.
To see equation (3.51) use equations (3.45) and our earlier result equation (3.23) with
x = B•j .
In words: “Column j of AB is a linear combination of the columns of A with coefficients
taken from column j of B”.
The above examples illustrate these statements, as do the following.
Example 3.3.7 Find AB in two ways: (i) in terms of linear combinations of rows of B and
(ii) in terms of linear combinations of columns of A, given that
2 1 5 4
A= and B =
−1 3 −7 0
Solution:
and
AC = 7A•1 − 37A•2 19A•2 + 17A•1 − 7A•3
(AC has two columns).
3.3. LINEAR TRANSFORMATIONS 117
Solution:
67 −5 92 7 17
B = −4 0 9 and C = −37 19
21 8 0 0 −7
Example 3.3.9 What is the matrix of the transformation T : ℜ3 → ℜ3 that first projects onto
the plane x1 + 2x2 + x3 = 0 as in (3.40) and then projects onto the x1 − x2 plane as in Example
3.2.3?
Solution:
The required matrix is the product
5
− 31 − 16 5
1 0 0 61 1 − 13 − 16
−3 − 13 = 6
0 1 0 3 − 31 1
− 13
− 16 − 31 5
6
3
Considered as mappings, both sides of (3.52) have the same meaning: First apply C, then
B, then A. Therefore both sides are equal as matrix products.
Powers of A if A is square. Let A be m × m. Then (AA)A and A(AA) are both defined
and are equal by (3.52), so we can write A3 for this product. Similarly, if n > 1 is an integer
we have An = AA · · · A (n times), where the bracketing does not matter. By convention,
A0 = Im (the m × m identity matrix).
We have given a motivated and especially simple proof of the associative law (3.52). The usual
proof given in textbooks is the following:
k k n
!
X X X
[A (BC)]ij = [A]ir [BC]rj = [A]ir [B]rs [C]sj
r=1 r=1 s=1
k X
n n k
!
X X X
= [A]ir [B]rs [C]sj = [A]ir [B]rs [C]sj
r=1 s=1 s=1 r=1
Xn
= [AB]is [C]sj = [(AB) C]ij
s=1
Let X be a point in the Cartesian plane ℜ2 = ℜ2×1 and suppose that R rotates OX = x
through the angle θ about the origin anticlockwise looking down the x3 − axis towards O. This
is a clockwise rotation of θ looking from the origin O along the positive x3 −axis. Let R (x) be
the effect of R on x.
x.2
........... C
............... ..
B ... ..
...
..
..
.
b .
..
a+b ..
...
..
′ ..
...A
.y R(a) : .. A
... a
..... x1
.
... O
.....
... R(a + b)
.....
..
.) R(b)
C ′ ...............
........
..
B′
Figure 3.4
R (a + b) = R (a) + R (b)
Thus the transformation R satisfies the first condition in equation (3.35) for linearity. It is
pretty obvious that if we multiply a by a scalar s that R (a) will be multiplied by the same
factor, i.e. that R (as) = R (a) s, so that that the second condition for linearity as expressed in
equation (3.36) also holds. We have shown that R : ℜ2 → ℜ2 is a linear transformation.
Since these are the first and second columns of Rθ respectively, this proves (3.53).
3.3. LINEAR TRANSFORMATIONS 119
x2 0
1
− sin θ 6
cos θ
cos θ ]
> sin θ
θ
θ
- 1
x1
0
Figure 3.5
Example 3.3.10 Write down the linear transformations that rotate π/3 and 2π/3 radians
about the origin in the Cartesian plane.
Solution: √
1
cos π3 − sin π3 √
2 − 21 3
R = π =
3 sin π3 cos π3 1
2 3
1
2
√
cos 2π
3 − sin 2π3 −√12 − 21 3
R 2π = =
3 sin 2π
3 cos 2π
3
1
2 3 − 21
Multiplying the two matrices on the left gives the addition formulae for sine and cosine:
cos θ cos φ − sin θ sin φ − cos θ sin φ − sin θ cos φ cos (θ + φ) − sin (θ + φ)
=
sin θ cos φ + cos θ sin φ cos θ cos φ − sin θ sin φ sin (θ + φ) cos (θ + φ)
0 0 0 λ
(n is a positive integer).
Solution
n
λ 0 0 0
0 λn 0 0
0 0 λn 0
0 0 0 λn
2. Find all possible products of the following matrices, whenever the product is defined,and
when this is the case, express the rows and columns of the product as linear combinations,
as in example 3.3.7.
−2 5
−1 3 2 0 −5
A= , B= 4 −7 , C = ,
2 1 −4 3 11
0 3
2
D = −3 4 −1 , E = −1
6
3.3. LINEAR TRANSFORMATIONS 121
−3B1•
BC = B•1 (−2) + B•2 (3) + B•3 B•2 − B•1
Note that BC has two columns.
Solution:
The matrix A must be 4 × 3. One such matrix is:
1 0 −2
0 2 5
A= 1 4 7
−3 0 0
6. Let A be an m × n matrix and Im and In identity matrices (see equation (3.8)). Show
that
Im A = A = AIn
Conclude that if A is n × n and I = In that IA = AI = A.
7. In No.8 of Exercise 5 of Chapter 1 you found the formula for the projection of a point
(p1 , p2 , p3 ) on the line (−2t, −t, 5t). Find its matrix B as a linear transformation. In No.1
of Exercise 57 above you found the matrix A of the projection on the plane 2x + z = 0.
Find
(a) the matrix of the transformation that first projects on the plane 2x + z = 0 then on
the line (−2t, −t, 5t).
Solution:
The matrix that projects on the plane 2x + z = 0 is
1
5 0 − 25
A= 0 1 0
− 25 0 4
5
(b) the matrix of the transformation that first projects on the line (−2t, −t, 5t) then on
the plane 2x + z = 0.
The answer is
1 4 2
0 − 25 4 2 −10 − 52
1 5 25
1
25
1
AB = 0 1 0 2 1 −5 = 15 30 − 61
30
− 25 0 54
−10 −5 25 8
− 25 4
− 25 4
5
8. (a) complete the proof of the addition formulae for sine and cosine by multiplying out
equation (3.54).
(b) Check by direct multiplication that Rθ R−θ = I2 .
Solution:
cos θ − sin θ cos θ sin θ cos2 θ + sin2 θ 0 1 0
= 2 =
sin θ cosθ − sin θ cosθ 0 sin θ + cos2 θ 0 1
9. *
(a) Let ℓ be a line in the Cartesian plane making an angle θ with the positive x1 −axis.
Find the projection matrix Pθ on ℓ and deduce that the matrix Mθ which reflects in
ℓ is given by
cos 2θ sin 2θ
Mθ =
sin 2θ − cos 2θ
Observe that, in spite of superficial appearances, Mθ is not a rotation matrix. Why?
Solution:
cos θ
Px = cos θ sin θ x
sin θ
(x1 cos θ + x2 sin θ) cos θ
=
(x1 cos θ + x2 sin θ) sin θ
cos2 θ cos θ sin θ
= x
cos θ sin θ sin2 θ
Thus
cos2 θ cos θ sin θ
P =
cos θ sin θ sin2 θ
The reflection matrix is
2 cos2 θ − 1 2 cos θ sin θ cos 2θ sin 2θ
Mθ = 2P − I = =
2 cos θ sin θ 2 sin2 θ − 1 sin 2θ − cos 2θ
We offer two proofs that a reflection in a line in the Cartesian plane cannot be a
rotation.
Geometric proof:
Let x = OX and Mθ x = OX ′ . Then the arm OX ′ is OX rotated through a certain
angle, say 2γ. But this angle depends on the distance of X from the line through O
making an angle of θ with the x1 −axis and so Mθ cannot represent a rotation. (See
3.3. LINEAR TRANSFORMATIONS 123
Mφ Mθ x
I
x... 2 -axis
..
...
..
..
..
..
..
..
.. Mθ x
.. M
..
..
..
..
.. α
.. x
..
.........
.. .. θ .........
..........
. .................. ................................................
x1 -axis
O
ii. algebraically.
Hint 64 For the geometric solution, let Mθ rotate x through 2γ. Then Mφ
rotates Mθ x through 2(α − γ). For the algebraic solution use the expression for
Mθ found in the previous exercise.
Algebraic solution
cos 2φ sin 2φ cos 2θ sin 2θ
Mφ Mθ =
sin 2φ − cos 2φ sin 2θ − cos 2θ
cos 2φ cos 2θ + sin 2φ sin 2θ cos 2φ sin 2θ − sin 2φ cos 2θ
=
sin 2φ cos 2θ − cos 2φ sin 2θ cos 2φ cos 2θ + sin 2φ sin 2θ
cos (2φ − 2θ) − sin (2φ − 2θ)
=
sin (2φ − 2θ) cos (2φ − 2θ)
= R2(φ−θ)
10. Following the corkscrew rule, find matrices that represent rotations through θ about the
x−, y− and z− axes.
Hint 65 Recall the corkscrew rule: looking along the specified axis, rotate clockwise
through θ. The required rotation about the z−axis is
cos θ − sin θ 0
sin θ cos θ 0
0 0 1
124 CHAPTER 3. LINEAR TRANSFORMATIONS AND MATRICES
11. (Compare Exercise 57, No.6 above). Let M be the matrix of a reflection in a plane
going through the origin and P the matrix of a projection on the same plane. Explain
geometrically why you expect the following to be true. Draw diagrams!
(a) M 2 = I3
(b) P 2 = P
(c) M P = P = P M
Hint 66 [(AB)T ]ji = [AB]ij = Ai• B•j . Now use equation (3.13) and Exercise 48 No.6.
13. Prove that the composite transformation x → A (Bx) of (3.41) is indeed linear. Can you
see why a similar result holds for general linear transformations?
Hint 68 Let the m × k matrix A have columns a1 , . . . , ak and let the m × n matrix B
have columns b1 , . . . , bn . By assumption, there is an n × k matrix C such that A = BC.
Since x = Au for some column vector u with k entries, the result follows. Fill in the
remaining details.
For the row case, think in terms of the transposes of the above matrices.
P
15. Show that the column case of No.14 can also be proved using - notation as follows:
Pk Pn
Write x = j=1 aj uj and aj = i=1 bi cij for j = 1, . . . , n. Then expand and rearrange
the sum !
Xk Xn
x= bi cij zj
j=1 i=1
16. (a) If the matrix sum A + B and the matrix product C (A + B) are defined, show that
C (A + B) = CA + CB.
(b) If the matrix sum A + B and and the matrix product (A + B) C are defined, show
that (A + B) C = AC + BC.
(c) If AB is defined, and s is a scalar, show s(AB) = (sA)B and (AB)s = A(Bs) and
that both are equal. Conclude with statements and proofs of more general results
than (a) and (b).
Hint 69 To show (a) use the definition of matrix multiplication and equation (3.14):
[C (A + B)]ij = Ci• [A + B]•j = Ci• (A•j + B•j ) = Ci• A•j + Ci• B•j = · · ·
17. *
(a) Let C be an n × n matrix satisfying C 2 = I where I = In .
2 2
Show that (I − C) = I − C and that (2C − I) = I.
The rest of the question refers to No.11 above and to Exercise 57, No.2a, No.2b and
No.4.
(b) Let P be the projection matrix of Exercise 57, No.2a and write the vector u as a
T
column vector u = u1 u2 u3 .
Show that
1
i. P = |u|2
u uT ,
ii. P 2 = P using matrix multiplication,
iii. the matrix M = 2P − I reflecting in the line satisfies M 2 = I.
(c) Let P be the projection matrix of Exercise 57, No.2b onto a plane through the origin
T
with normal n = n1 n2 n3 .
Show that
1
i. P = I − |n|2
n nT ,
ii. P 2 = P using matrix multiplication,
iii. the matrix M = 2P − I reflecting in the plane satisfies M 2 = I.
(d) Complete analytical proofs of No.11 above.
Hint 70 Note that we are forming matrix products of a column and a row as in No.4.
Also make use of No.17a.
126 CHAPTER 3. LINEAR TRANSFORMATIONS AND MATRICES
Ax = 0 (3.55)
the vectors a1 , a2 , · · · , an are said to be linearly dependent, otherwise they are linearly
independent.
To put it another way, the columns of the m × n matrix A are linearly independent if, and only
if for all column vectors x with n entries
To say the same thing once more, if we can find a linear combination such that
a1 x1 + a2 x2 + · · · + an xn = 0 (3.57)
in which one or more of the coefficients xj are not zero, then the vectors a1 , a2 ,..., an are
linearly dependent (there is a non-trivial linear relationship among them), otherwise they are
independent.
Let n ≥ 2 and suppose the vectors are linearly dependent. Then (say) x1 6= 0 in equation
(3.57). In that case a1 depends linearly on a2 ,..., an :
x2 xn
a1 = a2 − + · · · + an −
x1 x1
Similarly, if xj 6= 0, then aj depends linearly on the vectors ai for i 6= j.
Conversely, if (say) there are scalars s2 , . . . , sn such that
a1 = a2 s2 + · · · + an sn
Then
a1 + a2 (−s2 ) + · · · + an (−sn ) = 0
Since the coefficient of a1 is 1 6= 0, the vectors are linearly dependent, as required.
For a slightly refined version of this theorem, see Exercise 76, No.1(c)v.
The definition of linear independence for a list of row vectors is almost identical to that for
column vectors. See 3.4.14 below.
The columns of A are linearly dependent, as can be seen from (3.27) by letting t 6= 0, say
putting t = 1.
The columns of the matrix (3.29) in example 3.1.7 are linearly independent since Ax = 0 has
only the trivial solution x = 0.
The columns of A in Exercise 35 No.2b are linearly independent.
3.4.4 If a matrix has more columns than rows, its columns are linearly
dependent; if more rows than columns then its rows are linearly
dependent
This is just a rephrasing of the result of subsection 2.2.15 in Chapter 2: A homogeneous system
with more unknowns than equations has a non-trivial solution.
The row case is quite similar.
128 CHAPTER 3. LINEAR TRANSFORMATIONS AND MATRICES
Row-rank The row-rank of a matrix A is defined similarly, except that ‘rows’ replace ‘columns’.
See 3.4.14 below for more details.
Given a list of vectors of the same size (either rows or columns), it should be clear what is be
meant by their rank: it is the maximum number of linearly independent vectors among them.
Below we show that the column- and row-ranks of a matrix are equal. (See subsection 3.4.17).
Any dependency relation among the columns of A also holds for A′ and vice-versa:
In particular, certain columns of A are linearly independent if, and only if, the same columns
of A′ are linearly independent. It follows that the column ranks of A and A′ are the same.
Also, column A•j is a linear combination of other columns of A if, and only if, A′•j is a linear
combination of the corresponding columns of A′ with the same coefficients.
For these reasons, questions regarding dependence or independence of the columns of A are
best answered by considering the row reduced echelon form (row REF) of A.
Let k be the number of non-zero rows in an echelon form of the m × n matrix A and suppose
that A′′ is its row REF. Recall that the pivot columns (see 2.2.8) of A′′ are columns of the
identity matrix Im , which are certainly linearly independent. Hence the column-rank of A′′ is
at least k. The k × n matrix of the non-zero rows of A′′ cannot have column-rank more than k
by 3.4.4. Since the other rows of A′′ are zero, the column rank of A′′ is exactly k. This is also
the column-rank of A.
3.4. LINEAR INDEPENDENCE 129
Suppose that u and v are two solutions, i.e. Au = b and Av = b. Then Au = Av and so
Au − Av = A (u − v) = 0. As the columns of A are linearly independent, u − v = 0 and u = v.
Consider the matrix mapping x → Ax of (3.32). To say that this mapping is one-to-one
means that for any two column vectors u and v with n entries, we can have Au = Av only if
u = v.
In other words, two linear combinations of the columns of A are equal only if they have identical
coefficients. The mapping x → Ax is one-to-one if, and only if, the column-rank of A is n.
Example 3.4.2 A row echelon form (2.2) of the coefficient matrix A of Example 2.1.2 in
Chapter 2 is
1 1 3
0 2 1
0 0 1
from ℜ3 to ℜ3 is one-to-one.
All this becomes even clearer if we look at the row-reduced echelon form in Example 2.2.7 of
Chapter 2. This is the 3 × 3 identity matrix and so Ax = b has a solution for any b ∈ ℜ3 : The
mapping x → Ax has range the whole of ℜ3 .
The example illustrates the following remark.
if, and only if, Ax = b has a solution. Here, A is the m × n matrix having the vectors aj as
columns.
Proof : The proof depends the existence theorem regarding solutions to Ax = b (section 2.3 in
chapter 2).
Suppose that the column-rank b] is the
′′of ′′[A, same as that of A. Then the same applies to the
′′ ′′ ′′
REF: The column ranks of A , b and A are equal. Since the last non-zero row of A , b
cannot have the form 0 . . . 0 b′r where b′r 6= 0, it follows
that Ax = b has a solution.
Conversely, let Ax = b have a solution. Then the REFs A′′ , b′′ and A′′ have the same number
of non-zero rows and so the same column-rank.
Remark 71 We emphasize that the above theorem 3.4.11 about column vectors is equally valid
for row vectors. The statement 3.4.11 goes through word-for-word.
For those proceeding to a more advanced study of linear algebra an independent proof of the
fundamental theorem is provided in subsection 3.6.
Proof :
The rank of a1 , a2 , . . . , an , b is either n or n + 1. If it is n, then these vectors are linearly
dependent, and there exist constants x1 ,...,xn , β not all zero such that
a1 x1 + a2 x2 + · · · + an xn + bβ = 0
We must have β 6= 0, otherwise one or more of the scalars xj would have to be non-zero,
contradicting the linear independence of the vectors a1 , a2 , . . . , an . Therefore,
b = a1 −x1 β −1 + a2 −x2 β −1 + · · · + an −xn β −1
Example 3.4.3 Consider example 3.1.6 once more. Can we find a set of linearly independent
columns of A in (3.25) such that every column of A is a linear combination of these independent
columns? What is the column-rank of A? Illustrate the statements of 3.4.8 and 3.4.13.
3.4. LINEAR INDEPENDENCE 131
Solution:
The original matrix (3.28) is
2 5 −11
A = −3 −6 12
4 7 −13
The echelon form 3.26 is
2 5 −11
0 1 −3
0 0 0
shows that first two columns of A are linearly independent. The row REF shows things
even more clearly:
1 0 2
0 1 −3 (3.58)
0 0 0
Every column of A is a linear combination of the first two, which are pivot columns. The
column rank is 2.
Example 3.4.4 Let
−1 −4 −3 2
3 4 1 2
A=
2
9 7 −5
−5 −1 4 −9
Illustrate subsections 3.4.7, 3.4.8 and 3.4.13.
Solution:
We first reduce the matrix A to row REF:
−1 −4 −3 2
0 −8 −8 8 3R1 + R2
0 1 1 −1 2R1 + R3
0 19 19 −19 −5R1 + R4
−1 −4 −3 2
1
0 −1 −1 1 8 R2
0 1 1 −1
1
0 1 1 −1 19 R4
−1 −4 −3 2
0 −1 −1 1
0 0 0 0 R2 + R3
0 0 0 0 R2 + R4
−1 −4 −3 2
0 1 1 −1 −R2
0 0 0 0
0 0 0 0
−1 0 1 −2 4R2 + R1
0 1 1 −1
0 0 0 0
0 0 0 0
This gives the row REF A′′ of A as
1 0 −1 2
0 1 1 −1
E=
0
0 0 0
0 0 0 0
132 CHAPTER 3. LINEAR TRANSFORMATIONS AND MATRICES
The pivot columns of E are E•1 and E•2 and they are linearly independent, and therefore so
are A•1 and A•2 . The column-rank of A is 2. Clearly, every column of E is a linear combination
of E•1 and E•2 . Therefore, every column of A depends linearly on A•1 and A•2 . For example,
E•4 = E•1 (2) + E•2 (−1), and so A•4 = A•1 (2) + A•2 (−1), much as in the previous example.
yA = 0 implies y=0
In other words, if we can find a row vector y 6= 0 such that the linear combination
then the rows of A are linearly dependent, otherwise they are linearly independent.
The row-rank of A is the maximum number of linearly independent rows of A.
3.4.15 The non-zero rows of a matrix in row echelon form are linearly
independent
To see this, let C be an m × n matrix in row echelon form, and suppose that C1• , C2• ,...., Ck• ,
are its non-zero rows. Let
y1 C1• + y2 C2• + · · · + yk Ck• = 0
where 0 is the zero row with n entries. The first non-zero entry in C1• has only zeros below
it. Therefore, y1 = 0 and y2 C2• + · · · + yk Ck• = 0. As the first non-zero entry in C2• has only
zeros below it, y2 = 0. Carrying on the argument, y1 = y2 = · · · = yk = 0. The claim is proved.
For example, the first two rows of
−1 −4 −3 2
0 1 1 −1
0 0 0 0
0 0 0 0
Solution:
This is the matrix of Example 2.2.4 in Chapter 2. We found its row-echelon form in equation
2.26:
2 −3 −4 3 −3
0 1 2 −6 1
0 0 3 −2 −2
0 0 0 −5 −3
Therefore, r(A) = 4
Ax = 0 implies Bx = 0 (3.59)
Remark 72 We have actually shown a bit more: if the condition (3.59 holds, and certain
columns of B are linearly independent, then the corresponding columns of A are also linearly
independent.
Example 3.5.1 Let C be the 3 × 3 matrix of Example 3.4.2 of rank 3 and let A be the 3 × 3
matrix of example 3.4.3 with column-rank 2. From the corollary we can conclude that the
column-rank of the product
1 1 3 2 5 −11 11 20 −38
CA = −1 1 −2 −3 −6 12 = −13 −25 49
1 1 4 4 7 −13 15 27 −51
is also 2.
Remark 73 As with the Column Lemma, we have shown that if the condition (3.5.3) holds
and certain rows of B are linearly independent, the corresponding rows of A are also linearly
independent.
Let a1 , a2 , . . . , an , b be any list of n column vectors, each with m entries. Then the rank of
a1 , a2 , . . . , an , b is not more than the rank of a1 , a2 , . . . , an if, and only if, b depends linearly on
a1 , a2 , . . . , an .
Put slightly differently, the theorem says that
column-rank [A, b] = column-rank A
if, and only if, Ax = b has a solution. Here, A is the m × n matrix having the vectors aj as
columns.
Proof :
Let the maximum number of linearly independent vectors among a1 , a2 , . . . , an be s.
3.6. THEOREM*: THE FUNDAMENTAL PROPERTY OF LINEAR INDEPENDENCE REVISITED135
Corollary 75 Let the matrix B consist of k linearly independent columns of a matrix A which
cannot be enlarged by columns of A without destroying independence. Then
1. Every column of A is a linear combination of columns of B (this follows from the Lemma)
2. k is the column-rank of A.
1. In most of the elementary questions that follow, you may assume the given vectors are
either rows or columns, each with the same number of entries. Prove the given statements.
2. How does Lemma 3.4.12 show why 3.4.13 follows directly from 3.4.8?
3. Let a, b, c be the position vectors of the points A, B and C respectively. Show that the
three vectors are linearly independent if, and only if, A, B, C and O do not lie on a plane.
(This was mentioned in the preamble 3.4).
Solution:
If C lies on a plane through O, A and B, then c = sa + tb for certain scalars s and t and
so a, b and c are linearly dependent. Conversely, if these vectors are linearly dependent,
then one of them (say c) is a linear combination of the of the other two. This shows at
once that C lies on the plane through O, A and B.
4. Let A = [C, D] where C and D are matrices with the same number of rows. Show that if
the rows of C (or of D) are independent, so are those of A. Conclude that in any case
r (A) > r (C) and r (A) > r (D).
136 CHAPTER 3. LINEAR TRANSFORMATIONS AND MATRICES
5. For any matrix product AB, say why r(AB) ≤ r(A) and r(AB) ≤ r(B).
Solution:
These follow directly from the corollaries to the Row and Column Lemmas (See 3.5.4 and
3.5.2) and the fact that the row - and column-ranks of a matrix are equal.
8. The following questions refer you to the matrices in Exercise 35 of Chapter 2. To answer
them you should consult the row reduced echelon forms (row REF) found in Exercise 40.
Occasionally you may need to find the new row reduced echelon forms.
(a) Consider the matrix A of No.1a.Show that the first two columns of A form a maxi-
mum independent set of columns of A and that every column of A is a linear combi-
nation of A•1 and A•2 . Can you find three linearly independent columns of A? Find
the rank r(A).
Show that the first two rows of A form a maximum number of independent rows of
A.
Hint 79 Either show directly that A1• and A2• are linearly independent or use the
matrix B of No.1b in Exercise 35.
Solution:
A is 4 × 4 and the rows of A together with c can have rank at most four. Since
the rows of A are linearly independent (r(A) = 4 from the row REF), the answer
is ‘yes’ by the row version of Lemma 3.4.12.
3.6. THEOREM*: THE FUNDAMENTAL PROPERTY OF LINEAR INDEPENDENCE REVISITED137
9. Prove that in 3-space mutually orthogonal vectors are linearly independent. Hence com-
plete a rigorous proof of No.17 of Exercise 2 in Chapter 1.
Solution:
let l, m, n be three non-zero mutually orthogonal vectors as in the exercise and suppose
αl + βm + γn = 0
Take the dot product of both sides with l. Since l · m = l · n = 0, it follows that αl · l = 0
and so α = 0. Similarly the other coefficients are zero and the result follows.
To complete the proof of No.17 in Exercise 2, we need to show that every vector in ℜ3
is a linear combination of the vectors l, m, n. This follows by the Lemma 3.4.12 and the
fact that four vectors in ℜ3 are linearly dependent.
10. * Give rigorous proofs of the statements about planes in subsection 1.3.6.
Solutions:
(a) The normal uniquely determines a plane (item 1). Let n · x + c = 0 be the
Cartesian Equation, where n 6= 0. If n1 6= 0, we can let x2 and x3 be parameters to
get a generic equation of the plane.
(b) Consider item 3: Two non-parallel planes meet in a line.
We have two non-parallel (linearly independent) normals to the planes say, m and
n, written as row-vectors. Let the planes be defined by m x = c and n x = d, where
the column vector x represents a point in space.
We want the points common to the two planes, so let these two equations be expressed
as the single matrix equation Ax = b. As A has rank 2, we may suppose that A•1
and A•2 are linearly independent. Since three vectors are linearly dependent in ℜ2 ,
for any value of t, the equation A•1 x1 + A•2 x2 = −tA•3 + b has a (unique) solution
for x1 and x2 . With x3 = t (the parameter), Ax = b has a one-parameter solution,
which is a generic equation of the line of intersection of the two planes.
Alternatively, in a row echelon form of the augmented matrix [A, b] there are two
pivot and one non-pivot columns in A. Suppose A•3 is the non-pivot column and let
x3 = t be the parameter. Then Ax = b has a one-parameter solution, expressing the
generic equation of the line of intersection of the two planes. (This can most easily
be seen from the row REF of [A, b]).
(Question No.22c below shows that a system Ax = b in which A has independent
rows always has a solution).
138 CHAPTER 3. LINEAR TRANSFORMATIONS AND MATRICES
(c) Item 4 of 1.3.6 says that a line not parallel to a plane meets it in exactly one
point.
Let x = a + tu a generic equation of a line that is not parallel to the plane with
Cartesian equation n x = c, where c is a constant. Since n u 6= 0,
n (a + tu) = c
(d) There are infinitely many planes through a given line (item 6). Let the line
have generic equation a + tu (column vectors). Regarding the row vector n as the
unknown, n u = 0 has a two-parameter solution. (E.g. if u1 6= 0, we can take n2 and
n3 as parameters). For any such n, let c = n a. Then, corresponding to a choice of
the normal, n x = c is the Cartesian equation of a plane containing the given line.
There are clearly infinitely many such planes.
11. * Let I be the m × m identity matrix. The matrix J resulting from I after performing an
elementary row operation on I is called an m × m elementary matrix. Hence J is the
result of doing one of the following: (i) multiplying row r of I by β 6= 0, (ii) interchanging
rows r and s of I, and (iii) changing row Is• to βIr• + Is• if s 6= r.
Hint 80 Use [JA]i• = Ji• A. So, for example, let J be the result of a type (iii)
operation. Then for i 6= s, [JA]i• = Ai• , while if i = s,
(c) Define the notion of an “elementary column operation”. In the above what corre-
sponds to elementary column operations?
(d) Give a description of ”column echelon form” and ”column reduced echelon form” of
a matrix A.
12. Show that A, AT , AT A and AAT all have the same rank.
Hint 81 First of all, recall that if a is a column vector and aT a = 0, then a = 0 (See
Exercise 48 No.11).
Now suppose AT Ax = 0. Then xT AT Ax = (Ax)T Ax = 0.
Hint 82 One way: Use AT and B T in place of A and B in the previous Lemma 3.5.1.
For a more pleasant direct proof, we may suppose that the first k = r(B) rows of B are
independent. Let y be any row vector such that yk+1 = · · · = ym = 0 and yA = 0. Now
proceed as in the Column Lemma 3.5.1, except use rows in place of columns.
Solution:
By assumption, yB = 0. As the first k = r(B) rows of B are independent, y = 0. Hence
the first k rows of A are independent.
14. Complete the following alternative proof of the corollary to the Column Lemma (Corollary
3.5.2) which says r(BC) ≤ r(C). If k columns (say [BC]•1 , · · · , [BC]•k ) of BC are
linearly independent then so are C•1 , · · · , C•k . Do the same for the row version.
15. In the following assume the relevant matrix products are defined. Prove the statements.
17. * Using (among other things) the Column Lemma 3.5.1, show that if the condition (3.59)
holds then B = CA for some C. (This is essentially the converse of the corollary 3.5.2).
State and prove the corresponding result for the Row Lemma 3.5.3.
′
Hint 83 Let d = d = Bi• be a row of B and let A be the augmented matrix formed by
′ ′
adding row d to A. Then A x = 0 if, and only if, Ax = 0. Hence A and A have the same
column ranks and so the same row ranks. By the row version of the basic theorem 3.4.11
on linear independence, d is a linear combination of the rows of A.
State and prove a similar result corresponding to the Row Lemma 3.5.3.
18. Find the ranks of the following matrices and prove your results correct.
(a) A is a projection matrix onto a line that passes through the origin.
Solution:
Let the projection matrix A project onto the line ℓ that passes through O. For x ∈ ℜ3 ,
(b) A is a projection matrix onto a plane that passes through the origin.
Solution:
Let the projection matrix A project onto the plane Π that passes through the origin
O. As usual,
Ax = A•1 x1 + A•2 x2 + A•3 x3
A•1 , A•2 and A•3 are points on Π and we suppose A•1 6= 0. The columns A•2 and
A•3 cannot be both multiples of column A•1 , otherwise the range of x → Ax would
be a line. We may suppose that [A•1 , A•2 ] has rank 2. But then all points on Π
are linear combinations of A•1 and A•2 . Hence r(A) = 2, in agreement with our
comments in 3.4.10.
(c) A is a matrix that reflects in a plane or line that passes through the origin.
Solution:
Since A is a reflection matrix, the mapping x → Ax is one-to-one. Hence r(A) = 3.
(Alternatively, r(A) cannot be 0, 1 or 2 - why? - , so r(A) = 3).
This again illustrates the remarks in 3.4.10.
20. * Use the results of Exercise 63, No.17 to give analytic proofs for No.19 above.
21. If Rθ is the 2 × 2 rotation matrix of equation (3.53), prove algebraically that r(Rθ ) = 2.
Solution:
Consider the No.8b of Exercise 63. By No.5, r(Rθ ) = 2.
(If you are familiar with 2 × 2 determinants, you will notice that |Rθ | = cos2 θ + sin2 θ =
1 6= 0, so again r(Rθ ) = 2).
(a) Describe the row REF A′′ of A if r (A) = n? What is A′′ if r (A) = m = n?
Solution:
The first n rows of A′′ are In . If in addition m = n, A′′ = In .
(b) Describe in general terms what A′′ looks like if r(A) = m.
Solution:
m columns of A are the columns of the identity matrix Im (in their natural order)
and the others are zero columns.
(c) Let Ax = b be a system of m equations for n unknowns. If the rows of A are
independent then the system always has a solution. Show that this is a consequence
of the Fundamental Theorem 3.4.11.
Solution:
As mentioned in the hint, if the rows of A are independent, then the (row) ranks
satisfy m = r(A) = r [A, b]. The column-ranks satisfy the same equation so that by
the Fundamental Theorem 3.4.11, the column vector b is a linear combination of the
columns of A.
(d) Let yA = c be a system of n equations for m unknowns y1 , · · · , ym . If the columns
of A are independent then the system always has a solution. Show that this is also
a consequence of the Fundamental Theorem 3.4.11.
Solution:
Very similar to the previous one.
(e) What can be said of the solutions in 22c and 22d if m = n?
3.7. SQUARE MATRICES 141
3.7.6 Solving yA = c
If A has an inverse, then for any row vector c with n entries, the unique solution to
yA = c (3.63)
is
y = cA−1
The proof can be derived from 3.7.5 using transposes (see equation 3.9). Alternatively, the
proof parallels that of the previous section. The unique solution is found by multiplying both
sides of equation (3.63) on the right by A−1 .
Solution:
Use the result of Example 3.7.1:
−1
3 7 9 5 −7 9 122
x= = =
2 5 −11 −2 3 −11 −51
1. r(A) = n.
10. There is a column vector b with n entries such that Ax = b has a unique solution.
12. For every row vector c with n entries yA = c has a unique solution.
13. There is a row vector c with n entries such that yA = c has a unique solution.
Proof
The first seven properties are equivalent from the definition of linear independence, the fact
that row-rank is equal to column-rank, and because the row (column) REF of A is I if, and
only if, the columns (rows) of A are independent.
Now assume r(A) = n. Since the addition of column b (row c) to A does not change the column
(row) rank, Ax = b has a solution by Lemma 3.4.12, and similarly so does yA = c. These
solutions are necessarily unique on account of the independence of the columns (rows) of A.
Assume next that (8) holds. Then we can solve the n matrix equations AX•j = I•j for the
columns X•j and obtain an n × n matrix X such that AX = I. Similarly, if (11) holds we can
find an n × n matrix Y such that Y A = I.
By No.5 of Exercise 76, if either AX = I for some X or Y A = I for some Y , necessarily
r(A) = n.
Thus all the above properties (1) - (13) are equivalent.
Finally, suppose (14) holds. Then, as the properties are equivalent, Y A = I for some Y . By
3.7.3, X = Y and X is the inverse of A.
A similar argument shows that if (15) holds, necessarily Y = A−1 .
Solution:
We found there that an echelon form has three non-zero rows. Therefore, r (A) = 3 and A−1
exists. Find A−1 by row reduction.
Solution:
Start with
2 5 1 1 0 0
−3 −6 0 0 1 0
4 7 −2 0 0 1
144 CHAPTER 3. LINEAR TRANSFORMATIONS AND MATRICES
2 5 1
0 0 1
3 3 3 3
0 2 1 0
2 2 2 R1
+ R2
0 −3 −4 −2 0 1 (−2) R1 + R3
2 5 1 1 0 0
3 3 3
0 2 2 2 1 0
0 0 −1 1 2 1 2R2 + R3
2 5 1 1 0 0
2 2
0 1 1 1 3 0 3 R2
0 0 1 −1 −2 −1 (−1) R3
2 5 0 2 2 1 −R3 + R1
8
0 1 0 2 3 1 −R3 + R2
0 0 1 −1 −2 −1
2 0 0 −8 − 34
3 −4 −5R2 + R1
8
0 1 0 2 3 1
0 0 1 −1 −2 −1
1 0 0 −4 − 17
3 −2 1
2 R1
8
0 1 0 2 3 1
0 0 1 −1 −2 −1
Hence
−4 − 17
3 −2
A−1 = 2 8
3 1
−1 −2 −1
1. Find the inverses of the following matrices (where possible), by reduction to I, and solve
the related problems.
(a)
2 1
A=
3 2
Solution:
−1 2 −1
A =
−3 2
(b) i.
3 2 1
B= 0 2 2
0 0 −1
Solution:
1
3 − 13 − 31
B −1 = 0 1
2 1
0 0 −1
ii. Find the linear transformation T that maps column B•j of B to the correspond-
ing column Y•j of Y for j = 1, 2, 3, where
1 0 4
Y = −1 1 −2
2 3 1
3.7. SQUARE MATRICES 145
(c)
1 6 2
C = −2 3 5
7 12 −4
Solution: C Does not have inverse.
(d) i.
1 3 −1
D = 2 0 −3
−1 4 2
Solution:
12 −10 −9
D−1 = −1 1 1
8 −7 −6
ii. Use your inverse to solve
x1 + 3x2 − x3 = 3
2x1 − 3x3 = −1
−x1 + 4x2 + 2x3 = 2
Solution:
x1 12 −10 −9 3 28
x2 = −1 1 1 −1 = −2
x3 8 −7 −6 2 19
y1 + 2y2 − y3 = 4
3y1 + 4y3 = 5
−y1 − 3y2 + 2y3 = −1
Solution:
12 −10 −9
y1 y2 y3 = 4 5 −1 −1 1 1 = 35 −28 −25
8 −7 −6
2. Prove that an invertible matrix maps linearly independent vectors to linearly independent
vectors and state and prove the converse.
Solution: (The result follows from the Corollary 3.5.2 to the Column Lemma, but we
present a proof using inverses). Let the n × n matrix A have an inverse and suppose the
columns of the n × k matrix B are linearly independent. Let (AB)x = 0. Then
0 = A−1 [(AB) x] = A−1 A (Bx)
= I(Bx) = Bx
3. Let A and B be n × n matrices. Show that AB is invertible if, and only if, both A and
B are, in which case (AB)−1 = B −1 A−1 . Generalize this result to products of three or
more matrices.
−1
4. Let the n × n matrix A be invertible. Show that AT is also invertible and that AT =
−1 T
A .
5. Let A be n × n. As a linear transformation, the mapping A : ℜn → ℜn is one-to-one if,
and only if, A maps ℜn onto the whole of ℜn .
Prove the statement. (Compare Example 3.4.2).
6. If M is a matrix that reflects in a plane or line passing through the origin, show alge-
braically that M −1 = M .
Hint 87 First reduce A to row REF. Now use elementary column operations and the pivot
columns to reduce the non-pivot columns to zero columns. Apply suitable permutations to
the columns and finally use the results of Exercise 76), No.11.(Compare No.6 of the same
exercise).
Solution:
Since A 72 A3 − 52 A + 32 I = I, it follows that
7 3 5 3
A−1 = A − A+ I
2 2 2
Column j is written
a1j
a2j
A•j = .. (j = 1, · · · , n)
.
amj
Addition of two matrices of the same size is defined naturally, as is multiplication by a scalar
α:
[αA]ij = α [A]ij
If A is m × n and x a column vector with n entries xi , we have
A1• x
A2• x
Ax = .. = A•1 x1 + A•2 x2 + ... + A•n xn
.
Am• x
Here
x1
x2
Ai• x = ai1 ai2 ··· ain .. = ai1 x1 + ··· + ain xn
.
xn
and
a1j xj
a2j xj
A•j xj = ..
.
amj xj
The span of the vectors a1 , . . . , am (all of the same size) is the set of linear combinations of
a1 , . . . , am . It is denoted by sp(a1 , . . . , am ).
The row space of a matrix A is the span of its rows (see also exercise ??).
The column space of a matrix A is the span of its columns.
148 CHAPTER 3. LINEAR TRANSFORMATIONS AND MATRICES
4.1 Determinants
The determinant of a 1 × 1 matrix A = [a] is simply |A| = a (not |a|).
BA = AB = I
151
152 CHAPTER 4. DETERMINANTS AND THE CROSS PRODUCT
Or,
b1 a12
x1 = / |A|
b2 a22
a11 b1
x2 = / |A|
a21 b2
Proof :
A2• a21 a22
A1• = a11 a12
= a21 a12 − a11 a22 = − |A|
4.1. DETERMINANTS 153
Proof :
λA1• λa11 λa12
A2• = a21 a22
= λa11 a22 − a21 λa12 = λ |A|
Similarly,
A1•
= λ |A|
λA2•
4. Let C = c1 c2 , then we have the expansion
A1• + C A1• C
= +
A2• A2• A2•
Proof :
A1• + C a11 + c1 a12 + c2
=
A2• a21 a22 = (a11 + c11 )a22 − a21 (a12 + c12 )
a11 a12 c1 c2 A1• C
= a11 a22 − a21 a12 + c1 a22 − a21 c2 = + = +
a21 a22 a21 a22 A2• A2• .
6. Adding a multiple of one row to another row leaves the determinant invariant:
A1•
λA1• + A2• = |A|
A1• A1• A1•
Proof :By (4) and (5), = + = 0 + |A| = |A|.
λA1• + A2• λA1• A2•
8. The above properties hold for columns in place of rows. In No.1 of Exercise 91 you are
asked to formally state and prove this.
4.1.5 3 × 3 determinants
4.1.6 The search for a formula solving simultaneous equations
Let us try to solve formally
a11 a12 a13 x1 b1
a21 a22 a23 x2 = b2
a31 a32 a33 x3 b3
The detached array for this system is
Multiply numerator and denominator of (4.4) by a211 and use property 3 of subsection 4.1.4
above:
a11 b2 − a21 b1 a23 a11 − a21 a13
a11 b3 − a31 b1 a33 a11 − a31 a13
x2 =
a11 a22 − a21 a12 a23 a11 − a21 a13
a11 a32 − a31 a12 a33 a11 − a31 a13
Consider the denominator:
(a11 a22 − a21 a12 )(a33 a11 − a31 a13 ) − (a11 a32 − a31 a12 )(a23 a11 − a21 a13 )
= a11 a11 a22 a33 − a11 a13 a22 a31 − a11 a12 a21 a33 + a12 a21 a31 a13
−a11 a11 a23 a32 + a11 a13 a21 a32 + a11 a12 a23 a31 − a12 a13 a21 a31
4.1. DETERMINANTS 155
= a11 (a11 a22 a33 − a13 a22 a31 − a12 a21 a33 − a11 a23 a32 + a13 a21 a32 + a12 a23 a31 ).
The numerator for x2 is found by substituting b1 for a12 , b2 for a22 and b3 for a32 :
a11 (a11 b2 a33 − a13 b2 a31 − b1 a21 a33 − a11 a23 b3 + a13 a21 b3 + b1 a23 a31 ).
Finally, we find
a11 b2 a33 − a13 b2 a31 − b1 a21 a33 − a11 a23 b3 + a13 a21 b3 + b1 a23 a31
x2 = (4.5)
a11 a22 a33 − a13 a22 a31 − a12 a21 a33 − a11 a23 a32 + a13 a21 a32 + a12 a23 a31
Definition:
The denominator of equation (4.5) is defined as the determinant |A| of the 3 × 3 matrix A:
|A| = a11 a22 a33 − a13 a22 a31 − a12 a21 a33 − a11 a23 a32 + a13 a21 a32 + a12 a23 a31 (4.6)
4.1.7 Discussion
Each term of (4.6) is of the form
±a1i a2j a3k
The column indices i, j, and k form an arrangement (permutation) ijk of the numbers 1, 2,
and 3. In fact all six permutations occur and each has its own sign according to the scheme:
+ − +
− + −
+ − +
= a11 (a22 a33 − a23 a32 ) − a12 (a21 a33 − a23 a31 ) + a13 (a21 a32 − a22 a31 )
a22 a23 a21 a23 a21 a22
= a11 − a12
+ a13
a32 a33 a31 a33 a31 a32
Hence
Expansion by row 2:
|A| = a11 a22 a33 − a11 a23 a32 + a12 a23 a31 − a12 a21 a33 + a13 a21 a32 − a13 a22 a31
4.1. DETERMINANTS 157
−a21 (a12 a33 − a13 a32 ) + a22 (a11 a33 − a13 a31 ) − a23 (a11 a32 − a12 a31 )
a a13 a11 a13 a11 a12
= −a21 12 + a
22
− a
23
a32 a33 a31 a33 a31 a32
|A| = a21 A2|1 + a22 A2|2 + a23 A2|3
Expansion by row 3:
|A| = a11 a22 a33 − a11 a23 a32 + a12 a23 a31 − a12 a21 a33 + a13 a21 a32 − a13 a22 a31
= a31 (a12 a23 − a13 a22 ) − a32 (a11 a23 − a13 a21 ) + a33 (a11 a22 − a12 a21 )
a a13 a11 a13 a11 a12
= a31 12 − a
32 + a
33
a22 a23 a21 a23 a21 a22
Hence:
Solution:
Proof :
|A| = a11 a22 a33 − a13 a22 a31 − a12 a21 a33 − a11 a23 a32 + a13 a21 a32 + a12 a23 a31
= a11 a22 a33 − a31 a22 a13 − a21 a12 a33 − a11 a32 a23 + a21 a32 a13 + a31 a12 a23 .
b11 b22 b33 − b13 b22 b31 − b12 b21 b33 − b11 b23 b32 + b12 b23 b31 + b13 b21 b32
158 CHAPTER 4. DETERMINANTS AND THE CROSS PRODUCT
= |B|
This proves the result.
Hence, if we interchange A2• and A3• , the 2 × 2 determinants will change sign and so
will |A| .
3. Multiplying a row of A by a scalar multiplies the determinant by the same scalar:
λA1• A1• A1•
A2• = λA2• = A2• = λ |A|
A3• A3• λA3•
Proof :
Consider for example multiplying row 2 by λ. Expand the result by row 2:
A1•
λA2• = (λa21 )A2|1 + (λa22 )A2|2 + (λa23 )A2|3 = λ a21 A2|1 + a22 A2|2 + a23 A2|3 = λ |A|
A3•
Proof :
A1• + B
A2• = (a11 + b1 )A1|1 + (a12 + b2 )A1|2 + (a13 + b3 )A1|3
A3•
= a11 A1|1 + a12 A1|2 + a13 A1|3 + b1 A1|1 + b2 A1|2 + b3 A1|3
A1 B
= A2 + A2
A3 A3
4.1. DETERMINANTS 159
Using property (2) this can be proved directly from the first result as follows
A1• A2• + B A2• B A1• A2•
A2• + B = − A1• = − A1• − A1• = A2• + B
A3• A3• A3• A3• A3• A3•
First suppose that two rows are equal, say A1• = A2• . Then interchanging rows 1 and
2 of A changes the sign of the determinant by property (2). Hence |A| = − |A| and so
|A| = 0. Now let A1• = λA2• . By property (3), |A| is λ times a determinant with equal
rows 1 and 2 and so vanishes.
6. Adding a multiple of one row to a different row leaves |A| unchanged, e.g.
A1•
λA1• + A2• = |A|
A3•
Proof :
Use property 5:
A1• A1• A1• A1•
λA1• + A2• = A2• + λA1• = A2• + 0 = |A|
A3• A3• A3• A3•
(c) Multiplying a column of A by a scalar multiplies the determinant by the same scalar.
(d) If one column of A is a multiple of another, then |A| = 0.
(e) Adding a multiple of one column to another leaves |A| unchanged.
160 CHAPTER 4. DETERMINANTS AND THE CROSS PRODUCT
Simplification:
We may also simplify the evaluation of the determinant by adding a multiple of a row to
another:
1 3 −1
2 4 0
−5 −4 0 (−3)R1 + R3
Expanding by the third column gives
2 4
|A| = (−1)1+3 (−1) = −12
−5 −4
For columns:
Proof
We know that if i = j in (4.10) the left-hand side is just the expansion of |A| by row i.
Now let i 6= j, say i = 2 and j = 1. Then
This means that we can first find the matrix B with [B]ij = Ai|j and then adj(A) = B T .
Solution:
|A| = a21 A2|1 + a22 A2|2 + a23 A2|3
23
= 0 + − (−1) 24 = 3
37 35
As |A| =6 0, A−1 exists. We find
1 −1 0 −1 0 1
A1|1 = = 12, A1|2 = − = −3, A1|3 = = −3
5 7 3 7 3 5
162 CHAPTER 4. DETERMINANTS AND THE CROSS PRODUCT
4 3
A2|1 = − = −13, A2|2 = 2 3 = 5, A2|3 = − 2 4 =2
5 7 3 7 3 5
4 3
A3|1 = = −7, A3|2 = −(−2) = 2, A3|3 = 2 4 = 2
1 −1 0 1
Hence
T
A A1|2 A1|3
1 1 1|1
A−1 = adjA = A2|1 A2|2 A2|3
|A| |A|
A3|1 A3|2 A3|3
T
12 −3 −3
1
= −13 5 2
3
−7 2 2
12 −13 −7
1
= −3 5 2
3 −3 2 2
i+j
Ai|j = (−1) times the determinant formed by crossing out row i and column j
+ − + −
− + − +
+ − + −
− + − +
Solution:
3 −2 −3 6
1 −2 2 4
|A| =
−1 3 4 −9
0 1 0 0
Here we have added column 2 to column 3 and (−2)(Column 2) to column 4. Expanding
by row 4:
3 −3 6
|A| = (−1)4+2 1 2 4
−1 4 −9
0 −3 0
3 8
= 3 2 8 = (−1)1+2 (−3) = −81
3 4 3 −1
−1
We have added column 2 to column 1 and 2(column 2) to column 3 then expanded by row 1.
1. Using property (1) for 2 × 2 determinants, state and prove similar results for columns in
(2)-(6).
2. Evaluate the determinants of the matrices in Chapter 3, Exercise 84, No. 1. Find the
inverses of the matrices by the adjoint method if they exist.
The values of the the determinants are:
2 1
|A| = =1
3 2
3 2 1
|B| = 0 2 2 = −6
0 0 −1
1 0 4
|Y | = −1 1 −2 = −13
2 3 1
1
− 31 − 13
3 3
|T | = − 31 5
6
10
3
13
= 6
2 5 4
3 6 3
1 6 2
|C| = −2 3 5 = 0
7 12 −4
1 3 −1
|D| = 2 0 −3 = 1
−1 4 2
3. Evaluate the determinants
164 CHAPTER 4. DETERMINANTS AND THE CROSS PRODUCT
1 −3 0 −2
3 −12 −2 −6
(a)
−2 10 2 5
−1 6 1 3
Solution: The value of the determinant is −1.
−3 7 8 4
2 1 0 3
(b)
6 −1 −2 −1
−3 4 2 11
Solution: The value of the determinant is 24.
−3 7 8 4 −7
2 1 0 3 −1
(c) 6 −1 −2 −1 1
−3 4 2 11 −4
−3175 − 25 17 59 12
2
5. Let A be a 3 × 3 matrix and k a scalar. Show that |kA| = k 3 |A|. What do you think the
general result is?
Solution:
Let A be n × n Since in kA each row of A gets multiplied by k, |kA| = k n |A|.
6. For the 3 × 3 identity I = I3×3 , show that |I| = 1. Again, what do you think the general
result is?
7. Suppose the n × n matrix A is not invertible. Show that A(adj A) is the zero matrix.
Solution:
Since A is not invertible, |A| = 0 by 4.1.18. The result follows from equation (4.12).
8. State and prove Cramer’s rule for 3 equations in 3 unknowns. Do the same for n equations
in n unknowns using the fact that higher order determinants have the same properties as
2 and 3 determinants.
9. (Jacobi)
Let the n × n matrix matrix A have a non-zero determinant. Show that aij =
|A| A−1 j|i
10. (a) Let A = [aij ] be a 3 × 3 matrix and β a scalar. Replace the entry a23 with β and let
the resulting matrix be B. Show that |B| = |A| + (β − a23 ) A2|3 , and find a criterion
for B to be invertible.
(b) Generalize the previous result.
12. Let |A| be a 3 × 3 determinant. We can permute the rows A1• , A2• and A3• in six ways.
For which of these does the determinant keep its sign and for which does it change its
sign?
State and prove a similar result for the columns A•1 , A•2 and A•3 of A.
Solution:
If the order of the rows is ijk, then the determinant of A gets multiplied by the sign of
ijk. A similar result holds for permutations of the columns of A. The permutations, other
than 123, keeping the sign of the determinant are 312 and 231.
a2 a3 a1 a3 a1 a2
a×b
= i− j+ k (4.14)
b2 b3 b1 b3 b1 b2
a2 a3 a1 a3 a1 a2
=
b2 b3 , − b1 b3 , b1 b2
This is sometimes called the vector product and is used constantly in mechanics.
Treating i, j and k as though they are scalars (which they are not), the cross product can be
remembered by the formula
i j k
a × b = a1 a2 a3
b1 b2 b3
Example 4.3.1
i j k
(−1, 2, 4) × (2, −3, 1) = −1 2 4
2 −3 1
2 4 −1 4 −1 2
=
i− j + k
−3 1 2 1 2 −3
= (14, 9, −1)
166 CHAPTER 4. DETERMINANTS AND THE CROSS PRODUCT
i j k
j×k = 0 1 0
0 0 1
1 0 0 0
= i − j + 0 1 k = i
0 1 0 1 0 0
i j k
k×i = 0 0 1
1 0 0
0 1 0 1 0 0
= i − j + k=j
0 0 1 0 1 0
1.
b × a = − (a × b)
Proof :
In b × a we interchange the rows in the cofactors defining a × b. Since these all change
sign, the result follows.
2. (The triple scalar product)
c · (a × b) = b · (c × a) = a · (b × c) (4.15)
Proof :
First note that
a2 a3 a1 a3 a a2
c · (a × b)
= c1 − c2 + 1 c (4.16)
b2 b3 b1 b3 b1 b2 3
c1 c2 c3
= a1 a2 a3
b1 b2 b3
Here we have expanded the determinant by its first row. The other properties follow by
rearranging the rows of the above determinant according to the permutations bca and abc
of cab. These are all even and so the other determinants are all equal. (See Exercise 91,
No.12).
4.3. THE CROSS PRODUCT A × B 167
3.
a · (a × b) = b · (a × b) = 0
Proof :
4.
β(a × b) = (βa) × b = a × (βb)
5.
(β + γ)(a × b) = β(a × b) + γ(a × b)
6.
(αa + βb) × c = (αa × c) + (βb × c)
a × b = (|a||b| sin θ) bc
6
Figure 4.1
i θ :
b a
Suppose that 0 < θ < π is the angle between the non-zero and non-parallel vectors a and b.
Let bc be the unit vector perpendicular to a and b with direction found by using the corkscrew
rule: rotate from a to b through θ, then the direction of bc goes the way a corkscrew would go.
See Chapter 1, subsection 1.1.1 and Figure 4.1.
Geometric Interpretation
a × b = (|a||b| sin θ) bc (4.17)
We have
2
|a × b|2 + (a · b)
2 2
a2 a3 2 a1 a3 a a2
=
b2 b3
+
b1 + 1 + (a1 b1 + a2 b2 + a3 b3 )2
b3 b1 b2
2 2 2 2
= (a2 b3 − a3 b2 ) + (a1 b3 − a3 b1 ) + (a1 b2 − a2 b1 ) + (a1 b1 + a2 b2 + a3 b3 )
= a22 b23 + a23 b22 + a21 b23 + a23 b21 + a21 b22 + a22 b21 + a21 b21 + a22 b22 + a23 b23
= a23 + a22 + a21 b23 + b21 + b22 = |a|2 |b|2
It follows that
2 2
|a × b| (a · b)
= 1 − 2 2 = 1 − cos2 θ = sin2 θ
|a|2 |b|2 |a| |b|
Hence |a × b| = |a||b| sin θ. Since a × b is parallel to bc, we have a × b = ± (|a||b| sin θ) bc.
If a = i and b = j, then from Example 4.3.2 we have from the Corkscrew Rule bc = k and
i × j = +k. This suggests that the corkscrew rule applied to any cross product a × b gives the
same direction as bc, and we will be satisfied with this. Equation (4.17) now follows.
From Example 4.3.2 we see how the geometric interpretation gives, besides i × j = k, also
the other products j × k = i and k × i = j.
Compare this with 1.1.1 on right-handed systems in Chapter 1.
1. Find (−1, 5, 7) × (2, −3, 4) and (−2, 4, −3) × (6, −1, 2).
Solution:
(−1, 5, 7) × (2, −3, 4) = (41, 18, −7)
(−2, 4, −3) × (6, −1, 2) = (5, −14, −22)
3. Consider three points A, B and C in space. If they are not collinear they form a triangle.
Show that its area is 12 |AB × AC|.
Solution:
1
If θ is the angle between the vectors AB.and AC, the area of the triangle is 2 |AB| |AC| sin θ =
1
2 |AB × AC|
4. Find the area of the triangle with vertices A = (3, 4, 5), B = (4, 4, 6), C = (5, 5, 4).
Solution:
AB = (4, 4, 6) − (3, 4, 5) = (1, 0, 1) AC =√ (5, 5, 4) − (3, 4, 5) = (2, 1, −1). The area is
1 1 11
2 |(1, 0, 1) × (2, 1, −1)| = 2 |(−1, 3, 1)| = 2
5. Show that if the non-zero vectors a and b are at right angles then |a × b| = |a||b|. What
of the converse?
Solution:
π
Use the geometric interpretation with θ = 2.
4.3. THE CROSS PRODUCT A × B 169
6. Let l and m be unit vectors and put n = l × m. Using the geometric interpretation of the
cross product show that l, m and n form a right-handed system (see 1.1.1). In fact,
Hint 93 For property 7, consider the case when a and b are both non-zero. If they are
parallel, use properties of 2 × 2 determinants in 4.1.4 to show that the cross product is
zero. For the converse use the geometric interpretation (4.17) to show that a × b 6= 0
when the vectors are not parallel.
8. Let a, b and c be three vectors in ℜ3 . Show that they are linearly independent if, and only
if, the triple scalar product a · (b × c) is not zero.
9. Consider the parallelogram ABCD in Figure 4.2. The line EF passing through Q is
D H C
E Q F
A G B
Figure 4.2
parallel to AB and the line GH passing through Q is parallel to AD. Using vector methods
show that the parallelograms EQHD and GBF Q have equal areas if, and only if, Q lies
on the diagonal AC. Conclude that this is the case if, and only if, the parallelograms
AGQE and QF CH are similar. (A theorem going back to Euclid).
Solution:
We have AQ = AE + EQ and QC = QF + F C. Hence, using the fact that the cross
product of parallel vectors is zero,
AQ × QC = AE × QF + EQ × F C
= AE × QF − F C × EQ
The product AQ × QC is zero if, and only if, the vectors are parallel; equivalently, the
points A, Q and C are collinear. This is the case if, and only if, EQHD and GBF Q
have equal areas.
10. A parallelepiped is a ‘squashed box’: it has six faces with opposite faces congruent paral-
lelograms. Let u = AD, v = AB, and w = AE be three of its linearly independent edges.
Show that its volume is |(u × v) · w|.
170 CHAPTER 4. DETERMINANTS AND THE CROSS PRODUCT
: -
:
E -
-
A u D
Hint 95 Let its base be ABCD. This has area α = |AD × AB|. The height h of the
parallelepiped is the absolute value of the component of AE along AD × AB. The volume
is then hα. Make a drawing.
11. * (The triple vector product). Let a, b and c be three given vectors. Show that
a × (b × c) = (a · c) b − (a · b) c
Hint 96 Show the result is true if a is i, j or k. Now use the result of No.2.
13. * (Continuation) Consider an axis ξ passing through Q and having the same direction
as a given unit vector l. Let D and E be points on the line of action of F and the
axis ξ respectively such that the distance |ED| is minimum. (Thus ED, if not zero,
is perpendicular to both lines.) Let m be the unit vector with direction ED, so that
ED = µm, where µ = |ED|. Put n = l × m. In the sense of the right-hand rule, the
torque of F about the axis ξ is the component F · n times the distance µ = |ED| (draw
a picture to see this).
Show that this component is the triple scalar product M · l = QP × F · ℓ.
Show further that if P ′ and Q′ are any points on the line of action of F and the axis ξ
respectively, then
Q′ P ′ × F · l = QP × F · l
This shows that the torque of a force about an axis is purely a property of the force with
its line of action and the axis and does not depend on selected points on these lines.
4.3. THE CROSS PRODUCT A × B 171
Solution:
See the figure for this question. The vectors l, m and n form a right-handed system
(see No.6 above). Hence we can express F as a linear combination of these unit vectors:
F = αl + γn. (The coefficient of m is zero as this vector and the force are perpendicular).
Using QD = λl + µm for some λ, we get
*n
D F - line of action of F P F -
6
m
l 6 n
i *
E
l
i
Q
axis ξ
Figure 4.3 Torque about an axis
4.4.2 2 × 2 determinants
In Exercise 3a of Chapter 2 we introduced the idea of the determinant: |A| = a11 a22 − a21 a12 .
In subsection 4.1.4 of 4.1.1 we develop some basic properties of 2 × 2 determinants.
4.4.3 3 × 3 determinants
For a 3 × 3 matrix A, its determinant is given by equation (4.7):
X
|A| = sgn(σ)a1σ1 a2σ2 a3σ3
σ
Here the sum extends over all σ = 3! = 6 permutations of 1, 2, 3 and sgn(σ) is the sign of
the permutation σ = (σ1, σ2, σ3).
A similar definition holds for an n × n matrix A.
a × b = (|a||b| sin θ) bc
Here 0 < θ < π is the angle between the non-zero and non-parallel vectors a and b and bc is the
unit vector perpendicular to a and b with direction found by using the corkscrew rule: rotate
from a to b through θ.
The cross product can also be seen geometrically as an area (exercise 3); the triple scalar
product as a volume as a volume (exercise 10).