Académique Documents
Professionnel Documents
Culture Documents
LECTURE 1
Books:
S. Lipschutz – Schaum’s outline of linear algebra
S.I. Grossman – Elementary linear algebra
Key example
Rn , space of n-tuples of real numbers, u = (u1 , . . . , un ).
If u = (u1 , . . . , un ) and v = (v1 , . . . , vn ), then u + v = (u1 + v1 , . . . , un + vn ).
Also if a ∈ R, then au = (au1 , . . . , aun ).
• a(u + v) = au + av and
1
• (a + b)u = au + bu, (these are the distributive laws).
• 1u = u, (identity property).
Note our vectors were in Rn , our scalars in R. Instead of R we can use any set
in which we have all the usual rules of arithmetic (a field ).
Examples
Examples
(a) V = F n , where F = Q, R, C or F2 .
(b) V = Mm,n , all m × n matrices with entries in F .
(c) V = Pn , polynomials of degree at most n, i.e., p(t) = a0 + a1 t + . . . + an tn ,
with a0 , a1 , . . . , an ∈ F .
(d) V = F X . Let X be any set; then F X is the collection of functions from X
into F .
Define (f + g)(x) = f (x) + g(x) and (af )(x) = af (x), for f, g ∈ V and a ∈ F .
2
The proofs are mostly omitted, but are short. For example, a0 = a(0 + 0) =
a0+a0. Add −(a0) to both sides and we get 0 = a0+a0+(−a0) = a0+0 = a0.
LECTURE 2
Subspaces
1.4 Definition
Let V be a vector space over a field F and W a subset of V . Then W is a subspace
if it satisfies:
(i) 0 ∈ W .
(ii) For all v, w ∈ W we have v + w ∈ W .
(iii) For all a ∈ F and w ∈ W we have aw ∈ W .
That is, W contains 0 and is closed under the vector space operations. It’s easy
to see that then W is also a vector space, i.e., satisfies the properties of (1.1).
For example −w = (−1)w ∈ W if w ∈ W .
1.5 Examples
(i) Every vector space V has two trivial subspaces, namely {0} and V .
(ii) Take any v ∈ V , not the zero vector. Then span{v} = {av : a ∈ F } is a
subspace.
For example, in R2 we get a line through the origin [DIAGRAM]. These are the
only subspaces of R2 apart from the trivial ones.
(iii) In R3 we have the possibilities in (i) and (ii) above, but we also have planes
through the origin, e.g.,
W = {(x, y, z) ∈ R3 : x − 2y + 3z = 0}.
The general solution is obtained by fixing y and z, and then x is uniquely deter-
mined, e.g., z = a, y = b and x = −3a + 2b. So
W = {(−3a + 2b, b, a) : a, b ∈ R}
= {a(−3, 0, 1) + b(2, 1, 0) : a, b ∈ R}.
So we can see W either as all vectors orthogonal to (1, −2, 3), or all “linear
combinations” of (−3, 0, 1) and (2, 1, 0) (two parameters).
1.6 Definition
Given a set S of vectors in V , the smallest subspace of V containing S is written
W = span(S) or lin(S), and called the linear span of S.
3
It consists of all linear combinations a1 s1 + a2 s2 + . . .+ an sn , where a1 , . . . , an ∈ F
and s1 , . . . , sn ∈ S. It includes 0 the “empty combination”.
Note that all these combinations must lie in any subspace containing S, and if
we add linear combinations or multiply by scalars, we still get a combination. So
this is the smallest subspace containing S.
Example
1.7 Proposition
Let V be a vector space over F , and let U and W be subspaces of V . Then U ∩W
is also a subspace of V .
Proof: (i) 0 ∈ U ∩ W , since 0 ∈ U and 0 ∈ W .
(ii) If u, v ∈ U ∩ W , then u + v ∈ U and u + v ∈ W , since each of u and v are,
so u + v ∈ U ∩ W .
(iii) Similarly if a ∈ F and u ∈ U ∩ W , then au ∈ U and au ∈ W so au ∈ U ∩ W .
However, U ∪ W doesn’t need to be a subspace. For example, in R2 , take
U = {(x, 0) : x ∈ R} and W = {(0, y) : y ∈ R}. [DIAGRAM]
Then (1, 0) ∈ U ∪ W and (0, 1) ∈ U ∪ W , but their sum is (1, 1) 6∈ U ∪ W .
LECTURE 3
Sums of subspaces
1.8 Definition
Let V be a vector space over a field F and U, W subspaces of V . Then
U + W = {u + w : u ∈ U, w ∈ W }.
1.9 Proposition
U +W is a subspace of V , and is the smallest subspace containing both U and W .
v1 + v2 = (u1 + u2 ) + (w1 + w2 ) ∈ U + W.
∈U ∈W
4
(iii) If v = u + w ∈ U + W and a ∈ F , then
av = au + aw ∈ U + W.
∈U ∈W
u = u + 0 ∈ U + W.
∈U ∈W
1.10 Definition
In a vector space V with subspaces U and W , we say that U + W is a direct sum,
written U ⊕ W , if U ∩ W = {0}.
Examples
As above, U ∩ W = {0}, since if (a, 0, 0) = (0, b, 0), then they are both 0.
So U ⊕ W = {(a, b, 0) : a, b ∈ R}.
W ∩ T consists of all vectors (0, b, 0) = (c, d, −c), for some b, c, d, which is all
vectors (0, b, 0), or W again.
So W + T is not a direct sum, and the notation W ⊕ T is incorrect here.
1.11 Proposition
5
V = U ⊕ W if and only if for each v ∈ V there are unique u ∈ U and w ∈ W
with v = u + w.
Proof:
“⇒” The u and w are unique, since if u1 + w1 = u2 + w2 then
u1 − u2 = w2 − w1
∈U ∈W
“⇐” If v ∈ U ∩ W , then
v= v + 0 = 0 + v
∈U ∈W ∈U ∈W
LECTURE 4
2.2 Definition
A set if vectors S = {v1 , . . . , vn } is linearly independent, if the only solution to
a1 v1 + . . . + an vn = 0 is a1 = a2 = . . . = an = 0.
This is the same as saying that we can’t express any vector in S as a linear
combination of the others.
6
2.3 Examples:
1) In R3 , the vectors v1 = (1, 0, 0), v2 = (0, 1, 0) and v3 = (0, 0, 1) are indepen-
dent, since
a1 v1 + a2 v2 + a3 v3 = (a1 , a2 , a3 ) = 0 only if a1 = a2 = a3 = 0.
2) In R3 , the vectors v1 = (1, 0, 2), v2 = (1, 1, 0) and v3 = (−1, −2, 2) are linearly
dependent, since v1 − 2v2 − v3 = 0. We can write any vector in terms of the
others, e.g. v3 = v1 − 2v2 .
2.4 Definition
A set {v1 , . . . , vn } spans V if every v ∈ V can be written as
v = a1 v1 + . . . + an vn for some a1 , . . . , an ∈ F .
2.5 Examples
1) (1, 0, 0), (0, 1, 0) and (0, 0, 1) span R3 .
2) See after (1.6). The set {(1, 1), (2, 3)} spans R2 (the set of linear combinations
is all of R2 ), whereas {(1, 1), (2, 2)} doesn’t.
2.6 Definition
If a set {v1 , . . . , vn } spans V and is linearly independent, then it is called a basis
of V .
2.7 Proposition
{v1 , . . . , vn } is a basis of V if and only if every v ∈ V can be written as a unique
linear combination v = a1 v1 + . . . + an vn .
Proof:
Suppose that it is a basis. If there were two such ways of writing
v = a1 v1 + . . . + an vn = b1 v1 + . . . + bn vn , then
0 = (a1 − b1 )v1 + . . . + (an − bn )vn , and by linear independence we get
a1 − b1 = 0, . . . , an − bn = 0, which is uniqueness.
Conversely, if we always have uniqueness, we need to show the vectors are inde-
pendent. But if a1 v1 + . . . + an vn = 0, we know already that 0v1 + . . . + 0vn = 0,
and so by uniqueness, a1 = . . . = an = 0, as required.
2.8 Examples
7
1) Clearly {(1, 0, 0), (0, 1, 0), (0, 0, 1)} is a basis of R3 .
2) Let’s check whether {(1, 1, 1), (1, 1, 0), (1, 0, 0)} is a basis of R3 . We need to
solve x(1, 1, 1) + y(1, 1, 0) + z(1, 0, 0) = (a, b, c) for any given a, b, c.
That is
x + y + z = a
x + y = b
x = c,
with solution x = c, y = b − c, z = a − b (solve from the bottom upwards). This
is unique, so it’s a basis.
x + y + 3z = a
x + 2y + 4z = b
2x + 4z = c,
Row-reduce:
1 1 3 | a 1 1 3 | a
1 2 4 | b R2 − R1, R3 − 2R1 0 1 1 | b−a
−→
2 0 4 | c 0 −2 −2 | c − 2a
1 1 3 | a
R3 + 2R2
0 1 1 | b− a ,
−→
0 0 0 | c + 2b − 4a
which is equivalent to
x + y + 3z = a
y + z = b−a
0 = c + 2b − 4a,
LECTURE 5
We are aiming to show that all bases of a vector space have the same number of
elements.
8
Let V be a vector space over a field F , and suppose that {u1 , . . . un } spans V .
Let v ∈ V with v 6= 0. Then we can replace uj by v for some j with 1 ≤ j ≤ n,
so that the new set still spans V .
2.10 Theorem
Let V be a vector space, let {v1 , . . . , vk } be an independent set in V , and
let {u1 , . . . , un } be a spanning set. Then n ≥ k and we can delete k of the
{u1 , . . . , un }, replacing them by {v1 , . . . , vk }, so that the new set spans.
2.11 Example
Take V = R3 . Then u1 = (1, 0, 0), u2 = (0, 1, 0), u3 = (0, 0, 1) span, and
v1 = (1, 1, 0) and v2 = (1, 2, 0) are independent.
9
2.12 Theorem
Let V be a vector space and let {v1 , . . . , vk } and {u1 , . . . , un } be bases of V .
Then k = n.
2.13 Definition
A vector space V has dimension n, if it has a basis with exactly n elements (here
n ∈ {1, 2, 3, . . .}). It has dimension 0, if V = {0}, only.
We call V finite-dimensional in this case, and write dim V = n, where
n ∈ {0, 1, 2, . . .}.
2.14 Examples
(i) F n has dimension n over F , since it has basis
{(1, 0, . . . , 0}, (0, 1, 0, . . . , 0), . . . , (0, . . . , 0, 1)}, the standard basis.
2.15 Theorem
Let V be a vector space of dimension n. Then any independent set has ≤ n
elements, and, if it has exactly n, then it is a basis.
Any spanning set has ≥ n elements, and if it has exactly n, then it is a basis.
LECTURE 6
Proof: Let {v1 , . . . , vn } be a basis of V (i.e., spanning and independent). Let
{u1 , . . . , uk } be an independent set. By (2.10), we have k ≤ n. We can replace
k of the v’s by u’s, so that it spans. So if k = n, then {u1 , . . . , un } spans and
hence it is a basis.
10
Now let {w1 , . . . , wm } be a spanning set. Then (2.10) tells us that m ≥ n.
Suppose now that m = n. If {w1 , . . . , wm } is not independent, then a1 w1 + . . . +
am wm = 0, where at least one ai 6= 0. But then wi is a linear combination of
the others, so we can delete it and the set of n − 1 remaining vectors still spans.
This is a contradiction, since a spanning set must always be at least as big as
any independent set, by (2.10).
2.16 Examples
In R3 , the set {(1, 2, 3), (0, 1, 0)} is independent, but can’t be a basis, as not
enough elements for it to span.
Also, {(1, 2, 3), (4, 5, 6), (0, 1, 0), (0, 0, 1)} spans, but can’t be a basis as there are
too many elements and it is not independent.
2.17 Theorem
Let V be an n-dimensional vector space and W a subspace of V . Then W has
finite dimension and any basis of W can be extended to a basis for V by adding
in more elements. So if W 6= V , then dim W < dim V .
Proof:
We suppose WLOG that W 6= {0} or V , which is easy. Otherwise, let {w1 , . . . , wk }
be an independent set in W chosen to have as many elements as possible. (At
most n, since dim V = n.) We claim it’s a basis for W .
For any w ∈ W , the set {w, w1, . . . , wk } is not independent, so
a1 w1 + . . . + ak wk + bw = 0, say, with not all the coefficients being zero. But b
can’t be 0, since {w1 , . . . , wk } is independent. So we can write w as a combina-
tion of {w1 , . . . , wk }, and so they span W , and hence form a basis for it.
Now let {v1 , . . . , vn } be a basis for V . By (2.10) we can replace k of the v’s by
w’s and it still spans V . This is the same as extending {w1 , . . . , wk } to a basis
of n elements, since any n-element spanning set for V is a basis, by (2.15).
2.18 Example
Let W = {(x, y, z) ∈ R3 : x − 2y + 4z = 0}, with general solution z = a, y = b,
x = 2b − 4a, i.e., (2b − 4a, b, a) = a(−4, 0, 1) + b(2, 1, 0).
We can extend it to a basis for R3 by adding in something chosen from the basis
{(1, 0, 0), (0, 1, 0), (0, 0, 1)}. Indeed, (1, 0, 0) isn’t in the subspace, so that will do.
11
Thus {(−4, 0, 1), (2, 1, 0), (1, 0, 0)} is a basis of R3 containing the basis for W .
2.19 Theorem
Let V be a finite-dimensional vector space and U a subspace of V . Then there is
a subspace W such that V = U ⊕ W .
If v ∈ V , we can write
v = a1 u1 + . . . + ak uk + ak+1 wk+1 + . . . + an wn
in U in W
for some a1 , . . . , an ∈ F . So V = U + W .
LECTURE 7
2.20 Example
Let V = R3 and U = {a(1, 2, 3) + b(1, 0, 6) : a, b ∈ R}, a plane through 0.
For W we can take any line not in the plane that passes through 0, e.g. the x-axis
{c(1, 0, 0) : c ∈ R}. The complement is not unique.
3 Linear mappings
3.1 Definition
Let U, V be vector spaces over F . Then a mapping T : U → V is called a linear
mapping, or linear transformation, if:
12
(i) T (u1 + u2 ) = T (u1 ) + T (u2 ) for all u1 , u2 ∈ U;
3.2 Examples
(i) Let A be an m × n matrix of real numbers. Then we define T : Rn → Rm by
y = T (x) = Ax, i.e.,
y1 a11 . . . a1n x1
.. .. .. .. .. .
. = . . . .
ym am1 . . . amn xn
(m × 1) (m × n) (n × 1)
We shall see that for finite-dimensional vector spaces, all linear mappings can
be represented by matrices, once we have chosen bases for the spaces U and V
involved.
3.3 Definition
Let T : U → V be a linear mapping between vector spaces.
The null-space, or kernel, of T is ker T = {u ∈ U : T (u) = 0}, and is a subset of
U.
The image, or range, of T is im T = T (U) = {T (u) : u ∈ U}, and is a subset of V .
and
3.4 Proposition
13
Let T : U → V be a linear mapping between vector spaces. Then ker T is a
subspace of U and im T is a subspace of V .
Proof: (i) Start with ker T . Note that T (0) = T (0 + 0) = T (0) + T (0), by
linearity, and this shows that T (0) = 0. So 0 ∈ ker T .
If u1 , u2 ∈ ker T , then T (u1 + u2 ) = T (u1 ) + T (u2 ) (linearity), which equals 0 + 0
or 0. So u1 + u2 ∈ ker T .
If u ∈ ker T and a ∈ F , then T (au) = aT (u) (linearity), which equals a0 or 0.
So au ∈ ker T .
Hence ker T is a subspace of U.
3.5 Definition
For T : U → V linear, the nullity of T is dim(ker T ), and written n(T ).
The rank of T is dim(im T ), and written r(T ).
LECTURE 8
3.6 Theorem
Let U, V be vector spaces over F and T : U → V linear. If U is finite-dimensional,
then r(T ) + n(T ) = dim U.
14
Spanning. If v ∈ im T , then v = T u for some u ∈ U, and we can find
b1 , . . . , bn ∈ F such that u = b1 w1 + . . . + bk wk + bk+1 uk+1 + . . . + bn un , us-
ing our basis for U.
Apply T , and we get v = bk+1 T (uk+1 ) + . . . + bn T (un ), since the T (wi ) are all 0.
So the set spans im T .
4.2 Example
If e1 = (1, 0, . . . , 0), e2 = (0, 1, 0, . . . , 0), . . . , en = (0, 0, . . . , 0, 1) is the standard
basis of Rn , then x = (x1 , . . . , xn ) has coordinates x1 , . . . , xn .
15
Pn
Note yi = j=1 aij xj for i = 1, . . . , m, where A has entries (aij )m n
i=1 j=1 .
4.4 Proposition
Let U, V, u1 , . . . , un , v1 , . . . , vm be as in (4.3). Every map T such that the coor-
dinates of T (x) with respect to the v’s is given by
y1 x1
.. ..
. = A .
ym xn
x1 + x′1 x1 x′1
A ... = A ... + A ...
xn + x′n xn x′n
16
b1
b2
if T (u1 ) = b1 v1 + . . . + bm vm . That is, .. is the first column of A. Similarly,
.
bm
0
1 c1
c2
A 0 = ..
.. .
.
cm
0
LECTURE 9
Example. Suppose that T : U → V , where U has basis {u1 , u2 } and V has basis
{v1 , v2 , v3 }, and that
T (u1 ) = 3v 1 +4v2+5v3 , while T (u2 ) = v1 +v
2 +9v
3 . Then we fill in
thecolumns
3 1 3 1
1 0
to get A = 4 1, and note that A = 4, while A = 1.
0 1
5 9 5 9
Note that the identity mapping I : U → U with I(u) = u corresponds to the
1 0 0 ... 0
0 1 0 . . . 0
. .
. .
identity matrix .. . . . . . . . .. of size dim U, written I or In if it is n × n.
. . .
.. .. . . . . . 0
0 0 ... ... 1
This gives us:
4.5 Proposition
The matrix A representing
T with respect to u1 , . . . , un and v1 , . . . , vm is the one
a1i
a2i
whose ith column is .. , where
.
ami
T (ui ) = a1i v1 + a2i v2 + . . . + ami vm = m
P
j=1 aji vj .
17
Proof: For a typical vector x = x1 u1 + . . . + xn un , we have
n
X
T (x) = xi T (ui ) (by linearity)
i=1
n X
X m m
X
= xi ajivj = yj vj ,
i=1 j=1 j=1
where
y1 x1
.. ..
. = A . ,
ym xn
Pn
for j = 1, . . . , m, i.e., yj = i=1 aji xi .
4.6 Example
Find the matrix of the linear mapping T : R3 → R2 , with
T (x, y, z) = (x + y + z, 2x + 2y + 2z),
(i) with respect to the standard bases of R3 (call it A);
(ii) with respect to the bases {(1, 0, 0), (1, −1, 0), (0, 1, −1)} of R3 and {(1, 2), (1, 0)}
of R2 (call it B).
Solution. (i) T (1, 0, 0)= (1, 2),T (0, 1, 0) = (1, 2) and T (0, 0, 1) = (1, 2), so fill
1 1 1
in columns to get A = .
2 2 2
(ii)
T (1, 0, 0) = (1, 2) = 1(1, 2) + 0(1, 0)
T (1, −1, 0) = (0, 0) = 0(1, 2) + 0(1, 0)
T (0, 1, −1) = (0, 0) = 0(1, 2) + 0(1, 0)
1 0 0
Filling in columns we get B = .
0 0 0
4.7 Theorem
Let T : U → V be a linear mapping between vector spaces, and suppose that
dim U = n, dim V = m. Then we can find bases u1 , . . . , un of U and v1 , . . . , vm
of V so that the matrix of T has the canonical form
Ir O
A= ,
O O
18
Proof: Recall that ker T has dimension n − r, the nullity of T , by (3.6).
So take a basis ur+1 , . . . , un of ker(T ) and extend to a basis u1 , . . . , un of U.
Let v1 = T (u1 ), . . . , vr = T (ur ).
As in (3.6), v1 , . . . , vr is a basis of im T and we can extend to a basis v1 , . . . , vn
of V .
1
0
Now T (u1 ) = v1 , so the first column of the matrix will be .. ; and so on, until
.
0
0
..
.
0
T (ur ) = vr , so the rth column will be 1, with the 1 in the rth row. Finally,
0
.
..
0
T (ur+1 ) = . . . = T (un ) = 0, as these vectors are in the kernel; so the remaining
columns are all 0.
LECTURE 10
Now ker T is all multiples of (2, 1) so take u2 = (2, 1) and we can take u1 = (1, 0)
so that {u1 , u2 } is a basis for R2 .
Then T (u1 ) = (1, 2), which gives a basis for im T , and so we let v1 = (1, 2).
Extend with v2 = (0, 1) (say) to another basis for R2 .
1 0
Now T has the matrix with respect to the bases {u1 , u2 } and {v1 , v2 }.
0 0
Also r(T ) = 1 and n(T ) = 1.
4.9 Proposition
Let A be an m × n matrix with real entries. Let T : Rn → Rm be defined by
T (x) = Ax. Then
n(T ) is the dimension of the solution space of the equations Ax = 0, and
19
r(T ) is the dimension of the subspace of Rm spanned by the columns of A.
Proof:
The result on n(T ) is just by definition of the kernel.
4.10 Corollary
The row rank of a matrix (number of independent rows) equals the column rank
(number of independent columns).
Composition of mappings.
4.11 Proposition
Let U, V, W be vector spaces over F and let T : U → V and S : V → W be linear
mappings. Then ST is a linear mapping from U to W .
4.12 Example
20
Let U = R2 , V = R3 and W = R4 . We define T (x1 , x2 ) = (x1 + x2 , x1 , x2 ) and
S(y1 , y2, y3 ) = (y1 + y2 , y1 , y2 , y3).
Then ST (x1 + x2 ) = (2x1 + x2 , x1 + x2 , x1 , x2 ).
4.13 Proposition
Let U, V, W be finite-dimensional vector spaces over F with bases {u1 , . . . , un },
{v1 , . . . , vm } and {w1 , . . . , wℓ }, and let T : U → V and S : V → W be linear
mappings. Let T be represented by B and S by A with respect to the given
bases. Then ST is represented by the matrix AB, i.e.,
T S
U → V → W
B A
Pm Pℓ
Proof: T (uj ) = i=1 bij vi and S(vi ) = k=1 aki wk , as in Proposition 4.5.
So
m
X m X
X ℓ
ST (uj ) = bij S(vi ) = bij aki wk
i=1 i=1 k=1
ℓ m
!
X X
= aki bij wk
k=1 i=1
ℓ
X
= (AB)kj wk .
k=1
This “explains” the rule for multiplying matrices.
LECTURE 11
Isomorphisms
21
4.15 Definition
An isomorphism is a linear mapping T : U → V of vector spaces over the same
field, for which there is an inverse mapping T −1 : V → U satisfying T −1 T =
IU and T T −1 = IV , where IU and IV are the identity mappings on U and V
respectively.
4.16 Theorem
Two finite-dimensional vector spaces U and V are isomorphic if and only if
dim U = dim V .
T (a1 u1 + . . . + an un ) = a1 v1 + . . . + an vn ,
Let U = R2 , and let V be the space of all real polynomials p of degree ≤ 2 such
that p(1) = 0. Clearly dim U = 2. For V , we note that
V = {a0 + a1 t + a2 t2 : a0 + a1 + a2 = 0}. Setting a0 = c and a1 = d we have
a2 = −c − d, so
V = {c + dt + (−c − d)t2 : c, d ∈ R} = {c(1 − t2 ) + d(t − t2 ) : c, d ∈ R}, with
basis {1 − t2 , t − t2 }.
Hence dim V = 2, and there is an isomorphism between U and V defined by
T (a, b) = T (a(1, 0) + b(0, 1)) = a(1 − t2 ) + b(t − t2 ).
4.17 Remark
If A represents T with respect to some given bases of U and V , then A−1 repre-
sents T −1 , since if B is the matrix of T −1 , we have:
BA = matrix of T −1 T = In , and
AB = matrix of T T −1 = In , by (4.13).
22
5 Matrices and change of bases
The idea of this section is to choose bases
forU and V so that T : U → V has
Ir O
the simplest possible matrix, namely , as in (4.7). How is this related
O O
to the original matrix?
5.1 Proposition
Let V be an n-dimensional vector space over F . Let {v1 , . . . , vn } be a basis of V
and {w1 , . . . , wn } be any set of n vectors, not necessarily distinct, in V . Then
(i) There is a unique linear mapping S : V → V such that Svj = wj for each j.
(ii) There is a unique
Pn square matrix P representing S in the basis {v1 , . . . , vn },
such that wj = i=1 pij vi , for j = 1, . . . , n.
(iii) {w1 , . . . , wn } is a basis of V if and only if P is non-singular, i.e., invertible.
Pn Pn
Proof: (i) Define S( j=1 xj vj ) = j=1 xj wj . This is clearly linear and the
only possibility.
(ii) We write wj = ni=1 pij vi using the basis {v1 , . . . , vn }. This determines the
P
matrix P , which is the matrix of S, as in (4.5).
LECTURE 12
5.2 Theorem
Let U and V be finite-dimensional vector spaces over F and T : U → V a linear
mapping represented by a matrix A with respect to basis {u1 , . . . , un } of U and
{v1 , . . . , vm } of V . Then the matrix B representing T with respect to new bases
{u′1 , . . . , u′n } of U and {v1′ , . . . , vm
′
} of V is given by B = Q−1 AP , where
P is the matrix of the identityPmapping on U with respect to the bases {u′1 , . . . , u′n }
and {u1 , . . . , un }, i.e., u′j = ni=1 pij ui (so it writes the new basis in terms of the
old one), and similarly
23
Q is the matrix of the identityP mapping on V with respect to the bases {v1′ , . . . , vm
′
}
n
and {v1 , . . . , vm }, i.e., vk = i=1 qℓk vℓ .
′
So B = Q−1 AP .
5.3 Definition
Two m×n matrices A, B with entries in F are equivalent, if there are non-singular
square matrices P , Q with entries in F such that the product Q−1 AP is defined
and equals B. (So P must be n × n and Q is m × m.)
Writing R = Q−1 , we could also say B = RAP , with R, P nonsingular.
If A and B are equivalent, we write A ≡ B.
5.4 Proposition
Equivalence of matrices is an equivalence relation; i.e., for m × n matrices over
F , we have
A ≡ A, A ≡ B =⇒ B ≡ A, [A ≡ B, B ≡ C] =⇒ A ≡ C.
5.5 Theorem
Let U, V be vector spaces over F , with dim U = n and dim V = m, and let A be
an m × n matrix. Then
24
(i) Given bases {u1 , . . . , un } and {v1 , . . . , vm } of U and V , there is a linear map-
ping T that is represented by A with respect to these bases.
5.6 Example
1 1 0
Take A = . Find r and nonsingular P and Q such that
1 1 2
I O
Q−1 AP = r . Note that P must be 3 × 3 and Q must be 2 × 2.
O O
Solution. Row and column reduce to get it into canonical form. We’ll do rows
first.
1 1 0 1 1 0 1 1 0
→ → ,
1 1 2 0 0 2 0 0 1
where we did r2 − r1 and then r2 /2. Next, columns.
1 1 0 1 0 0 1 0 0
→ → ,
0 0 1 0 0 1 0 1 0
doing c2 − c1 and c2 ↔ c3 .
For Q−1 start with the 2 × 2 identity matrix I2 , and do the same row operations.
1 0 1 0 1 0
→ → .
0 1 −1 1 − 21 12
For P start with I3 and do the same column operations.
1 0 0 1 −1 0 1 0 −1
0 1 0 → 0 1 0 → 0 0 1 .
0 0 1 0 0 1 0 1 0
25
1 0
We can get Q by inverting Q , and the answer is
−1
.
1 2
LECTURE 13
Alternatively. From first principles, we have T (x, y, z) = (x + y, x + y + 2z),
so the kernel consists of all vectors with z = 0 and x = −y, i.e., has a basis
(−1, 1, 0).
Extend to a basis for R3 , say by adding in (1,0, 0) and (0,
0, 1).
1 0 −1
Fill in the basis vectors as columns. So P = 0 0 1 , as before.
0 1 0
2
Also T (1, 0, 0) = (1, 1) and
T (0,
0, 1) = (0, 2), which already gives a basis for R .
1 0
Fill in columns so Q = , also as before.
1 2
[v = x1 v1 + . . . + xn vn ] =⇒ [T (v) = y1 v1 + . . . + yn vn ].
Here A is n × n (square).
a1j
Also T (vj ) = a1j v1 + . . . + anj vn , and ... is the jth column of A.
anj
2 2 1 1
Consider the linear mapping T : R → R given by x 7→ Ax with A = .
−2 4
If we use the basis consisting of v1 = (1, 1) and v2 = (1, 2), then
T (v1 ) = (2, 2) = 2v1 and T (v2 ) = (3, 6) = 3v2 .
2 0
Hence with respect to this basis T has the diagonal matrix B = . Can
0 3
we always represent a linear mapping by such a simple matrix?
6.1 Theorem
Let V be an n-dimensional vector space over F , and T : V → V a linear mapping
represented by the matrix A with respect to the basis {v1 , . . . , vn }. Then T is
26
represented by the matrix B with respect to the basis {v1′ ,P
. . . , vn′ } if and only
if B = P AP , where P = (pij ) is nonsingular and vj = ni=1 pij vi , i.e., P is
−1 ′
the matrix of the identity map on B with respect to the bases {v1′ , . . . , vn′ } and
{v1 , . . . , vn }.
6.2 Definition
Two n × n matrices A and B over F are similar or conjugate if there is a non-
singular square matrix P with B = P −1 AP . We write A ∼ B. This happens if
they represent the same linear mapping with respect to two bases.
1 1 2 0
Thus, in our example, ∼ .
−2 4 0 3
6.3 Proposition
Similarity is an equivalence relation on the set of n × n matrices over F , i.e.,
A ∼ A, A ∼ B =⇒ B ∼ A, [A ∼ B, B ∼ C] =⇒ A ∼ C.
6.4 Example
2 2 2 0
Take T : R → R with T (v) = Av, where A = = 2I2 . Now P −1AP =
0 2
P −1 (2I2 )P = P −1(2P ) = 2P −1P = 2I2 , for every P , so that whatever basis we
use, T has to have the matrix A. Note T (x) = 2x for all v, so that 2 is an
“eigenvalue” – this turns out to be the key.
6.5 Definition
Let V be a vector space over F and T : V → V a linear mapping. An eigenvector
of T is an element v 6= 0 such that T (v) = λv for some λ ∈ F . The scalar λ is
then called an eigenvalue.
27
For a matrix A with entries in F , then λ ∈ F is an eigenvalue, if there is a
non-zero x ∈ F n such that Ax = λx, and then x is an eigenvector of A.
For the mapping T : F n → F n given by T (x) = Ax, this is the same definition
of course.
In our example, λ = 2 and 3 were eigenvalues, with eigenvectors (1, 1) and (1, 2)
respectively.
LECTURE 14
6.6 Proposition
A scalar λ ∈ F is an eigenvalue of an n × n matrix A if and only if λ satisfies the
characteristic equation,
χ(λ) = det(λIn − A) = 0,
N.B. Some people use det(A − λIn ) as the definition of χ(λ). We accept either
definition, as they only differ by a factor of (−1)n , and so have the same roots.
6.7 Proposition
28
Similar matrices have the same characteristic equation, and hence the same eigen-
values.
6.8 Proposition
Let V be an n-dimensional vector space over a field F , and T : V → V a linear
mapping. Let A represent T with respect to the basis {v1 , . . . , vn } of V . Then
A and T have the same eigenvalues.
Proof:
Well, λ∈ Fis an eigenvalue of A
x1 x1
.. ..
⇔ A . = λ . for some x1 , . . . , xn not all 0 in F
xn xn
⇔ T (x1 v1 + . . . + xn vn ) = λ(x1 v1 + . . . + xn vn ) for some x1 , . . . , xn not all 0 in F
⇔ T (v) = λv for some v 6= 0 in V
⇔ λ is an eigenvalue of T .
If we can find a basis of eigenvectors, the matrix of T has a particularly nice
form. We’ll prove one little result, then see some examples.
6.9 Proposition
Let V be an n-dimensional vector space over a field F , and let T : V → V be a
linear mapping. Suppose that {v1 , . . . , vn } is a basis of eigenvectors of T . Then,
with respect to this basis, T is represented by a diagonal matrix whose diagonal
entries are eigenvalues of T .
29
6.10 Example
11 −2 1 2
Define T : R2 → R2 by T (x) = x, and take v1 = and v2 = .
3 4 3 1
Now
11 −2 1 5 1
T (v1 ) = = =5 , and
3 4 3
15 3
11 −2 2 20 2
T (v2 ) = = = 10 .
3 4 1 10 1
5 0
So using {v1 , v2 } the matrix of T is .
0 10
However, not all matrices have a basis of eigenvectors.
6.11 Example
2 2 0 −1
Take T : R → R , defined by T (x) = x. This is a rotation of the plane
1 0
through 90 anti-clockwise.
◦
0 −1 1 0 −λ −1
= λ2 + 1, so no real eigenvalues
Now det −λ =
1 0 0 1 1 −λ
(roots of λ2 +1 = 0), so no eigenvectors in R2 (which is also obvious geometrically).
FACT. Over C every polynomial can be factorized into linear factors, and so has
a full set of complex roots. This is the Fundamental Theorem of Algebra (see
MATH 2090). So we can always find an eigenvalue.
LECTURE 15
6.12 Example
2 2 2 2 2 1
Consider T : C → C (or indeed R → R ) defined by T (x) = x.
0 2
2 1 1 0 2 − λ 1
det −λ =
= (λ − 2)2 ,
0 2 0 1 0 2 − λ
so 2 is the only eigenvalue. Solving T (x) = 2x gives
2 1 x1 x
= 2 1 , i.e.,
0 2 x2 x2
0 1 x1 0
= ,
0 0 x2 0
a
so x2 = 0, x1 arbitrary, and the eigenvectors are { : a ∈ F, a =
6 0}, with
0
F = R or C.
30
2 1
So there is no basis of eigenvectors, and is not similar to a diagonal
0 2
matrix.
6.13 Theorem
Let V be a vector space over F , let T : V → V be linear, and suppose that
{v1 , . . . , vk } are eigenvectors of T corresponding to distinct eigenvalues
{λ1 , . . . , λk }. Then {v1 , . . . , vk } is a linearly independent set. If k = dim V , then
it’s a basis.
Proof: Suppose that ki=1 ai vi = 0, and apply (T −λ2 I)(T −λ3 I) . . . (T −λk I)
P
to both sides.
Each term is sent to zero except for the first which becomes
(λ1 − λ2 ) . . . (λ1 − λk )a1 v1 = 0.
But v1 6= 0, and the other factors are nonzero, so a1 = 0.
Hence a2 v2 +. . .+ak vk = 0. Applying (T −λ3 I) . . . (T −λk I), we deduce similarly
that a2 = 0. Continuing, we see that the aj are all 0 and the set is independent.
Finally, if we have k independent vectors in a k-dimensional space, then it is
automatically a basis, by (2.15).
6.14 Corollary
(i) Let V be an n-dimensional vector space over a field F , and T : V → V a
linear transformation. If T has n distinct eigenvalues in F , then V has a basis of
eigenvectors of T , and T can be represented by a diagonal matrix, whose diagonal
entries are the eigenvalues; this is unique up to reordering the eigenvalues.
31
A matrix similar to a diagonal
matrix is called
diagonalisable. Not
all matrices
0 1 2 0 0 a1 0
are, e.g., if A = , then A = , so if P AP =
−1
, then
0 0 2 0 0 0 a2
a1 0
P −1 A2 P = (P −1 AP )(P −1AP ) = 2
2 . Since A = O, we have a1 = a2 = 0.
0 a2
0 0
This means that A = P P −1 = O, which is a contradiction.
0 0
1 ... 0
Of course In = 0 . . . 0 is diagonalisable, even though it has repeated eigen-
0 ... 1
values, 1, so (6.14) isn’t the only way a matrix can be diagonalisable.
6.15 Example
1 3
Let A = . Find D such that A ∼ D, and P such that D = P −1 AP .
4 2
Eigenvalues.
λ − 1 −3
det(λI − A) = = (λ − 1)(λ − 2) − 12
−4 λ − 2
= λ2 − 3λ − 10 = (λ − 5)(λ + 2),
5 0
so the eigenvalues are 5 and −2, and we can take D = .
0 −2
−4 3 x1 0
For λ = 5 we solve (A − 5I)x = 0, or = , i.e.,
4 −3 x2 0
(x1 , x2 ) = ( 43 a, a) for a ∈ R. So we take v1 = (3, 4), say.
3 3 x1 0
For λ = −2, we solve (A + 2I)x = 0, or = , i.e.,
4 4 x2 0
(x1 , x2 ) = (−b, b) for b ∈ R. So we take v2 = (−1, 1), say.
3 −1
So P = will do. We can check that P −1 AP = D, or, what is simpler
4 1
and equivalent, that AP = P D.
LECTURE 16
32
7 Polynomials
7.1 Definition
Let V be a vector space over F and T : V → V a linear mapping. Let
p(t) = a0 + a1 t + . . . + am tm be a polynomial with coefficients in F . Then
we define
p(T ) = a0 I + a1 T + a2 T 2 + . . . + am T m , i.e.,
p(T )v = a0 v + a1 T (v) + a2 T (T (v)) + . . . + am T (T . . . (v)).
7.2 Definition
Let V be an n-dimensional vector space over F , and T : V → V linear. Suppose
that T is represented by an n × n matrix A with respect to some basis. Then
the characteristic polynomial of T is χ(λ) = det(λIn − A). This is independent
of the choice of basis, since if B = P −1AP represents T with respect to another
basis, then det(λIn − B) = det(λIn − A) by (6.7).
N.B. Some people use det(A − λIn ) as the definition of χ(λ). We accept either
definition, as they only differ by a factor of (−1)n , and so have the same roots.
Proof: To be discussed later. Note χ(A) = det(λIn − A), but we can’t just
substitute λ = A.
Example
1 3
Take A = , as in (6.15). Then χ(λ) = λ2 − 3λ − 10 = (λ − 5)(λ + 2), and
4 2
−4 3 3 3 0 0
(A − 5I)(A + 2I) = = .
4 −3 4 4 0 0
Similarly, if T : V → V is a linear mapping on an n-dimensional vector space,
then χ(T ) = 0, because χ(λ) is defined in terms of a matrix representing T .
33
7.4 Definition
The minimum or minimal polynomial of a square matrix A is the monic poly-
nomial of least degree such that µ(A) = O. (“Monic” means that the leading
coefficient is 1.) We write it µ or µA .
4 2 0
Example. A = 0 4 0 has χA (λ) = det(λI − A) = (λ − 4)3 (check).
0 0 4
0 2 0 0 2 0 0 0 0
So (A−4I)3 = O. But in fact (A−4I)2 = 0 0 0 0 0 0 = 0 0 0,
0 0 0 0 0 0 0 0 0
2
and the minimum polynomial is µA (λ) = (λ − 4) .
Now, given the characteristic polynomial, we can show that there are only a small
number of possibilities to test for the minimum polynomial.
7.5 Theorem
Let A be a square matrix; then:
(i) Every eigenvalue of A is a root of the minimum polynomial;
(ii) The minimum polynomial divides the characteristic polynomial exactly.
(ii) Let µ be the minimum and χ the characteristic polynomial. By long division
we can write χ = µq + r for polynomials q (quotient) and r (remainder), with
deg r < deg µ.
Now
χ(A) = µ(A)q(A) + r(A)
O O
CH (7.3) Defn. of µ
so r(A) = O. Since deg r < deg µ and µ is the minimum polynomial, we have
r ≡ 0, i.e., µ divides χ exactly.
LECTURE 17
Example.
34
3 1 0 0
0 3 0 0
Let A =
0
. Calculate µA and χA .
0 6 0
0 0 0 6
Then
λ − 3 −1 0 0
0 λ−3 0 0 2 2
det(λI − A) = = (λ − 3) (λ − 6) .
0 0 λ − 6 0
0 0 0 λ − 6
So the eigenvalues are 3 and 6 only. We have χ(t) = (t − 3)2 (t − 6)2 , and µ(t)
has to divide it; also, both (t − 3) and (t − 6) must be factors of µ(t).
(t − 3)(t − 6) degree 2,
(t − 3)2 (t − 6) degree 3,
(t − 3)(t − 6)2 degree 3,
(t − 3)2 (t − 6)2 degree 4.
Next try
0 1 0 0 0 −3 0 0
0 0 0 0 0 0 0 0 = O,
(A − 3I)2 (A − 6I) =
0 0 3 0 0 0 0 0
0 0 0 3 0 0 0 0
7.6 Proposition
Similar matrices have the same minimum polynomial. Hence if A is diagonalis-
able, then µA (t) has no repeated roots.
35
Hence p(B) = 0 ⇐⇒ p(A) = 0. So µB (t) = µA (t).
λ1 0 . . . . . . . . . . . . 0
..
0 . 0 ... ... ... 0
0 0 λ1 0 . . . . . . 0
Now, if B is a diagonal matrix, say ..
(with possi-
0 . . . 0 . 0 . . . 0
0 ... ... 0 λ 0 0
k
0 ... ... ... 0 . .. 0
0 . . . . . . . . . . . . 0 λk
bly repeated diagonal entries), then (B−λ1 I)(B−λ2 I) . . . (B−λk )I = O, so µB (t)
has no repeated roots since it must divide the polynomial (t−λ1 )(t−λ2 ) . . . (t−λk ).
But any diagonalisable A is similar to a matrix B of this form, so µA has no re-
peated roots.
8.1 Definition
λ 1 0 ... 0
.. ..
0 λ 1 . .
. . . ..
.. . . . .
A Jordan block matrix is a square matrix of the form Jλ = .0 ,
0 . . . . . .
λ 1
0 ... ... 0 λ
with λ on the diagonal and 1 just above the diagonal (for some fixed scalar λ).
There is also the trivial 1 × 1 case.
8.2 Proposition
Suppose that V is an n-dimensional vector space, and T a linear transforma-
tion on V , represented by a Jordan block matrix A with respect to some basis
{v1 , . . . , vn }.
36
Then χ(t) = det(tI − A) = (t − λ)n , and we also have
T v1 = λv1 ,
T v2 = v1 + λv2 ,
T v3 = v2 + λv3 ,
... ...
T vn = vn−1 + λvn .
LECTURE 18
8.3 Definition
Let V be a vector space over F and T : V → V a linear mapping. Let λ ∈ F
be an eigenvalue of T . The non-zero vector v ∈ V is said to be a generalized
eigenvector of T corresponding to λ if (T − λI)k v = 0 for some k ≥ 1.
Similarly for n × n matrices A, a column vector x is a generalized eigenvector if
(A − λI)k x = 0 for some k ≥ 1.
Example.
4 1
For A = , we have v1 = (1, 0) an eigenvector with eigenvalue λ = 4, since
0
4
4 1 1 4
= . Now v2 = (0, 1) is not an eigenvector, since
0 4 0 0
4 1 0 1
= = v1 + 4v2 . So (A − 4I)v2 = v1 , and then
0 4 1 4
(A − 4I)2 v2 = (A − 4I)v1 = 0; so v2 is a generalized eigenvector.
8.4 Definition
37
A square matrix A is said to be in Jordan canonical form (or Jordan normal
form) if it consists of Jordan block matrices strung out on the diagonal, with
zeroes elsewhere. The diagonal entries are the eigenvalues of A.
8.5 Examples
Any Jordan block matrix is in JCF. So are:
2 1 0 0 0
−1 0 0 −1 0 0 0
2 1 0 0
A = 0 2 0 , B = 0 2 1 , and 0
C= 0 2 0 0
.
0 0 2 0 0 2 0 0 0 2 1
0 0 0 0 2
here:
A has 3 blocks all of size 1 × 1;
B has 1 block of size 1 × 1 and 1 of size 2 × 2;
C has 1 block size 3 × 3 and 1 block size 2 × 2.
8.6 Theorem
Let A be an n × n matrix with entries in C. Then A is similar to a matrix in Jor-
dan canonical form, unique up to re-ordering the blocks. If A is an n × n matrix
with real entries, then A is similar to a real matrix in JCF (i.e., B = P −1AP
with B real) if and only if all the roots of the characteristic equation are real.
B1 0
Proof: Omitted, but the “only if” follows from the fact that if B = .. ,
.
0 BN
is in JCF and each Bi is a Jordan block of size mi with diagonal elements λi ,
then
χB (λ) = χB1 (λ) . . . χBN (λ) = (λ − λ1 )m1 . . . (λ − λN )mN ,
so if all the blocks are real then χ has only real roots.
1. For each λ the power of (t − λ) in χ(t) is the total size of Jordan blocks
using λ. See (8.6).
2. The number of λ-blocks is the dimension of the eigenspace ker(A − λI). For
each block gives one new eigenvector, by (8.2).
38
8.8 Example
5 0 −1
Take A = 2 3 −1. Find the Jordan canonical form of A.
4 0 1
N.B. We will do this at the very end of the course for everything up to 4 × 4
matrices.
This tells us that there are 2 blocks, i.e., one size 2 and one size 1.
LECTURE 19
39
8.9 The Cayley–Hamilton theorem via Jordan canonical matrices
Let A be a real or complex matrix and χ(t) the characteristic polynomial. Then
χ(A) = O.
B1 0
Proof: Suppose that M is a Jordan canonical matrix. If M = .. ,
.
0 Bm
2
B1 0
with B1 , . . . , Bm Jordan blocks, then M 2 =
. .. , and similarly for
2
0 B
m
χ(B1 ) 0
higher powers of M. So χ(M) =
. .. and we need to show
0 χ(Bm )
that χ(Bi ) = 0 for each i.
λ 1 0 ... 0
. . ..
0 λ 1 . .
. . . .
Now suppose that B = .. . . . . . . 0
is a Jordan block of size k. Then
0 . . . . . . λ 1
0 ... ... 0 λ
k
(B − λI) = O as in (8.2).
Thus χ(B) = 0 for each of the blocks B making up A in Jordan form (since in
each case (t − λ)k divides χ(t)).
χ(B1 ) 0
Now χ(M) = .. = O.
.
0 χ(Bm )
Finally, any matrix A is similar to a matrix M in Jordan canonical form with the
same characteristic polynomial, by (6.7), i.e., M = P −1 AP , and
χ(A) = P −1 χ(M)P = O.
8.10 Theorem
If no eigenvalue λ appears with multiplicity greater than 3 in the characteristic
equation for A, then we can write down the JCF knowing just the characteristic
and minimum polynomials (χ(t) and µ(t)).
40
of (t − λ) in the minimum polynomial is the size of the largest Jordan
block associated with λ.
If the multiplicity of (t − λ) in χ(t) is 1, then there is just one block, size 1. The
power of (t − λ) in both χ(t) and µ(t) is 1.
8.11 Example
A matrix A has characteristic polynomial χ(t) = (t − 1)3 (t − 2)3 (t − 3)2 and
minimal polynomial µ(t) = (t − 1)3 (t − 2)(t − 3)2 . Find a matrix in Jordan form
similar to A.
Solution: For λ = 1, there is one block size 3; for λ = 2, there are 3 blocks size
1; for λ = 3 there is one block size 2.
0 0 0 λ 0 0 0 λ
since both have the largest block of size 2.
41
However, it is still possible to determine quickly which case we are in, as in the
first case there are 2 blocks, and the eigenspace ker(A − λI) has dimension 2; in
the other case, 3 blocks, and it has dimension 3.
LECTURE 20
8.12 Worked example (from the 2003 paper)
7 1 1 1
0 8 0 0
You are given that the matrix A = 0 0 6 −2 has characteristic poly-
−1 1 −1 7
2 2
nomial χA (t) = (t − 6) (t − 8) . Find its Jordan canonical form and its minimum
polynomial.
Solution: We see that the eigenvalues are 6 and 8. Let’s go for the minimum
polynomial µ first. This has roots 6 and 8, and divides χ. Since it is monic, it is
therefore one of:
(t − 6)(t − 8), (t − 6)2 (t − 8), (t − 6)(t − 8)2 , or (t − 6)2 (t − 8)2 .
We calculate
1 1 1 1 −1 1 1 1 −2 2 −2 −2
0 2 0 0 0 0 0 0 = 0 0 0 0
(A−6I)(A−8I) =
0
=6 O,
0 0 −2 0 0 −2 −2 2 −2 2 2
−1 1 −1 1 −1 1 −1 −1 0 0 0 0
which eliminates the first possibility. Next,
1 1 1 1 −2 2 −2 −2 0 0 0 0
0 2 0 0 0
0 0 0 0
0 0 0
(A−6I)2 (A−8I) = 0 = = O,
0 0 −2 2 −2 2 2 0 0 0 0
−1 1 −1 1 0 0 0 0 0 0 0 0
so the minimum polynomial is (t − 6)2 (t − 8). (If it hadn’t been zero, we would
then have tried (t − 6)(t − 8)2 .)
Each eigenvalue has multiplicity 2, so we have to work out whether the blocks
are 2 or 1 + 1. Since we needed (t − 6)2 , this block is size 2, and since we only
needed (t − 8)1 this has two blocks of size 1. The Jordan form is therefore:
6 1 0 0
0 6 0 0
B= 0 0 8 0 .
0 0 0 8
THE END
42