Vous êtes sur la page 1sur 9

EECS 16A Designing Information Devices and Systems I

Spring 2016 Official Lecture Notes Note 21

Introduction

In this lecture note, we will introduce the last topics of this semester, change of basis and diagonalization.
We will introduce the mathematical foundations for these two topics. Although we will not go through
its application this semester, this is an important concept that we will use to analyze signals and linear
time-invariant systems in EE16B.

By the end of these notes, you should be able to:

1. Change a vector from one basis to another.


2. Take a matrix representation for a linear transformation in one basis and express that linear transfor-
mation in another basis.
3. Understand the importance of a diagonalizing basis and its properties.
4. Identify if a matrix is diagonalizable and if so, to diagonalize it.

Change of Basis for Vectors

Previously, we have seen that matrices can be interpreted as linear transformations between vector spaces.
In particular, an m × n matrix A can be viewed as a function A : U → V mapping a vector ~u from vector space
U ∈ Rn to a vector A~u in vector space V ∈ Rm . In this note, we explore a different interpretation of square,
invertible matrices as a change of basis.
" # " #
u 4
Let’s first start with an example. Consider the vector ~u = 1 = . When we write a vector in this form,
u2 3
" # " #
1 0
implicitly we are representing it in the standard basis for R2 , ~e1 = and ~e2 = . This means that we
0 1
can write ~u = 4~e1 + 3~e2 . Geometrically,~e1 and~e2 define a grid in R2 , and ~u is represented by the coordinates
in the grid, as shown in the figure below:

EECS 16A, Spring 2016, Note 21 1


" #
1
What if we want to represent ~u as a linear combination of another set of basis vectors, say ~a1 = and
1
" #
0
~a2 = ? This means that we need to find scalars ua1 and ua2 such that ~u = ua1~a1 + ua2~a2 . We can write
1
this equation in matrix form:
 
| | " # " #
 ua1 u
~a1 ~a2  = 1

ua2 u2
| |
" #" # " #
1 0 ua1 4
=
1 1 ua2 3
" #
1 0
Thus we can find ua1 and ua2 by solving a system of linear equations. Since the inverse of is
1 1
" #
1 0
, we get ua1 = 4 and ua2 = −1. Then we can write ~u as 4~a1 −~a2 . Geometrically, ~a1 and ~a2 defines
−1 1
a skewed grid from which the new coordinates are computed.

Notice that the same vector ~u can be represented in multiple ways. In the standard basis ~e1 ,~e2 , the coordi-
nates for ~u are 4, 3. In the skewed basis ~a1 ,~a2 , the coordinates for ~u are 4, −1. Same vector geometrically,
but different coordinates.

In general, suppose we are given a vector ~u ∈ Rn in the standard basis and want to change to a different" basis #
ua1
with linearly independent basis vectors ~a1 , · · · ,~an . If we denote the vector in the new basis as ~ua = ,
ua2
h i
we solve the following equation A~ua = ~u, where A is the matrix ~a1 · · · ~an . Therefore the change of
basis is given by:

~ua = A−1~u

Now we know how to change a vector from the standard basis to any basis, how do we change a vector
~ua in the basis ~a1 , · · · ,~an back to a vector ~u in the standard basis? We simply reverse the change of basis
transformation, thus ~u = A~ua .

Pictorially, the relationship between any two bases and the standard basis is given by:

A- B−1
Basis ~a1 , · · · ,~an  Standard Basis  Basis ~b1 , · · · ,~bn
-
A−1 B

EECS 16A, Spring 2016, Note 21 2


For example, given a vector ~u = ua1~a1 + · · · + uan~an represented as a linear combination of the basis vectors
~a1 , · · · ,~an , we can represent it as a different linear combination of basis vectors ~b1 , · · · ,~bn , ~u = ub1~b1 + · · · +
ubn~bn by writing:

B~ub = ub1~b1 + · · · + ubn~bn = ~u = ua1~a1 + · · · + uan~an = A~ua


~ub = B−1 A~ua

Thus the change of basis transformation from basis ~a1 , · · · ,~an to basis ~b1 , · · · ,~bn is given by B−1 A.

Change of Basis for Linear Transformations

Now that we know how to change the basis of vectors, let’s shift our attention to linear transformations.
We will answer these questions in this section: how do we change the basis of linear transformations and
what does this mean? First let’s review linear transformations. Suppose we have a linear transformation T
represented by a n × n matrix that transforms ~u ∈ Rn to v ∈ Rn :

~v = T~u

The important thing to notice is that T maps vectors to vectors in a linear manner. It also happens to
be represented as a matrix. Implicit in this representation is the choice of a coordinate system. Unless
stated otherwise, we always assume that the coordinate system is that defined by the standard basis vectors
~e1 ,~e2 , . . . ,~en .

What is this transformation T in another basis? Suppose we have basis vectors ~a1 , · · · ,~an ∈ Rn . If ~u is
represented by ~ua = ua1~a1 + · · · + uan~an and ~v is represented by ~va = va1~a1 + · · · + van~an in this basis, what
is
h the transformation
i T represented by in this basis, such that Ta~ua = ~va ? If we define the matrix A =
~a1 · · · ~an , we can write our vectors ~u and ~v as a linear combination of the basis vectors: ~u = A~ua and
~v = A~va . This is exactly the change of basis from the standard basis to the ~a1 , · · · ,~an basis.

T~u =~v
TA~ua = A~va
−1
A TA~ua =~va

Since we want Ta~ua = ~va , Ta = A−1 TA. In fact, the correspondences stated above are all represented in the
following diagram.

T
~u - ~v
6 6

A−1 A A A−1
? ?
~ua - ~va
Ta

For example, there are two ways to get from ~ua to ~va . First is the transformation Ta . Second, we can trace
out the longer path, applying transformations A, T and A−1 in order. This is represented in matrix form as
Ta = A−1 TA (note that the order is reversed since this matrix acts on a vector ~ua on the right).

EECS 16A, Spring 2016, Note 21 3


By the same logic, we can go in the reverse direction. And then T = ATa A−1 . Notice that this makes sense.
A−1 takes a vector written in the standard basis and returns the coordinates in the A basis. Ta then acts on it
and returns new coordinates in the A basis. Multiplying by A returns a weighted sum of the columns of A,
in other words, giving us the vector back in the original standard basis.

A Diagonalizing Basis

Now we know what a linear transformation looks like under a different basis, is there a special basis under
which the transformation attains a nice form? Let’s suppose that the basis vectors ~a1 , · · · ,~an are linearly
independent eigenvectors of the transformation matrix T , with eigenvalues λ1 , · · · , λn . What does T~u looks
like now? Recall that ~u can be written in the new basis: ua1~a1 + · · · + uan~an .

T~u = T (ua1~a1 + · · · + uan~an )


= ua1 T~a1 + · · · + uan T~an
= ua1 λ1~a1 + · · · + uan λn~an
   
| · · · | λ1 u
.   .a1 
= ~a1 · · · ~an  
  . . . 
 . 
 
| ··· | λn uan
= AD~ua
= ADA−1~u

Where D is the diagonal matrix of eigenvalues and A is a matrix with corresponding eigenvectors as its
columns. Thus we have proved that in an eigenvector basis, T = ADA−1 . In particular, Ta , the counterpart
of T in the eigenvector basis, is a diagonal matrix.

This gives us many useful properties. Recall that matrix multiplication does not commute in general, but
diagonal matrices commute. Can we construct other non-diagonal matrices that commute under multipli-
cation? Suppose T1 and T2 can both be expressed in the same diagonal basis, that is they have the same
eigenvectors but possibly different eigenvalues. Then T1 = AD1 A−1 and T2 = AD2 A−1 . We show that T1 and
T2 commute:

T1 T2 = (AD1 A−1 )(AD2 A−1 )


= AD1 D2 A−1
= AD2 D1 A−1
= (AD2 )(A−1 A)(D1 A−1 )
= (AD2 A−1 )(AD1 A−1 )
= T2 T1

One of the other great properties has to do with raising a matrix to a power. If we can diagonalize a matrix,
then this is really easy. Can you see for yourself why the argument above tells us that if T = ADA−1 , then
T k = ADk A−1 ? (Hint: all the terms in the middle cancel.) But raising a diagonal matrix to a power is just
raising its individual elements to that same power.

EECS 16A, Spring 2016, Note 21 4


Diagonalization

We saw from the previous section the usefulness of representing a matrix (i.e. a linear transformation) in
a basis so that it is diagonal, so under what circumstances is a matrix diagonalizable? Recall from before
that a n × n matrix T is diagonalizable if it has n linearly independent eigenvectors. If it has n linearly
independent eigenvectors ~a1 , · · · ,~an with eigenvalues λ1 , · · · λn , then we can write:

T = ADA−1
h i
Where A = ~a1 · · · ~an and D is a diagonal matrix of eigenvalues.
" #
1 0
Example 21.1 (A 2 × 2 matrix that is diagonalizable): Let T = . To find the eigenvalues, we
0 −1
solve the equation:

det(T − λ I) = 0
(1 − λ )(−1 − λ ) = 0
λ = 1, −1
" #
1
The eigenvector corresponding to λ = 1 is ~a1 = . And the eigenvalues corresponding to λ = −1 is
0
" # " # " #
0 1 0 1 0
~a2 = . Now we have the diagonalization of T = ADA−1 where A = ,D= , and A−1 =
1 0 1 0 −1
" #
1 0
.
0 1
"
#
1 0
Example 21.2 (A 2 × 2 matrix that is not diagonalizable): Let T = . To find the eigenvalues,
−1 1
we solve the equation:

det(T − λ I) = 0
(1 − λ )2 = 0
λ =1
" #
0
The eigenvector corresponding to λ = 1 is ~a = . Since this matrix only has 1 eigenvector, it is not
1
diagonalizable.

Example 21.3 (Diagonalization of a 3 × 3 matrix):


 
3 2 4
T = 2 0 2
 
4 2 3

EECS 16A, Spring 2016, Note 21 5


To find the eigenvalues, we solve the equation:

det(T − λ I) = 0
λ − 6λ 2 − 15λ − 8 = 0
3

(λ − 8)(λ + 1)2 = 0
λ = −1, 8

If λ = −1, we need to find ~a such that:


   
4 2 4 2 1 2
2 1 2~a = 0 =⇒ 0 0 0~a = 0
   
4 2 4 0 0 0

Thus the dimension of the nullspace of T − (−1)I is 2, and we can find two linearly independent vectors in
this basis:
   
1 1
~a1 =  0  ~a2 = −2
   
−1 0

If λ = 8, we need to find ~a such that:


   
−5 2 4 1 0 −1
 2 −8 2 ~a = 0 =⇒ 0 2 −1~a = 0
   
4 2 −5 0 0 0

Thus the dimension of the nullspace of T − (8)I is 1, and we find the vector in the nullspace:
 
2
~a3 = 1
 
2

Now we define:
     
1 1 2 −1 0 0 4 2 −5
1
A =  0 −2 1 D =  0 −1 0 A−1 = 1 −4 1 
    
9
−1 0 2 0 0 8 2 1 2

Then T can be diagonalized as T = ADA−1 .

(Extra) Application - the DFT Basis

We would like to offer a sneak peek into why diagonalization can be useful in signals and linear time
invariant systems. You will not be responsible for the material introduced in this section, but we would just
want to give you some motivation behind the topic.

EECS 16A, Spring 2016, Note 21 6


Any finite time linear time invariant system can be described by matrix of the form
 
s0 sn−1 sn−2 s1
 s1 s0 sn−1 s2 
 
s2 s1 s0 . . . s3 
 
S=  . (1)
 . .. .. .. 
 . . . .

sn−1 sn−2 sn−3 s0

These kind of matrices are called circulant matrices. Observe that the second column is the first col-
umn shifted by one time step, the third column is the first column shifted by two time steps, . . . , the nth
column is the first column shifted by n − 1 time steps. As we can see, we need exactly n parameters
h iT
~s = s0 s1 . . . sn−1 to uniquely define the system.
h iT
For example, if the input signal to the system is ~x = 1 0 . . . 0 , then the output of the system would
be
 
s0 sn−1 sn−2 s1    
1 s
 s1 s0 sn−1 s2     0 
 
 0  s 
s s1 s0 . . . s3   .  =  .1  .

S~x = 
 .2 (2)
.. ..   ..   .. 
..     
 .
 . . . .
0 sn−1
sn−1 sn−2 sn−3 s0

h iT
For a general signal ~x = x0 x1 . . . xn−1 , the output signal would be
 
s0 sn−1 sn−2 s1  
 x0
s1 s0 sn−1 s2  

  x1 
 
s2 s1 s0 . . . s3   .  .

S~x =  (3)
.. .. ..   .. 
.. 
  
. . . .

xn−1

sn−1 sn−2 sn−3 s0

However, doing such matrix multiplication is tedious. Is there a way to transform S and ~x to a different basis
so that in this basis S becomes a diagonal matrix S0 and ~x becomes ~x0 ? In this case, computing the matrix
multiplication is simple
    
s00 x0 s00 x0
s01   x1   s01 x1 
    
0~ 0

Sx = ..  .  = 
 .   .. . (4)

 .  .   .


s0n−1 xn−1 s0n−1 xn−1

In other words, we would like to be able to diagonalize S, i.e., find U such that S = US0U −1 where the
columns of U are the desired basis.

To find U, we need to find the eigenvectors and eigenvalues of S. Recall that we can usually start by finding
the eigenvalues by solving the equation det(S − λ I) = 0, however, in this case, this does not give us much
information. Instead, let’s try to identify some patterns by solving the n = 3 case:

EECS 16A, Spring 2016, Note 21 7


 
s0 s2 s1
S = s1 s0 s2  (5)
 
s2 s1 s0

Let ~v = [w0 , w1 , w2 ]T . Then


 
s0 w0 + s1 w2 + s2 w1
S~v = s0 w1 + s1 w0 + s2 w2  (6)
 
s0 w2 + s1 w1 + s2 w0

For ~v to be an eigenvector, S~v = λ~v. What properties should w0 , w1 and w2 have so that ~v is an eigenvector?
We see an intriguing pattern in S~v. If wi has the following two properties:

1. wi w j = wi+ j (mod n)

2. w0 = 1

Then we can factor out the eigenvalue:


     
(s0 w0 + s1 w2 + s2 w1 )w0 w0 (s0 w0 + s1 w2 + s2 w1 )w0 w0
S~v = (s0 w1 + s1 w0 + s2 w2 )w2 w1  = (s0 w0 + s1 w2 + s2 w1 )w1  = s0 w0 + s1 w2 + s2 w1 w1  (7)
     
(s0 w2 + s1 w1 + s2 w0 )w1 w2 (s0 w0 + s1 w2 + s2 w1 )w2 w2

Therefore ~v is an eigenvector with eigenvalue s0 w0 + s1 w2 + s2 w1 .

This guessing and checking method we used might seem quite magical, but it is often a great way to find
eigenvectors of matrices with a lot of symmetry.

To really find the eigenvectors, we need to find a set of numbers w0 , w1 , w2 with the above-mentioned
properties. Real numbers wouldn’t do the trick, as they don’t have the “wrap-around” property needed. For
2πm
all m = 0, 1, . . . , n − 1, complex exponentials of the form wk = ei n k have all the required properties.

Why? Because ei n k steps along the n-th roots of unity as we let k vary from 0, 1, . . . , n − 1. This is a simple
2π 2π
consequence of Euler’s identity eiθ = cos θ + i sin θ geometrically or the fact that (ei n )n = ei n n = ei2π = 1.

How could we have come up with this guess? The 3x3 case is not impossible to solve. We could’ve just
written out the matrix for s0 = s2 = 0 and s1 = 1. (This is the shift-by-one matrix that simply cyclically
delays the signal it acts on by 1 time step.) At this point, you could explicitly evaluate the det(S − λ I) to
find the characteristic polynomial and see that we need to find the roots of λ 3 − 1. This implies that the
eigenvalues are the 3rd roots of unity. From there, you can find the eigenvectors by direct computation and
you would find:

The 3 eigenvectors are:


     
1 1 1
 2πi   4πi 
v0 = 1 , v1 = e 3  , v2 = e 3  (8)
 
4πi 2πi
1 e3 e3

EECS 16A, Spring 2016, Note 21 8


You would then notice that the same eigenvectors work for the case s0 = s1 = 0 and s2 = 1. This is the delay
by 2 matrix which, by inspection, can be seen as being D2 if D is the delay-by-one matrix. At this point, you
know that since these every 3x3 circulant matrix is a linear combination of the identity matrix (no delay)
and these two pure delay matrices, and all three of these matrices share the same eigenvectors, that in fact
these must be the eigenvectors for all 3x3 circulant matrices.

Generalizing, if we have a n-dimensional signals, all LTI systems that act on such signals would have the
same n eigenvectors:

~vl = [ω l·0 , ω l·1 , . . . , ω l·(n−1) ]T (9)


2πi
Where ω = e n . To normalize this, we have ~ui = √1 ~
v. As we will see, the extra √1 term makes the norm
n i n
be 1.

Therefore we can write U as:


 
1 1 ··· 1 1
1 ··· ω n−2 ω n−1
 
 ω 
1 . .. .. .. .. 
U = [~u0 ~u1 · · · ~un−1 ] = √  .. . . . .  (10)
n 
1 ω n−2 · · · ω (n−2)(n−2) (n−1)(n−2)
 
ω 
1 ω n−1 · · · ω (n−2)(n−1) ω (n−1)(n−1)

The basis in which S is diagonalized is called the Fourier basis or DFT basis (also called the frequency
domain), and U as a linear transformation is called the inverse discrete Fourier transform that brings
signals from the frequency domain into the time domain. The conjugate transpose of U, namely U ∗ , maps
signals from the time-domain to the frequency domain and as a transformation is called the discrete Fourier
transform or DFT.

EECS 16A, Spring 2016, Note 21 9

Vous aimerez peut-être aussi