Vous êtes sur la page 1sur 82

GATE Material for Mathematics

Lesson 1: Linear Algebra: Introduction to Matrices: Matrices and Determinants were discovered and developed in the eighteenth and nineteenth centuries. Initially, their development dealt with transformation of geometric objects and solution of systems of linear equations. Historically, the early emphasis was on the determinant, not the matrix. In modern treatments of linear algebra, matrices are considered first. We will not speculate much on this issue. Matrices provide a theoretically and practically useful way of approaching many types of problems including:

Solution of Systems of Linear Equations, Equilibrium of Rigid Bodies (in physics), Graph Theory, Theory of Games, Leontief Economics Model, Forest Management, Computer Graphics, and Computed Tomography, Genetics, Cryptography, Electrical Networks, Fractals.

Introduction and Basic Operations: Matrices, though they may appear weird objects at first, are a very important tool in expressing and discussing problems which arise from real life cases. Our first example deals with economics. Indeed, consider two families A and B (though we may easily take more than two). Every month, the two families have expenses such as: utilities, health, entertainment, food, etc... Let us restrict ourselves to: food, utilities, and health. How would one represent the data collected? Many ways are available but one of them has an advantage of combining the data so that it is easy to manipulate them. Indeed, we will write the data as follows: If we have no problem confusing the names and what the expenses are, then we may write This is what we call a Matrix. The size of the matrix, as a block, is defined by the number of Rows and the number of Columns. In this case, the above matrix has 2 rows and 3 columns. You may easily come up with a matrix which has m rows and n columns. In this case, we say that the matrix is a (mxn) matrix (pronounce m-by-n matrix). Keep in mind that the first entry (meaning m) is the number of rows while the second entry (n) is the number of columns. Our above matrix is a (2x3) matrix. When the numbers of rows and columns are equal, we call the matrix a square matrix. A square matrix of order n, is a (nxn) matrix. Back to our example, let us assume, for example, that the matrices for the months of January, February, and March are To make sure that the reader knows what these numbers mean, you should be able to give the Health-expenses for family A and Food-expenses for family B during the month of February. The answers are 250 and 600. The next question may sound easy to answer, but requires a new concept in the matrix context. Indeed, what is the matrix-expense for the two families for the first quarter? The idea is to add the three matrices above. It is easy to determine the total expenses for each family and each item, then the answer is So how do we add matrices? An approach is given by the above example. The answer is to add entries one by one. For example, we have Clearly, if you want to double a matrix, it is enough to add the matrix to itself. So we have we get which implies

This suggests the following rule and for any number , we will have Let us summarize these two rules about matrices. Addition of Matrices: In order to add two matrices, we add the entries one by one. Note: Matrices involved in the addition operation must have the same size. Multiplication of a Matrix by a Number: In order to multiply a matrix by a number, you multiply every entry by the given number. Keep in mind that we always write numbers to the left and matrices to the right (in the case of multiplication). What about subtracting two matrices? It is easy, since subtraction is a combination of the two above rules. Indeed, if M and N are two matrices, then we will write M-N = M + (-1)N So first, you multiply the matrix N by -1, and then add the result to the matrix M. Example. Consider the three matrices J, F, and M from above. Evaluate Answer. We have and since we get To compute J-M, we note first that Since J-M = J + (-1)M, we get And finally, for J-F+2M, we have a choice. Here we would like to emphasize the fact that addition of matrices may involve more than one matrix. In this case, you may perform the calculations in any order. This is called associativity of the operations. So first we will take care of -F and 2M to get Since J-F+2M = J + (-1)F + 2M, we get So first we will evaluate J-F to get to which we add 2M, to finally obtain

For the addition of matrices, one special matrix plays a role similar to the number zero. Indeed, if we consider the matrix with all its entries equal to 0, then it is easy to check that this matrix has behavior similar to the number zero. For example, we have and
Matrices, though they may appear weird objects at first, are a very important tool in expressing and discussing problems which arise from real life cases.

Our first example deals with economics. Indeed, consider twofamilies A and B (though we may easily take more than two). Every month, the two families have expenses such as: utilities, health, entertainment, food, etc... Let us restrict ourselves to: food, utilities, and health. How would one represent the data collected? Many ways are available but one of them has anadvantage of combining the data so that it is easy to manipulate them. Indeed, we will write the data as follows:

If we have no problem confusing the names and what theexpenses are, then we may write

This is what we call a Matrix. The size of the matrix, as a block, is defined by the number of Rows and the number ofColumns. In this case, the above matrix has 2 rows and 3 columns. You may easily come up with a matrix which has m rows and n columns. In this case, we say that the matrix is a(mxn) matrix (pronounce m-by-n matrix). Keep in mind that the first entry (meaning m) is the number of rows while the second entry (n) is the number of columns. Our above matrix is a (2x3) matrix. When the numbers of rows and columns are equal, we call the matrix a square matrix. A square matrix of order n, is a (nxn) matrix. Back to our example, let us assume, for example, that the matrices for the months of January, February, and March are

To make sure that the reader knows what these numbersmean, you should be able to give the Health-expenses for family A and Food-expenses for family B during the month of February. The answers are 250 and 600. The next question may sound easy to answer, but requires a new concept in the matrix context. Indeed, what is the matrix-expense for the twofamilies for the first quarter? The idea is to add the three matrices above. It is easy to determine the total expenses for each family and each item, then the answer is

So how do we add matrices? An approach is given by the above example. The answer is to add entries one by one. For example, we have

Clearly, if you want to double a matrix, it is enough to add the matrix to itself. So we have

we get

which

implies

This

suggests

the

following

rule

and

for

any number

, we

will have

Let us summarize these two rules about matrices. Addition of Matrices: In order to add two matrices, we add the entries one by one. Note: Matrices involved in the addition operation must have the same size. Multiplication of a Matrix by a Number: In order to multiply a matrix by a number, you multiply every entry by the given number. Keep in mind that we always write numbers to the left and matrices to the right (in the case of multiplication). What about subtracting two matrices? It is easy, since subtraction is a combination of the two above rules. Indeed, ifM and N are two matrices, then we will write

M-N = M + (-1)N So first, you multiply the matrix N by -1, and then add the result to the matrix M. Example. Consider the three matrices J, F, and M from above. Evaluate

Answer. We have

and

since

we

get

To compute J-M, we note first that

Since J-M = J +

(-1)M,

we

get

And finally, for J-F+2M, we have a choice. Here we would like to emphasize the fact that addition of matrices may involve more than one matrix. In this case, you may perform the calculations in any order. This is called associativity of the operations. So first we will take care of -F and 2M to get

Since J-F+2M = J +

(-1)F +

2M,

we

get

So

first we

will evaluate J-F to

get

to which we add 2M, to finally obtain

For the addition of matrices, one special matrix plays a role similar to the number zero. Indeed, if we consider the matrix with all its entries equal to 0, then it is easy to check that this matrix has behavior similar to the number zero. For example, we have

and

What about multiplying two matrices? Such operation exists but the calculations involved are complicated. What about multiplying two matrices? Such operation exists but the calculations involved are complicated.

Multiplication of Matrices: Before we give the formal definition of how to multiply two matrices, we will discuss an example from a real life situation. Consider a city with two kinds of population: the inner citypopulation and the suburb population. We assume that every year 40% of the inner city population moves to the suburbs, while 30% of the suburb population moves to the inner part of the city. Let I (resp. S) be the initial population of the inner city (resp. the suburban area). So after one year, thepopulation of the inner part is

0.6 I + 0.3 S while the population of the suburbs is

0.4 I + 0.7 S After two years, the population of the inner city is

0.6 (0.6 I + 0.3 S) + 0.3 (0.4 I + 0.7 S) and the suburban population is given by

0.4 (0.6 I + 0.3 S) + 0.7(0.4 I + 0.7 S) Is there a nice way of representing the two populations after a certain number of years? Let us show how matrices may be helpful to answer this question. Let us represent the two populations in one table (meaning a column object with two entries):

So

after one

year the

table which

gives

the

two

populations

is

If

we

consider

the

following

rule

(the

product of

two

matrices)

then

the

populations

after one

year are

given

by

the

formula

After

two

years

the

populations

are

Combining

this

formula

with

the

above

result,

we

get

In

other

words,

we

have

In fact, we do not need to have two matrices of the same size to multiply them. Above, we did multiply a (2x2) matrix with a (2x1) matrix (which gave a (2x1) matrix). In fact, the general rule says that in order to perform the multiplication AB, whereA is a (mxn) matrix and B a (kxl) matrix, then we must haven=k. The result will be a (mxl) matrix. For example, we have

Remember that though we were able to perform the abovemultiplication, it is not possible to perform the multiplication

So we have to be very careful about multiplying matrices. Sentences like "multiply the two matrices A and B" do not make sense. You must know which of the two matrices will be to the right (of your multiplication) and which one will be to the left; in other words, we have to know whether we are asked to perform or . Even if both multiplications do make sense (as in the case of square matrices with the same size), we still have to be very careful. Indeed, consider the two matrices

We

have

and

So what is the conclusion behind this example? The matrix multiplication is not commutative, the order in which matrices are multiplied is important. In fact, this little setback is a major problem in playing around with matrices. This is something that you must always be careful with. Let us show you another setback. We have

The product of two non-zero matrices may be equal to the zero-matrix.

Algebraic Properties of Matrix Operations: In this page, we give some general results about the three operations: addition, multiplication, and multiplication withnumbers, called scalar multiplication. From now on, we will not write (mxn) but mxn.

Properties involving Addition. Let A, B, and C be mxn matrices. We have 1. A+B = B+A 2. (A+B)+C = A + (B+C) 3.

where 4.

is the mxn zero-matrix (all its entries are equal to 0);

if only if B= -A.

and

Properties involving Multiplication. 1. Let A, B, and C be three matrices. If you can perform the products AB, (AB)C, BC, and A(BC), then we have

(AB)C = A (BC)

Note, for example, that if A is 2x3, B is 3x3, and C is 3x1, then the above products are possible (in this case, (AB)C is 2x1 matrix).

2. If and are numbers, and A is a matrix, then we have

3. If we is a number, and A and B are two matrices such that the product is possible, then have

4. If A is an nxm matrix and the mxk zero-matrix, then

Note that different.

is the nxk zero-matrix. So if n is different from m, the two zero-matrices are

Properties involving Addition and Multiplication. 1. Let A, B, and C be three matrices. If you can perform the appropriate products, then we have

(A+B)C = AC + BC

and

A(B+C) = AB + AC 2.

If

and

are

numbers, A and B are

matrices,

then

we

have

and

Example. Consider the matrices

Evaluate (AB)C and A(BC). Check that you get the same matrix.

Answer. We have

so

On

the

other

hand,

we

have

so

Example. Consider

the

matrices

It

is

easy

to

check

that

and

These two formulas are called linear combinations. More on linear combinations will be discussed on a different page. We have seen that matrix multiplication is different from normal multiplication (between numbers). Are there some similarities? For example, is there a matrix which plays a similar role as the number 1? The answer is yes. Indeed, consider the nxn matrix

In

particular,

we

have

The matrix In has similar behavior as the number 1. Indeed, for any nxn matrix A, we have

A In = In A = A

The matrix In is called the Identity Matrix of order n.

Example. Consider

the

matrices

Then

it

is

easy

to

check

that

The identity matrix behaves like the number 1 not only among the matrices of the form nxn. Indeed, for any nxm matrix A, we have

In

particular,

we

have

Invertible Matrices: Invertible matrices are very important in many areas ofscience. For example, decrypting a coded message uses invertible matrices (see the coding page). The problem offinding the inverse of a matrix will be discussed in a different page. Definition. An matrix B such matrix A is called nonsingular orinvertible iff there exists an that

where In is the identity matrix. The matrix B is called theinverse matrix of A. Example. Let

One

may

easily

check

that

Hence A is invertible and B is its inverse. Notation. A common notation for the inverse of a matrix A isA-1. So

Example. Find

the

inverse

of

Write

Since

we

get

Easy

algebraic

manipulations

give

or

The inverse matrix is unique when it exists. So if A is invertible, then A-1 is also invertible and

The following basic property is very important:

If A and B are

invertible

matrices,

then

is

also

invertible

and

Remark. In the definition of an invertible matrix A, we used both and to be equal to the identity matrix. In fact, we need only one of the two. In other words, for a matrix A, if there exists a matrix B such that , then A is invertible and B = A-1.

Applications:
Application of Invertible Matrices: Coding There are many ways to encrypt a message. And the use of coding has become particularly significant in recent years (due to the explosion of the internet for example). One way to encrypt or code a message uses matrices and their inverse. Indeed, consider a fixed invertible matrix A. Convert the message into a matrix B such that AB is possible to perform. Send the message generated by AB. At the other end, they will need to know A-1 in order to decrypt or decode the message sent. Indeed, we have

which is the original message. Keep in mind that whenever an undesired intruder finds A, we must be able to change it. Sowe should have a mechanical way of generating simple matrices A which are invertible and have simple inverse matrices. Note that, in general, the inverse of a matrix involves fractions which are not easy to send in an electronic form. The best is to have both A and its inverse with integers as their entries. In fact, we can use our previous knowledge to generate such class of matrices. Indeed, if A is a matrix such that its determinant is and all its entries are integers, then A1 will have entries which are integers. So how do we generate such class of matrices? One practical way is to start with an upper triangular matrix with on the diagonal and integerentries. Then we use the elementary row operations to change the matrix while keeping the

determinant unchanged. Do not multiply rows with non-integers while doing elementary row operations. Let us illustrate this on an example. Example. Consider the matrix

First we keep the first row and add it to the second as well as to the third rows. We obtain

Next we keep the first row again, we add the second to the third, and finally add the last one to the first multiplied by -2. We obtain

This is our matrix A. Easy calculations will give det(A) = -1, which we knew since the above elementary operations did not change the determinant from the original triangular matrix which obviously has -1 as its determinant. We leave the details of the calculations to the reader. The inverse of A is

Back to our original problem. Consider the message

To every letter we will associate a number. The easiest wayto do that is to associate 0 to a blank or space, 1 to A, 2 to B, etc... Another way is to associate 0 to a blank or space, 1 to A, -1 to B, 2 to C, -2 to D, etc... Let us use the second choice. So our message is given by the string

Now we rearrange these numbers into a matrix B. For example, we have

Then we perform the product AB, where A is the matrix found above. We get

The encrypted message to be sent is

Complex numbers as Matrices: In this section, we use matrices to give a representation ofcomplex numbers. Indeed, consider the set

We will write

Clearly, the set

is not empty. For example, we have

In particular, we have

for any real numbers a, b, c, and d. Algebraic Properties of 1. Addition: For any real numbers a, b, c, and d, we have Ma,b + Mc,d = Ma+c,b+d. In other words, if we add two elements of the set we have -Ma,b = M-a,-b. , we still get a matrix in . In particular,

2. Multiplication by a number: We have

So a multiplication of an element of 2.

and a number gives a matrix in

Multiplication: For any real numbers a, b, c, and d, we have

In other words, we have Ma,b Mc,d = Mac-bd, ad+bc. This is an extraordinary formula. It is quite conceivable given the difficult form of the matrix multiplication that, a priori, the product of two elements of this case, it turns out to be true. The above properties infer to may not be in again. But, in

a very nice structure. The next natural question to ask, in this is invertible. Indeed, for any realnumbers a and b, we

case, is whether a nonzero element of have

So, if

, the matrix Ma,b is invertible and

In other words, any nonzero element Ma,b of

is invertible and its inverse is still in

since

In order to define the division in

, we will use the inverse. Indeed, recall that

So for the set Ma,


bMc,

, we have = Ma,
bMc, -1

= Ma,

bM

which implies Ma,


bMc,

=M

The matrix Ma,-b is called the conjugate of Ma,b. Note that the conjugate of the conjugate of Ma,b is Ma,b itself. Fundamental Equation. For any Ma,b in Ma,b = a M1,0 + b M0,1 = a I2 + b M0,1. , we have

Note that M0,1 M0,1 = M-1,0 = - I2.

Remark. If we introduce an imaginary number i such that i2= -1, then the matrix Ma,b may be rewritten by a + bi A lot can be said about , but we will advise you to visit the page on complex numbers.

Markov Chains: In a previous page, we studied the movement between the city and suburbs. Indeed, if I are S are the initial populationof the inner city and the suburban area, and if we assume that every year 40% of the inner city population moves to thesuburbs, while 30% of the suburb population moves to the inner part of the city, then after one year the populations are given by

The matrix

is very special. Indeed, the entries of each column vectors are positive and their sum is 1. Such vectors are calledprobability vectors. A matrix for which all the column vectors are probability vectors is called transition orstochastic matrix. Andrei Markov, a russian mathematician, was the first one to study these matrices. At the beginning of this century he developed the fundamentals of the Markov Chain theory. A Markov chain is a process that consists of a finite number ofstates and some known probabilities pij, where pij is theprobability of moving from state j to state i. In the example above, we have two states: living in the city and living in thesuburbs. The number pij represents the probability of moving from state i to state j in one year. We may have more than two states. For example, political affiliation: Democrat, Republican, and Independent. For example, pij represents theprobability of a son belonging to party i if his father belonged to party j. Of particular interest is a probability vector p such that

, that is, an eigenvector of Aassociated to the eigenvalue 1. Such vector is called asteady state vector. In the example above, the steady state vectors are given by the system

This system reduces to the equation -0.4 x + 0.3 y = 0. It is easy to see that, if we

set

, then

So the vector is a steady state vector of the matrix above. So if the populations of the city and the suburbs are given by the vector , after one year the proportions remain the same (though the people may move between the city and the suburbs). Let us discuss another example on population dynamics. Example: Age Distribution of Trees in a Forest Trees in a forest are assumed in this simple model to fall into four age groups: b(k) denotes the number of baby trees in the forest (age group 0-15 years) at a given time period k; similarly y(k),m(k) and o(k) denote the number of young trees (16-30 years of age), middle-aged trees (age 31-45), and old trees (older than 45 years of age), respectively. The length of one time period is 15 years. How does the age distribution change from one time period to the next? The model makes the following three assumptions:

A certain percentage of trees in each age group dies. Surviving trees enter into the next age group; old trees remain old. Lost trees are replaced by baby trees.

Note that the total tree population does not change over time. We obtain the following difference equations: b(k+1) y(k+1) m(k+1) o(k+1) = = (1-db) b(k) = (1-dy) y(k) = (1-dm) m(k) + (1-do) o(k) (1) (2) (3) (4)

Here 0 < db,dy,dm,do <1 denote the loss rates in each age group in percent. Let

be the ``age distribution vector". Consider the matrix

Then we have

Note that the If, db=.1,dy=.2,dm=.3 and do=.4, then

matrix A is

stochastic

matrix!

After easy calculations, we find the steady state vector for the age distribution in the forest:

Assume a total tree population of 50,000 trees. Suppose the forest is newly planted, i.e.

After 15 years, the age distribution in the forest is given by

After 30 years, we have

and after 45 years

After 15n years, where distribution in the forest is given by

, the age

So the problem is to find the nth-power of the matrix A. We have seen that diagonalization technique may be helpful to solve this problem. Another problem deals with the long term behavior of the sequence x(n) when n gets large. The calculations on the example above becomes tedious. Let us illustrate the problem on a small matrix. Example. Consider the stochastic matrix

Note this is a symmetric matrix. The characteristic polynomial of A is

An eigenvector associated to 1 is

and an eigenvector associated to 0.6 is

If we set

then we have

So, we have

When n gets large, the matrices An get closer to the matrix

So the sequence of vectors defined by

will get closer to

when n gets large. If have

, then we

Note that the vector proportional to the unique steady state vector of A

is

This is not surprising. In fact there is a general result similar to the one above for any stochastic matrix.

Systems of Linear Equations: Introduction


Many books on linear algebra will introduce matrices via systems of linear equations. We tried a different approach. We hope this way you will appreciate matrices as a powerful tool useful not only to solve linear systems of equations. Basically, the problem of finding some unknowns linked to each others via equations is called a system of equations. For example,

and

are systems of two equations with two unknowns (x and y), while

is a system of two equations with three unknowns (x, y, and z).

These systems of equations occur naturally in many real life problems. For example, consider a nutritious drink which consists of whole egg, milk, and orange juice. The food energy and protein for each of the ingredients are given by the table:

A natural question to ask is how much of each ingredient do we need to produce a drink of 540 calories and 25 grams of protein. In order to answer that, let x be the number of eggs, ythe amount of milk (in cups), and z the amount of orange of juice (in cups). Then we need to have

The task of Solving a system consists of finding the unknowns, here: x, y and z. A solution is a set of numbers once substituted for the unknowns will satisfy the equations of the system. For example, (2,1,2) and (0.325, 2.25, 1.4) are solutions to the system above. The fundamental problem associated to any system is to find all the solutions. One way is to study the structure of its set of solutions which, in some cases, may help finding the solutions. Indeed, for example, in order to find the solutions to a linear system, it is enough to find just a few of them. This is possible because of the rich structure of the set of solutions.

Systems of Linear Equations: Gaussian Elimination It is quite hard to solve non-linear systems of equations, while linear systems are quite easy to study. There are numerical techniques which help to approximate nonlinear systems with linear ones in the hope that the solutions of the linear systems are close enough to the solutions of the nonlinear systems. We will not discuss this here. Instead, we will focus our attention on linear systems. For the sake of simplicity, we will restrict ourselves to three, at most four, unknowns. The reader interested in the case of more unknowns may easily extend the following ideas.

Definition. The equation ax+by+cz+dw=h where a, b, c, d, and h are known numbers, while x, y, z, andw are unknown numbers, is called a linear equation. If h=0, the linear equation is said to be homogeneous. Alinear system is a set of linear equations and ahomogeneous linear system is a set of homogeneous linear equations. For example,

and

are linear systems, while

is a nonlinear system (because of y2). The system

is an homogeneous linear system.

Matrix Representation of a Linear System Matrices are helpful in rewriting a linear system in a very simple form. The algebraic properties of matrices may then be used to solve systems. First, consider the linear system

Set the matrices

Using matrix multiplications, we can rewrite the linear system above as the matrix equation

As you can see this is far nicer than the equations. But sometimes it is worth to solve the system directly without going through the matrix form. The matrix A is called the matrix coefficient of the linear system. The matrix C is called the nonhomogeneous term. When , the linear system is homogeneous. The matrix X is the unknown matrix. Its entries are the unknowns of the linear system. Theaugmented matrix associated with the system is the matrix [A|C], where

In general if the linear system has n equations with munknowns, then the matrix coefficient will be a nxm matrix and the augmented matrix an nx(m+1) matrix. Now we turn our attention to the solutions of a system.

Definition. Two linear systems with n unknowns are said to be equivalent if and only if they have the same set of solutions. This definition is important since the idea behind solving a system is to find an equivalent system which is easy to solve. You may wonder how we will come up with such system? Easy, we do that through elementary operations. Indeed, it is clear that if we interchange two equations, the new system is still equivalent to the old one. If we multiply anequation with a nonzero number, we obtain a new system still equivalent to old one. And finally replacing one equation with the sum of two equations, we again obtain an equivalent system. These operations are called elementary operations on systems. Let us see how it works in a particular case. Example. Consider the linear system

The idea is to keep the first equation and work on the last two. In doing that, we will try to kill one of the unknowns and solve for the other two. For example, if we keep the first and second equation, and subtract the first one from the last one, we get the equivalent system

Next we keep the first and the last equation, and we subtract the first from the second. We get the equivalent system

Now we focus on the second and the third equation. We repeat the same procedure. Try to kill one of the twounknowns (y or z). Indeed, we keep the first and secondequation, and we add the second to the third after multiplying it by 3. We get

This obviously implies z = -2. From the second equation, we get y = -2, and finally from the first equation we get x = 4. Therefore the linear system has one solution

Going from the last equation to the first while solving for theunknowns is called backsolving. Keep in mind that linear systems for which the matrix coefficient is upper-triangular are easy to solve. This is particularly true, if the matrix is in echelon form. So the trick is to perform elementary operations to transform the initial linear system into another one for which the coefficient matrix is in echelon form. Using our knowledge about matrices, is there anyway we can rewrite what we did above in matrix form which will make our notation (or representation) easier? Indeed, consider the augmented matrix

Let us perform some elementary row operations on this matrix. Indeed, if we keep the first and second row, and subtract the first one from the last one we get

Next we keep the first and the last rows, and we subtract the first from the second. We get

Then we keep the first and second row, and we add the second to the third after multiplying it by 3 to get

This is a triangular matrix which is not in echelon form. The linear system for which this matrix is an augmented one is

As you can see we obtained the same system as before. In fact, we followed the same elementary operations performedabove. In every step the new matrix was exactly the augmented matrix associated to the new system. This shows that instead of writing the systems over and over again, it is easy to play around with the elementary row operations and once we obtain a triangular matrix, write the associated linear system and then solve it. This is known as Gaussian Elimination. Let us summarize the procedure: Gaussian Elimination. Consider a linear system. 1. Construct the augmented matrix for the system; 2. Use elementary row operations to transform the augmented matrix into a triangular one; 3. Write down the new linear system for which the triangular matrix is the associated augmented matrix; 4. Solve the new system. You may need to assign some parametric values to some unknowns, and then apply the method of back substitution to solve the new system. Example. Solve the following system via Gaussian elimination

The augmented matrix is

We use elementary row operations to transform this matrix into a triangular one. We keep the first row and use it to produce all zeros elsewhere in the first column. We have

Next we keep the first and second row and try to have zeros in the second column. We get

Next we keep the first three rows. We add the last one to the third to get

This is a triangular matrix. Its associated system is

Clearly we have v = 1. Set z=s and w=t, then we have

The first equation implies

x=2+

y+

z-w-

v.

Using algebraic manipulations, we get

x=-

s - t.

Putting all the stuff together, we have

Example. Use Gaussian elimination to solve the linear system

The associated augmented matrix is

We keep the first row and subtract the first row multiplied by 2 from the second row. We get

This is a triangular matrix. The associated system is

Clearly the second equation implies that this system has no solution. Therefore this linear system has no solution. Definition. A linear system is called inconsistent or overdetermined if it does not have a solution. In other words, the set of solutions is empty. Otherwise the linear system is called consistent. Following the example above, we see that if we perform elementary row operations on the augmented matrix of the system and get a matrix with one of its rows equal to where , then the system is inconsistent. ,

Systems of linear equations in Two variables: A system of equations is a collection of two or more equations with the same set of unknowns. In solving a system of equations, we try to find values for each of the unknowns that will satisfy everyequation in the system. The equations in the system can be linear or non-linear. This tutorial reviews systems of linear equations. A problem can be expressed in narrative form or the problem can be expressed in algebraic form. Let's start with an example stated in narrative form. We'll convert it to an equivalent equation in algebraic form, and then we will solve it.

Example 1: A total of $12,000 is invested in two funds paying 9% and 11% simple interest. If the yearly interest is $1,180, how much of the $12,000 is invested at each rate?

Before you work this problem, you must know the definition of simple interest. Simple interest can be calculated by multiplying the amountinvested at the interest rate.

Solution: We have two unknowns: the amount of money invested at 9% and the amount of money invested at 11%. Our objective is to find these two numbers. Sentence (1) ''A total of $12,000 is invested in two funds paying 9% and 11% simple interest.'' can be restated as (The amount of money invested at 9%) (The amount of money invested at 11%) $12,000. Sentence (2) ''If the yearly interest is $1,180, how much of the $12,000 is invested at each rate?'' can be restated as (The amount of money invested at 9%) invested at 11% 11%) total interest of $1,180. 9% + (The amount of money

It is going to get tiresome writing the two phrases (The amount of money invested at 9%) and (The amount of money invested at 11%) over and over again. So let's write them in shortcut form. Call the phrase (The amount of money invested at 9%) by the symbol and call the phrase (The amount of money invested at 11% times 11%) by the symbol .

Let's rewrite sentences (1) and (2) in shortcut form.

We have converted a narrative statement of the problem to an equivalent algebraic statement of the problem. Let's solve this system of equations. A system of linear equations can be solved four different ways:

Substitution,

Elimination,

Matrices,

Graphing.

The Method of Substitution:

The method of substitution involves five steps:

Step 1: Solve for y in equation (1).

Step 2: Substitute this value for y in equation (2). This will changeequation (2) to an equation with just one variable, x.

Step 3: Solve for x in the translated equation (2).

Step 4: Substitute this value of x in the y equation you obtained in Step 1.

Step 5: Check your answers by substituting the values of x and y in each of the original equations. If, after the substitution, the left side of the equation equals the right side of the equation, you know that your answers are correct.

The Method of Elimination:

The process of elimination involves five steps:

In a two-variable problem rewrite the equations so that when theequations are added, one of the variables is eliminated, and then solve for the remaining variable.

Step 1: Change equation (1) by multiplying equation (1) by equivalent equation (1).

to obtain a new and

Step 2: Add new equation (1) to equation (2) to obtain equation (3).

Step 3: Substitute

in equation (1) and solve for x.

Step 4: Check your answers in equation (2). Does

The Method of Matrices:

This method is essentially a shortcut for the method of elimination.

Rewrite equations (1) and (2) without the variables and operators. The left column contains the coefficients of the x's, the middle column contains the coefficients of the y's, and the right column contains the constants.

The objective is to reorganize the original matrix into one that looks like

where a and b are the solutions to the system.

Step 1. Manipulate the matrix so that the number in cell 11 (row 1-col 1) is 1. In this case, we don't have to do anything. The number 1 is already in the cell.

Step 2: Manipulate the matrix so that the number in cell 21 is 0. To do this we rewrite the matrix by keeping row 1 and creating a new row 2 by adding -0.09 x row 1 to row 2.

Step 3: Manipulate the matrix so that the cell 22 is 1. Do this by multiplying row 2 by 50.

Step 4: Manipulate the matrix so that cell 12 is 0. Do this by adding

You can read the answers off the matrix as x = $7,000 and y = $5,000.

Determinants:
Introduction to Determinants: For any square matrix of order 2, we have found a necessary and sufficient condition for invertibility. Indeed, consider the matrix

The matrix A is invertible if and only if . We called this number the determinant of A. It is clear from this, that we would like to have a similar result for bigger matrices (meaning higher orders). So is there a similar notion of determinant for any square matrix, which determines whether a square matrix is invertible or not?

In order to generalize such notion to higher orders, we will need to study the determinant and see what kind of properties it satisfies. First let us use the following notation for the determinant

Properties

of

the

Determinant

1. Any

matrix A and

its

transpose

have

the

same

determinant,

meaning

This is interesting since it implies that whenever we use rows, a similar behavior will result if we use columns. In particular we will see how row elementary operations are helpful in finding the determinant. Therefore, we have similar conclusions for elementary column operations. 2. The determinant of a triangular matrix is the product of the entries on the diagonal, that is

3. If we interchange two rows, the determinant of the new matrix is the opposite of the old one, that is

4. If we multiply one row with a constant, the determinant of the new matrix is the determinant of the old one multiplied by the constant, that is

In particular, if all the entries in one row are zero, then the determinant is zero. 5. If we add one row to another one multiplied by a constant, the determinant of the new matrix is the same as the old one, that is

Note that whenever you want to replace a row by something (through elementary operations), do not multiply the row itself by a constant. Otherwise, you will easily make errors (due to Property 4). 6. We have

In particular, if A is invertible (which happens if and only if

), then

If A and B are similar, then Let us look at an example, to

. see how these properties work.

Example. Evaluate

Let us transform this matrix into a triangular one through elementary operations. We will keep

the first row and add to the second one the first multiplied by

. We get

Using

the

Property

2,

we

get

Therefore,

we

have

which one may check easily.

Determinants of Matrices of Higher Order: As we said before, the idea is to assume that previous properties satisfied by the determinant of matrices of order 2, are still valid in general. In other words, we assume: 1. Any matrix A and its transpose have the same determinant, meaning

2. The determinant of a triangular matrix is the product of the entries on the diagonal. 3. If we interchange two rows, the determinant of the new matrix is the opposite of the old one. 4. If we multiply one row with a constant, the determinant of the new matrix is the determinant of the old one multiplied by the constant. 5. If we add one row to another one multiplied by a constant, the determinant of the new matrix is the same as the old one. 6. We have

In particular, if A is invertible (which happens if and only if

), then

So let us see how this works in case of a matrix of order 4. Example. Evaluate

We have

If we subtract every row multiplied by the appropriate number from the first row, we get

We do not touch the first row and work with the other rows. We interchange the second with the third to get

If we subtract every row multiplied by the appropriate number from the second row, we get

Using previous properties, we have

If we multiply the third row by 13 and add it to the fourth, we get

which is equal to 3. Putting all the numbers together, we get

These calculations seem to be rather lengthy. We will see later on that a general formula for the determinant does exist. Example. Evaluate

In this example, we will not give the details of the elementary operations. We have

Example. Evaluate

We have

General Formula for the Determinant Let A be a square matrix of order n. Write A = (aij), where aij is the entry on the row numberi and the column number j, for and

. For any i and j, set Aij (called the cofactors) to be the determinant of the square matrix of order (n-1) obtained from A by removing the row number i and the column number j multiplied by (-1)i+j. We have

for any fixed i, and

for any fixed j. In other words, we have two type of formulas: along a row (number i) or along a column (number j). Any row or any column will do. The trick is to use a row or a column which has a lot of zeros. In particular, we have along the rows

or

or

As an exercise write the formulas along the columns. Example. Evaluate

We will use the general formula along the third row. We have

Which technique to evaluate a determinant is easier ? The answer depends on the person who is evaluating the determinant. Some like the elementary row operations and some like the general formula. All that matters is to get the correct answer.

Determinant and Inverse of Matrices: Finding the inverse of a matrix is very important in many areas of science. For example, decrypting a coded message uses the inverse of a matrix. Determinant may be used toanswer this problem. Indeed, let A be a square matrix. We know that A is invertible if and only if . Also ifA has order n, then the cofactor Ai,j is defined as the determinant of the square matrix of order (n-1) obtained from A by removing the row number i and the columnnumber j multiplied by (-1)i+j. Recall

for any fixed i, and

for any fixed j. Define the adjoint of A, denoted adj(A), to be the transpose of the matrix whose ijth entry is Aij. Example. Let

We have

Let us evaluate

. We have

Note that

. Therefore, we have

Is this formula only true for this matrix, or does a similarformula exist for any square matrix? In fact, we do have a similar formula. Theorem. For any square matrix A of order n, we have

In particular, if

, then

For a square matrix of order 2, we have

which gives

This is a formula which we used on a previous page.

Eigenvalues and Eigenvectors:


The eigenvalue problem is a problem of considerable theoretical interest and wideranging application. For example, this problem is crucial in solving systems of differentialequations, analyzing population growth models, andcalculating powers of matrices (in order to define the exponential matrix). Other areas such as physics, sociology,biology, economics and statistics have focused considerableattention on "eigenvalues" and "eigenvectors"-theirapplications and their computations. Before we give the formal definition, let us introduce these concepts on an example. Example. Consider the matrix

Consider the three column matrices

We have

In other words, we have

Next consider the matrix P for which the columns are C1, C2, and C3, i.e.,

We have det(P) = 84. So this matrix is invertible. Easy calculations give

Next we evaluate the matrix P-1AP. We leave the details to the reader to check that we have

In other words, we have

Using the matrix multiplication, we obtain

which implies that A is similar to a diagonal matrix. In particular, we have

for of A.

. Note that it is almost impossible to findA75 directly from the original form

This example is so rich of conclusions that many questions impose themselves in a natural way. For example, given a square matrix A, how do we find column matrices which have similar behaviors as the above ones? In other words, how do we find these column matrices which will help find the invertible matrix P such that P-1AP is a diagonal matrix? From now on, we will call column matrices vectors. So the above column matrices C1, C2, and C3 are now vectors. We have the following definition.

Definition. Let A be a square matrix. A non-zero vector C is called an eigenvector of A if and only if there exists a number (real or complex) such that

If such a number exists, it is called an eigenvalue of A. The vector C is called eigenvector associated to the eigenvalue . Remark. The eigenvector C must be non-zero since we have

for any number

Example. Consider the matrix

We have seen that

where

So C1 is an eigenvector of A associated to the eigenvalue 0.C2 is an eigenvector of A associated to the eigenvalue -4 while C3 is an eigenvector of A associated to the eigenvalue 3. It may be interesting to know whether we found all the eigenvalues of A in the above example. In the next page, we will discuss this question as well as how to find the eigenvalues of a square matrix. Computation of Eigenvalues: For a square matrix A of order n, the number zero vector C such that is an eigenvalue if and only if there exists a non-

Using the matrix multiplication properties, we obtain

This is a linear system for which the matrix coefficient is . We also know that this system has one solution if and only if the matrix coefficient is invertible, i.e. . Since the zero-vector is a solution and C is not the zero vector, then we must have

Example. Consider the matrix

The equation

translates into

which is equivalent to the quadratic equation

Solving this equation leads to

In other words, the matrix A has only two eigenvalues. In general, for a square matrix A of order n, the equation

will give the eigenvalues of A. This equation is called thecharacteristic equation or characteristic polynomial ofA. It is a polynomial function in of degree n. So we know that this equation will not have more than n roots or solutions. So a square matrix A of order n will not have more than n eigenvalues. Example. Consider the diagonal matrix

Its characteristic polynomial is

So the eigenvalues of D are a, b, c, and d, i.e. the entries on the diagonal. This result is valid for any diagonal matrix of any size. So depending on the values you have on the diagonal, you may have one eigenvalue, two eigenvalues, or more. Anything is possible.

Remark. It is quite amazing to see that any square matrix Ahas the same eigenvalues as its transpose AT because

For any square matrix of order 2, A, where

the characteristic polynomial is given by the equation

The number (a+d) is called the trace of A (denoted tr(A)), and clearly the number (ad-bc) is the determinant of A. So the characteristic polynomial of A can be rewritten as

Let us evaluate the matrix B = A2 - tr(A) A + det(A) I2.

We have

We leave the details to the reader to check that

In other word, we have

This equation is known as the Cayley-Hamilton theorem. It is true for any square matrix A of any order, i.e.

where

is the characteristic polynomial of A.

We have some properties of the eigenvalues of a matrix. Theorem. Let A be a square matrix of order n. If 1. is an eigenvalue of Am, for is an eigenvalue of A, then:

2. If A is invertible, then

is an eigenvalue of A-1. is an eigenvalue of A.

3. A is not invertible if and only if

4. If is any number, then is an eigenvalue of . 5. If A and B are similar, then they have the same characteristic polynomial (which implies they also have the same eigenvalues).

Computation of Eigenvectors: Let A be a square matrix of order n and of A associated to . We must have one of its eigenvalues. Let X be an eigenvector

This is a linear system for which the matrix coefficient is . Since the zero-vector is a solution, the system is consistent. In fact, we will in a different page that the structure of the solution set of this system is very rich. In this page, we will basically discuss how to find the solutions. Remark. It is quite easy to notice that if X is a vector which satisfies , then the vector Y = c X (for any arbitrary number c) satisfies the same equation, i.e. . In other words, if we know that X is an eigenvector, then cX is also an eigenvector associated to the same eigenvalue. Let us start with an example. Example. Consider the matrix

First we look for the eigenvalues of A. These are given by the characteristic equation , i.e.

If we develop this determinant using the third column, we obtain

Using easy algebraic manipulations, we get

which implies that the eigenvalues of A are 0, -4, and 3. Next we look for the eigenvectors. 1. Case : The associated eigenvectors are given by the linear system

which may be rewritten by

Many ways may be used to solve this system. The third equation is identical to the first. Since, from the second equations, we have y = 6x, the first equation reduces to 13x+ z = 0. So this system is equivalent to

So the unknown vector X is given by

Therefore, any eigenvector X of A associated to the eigenvalue 0 is given by

where c is an arbitrary number.

2. Case

: The associated eigenvectors are given by the linear system

which may be rewritten by

In this case, we will use elementary operations to solve it. First we consider the augmented matrix , i.e.

Then we use elementary row operations to reduce it to a upper-triangular form. First we interchange the first row with the first one to get

Next, we use the first row to eliminate the 5 and 6 on the first column. We obtain

If we cancel the 8 and 9 from the second and third row, we obtain

Finally, we subtract the second row from the third to get

Next, we set z = c. From the second row, we get y = 2z = 2c. The first row will imply x = 2y+3z = -c. Hence

Therefore, any eigenvector X of A associated to the eigenvalue -4 is given by

where c is an arbitrary number.

2. Case : The details for this case will be left to the reader. Using similar ideas as the one described above, one may easily show that any eigenvector X of A associated to the eigenvalue 3 is given by

where c is an arbitrary number. Remark. In general, the eigenvalues of a matrix are not all distinct from each other (see the page on the eigenvalues formore details). In the next two examples, we discuss this problem. Example. Consider the matrix

The characteristic equation of A is given by

Hence the eigenvalues of A are -1 and 8. For the eigenvalue 8, it is easy to show that any eigenvector X is given by

where c is an arbitrary number. Let us focus on the eigenvalue -1. The associated eigenvectors are given by the linear system

which may be rewritten by

Clearly, the third equation is identical to the first one which is also a multiple of the second equation. In other words, this system is equivalent to the system reduced to one equation 2x+y + 2z= 0. To solve it, we need to fix two of the unknowns and deduce the third one. For example, if we set and , we obtain . Therefore, any eigenvector Xof A associated to the eigenvalue -1 is given by

In other words, any eigenvector X of A associated to the eigenvalue -1 is a linear combination of the two eigenvectors

Example. Consider the matrix

The characteristic equation is given by

Hence the matrix A has one eigenvalue, i.e. -3. Let us find the associated eigenvectors. These are given by the linear system

which may be rewritten by

This system is equivalent to the one equation-system x - y = 0. So if we set x = c, then any eigenvector X of A associated to the eigenvalue -3 is given by

Let us summarize what we did in the above examples. Summary: Let A be a square matrix. Assume is an eigenvalue of A. In order to find the associated eigenvectors, we do the following steps: 1. Write down the associated linear system

2. Solve the system. 3. Rewrite the unknown vector X as a linear combination of known vectors. The above examples assume that the eigenvalue is realnumber. So one may wonder whether any eigenvalue is always real. In general, this is not the case except for symmetric matrices. The proof of this is very complicated. For square matrices of order 2, the proof is quite easy. Let us give it here for the sake of being little complete. Consider the symmetric square matrix

Its characteristic equation is given by

This is a quadratic equation. The nature of its roots (which are the eigenvalues of A) depends on the sign of the discriminant

Using algebraic manipulations, we get

Therefore,

is a positive number which implies that the eigenvalues of A are real numbers.

Remark. Note that the matrix A will have one eigenvalue, i.e. one double root, if and only if A = a I2. . But this is possible only if a=c and b=0. In other words, we have

The Case of Complex Eigenvalues: First let us convince ourselves that there exist matrices withcomplex eigenvalues. Example. Consider the matrix

The characteristic equation is given by

This quadratic equation has complex roots given by

Therefore the matrix A has only complex eigenvalues. The trick is to treat the complex eigenvalue as a real one. Meaning we deal with it as a number and do the normalcalculations for the eigenvectors. Let us see how it works onthe above example.

We will do the calculations for linear system A X = (1+2i) X which may be rewritten as

. The associated eigenvectors are given by the

In fact the two equations are identical since (2+2i)(2-2i) = 8. So the system reduces to one equation (1-i)x - y = 0. Set x=c, then y = (1-i)c. Therefore, we have

where c is an arbitrary number. Remark. It is clear that one should expect to have complexentries in the eigenvectors. We have seen that (1-2i) is also an eigenvalue of the above matrix. Since the entries of the matrix A are real, then one may easily show that if is a complex eigenvalue, then its conjugate the vector is also an eigenvalue. Moreover, if X is an eigenvector of A associated to , obtained from X by taking the complex-conjugate of the entries of X, is an , then

eigenvector associated to . So the eigenvectors of the above matrix A associated to the eigenvalue (1-2i) are given by

where c is an arbitrary number.

Let us summarize what we did in the above example. Summary: Let A be a square matrix. Assume is acomplex eigenvalue of A. In order to find the associated eigenvectors, we do the following steps: 1. Write down the associated linear system

2. Solve the system. The entries of X will be complex numbers. 3. Rewrite the unknown vector X as a linear combination of known vectors with complex entries. 4. If A has real entries, then the conjugate is also an eigenvalue. The associated eigenvectors are given by the same equation found in 3, except that we should take the conjugate of the entries of the vectors involved in the linear combination. In general, it is normal to expect that a square matrix with real entries may still have complex eigenvalues. One may wonder if there exists a class of matrices with only real eigenvalues. This is the case for symmetric matrices. The proof is very technical and will be discussed in another page. But for square matrices of order 2, the proof is quite easy. Let us give it here for the sake of being little complete. Consider the symmetric square matrix

Its characteristic equation is given by

This is a quadratic equation. The nature of its roots (which are the eigenvalues of A) depends on the sign of the discriminant

Using algebraic manipulations, we get

Therefore,

is a positive number which implies that the eigenvalues of A are real numbers.

Remark. Note that the matrix A will have one eigenvalue, i.e. one double root, if and only if A = a I2. . But this is possible only if a=c and b=0. In other words, we have

Vous aimerez peut-être aussi