Chen ch2 pg7-59

LINEAR SPACES OVER A FIELD 7 Section 2-7, we study functions of a square matrix. The minimal polynomial and the Cayley-Hamilton theorem are introduced. In the last section, the concepts of inner product and norm are introduced. This chapter is intended to be self-contained. The reader is assumed to have some basic knowledge of matrix theory (determinants, matrix addition, multiplication, and inversion). The matrix identities introduced below will also be used. Let A, B,C, D be nxm, m xr, 1 xn, and r x p constant matrices, respectively. Let a; be the ith column of A, and let b; be the jth row of B. Then. we have b AB=[a, a,---a,,] be =a,b, +a,b,+°-- +a,b,, (2-1) b, CA=C[a, a,---a,]=[Ca, Ca,---Ca,,] (2-2) b, b,D pp =| 2 |p— b.D (2-3) by, b,D These identities can be easily checked. Note that a,b; is an n x r matrix, it is the product of an n x 1 matrix a; anda | x r matrix bj. The material presented here is well known and can be found in References 5, 38, 39, 43 to 45, 77, 86, and 116.2 However our presentation is different. We emphasize the difference between a vector and its representations [see Equation (2-12) and Definition 2-7]. After stressing this distinction, the concepts of matrix representation of an operator and of the similarity transformation follow naturally. 2-2 Linear Spaces over a Field In the study of mathematics we must first specify a collection of objects that forms the center of study. This collection of objects or elements is called a set. For example, in arithmetic, we study the set of real numbers. In boolean algebra, we study the set {0,1}, which consists of only two elements. Other examples of sets include the set of complex numbers, the set of positive integers, the set of all polynomials of degree less than 5, the set of all 2 x 2 real constant matrices. In this section when we discuss a set of objects, the set could be any one of those just mentioned or any other the reader wishes to specify. Consider the set of real numbers. The operations of addition and multiplication with the commutative and associative properties are defined for the set. The sum and product of any two real numbers are real numbers. The set has elements 0 and 1. Any real number « has an additive inverse {— 2) and ? Numbers correspond to the References at the end of the book.8 LINEAR SPACES AND LINEAR OPERATORS. a multiplicative inverse (1/x, except 2 =0)in the set. Any set with these properties is called a field. We give a formal definition of a field in the following. Definition 2-1 A field consists of a set, denoted by ¥, of elements called scalars and two operations called addition “+” and multiplication “-”; the two operations are defined over ¥ such that they satisfy the following conditions: 1. To every pair of elements x and f in F. there correspond an element a+ B in F called the sum of « and , and an element «: B or #f in F, called the product of « and B. 2. Addition and multiplication are respectively commutative: For any x, f in F, a+Pp=Bta a B=p-o 3. Addition and multiplication are respectively associative: For any «,B,) in F, a+(B+y) (a B) y=a-(B-y) 4. Multiplication is distributive with respect to addition: For any x. B.) in F, (a+ B)+y a (B+y)=(a- B)+(a-y) 5. # contains an element, denoted by 0, and an element, denoted by 1, such that ¢+0=4, 1-«=a for every xin ¥. 6. To every xin F, there is an element f in ¥ such that 2 +B=9. The element B is called the additive inverse. 7. To every xin F which is not the element 0, there is an element y in ¥ such that «-)=1. The element » is called the multiplicative inverse. : We give some examples to illustrate this concept. Example 1 Consider the set of numbers that consists of 0 and 1. The set {0, 1} does not form a field if we use the usual definition of addition and multiplication, because the element 1+? =2 is not in the set {0,1}. However, if we define 0+0= 1+1=0,1+0=1 and 0-1! =0-0=0, 1-1 =1, then it can be verified that {0, 1} with the defined addition and multiplication satisfies all the conditions listed for a field, Hence the set {0, 1} with the defined operations forms a field. It is called the field of binary numbers. ' Example 2 Consider the set of all 2 x 2 matrices of the form x -y yx where x and y are arbitrary real numbers. The set with the usual definitions of matrix addition and multiplication forms a field. The elements 0 and 1 of theLINEAR SPACES OVER A FIELD. 9 [oo] = fo 4] Note that the set of all 2 x 2 matrices does not form a field. 1 Id are, respectively, From the foregoing examples, we see that the set of objects that forms a ‘field could be anything so long as the two operations can be defined for these ‘Objects. The fields we shall encounter in this book are fortunately the most familiar ones: the field of real numbers, the field of complex numbers, and the field of rational functions with real coefficients. The additions and multiplications of these fields are defined in the usual ways. The reader is advised to show that they satisfy all the conditions required for a field. We use R and C to denote the field of real numbers and the field of complex numbers, respectively, ‘and use R(s) to denote the field of rational functions with real coefficients and with indeterminate s. Note that the set of positive real numbers does not form a field because it has no additive inverse. The set of integers and the set of poly- fomials do not form a field because they have no multiplicative inverse.> Before introducing the concept of vector space, let us consider the ordinary Etwo-dimensional geometric plane. If the origin is chosen, then every point in the plane can be considered as a vector: it has direction as well as magnitude. A vector can be shrunk or extended. Any two vectors can be added, but the ‘product of two points or vectors is not defined. Such a plane, in the mathe- matical terminology, is called a /inear space, or a vector space, or a linear vector “space. E Definition 2-2 ‘A linear space over a field ¥, denoted by ( 2, ), consists of a set, denoted by 2, of elements called vectors, a field #, and two operations called vector addition ‘and scalar multiplication. The two operations are defined over @and ¥ such at they satisfy all the following conditions: 1. To every pair of vectorsx, and x, in 2, there corresponds a vector xX; +x> in %, called the sum of x, and x3. ; 2. Addition is commutative: For any x,, x2 in 2.x, +x)=x)+X). 3. Addition is associative: For any x,, x», and x, in 2 (x, +X))+xX3=x, + (x, +X3). A. £ contains a vector, denoted by 0, such that 0 +x =x for every x in 2 The vector 0 is called the zero vector or the origin. : §. To every x in & there is a vector ¥ in 2, such that x +k =0. E 6. To every a in #, and every x in %, there corresponds a vector ox in Xcalled the scalar product of « and x. 7. Scalar multiplication is associative: For any a, B in F and any x in 2, (Bx) = (o.B)x. 3.A set with all properties of a field except property 7 in Definition 2-1 is called a ring or, more precisely, a commutative ring with (multiplicative) identity. The set of integers forms a ring, as does the set of polynomiais with real coefficients.10 LINEAR SPACES AND LINEAR OPERATORS 8. Scalar multiplication is distributive with respect to vector addition: For any win F and any x,,x, in Zak, +x 2) =ax, + OK. 9. Scalar multiplication is distributive with respect to scalar addition: For any «, pin F andanyx in %,(% + B)x = ax + px. 10. For any x in 2, ix=x. where | is the element 1 in 7. 1 Example 1 A field forms a vector space over itself with the vector addition and scalar multiplication defined as the corresponding operations in the field. For example, (IR, R) and (C, C) are vector spaces. Note that (€,R)} is a vector space but not (R, C). (Why?) We note that (R(s), R(s)) and (Ris), R) are also vector spaces, but not (IR, Ris). ' Example 2 The set of all real-valued piecewise continuous functions defined over (— 20, 20) forms a linear space over the field of real numbers. The addition and scalar multiplication are defined in the usual way. It is called a function space. . Example 3 Given a field ¥, let #" be all n-tuples of scalars written as columns XG (24) Xai where the first subscript denotes various components of x; and the second subscript denotes different vectors in #”. If the vector addition and the scalar multiplication are defined in the following way: XitXay 4X45 Xj +x)= | X21 ¥2 ox, = | 24 (2-5) Xni + Xnj IX ni then (#", F) is a vector space. If F ={R:(IR", R) is called the n-dimensional real vector space: if ¥ =C:(C", C) is called the n-dimensional complex vector space; if < Ris); (Rs), R(s)) is called the n-dimensional rational vector space. ‘ Example 4 Consider the set R,[s] of all polynomials of degree less than n with real coefficients* iter * Note that &(s), with parentheses, denotes the field of rational functions with real coefficients: whereas R[s], with brackets, denotes the set of polynomials with real coefficients.: LINEAR SPACES OVER A FIELD iu Let the vector addition and the scafar multiplication be defined as Fast pisi= "Sat Bs! met 2 im a ( y 7) = ¥ (aa,)s! fo iS | It is easy to verify that (R,[s], R) is a linear space. Note that (R,[s], Ris) is | nota linear space. 1: Example 5 Let % denote the set of all solutions of the homogeneous differential equation ¥+2X%+43x=0. Then (%,R) is a linear space with the vector addition and the scalar multiplication defined in the usual way. If the differential equation is not homogeneous, then (, R) is not a linear space. (Why?) 1 We introduce one more concept to conclude this section. Definition 2-3 Let (2, ¥) be a linear space and let Y be a subset of %. Then (%, #) is said to bea subspace of (#, #) if under the operations of (2, ¥), ¥ itself forms a vector space over F. ' We remark on the conditions fora subset of % to form a subspace. Since the vector addition and scalar multiplication have been defined for the linear space (2, F), they satisfy conditions 2, 3, and 7 through 10 listed in Definition 2-2. Hence we need to check only conditions 1 and 4 through 6 to determine whether a set % is a subspace of (%, F). It is easy to verify that if ay; + 222 isin Y for any y;, ¥; in Y@and any o, a, in ¥, then conditions | and 4 through 6 are satisfied. Hence we conclude that a set Y is a subspace of (2, ¥) if Yi + %2Y2 isin Y, for any ¥1,¥2 in Y and any 4,42 in F. Example 6 In the two-dimensional real vector space (R?, R), every straight line passing through the origin is a subspace of (IR?,R). That is, the set Ls] for any fixed real « is a subspace of (R?, R). : Example 7 The real vector space (R",R) is a subspace of the vector space (C".R).12 _LINEAR SPACES AND LINEAR OPERATORS 2-3 Linear Independence, Bases, and Representations Every geometric plane has two coordinate axes, which are mutually perpendicular and of the same scale. The reason for having a coordinate system is to have some reference or standard to specify a point or vector in the plane. In this section, we will extend this concept of coordinate to general linear spaces. In linear spaces a coordinate system is called a basis. The basis vectors are generally not perpendicular to each other and have different scales. Before proceeding, we need the concept of linear independence of vectors. Definition 2-4 A set of vectors X;,X>,...,%,, in a linear space over a field F, (2, F), is said to be /inearly dependent if and only if there exist scalars a, %2,...,%, in F, not all zero, such that OX, $OyX2 + °° +a,%,=0 (2-6) If the only set of a; for which (2-6) holds is x, =0, #, =0,.-..%= set of vectors X1,X2,...,X, is said to be linearly independent. 0, then the ' Given any set of vectors, Equation (2-6) always holds for x; =0, #2 =0,..., a, =0. Therefore, in order to show the linear independence of the set, we have to show that %; =0, 2. =0,...,0, =0 is the only set of x; for which (2-6) holds; that is, if any one of the «/s is different from zero, then the right-hand side of (2-6) cannot be a zero vector. Ifa set of vectors is linearly dependent, there are generally infinitely many sets of ¢;, not all zero, that satisly Equation (2-6). However, it is sufficient to find one set of «;, not all zero, to conclude the linear dependence of the set of vectors. Example 1 Consider the set of vectors x,,X2,--..X, in which x, =0. This set of vectors is always linearly dependent, because we may choose a; = 1, 4. =0, #2. =0,..., a, = 0, and Equation (2-6) holds. a Example 2 Consider the set of vector x, which consists of only one vector. The set of vector x, is linearly independent if and only if x, #0. If x, £0, the only way to have a,x; =O isa, =0. Ifx, =0, we may choose a, =1. : If we introduce the notation my OX; Hox. t+-> taHX,A[X, X2 7 XA] M2 Al[x; x. oxo (2-7)LINEAR INDEPENDENCE, BASES, AND REPRESENTATIONS 13 then the linear independence of a set of vectors can also be stated in the following definition. Definition 2-4’ A set of vectors X\,X2,---.X, in (2, F) only if the equation is said to be linearly independent if and Ix. x) <= x,Je=0 implies a =0, where every component of @ is an element of ¥ or. correspondingly, a can be considered as a vector in ¥". . Observe that linear dependence depends not only on the set of vectors but also on the field. For example, the set of vectors {x,,x2}, where 1 s+2 sti (s+ 1s +3) x, x, 1 1 is linearly dependent in the field of rational functions with real coefficients. Indeed, if we choose _s+3 a=-1 and = 9 then a,x, +4,x,=0. However, this set of vectors is linearly independent in the field of real numbers, for there exist no a, and a, in R that are different from zero, such that 0X, +a,x;=0. In other words, x, and x, are linearly independent in (R7(s), R), but are linearly dependent in (R7(s), R(s)). ' It is clear from the definition of linear dependence that if the vectors X1,X2,--.,X, are linearly dependent, then at least one of them can be written as a linear combination of the others. However, it is not necessarily true that every one of them can be expressed as a linear combination of the others. Definition 2-5 The maximal number of linearly independent vectors in a linear space (2, F) is calied the dimension of the linear space (2, F). ' In the previous section we introduced the n-dimensional real vector space (R",R). The meaning of n-dimensional is now clear. It means that in (R",R) there are, at most, n linearly independent vectors (over the field R). In the two- dimensional real vector space (R?,R), one cannot find three linearly independent vectors. (Try!)14 LINEAR SPACES AND LINEAR OPERATORS. Example 3 Consider the function space that consists of all real-valued piecewise continuous functions defined over (— 0, 0), The zero vector in this space is the one which is identically zero on (— 00, ©). The following functions, with — 0 +,€, =O (2-8) We claim that a #0. {faq =0, Equation (2-8) reduces to ae; +a,e, +" +2,€,—0 (2-9) which, together with the linear independence assumption of e,,€2,....€n, implies that a; =0,2,=0,...,¢,=0. This contradicts the assumption that not all %,%,...,%, are zero. If we define B; 8 —a,/x, for i=1,2,...,n, then (2-8) becomes X= Bie, + Brer+ °° + Brey (2-10) This shows that every vector x in 2’can be expressed as a linear combinationLINEAR INDEPENDENCE, BASES, AND REPRESENTATIONS 15 ofe;,@2,...,€,, Now we show that this combination is unique. Suppose there is another linear combination, say x= Bye, + Brent + Been (2-11) Then by subtracting (2-11) from (2-10), we obtain 0=(B; —B,)e1 + (B2 — Bader +--+ (Bu Bales which, together with the linear independence of {e;}, implies that Bi=B, =1,2,...,0 This completes the proof of this theorem. QED. This theorem has a very important implication. In an »-dimensional vector space ( &, F), if a basis is chosen, then every vector in 2 can be uniquely represented by a set of n scalars B;, f2,...,B, in F. If we use the notation of (2-7), we may write (2-10} as x=[e, e: -* e,]B (2-12) where B =[B,, B2,...,Bn]’ and the prime denotes the transpose. The n x1 vector B can be considered as a vector in (¥”, ¥). Consequently, there is a one-to-one correspondence between any n-dimensional vector space (&, ¥) and the same dimensional linear space (¥", F) if a basis is chosen for (X. F). Definition 2-7 In an n-dimensional vector space (2, ¥), if a basis {e;,€2,...,€,} is chosen, then every vector x in % can be uniquely written in the form of (2-12). B is called the representation of x with respect to the basis {€,,€2,...,€,}- . Example 4 The geometric plane shown in Figure 2-1 can be considered as a two-dimensional real vector space. Any point in the plane is a vector. Theorem 2-1 states that Figure 2-1 A two-dimensional real vector space.16 LINEAR SPACES AND LINEAR OPERATORS Table 2-1" Different Representations of Vectors Vectors Bases ob a & e & wo f) fl] @ tl ol ee [a Oo fl bd 0 any set of two linearly independent vectors forms a basis. Observe that we have not only the freedom in choosing the directions of the basis vectors (as long as they do not lie in the same line) but also the magnitude (scale) of these vectors. Therefore, given a vector in(R?, R), for different bases we have different representations of the same vector. For example, the representations of the vector b in Figure 2-1 with respect to the basis {e,,e2} and the basis {€,,@2} are, respectively, [1 3] and[—1 2]' (where the “prime” symbol denotes the transpose). We summarize the representations of the vectors b, €,, €2, €;, and e2 with respect to the bases {e;,€2), {@,,@2} in Table 2-1. Example 5. Consider the linear space (R,[s],®), where R,[s] is the set of all real polynomials of degree less than 4 and with indeterminate s. Let e; =s*,e, =s?, e, =s,ande,=1. Clearly, the vectors e;, i = 1,2, 3, 4, are linearly independent and qualify as basis vectors. With this set of basis vectors, the vector x= 3s* + 2s? — 2s + 10 can be written as 3 2, x=[er ex es ej} ? 10 hence [3 2 -—2 10]' (where the “prime” denotes the transpose) is the representation of x with respect to {e;,€2,€3,e,}. If we choose @, €, =s* —s,@,;=5—1, and @, = 1 as the basis vectors, then x = 399 + 2s? —25 + 10= 3(s* —s?) + 5(s? — 8) + 3s — 1) 413-1 3 Seeeeeeeeeeneee 5 =f & & &)]3 13 Hence the representation ofx with respect to (@,, €,@3,@,}is[3 5 3 13]. & In this example, there is a sharp distinction between vectors and representations. However, this is not always the case. Let us consider the n-dimensionalLINEAR INDEPENDENCE BASES, AND REPRESENTATIONS 17 real vector space (R",R), complex vector space (C", C), or rational vector space (Rs), R(s)); a vector is an n-tuple of real, complex, or reat rational functions, written as lB xa] 8: Br This array of n numbers can be interpreted in two ways: (1) It is defined as such; that is, it is a vector and is independent of basis. (2) It isa representation of a vector with respect to some fixed unknown basis. Given an array of numbers, unless it is tied up with some basis, we shall always consider it as a vector. However we shall also introduce, unless stated otherwise, the following vectors: 1 0 0 0 1 0 n,=|°], a,=|° . n,=|° (2-13) 0 0 1 0 0 0 0 1 as the basis of (R”, R), (C", C), and (R"s), R(s)). In this case, an array of numbers can be interpreted as a vector or the representation of a vector with respect to the basis {m,,m2,...,m,}, because with respect to this particular set of bases, the representation and the vector itself are identical; that is, Bi Bl 2 ee aaa) bn Br Change of basis. We have shown that a vector x in (2, ¥) has different representations with respect to different bases. It is natural to ask what the relationships are between these different representations of the same vector. In this subsection, this problem will be studied. Let the representations of a vector x in (2, #) with respect to {e;, €2,..., &,} and {@,,€),...,€,} be B and B, respectively; that is,° x=[e, e, * e)B=[6, & -° 616 (2-15) In order to derive the relationship between B and , we need either the information of the representations of €;, for i= 1, 2,...,n, with respect to the 5 This set of vectors is called an orthonormal set. °One might be tempted to write B=[e, e, --- e,] ‘[@ @: °°: @,]B. However, [e: €2 *** @,J~' may not be defined as can be seen from Example 5.18 LINEAR SPACES AND LINEAR OPERATORS basis {e;,€2,...,€,}, or the information of the representations of e,, for i= 1,2,...,n, with respect to the basis {@,,@,,...,@,}. Let the representation of e; with respect to {@,,@,...,8,} be [pi; pai P3 Pri]; that is, Pai e=[& @ «> @,) |P2) QEp, i= 1,2,. (2-16) Pri where EA [e, @, --- @,),p:O[pii poi °** Pn. Using matrix notation, we write [er e2 ++: e,J=[Ep: Ep, -* Ep,] (2-17) which, by using (2-2), can be written as fe. e: --° enb=E[P: Po -* Pad Pir Piz *** Pan a 8] |P21 P22" Pan Pui Pur *** Bal Se %& «> &)P (2-18) Substituting (2-18) into (2-15), we obtain x=[@ @ - &JPB=[@ é (2-19) Since the representation of x with respect to the basis {@,, @,...,&,} is unique, (2-19) implies B=PB (2-20) ith column: the representation of e; with respect to {€,,€,...,8,} where P= (2-21) This establishes the relationship between B and B. In (2-16), if the representation of @; with respect to {e;,€2,...,€,} is used, then we shall obtain B= QB (2-22) ith column: the here Q=| representation of (2-28) &; with respect to {€1,€2,....€n} Different representations of a vector are related by (2-20) or (2-22). There- fore, given two sets of bases, if the representation of a vector with respect to one set of bases is known, the representation of the same vector with respect to the other set of bases can be computed by using either (2-20) or (2-22). Since B= PB and B= QB, we have B = PQB. for all B; hence we conclude that PQ=1 or P=Q"! (2-24)LINEAR OPERATORS AND THEIR REPRESENTATIONS 9 Example 6 Consider the two sets of basis vectors of (R4{s],R) in Example 5. It can be readily verified that 1000 fer er es el=[e @ & &]]) |? OlOle & & alP 1111 and 1 0 0 0 fe & 8 @J=[e ee ed }7) | 9 [Sle e: es el@ 0 0 -l 1 Clearly, we have PQ =I and B = Pf, or 3 3 1000 3) 5|_p| 2[_|1 1 0 off 2 . 3 -2 1 1 1 O}}-2 13. 10. 1111 10 2-4 Linear Operators and Their Representations The concept of a function is basic to all parts of analysis. Given two sets Zand @, if we assign to each element of 2 one and only one element of Y, then the rule of assignments is called a function. For example, the rule of assignments in Figure 2-2(a) is a function, but not the one in Figure 2-2(b). A function is usually denoted by the notation f : ¥— %, and the element of % that is assigned to the element x of #is denoted by y= f(x). The set Zon which a function is defined is called the domain of the function. The subset of ¥ that is assigned to some element of 2 is called the range of the function. For example, the Figure 2-2 Examples in which (a) the curve represents a function and (b) the curve does not represent a function.20 LINEAR SPACES AND LINEAR OPERATORS. domain of the function shown in Figure 2-2(a) is the positive real line, the range of the function is the set [ —1, 1], which is a subset of the entire real line %. The functions we shall study in this section belong to a restricted class of functions, called /inear functions, or more often called linear operators, linear mappings, Or linear transformations. The sets associated with linear operators are required to be linear spaces over the same field, say (%, ¥) and (Y, F¥). A linear operator is denoted by L:%, ¥) > (@, F). In words, L maps (2, F) into (Y, F). Definition 2-8 A function L that maps (2, ¥) into (YW, F) is said to be a linear operator if and only if Lo yXy + 0 3X2)= 4, LX, + 0, Lx, for any vectors x,,x2 in #and any scalars a, %) in F. ' Note that the vectors Lx, and Lx, are elements of Y. The reason for requiring that & be defined over the same field as 2 is to ensure that «Lx, and 42Lx, be defined. Example 1 Consider the transformation that rotates a pomt in a geometric plane counterclockwise 90° with respect to the origin as shown in Figure 2-3. Given any two vectors in the plane, it is easy to verify that the vector that is the sum of the two vectors after rotation is equal to the rotation of the vector that is the sum of the two vectors before rotation. Hence the transformation is a linear transformation. The spaces (%, 7) and (Y, #) of this example are all equal to (R?, BB). 1 Example 2 Let % be the set of all real-valued piecewise continuous functions defined over [0. T] for some finite T>0. It is clear that (%, R) is a linear space whose dimension is infinity (see Example 3, Section 2-3). Let g be a continuous function defined over [0,7]. Then the transformation 1 y= { at —t)u(t) dt (2-25) o 5 3 Y2-05 1 1S Figure 2-3 The transformation that rotates a vector counterclockwise 90°.LINEAR OPERATORS AND THRIR REPRESENTATIONS 2 is a linear transformation. The spaces (%, #) and (Y, F) of this example are all equal to (#%, R). ' Matrix representations of a linear operator. We see from the above two examples that the spaces (2, ¥) and (Y, ¥) on which a linear operator is defined may be of finite or infinite dimension. We show in the following that every linear operator that maps finite-dimensional (%, #) into finite- dimensional (Y, #) has matrix representations with coefficients in the field F. If (2, F) and (Y, F) are of infinite dimension, a representation of a linear operator can still be found. However, the representation will be a matrix of infinite order or a form similar to (2-25). This is outside the scope of this text and will not be discussed. Theorem 2-2 Let (%, ¥) and (Y, F) be n- and m-dimensional vector spaces, respectively, over the same field. Let x,,x2,...,x, be a set of linearly independent vectors in & Then the linear operator L:(%, ¥) > (Y, F) is uniquely determined by the n pairs of mappings y;=Lx,, for i=1,2,...,n. Furthermore, with respect to the basis {x,,X2,-..,X,} of Zand a basis {u,, U2,...,u,,} of Y, L can be represented by an m x » matrix A with coefficients in the field #. The ith column of A is the representation of y; with respect to the basis {u,,U5,...,U,,}- Proof Let x be an arbitrary vector in % Since x,,x»,...,x, are linearly independent, the set of vectors qualifies as a basis. Consequently, the vector x can be expressed uniquely as «x, +0.X2 +‘ +a,x, (Theorem 2-1). By the linearity of L, we have LX = 0sLX; +0QLX2 + °+* + opLXq HOY + HaY2 + + Yn which implies that for any x in 2, Lx is uniquely determined by y;= Lx,, for i=1,2,...,n. This proves the first part of the theorem. Let the representation of y; with respect to {uy,U5,...,u,,} be [au dai °7* Gi)’ that is, a; = [uy uy, ou) } dai an (2-26) Opi where the a,;'s are elements of ¥. Let us write, as in (2-17) and (2-18), Ux, X20 Xe) = [Yr Yo Yn yy ayn 1" ty =[u, uy uy) [428 4227 aan Amy m2 °° Amn Qfu u, - u,jA (2-27)22 LINEAR SPACES AND LINEAR OPERATORS Note that the elements of A are in the field ¥ and the ith column of A is the representation of y; with respect to the basis of ¥. With respect to the basis {X\,Xa,---5X,} of (% F) and the basis {u,,u,,...,U,} of (¥, F), the linear operator y = Lx can be written as (uu, + Un]B= LEK, x2 0 Xa) (2-28) where BA[B, B. -* B,J anda Q[x, «a, -** o,]' are the representations of y andx, respectively. After the bases are chosen, there are no differences between specifying x, y, and @, B; hence in studying y = Lx, we may just study the relationship between B anda. By substituting (2-27) into (2-28), we obtain fu ou, un, JB=[0, uu, Ao (2-29) which, together with the uniqueness of a representation, implies that p= Aa (2-30) Hence we conclude that if the bases of (2, #) and (Y, F) are chosen, the operator can be represented by a matrix with coefficients in ¥. Q.E.D. We see from (2-30) that the matrix A gives the relation between the representations & and B, not the vectors x and y. We also see that A depends on the basis chosen. Hence, for different bases, we have different representations of the same operator. We study in the following an important subclass of linear operators that maps a linear space (2%, #) into itself, that is, L:(2) F)>(%, F), In this case, the same basis is always used for these two linear spaces. If a basis of 4, say {€,€2,...,€,}, is chosen, then a matrix representation A of the linear operator L can be obtained by using Theorem 2-2. For a different basis {€,,€,..., é,}, we shall obtain a different representation A of the same operator L. We shall now establish the relationship between A and A. Consider Figure 2-4; x is an arbitrary vector in 2; a and & are the representations of x with respect to the basis {e,,e2,...,€,} and the basis {@,,@2,...,@,}, respectively. Since the vector y = Lx is in the same space, its representations with respect to the bases chosen, say B and B, can also be found. The matrix representations A and A can be computed by using Theorem 2-2. The relationships Independent x—sy(=1x) ith column: | ith column: of basis Representation of Representation of =| Le; with respect to, P= | e; with respect to oH the basis {8,835.58} C1 €a += ey fe, e, 1&2 i ith column: ith column: ___ | Representation of Representation of oo A= | Le, with respect to], Q= & with respect to Te, & the basis {Oi €25-.. 58} (Ce Figure 2-4 Relationships between different representations of the same operatorLINEAR OPERATORS AND THEIR REPRESENTATIONS 23 between a and & and between B and B have been established in (2-20); they are related by &= Pa and B = PB, where P is a nonsingular matrix with coefficients in the field F and the ith column of P is the representation of e; with respect to the basis {@,,8),...,@,}. From Figure 2-4, we have p=Aa and §-=PB=PAa=PAP-'a Hence, by the uniqueness of a representation with respect to a specific basis, we have Aa=PAP™'&. Since the relation holds for any & we conclude that A=PAP-'=Q-'AQ (2-314) or A=P-'AP=QAQ"! (2-31b) where Q 4 P! Two matrices A and A are said to be similar if there exists a nonsingular matrix P satisfying (2-31). The transformation defined in (2-31) is called a similarity transformation. Clearly, all the matrix representations (with respect to different bases) of the same operator are similar. Example 3 Consider the linear operator L of Example 1 shown in Figure 2-3. If we choose {X1,X2} as a basis, then yi=Lx,=[x, «1("| andy, =Lx,=[x, «1[~4| Hence the representation of L with respect to the basis {x,, x2} i -[O ~1 1 0 LS 0.5 It is easy to verify that the representation of y, with respect to {X,, X2} is equal to [° —1ifis _| -05 It of[os| | 1s -0.5 or =|x Xx. nots x1[ 93] If, instead of {x,, x»}, we choose {x,, x3} as a basis, then from Figure 2-3, bx, «173 andy 3 Hence the representation of L with respect to the basis {x,, x3} is [33] The representation of x is24 LINEAR SPACES AND LINEAR OPERATORS The reader is advised to find the P matrix for this example and verify A = PAP"! 1 In matrix theory, a matrix is introduced as an array of numbers. With the concepts of linear operator and representation, we shall now give a new interpretation of a matrix. Given an n xn matrix A with coefficients in a field ¥, if it is not specified to be a representation of some operator, we shall consider it asa linear operator that maps (¥", F)into itself.’ The matrix A is independent of the basis chosen for (¥#", F). However, if the set of the vectors ny, Mo,...,M, in Equation (2-13) is chosen as a basis of (#", ¥), then the representation of the linear operator A is identical to the linear operator A (a matrix) itself. This can be checked by using the fact that the ith column of the representation is equal to the representation of An, with respect to the basis {m,,m,...,m,}. If a; is the ith column of A, then An;=a,. Now the representation of a; with respect to the basis (2-13) is identical to itself. Therefore we conclude that the representation of a matrix (a linear operator) with respect to the basis (2-13) is identical to itself. For a matrix (an operator), Figure 2-4 can be modified as in Figure 2-5. The equation Q=[q, 42 °** q,J follows from the fact that the ith column of Q is the representation of q; with respect to the basis {n,,02,...,0,}- Ifa basis {q,, q2,..-.q,} is chosen for (¥", #), a matrix A has a representation A. From Figure 2-5, we see that the niatrix representation A _may be computed either from Theorem 2-2 or from a similarity transformation. In most of the problems encountered. in-this book, it is always much easier to compute A from Theorem 2-2 than from using a similarity transformation. Example 4 Consider the following matrix with coefficients in R: ” This interpretation can be extended 10 nonsquare matrices, ith column: Representation of Aq, with respect to Independent of basis ap Basis(m, my ~~) a@— “+B re ee ef - Je A=Q7'AQ Basis(q: 42° a) @—~—+B Q=[n @ - aJ=Pt Figure 2-5 Different representations of a matrix (an operator).LINEAR OPERATORS AND THEIR REPRESENTATIONS, yy S = —5 Then* Ab= | Atb= | vv-| 10 1 =3 13 Jt can be shown that the following relation holds (check !): A®b =5A*b — LSAb + 17b (2-32) Since the set of vectors b, Ab, and A?b are linearly independent, it qualifies as a basis. We compute now the representation of A with respect to this basis. It is clear that 0 A(b)=[b Ab Ab] 1 0) 0 A(Ab)=[b Ab A?b]|0 1 17 and A(A?b)=[b Ab Ab] | —15 5 The last equation is obtained from (2-32). Hence the representation of A with respect to the basis {b, Ab, Ab} is The matrix A can also be obtained from Q~ ' AQ, but it requires an inversion of a matrix and n° multiplications. However, we may use A=Q™'AQ, or more easily, QA = AQ, to check our result. The reader is asked to verify fo -1 -4]fo 0 17 3.2 -i]fo -1 — 0 0 21 o -1s}=|-2 1 Offo 0 2 1 4 -3{fo 1 5 43 aft 4 -3 Example 5 We extend Example 4 to the general case. Let A be an n xn square matrix with real coefficients. If there exists a real vector b such that the vectors b, b,...,A"”'b are linearly independent and if A"b= —a,b—a,_,Ab—--- — «,A"~ "b (see Section 2-7), then the representation of A with respect to the basis {b, Ab,..., A"~4b} is 0 Hoy 0 -a-1 0 —on-2 (2-33) 0-2, 1 =a 1 SA? DAA, AP DO AAA.26 LINEAR SPACES AND LINEAR OPERATORS A matrix of the form shown in (2-33) or its transpose is said to be in the companion form. See Problem 2-26. This form will constantly arise in this text. As an aid in memorizing Figure 2-5, we write A=Q>'AQ as QA=AQ Since Q=[q, q2 °*: q,], it can be further written as [ar 42 -* aJA=[Aq, Aq, -> Aq.) (2-34) From (2-34), we see that the ith column of A is indeed the representation of Aq; with respect to the basis {q),42.---4n}- We pose the following question to conclude this section: Since a linear operator has many representations, is it possible to choose one set of basis vectors such that the representation is nice and simple? The answer is affirma- tive. In order to give a solution, we must first study linear algebraic equations. 2-5 Systems of Linear Algebraic Equations Consider the set of linear equations: ayiX1 4.412%) 40° p(X, 422%, + 7° (2-35) Amy Xy + Am2X2 + where the given a,;s and y;s are assumed to be elements of a field ¥ and the unknown x/s are also required to be in the same field ¥. This set of equations can be written in matrix form as Ax=y (2-36) Ayy 42 7" tn x 421 422 *** day x2 where AA): . ; x4], y4 Ami m2 °°" Amn Xn Vm Clearly, A is an m xn matrix, x is an nx 1 vector, and y is an m x1 vector. No restriction is made on the integer m; it may be larger than, equal to, or smaller than the integer n. Two questions can be raised in regard to this set of equations: first, the existence of a solution and, second, the number of solutions. More specifically, suppose the matrix A and the vector y in Equation (2-36) are given; the first question is concerned with the condition on A and y under which at least one vector x exists such that Ax=y. If solutions exist, then the second question is concerned with the number of linearly independent vectors x such that Ax=y. In order to answer these questions, the rank and the nullity of the matrix A have to be introduced. We have agreed in the previous section to consider the matrix A as a linearSYSTEMS OF LINEAR ALGEBRAIC EQUATIONS. 27 operator which maps (#", ¥) into (#", F). Recall that the linear space (¥", F) that undergoes transformation is called the domain of A. Definition 2-9 The range of a linear operator A is the set #(A) defined by @(A)= {all the elements y of (#", F) for which there exists at least one vector x in (F", F) such that y= Ax} 1 Theorem 2-3 ‘The range of a linear operator A is a subspace of (¥", ¥). Proof are elements of we then by definition there exist vectors x, and 7) such that y, = = Ax,. We claim that for any a, and in F, the vector %,y, +%2Y> is ae an element of AA). Indeed, by the linearity of A, it is easy to show that a,y, + #2¥2 = A(#,X, +4,x,), and thus the vector aX, +a2X> is an element of (¥", F). Hence the range AA) is a subspace of (¥", F) (see the remark following Definition 2-3). QED. Let the ith column of A be denoted by a;: that is, A=[a; a, --- a,], then the matrix equation (2-36) can be written as Y=Xa, + X28. 40 + x, (2-37) where x,, for i= 1,2,...,n, are components of x and are elements of 7. The range space #(A) is, by definition, the set of y such that y= Ax for some x in (F", ¢ It is the same as saying that A(A) is the set of y with x,, x2,...,x,, in (2-37) ranging through all the possible values of #. Therefore we conclude that &A) is the set of all the possible linear combinations of the columns of A. Since AA) is a linear space, its dimension is defined and is equal to the maximum number of linearly independent vectors in @A). Hence, the dimension of MA) is the maximum number of linearly independent columns in A. Definition 2-10 The rank of a matrix A, denoted by p(A), is the maximum number of linearly independent columns in A, or equivalently, the dimension of the range space of A. ' Example 1 Consider the matrix one Noe oAN28 LINEAR SPACES AND LINEAR OPERATORS The range space of A is all the possible linear combinations of all the columns of A, or correspondingly, all the possible linear combinations of the first two columns of A, because the third and the fourth columns of A are linearly dependent on the first two columns. Hence the rank of A is 2. ' The rank of a matrix can be computed by using a sequence of elementary transformations (see Appendix A). This is based on the property that the rank of a matrix remains unchanged after pre- or postmultiplications of elementary matrices (Theorem 2-7). Once a matrix is transformed into the upper triangular form as shown in (A-6), then the rank is equal to the number of nonzero rows in (A-6). From the form, it is also easy to verify that the number of linear independent columns of a matrix is equal to the number of independent rows. Consequently, if A is an n x m matrix, then rank A=no. of linear independent columns =no. of linear independent rows p(B) ~n+plA). QED. Hf B is an » x » matrix and nonsingular, then p(A)+ p(B) —n = p(A) +|Bq/? Hence a* A* Aa =0 implies B; =0, for i=1,2,...,m; or. equivalently, Aa =0, which, in turn, implies # =0 from the assumption of p(A)=n. Therefore we conclude that p(A*A) =n. 2. This part can be similarly proved or directly deduced from the foregoing by using the fact p(A) = p(A*). QED. 2-6 Eigenvectors, Generalized Eigenvectors, and Jordan- Form Representations of a Linear Operator With the background of Section 2-5, we are now ready to study the problem posed at the end of Section 2-4. We discuss in this section only linear operators that map (C", C) into itself with the understanding that the results are applicable to any operator that maps a finite-dimensional linear space over C into itself. The reason for restricting the field to the field of complex numbers will be seen immediately. Let A be an n xn matrix with coefficients in the field C. We have agreed to consider A as a linear operator that maps (C”, C) into (C", C). Definition 2-12 Let A be a linear operator that maps (C”, €) into itself. Then a scalar A in C is called an eigenvalue of A if there exists a nonzero vector x in C" such that Ax =4x. Any nonzero vector x satisfying Ax = Ax is called an eigenvector of A associated with the eigenvalue /."' 1 In order to find an eigenvalue of A, we write Ax = Ax as (A—ADx =0 (2-40) where I is the unit matrix of order. We see that for any fixed 4 in C, Equation (2-40) is a set of homogeneous linear equations. The matrix (A — AI) is an nxn square matrix. From Corollary 2-5, we know that Equation (2-40) has a nontrivial solution if and only if det (A —AI)=0. It follows that a scalar 2 is an eigenvalue of A if and only if it isa solution of A(2) & det (AL—A)=0. A(A) is a polynomial of degree n in 4 and is called the characteristic polynomial of A. Since A(A) is of degree n, the n x n matrix A has n eigenvalues (not necessarily all distinct), '* [is also called a right eigenvector of A. Ifa 1 xn nonzero vector y exists such that yA = Ay, then y is called a lefi eigenvector of A associated with 2.Re LINEAR SPACES AND LINEAR OPERATORS, Example 1 Consider the matrix 1-1 a-[) “i (2-41) which maps (IR?, ) into itself. We like to check whether Definition 2-12 can be modified and applied to a linear operator that maps (R", R) into (R", R). A modified version of Definition 2-12 reads as a scalar / in R is an eigenvalue of A if there exists a nonzero vector x such that Ax = 4x. Clearly / is an eigenvalue of A if and only if it is a solution of det (AI—A)=0. Now ; a-1 1 > det (AI =A)=aet| A al=4 +1 which has no real-valued solution. Consequently, the matrix A has no eigenvalue in R. Since the set of real numbers is a part of the field of complex numbers, there is no reason that we cannot consider the matrix A in (2-41) as a linear operator that maps (C?: C) into itself. In so doing, then the matrix A has eigenvalues +iand —i where i? A -1. . The constant matrices we shall encounter in this book are all real-valued. However in order to ensure the existence of eigenvalues, we shall consider them as linear operators under the field of complex numbers. With these preliminaries, we are ready to introduce a set of basis vectors such that a linear operator has a diagonal or almost diagonal representation. We study first the case in which all the eigenvalues of A are distinct; the case where A has repeated eigenvalues will then be studied. Case 1: All the eigenvalues of A are distinct Let 4,,43,...,4, be the eigenvalues of A, and let v, be an eigenvector of A associated with 2,, for i=1,2,...,n; that is, Av; =4,¥;. We shall use the set of vectors {V;,V2,...,¥,} aS a basis of (C", C). In order to do so, we have to show that the set is linearly independent and qualifies as a basis. Theorem 2-9 Let A,,42,...,4, be the distinct eigenvalues of A, and let v; be an eigenvector of A associated with 4;, for i=1,2,...,n. Then the set {v,,¥2,...,¥,} is linearly independent (over C). Proof We prove the theorem by contradiction. Suppose v,,¥5,...,¥, are linearly dependent; then there exist «,,02,..., a, (not all zero) in € such that OV, HQV2°°* +0G,V,=0 (2-42) We assume a, #0. If «,=0, we may reorder A; in such a way that «, #0.REPRESENTATIONS OF A LINEAR OPERATOR 35 Equation (2-42) implies that (A=ADMA 214200 ( F xm)=a (2-43) Since (A-ADy=Gi— ay iti Fi and (A-AD = the left-hand side of (2-43) can be reduced to (Ay —AgMAy Aa) Ay — AW =O By assumption the 1's, for i=1,2,...,n, are all distinct; hence the equation a; [] (41 —A)v, =0 implies 2, =0. This isa contradiction. Thus, the set of vectors {V¥1,V2,--..¥,} Vn} is linearly independent and qualifies as a basis. QE.D. Let A be the representation of A with respect to the basis {¥,.¥2,....¥,}. Recall from Figure 2-5 that a ith column of A is the representation of Av; = Aw; with respect to {V,, V2, nj—that is.[0 --- 0 4; O --: OJ’, where 4, is located at the ith entry. “ence the representation of A with respect to Wi, Vope es Voh iS A, 0 0 0 , [o 4 0 0 A=|0 O ase 0 (2-44) This can also be checked by using a similarity transformation. From Figure 2-5, we have Q= In v2 7 vs] Since Yoo) WJ=[Av, Av, «17 Avy] dave dg] = QA we have A=Q-'AQ We conclude that if the eigenvalues of a linear operator A that maps (C”, C) into itself are all distinct, then by choosing the set of eigenvectors as a basis. the operator A has a diagonal matrix representation with the eigenvalues on the diagonal. Example 2 Consider36 LINEAR SPACES AND LINEAR OPERATORS The characteristic polynomial of A is 4? +1. Hence the eigenvalues of A are +iand —i. The eigenvector associated with 2, =i can be obtained by solving the following homogeneous equation: (A-A,Dy, -[' a = liz |-9 21 Clearly the vector v; =[1 1 —J]' is a solution. Similarly, the vector v, = {t 1 + i] can be shown to be an eigenvector of 4, = —i. Hence the representation of A with respect to {¥,, ¥2} is ale | The reader is advised to verify this by a similarity transformation. . Case 2: The eigenvalues of A are not all distinct Unlike the previous case, if an operator A has repeated eigenvalues, it is not always possible to find a diagonal matrix representation. We shall use examples to illustrate the difficulty that may arise for matrices with repeated eigenvalues. Example 3 Consider 1 Oe! A=|0 1 0 0 0 2 The eigenvalues of A are 1, = 1,4, =1,and4,;=2. The eigenvectors associated with 4, can be obtained by solving the following homogeneous equations: 0 0 -1 (A-ADv=|0 0 O|v=0 (2-45) 0 Oo 2 Note that the matrix (A —/,]) has rank 1; therefore, two linearly independent vector solutions can be fourid-foi' (2-43) (see Corollary 2-5). Clearly, v, = [1 0 OJ andv,=[0 1 OJ’ are two linearly independent eigenvectors associated with 14,=A,;=1. An eigenvector associated with 2,=2 can be found as v;=[—-1 0 1]. Since the set of vectors {v,,V2,V} is linearly independent, it qualifies as a basis. The representation of A with respect to {¥1, V2. Va} is 100 A=|0 1 0 002 In this example, although A has repeated eigenvalues, it can still be diag- onalized. However, this is not always the case, as can be seen from the following example.REPRESENTATIONS OF A LINEAR OPERATOR 7 1 12 A=|0 1 3 (2-46) 0 0 2 The eigenvalues of Aare 2, =1, 4, =1,and 4,=2. The eigenvectors associated with 2, =1 can be found by solving 012 (A—A,Dv=|0 0 3|v=0 001 Since the matrix (A — 4,1) has rank 2, the null space of (A — 4,1) has dimension 1. Consequently, we can find only one linearly independent eigenvector, say ¥,=[1 0 OY, associated with 2;=/,=1. An eigenvector associated with 4, =2 can be found as v;=[5 3 1]. Clearly the two eigenvectors are not ~ sufficient to form a basis of (C3, C). 1 Example 4 Consider From this example, we see that if an n xn matrix A has repeated eigenvalues, it is not always possible to find n linearly independent eigenvectors. Consequently, the A cannot be transformed into a diagonal form. However, it is possible to find a special set of basis vectors so that the new representation is almost a diagonal form, called a Jordan canonical form. The form has the eigenvalues of A on the diagonal and either 0 or 1 on the superdiagonal. For example, if A has an eigenvalue 2, with multiplicity 4 and an eigenvalue /> with multiplicity 1, then the new representation will assume one of the following (2-47) Which form it will assume depends on the characteristics of A and will be pdiscussed in the next subsection. The matrices in (2-47) are all of block-diagonal38 LINEAR SPACES AND LINEAR OPERATORS, form. The blocks on the diagonal are of the form A100 0 O41 00 000-10 ae 000-41 000-04 with the same eigenvalue on the main diagonal and I’s on the diagonal just above the main diagonal. A matrix of this form is called a Jordan block associated with 4. A matrix is said to be in the Jordan canonical form, or the Jordan Jorm, if its principal diagonal consists of Jordan blocks and the remaining elements are zeros. The fourth matrix in (2-47) has two Jordan blocks associated with 4, (one with order 3, the other with order 1) and one Jordan block associated with 4,. A diagonal matrix is clearly a special case of the Jordan form: all of its Jordan blocks are of order 1. Every matrix which maps (C”, C) into itself has a Jordan-form representation. The use of Jordan form is very convenient in developing a number of concepts and results; hence it will be extensively used in the remainder of this chapter. D Derivation of a Jordan-form representation.'? In this subsection, we discuss how to find a set of basis vectors so that the representation of A with respect to this set of basis vectors is in a Jordan form. The basis vectors to be used are called the generalized eigenvectors. Definition 2-13 A vector v is said to be a generalized eigenvector of grade k of A associated with A if and only if!* (A—AD‘'v =0 and (A-ADK'v #0 : Note that if k = 1, Definition 2-13 reduces to (A — Aly =0 and v #0, which is the definition of an eigenvector. Hence the term “generalized eigenvector” is well-justified. Let v be a generalized eigenvector of grade k associated with the eigenvalue '? This section may be skipped without loss of continuity. However, it is suggested that the reader glances through it to gain a better feeling about the Jordan-form representation. '3(A— 21 Q (A AKA — AD): (A — AD) terms), (A= 21) 21REPRESENTATIONS OF A LINEAR OPERATOR 39 4. Define (2-43) and This set of vectors {v,,¥>....,¥q} is called a chain of generalized eigenvectors of length k. Let V7; denote the null space of (A — Al)’, that is, 4’; consists of all x such that (A —AIx =0. It is clear that ifx is in .;, then it is in 7,.,. Hence V; isa subspace of V;.,.denoted as Wjc W;,,. Clearly, the v defined in Defini- tion 2-13 is in .V, but not in .,_;. In fact, for /=1,2,....k.¥; =(A— ADK 'v defined in (2-49) is in V7; but not in V;_,. Indeed, we have (A —AD'v, = (A —AD(A — ADE “'v =(A — ADV =0 and (A AD!" 'v, = (A = AD (A — AV =(A — ADV £0 hence vy, is in 7; but not in W;-y Let A be an n xn matrix and have eigenvalue 4 with multiplicity m. We discuss in the following how to find m linearly independent generalized eigenvectors of A associated with 2. This is achieved by searching chains of generalized eigenvectors of various lengths. First we compute ranks of (A —AD)', i=0,1,2,..., until rank (A — AI m. In order not to be overwhelmed by notations, we assume n = 10,m = 8, k =4, and the ranks of (A — AI)‘, i =0, 1,2, 3,4, are as shown in Table 2-2. The nullity v; is the dimension of the null space 1’;, and is equal to, following -Theorem 2-5, n—p(A—Al). Because of Moe Nic N20-**. we have 0 =v) SV; Sv) °° Se=m. Now we are ready to find m=8 linearly independent eigenvectors of A associated with 4. Because of 3c 44 and v4—v, =1, we can find one and only one linearly independent vector u in 1", but not in 1’; such that Bu=0 and BuO where BA A—AI From this u, we can generate a chain of four generalized Table 2-2 Chains of Generalized Eigenvectors. where B 2 A — Al No. of independent vectors in 4; but not in _, hte chains with length 2 vi %o v, = By One chain with length 440 LINEAR SPACES AND LINEAR OPERATORS eigenvectors as u, =B*u u, =B’u u; =Bu (2-50) Because of Wc A’; and v;—v, =1, there is only one linearly independent vector in M3 but not in 4. The uy in (2-50) is such a vector; therefore we cannot find any other linearly independent vector in 43 but not in 4. Con- sider now the vectors in 4 but not in W,. Because v,—v, =3, there are three such linearly independent vectors. Since uy in (2-50) is one of them, we can find two vectors v and w such that {u, v, w} are linearly independent and By-0 Bv40 and Bw=0 Bw#0 From y¥ and w, we can generate two chains of generalized eigenvectors of length 2.as shown in Table 2-2. As can be seen from Table 2-2, the number of vectors in NW, is equal to v, — vo =3, hence there is no need to search other vector in WN. This completes the search of eight generalized eigenvectors of A associated with 4. Theorem 2-10 The generalized eigenvectors of A associated with A generated as in Table 2-2 are linearly independent. Proof First we show that if {u;, ¥, w} is linearly independent, then {u,, v,, w;} is linearly independent. Suppose {u,,¥;,W,} is not linearly independent, then there exist c;,i=1,2,3, not all zero, such that cyu; +¢2¥; +caw: =0. However, we have O0=cyu, +¢3¥, +¢,W, =c, Bu, +¢,Bv +c,Bw =Bic,u, +c,v +c,w) 4 By Since y isa vector in A>, the only way to have By =Ois that y Since {u,, v, w} is linearly independent by assumption, y=0 implies c,=0,i=1,2,3. This is a contradiction. Hence if {u, v, w} is linearly independent, so is {u,, V1, W;}- Now we show that the generalized eigenvectors {u;, i=1,2,3,4; ¥;, Wj =1, 2} are linearly independent. Consider ed Fez +e3U3 +eqly +es¥, +c6V2 +c7W, +cgW2 =0 (2-51) The application of B? =(A —2I)? to (2-51) yields c,B°u,=0 which implies, because of B*u, #0,c4=0. Similarly, we can show c,=0 by applying B? to (2-51). With cs =c4 =0, the application of B to (2-51) yields 2g +c6V. +cegw, =0REPRESENTATIONS OF A LINEAR OPERATOR 41 which implies, because of the linear independence of {uy, V2, W2}, C2 = C5 = Cg =0. Finally, we have c, =cs =c, =0 following the linear independence of {u,,¥,, W1}- This completes the proof of this theorem QED. Theorem 2-11 The generalized eigenvectors of A associated with different eigenvalues are linearly independent. ' This theorem can be proved as in Theorem 2-10 by applying repetitively (A—AJ)(A ~A,D). The proof is left as an exercise. Now we discuss the representation of A with respect to QO [u, wu) a5 Uv, V2 Ww; Wo X x]. The last two vectors are the eigenvectors of A associated with other eigenvalues. The first four columns of the new representation A are the representations of At , 2, 3,4, with respect to {u,, WU, U3, U4, Vi, V2,Wi,W2,X,X}. Because (A =0, (A —ADu, =u,, (A —ADu; =u,, and (A — Alu, =u5. we have Au, = Au; “op 0 0 + Of, Au, =a, +42, = Qti 4.0 0 +. Oy, Aus=u,+4u,=Qf0 1 4 0 --- OY, and Au, =u; +du,=Q[0 0 t 2 0 --- OF, where the prime denotes the transpose. Proceeding similarly, the new representation A can be obtained as 24100:0000 0410:'0000 0041:0000 , $9 0 0 419 000 A=|0 00 OFA 110 6 (2-82) 000 0:0 430 0 o000 00 00000 0:0 :! This isa Jordan-form matrix, Note that the number of Jordan blocks associated with 2 is equal to v, =3, the dimension of the null space of (A — AI)."* Example 4 (Continued) Consider > I oo one Nw *6This number is called the geometric multiplicity of 4 in Reference 86. In other words, geometric multiplicity is the number of Jordan blocks, and the (algebraic) multiplicity is the sum of the orders of all Jordan blocks associated with A.a2 LINEAR SPACES AND LINEAR OPERATORS Its eigenvalues are 4; =1,4,=1,4;=2. An eigenvector associated with 23=2isv,=[5 3 1]. The rank of (A —A,I) is 2; hence we can find only one eigenvector associated with A,. Consequently, we must use generalized eigenvectors. We compute 012 BA(A-AD=|0 0 3 oot 0 1 2Wfo 1 2) foo s and (A-A,D?=/0 0 3]/0 0 3/=10 0 3 0 6 1fo 0 tf foo 1 Since pB?=1=n—m, we stop here. We search a v such that B’v=0 and By #0. Ciearty,7=[0 1 OJ'issucha vector. It is a generalized eigenvector of grade 2. Let 0] 0 1 2)fo] fi v,Av=}1 vy, S(A-A v=] 0 allil=lo 4 - Lo o ijlo} Lo Theorems 2-10 and 2-11 imply that v,,v2, and vs are linearly independent. This can also be checked by computing the determinant of [¥, vz v3]. If we use the set of vectors {¥,,¥2,¥3} as a basis, then the ith column of the new representation A is the representation of Av; with respect to the basis {¥,, V2, ¥3}- Since Av; =A,¥;,Av,=V, +4,¥,, and Av,=A3¥3, the representations of Av,, Ay,, and Av, with respect to the basis {v,,¥2,¥3} are, respectively, EH) Gl where 2, =1,4;=2. Hence we have , fi to A=|0 110] (2-53) This can also be obtained by using the similarity transformation A=Q-'AQ where Q=f), wh oon ono RON oe -REPRESENTATIONS OF A LINEAR OPERATOR, 43 Example 5 Transform the following matrix into the Jordan form: 3-1 1 1 0 0 1 1-1 -1 0 0 {6 0 2 0 1 1 , A=lo 0 0 2 -1 -1 eae 0 0 0 0 1 41 o 0 0 0 1 1 1. Compute the eigenvalues of A.'* det (A —Al) =[(3 — AI — 2) +1] — 2° [ — 2? - 1] = Hence A has eigenvalue 2 with multiplicity 5 and samatue 0 with multiplicity 1. 2. Compute (A — 21), for i= 1,2,...,as follows: ‘ 1-1 1 1 0 0 1-1 -1 -1 0 0 / 0 0 0 0 1 1 (A — 21) - 4(A-21= pl BSA-W=)9 9 9 0 -1 -1 vy, =6-4=2 0 0 0 oO -1 1 0 0 0 0 1 -1 0 0 2 2 0 6 0 0 2 2 0 0 0 0 0 0 0 0 (A — 21)? = (A — 21)? = pl : Y=l0 6 0 0 0 0 v2 =4 0 0 0 0 2 -2 0 0 0 0-2 2 0 0 0 0 0 0 0 0 0 0 0 0 _yp_|9 0 0 0 0 0 p(A — 21) = A =19 9 0 0 0 OO] a 0 0 0 0-4 4 0 0 0 0 4 ~4 Since p(A —21)* =n —m =1, we stop here. Because v; —v =1, we can find a gerieralized eigenvector u of grade 3 such that B*u =0 and Bu #0. It is 45 We use the fact that A BI det =det A det C 0 Cc where A and C are square matrices, not necessarily of the same order.44 LINEAR SPACES AND LINEAR OPERATORS easy to verify thatu=[0 0 1 0 O OJ is sucha vector. Define 2 1 0 2 -1 0 0 0 1 a, 2Bu= 0 u, 4 Bu= a u,; Qu= 0 0 0 0 0 0. 0. This is a chain of generalized eigenvectors of length 3. Because of v, — vy, =2, there are two linearly independent vectors in AW but not in .W”,. The vector u, is one of them. We search a vector v which is independent of #, and has the property B?v=0 and Bv#0. It can be readily verified that v= fo oO 1 1 1] is such a vector. Define Now we have found five generalized eigenvectors of A associated with 1 = 2. 3. Compute an eigenvector associated with 2, =0. Let w be an eigenvector of A associated with 4, =0; then Clearly, w=(0 0 0 0 1 ~1Jisasolution. 4. With respect to the basis {u,,U2,U,¥;.¥2, W},A has the following Jordan- form representation: (2-55) 5. This may be checked by using A=Q°'AQ or QA=AQFUNCTIONS OF A SQUARE MATRIX 45 where uoY¥, Vv. wi ocoorce mm cocoo 0 0 1 lll 1 1 oconnoso In this example, if we reorder the basis {u,, U2, U3, ¥,,V¥2,w} and use {W, Vo, Vy, U3, U2, Hy} as a new basis, then the representation will be (2-56) This is also called a Jordan-form representation. Comparing it with Equation (2-55), we see that the new Jordan block in (2-56) has 1’s on the diagonal just below the main diagonal as a result of the different ordering of the basis vectors. In this book, we use mostly the Jordan block of the form in (2-55). Certainly everything discussed for this form can be modified and be applied to the form given in Equation (2-56). A Jordan-form representation of any linear operator A that maps (C”, C) into itself is unique up to the ordering of Jordan blocks. That is, the number of Jordan blocks and the order of each Jordan block are uniquely determined by A. However, because of different orderings of basis vectors, we may have different Jordan-form representations of the same matrix. 2-7 Functions of a Square Matrix In this section we shall study functions of a square matrix or a linear transformation that maps (C", C) into itself. We shall use the Jordan-form representation extensively, because in terms of this representation almost all properties of a function of a matrix can be visualized. We study first polynomials of a square matrix, and then define functions of a matrix in terms of polynomials of the matrix. Polynomials of a square matrix. Let A be a square matrix that maps (C*, C) into (C”, C). Mf k is a positive integer, we define A‘ QAA---A — (k terms) (2-57a) and ACSI (2-57b)46 LINEAR SPACES AND LINEAR OPERATORS: where I is a unit matrix. Let f(A) be a polynomial in A of finite degree; then f(A) can be defined in terms of (2-57). For example, if f(4)=43 +22? +6, then s(A) & A? +24? +61 We have shown in the preceding section that every square matrix A that maps (C’, C) into itself has a Jordan-form representation, or equivalently, there exists a nonsingular constant matrix Q such that A=QAQ™' with A in a Jordan canonical form. Since AF =(QAQ™'YQAQ~!)---(QAQ™) = QA'Q™ we have ’ SIA=QFAQ™? or (A)=Q"' FAQ (2-58) for any polynomial f(A). One of the reasons to use the Jordan-form matrix is that if _fA, 0 a-[4 2 (2-59) where A, and A, are square matrices, then _[fay 0 faye 0 jay] This can be easily verified by observing that a yat ° 0 AS The minimal polynomial of a matrix A is the monic polynomial'® (A) of least degree such that y(A) =0. ' (2-60) Definition 2-14 Note that the 0 in y(A)=0 is an n x n square matrix whose entries are all zero. A direct consequence of (2-58) is that f(A) =0 if and only if f(A)=0. Consequently, the matrices A and A have the same minimal polynomial, or more generally, similar matrices have the same minimal polynomial. Computing the minimal polynomial of a matrix is generally not a simple job; however, if the Jordan-form representation of the matrix is available, its minimal polynomial can be readily found. Let A,,42,....4m be the distinct eigenvalues of A with multiplicities Nj, N,-.-,MNm, respectively. It is the same as saying that the characteristic ‘© 4 monic polynomial is a polynomial the coefficient of whose highest power is 1. For example, 3x +1 and — x? +2x +4 are not monic polynomials, but x? —4x +7 is.FUNCTIONS OF A SQUARE MATRIX 41 polynomial of A is (2-61) A, 0 0 Az|® Ar 7 (2-62) 0 0 A, where the n, x n; matrix A, denotes all the Jordan blocks associated with 1; Definition 2-15 The largest order of the Jordan blocks associated with A; in A-is called the index of A; in A. 1 The multiplicity of A; is denoted by n;, the index of A; is denoted by #. For the matrix in (2-52), n, =8,/; =4; for the matrix in (2-53), n, =f, =2,n.= fiz =1; for the matrix in (2-55), n, =5,f,=3,n2=f,=1. It is clear that i. ni Theorem 2-12 The minimal polynomial of A is WA = [T] a- i= where ji; is the index of 4; in A. Proof Since the matrices A and A have the same minimal polynomial, it is the same as showing that (A) is the polynomial with least degtee such that y(A)=0. We first show that the minimal polynomial of A; is pi(4)=(4 —4,)". Suppose A, consists of r Jordan blocks associated with J; Then Av=diag (Ai, Aig,..., As) Wan) Q@ -- 0 and wiAj=| 9 Wan) 0 0 0 yay) (2-63)48 LINEAR SPACES AND LINEAR OPERATORS If the matrix (Ajj ~ A;l) has dimension n;;, then we have 010+ 0 a 001-0 (Ajy-AD=}i bt i (2-64a) (nxn) [0 0 0 + 1 000+: 6 0010 0 oo0o0t- 0 A. -ayeli i i: (Ai;— Ad) 0000: 1 (2-64b) 0000-0 0000 0. 000+ 01 (Ay—aaet =] 9 2-0 0 (2-64) 000.00 and (A,—2)=0 — for any integer k>n,; (2-64) By definition, i, is the largest order of the Jordan blocks in A,, or equivalently, =max (nj, i it). Hence (Ay— 24D =0 for j=1,2,....r. Con- peer V(A))=0. Itis easy to see from (2-63) and (2-64) that if W(4) =(4—2)' with either % #4, ork 4—1) as minimal polynomials. ' Because the characteristic polynomial is always divisible without remainder by the minimal polynomial, we have the following very important corollary of Theorem 2-12. Corollary 2-12 (Cayley-Hamilton theorem) Let A(A) A det (AT— A) A A" +02" +--- +0,,-,2 +0, be the characteristic polynomial of A. Then A(A)=A" +a, A"! to-+ +a, A +a,1=0 1 The Cayley-Hamilton theorem can also be proved directly without using Theorem 2-12 (see Problems 2-39 and 2-40). The reason for introducing the concept of minimal polynomial will be seen in the following theorem. Theorem 2-13 Let A,,42,....4m be the distinct eigenvalues of A with indices fi,, fiz,.... fin. Let f and g be two polynomials. Then the following statements are equivalent. 1. f(A) =9(A). 2. Either f =A, +g or g=h,W +f, where is the minimal polynomial of A,and h, and h, are some polynomials. 3. SR)=9%) — for 1=0,1,2,...,7)—1: 1=1,2,...,m (2-65) w aw @) aan 5 where 4) & Sa and g'(d,) is similarly defined. Proof E. The equivalence of statements 1 and 2 follows directly from the fact that (A) = 0. ; Statements 2 and 3 are equivalent following WA) = QE.D. In order to apply this theorem, we must know the minimal polynomial of A. The minimal polynomial can be obtained by transforming A into a Jordan form or by direct computation (Problem 2-42). Both methods are complicated. Therefore it is desirable to modify Theorem 2-13 so that the use of the minimal polynomial can be avoided. Corollary 2-13 Let the characteristic polynomial of A be AW) 4 det (at —A)= [] G—ay" i=50 LINEAR SPACES AND LINEAR OPERATORS: Let f and g be two arbitrary polynomials. If FA) =9A,) for 1=0, 1,2 i=1,2,. (2-66) then f(A) =9(A). a This follows immediately from Theorem 2-13 by observing that the condition (2-66) implies (2-65). The set of numbers f(A,), for i=1,2,...,m and [= 0,1,2,...,2;-1 (there are totally n=)", n;) are called the values of f on the spectrum of A. Corollary 2-13 implies that any two polynomials that have the same values on the spectrum of A define the same matrix function. To state it in a different way: Given n numbers, if we can construct a polynomial which gives these numbers on the spectrum of A, then this polynomial defines uniquely a matrix-valued function of A. It is well known that given any » numbers, it is possible to find a polynomial g(4) of degree n — 1 that gives these nnumbers at some preassigned 4. Hence if A is of order n, for any polynomial (2), we can construct a polynomial of degree n — 1, (A) =o tad tee tay ant (2-67) such that g(A)= f(A) on the spectrum of A. _ Hence any polynomial of A can be expressed as S{A)=g(A)=aol tayA +++ +a, ,A"™? This fact can also be deduced directly from Corollary 2-12 (Problem 2-38). Corollary 2-13 is useful in computing any polynomial and, as will be discussed, any function of A. If A is of order x, the polynomial g(A) can be chosen as in (2-67) or as any polynomial of degree n — 1 with n independent parameters. For example, if all eigenvalues, 4;,i=1,2,...,n, of A are distinct, then g(A) can be chosen as cant a= ¥ BT] @-A) io j1 iti not t or g() = y B; [] 4-4) So j=i In conclusion, the form of g(A) can be chosen to facilitate the computation. Example 2 Compute A!°, where fs i] In other words, given f(i)=A?°°, compute f(A). The characteristic polynomial of A is A(A) =det (AI — A) = (4-1). Let g(d) be a polynomial of degree n—-1=1, say (A) = +0450FUNCTIONS OF A SQUARE MATRIX 51 Now, from Corollary 2-13, if f(A) =g(4) on the spectrum of A, then f(A) =g(A). On the spectrum of A, we have fU)=g(1) (1)109 = 09 +04 SM=9') 100: (1)? =a Solving these two equations, we obtain «, = 100 and % = —99. Hence 10 12 { 200 100 _ =u,1 +4,A = — = A gG(A) =%1 +0, (4 1 | +10f 5 | (0 | Obviously A1°° can also be obtained by multiplying A 100 times or by using a different g(4) such as g() =%9 +a,(4 ~ 1) (Problem 2-33). 1 Functions of a square matrix Definition 2-16 Let f(A) be a function (not necessarily a polynomial) that is defined on the spectrum of A. If g(A) is a polynomial that has the same values as f(A) on the spectrum of A, then the matrix-valued function f(A) is defined as f(A)Og(A). This definition is an extension of Coroliary 2-13 to include functions. To be precise, functions of a matrix should be defined by using the conditions in (2-65). The conditions in (2-66) are used because the characteristic polynomial is easier to obtain than the minimal polynomial. Of course, both conditions will lead to the same result. If A is an n xn matrix, given the n values of f(A) on the spectrum of A, we can find a polynomial of degree n —1, A) =a takers pony ah? which is equal to f(A) on the spectrum of A. Hence from this definition we know that every function of A can be expressed as f(A) =o aA te tay, A"? We summarize the procedure of computing a function of a matrix: Given an nxn matrix A and a function f(A), we first compute the characteristic polynomial of A, say Ad) = [TG -Ay ist Let GA) =o HOA Hee Hoy yA"! where %,%1,...,%,—, aren unknowns. Next we use the n equations in (2-66) to compute these ’s in terms of the values of f on the spectrum of A.. Then we have f(A)=g(A). We note that other polynomial g(A) of degree n —1 with n'independent parameters can also be used.52 _LINEAR SPACES AND LINEAR OPERATORS @ 0 = A,j=|o 1 0 1 0 3 Compute e*, Or equivalently, if f(2) =e, what is f(A,)? The characteristic polynomial of A, is(A— 1)°(A—2). Let gd) =a toh + ay) Then f=) ea +a, +0 S'M=g') te' =a, +2a, (note that the derivative is with respect to A, not t) S(2)=g(2) e** =a) +20, +4ery Example 3 Let Solving these equations, we obtain a) = —2te' +e7', 4, =31e +2e' —2e7, and a, =e” ~—e'~re'. Hence, we have ett = g(Ay) =(~2te! +e +(3te! +20" ~ 2eVA, +(e —e' —1e)A} 2e! —e* 0 2e! ~2¢" 0 é o | e+e 0 dete Exampie 4 Let Its characteristic polynomial is A(Z)=(4 — 1)°(A —2), which is the same as the one of A, in Example 3. Hence we have the same g(A) as in Example 3. Con- sequently, we have 2e —e* 2te! 2e! — 26" eM! = g(A,)= 0 é 0 ete —tel 20 ~e ‘ Example 5 Given 4A, tL Os O , JO A tee 0 A =|: i: 7 (2-68) (axn) {0 0 O-- 1 00 0-7 A The characteristic polynomial of A is (4—4,)". Let the polynomial g(2) beFUNCTIONS OF A SQUARE MATRIX 53 of the form GA) = +04(2 Ay) Horg(h AP oto eA Then the conditions in (2-66) give immediately ; 4“ LOMA) a=fUah H=LUrw es r= Gy Hence, Rt ag yf, fA) (A)=g(A)= fA A +r fOr) PAYAL fAy2t of" MA n=! 0 fla) f'Uah > f° n—2)! 0 Q Fs) FMAM 3)! gg) 0 0 0 fs) Here in the last step we have used (2-64). If f(A) =e", then Ate" (A-A\p? et pet Pe/21 Nein 1)! eta 0 en te aa tPer'f(n — 2)! (2-70) er Note that the derivatives in (2-69) are taken with respect to 1,, not to t. 1 A function of a matrix is defined through a polynomial of the matrix; therefore, the relations that hold for polynomials can also be applied to functions ofa matrix. For example, if A=QAQ~', then HAV=Qf(AIQ™* andi _[A, 0 and if A -[ 0 ‘| _[ f(A) 0 then f(A) -[ 0 ray (2-71) for any function f that is defined on the spectrum of A. Using (2-69) and (2-71), any function of a Jordan-canonical-form matrix can be obtained immediately. Example 6 Consider (2-72)54 LINEAR SPACES AND LINEAR OPERATORS If f(2) =e*, then eM tet Per/21! 0 0 0 ec te 10 0 A_| 0. 0 a 1 0 8 seta] -O..- 988s aU eee 7; SiA)=e oN OY a a (2-73) 9 0 0:0 ent If f(2)=(s—A) "|, where 5 is a complex yariable, then f(A) = (st A)! 1 sl, SI (2-74) ! Functions of a matrix defined by means of power series. We have used a polynomial of finite degree to define a function of a matrix. We shall now use an infinite series to give an alternative expression of a function of a matrix. Definition 2-17 Let the power series representation of a function f be fa)=¥ wai (2-75) i=0 with the radius of convergence p. Then the function f of a square matrix A is defined as IAS Y wat (2-76) i=0 if the absolute values of all the eigenvalues of A are smaiier than p, the radius of convergence; or the matrix A has the property A*=0 for some positive integer k. . This definition is meaningful only if the infinite series in (2-76) converges. If A‘ =0 for some positive integer k, then (2-76) reduces to k-1 fiaj=FUNCTIONS OF A SQUARE MATRIX 5 If the absolute values of all the eigenvalues of A are smaller than p, it can also be shown that the infinite series converges. For a proof, see Reference 77. Instead of proving that Definitions 2-16 and 2-17 lead to exactly the same matrix function, we shall demonstrate this by using Definition 2-17 to derive (2-69). Example 7 Consider the Jordan-form matrix A given in (2-68). Let FA= fl) + PAA) + Gap be then A A £07 Md) SAYS SAM + f(A XA AD +2 +7 ft)! Aad yore (2-77) Since (A—A,1)' is of the form of (2-64), the matrix function (2-77) reduces immediately to (2-69). ' Example 8 The exponential function 22 ere etal tit Fi te = converges for all finite 4 and t. Hence for any A, we have roy t eM x ii tKAK (2-78) Kok! A remark is in order concerning the computation of e'. If e“‘ is computed by using Definition 2-16, a closed-form matrix can be obtained. However, it requires the computation of the eigenvalues of A. This step can be avoided if the infinite series (2-78) is used. Clearly, the disadvantage of using (2-78) is that the resulting matrix may not be in a closed form. However, since the series (2-78) converges very fast, the series is often used to compute e“‘ on a digital computer. We derive some important properties of exponential functions of matrices to close this section. Using (2-78), it can be shown that = ets) — phtghs (2-79) efA+8 eA! if and only if AB=BA (2-80) In (2-79), if we choose s = — t, then from the fact that e® =I, we have [eM] t=e™ (2-81)56 LINEAR SPACES AND LINEAR OPERATORS, By differentiation, term by term, of (2-78), we have d At . 1 k-1 Ak el 4 me aN F ao > Zao" Zee =Ae“=eMA (2-82) Here we have used the fact that functions of the same matrix commute (see Problem 2-36). The Laplace transform of a function f defined on [0, 00) is defined as FS FPO 4 [[rioesa (2-83) 0 [i : By taking the Laplace transform of (2-78), we have It is easy to show that soe) BM) =, HAR = (2-84) K=0 It is well known that the infinite series fU)=U Atal tatete converges for |4|<1. Now if s is chosen sufficiently large, the absolute values of all the eigenvalues of s~'A are smaller than 1. Hence from Definition 2-17, we have (I=s"tA)-' = © (stay (2-85) Ko Hence from (2-84) we have Le) =s-'"(I-s" 1A)! =(sI—A)7! (2-86) In this derivation, Equation (2-86) holds only for sufficiently large s. However, it can be shown by analytic continuation that Equation (2-86) does hold for all s except at the eigenvalues of A. Equation (2-86) can also be established from (2-82), Because of £[dh(t)/dt] =s£[A(t)] —A(0), the application of the Laplace transform to (2-82) yields s£(e)— & =AL(e) or (sI-A)L(e") =1 which yields immediately (2-86). For the matrices in (2-73) and (2-74), we can also readily establish (2-86).NORMS AND INNER PRODUCT 57 2-8 Norms and Inner Product’ ” All the concepts introduced in this section are applicable to any linear space over the field of complex numbers or over the field of real numbers. However, for convenience in the discussion, we restrict ourself to the complex vector space (C", C). The concept of the norm of a vector x in (C", C) is a generalization of the idea of length. Any real-valued function of x, denoted by ||xl|, can be defined ——— ; 5 as a norm if it has the properties that for any x in (C”, C) and any « in C 1. |[x|| =0 and |]x|| =0 if and only if x =0. 2. |lex|| =o [Ix||. 3. |x, + Xa] <|Peul] + [Peal]. | The last inequality is called the triangular inequality. Letx=[x, x2 *** XJ. Then the norm of x can be chosen as Ihxll: & or Isl: & or Ibs. It is easy to verify that each of them satisfies all the properties of a norm. The norm ||-||2 is called the euclidean norm. In this book, the concept of norm is used mainly in the stability study.; we use the fact that ||x|| is finite if and only if all the components of x are finite. ’ The concept of norm can be extended to linear operators that map (C", C) into itself, or equivalently, to square matrices with complex coefficients. The F norm of a matrix A is defined as 1 sup, lax x= wp! i 228 Tl where “sup” stands for supremum, the largest possible number of ||Ax|| or the least upper bound of ||Ax||. An immediate consequence of the definition of [Al] is, for any x in (C", C), IAx{] [lal |x|) (2-80) The norm of A is defined through the norm ofx; hence it is called an induced |. norm. For different ||x||, we have different ||A||. For example, if ||x||, is used, then A *7 May be skipped without loss of continuity The material in this section is used only in Chapter 8, and its study may be coupled with that chapter. ial =max ( la58 LINEAR SPACES AND LINEAR OPERATORS where a;;is the jth element of A. If||x||, is used, then IAll2 = (ena A*A))"? where A* is the complex conjugate transpose of A and A,,,,(A*A) denotes the largest eigenvalue of A*A (see Appendix E). If |||]... is used, then Ulln =max ($l) These norms are all different, as can be seen from Figure 2-7. The norm of a matrix has the following properties |A + Bll 0 for allx #0 where the “overbar” denotes the complex conjugate of a number. The first property implies that (x,x) is a real number. The first two properties imply that (x, xy) = (x, y). In the complex vector space (C", C), the inner product is always taken to be Gy) =x*y Savi (2-93) co where x* is the complex conjugate transpose of x. Hence, for any square matrix A, we have «x, Ay) =x*Ay and (A*x, y) =(A*x)ty =x*Ay Consequently we have «x, Ay) = (A*x, y) (2-94) The inner product provides a natural norm for a vector x:||x|| =((x,x))!/7 In fact, this is the norm defined in Equation (2-88). Theorem 2-14 (Schwarz inequality) M we define ||x|| =((x,x))!/, then |, ¥)] SII IIs] ¢ inequality is obviously true if y=0. Assume now y #0. Clearly we have OS«& +ay,x tay) = (xx) +a(y,x) +(x, y) +aady, y) (2-95) any a, Let a= —(y,x)/(y, y); then (2-95) becomes & ¥)Q,x) _[O¥)? (yy) yy) gives the Schwarz inequality. QED. (x) =

Chen ch2 pg7-59

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Chen ch2 pg7-59

Transféré par

Droits d'auteur :

Formats disponibles

Vous aimerez peut-être aussi