Académique Documents
Professionnel Documents
Culture Documents
of Matrices
Books in the Classics in Applied Mathematics series are monographs and textbooks declared out
of print by their original publishers, though they are of continued importance and interest to the
mathematical community. SIAM publishes this series to ensure that the information presented in
these texts is not lost to today's students and researchers.
Editor-in-Chief
Robert E. O'Malley, Jr., University of Washington
Editorial Board
John Boyd, University of Michigan Peter Oiver, University of Minnesota
Susanne Brenner, Louisiana State University Philip Protter, Cornell University
Bernard Deconinck, University of Washington Matthew Stephens, The University of Chicago
William G. Faris, University of Arizona Divakar Viswanath, University of Michigan
Nicholas J. Higham, University of Manchester Gerhard Wanner, L' Universite de Geneve
Mark Kot, University of Washington
Jean Dickinson Gibbons, Ingram Olkin, and Milton Sobel, Selecting and Ordering Populations: A New
Statistical Methodology
James A. Murdock, Perturbations: Theory and Methods
Ivar Ekeland and Roger Temam, Convex Analysis and Variational Problems
Ivar Stakgold, Boundary Value Problems of Mathematical Physics, Volumes I and 11
J. M. Ortega and W. C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables
David Kinderlehrer and Guido Stampacchia, An Introduction to Variational Inequalities and Their
Applications
F. Natterer, The Mathematics of Computerized Tomography
Avinash C. Kak and Malcolm Slaney, Principles of Computerized Tomographie Imaging
R. Wong, Asymptotic Approximations of Integrals
0 . Axelsson and V. A. Barker, Finite Element Solution of Boundary Value Problems: Theory and Computation
David R. Brillinger, Time Series: Data Analysis and Theory
Joel N. Franklin, Methods of Mathematical Economics: Linear and Nonlinear Programming, Fixed-Point Theorems
Philip Hartman, Ordinary Differential Equations, Second Edition
Michael D. Intriligator, Mathematical Optimization and Economic Theory
Philippe G. Ciarlet, The Finite Element Method for Elliptic Problems
Jane K. Cullum and Ralph A. Willoughby, Lanczos Algorithms for Large Symmetric Eigenvalue
Computations, Vol. 1: Theory
M. Vidyasagar, Nonlinear Systems Analysis, Second Edition
Robert Mattheij and Jaap Molenaar, Ordinary Differential Equations in Theory and Practice
Shanti S. Gupta and S. Panchapakesan, Multiple Decision Procedures: Theory and Methodology
of Selecting and Ranking Popuhtions
Eugene L. Allgower and Kurt Georg, Introduction to Numerical Continuation Methods
Leah Edelstein-Keshet, Mathematical Models in Biology
Heinz-Otto Kreiss and Jens Lorenz, lnitial-BounL·ry Value Problems and the Navier-Stohes Equations
J. L. Hodges, Jr. and E. L. Lehmann, Basic Concepts of Probability and Statistics, Second Edition
George F. Carrier, Max Krook, and Carl E. Pearson, Functions of a Complex Variable: Theory and Technique
Friedrich Pukelsheim, Optimal Design of Experiments
Israel Gohberg, Peter Lancaster, and Leiba Rodman, Invariant Subspaces of Matrices with Applications
Lee A. Segel with G. H. Handelman, Mathematics Applied to Continuum Mechanics
Rajendra Bhatia, Perturbation Bounds for Matrix Eigenvalues
Barry C. Arnold, N. Balakrishnan, and H. N. Nagaraja, A First Course in Order Statistics
Charles A. Desoer and M. Vidyasagar, Feedback Systems: Input-Output Properties
Stephen L. Campbell and Carl D. Meyer, Generalized Inverses of Linear Transformations
Alexander Morgan, Solving Polynomial Systems Using Continuation for Engineering and Scientific Problems
1. Gohberg, P. Lancaster, and L. Rodman, Matnx Polynomials
Galen R. Shorack and Jon A. Wellner, Empirical Processes with Applications to Statistics
Richard W. Cottle, Jong-Shi Pang, and Richard E. Stone, The Linear Complementarity Problem
Rabi N. Bhattacharya and Edward C. Waymire, Stochastic Processes with Applications
Robert J. Adler, The Geometry of Random Fields
Mordecai Avriel, Walter E. Diewert, Siegfried Schaible, and Israel Zang, Generalized Concavity
Rabi N. Bhattacharya and R. Ranga Rao, Normal Approximation and Asymptotic Expansions
Francoise Chatelin, Spectral Approximation of Linear Operators
(continued)
Classics in Applied Mathematics (continued)
Yousef Saad, Numerical Methods for Large Eigenvalue Problems, Revised Edition
Achi Brandt and Oren E. Livne, Multigrid Techniques: 1984 Guide with Applications to Fluid Dynamics,
Revised Edition
Bernd Fischer, Polynomial Based Iteration Methods for Symmetric Linear Systems
Pierre Grisvard, Elliptic Problems in Nonsmooth Domains
E. J. Hannan and Manfred Deistler, The Statistical Theory of Linear Systems
Franchise Chatelin, Eigenvalues of Matrices, Revised Edition
Eigenvalues
of Matrices
r REVISED EDITION T
Frangoise Chatelin
CERFACS and the University of Toulouse
Toulouse, France
With exercises by
Mario Ahues
Universite de Saint^Etienne, France
and
Frangoise Chatelin
51EUTL,
Society for Industrial and Applied Mathematics
Philadelphia
Copyright © 2012 by the Society for Industrial and Applied Mathematics
This SIAM edition is a revised republication of the work first published by John
Wiley & Sons, Inc., in 1993.
This book was originally published in two separate volumes by Masson, Paris:
Valeurs propres de matrices (1988) and Exercises de valeurs propres de matrices (1989).
10 9 8 7 6 5 4 3 2 1
All rights reserved. Printed in the United States of America. N o part of this book
may be reproduced, stored, or transmitted in any manner without the written
permission of the publisher. For information, write to the Society for Industrial and
Applied Mathematics, 3600 Market Street, 6th Floor, Philadelphia, PA 19104-2688
USA.
is a registered trademark.
To
Hypatia of Alexandria,
a.d. 370415,
stoned to death by the mob.
o
Contents
Preface to the Classics Edition xiii
Preface xv
Preface to the English Edition xix
Notation xxi
List of Errata xxiii
Appendices 351
A Solution to Exercises 351
B References for Exercises 395
C References 399
Index 406
Preface to the Classics Edition
The original French version of this book was published by Masson, Paris, in 1988.
The 24 years which have elapsed since 1988 until the present SIAM republication
of the English translation (Wiley, 1993) by Professor Ledermann have confirmed the
essential role played by matrices in intensive scientific computing. They lie at the
foundation of the digital revolution that is taking place worldwide at lightning speed.
During the past quarter of a century, the new field called qualitative computing
has emerged in mathematical computation, which can be viewed as a first step in the
direction of the polymorphic information theory that is required to decipher life phe
nomena on the planet. In this broader perspective, the backward analysis, which was
devised by Givens and Wilkinson in the late 1950s to assess the validity of matrix
computations performed in the finite precision arithmetic of scientific computers, be
comes mandatory even when the arithmetic of the theory is exact. This is because
classical linear algebra may yield local results which disagree with the global nonlin
ear algebraic context. Consequently, square matrices play, via their eigenvalues and
singular values, an even more fundamental role than that which was envisioned in
Chapter 3 of the original version.
This Classics Revised Edition describes this deeper role in a postface taking the
form of Chapter 8, which is accompanied by an updated bibliography. This is my third
book devoted to computational spectral theory to be published by SIAM, following
Lectures on Finite Precision Computations in 1996 (co-authored with Valerie Fraysse)
and Spectral Approximation of Linear Operators in 2011 (Classics 65). These books
form a trilogy which contains the theoretical and practical knowledge necessary to
acquire a sound understanding of the central role played by eigenvalues of matrices
in life information theory.
My gratitude goes to Beresford Parlett, U.C. Berkeley, for his perceptive reading
of a draft of the Postface. It is again my pleasure to acknowledge the highly profes
sional support provided by Sara Murphy, Developmental and Acquisitions Editor at
SIAM.
Frangoise Chatelin
CERFACS and University of Toulouse,
July 2012.
Preface
Helmholtz (...) advises us to observe for a long time the waves of the sea and the
wakes of ships, especially at the moment when the waves cross each other (...). Through
such understanding one must arrive at this new perception, which brings more order
into the phenomena.
The reader who wants to obtain a deeper understanding of this area will find
the study of the following classical books very enriching: Golub-Van Loan
(Chapters 7,8 and 9), Parlett and Wilkinson.
The present book is a work on numerical analysis in depth. It is addressed
especially to second-year students of the Maitrise, to pupils of the Magistere, as
well as to those of the Grandes Ecoles. It is assumed that the reader is familiar
with the basic facts of numerical analysis covered in the book Introduction ά Γ
Analyse Numerique Matricielle et ά Γ Optimisation* by P. G. Ciarlet. The Recueil
if Exercices* is an indispersable pedagogic complement to the main text. It
consists of exercises of four types:
*An English translation was published in 1989 by the Cambridge University Press (Translator's
footnote). ·
+
This collection of exercises is incorporated in the present volume (Translator's footnote).
PREFACE χνιι
I am very pleased to have this opportunity to acknowledge the very fine work
accomplished by Professor Ledermann. He certainly worked much harder than
is commonly expected from a translator, to transform a terse French text into
a more flowing English one. He even corrected some mathematical mistakes!
Two paragraphs have also been added in Chapter 4 to keep up to date with
new developments about the influence of non-normality and the componentwise
stability analysis. The list of references has been updated accordingly.
Notation
N set of integers
JR set of real numbers
C set of complex numbers
A = (α^) matrix with elementflyin the ith row andjth column (1 < i ^ w,
K jϊ ^ m); linear map of <CW into C"
Λ τ = (αβ) transposed matrix (denoted by XA in algebra)
A* = (a,,) transposed conjugate matrix
Xm
<C" set of n by m matrices over C
τ
x = ( { # = ( { l 9 ..., £ η ) column vector of C
{x x ,...,x r } = {Xi}\ set of r vectors
A = [fl t ,..., flm] matrix of column vectors {A,}"
sp(4) spectrum of A
{A,·}^ set of distinct eigenvalues of A, d ^ w.
{μ,}" set of eigenvalues, possibly repeated, each counted with its
algebraic multiplicity
res(/l) = C — sp(4) resolvent set of A
p(A) = max | χ.\ spectral radius of A
i
det A determinant of A
n
tr A trace of A = £ aH
i=l
r(A) rank of A
adj /I = (Aij) adjoint of A:AU is the cofactor of αμ when A = (fly)
π(λ) = det (A/ — Λ) the characteristic polynomial of A
2
|| x || 2 = I Σ I iil ) Euclidean norm of x
χχιι NOTATION
Preface
p. xv line(—22) helicopters
p. xvi line(—2) indispensable
Notation
p. xxii definition A is regular 4=> A is invertible <=$■ det A^O
Chapter 1
p. 3 line(-12) orthonormal
p. 10 line(13) basis
p. 11 line(2) x*x = 1
line(-7) ||(/-πΛτ)πΜ|| in (1.4.6)
p. 16 line(2) ck
line(10) as required.
p. 18 line(2) basis
line(5) = l/2ibr
/ x -esin(2/e) λ
line(8) A[£)
~ \ -εώη(2/ε) x )
p. 21 line(-7) ,w*)
p. 22 line(15) 1< j <i < m
p. 23 line(9) =0
p. 31 line(13) . . . of X, but it actually goes back to Laplace in the
first Supplement, pp. 505-512, to Theorie analy-
tique des Probabilites, 3rd edition, Paris, 1820.
p. 32 footnote 1854-
XXIV LIST OF ERRATA
p. 38 line(18) \\&z\\
footnote *Karl Adolf Hessenberg, 1904-1959, born and died
in Frankfurt am Main.
p. 39 lined) HT-'HF
line(ll) 5<1
line(12) cond 2 (X)
p. 40 line(16) \\AA* - A*A\\F
line(-8) of order r (not to be confused with the order of B
taken to be 1)
p. 42 line(—7) set of finite eigenvalues
p. 43 line(—13) Poincare
p. 45 line(9) - f^) in 1.1.10
p. 46 lined 3) Plli in 1.1.19
line(-3) 0 m a x = π/2 in 1.2.2
p. 47 line(-2) [UU] in 1.2.6
p. 48 line(10) If||(P-Q)P||2<l in 1.3.4
/ r \ 1/2
2
line(-5) h^sin ^ in 1.4.2
Chapter 2
p. 65 line(-3) constant
p. 66 Proposition 2,.2.7 z .-* S(z)
lined 1) keW
lined 2)
p. 68 line(7) An = APv
lined 6) z^T
p. 70 line(9) -p
z-X
in (2.2.5)
l-l
lined 6)
= -Σ fe=0
p. 72 line(5)
= P(A*,X)
p. 73 line(2)
P(A*,X) =
line(4)
dz = —dz
LIST OF ERRATA XXV
p. 75 line(14)
p. 78 line(3) (b) >δ-'
p. 79 line(7) σ
p. 80 line(-l) A-zI
p. 83 line(14) (2.6.2)
Hne(-lO) in exact arithmetic.
p. 84 line(-3) λ(ί) =
p. 86 line(3) k
p. 87 line(-6)
έί
p. 88 line(4) S(t) = [.·.]- 1
p. 89 PROOF Take s = p defined in Theorem 2.9.3.
p. 90 line(12) (I - Y*X)B = 0.
line(-6) = r U {0}.
p. 91 line(14) -0
p. 92 line(19) B = Y*AU
p. 93 line(3) \\vk\\ = *k
p. 96 lines(—7, —6) l(U)
p. 98 line(2) = Κχ^ + · · ·
line(5) +A'- 1 6
103 line(-3) B in 2.9.1
\\{vY
104 line(4) A*y - ξν in 2.9.1
line(6) •••IIQUe in(*)
p. 105 line(7) delete of in 2.10.2
p. 108 line(16) coefficient (1, n) in J is a\n in 2.11.5
Chapter 3
n
line(--8)
Λ-Σ-
line(--7) Xi = (1 + r)fi
P· 126 line(--7) of eigenvalues
XXVI LIST OF ERRATA
p. 132
line(9) π be the vector in 3.2.3
p. line(-2)
135 click in 3.2.9
p. 136
line(8) converge in 3.2.9
p. 140
line(5) S = [Si,...] in 3.4.8
p. 145
3.7.1 [B:ll]
3.7.2 [B:4,H]
p. 146 3.7.4 [B: 3,11]
Chapter 4
Chapter 5
Chapter 6
p. 253
line(9) G,
p. 254
lined 3)
p. 255
line(17) < caf.
p. 258
line(3) jen = lin(...,An~1u)
U
line(-2) bj
line(-l) u : = U" -
p. 260 line(8) Let
χχνιιι LIST OF ERRATA
p. 261 line(-l) H
p. 262 line(13) Ρ(λι) = · · · =
p. 267 line(4) J^l
line(14) (a) delete A
p. 269 line(4) skeS
line(ll) Sk = Xk + Y]PjSk
line(-l) (7i)
p. 271 line(14) required
line(15) same
p. 272 line(6) u := u — · · ·
p. 273 line(l) M
p. 275 line(3) sp(A) - {X}
line(-4) centre c,
line(-2) denominator
laiui 7}
±i-\ _i y(-— 0
p. 276 line(9) < cai
line(-4) Delete theie end of pr<pi
lowing:
When dealing with || (Pi — P)x\\ in Theorem 6.3.4
we did not use the assumption that A is Hermitian.
Therefore the result sinθι < \\(Ρι — P)x\\ < cai
is valid. To bound λ — λ/ we also follow the proof
of Theorem 6.3.4, where xi (respectively x{) is re
placed by xi (respectively x[). The conclusion fol
lows accordingly.
p. 278 line(9) ioi = j
line(16) i = i* under Σ
line(19) ignore sketch of Hi
line(-7) \Χί
line(-5) h+i ι
line(-l) Hi
p. 279 lines(—11, - 8 , - 1 ) wi
line(-8) (6.7.1)
p. 281 line(-8) Sib in 6.1.2
p. 282 line(-12) Theorem 6.2.6 in 6.2.5
p. 283 line(l) (#*uj+i)i/i in 6.3.3
p. 284 line(8) change m to / in T/ in 6.3.5
p. 285 line(8)
p. 287 line(5) Exercise 6.3.13 in 6.3.14
p. 289 line(3) do,... , d m _ i in 6.3.18
line(10) rjrj = 0 in 6.3.18
line(lS) tridiagonal matrix Tk in 6.3.18
LIST OF ERRATA XXIX
Chapter 7
Appendix A
Appendix B
p. 395 line(-12) [15] de la methode ...
Appendix C
p. 399 Balas
Barra
Baumgärtel, H. (1985)
p. 400 line(20) Bauwens
p. 402 line(24) . . . Sijthoff and Noordhoff.
line(25) Meyer, CD. Jr. . . .
p. 403 line(20) Oeuvres
line(-ll) Rutishauser, H. (1969) 'Computational aspects of
F. L. Bauer's simultaneous iteration method', Nu-
mer. Math., 13,4-13.
CHAPTER 1
j=\
and
11x1100= max |<y
are also useful. Unless the contrary is indicated, || · || denotes an arbitrary norm
on <C".
The scalar product on <Cn is given by
(x9y) = y*x.
When (x, y) = 0, the vectors x and y are said to the orthogonal.
2 SUPPLEMENTS FROM LINEAR ALGEBRA
which is represented by the matrix A when <Cr and <D" are referred to their
respective canonical bases.
NOTATION AND DEFINITIONS 3
Example 1.1.1 The unit matrix I„ represents the identity map C"->C". When
there is no ambiguity we shall write /.
The matrix
A* = (äjd
is the transposed conjugate complex of A; it reduces to the transposed
Λτ = (αβ)
(or lA) when A is real. Let / I b e a n n x n square matrix. The trace οϊΑ is given by
n
*A= Σ aJr
ß*ß = 'r
Ann x n square matrix Q is called unitary if
Example 1.1.2 (see Isaacson and Keller, 1966, pp. 9-10) When the norm ||* || x
is used for both C r and C , then
Π
When the norm || · || „ is used for both <Cr and <P, then
II ^t II oo = max Σ Κ Ι = ΜΊΙι-
CJ = COS0J9 O^0;^(;=l,...,r),
2
We introduce the diagonal matrix
e = diag(0 1 ,...,0 r ).
The canonical angle
Proposition 1.2.2
l/*Q~cos0.
In particular we deduce that
||l/*ß|| 2 = ||cos©|| 2 and | | t / * ß | | F = l|cos0|| F ,
because the spectral norm ||·|| 2 and the Frobenius norm ||-||F depend only on the
singular values.
In Exercise 1.2.1 the reader will be introduced to the case in which the common
dimension of M and N exceeds n/2.
We now suppose that M is referred to an orthonormal basis Q and that N is
referred to the adjoint basis Y (if it exists). Thus
Q*Q = Y*Q = /.
The following lemma ensures the existence of the adjoint basis provided that
that is
B*U*Q = I.
Thus £* exists if and only if U*Q is invertible. By Proposition 1.2.2 (see
Exercise 1.1.8), the matrix U*Q is invertible if and only if
cos0 max >O,
and then
J5- 1 =(l/*Ö)* = Q*C/.
Lemma 1,2.4 Let X, Yand X'f Y' be two pairs of adjoint bases for M and N. Then
there exists an invertible matrix C of order r such that
X' = XC and Y'=Y{C~lY.
PROOF Since X and X' are bases for M, there exists an invertible matrix C such
that
X' = XC.
Similarly, there exists an invertible matrix D satisfying
Γ = YD.
By hypothesis, the bases for M and N are adjoint in pairs; thus
Y*X = I and (Γ)*Χ' = 7.
Hence
(YD)*(XC) = D*Y*XC = O*C = /,
so that
D = (C*)- 1 =(C" 1 )*.
Proposition 1.2.5 Let Yand Q be adjoint bases ofM and N respectively, Q being
orthonormal, and let Θ be the matrix of canonical angles between M and N. Then,
ifemax<n/2,
y-(cos0)_1 and ß-y-tan©.
Figure 1.2.1
\\y\\2 = — j
cosy
Il<7-yll2 = tan0,
||f||2 = cos0,
| | 4 - i | | 2 = sin0.
1.3 PROJECTIONS
A projection P is a linear idempotent map:
P2 = P.
To each projection there corresponds a decomposition of C into a direct sum:
can write
x = px + (χ - ρχ\
where PxelmP and x — PxeKer P. We say that P is a projection on M parallel
to W. Conversely, if a direct decomposition of C is given, we can define a
projection by stipulating that P is the identity map on M and the zero map on
W. We put
and so
W = N1 = {xeC|x*}> = 0 for all yeN}.
Lemma 1.3.1 The space C" can be decomposed into the direct sum
if and only i/0 max < π/2, where 0max is ifte maximal angle between M and N.
PROOF According to Exercise 1.2.2 the equality 0max = π/2 is equivalent to the
existence of a non-zero vector that belongs to both M and N1 (see Figure 1.3.1).
Proposition 1.3.2 Let X and Ybe adjoint bases for M and N (which exist when
0max < π/2). Then the matrix
P = XY* (1.3.1)
represents the projection on M parallel to N1 in the canonical basis o / C .
PROOF By hypothesis
Y*X = L (1.3.2)
Hence
P2 = XY*XY* = XY* = P;
so P is a projection. Let x e C ; then
Figure 1.3.1
10 SUPPLEMENTS FROM LINEAR ALGEBRA
Proposition 1.4.1
ω(Μ, N) = max {max dist (x, N\ max dist (>>, M)}
THE GAP BETWEEN TWO SUBSPACES 11
PROOF We shall prove that || P - P \\ < 1 implies that dim M ^ dim N. Let
x ! , . . . , x r be a basis of M. We shall show that the vectors Px1,...,Pxr are
linearly independent. If not, suppose that
THE GAP BETWEEN TWO SUBSPACES 13
y = Σ α ί χ ί·
Corollary 1.4.3
ω(Μ, N) < 1 implies that dim M = dim N.
Theorem 1.4.4 Suppose that dim M = dim N = r< n/2. Then the 2r eigenvalues
ofnM — πΝ, which are not necessarily zero, are equal to
± sin 0f (i= l,...,r).
PROOF Let [Q,ß] and [(/, ζ/] be the bases of <Cn defined in Exercise 1.2.6.
Relative to the orthonormal basis [Q, Q], the projection nM is represented by
^/r 0 0\
0 0 0
0 0 0,
V
and the projection πΝ is represented by
/cx
-s (C -S 0).
0
14 SUPPLEMENTS FROM LINEAR ALGEBRA
CjSj -S2
that is ±Sj(j = 1,...,r). Since Π is symmetric, the Sj are also the singular values
ofn.
Corollary 1.4.5
<u(M,JV) = sin0 max .
PROOF We remark that nM — πΝ and sin Θ have the same non-zero singular
values, whence the result follows at once.
where 0 ^ x is the maximal acute angle between Mk and M, or, again, if and only if
sin0(fc)->O,
or, equivalently, if and only if
Ι|π Μ ΐ ί -π Μ || 2 ->0,
that is
PROOF By hypothesis,
ω(ΜΛ, M) = || nMk - πΜ || 2 -» 0.
Thus, for sufficiently great fe,
Ι|πΜ|ϊ-πΜ||2<1.
The result now follows from Corollary 1.4.3.
Theorem 1.5.2 Let Qk and Q be orthonormal bases for the subspaces Mk and M
respectively (k = 1,2,....). Without loss of generality, assume that
dim Mk = dim M — r.
Then
Mk-^>M as fc-*oo,
if and only if there exists a sequence of unitary matrices
Uk (fe=l,2,...)
of order r such that
as
QkUk~+Q fc->oo.
PROOF
(a) Assume that Mk^M so that π Μ(ί ->π Μ , or, in terms of the corresponding
matrices,
Q*e;r-ee*-o.
Multiplying on the right by Q and using the fact that ß*Q = /, we obtain
where
Q = Ö?ß.
Our hypothesis implies that
in particular
|detQ|2^l.
Hence, for sufficiently great values of k9 the Hermitian matrix Ck¥Ck is positive
definite. Every positive definite Hermitian matrix possesses at least one
positive definite square root (see Horn and Johnson, 1990, p. 405). We shall
denote a positive definite square root of C^Ck by (C£Cfc)1/2 and its inverse
by{C*Ck)-v\
It is easy to verify that
uk = ck(c*ckym
16 SUPPLEMENTS FROM LINEAR ALGEBRA
is unitary; indeed,
vkut = ck(c*cj- ^(c*ckr^c* = /.
In accordance with our assumption, all eigenvalues of C%Ck tend to unity
as fc-> oo. The same is true for (CJCk)1/2; hence
(C*Cfc)1/2-»J as fc-oo.
We have
Qk(Ck-Uk) = QkUkl(C*Ck)V2-Il
Since Qk and Uk remain bounded in the spectral norm, it follows that
QkUk^Q,
we required.
(b) Conversely, suppose there exist unitary matrices Uk(k =1,2,...) such that
Qkvk^Q.
We deduce that
(QkUk)(QkUk)* = QkQ*^QQ*,
which is equivalent to Mk -+ M.
PROOF
(a) By Theorem 1.5.2 there exists a sequence {Uk} of unitary matrices of order
r such that
QkUk^Q.
Every unitary matrix is of spectral norm unity. Hence {Uk} is a bounded
sequence (in the spectral norm) and therefore possesses a convergent
subsequence, say
{I/,}-I/,
where / runs through an increasing sequence of positive integers. Evidently,
U is a unitary matrix. We now have that
QlVl ^Q and Q,^QC/* = K,
CONVERGENCE OF A SEQUENCES OF SUBSPACES 17
say, whence
V*V=I;
so V is an orthonormal basis for M.
(b) This follows at once by putting
Vk = QkUk.
By Theorem 1.5.2 we have Vk-+Q, as required.
Proposition 1.5.4 Suppose that the subspaces Mk (k = 1,2,...) and M are equipped
with orthonormal bases Qk and Q respectively. Let Xk and X be arbitrary
bases for Mk and M. Then the existence of unitary matrices Uk such that QkUk->Q
is equivalent to the existence ofinvertible matrices Fk such that XkFk-+X.
PROOF
c o s - , - s i n - 1 ,i s i n - , c o s - I > (1.5.3)
where ε is real, has no limit when ε tends to zero. The subsequence obtained
by putting zk = \kn does converge when k tends to infinity. Indeed, (1.53) is
equal to {ex, e2} for k = 1,2,.... We remark that (1.5.3) consists of the eigenvectors
of the matrix
,, x / 1 + ε cos (2/ε) ε sin (2/ε)
Α(ε) = I
\ε sin (2/ε) 1 — ε cos (2/ε)
corresponding to the eigenvalues 1 + ε and 1 — ε. When ε tends to zero, the
matrix Α(ε) tends to / and both eigenvalues tend to 1, the double eigenvalue of/.
xtxj = iu (U = ! . · - . » ) ■ (1-6.2)
Now Axt = μ,χ,Ο' = 1,..., n) can be written
AX = XD or A= XDX'\
Thus A is diagonalisable. Now AX = XD is equivalent to X~ lA = DX1; so
^*χ^ = μι·χί* ( ι = Ι,.,.,η).
The xt* are the eigenvectors of A* normalized by the relations (1.6.2).
Moreover, X+ = [x r J is the adjoint basis of X = [ x j and X*X = I is
equivalent to Χζ = X~x. Conversely, if A is diagonalisable, it possesses n linearly
independent eigenvectors. We leave the proof to the reader.
Thus A is diagonalisable if and only if its eigenvalues are semi-simple. In
that case A9 too, is termed semi-simple. When A is not diagonalisable, it is called
defective.
20 SUPPLEMENTS FROM LINEAR ALGEBRA
Theorem 1.6.2 There exists a unitary matrix Q such that Q*AQ is an upper
triangular matrix whose diagonal elements are the eigenvalues μί9...,μη in this
order.
(μχ x\AU\
\0 U*AUJ
The eigenvalues of U*AU are μ 2 ,...,μ„. The result follows by induction.
The chosen order of the eigenvalues {μ^ determines the Schur basis Q apart
from a unitary block-diagonal matrix (see Exercise 1.6.5).
Theorem 1.6.3 Let A be an m by n matrix and put q — min (m, n). There exist
unitary matrices U and V of orders m and n respectively such that U*AV= diag (σ,)
is of order mbyn and σχ ^ σ2 ^ · · ^ aq ^ 0. [This means that U*AV= (s^·) where
s
u = σ,·(ΐ = 1,... ,q) and sfj = 0 otherwise.]
PROOF There are vectors xeC" and ye<Cm such that ||x|| 2 = ΙΜΙ2 = 1 an(
l
Ax = axy, where
*i = MII2.
We construct unitary matrices U and V of the form:
U = ly,UJ and Κ=[χ,Κ χ ].
Then
*-™"-('.' ;*)
where
w*=y*AVx and 5=ί/*/1Κ 1 .
1/2
Let φ = (σ\ + w*w)~ . The vector:
u* = (f>(auoj*)
is a unit row vector such that
Α
* = Φ{ Bw }
We deduce that
M i « III > Φ2(?\ + w*w)2 = σ\ + w*w.
However,
<x? = M l l ^ M i 112 >Mi«ll!·
22 SUPPLEMENTS FROM LINEAR ALGEBRA
It follows that
and so w = 0. The proof is now completed by induction by virtue of the fact that
M||2^||B||2.
Returning to similarity transformations of square matrices and employing
transformations that are not necessarily unitary, we shall put an arbitrary matrix
(diagonalisable or defective) into a particular block-diagonal form, known as
Jordans* form.
Vo RJ'
where
R1 = λlI+U1
c rr-c -r>
There exists a matrix 5 with the property that
// B\(RX S\(I -B\JRX 0\
Vo i)\o jt2Ao i J \o R2J
if and only if
S = RXB-BR2 (1.6.3)
It will be shown in Section 1.12 (Proposition 1.12.1) that equation (1.6.3) has a
unique solution for B provided that
sp(K 1 )nsp(K 2 ) = 0,
as is indeed the case in this situation.
< V}
Then
£* = 0,
£*i+i=*i (i=l,...,fe-l),
/ - £ T £ = βχβϊ.
Lemma 1.6.6 Let U be a strictly upper triangular matrix of order m. Then there
exists an invertible matrix Y such that
y - 1 l / y = N = diag(£ J ),
where
The block Ej is of order kj and these orders are arranged in decreasing order of
magnitude. (When kj=\, then Ej is the zero matrix of order unity.)
assume that it holds for all strictly upper triangular matrices of order less than m.
Let
0 I 0 0 £i 0 I 0 0
0 0 /) 0
\
0 N2, 0 / o N
2j
and
u](I — £{£j) = wjejej = σ^[
say, where
σ
= «Τ«ι·
We now have to distinguish two cases. First, when σ ^ 0, we verify that
1
/σ 0 0 \ ft ael «I\ σ 0 0\ (0 e\
N exu\
0 / 0 0 Ei 0 / 0 = 0 Ei 0
0 0 σ-1/ 0 0 0 W2
N V 0 0 alj Γό ό NJ
V \
where the order of N exceeds that of E1 by unity. Put
sJ = uT2Ni2-1 (i=l,2,...,fc 2 + l).
We observe that
N22 = 0,
because the blocks of N2 are arranged in decreasing order; on the other hand,
e,s: eM~
REDUCTION OF SQUARE MATRICES 25
and
order (N) > order (E2).
Next, suppose that σ = 0. Then a simple permutation of the rows and columns,
which amounts to a similarity transformation, shows that U is similar to
(El 0 0 \
0 0 u\\.
^0 0 N2)
so U is similar to
(El °)
where N'2 has the block-diagonal form required.
Theorem 1.6.7 Let Abe a defective matrix of order n with distinct eigenvalues
λί9...9λά (d<n).
Then there exists an invertible matrix X such that
X-MX = diag(J0),
where
J.j = XJ -f Eu
and Eu is a matrix of order k^ of the form
E
»-(o V ) U-'·*■■·■«*
that is there are gt blocks Ei} corresponding a particular λ(.
26 SUPPLEMENTS FROM LINEAR ALGEBRA
The set of Jordan blocks J 0 (j = 1,..., gt) associated with the same eigenvalue
λι constitutes the Jordan box associated with Af. Its order is
m£ = fca + — + fcfrf;
it contains gt blocks and so
/,<mf (1.6.4)
Let t{ be the dimension of the largest Jordan block associated with kr Then
( ^ - ^ = 0,
for there are no more than ί{ — 1 consecutive units along the first superdiagonal
of Bt. As always, let ku...., kd be the distinct eigenvalues of A.
Theorem 1.6.7 shows that
C=©Mf,
i= 1
where
M^KerM-V/«;
we then have
dim M, = mh
which is the algebraic multiplicity of k{. We call /, the index of kt.
It can be proved that the Jordan form is unique apart from the arrangements
of the blocks along the diagonal.
\λ 0 \λ 1
0 λ 0
λ 1 0
0 λ 1 λ 1
0 0 A 0 0 λ 0
A 1 θ! λ 1 0
0 A l ο λ ι!
0 0 A 0 0 A
SPECTRAL DECOMPOSITION 27
This is the spectral projection associated with kt and is illustrated by the following
diagram:
0 0
0 Bi 0
0 0
X J x-1
28 SUPPLEMENTS FROM LINEAR ALGEBRA
where
A*=t$ipf + DT)
i= 1
PROOF The proof is immediate, as can be seen from the following example:
(X 1 0 0 \ /Ä 0 0 0 \
0 A 0 0 1 A 0 0
J= J* = = PJT,
0 0 A 1 •0 0 Ä 0
Vo o o xl Vo o i V
where
/Ä 1 0 0\ / 0 1 0 0\
0 Ä 0 0 10 0 0
J' = P=
0 0 Ä 1 0 0 0 1
Vo o o xJ V0 0 1 θ7
Generally, the determination of P is given in Exercise 1.6.16.
We deduce from Proposition 1.7.3 that Af and 3, have the same multiplicities
and indices in A and A* respectively.
-CO
has the eigenvalues λχ = 3 and λ2 = — 1; the corresponding eigenvectors can be
taken to be
x
'-(i) ''-(-2)
The matrix A* has the eigenvectors
and
'-.-»ft 'ί2)
*2 = *2*ί. = 2 ( _ 2 1 }
30 SUPPLEMENTS FROM LINEAR ALGEBRA
Lemma 1.7.4 Let (X, Y) and (X, Y') be two pairs of adjoint bases in (M,N) and
(M, ΛΓ) respectively, where M is an invariant subspace for A. Then
B=Y*AX= Y'*AX.
PROOF
Lemma 1.8.1 The rank ofX is equal to the number of non-zero singular values
ofX.
Proposition 1.8.2 Let r(X) = m. Then there exists annxm orthonormal matrix Q
and upper triangular invertible matrix R such that
X = QR. (1.8.1)
PROOF The proof can be found in Horn and Johnson (1990, p. 112).
The formula (1.8.1) is called the Schmidt, or QR9 factorization of X. Several
algorithms exist for obtaining the factorization (1.8.1). We mention the methods
of Gram,* Schmidt, the modified Gram-Schmidt processes and those of
Householder and Givens (see Golub and Van Loan, 1989, pp. 146-62).
Proposition 1.8.3 Ifr(X) = r <m, there exists a permutation matrix Π such that
XTL = QR,
where Q is orthonormal,
Definitions Let
be the singular values ofX and let ebe a given positive number.
(a) The matrix X is said to be oft-rank r if exactly r singular values satisfy
^ ε (i=l r).
σι
(b) The m column vectors ofX are dependent within ε, if there exists an invertible
matrix B of order m such that XB is oft-rank less than m. A matrix X which
is of rank m but oft-rank r is said to have a numerical rank equal to r.
Then
μ]ί = min max (x*Ax; xe Vk, x*x = 1)
vk
(k= 1,..., n), where Vk ranges over all k-dimensional subspaces of<Cn.
PROOF See, for example, Ciarlet (1989, Theorem 1.3.1, p. 16). The max-min
characterization, which is due to Courant* and Weyl* is demonstrated in
Exercise 1.9.5.
The number
^, χ x*Ax
^(x) = — - ,
x*x
defined for x Φ 0, is called the Rayleigh* quotient of A for the vector x. This
number plays a very important part in the calculation of the eigenvalues of
Hermitian matrices. In particular, it is an immediate consequence of Theorem
1.9.1 that
μί^χ*Αχ^μη,
where x is an any vector satisfying x*x = 1.
1.10 N O N - N E G A T I V E MATRICES
A matrix is said to be non-negative if all its elements are either positive or zero.
Such matrices occur in the numerical treatment of partial differential equations,
in the theory of probability, in physics, chemistry and economics (see Chapter 3).
We quote the principal result, which is known as the Perron § -Frobenius
theorem; it is concerned with what are called irreducible matrices.
This notion is defined as follows: an n by n matrix A = (a0) is said to be reducible
if the indices 1,2,..., n can be split into two non-empty disjoint sets:
il9...9ir; Ji,...Js (r + s = n)
in such a way that
a
.W/> = ° (a=l,...,r;j?=l,...,s).
If such a splitting is impossible the matrix is said to be irreducible.
PROOF The reader is referred to the book by Varga (1962, p. 30) for the
definitions and the proof or to the book by Gantmacher (1960, Vol. II, Ch. 13)
or to Horn and Johnson (1990, Ch. 8).
1.12.1 Block-diagonalisation of J =
ES
Let A and B be square matrices of orders n and r respectively. Suppose that the
nxr matrix Z is a solution of Sylvester's* equation:
AZ-ZB = C. (1.12.1)
It is easy to verify that, if
and
-ft a - - f t r>
The matrices
(0) - (ζ')
are bases for the right and left invariant subspaces of T respectively, both being
associated with the matrix A. The corresponding spectral projection is given by
Let
Z = [z 1 ,...,z r ],
where z l,..., zr are the columns of Z; then
Zl
vecZ = f ) eC"
Put
Is
z = vec Z, c = vec C,
Then
ΛΖ - ZB = C
is equivalent to
^z = c, (1.12.2)
where
^ = /r®^-BT(8)/„,
the symbol ® denoting the tensor product. Explicitly,
pjA-b^K ... - U \ (1123)
Further, let
z' = vec Z' and d = vec C".
Then equation (1.12.5) is equivalent to
9"i = c\
where
(Α-μχΙη 0
T
f' = Ir®A-T ®I =
V -*ιΛ Α-μϊ%)
is a block-triangular matrix. More explicitly, we can write the equation as the
system
(Α-μιΙ)ζ,ί=ο\
(Α-μ2Ι)ζ'2 = €'2 + ίί2ζ\ (1.12.6)
r-l
( Λ - / ν θ ζ ; = < + Σ iirz;
This system may be solved recursively for z\,...,z'r provided that each of the
matrices
Α-μχΙ„...,Α-μΤΙη
is invertible. The original unknowns can then be determined with the aid of
the equation
Z = Z'Q*.
Hence we have the following proposition.
1.12.4 Algorithms
Two algorithms are used in practice to solve equations (1.12.6):
(a) The algorithm of Bartels and Stewart (1972) which utilizes the Schur form
38 SUPPLEMENTS FROM LINEAR ALGEBRA
PROOF The expression for 2Γ' makes it plain that its eigenvalues and therefore
those of ^" are A, —/ij, where At6Sp(y4) and μ^€8ρ(β). Hence p(^""x) = p(T "x) = δ~ x
and p ( T - 1 ) ^ ||T _1 1| for any induced norm |||| (see Exercise 1.1.4).
Next, if A and B are normal, so is ΖΓ\ this may be deduced from (1.12.3) by
showing that ^ * ^ = «T.T*, provided that A*A = AA* and B*B = BB*.
A normal matrix can be diagonalised by a unitary similarity transformation.
Since the norm || · || 2 is invariant under unitary transformation we may assume
for the present purpose that 5" is in diagonal form. It now becomes plain that
ρ ( τ - 1 ) = | | ^ - 1 | | 2 = 1ΐτ-1||ρ.
1.12.5.1 Caser=\
Let δ = dist [b,sp(/4)] = |fr - λ\ > 0.
(μ-b l 0 \
^ 0 ··· 'μ-b)
of order less than or equal to the index έμ of the eigenvalue μβ$ρ(Α). For such
a block, the least singular value is the square root of the least eigenvalue of
H = G*G. Put ε = μ - b, where \ε\^δ. Then
I e.t 1 + |ε|2'χ
'ε
V0 '''·ε l + |e|2J
We wish to obtain a lower bound for the eigenvalues of H. Now
det// = |detG| 2 = |6| 2r .
By Gershgorin's theorem (Theorem 4.5.1) the eigenvalues a, of H satisfy
0 < α ί ^ 1 + |ε| 2 + 2|ε| = (1 + |ε|) 2
40 SUPPLEMENTS FROM LINEAR ALGEBRA
Therefore
\\(A-biyl\\2^cona2(.X) max [(1 + |β|/"- 1 |β| - ''] (1.12.7)
nespM)
1+ΙεΐΥ"1
/:|6|
is given in Figure 1.12.1, when r > 1. This is an increasing function with respect
to the exponent r.
If we confine ourselves to values of |ε| less than unity, we have
ΐ / ΐ + |ε|Υ" -1 1/Ί+<5 JL-1
|e|
where
L=max[t'ßesp(A),\s\ < 1].
When δ is sufficiently small, the maximum that appears in (1.12.7) is attained
at <5(^|ε|), the term in S~L being dominant.
When A is normal, cond 2 (A r )=l. Exercise 1.6.19 shows that cond2(X)
increases as a function of v(A) = || A A* - A*F\\¥9 the departure from normality
of A. Similarly, it is seen in Exercise 1.6.20 that \\N\\¥ increases simultaneously
with v(A\ where N is the strictly upper triangular part of the Schur form of A.
A= a c
0 b
Figure 1.12.1
SYLVESTER'S EQUATION 41
x-(x l
\
\0 (b-a)/c)
and when a = b, the Jordan basis is given by the matrix
*-G .-■)
It can be verified that cond2(A')-^ oo when |c| -* oo.
1.12.5.2 Caser>\
1
ΙΙ(Λ,β)" 1|F depends on cond2(X), cond 2 (F) and on δ = mindist[sp(/l),sp(J5)],
where X and V are the Jordan bases (bases of eigenvectors) of A and B
respectively.
B= a c
0 b
Since
(A-aIn 0 \
V -clm A-blJ
it is clear that cond2(«^") depends on |c| and on cond 2 (/l — μΐ). where μ = a or
b. We shall examine a particular case. Let
(\ OL \
1# a
A =
'a
K '■ij
of order 6. We note that |a| is related to cond2(^f) and \c\ is related to cond2(K);
let δ = min(\a - 11, \b - 11). Table 1.12.1 shows the dependence of y = ||(Ay B)"1 ||F
42 SUPPLEMENTS FROM LINEAR ALGEBRA
Table 1.12.1.
a -1 -5 -10 -15
y 5 x 105 2 x 108 2.3 x 108 7 x 108
c 0 1 10 100
y 1.6 xlO 4 5xl0 5 5 x 106 5 x 107
Example 1.13.1
,aM
-G 3) B =C 0) »wfl-(u
(bM
-C 3) Mo i) s »t^ = 0
EXERCISES 43
(cM
"C o) B-(l o) »Wfl-c
In applications, A and B are often symmetric. Without further assumptions
about A and B9 the spectrum sp[i4,£] may be complex.
Example 1.13.2
EXERCISES
1
Al\\^\\A\\2^y/n\\A\\ii V/1GC W X W ,
M||2<Mi|F<v/K^M|l2-V/ie(C''1'<'\
where r(A) denotes the rank of A,
it An
*ζρ(Α), Vi4e<P,x", VueC\{0}
u*u
p(A) < || A ||, V X e C x", V induced norms.
x
1.1.5 [B:8,11] Let A e C " be Hermitian with spectrum
sp(/l) = {A 1 ,... ) A d }.
Prove that:
(a) A f G R ( i = l , . . . , 4
(b) The singular values of A are σ{ = | λ,·| (i = 1,..., d).
(c) There exists an orthonormal basis of <CW that consists of the eigenvectors of A.
(d) \\A\\2 = p(A).
1.1.6 [A] Show that, for all A in <C"xr,
Μ||*=ρ(ΛΜ).
n xr
1.1.7 [A] Let Qe<E and suppose that the columns of Q form an orthonormal
system. Prove that | | β | | 2 = 1. Deduce that, if Q is a unitary matrix, then
cond2(0=l.
1.1.8 [A] Prove that if A is a singular matrix, then at least one of its singular
values is equal to zero.
EXERCISES
Gi = SUp
dimK =-i xeV\ χ*χ )
tv(AB)^£ai(A)ai(B)9
dim M = dim N ^ -
2
and let 0max be the greatest canonical angle between M and N. Show that if
#max < π /2, then MnN1 contains at least one non-zero vector.
are given by
2
Put
0 = diag(Ö1,...,Ör).
Let
7 = YY*X.
Prove that Γ is a basis of N and that
r~cos©, Τ-ΛΓ-sin©.
1.2.4 [B:6] Let M and JV be subspaces of <P such that
Let 0X ^ 02 ^ ··· ^ 0r be the canonical angles between M and JV. Show that
cos0 r _ i + 1 = |j/*x.| (i= 1 r).
1.2.5 [D] Let 0X ^ ·· · ^ 0r be the canonical angles between the subspaces M
and JV of dimension r. Show that
sin 0X = max min (|| x — y || 2; x*x =1),
xeM yeN
1.2.6 [A] Let Θ be the diagonal matrix of the canonical angles between the
subspaces M and JV of C of dimension r < n/2. Put C = cos Θ and S = sin Θ.
Prove that there exist orthonormal bases Q of M, Q of M 1 , (7 of ΛΓ and ζ/ of
JV1 such that
(C -S 0 \
[Qß]*[t/i/]* = S C O
V0 0 'n-2ry
be the canonical angles between them. Prove that if Π Μ and Π Ν are the
orthogonal projections on M and N respectively, then
dim M = dim N ^ -.
EXERCISES 49
Deduce that amin is the distance of A from the nearest singular matrix.
50 SUPPLEMENTS FROM LINEAR ALGEBRA
1.6.8 [B:9] Let AeCm x", r(A) = r and D = d i a g ^ , . . . , σΓ), where the oi are the
non-zero singular values of A, and let U and V be the matrices of the SVD:
1.6.9 [D] Let ί be the order of the greatest Jordan block associated with an
eigenvalue λ of Ae<Enxn. Prove that if / > 1, then
l
i ^ /=>Ker(/4 - λΐΥ' c Ker(/1 - A/)',
the inclusion being strict, and
i ^S=>Ker(A - λΐ)1 = Ker(A - λΙ)ί+ι = M,
where M is the maximal invariant subspace under A associated with λ.
1.6.10 [B:ll] By using the theorem on the Jordan form establish the following
results:
p(A)=\r&\\Ak\\llk=\\m\\Ak\\u\
k^ 1 k-> oo
k
lim A = Oop(A) < 1.
fc-*oo
Prove that
a„_1 = - t r A and a0 = {-\)ndetA.
1.6.16 [A] Show that every Jordan block J is similar to J*:
J* = P~lJP9
where P is a permutation matrix. Determine P.
1.6.17 [B:8] This exercise furnishes an alternative proof of the Jordan form.
Let Le<Cnxn be a nilpotent matrix of index /, that is
iZ-^O, but L' = 0.
Define
M^KeriJ, N^ImL', L° = /.
(a) Show that
Mi c:Mi+1 (strict inclusion)
when i = 0, l,...,<f — 1.
(b) Prove that there exists a basis of C" in which L is represented by
(N? 0
1
"ft
J=
N[l)
M1»
\°
52 SUPPLEMENTS FROM LINEAR ALGEBRA
where
NU) = Νψ = ... = No> = y e C i ^ ;
;i, ifj? = a + l
(0, otherwise
when j — 2,3,..., Λ and
with the convention that the blocks Νψ,..., Ν^ are omitted when pj = 0.
(c) Let Ae<Cnxn be an arbitrary matrix. Prove that A can be represented by a
matrix
'Alm 0
0 'Bly
where AY is nilpotent and Βγ is regular.
(d) Let sp(/l) = {λ 1 ,..., λά) be the spectrum of A. Prove that >1 can be represented
by a block-diagonal matrix
U 0 N
Vo 'Aj
where Λ, — λ(ΙΜί is nilpotent, mt being the algebraic multiplicity of Af.
(e) Deduce the existence of the Jordan form.
1.6.18 [D] Prove that the Jordan form is unique apart from the ordering of
the diagonal blocks.
1.6.19 [A] Suppose that A is diagonalisable by X:
D = X~lAX
and that Q is a Schur basis
ß M ß = D + iV,
where N is a strictly upper-diagonal matrix.
Prove the inequalities
\\A\\l
cond 2 2 (X):sl+--^%
2 WAX
where
v(i4)=||i4M-i4i4*|| F .
1.6.20 [A] Let A = QRQ* be the Schur form of A, where R is an upper diagonal
EXERCISES 53
matrix and N its strictly upper triangular part. Establish the bounds
v2(A) ^UKJll2 ^
m viA)lr?-n
,mr 'y^ -
where
v(A)=\\A*A-AA*\\F.
1.6.21 [D] Prove that two diagonalisable matrices are similar if they have the
same spectrum. What can be said about defective matrices?
1.6.22 [D] Let D be a diagonal matrix of order n and let X be a regular matrix
of order n. Consider the matrix
A= X~lDX.
(a) Determine a similarity transformation Y that diagonalises the matrix
" < : : )
of order In.
(b) Prove that Y diagonalises
B = fp(A) q(A)\
Kq(A) p(A)f
where p and q are arbitrary polynomials.
(c) Express the eigenvalues of B in terms of those of A.
1.6.23 [D] Determine the connection between the singular values of X and
the Schur factorization of the matrices
Section 1.7
X*X9 XX*
Spectral Decomposition
and
a
(~
\X
X*
0
Dfj = 3uDj,
54 SUPPLEMENTS FROM LINEAR ALGEBRA
Dflj = 0 when i # ; ,
APi = P,A = PiAPi = AfPt + Dh
Di = (A-XiI)Pi.
n xn
1.7.3 [D] Let Ae<£ . Prove the existence of a basis
Λ= 0 1
0 0
\
Section 1.8 Rank and Linear Independence
1.8.1 [A] Let Xe<Cnxm9 where m < n, and suppose that r(X) < m. Prove that
there exists a permutation matrix Π such that ΧΠ = QR, where Q is orthonormal
and
- ( V *;>
K n being a regular upper triangular matrix of order r.
1.8.2 [D] Prove that if X = QK, where Q is orthonormal, then
r (X) = r(R) and cond 2 (X) = cond 2 (R).
1.8.3 [D] Suppose that X = QK, where Q is orthonormal and
*u ^12
R=
0 R22J
is an upper triangular matrix. If Rx x is of order r and σ χ , σ 2 ,... are the singular
values of X arranged in decreasing order, prove that
1.8.4 [D] Suppose that the ε-rank of the matrix X is equal to r for every
EXERCISES 55
where p = 2 or F.
1.8.5 [B:39] The Householder algorithm is defined as follows. Let A(l) = A be
a given matrix.
(*) given A{k) = (β{*>):
Iffe= Π, STOP.
If k < n, define
a = (M2+...+M2)l/2
Π
» «to
^Hl,GOTO(*).
(a) Prove that Hk is symmetric and orthogonal.
(b) Prove that the matrix Hk and the vector u satisfy the following equations:
H j M = — (xel,
fc-1
Hku = £ WJ^J — (xek if fe ^ 2.
(d) Let
R
= Hn-1Hn-2-H2H1A,
Q= H1H2-'Hn-2Hn-i.
Prove that R is an upper triangular matrix and that Q is orthogonal.
(e) Prove that A = QR.
1.8.6 [D] Let amin(X) be the least singular value of X. Prove that there exists
a permutation matrix Π such that, if ATI = QR is the Schmidt factorization, then
x
1.8.7 [D] Let AeC *, where n^p.
56 SUPPLEMENTS FROM LINEAR ALGEBRA
(a) Prove that there exists a factorization, known as the polar decomposition,
A = QH,
where Qe<Enxp, Q*Q = IP and where He<Epxp is symmetric and positive
semi-definite.
(b) Prove that the matrix Q in (a) satisfies the conditions
|| A - Q\\j = min {\\A-U ||,·:Ηη U = lin A and U*U = / p } ,
where j = 2 or F.
(c) Compare the applications of the polar decomposition and of the Schmidt
factorization in relation to the orthonormalization of a set of linearly
independent vectors.
faj u
jj)
S = \xe1Rn:xi^Q, Σ x f = l [.
Put
T(x) = Λχ,
Suppose now that A and B are Hermitian and positive definite and that, for all
ÜJC° -M,
\s0 cj
where
(a) C 0 > 0 and Cj > 0 and
(b) S 1 = S * .
Prove that if N' = Im (/ — Q) and M' — Im (/ — P) satisfy the conditions
MniV' = M'n/V = {0},
then the direct rotation of M on N exists; it is unique and (a) implies (b).
60 SUPPLEMENTS FROM LINEAR ALGEBRA
2inJrz — z0
By differentiation,
2inJr(z-z0)k+l
The expansion off as a Taylor* series in the neighbourhood of z 0 is as follows:
Figure 2.1.1
it converges absolutely and uniformly with respect to z in the interior of any disk
that lies inside Γ. Conversely, every series of the form
00
f{z)= Σαάζ-zvt
k= 0
defines a function that is holomorphic in the open disk {z;|z — z 0 | < p}, where
p = (limsup|aj 1/fc )- 1 .
This series converges absolutely and uniformly with respect to z in every disk
{z;|z — z 0 | ^ r}, where r < p. Moreover, this series is uniquely determined by /
because
a — /c = 0 , 1 ,
k\
The coefficients ak of the Taylor expansion can be bounded by Cauchy's
inequalities:
\ak\ ^ Mr~\ k^O, where M = max |/(z)|.
Next we suppose that / is holomorphic in the annulus
{ζ;α<|ζ-ζο|<0}, a > 0.
Then / can be expanded in a Laurent* series in the neighbourhood of z 0 ; thus
f(z)= Yak{z-z0f.
— oo
said to be an essential singularity of/. In the opposite case, z 0 is a pole of/; the
greatest integer { such that a_t Φ 0 is called the order of the pole.
The definitions and properties that have been recalled above can be extended
without difficulty to a function / with values in the vector space C n xw, that is,
for example, to a square matrix A of order n whose n2 coefficients depend on the
complex variable z. It suffices to replace the absolute value | · | on <C by the chosen
norm | | | | on C n x n . In particular, one could apply Liouville's* theorem and
Cauchy's integral formula.
Lemma 2.2.1 The resolvent R(z) satisfies two identities, known as the first and
second resolvent equations respectively:
R(Zl) - R(z2) = (zx - z2)R(Zl)R(z2) = (zx - z2)R(z2)R(Zl) (2.2.1)
for all zx and z2 in res(/4), and
R(Ai,z)-R(A2,z) = R(Al,z)(A2-Al)R(A2iz)
^R(A29z)(A2-Al)R(Al,z) (2.2.2)
for zeres (A x )n res (A2).
Proposition 2.2.2 The resolvent R(z) is holomorphic throughout res (,4), where it
possesses the following expansion as a Taylor series in a neighbourhood ofz0:
R(z) = R(z0)f^l(z-z0)R(z0)f
fc = 0
converges absolutely.
*^inl*
inf* = b.
* k
The inequality
M- + *IKM-Ihll^ll
implies that
am + k<:am + ak.
When m is a fixed positive integer, we can put k = mq + r, where q and r are
integers such that 0 ^ r < m. Hence
k m
Hence lim sup (ak/k) ^ ajm, where m is arbitrary. Therefore sup(ak//c) ^ b. On
the other hand, since ak/k ^ fr, it follows that
liminfi — \^b.
SINGULARITIES OF THE RESOLVENT 65
PROOF By Theorem 2.2.3, \\Ak\\1/k -*a when k-> oo. Hence, if |z| > a + ε, where
ε > 0, we have
ΙζΙ-Μΐ^ΙΙ^^ία + εΓΗα + Η
and so
converges when \z\ > a. On multiplying on the left or on the right by A — z/, we
verify that
(A - zI)S{z) = S(z)(A -zl)=-1.
This proves equation (2.2.4).
The identity (2.2.4) is the expansion of R(z) as a Taylor series in the
neighbourhood of z = oo, the radius of convergence being limfc sup || Ak ||1/k = a.
Hence equation (2.2.4) diverges when |z| < a.
Corollary 2.2.5 The sets res(A) and sp(A) are not empty.
ll*WII< Σ Μ Τ Τ ^ ^ Ι - Μ Ι Ι ) " 1 ·
k = 0\Z\
Hence || R(z)\\ -►O when \z\ -> oo. If sp(A) is empty, then R(z) would by analytic
and bounded throughout <C. By Liouville's theorem R(z) would be constants, and
this constant would be the zero matrix. This would lead to the contradiction that
/ = (,4-zJ)K(z) = 0.
66 ELEMENTS OF SPECTRAL THEORY
PROOF We shall show that there exists at least one point of sp(/l) (that is an
eigenvalue of A) on the circle {z; \z\ = a}. Since the domain of convergence of
equation (2.2.4) is {z; |z| > a}, there is at least one singularity of R(z) on the circle
of convergence, provided that a > 0.
On the other hand, if a = 0, then sp (A) = {0} unless the spectrum of A is empty,
which is impossible. Hence we conclude that a = p(A).
PROOF For every /ceN, the function z\-> || Sk(z) || l,k is continuous in G. For every
ε > 0 and z in G, there exists veN such that
||S v (z)|| 1/v ^p(S(z) + ie.
There exists of δ > 0 such that
|ζ'-ζ|<<5^||5 ν (ζ')ΙΙ 1 / ν ^ΙΙ5 ν (ζ)|| 1 / ν + ^
^ p(S(z)) + ε.
Since
p(S(z'))= inf || S V ) II1/fc,
fc$sl
we obtain
\z' - z\ <S=>p(S(z')) ^ p(S(z)) + ε.
We have established the fact that R{z) is holomorphic in the exterior of the
disk {z; \z\ ^ p(A)} which contains the spectrum of A (see Figure 2.2.1) and that
R(z) has the expansions (2.2.3) and (2.2.4). Next we are going to establish the form
of the Laurent expansion of R(z) in the neighbourhood of an eigenvalue x, where
\M^P(A).
Let Γ and Γ' be two Jordan curves surrounding λ. The curve P lies in the
exterior of Γ. Both curves lie in the set res (A) and contain no other point of sp(A).
Figure 2.2.1
SINGULARITIES OF THE RESOLVENT
Figure 2.2.2
p
-{^)lmdz
has the following properties :
(a) P is a projection on M — Im P, along M = Ker P.
(b) M and M are invariant subspaces of A.
(c) AfM:M-+M has the spectrum λ and A^:M^M has the spectrum τ.
PROOF
P2 R(z)R(z')dz'dz,
\2in) JJ r
where zeT and z ' e F . By equation (2.2.1) we obtain
p2J±Y[\w)-mdz,az
\2inJ JrJr, z'-z
We remark that
dz
f ' 2ίπ and
f—-o·
J r ,z'-z
On changing the order of integration we immediately deduce that
urn*-*
68 ELEMENTS OF SPECTRAL THEORY
Also
C
Ri / Γ
az \2i
AT' C
l il,7^) - '\ Γ
R(z)dz.
Thus
-1
\[ R(z)dz = P.
2/π
(b) Put M = Im P and M = Ker P. We will show that M is invariant under A.
Since AR(z) = R(z)A, we deduce that PA = AP. If welm P, then u = Pv for
some v. Hence Au = PAv = PAveM. Thus
MgM.
By a similar argument (see Section 1.3), it is shown that
M = KerP = I m ( / - P )
is invariant under A.
(c) Let [X, X] and [Χ%, X + ] be adjoint bases of C such that X and X are bases
of M and M respectively. Relative to these bases the map A^M:M-^M is
represented by the matrix B = X*AX. Similarly, the map A^:M-^M is
represented by the matrix B = X*AX.
When zeres(,4), zeT and teT, we have
2ίπ >**-©ί»·
When z is in the exterior of Γ, we obtain
df
*-(£)/,
Ä(z)P= — Κ(ί)
ζ - ί
[^•«■κ a
*
and sp(/4) = sp(ß)usp(£). Therefore
(B
^--^{ r\s-um
Since P = XX* (see Exercise 2.2.3) we deduce that
*
R(z)P = X{B - ziy lX* and R{z)(I -P) = X(B - zl)~lX
SINGULARITIES OF THE RESOLVENT 69
G j w : «..·«-}
Lemma 2.2.9 T/ie matrix
(B-ziyi = - f^iz-xy^^B-xif
fc = 0
= X {z-kyk-xX(B-XI)kXl
k= 0
P - *
ζ-Λ *ΐ-!(ζ-λ)* + 1 '
Ä(z)(/-P) = X ( B - z / ) " 1 X * = £ ( z - A ^ ß - A / ) - ^ 1 ^ *
fc = 0
is the Jaylor series for R{z)(I — P), valid near λ. On using the definition
S = X(B - λΙ)~ lX% we infer that R(z)P + K(z)(/ - P) satisfies equation (2.2.5).
Without referring to the characteristic polynomial we have established the fact
that a pole λ of order { of the resolvent R(z) is an eigenvalue of A of index *f and
algebraic multiplicity m = dim M.
We will now mention the form that Cauchy's integral formula takes in this
context. Let Γ be a Jordan curve lying in res (A) and enclosing sp(,4), and let /
be a function that is holomorphic in the neighbourhood of sp(y4). Then Cauchy's
integral formula enables us to define
ZJW-
i=l
PROOF Let Γ be a Jordan curve enclosing the set {Λ,,} (1 ^ i ^ d). Then
1
R(z)dz.
i=i 2ιπ, r
Since R(z) is holomorphic in the exterior of Γ, we can use the expansion (2.2.4)
of R{z) and make the change of variable: z = 1/i. From the identity
R(z)dz= £ tk+lAk-2
*=o t
and from the fact that t traverses a Jordan curve Γ' in the negative sense
(z = pew => z " l = p " * e ~ie) we deduce that
^i«(z)d Z = ^ f /* = · - 2ΐπ,/ = /.
i m j rr
2i7rJ 2i*7r J rj-- tt —2in
= I y*R*{A,z)xdz= I [R(A,z)yYxaz.
Hence
i f ÄM,z)dzT= f R*(A,z)dz.
Γ"
Figure 2.2.3
THE REDUCED RESOLVENT AND THE PARTIAL INVERSE 73
Now
— K%4, z) dz = — K%4, z) άζ
2mJr_ 2mJr_
PROOF Since R*{z) = R(z), it follows from equation (2.2.1) that R*(z)R(z) =
R(z)R*(z) and
l|RWIl2 = p [ M - ^ ) " " 1 ] = d i s t - 1 [ z , s P M ) ] .
Since λ is real, it may be assumed that the contour in Figure 2.2.3 is symmetric
with respect to the real axis; thus P* = P. Now
D = (A - λΙ)Ρ = D*.
Since D is a Hermitian nilpotent matrix it is zero, that is D = 0 and ί = 1.
Put
5 = dist[A,sp(i4)-{A}]>0.
PROOF
Lemma 2.3.2 Suppose that X*b = 0 and that λΦθ. Then the unique solution z
of system (2.3.1) is a solution of the system
(I - P)Az -kz = b. (2.3.2)
L 0 Β-λΐ]ΐΧ*]
or, again,
-λΧ*ζ =0,
*
{Β-λΙ)Χζζ=Χ*ο.
When λ Φ 0, we obtain Xp = 0 and X%z = (B - λ1)~ lX*b, whence z = Sb.
System (2.3.2) of rank n can be solved in a standard fashion.
of a partial inverse which requires only the knowledge of the right invariant
subspace M.
Let X be a basis for M and let 7 be an adjoint basis for a subspace N, which
need not be M + . We suppose that ω(Μ, N) < 1, that is 0max < π/2. Then Π = X Y*
is the projection on M along N1 = W. Let [X, X] and [7, 7 ] be adjoint bases of
C . The matrix A is similar to
sp(ß)nsp(ß) = 0 .
relativeto the adjoint bases IX, X~\ and[Y, 7 ] , then there exist adjoint bases [X, X~\
and [X^9X^\ defined by
X = X-XR, X*=Y+YR*, X* = Y
l
where R = (B, B)~ C, such that A becomes block-diagonal of the form
K ίΐ
PROOF It is easy to verify that
ifandonlyifÄ = (B,jB)"1C.
With the notation of Section 2.3.1 we have
B = X%AX = Y*A(X - XR) = Y*AX = B.
As we shall see in Proposition 2.3.4, this is due to a particular choice of the bases
X, X+, starting from the bases X, 7in Theorem 2.3.3.
When arbitrary bases IX, X] and [X, X~\ are associated with the direct decom
positions MφNL and Λ ί φ Μ , the corresponding matrices B=Y*AX and
B = X*AX are similar because they have the same spectral structure (see Exercise
2.3.2). We shall study this similarity more precisely in the following proposition.
Proposition 2.3.4 Let {X9 X+) and (X, 7) be a pair of adjoint bases. Then there
76 ELEMENTS OF SPECTRAL THEORY
neighbouring eigenvalues, and the resolvent that is associated with each of these
eigenvalues individually is ill-conditioned because their distance from the re
mainder of the spectrum is small. One may therefore wish to treat globally the
cluster of eigenvalues which are close to a multiple eigenvalue.
Let {μ,·:| ^ Ι ^ Γ } be the block of the r eigenvalues of A, counted with their
multiplicity and distinct from the rest of the spectrum; we wish to treat the
eigenvalues {μ^ simultaneously. The corresponding right invariant subspace is
given by
i= 1
t = sp(£)
and
δ = dist min (σ, τ),
which is positive by hypothesis.
In the case of a block of eigenvalues, the eigenvalues of B may include distinct
ones. We will generalize the notion of a reduced resolvent in the following
manner.
PROOF
where
wxuwi^iwxutwi
Hence
Next,
H^ll^ll^llltli^llHll^llillfll?·
i= 1
ΙΙΙ/ΙΙΡ^ΙΚΑΒΓΜΙΡΙΙ^ΖΙΙΡ^ΙΙΑΒΓΜΙΡΙΙ^Ι^ΙΙΖΙΙΡ·
whence the result follows:
(b) This_is a consequence of Proposition 1.12.2 and the fact that sp(S) =
spKftBr^ufO}.
(c) We can choose bases X and X+ in M 1 and M£ respectively such that
Jf * i = X*X = /,
λ
because ω(Μ ,Μ^) < 1. Then by Proposition 1.2.5 we have
J^-cos1©,
where Θ is the diagonal of the canonical angles. We deduce that || X ||2 = 1
and \\XJ2J* 1. If Af = M # , then ||Äf||2 = \\XJ2 = \. If A is Hermitian, so
are B and J5 if we choose an orthonormal basis [ β , β ] . Thus
| | 5 | | Ρ = | | ( 5 , 5 ) - 1 | | ρ = δ- 1 .
Lemma 2.4.2 Let R be an n by r matrix such that X*R = 0.IfB is regular, the
equations
\AZ-ZB = R,
X%Z = 0,
LINEAR PERTURBATIONS OF THE MATRIX A 79
and
(I-P)AZ-ZB = R,
where P = XX*, have the same solution, namely Z = SR.
PROOF This is analoguous to the proof of Lemma 2.3.2. The equation λΧζζ = 0
is replaced by {X*tZ)B = 0, which has the unique solution AT*Z = 0 if and only if
B is regular.
Definition The linear map Σ = X(B,B)~1Y* is called the block partial inverse
with respect to {μ,: 1 ^ i < r}, defined in N1, where
B = Y*AX.
In particular, we may choose N = M. When Q is an orthogonal basis of Mx,
we have B = Q*AQ and Σ1 = Q(B,B)~ lQ*. We~leave to the reader the task of
verifying that
l|E- L llp=ll(B,B)- 1 ||p when p = 2orF.
P(t)=-~{ R(t,z)dz
PROOF We have
HR\z) = {A' - A)R\z) = I-{A-zl)R\z).
On the other hand, from the definition of p(A) as limfc sup || Ak ||l/k it follows that
p[HK'(z)]=p[Ä'(z)Jf].
See also Section 2.12.
PROOF We have
A(t) = zI = A'-zI-tH = (Α' - zI)U - tR\z)H\
LINEAR PERTURBATIONS OF THE MATRIX A 81
Hence
if and only if
\t\plR'(z)Hl<l.
We deduce that t = 1 belongs to the domain of analyticity provided that
p[R'(z)H~\ < 1. This holds under the classic condition
IIHIKHK'Wir1;
but it may also hold for matrices of greater norm, as the following example shows.
and
1/2
p[HR(z)] = ^
z(2-z)
max |Ä(z)||2 = m a x f l _ L ) B L
ζεΓ »r V|z| | 2 - z | /
As z describes Γ, the condition
Figure 2.5.1
The corresponding sets of points in the (x,y) plane are bounded by the
circumference of the square and by the two hyperbolas shown in Figure 2.5.1. If
we choose x = y/n,y= l/n,thenx>/ = l/ x /tt->Oasw->oo;but ||H|| 2 = H 1 / 2 -»OO.
x(t) = ttkyk
o
converges, where y0 = x' and
y^lR'Wfx^R'Wy^, (fc^l).
PROOF The assertion follows immediately from Lemma 2.5.2.
Ifp(HK'(z)]<l,then
k
x
k= Σyi-* x > as
^-*°°·
/= 0
Also
*o = Jo = *'.
and
xk = x' + k\z){A' - A)xk.1 (k ^ 1) (2.6.2)
p(I — A'~ 1A) < 1. This is true in precision arithmetic. In floating-point arithmetic
we have convergence towards an approximate solution with the precision utilized
in calculating rk. We pose the question: 'When calculating the residue in double
precision can we obtain the solution x in double precision whilst solving the
equation (2.6.3) in simple precision?' The answer is 'yes' if we proceed as follows:
We repeat the procedure until the residue is zero in double precision (see
Exercise 2.6.1).
R(t,z)dz
« · - ( = ) ;
is analytic in the disk
anc
PROOF The matrix P(t) is defined for each t in δ'Γ Let t0e&r * consider its
neighbourhood
Lemma 2.7.2 If the projection P(t) depends continuously on t, when t varies over
a connected region o/C, then the dimension of Im P(t) remains constant.
Proposition 2.7.3 The functions l(t) and B(t) are analytic in δ'Γ.
THE RELLICH-KATO EXPANSIONS 85
«-C i)
has the eigenvalues ± J~t and
*»-(: i)
has zero as an eigenvalue of index 2. We remark that the arithmetic mean of the
eigenvalues remains constant at zero. The matrix
2
«-C 0)
has the eigenvalues \{t ± y/t + At) whose arithmetic mean is equal to \i.
where
PROOF The reader is referred to the proof given in Kato (1976, pp. 74-80).
On account of the complexity of their coefficients these expansions are chiefly
of theoretical interest. In what follows we shall meet the Rayleigh-Schrödinger*
expansions which possess the double advantage of having coefficients that can
be calculated recursively and of covering the case in which Γ encloses several
eigenvalues {μ$\9 of A'.
Proposition 2.9.1 The coefficients Zk and Ck are formally the solutions of the
recurrence relations
C0 = B\ Ck = Y*(A'Zk -HZk-x) (k>\\
Remark If we choose Y=X'+, then ΓΓ and Σ' become ?' and S' respectively and
x'?A'Zk = B'X'fZk = 0,
whence
Lemma 2.9.2 If p\_(P(t) - Ρ')ΓΓ] < 1 w/ien ί //es in <5r, ί/ι<?η ίΛέ? mairix
S(t) = lY*P(t)XT1
is analytic in δ'Γ and Π(ί) = P(t)X'S(t) Y* defines the projection on M(t) along N1.
88 ELEMENTS OF SPECTRAL THEORY
77ien r/ie disk {t; \t\ < \/p} belongs to the domain of analyticity ofX(t) and B(t).
U Ml J^ II 2 ^ < - -- -—*— ^ -1 -
r
2|Γ|/·ί-||Π|| 2 2
We deduce that
p{LP(t)-PW'}^\\iP(t)-PW'\\2
and
Thus
On the other hand, since |r| \\H\\2r'r < | , it is clear that teS'r. Hence, by Lemma
2.9.2, P{t) is analytic and so also are X(t) and B(t) = Y*(A' - tH)X(t).
--)||ΠΊΙ2Γ'Γ2||Η||2<1, (2.9.1)
nJ
NON-LINEAR EQUATION AND NEWTON'S METHOD 89
PROOF The condition 1/s > 1 is satisfied under the hypothesis (2.9.1), which can
also be written as
m/iu<-—--U-
|Γ|||Π'||2Γ'Γ2 2r'r
by virtue of the relations mentioned at the beginning of the proof of Theorem
2.9.3.
Let q be such that s < q < 1, so that
q s
Now X(i) and B(t) are analytic on the circle {i; |i| = \/q}. Let
0L = max(\\X(t)-X'\\2;\t\=\/q\
ß = max(\\B(t)-B'\\2;\t\ = \/q).
By Cauchy's inequalities (page 62) we have
IIZJ2W, \\Ck\\2^ßqk.
This demonstrates that the convergence is at least as fast as that of a geometric
progression of common ratio q, where q is arbitrarily close to s provided that
|| H || 2 satisfies condition (2.9.1).
If we choose Y = X' = Q', where Q' is an orthonormal basis of the invariant
subspace M', then ΓΓ is an orthogonal projection and ||Π'|| 2 = 1 on condition
(2.9.1). This condition is satisfied when \\H\\2 is sufficiently small. This furnishes
a theoretical framework which suffices for the error analysis that we shall under
take in Chapter 4. Nevertheless, it is interesting to remark that the sufficient
condition for analyticity given in Lemma 2.9.2 may be satisfied even when ||H|| 2
is not 'small' (see Exercise 2.9.1).
Theorem 2.10.1 Ι/Οφσ, the basis X which satisfies AX — XB and which is norma
lized by Y*X = I is a solution of
F(X) = AX- X{Y*AX) = 0. (2.10.1)
PROOF Equation (2.10.1) expresses the fact that there exists a matrix B such that
AX = XB. Hence X is invariant under A. On multiplying on the left by Y* we
deduce that
(/ - Y*X) = 0,
which implies that Y*X = / since B is regular by virtue of the assumption
0£σ = sp(ß).
The Frochet* differential of F, which is a quadratic function of X, is easily
calculated, namely
Zh+(DXF)Z = J(X)Z = (/ -XY*)AZ - Z(Y*AX)
= (/-ΠμΖ-Ζ£, (2.10.2)
xr
where Z ranges over C .
PROOF
(a) Let τ = sp(>l) — σ. Now
sp((J-IIM) = a-{0}.
Therefore
ansp((/-n).4) = 0 ,
because Ο^σ.
xr
(b) Let Xx and X2 be elements of C . Then
[ J ^ ) - J(X 2 )]Z = (X2 - XX)Y*AZ + ZY*A(X2 -X,)
and so
|| J&J - J ( * 2 ) K 2|| Y*A|| \\X2 - Xt ||.
We define
HT = [Z; 7*Z = 0];
xr
this is a subspace of C .
Lemma 2.10.3 Let V= [vx,..., i?r] be a matrix satisfying Y*V= I. Then F(V)eW.
IfJ(V) and Y*AV are invertible, the space iV is invariant under J _ 1 ( F ) .
PROOF We have
y*F(F)= y*(/- VY*)AV=O,
which proves the first assertion.
Next consider the equation
( / - VY*)AZ-Z(Y*AV) = C9 (2.10.3)
where C is such that Y*C = 0. Since we suppose that J(K) is invertible, we have
sp ((/ - V Y*)A) n sp (7*A V) = 0.
The solution Z of equation (2.10.3) exists and is such that
Y*Z(Y*AV)= y*c = o.
This implies that Y*Z = 0 because Y*A V is invertible. Thus we have shown that
Ce1T implies that J'l(V)CeiT.
Theorem 2.10.4 On the assumption that 0$G, there exists p>0 such that, for
every U satisfying \\X — U\\ < p, the sequence defined in (2.10.4) is meaningful and
converges quadratically to X as k tends to infinity.
PROOF Using Lemma 2.10.3 we show recursively that Y*Xk — I, that is Xk+1 — Xk
lies in iV if we suppose that, at each step, J(Xk) and Y*AXk are invertible. Since
we assume that Ο^σ, the matrix B = Y*AX is invertible. Now the function
Bv-►det B is continuous: hence for some a > 0 there exists ργ such that, for all V
satisfying \\X - V\\ < pu we have that det(Y*AV) ^ a > 0.
On the other hand, since 3(X) is invertible and J satisfies a Lipschitz condition,
there exists p2 ^ px such that, for every U such that \\X - l / | | < Ρι·>the sequence
(2.10.4) converges quadratically to X (see Exercises 2.10.1 and (2.10.2).
Finally, we remark that when Xk -* X, then F(Xk) tends to zero and, in practice,
must therefore be calculated with increasing precision (see Sections 2.6 and 2.12).
2.11 MODIFIED M E T H O D S
Iterations of the type (2.10.4) are costly because they require the solution of a
different Sylvester equation at each step. We can modify this type of iteration by
keeping the system that has to be solved fixed in the course of the iterations. Thus
we define a family of modified Newton methods (with fixed gradient) whose
convergence is linear. We present these methods in relation to the variable
deviation Vk = Xk — X°\ for the study of convergence is simpler in this version.
0 1/4 t
Figure 2.11.1
MODIFIED METHODS 93
iorY*AVk = SVk.
Define a sequence (xfc) by putting x 0 = 0 and
% = π1(1 + xk) (fc^l).
2
Then xk+ x = ε(1 + xk) , where ε = ys\\ Vl ||.
It can be shown inductively that xfc is a monotonic increasing sequence (k ^ 0);
it tends to x where
1 + x = g(s)
(see Figure 2.11.2). Since xk < x, we conclude that
\\Vk\\^Kk^n1(l+x) = g{e)nl.
PROOF We have
G(V)- G(V) = J - ' I V Y * A V - V'Y*A V\
= J ^ ' C i ^ - V')Y*AV+ V'Y*A(V- ν')]=ξ.
By hypothesis, max( || V \\, || V'\\) < «(e)|| Kt || = g(*)*i■ Hence
H\\ <2^(ε)ν 5πι ||Κ- K'H =2εοτ(ε)|| F - HI-
94 ELEMENTS OF SPECTRAL THEORY
Hence
a = 2sg(e) < 4ε < 1
when ε < \.
Consequently, G possesses a unique fixed point V in & which satisfies
A(U + V) = (U + V)Y*A(U + V\
or
AX = X£.
One can calculate V by successive approximations starting from V0 = 0; the
iteration (2.11.1) converges linearly to V provided that ||R|| < \y2s:
Since a is the contraction constant of G, we have
\\V~Vk\\^^—\\Vi\\ (*>1),
1 —a
where
α = 2ε0(ε), e = ys||K1||.
and
\\B-Bk\\ = \\Y*A(V-Vk)\\^^s\\Vxl
1 —a
Corollary 2.11.3 The upper bounds \\V - X\\ ^g(z)\\3-lR\\ and \\B-B\\^
g(e)s\\Jo XR\\ are valid when ε<\.
Vl=J"lHX\ (2.11.3)
Vk+l = V, + 3'-l[VkY*AVk + HVk- VkY*HX'l
In Chapter 4, this iteration will be used to obtain bounds for the error linking
the eigenelements of A with those of A'.
(b) We can modify J 0 in the following way:
Z -> J Z = (/ - U Y*)AZ - ZB,
where B is defined as follows: let
T = Q*BQ
be the Schur form of
B= Y*AU.
Put
i=T+diag(C-Q,
where the d are the r eigenvalues of B and ζ is their arithmetic mean. Then
define
B = QTQ*.
V^-J'R, (2.11.4)
# = {y,\\y-x\\<p}.
Suppose, further, that we are able to evaluate the map F and that we know an
approximate inverse G defined in the following manner.
Definition We say that G is an approximate local inverse ofF if and only if the
following three conditions are satisfied:
(a) F ( J ) c D o m G .
(b) G(0)e<^.
(c) U = 1 — G°F is a contraction map on $.
The method of residual correction consists in producing the sequence
x (0) = G(0), x(k + υ = G(0) + U(xik)) (k ^ 0). (2.12.2)
PROOF The existence of lim xik) is assured because the sequence defined in
(2.12.2) is a Cauchy sequence. Let
x = limx (k) as fc-»oo.
We shall show that x is a fixed point of the map
K(y) = G(0)+l/(y).
On letting k tend to oo in (2.12.2) we obtain
x = G(0)+l/(x).
Hence
We leave it to the reader to verify that the iteration of the residual correction
x (0) = Ku + Hb, xik+l) = K{k) + Hb
reverts to the three well-known iterative methods.
EXERCISES
Section 2.1 Revision of Some Properties of Functions of a Complex Variable
2.1.1 [B:41] Let Ω be an open set in C Suppose the maps /;Ω->(Ε, p:R 2 -+R,
<?:R2 ->R and #:R 2 ->C are such that
f(z) = p(x, y) 4- iq(x, y) = g(x9 y)
for all z = x 4- iy in Ω.
EXERCISES 99
δχ2 + dy2""
ίηΩ.
2.1.2 [D] Let Ω be an open set in C Prove that
zeüv-*A(z) = [α0·(ζ)]€€"xm
is analytic if and only if the mn functions ζθΩι->α0·(ζ) are analytic.
! * ( * ) = _[K(z)] a f
dz
and, more generally,
^-kR(z) = kl(-\)klR(z)f +1
(/c=l,2,...).
dz*
2.2.3 [B:63] Show directly that the function zi-»p(R(z)) is upper semi-conti
nuous on the resolvent set.
2.2.3 [D] Let As<£n xn and let M be the invariant subspace associated with an
eigenvalue λ of A. Let X be a basis of M and let X+ be_a basis of the orthogonal
complement of the complementary invariant subspace M{M+ = ML,M®M = C),
such that XIX = /. Prove that
XX* =
* — L f; R(z)dz,
2niJr
where Γ is a Jordan curve lying in res(,4) and isolating λ.
2.2.4 [B:35] Let P be the spectral projection associated with an eigenvalue λ.
Prove that
limK(z)(/-P) = S,
z-*k
2.2.6 [B:10] Investigate sufficient conditions for matrices A and B to satisfy the
equation
2.2.7 [D] Let A be a square matrix. Let Xesp(A) and let P be the associated
spectral projection. Prove that if 0£sp(/4), then A~ί exists, λ~* is an eigenvalue
οϊ A'1 and P is the spectral projection of A~l associated with λ~ι.
ΛΓ= - \ QAtCeBtat.
Jo
(b) Let a be a non-zero number in
res (A) n res (B).
Define
/ ( ζ ) = (ζ + α ) ( ζ - α Γ \
U = f(A),
D=-±-(U-I)C(V-I).
2a
Prove that X is a solution of equation (1) if and only if X is a solution of
X-UXV=D. (2)
(c) Prove that if the eigenvalues of A and of B have negative real parts, then
p(U)<\ and p(V)<l·
(d) Prove that if p(U)p(V) < 1, then the solution of equation (2) can be expanded
in the series
00
X= £ Un~lDVn~l.
n= 0
EXERCISES 101
ι
2mJr ζ—λ
where Γ is a closed Jordan curve which isolates A -1 from the rest of the
spectrum of A"1.
2.3.5 [C] Let
22
A-( V—21 Λ
53/
(a) Verify that A = 25 is an eigenvalue of A.
(b) Calculate the reduced resolvent S associated with A.
(c) Calculate the partial inverse Σ 1 associated with A and the orthogonal projec
tion upon the invariant subspace.
(d) Compare ||5|| 2 with ||Σ Χ || 2 .
2.3.6 [A] Let Aesp(,4). Let M be the invariant subspace associated with A, Π
any projection on M, Π 1 the orthogonal projection on Μ,Σ(Π) the partial
inverse associated with A and Π. Prove that
imn 1 )«j = min ||Σ(Π)||,. (j = 2 or F).
102 ELEMENTS OF SPECTRAL THEORY
2.3.7 [B:9] Identify S with the Drazin inverse (A - λΙ)Ό and, when λ is semi-
simple, with the group inverse (A - λΐψ. If A is normal, prove that
(A - λϊγ = (Α- λΙ)Ό = (A- Xlf = 5
where f denote the Moore-Penrose inverse, or pseudo-inverse (Exercise 1.6.8).
2.6.3 [D] Suggest an algorithm based on Exercise 2.5.1 for solving the equation
AX - XB = C
when B is almost triangular.
2.6.4 [C] Solve the system
(-; TO-C)
by successive iterations, starting from ( I, which is the solution of
("I ?XK>
Section 2.7 Analyticity of the Spectral Projection
2.7.1 [B:35] Let P(t) be the spectral projection of the perturbed matrix A(t) =
A — tH. Give an example which shows that Kato's proof for expansion of P{t)
cannot be extended to the case in which the Jordan curve Γ defining P(t) encloses
several distinct eigenvalues of the matrix A(0) = A.
ίχ =
g{r)
l - yX-r ^ 4 r ,
2r
u = Ax — ξχ,
v = A*y = ξγ.
Suppose there exists a( ^ σ) and ε such that
(*) |ι>*Σ*ιι|^<ι*||β||6 (k>l).
Define
? = a2\\Q\\e.
(b) Prove that if f < {, then there exists a simple eigenvalue λ of A such that
\λ-ξ\*ί9(η\Ό*Ση\9
which is the only eigenvalue of A in the disk
\ζ-ξ\^$α.
Let P be the spectral projection of A associated with λ and suppose that
y*Px Φ 0. Prove that there exists an eigenvector φ of A associated with λ and
normalized by γ*φ = 1 such that
||φ-*Κ0(?)||Σιι||.
2.9.2 [D] Verify that in the case of a simple eigenvalue the expansions of
Rellich-Kato and Rayleigh-Schrödinger can be made to coincide, and that this
is also true in the case of a semi-simple eigenvalue.
2.9.3 [D] In the proof of Proposition 2.9.1 (page 87) we calculated first Zk and
then Ck. Suggest a way of first calculating Ck and then Zk.
2.9.4 [D] Verify the identity
n(i) = P(i)Ar,S(i)y*
by using the Rellich-Kato series for P(t) and S(t) and the Rayleigh-Schrödinger
series for Tl(t) = X(t)Y*.
Prove that there exists p > 0 such that, for every x0 satisfying ||x* — x 0 II < P, the
sequence
r^X-{\-JV-2mfc).
ΥΥϊί
2^/1-2m7c
v= — .
mt
Prove that:
(a) 3x*e{xeB: \\ x - x0 || ^ p} such that F(x*) = 0.
(b) 0 < y < l .
(c) The Newton sequence {xk} satisfies
ll**-*oll <P
and
||xk-x*Kvy2k/(l-72k).
2.10.5 [B:19] Consider the equation (2.10.1) (page 90)
AX-X(Y*AX) = 0,
where Y is of full rank m. Change the basis in such a way that the matrix is
replaced by Γ = [/m 0] T .
(a) Prove that relative to this basis the unknown X is replaced by
(a) Show that if κ < £, then the proposed method of iteration converges linearly
towards a solution R which is the only solution in the closed ball with centre
0 and radius
1-^/1 - 4 K
(b) Show that if κ < 1/12, then Newton's method, applied to Riccati's equation,
converges in a quadratic manner towards a unique solution in the closed ball
defined above.
™-C£T)-0
where λ is a simple eigenvalue of A.
(a) Write out Newton's method, applied to this problem.
(b) Propose a simplified method with fixed gradient.
(c) Examine the convergence of these methods.
2.11.2 [A] Give sufficient conditions for the convergence of the method (2.11.3)
(page 95).
2.11.3 [A] Give sufficient conditions for the convergence of the method (2.11.4)
(page 95). Establish the contraction constant of the map
G:F^K1-hJ-1[Fy*/4F+F(5-i)],
in a manner similar to Exercise 2.11.2, provided that \\B — B\\ and ||K|| are
sufficiently small.
108 ELEMENTS OF SPECTRAL THEORY
2.11.4 [A] Let F be an operator and let x* be a point in its domain such that
F(x*) = 0.
Suppose that F is differentiable in a neighbourhood of x* and that T is a regular
linear operator such that
y= sup IIT-FMII^ir-Mr1,
ll*-**l!<P
where || · || is a vector norm and also the corresponding induced norm for linear
operators, and where p is a given positive real number. Prove that if ||x0 — x* || < p
then the sequence
{ 0 1 j
where Β(λ, ύ) denotes the matrix obtained by replacing the last column of
A -λΐ by - i i .
(b) Show that Β(λ, u) is singular if the eigenvalue λ is of multiplicity ^ 2.
(c) Apply Newton's method with fixed slope to
(d) Extend the preceding results to the case of a double eigenvalue by taking the
operator
(AV -UB\
where
a b
E = (e„^iye„) and B=
c d
EXERCISES 109
...Analytical mechanics is much more than an efficient tool for the solution of dynamical
problems that we encounter in physics and engineering There is hardly any other branch
of the mathematical sciences in which abstract mathematical speculation and concrete
physical evidence go so beautifully together and complement each other so perfectly.
... There is a tremendous treasure of philosophical meaning behind the great theories
ofEuler* and Lagrange? and of Hamilton* and Jacobi ...a source of the greatest intellectual
enjoyment to every mathematically-minded person.
The eigenvalues of matrices or linear operators play a part in a very large number
of applications, both theoretical and practical. We shall try to convey an idea of
the extent of applications by citing examples deliberately chosen from very diverse
disciplines: they range from mathematics to chemistry and to the dynamics of
structures; they touch on economics.
While the theoretical applications are fundamental, the industrial applications
are no less important. We mention only the accident of the suspension bridge at
Tacome, in the state of Washington on the West Coast of the United States. This
bridge, of a span of 700 m, collapsed in 1940 under the effect of aeroelastic
vibrations, only four months after it was brought into service. At the moment of
collapse it showed a torsion of 45° against the horizontal in both directions under
the effect of a 70 km per hour wind.
^ = Jv, (3.1.2)
di
where J is the Jordan form of A: let ί be the size of the greatest block.
The reader will verify (Exercise 3.1.1) that the solution of (3.1.1) for which
u(0) = u0 is given by
u(t) = eAtu0 = XeJtX~lu09 (3.1.3)
where e is an upper-triangular matrix whose elements are of the form tjeXit
Jt
(b) The system is unstable and u(t) is unbounded if there exists an eigenvalue λ
such that
ReA>0.
(c) When maxA. Re A, = 0, the solution u{t) is bounded or unbounded according
to whether the eigenvalues for which Re A, = 0 are semi-simple or include at
least one defective eigenvalue (see Exercise 3.1.3).
Remark The equation (3.1.1) models a diffusion problem when the time is regarded
as a continuous variable. If one considers only discrete values of the time, then the
formulation involves a linear recurrence equation.
(b) The system is unstable and uk is unbounded when there exists an eigenvalue
λ such that
W>i.
(c) When p(A) = 1, the solution uk is bounded or unbounded as k -► oo, according
as to whether the eigenvalues ks for which \Xj\ = 1 are semi-simple or include
at least one defective eigenvalue.
Example 3.1.1 Consider the Fibonacci* sequence 0, 1, 1, 2, 3, 5, 8, 13,...:
/o = 0,
h = 1. (3-1-5)
Put
Wo== M
(o} *vi o) W f c _ 1 (k>V-
The eigenvalues of the matrix are
Example 3.1.2 Consider the method for calculating yfl, proposed by Theon of
Smyrna (second century B.C.). Starting from (1,1), iterate the transformation
xf->x + 2y,
y\-+x + y.
It will be found that x2/y2 ->2. The procedure can be formalized as follows:
where
The reader should verify that xk/yk -> Jl and that the values of this quotient are
alternately greater and less than yjl.
According to whether a phenomenon is modelled by a system of differential
equations (continuous time) or by a difference system (discrete time), the stability
of the system depends, respectively, on the real parts or on the moduli of the
eigenvalues of the matrix describing the system.
Example 3.2.1 Random walk on a triangular grid of side n + / ( see Figure 3.2.1).
A particle moves at random on the grid by jumping from one point to one of
its (at most four) neighbouring points (N-S-E-W).
l
i
2 x<6>
1 x(4) x<5)
(1)
0 x x(2) x (3>
/WO 1 2 n= 2
/ - I . /
X /, / + 1
X4-X/, j
up T
1 x -* x /; + 1, /
x down
W-1
Figure 3.2.1
probabilities
We put
where
qf* = P(Xk=j) (fc = 0,l,...).
We remark that
£<??'=11^11 ι = ι·
Proposition 3.2.1
qw = q(0)pk (3.2-1)
PROOF We have
i= 1
PROOF
(a) Let p = (pl9..., pn) be a price system. The production cost of one unit of; is
n n
C
J = Σ auPj + W^J = Σ (au + *A)Pi
i- 1 i=l
= Σ buPi·
i= 1
It is required that p} = (1 + r)cy Hence p satisfies (3.3.1) if and only if
pB = pp, p>0,
p= -—.
1 +r
Moreover, p < 1, because by hypothesis all branches are profitable (Exercise
3.3.2). Hence r = l / p - l > 0 .
(b) Let x = (x!,...,xn)T be the production structure. The total quantity of n
articles required for the production of one unit of i is given by
n n
di = Σ (fly + tjddXj = X buXj.
forecasting models comprising a group of countries or the whole world, then the
matrices involved are structured in blocks, but they are not symmetric and are
of gigantic size. To give an indication, the input-output array for the years 1970
to 1979 supplied by INSEE for France corresponds to a division into about 600
branches of activity. Aggregated versions exist comprising 91,35 and 15 branches
respectively.
s = Σ aJsJ and X
J = SJ - £
The matrix
of order k by n represents the data referred to the mass centre. The principal
components analysis of the cluster of points consists in projecting them on to
the plane which minimizes their dispersion in R* in the sense of the norm defined
by B. We use the notation
/t = diag(at).
The method consists in calculating the two greatest eigenvalues of the matrix
U = ΧΑΧΎΒ
together with the corresponding two eigenvectors. In general, the matrix U is
not symmetric.
Lemma 3.4.1 (Barra, 1981) Let X, A and B be three matrices of orders kxn,
n xn and kxk respectively, where A and B are symmetric positive definite. We
suppose that k^n. The matrices U = XAXTB and V= XTBXA of orders k and
n respectively have s( < k) positive eigenvalues in common; they are the non-zero
eigenvalues of the positive semi-definite matrix
W=B1,2XAXTBl/2
of order k.
with A φ 0. Then
Uu = XAXTBu = Aw.
Put
Then
X/li; = Aw#0.
Hence v φ 0 and
M 0 λ
={:}· ' - ( : : ) - ·■( 0 -KJ
122 WHY COMPUTE EIGENVALUES?
3.6 CHEMISTRY
Φ = Σ ciXi-
£= 1
Ac = c.
β-Εσ
Thus E and c can be deduced from the eigenelements of the matrix A, which
corresponds to the graph of bonds between the carbon atoms.
The interest of this approach lies in the fact that the size of A is evidently
reduced. However, although the results obtained by this method are qualitatively
good, they are far less precise then those obtained by the method of the preceding
section.
y(i,0) = y(i,l) = ^.
The system (3.6.2) has the trivial stationary solution x = A, y = β/Λ. The linear
stability of (3.6.2) around the equilibrium solution is studied by putting
dt dt
The stability is therefore related to that of the Jacobian of the right-hand side of
(3.6.2), evaluated at the equilibrium solution.
Let J be the linear operator. There exists a stable periodic solution if the
eigenvalues of J with largest real part are pure imaginary (and semi-simple). The
reader will verify that
l
Dx--2 + B-l A2
J= or1
2
-B D2—-
2 2
dr
K:x(r)-(Kx)(r) i = fc(r,s)x(s)ds
k(t9s)xi (0 ^ t ^ 1),
Jo
The scalar λη and the vector xn = [xn(s7)]" are calculated by discretizing the
variable t at the same points if = st. We obtain
n
X wjk(th Sj)xn(Sj) = λΗχΜ (i = 1,. · ·, n).
Let An be the matrix with elements Wjk(th sj) (i, j? = 1,..., n). Then λη and xn are
solutions of
positive real parts and so on). The methods presented in Chapter 5 (dense
matrices of moderate size) and those of Chapters 6 and 7 (large sparse matrices)
will enable us to carry out these numerical calculations.
EXERCISES
a constant matrix of order n. Let J be the Jordan form of A and let V be the
corresponding basis. Show that
M(i)=Ke J i F- 1 u 0 ,
and determine the elements of eJ*. In particular, discuss the case in which A is
diagonalisable.
3.1.2 [C] Compute u(t) when the data in Exercise 3.1.1 are
/0
1 2\
A= 0 0 1 and un = 1
v1/
3.1.3 [D] Show that the system of differential equations proposed in Exercise
3.1.1 can be bounded or unbounded according to whether the eigenvalues of A
having a zero real part are semi-simple or not.
3.1.4 [D] Consider the one-dimensional heat equation
du d2u
—= — (0<χ<1, ί>0
2
dt dx
with the boundary conditions
ii(r,0) = u(i,l) = 0 (i>0)
and the initial condition
M (0,x)=/(x) (O^x^l).
(a) Write down the problem that arises when the second derivative is discretized
by finite differences:
2
_d _u(u
_ _ x)
_ ^ u(t, x-h)-
Ä
2u{U
__ x) + w(i, x + h)
φ(ή =
uN(t)
Ui(t) being an approximate value of w(i, ih). Put tj =jAt {j - 0,1,...) and let
u[ be an approximation to uk(jAt). Integrate with respect to the time over
128 WHY COMPUTE EIGENVALUES?
[tj(iJ+1]andput
1 <;♦■
uk(t)dt.
"i =
AtJ-
(c) Rewrite the system. Use the trapezium rule to evaluate approximately the
integrals
1
(\t)dt*$ Wc) + Ml (c<d).
d-c
(d) Show that the system can now be written as
A + — l)uJ+1 = (-A + — I W,
At J V At
where
u\
w
(e) given that the eigenvalues of A are
2
^k "- ^ i ,s2 i- n ^
2
(/c=l,2,...,N),
show that the sequence uj is bounded provided that h and Δί are sufficiently
small.
3.1.5 [B:39] Discretize the equation
Id2u d2u\ „ Λ «0
+ _/ Ω (0 )
"(i? v) °" - ·' ·
where Γ is the boundary of Ω, using finite differences and a step of h = 1/N for
both x and y.
(a) Let utj be approximations oiu(ihJh) and / y = f(ihjh). Show that the resulting
system becomes
and x0 = 1, y0 = z0 = 0. Put
(l\
(Xk)
"fc = yk u0 = 0
\zk) {o)
Determine the matrix A such that
"* + Ι = Λ Μ * .
130 WHY COMPUTE EIGENVALUES?
3.1.9 [D] Study the discretization through finite differences of the eigenvalue
problem:
d2u(x)
Y = ku(x) (0<x<l),
dx
whenu(0) = u(l) = 0.
Calculate the exact solutions and compare them with the results furnished by
a discritization at five points (associated matrix of order 3).
3.1.10 [B:l 1] Let T be a bounded linear operator in Hubert space (H, <·,·»· Let
V=(vi,...,vn)
be an orthonormal system in H and let
S = lin V
be the subspace generated by V. The orthogonal projection on S is denoted by
π. Consider the eigenvalue problem
Τφ = λφ, ΟΦφβΗ, λβ<£
and the approximation, named after Galerkin, that is associated with the sub-
space S:
π(Τφη- ληφη) = 0, 0 Φ φη6S, Xne<£.
Show that the approximate problem is equivalent to a matrix problem
Au = ωκ, O^iieC", coeC.
Determine the matrix A and the relation between u and φη.
Ρ = (Ρυ)
EXERCISES 131
be the associated transition matrix. Suppose that the chain is irreducible and
non-periodic. KolmogorofFs equations
ι=Σ*ι
have a unique solution π*. Jacobi's iteration can be written as
(a) Show that Jacobi's iteration corresponds to the power method applied to PT
with the normalization condition
7c(fc)e=l,
where
e = (l,l 1)T.
Let {Ω(ι):ι = 1,...,/?} be a partition of the set {1,2,...,«}. With each state
π of the chain such that n{ > 0 (i = 1,..., n) we associate a matrix Ρ3(π) (called
the aggregated matrix) defined by
Σ Σ *'pfk
η^^ιφ^ι— (ΐ</,^ρ).
and let
p π*
π= Σ v - J n- GJn>
i=i Σ f
where
Gj= Σ <νϊ·
(c) Show that ne = 1. The new stationary state of the chain is defined by one
Jacobi step
π = πΡ.
(d) Show that
** = Σ ^ τ Σ MV* (K*<n).
^ΩΟ)
132 WHY COMPUTE EIGENVALUES?
3.2.3 [D] We retain the notations of Exercise 3.2.2. Consider a Markov chain
which is almost completely reducible:
P = D + £,
where
D = diag(D 1 ,Z) 2 ,...,D / ,), ||£|| 2 = ε;
£>, is the transition matrix of an irreducible non-periodic chain, whose stationary
state is denoted by π, and satisfies the condition
nte— 1.
Let π be a vector of blocks π,. Consider one step in the aggregation/disaggregation
method based on π:
Ρβ= Σ Σ MV* (ΐ^υ^ρ),
ΛεΩ(ί') / 6 Ω 0 )
π^= 1,
where
π3 = πΛΡ\
Show that
| | π - π * | | 2 = 0(ε).
3.2.4 [C] A message has to go from a point A to a point B by passing through
n intermediary points. Suppose the message can take only two states: either 0 or
1. Each intermediary has the probability p = \ of correctly transmitting the
message received and a probability q = f of transmitting the opposite message.
We say the system is in the state £(0) at the kth stage if the intermediary k
transmits 0 to the next intermediary, and that it is in the state £(1) if the
intermediary transmits 1.
(a) Prove that the sequence of observed states is a Markov chain.
(b) Calculate the transition matrix.
(c) Calculate the probability of receiving the correct message at B and determine
the limit of this probability when the number, n, of intermediaries tends to
infinity.
unit. Let A be the new matrix obtained in this way. Prove that
Ä= D'lAD,
where
D = diag(d1,...,rfw).
3.3.2 [B:37] Let A be the matrix of technical coefficients defined in monetary
units. Suppose there exists a price system that makes each branch of the economy
profitable. Show that
p(A)<\
and hence p(A) < 1, where A is the matrix of coefficients defined in whatever
system of units.
3.3.3 [B:37] Suppose the number of employees in the branch j of the economy
decreases, thereby causing the intensity of work in this branch to increase. Prove
that the rate of profit and the rate of growth will then increase.
3.3.4 [B:37] In the Marx-von Neumann model wages are indexed by prices:
w = pd,
where d is the employees' basket of consumption goods. Suppose that d is varied
by an amount Ad.
(a) Show that the increases of consumption by the employees is equivalent to an
increase in the wage costs.
(b) Show that an increase of the wage costs implies a decrease in the rates of
profit and growth.
3.3.5 [B:5] We present here what is known as the closed Leontiev model. The
set of goods is supposed to be equal to the set of products. The matrix A of
technical coefficients is a non-negative square matrix. If x is the vector of products
and y is the vector of goods, then
y = Ax.
The system is viable if y ^ x and the equilibrium of the quantities is given by
(J - A)x = 0 (x ^ 0).
(a) Determine a sufficient condition for equilibrium when the matrix A is irredu
cible.
Let p be the row vector of the prices of the goods. The row vector of the
costs of manufacturing the goods is therefore
c = pA.
Hence the equilibrium of the prices is given by
ρ(/-Λ) = 0 (ρ>0).
134 WHY COMPUTE EIGENVALUES?
(b) Show that, when A is irreducible, the equilibrium of the prices is equivalent
to the equilibrium of the quantities.
3.3.6 [B:37] Next, we shall present the open model of Leontiev. We now have
n goods which are also products, but there exists a type of goods which is not a
product (in general, the work). The matrix A of technical coefficients is non-
negative and irreducible. The net produce is given by the equation
q = (I-A)x.
(a) Given a demand vector c ^ 0, determine a sufficient condition for the exis
tence of a vector x ^ 0 such that
q = c.
(b) Prove that if there exists a row vector p > 0 such that
pA>p,
x
then (I — A)~ exists and is positive.
(c) Examine and interpret the sequence
x (0) = c,
when the dominant eigenvalue λ* of A is such that 0 < λ* < 1.
3.3.7 [B:37] In the growth model of von Neumann, the production is defined
by two matrices:
A = coefficient matrix of the goods,
B = coefficient matrix of the products,
where it is supposed that there are m techniques for the production of n goods.
During a given period of time we consider a column vector xeR m of the activity
level of the techniques and a row vector p,pTeRn, of the prices of the goods. If a
is the growth rate and β is the interest rate, then
(B - <xA)x ^ 0 (x ^ 0),
ρ(Β-βΑ)^0 (ρ»0).
(a) Show that the surplus has zero price and that, if the profit is less than the
interest rate, the activity level is nil.
(b) Show that, if the technology (B9 A) consists of non-negative irreducible matri
ces, there exists a unique number α* = β* > 0 such that
a* Ax ^Bx (x> 0),
β*ρΑ^ρΒ (ρ>0)
and
p(a*A - B)x = 0.
EXERCISES 135
(c) What can be said about the maximum rate of growth in relation to the
minimum rate of interest?
3.3.8 [C] We shall treat here the case of a farmer whose economy is confined to
the raising of chickens. We are concerned with two goods (chickens and eggs)
and two processes (laying and brooding). It will be assumed that a laying hen
will lay a dozen eggs per month, while a brooding hen will hatch four eggs per
month.
(a) Show that the matrices A and B of Exercise 3.3.7 are in this case
(b) Study the farmer's situation at the end of two months after he started with
three chickens and eight eggs.
(c) Repeat part (b) for the case when he started with two chickens and four eggs.
(d) Calculate the rate of growth when the economy is balanced.
(e) Study the balance of prices when one chicken is worth 10 units and an egg is
worth 1 unit.
(f) Repeat part (e) for the case when the price of a chicken is 6 units and that of
an egg is 1 unit, and calculate the rate of interest.
3.3.9 [B:37] Consider the Marx-von Neumann model defined in Section 3.3
(page 117): we formalize here the process of absolute price formation and hence
the propagation of inflation. Given two vectors
x = (x 1 ,...,x n ) and }> = (j>i,...,)'„)
we define the vector
z = (z 1 ,...,z n ) = max{x,<y}
by putting
z^maxjxi,^} (ι = Ι,.,.,η).
Let s be the marginal rate and put
A—U
1+5
Define the following sequence of row vectors:
pk+i=max<pk9-pkB>9
This formalizes the effect of the 'clich', that is the rigidity at the decline of prices.
(a) Show that if B is irreducible and if p = (p (1) ,... ,p(n)) > 0 is the price system,
136 WHY COMPUTE EIGENVALUES?
1-,—'-
1+r
and
P(i)
a = max -£-.
which is the delay of one year between the evolution of consumption and revenue
(with vital minimum zero).
(a) Taking dk = 1 for all /c, show that the national revenue satisfies the equation
rfc + 2 - s ( l +v)rk+l+svrk= 1.
(b) Study the solution of this equation in relation to the values of (s, v). It is
convenient to consider the four regions in the (s, 1;) plane defined by the curves
1 Δ 4ν
s = -v and s = (1+tO 2
3.3.11 [C] Consider an economy that is divided into N regions. We study the
interregional movements of the immigrant workforce. Its redistribution during
the period k (0 ^ k ^ T) is given by a row vector
x
k — ( x kl>* · -*xkNh
EXERCISES 137
where xkj is the effective workforce of the immigrant population in the region j
during the period k. We suppose that the workers freely change their location
according to taste and different market conditions for work. Let Ak = (α(9) be the
'matrix of migration', where af) is the rate of migration of workers from the region
i to the region j during the period k.
(a) Show that
X =
k+1 ^0^0^*1 '"Ak.
(a) Show that the associated eigenvectors uh wh vt and zf satisfy the equations
Ui = B-1/2wh vi = A-li2zi.
(b) Show that the SVD (Exercise 1.6.8) of E is
E=ijIiZiw].
i=l
where max {k,n) ^ JV, and define X = STT, A~' = TT T , B~* = SST. Let 0, be the
ith canonical angle between the subspaces lin ST and lin T T in R N . Show that
yfki = cos 0i (i = 1,..., k).
3.4.4 [B:14] In the method of correspondence analysis the matrix X of order
k x n represents a contingency table:
Xij^O and Σ^ο=^·
A = diag(ar 1 ), JB = diag(br 1 ).
(c) Show that (70 is associated with the triplet (X0A9 A ~ l9 B) as well as with the
triplet (X09A,B).
(d) Interpret factorial correspondence analysis of the columns of X as determining
the principal inertial axis in R k equipped with the norm B of the set of points
defined by the columns of X' — X0A weighted by {a,}". The row analysis is
obtained by duality.
3.4.5 [B:14] Continuing Exercise 3.4.4, assume now that xu represents the
probability of the event {/ = i and J =7}, where / and J are two discrete random
variables. Suppose that the vectors / and g of Exercise 3.4.2 represent the corre
sponding functions
/:{l,...,fc}->R,
0:{l,...,n}-]R.
Associated with the variables / and J we have the functions / ( / ) and g(J).
Establish the following results:
fTXg = <$lf(I)g(J)l
EXERCISES 139
fTB-if=£{u(m2h
Show that the method determines uJS, the linear combination of the compo
nents of 5 that has the greatest variance and is uncorrelated with u?S when j < i.
In this situation, is it necessarily true that λ,·€[0,1]?
For the geometric interpretation, we consider n weights {α^Ί such that Σ" = laj= 1
and n vectors {S}}\ in R \ centred with respect to the {a,·}". Set X = [ S l 5 . . . , S J
and A = diag(a7); the norm B on R* is given. The principal component analysis
(PCA) of the set of n points {Sj)nx in R" finds the principal inertial axes in R*
normed with B of the n points weighted by {a^\. The two particular choices
B = I [respectively B = diag(br*)] lead to the unitary PCA [respectively norma
lized PCA with bi = Σ"= i<*jSfj which is the empirical variance].
3.4.7 [B:14] Consider the method of discriminant analysis. We retain the nota
tions of the Exercises 3.4.1 and 3.4.2. Let S be a random centred vector in R* and
let J be an integer random variable. Put n(j) = Prob(J =y). We define
* = (*!>·..,*„),
A-^dmgWj))
B-l=£(SST)
(a) Show that
fTB-lf=a2(fTS),
gTA-lg = #[g2(J)l
(b) Show that sp(l/) c [0,1].
(c) Show that p(f, g) is the correlation coefficient for fTS and g(J). The canonical
140 WHY COMPUTE EIGENVALUES?
h= l/(n + 1). Let wf denote the approximation for u(ih) and put ai+l/2 =
a((i +1)). The discretized problem can be written as
"o = "«= 0.
(b) Show that this discretization is equivalent to a matrix problem
Au = Xu.
Determine the matrix A,
3.5.2 [B:66] The lower ends of the two bars in Figure 3.5.1 are equipped with
springs in such a way that in the absence of forces, the position of the bars is in
the same vertical line. A downward vertical force F is applied to the upper end
of the second bar, which causes the angles 0X and 0 2 to appear. Both bars are of
length / and mass m, and the two springs have the some characteristic constant k.
(a) Show that, if the force of gravity is neglected, the kinetic energy K and the
potential energy V are given by
K = > / 2 [ 4 ö ? + Wxe2 cosφ χ - θ2) + θ221
V= \kQ\ -h \k{Q2 - Θ,)2 - Fl(2 - cos θχ - cos 02).
(b) Write down Lagrange's equations
dfdK\ dK dV Λ ,. t ^
— —r- +— =0 0=1,2)
dt\BeJ de, dßi
for this case. Discuss the solution that corresponds to the initial perturbed
Figure 3.5.1
142 WHY COMPUTE EIGENVALUES?
conditions
0,(0) = εα„ 0,(0) = eft 0=1,2).
Assume the existence of a solution 0,(ί,ε) which is differentiable with respect
to ε at ε = 0. Put
ΦΜ = —-—
οε
(c) Show that </>, satisfies the equation
+ W, = 0, 0 = 1,2).
30,50,/
(d) Write the preceding system of differential equation in matrix form:
Βφ + Αφ = 0,
and show that A and £ are symmetric and that B is positive definite.
(e) Show that if F > 0 and k> 0, then the roots of the polynomial
p(/4) = d e t ( A - ^ ß )
are real and distinct.
3.5.3 [B:66] Generalize the problem 3.5.2 to the case of n bars. In this case the
kinetic and potential energies (neglecting gravity) are given by
ml2 n
· ·
K
= ΤΓ Σ (6« + 3 - 6max
{'' J) - δ»)θ&cos (0.· - ej)>
3.5.4 [B:66] Consider an elastic solid. When the equations of elasticity are
linearized, they take the form
1
£ A / Jduk duA
2 M =i \dxl dxkJ
Aijki — Ajikl — i4yIfc — v4kiij,
,·= i dx,. dr
where u(x) is the displacement vector and p(x) is the density of the material.
Write down the system of differential equations that enable us to determine
the normal modes of vibration of the form
ux(x, t) = exp ( - ί^Τήω(χ).
EXERCISES 143
3.3.5 [D] Consider the vibrations of an elastic disk (homogeneous and isotropic)
whose normal displacement component u(x, t) is a solution of
dt2
where Δ 2 is the biharmonic operator (the Laplacian is applied twice). The method
of the separation of variables yields two normal modes of vibration of the form
u(x, t, λ) = exp (iy/Tt)w(x).
Write down the equation satisfied by w.
3.5.6 [A] Consider the differential equations
Mu" + Bu' + Ku = 0,
whose unknown u is a vector function of a real variable t > 0. Find a solution u
of the form
"(0 = eA'4>,
where φ is a constant vector. Prove that the pair (A, φ) satisfies the equation
(λ2Μ + λΒ + £)</> = 0.
3.5.7 [D] Consider the differential equation
MM" + J3u' + Ku = 0
with the initial conditions
w(0) = wo, ii'iOHtii.
Define the polynomial
ρ(λ) = ά<Χ(λ2Μ + λΒ + Κ).
Suppose that M, B and K are Hermitian. Prove that:
(a) If M, 1? and K are positive semi-definite and if M and X are positive definite,
then no root of ρ(λ) has a positive real part.
(b) If M and K are positive semi-definite and B is positive definite, then λ = 0 is
the only root with a zero real part.
3.5.8 [B:22] We present here a method known as static condensation in relation
to the problem
(P) Kq = aj2Mq, O^eC,
which models the natural frequencies and modal forms of a structure considered
globally; K is the rigidity matrix and M is the matrix of masses (page 121).
We choose a subset qc of coordinates that are to be eliminated and we denote
by qR the subset of coordinates that are to be retained. This induces a partitioning
144 WHY COMPUTE EIGENVALUES?
=
^*RR ^*RR ~" ^ R C ^ C C ^CR ~~ ^ R C ^ C C ^*CR>
=
^*CR ^ C R ~" ^ C C ^ C C ^CR·
(b) Show that qD satisfies the equation
(Kcc - a)2Mcc)qD = co2MCR<?R.
Let qR = 0 and let (μ,, </>,) be the solutions of
Κ€€φ = μ2Μ€€φ (φ*0).
Suppose that
.2 *> . . . ^ „2
··<«
and
4>*Mcc<t>J = oij.
Let ε > 0 be the order of magnitude of the error acceptable for the modal
forms associated with the low frequencies.
(c) Show that the method of static condensation furnishes approximations ac
ceptable for the solutions (ω, q) such that
ω2 = εμ2« 1.
3.6.2 [D] Verify that the Jacobian of the right-hand side of (3.6.2) (page 124) is
given by
d2 \
A2
J =
d2
-B D
^
\ 1
(a) Determine the associated Green kernel and formulate the eigenvalue problem
associated with the integral operator.
(b) Show that the discretization by finite differences for the differential problem
is equivalent to Fredholm's approximation for the integral problem.
3.7.2 [B:6,12] We present here what is known as the collocation method by
discussing an example. Let B = C°[0,1] be the Banach space of functions
x:ie[0, l]i->x(i)eC
which are continuous on [0,1] and are equipped with the uniform norm
| | x | | = max |x(i)|.
if 0 ^ t ^ h9
*i(0=-
otherwise;
whenj = 2,...,n— 1:
l--|i-r,.| ύ tj.1^t^tj+1,
*;(') =
0 otherwise,
en(t)=S h
0 otherwise.
146 WHY COMPUTE EIGENVALUES?
/(x)= I x(i)di.
Jo
Let T:C°[0,1] -> C°[0,1] be the integral operator
(Tx)(t)= k(t9s)x(s)ds
Jo
having a continuous kernel k.
We define the Nyström approximation of T associated with the given quadra
ture formula, namely
(Tnx)(t)= t<*jnk(t,tjn)x(tjn).
EXERCISES 147
max | x j = l.
1 </<«
<M0 = Σ °>jnKt,tjn)Xjn
Error Analysis
This chapter begins with a topic of great practical importance, namely the
stability of a spectral problem and the notion that is derived from it: the spectral
conditioning for a set of distinct eigenvalues and for the associated invariant
subspace. This will be considered in the most general case involving a non-
normal matrix and defective eigenvalues.
The analysis of a priori errors in based on spectral theory, which enables us
to give concise and elegant proofs. The analysis of a posteriori errors furnishes
bounds that are fairly easy to calculate as a function of the residual matrix
AU — UC constructed upon the matrix C of order m and the m vectors of V.
UA, - '"'
cond A
and A + A A is singular (see Exercise 4.1.1; also see Chapter 1, Section 1.12).
Before solving a linear system it is useful in practice to scale the matrix in order
to reduce its condition number. The scaling process of A consists in finding
150 ERROR ANALYSIS
As we have seen, the stability of the solution of a linear system depends on the
regularity of A. The situation is much more complex for the problem of eigenvalues.
The notion that corresponds to the regularity of A is the property of A to be
diagonalisable or non-defective.
is defective. Put
m-(] J) (e>o).
The eigenvalues of Α{ε) are
Αχ(ε) = 2 + y/ε, λ2(ε) = 2 - ^/ε,
d/l1(g)_ 1 άλ2(ε)_ 1
dß 2^/έ' de iyß
The rate of change of the eigenvalues at 0 is infinite. However, Α(ε) is much nearer
to non-diagonalisation than might be predicted from the distance of the eigen
values:
\\A — Α(ε)\\ = ε and Α^ε) — Α2(ε) = 2 χ / ε » ε
for small ε.
Figure 4.2.1
in Mλ and let ξ be the acute angle between the eigendirections lin (x) and lin (xj
(see Figure 4.2.1). If [x,Q] is a unitary basis of C such that Q is a basis of
M \ then
Σ 1 = ρ(Β^Λ/Γ 1 ρ*,
where
Β = βΜρ.
Put
<5 = dist[A,sp(>l)-{A}].
Figure 4.2.2
152 ERROR ANALYSIS
Definitions
(a) The spectral condition number of the simple eigenvalue λ is defined by
csp(/l)= | | x j | 2 .
(b) The spectral condition number of the eigendirection lin(x) is defined by
ο*ρ(χ)=\\Σ1\\2.
We recall that
Il**ll2 = 11^112 = (cos ξ)~\
and
(3-1^||(B-A/)-1||2 = l|X1||2<2cond2(F)^',
where £ is the index of the eigenvalue of B that is nearest to A, provided that δ is
sufficiently small (see Exercise 4.2.1), and where V is the Jordan basis for B.
When B is diagonalisable,
<5-1^||Z1||2<cond2(F)^-1,
where V is the basis of the eigenvectors of B. Then
(a) If there exists a vector of M that is almost parallel to x (that is if ξ is almost
a right angle), then λ is ill-conditioned.
(b) If δ is small and/or cond 2 (K) is large, then x is ill-conditioned.
When A is Hermitian or normal, then
M 1 = M, ||xj|2 = l and ΙΙΣ1!!, =-cT'.
In this case, the sole cause for the ill-conditioning of lin (x) is δ, the distance of /
from the rest of the spectrum of A. For an arbitrary matrix, the departure of B
from normality also plays a part (see Chapter 1, Section 1.12).
-φ-ά)
\ )
has a defective double eigenvalue \ (a + b), and the Jordan basis is given by
n i \
V=\ ε
U b>-a\ 2 /
If (l/s)(b — a) is moderate, then cond2 (V) is moderate. This condition number
is large when (1/ε)(ί? — a) is large; in this case V is of rank unity up to the term
ε/(ί> — a), and the departure of T from normality is (ft — α)/ε and is therefore
large.
-X · · K · K ^
I 2 3
Figure 4.2.3
154 ERROR ANALYSIS
—._jT· r. ^
I 2 3
Figure 4.2.4
(b) aT22 = 179.997 769: the eigenvalues are now 1.550 945 6 ± i x 7.999 21 x 10" 2 ;
2.895 877 9 (Figure 4.2.4).
It can be seen that a small perturbation around a22 can produce very different
perturbations of the eigenvalues. It can be verified that csp (Af) % 103 (i = 1,2,3).
Then
P = QXt
is the spectral projection on M.
Proposition 4.2.2 If the matrix A is subjected to the perturbation ΔΑ, then aand
M become σ' and M' respectively, which (to the first order) are defined asfollows:
(a) σ' is the spectrum ofB' = B + X$AAQ.
(b) M' has the basis X', normalized by Q*X' = Im such that X' = Q - E±AAQ.
Property 4.2.3 For each λΈσ', there exists λβσ of index { such that
^-A|^2[cond2(F)||ZJ|t||2]1^,
for sufficiently small ε2.
1 = - X μ, 1' = - £ μ.
m μεσ m μεσ'
156 ERROR ANALYSIS
Definitions
(a) The global spectral condition number ο/σΪ8 defined as
csp(a) = cond2(V)\\XJ2.
(b) The spectral condition number of the invariant subspace M is defined as
csp(M)=||Z 1 || F .
These definitions are independent of the choice of bases in M and M 1 . We
recall that
ll*JI 2 = Wp\\2 = ll(cosE)- 1 || 2 = (cos^ max )- 1
and that
||I- L || F =||(fl,JB)- 1 |lF>*" 1 -
In the special case, in which A is Hermitian or normal, we obtain the following
simplifications:
^ * = 6» V is unitary, cond 2 (V)= 1,
0,P = P 1 , S = L 1 ,||S|| F = i - 1 ,
S =
Example 4.2.4 Consider the special case in which σ contains a single eigenvalue
λ of multiplicity m; then λ is globally well-conditioned when cond 2 (K)||X 3|c || 2 is
moderate (note that we may put V= Im if λ is semi-simple).
A defective eigenvalue is ill-conditioned when cond2(K)||ArJ|t||2 is large (see
Exercise 4.2.2). On the other hand, it may be well-conditioned in contrast to the
case in which it is treated individually (see Exercise 4.2.1).
'«i1
\0
l
)
-10-4/
cond 2 (K)~10 4 .
104
It is easy to verify that A' = I _4 Λ ) is defective; it has | as a double
10 _ 4 /2 0
eigenvalue and the Jordan basis V = ( . I, for which cond 2
6
V-10_4/2 10"4/2/
(F')~10 4 .
We remark that the departure from normality of A is of the order 104 and that
the relative distance between the eigenvalues is \/\\A\\ ~ 10~ 4 .
Λ.(V1O '°Ί
0 J
and A.f\01 ° \
ΚΓ 4 /
Then
A' = A~lAA = (1 l
\
158 ERROR ANALYSIS
* — * - ( ; ' , )
It can be verified that the balancing of A by Δ has diminished the condition
number of the basis of eigenvectors, as well as the departure from normality of
A (see Exercise 4.2.3).
has the separated eigenvalues {1,0, j}. The eigenvalues 1 and 0 are ill-conditioned
(condition number of the order 104) while the corresponding eigenvectors
(1, 0, 0)T and (1, - Κ Γ 4 , 0)T
are well-conditioned (condition number of the order 1). In fact, the matrix
/ 1 104 θ\
5
A' = 1.1 x K T 0 0
^2xl0"5 0 \j
has the eigenvalues {1.1; —0.1; ^}. The first two eigenvectors are (1, 10 5,
±x 10" 4 ) T and(l, - l . l x l O " 4 , -±xlO" 4 ) T .
When we group the two ill-conditioned eigenvalues into σ= {0, 1}, we find
that csp(^~10 4 and csp(M)~104, where M = lin(e1,^2) is the associated in-
(\ 10 4 \
variant subspace. This is due to the fact that the matrix B = has as its
vo o ;
matrix of eigenvectors V= I _4 1, which has the property that cond2 (K)~
104.
Denote by X' the basis of the invariant subspace associated with σ' = {—0.1;
1.1} and normalized by Q*X' = /, where Q = [eXie^\. The reader can verify that
*·-[!: * x r]
(see Exercise 4.2.7). The grouping did not improve the spectral condition numbers,
because cond2 (V) remains unchanged. One may ponder on the apparent paradox
that makes it possible for two well-conditioned eigenvectors to generate on
ill-conditioned invariant subspace.
It should be remarked that the grouped eigenvalues {0,1} are not consecutive:
the eigenvalue \ lies between them. The relative distance 1/||>4|| ~ 10 ~4 is small.
/ * 1 3Ί
r yi *1 V o
x2 y2
-y2 XjJ
\v
It is easy to check that the di.n. v(Sv) is larger than y/n — 2 v2, so that it increases
as the parameter v increases, when n is fixed.
The real values xk,yk have been chosen so that the eigenvalues xk ± iyk lie on
the parabola x = — 10 y2, that is
(2/c-l)2 2/c-l
**= - yk = - löö" (/c=l,...,p).
1000
Now let v4v = ßSvQ, where Q is the symmetric orthonormal matrix consisting of
the eigenvectors of the second-order difference matrix
W (ij= Ι,.,.,η).
4u = . -sin
n+ 1 n+ 1
The matrix Λν has the same spectrum and d.f.n. as 5V. For v = 1, 10, 102, 103,
and n fixed equal to 20, the following computations on Av were performed by
means of the QR algorithm (see Chapter 5), under MATLAB* on a workstation
working with a machine precision of the order of 2 x 10~ 16 . Figure 4.2.5 shows
the exact ( + ) and computed (°) spectra of the four 20x20 matrices Av. The
increasing instability of the spectrum is clear, as v increases. The * represents the
exact and computed means of the eigenvalues; they are equal within machine
precision, as a consequence of Corollary 4.2.5.
Apart from two of them, 18 of the 20 computed eigenvalues lie, for v= 102 and
103, on a disk centred at this arithmetic mean 1 This suggests that the matrix Av
behaves approximately like one Jordan block of size 18, completed by two
diagonal elements.
In order to test this hypothesis, we compute the spectra of a sample of matrices
A' = A + A A, randomly perturbed from A, in the following componentwise way.
♦MATLAB is a numeric computation system, trade mark of The Math Works Inc.
STABILITY OF A SPECTRAL PROBLEM 163
U- I V- 10
•0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 2 -1.5 -1 -0.5 0 0.5 1 1.5 2
Thus Ay becomes a'u = alV(l + αί), where a is a random variable taking the values
± 1 with probability £ and t = 2~k, the integer k varying from 40 to 50. Therefore
|Δ>1| = ί|>4| where t ranges from 2 - 4 0 ~ 1 0 " 1 2 to 2 - 5 O ~ 1 0 " 1 5 . For each i, the
sample size is 30, so that the total number of matrices is 30x 11=330. The
superposition of the corresponding 330 spectra are plotted in Figure 4.2.6. The
transformation of the perturbed spectra as v varies is dramatic. The 36 spikes
around 1 for v = 102 and 103 confirm the hypothesis that Av becomes computa
tionally close to a Jordan form as v increases. The computed eigenvalues λ' are
solutions of (λ' - A)18 = ε = 0(t\ ε being positive or negative with equal probability.
The computed eigenvalues appear at the vertices of two families of regular
polygons with 18 sides, which are symmetric with respect to the vertical axis—
164 ERROR ANALYSIS
iSIttll
V^VfC^Vf^'-:.
;^Φ**ΐ-:
■%$m
-05 -0.4 -0.3 0.2 -0.1 -0.4 -0.3 0.2 -0.1
U= 10
. ■A T
L i i#" 4
10s 10s
Perturbed spectra
hence the 36 spikes. Note that, although the spectra for v = 102 and 103 are
qualitatively similar, the one for v = 103 is more ill-conditioned than the one for
v = 102. The fact that the matrix approximately behaves like a Jordan block of
size 18 (rather than 20, for example) is a consequence of the particular structure
of 5V. It remains true under normwise perturbations (Fraysse 1992).
The perturbed spectra are part of the componentwise ε pseudo-spectrum, with
40
£ - 2 " . However, the information on the underlying Jordan structure, in finite
arithmetic, of Av (which is diagonalisable in exact arithmetic) is too detailed to
be retrievable by looking at the global pseudo-spectrum. The interested reader
can find other suggestive examples in Chatelin (1989).
Example 4.2.11 shows that truly difficult problems arise when the d.f.n. is
A PRIORI ANALYSIS OF ERRORS 165
unbounded under the variation of some parameter. The parameter can be merely
the size n of the matrix, as is the case when the matrix is the discretization of a
highly non-normal operator.
The true difficulty of computing the spectrum in the absence of normality has
been known to mathematicians for a long time. Its practical implications should
not be overlooked, since such examples appear in essential industrial etpplications,
as well as in physics—fluid dynamics and plasma physics, for example.
PROOF Let Γ be a closed Jordan curve isolating λ and lying in res (A). Consider
Ρ ' - Ρ = - - ί - ί [K'(z)-K(z)]dz,
2mJr
where
R'(z) = (A-ziy\
R(z) = (A-ziy\
R'{z) - R(z) = R'(z)(A - A)R(z).
Hence
||P' _ p|| ^ i i ^ l L azerx ||Ä'(2)|| ||R(z)||1||H|| = cs9
In L J
where meas Γ denotes the Lebesgue* measure of the curve Γ (see Exercise 4.3.1).
If ε is such that \\P'-P\\ <U then
dim ImP = dim Im F = m,
and A has m eigenvalues μ'. inside Γ.
We define Af' = Im F , the subspace invariant under A and associated with
Lemma 4.3.3 Ife is sufficiently small, then P' defines a bijection ofM on to M'.
PROOF Let F be the map P\M:M-+M'. Suppose that xeM and ||x|| = 1. Then
11 - HP'xIl | = I IIPxIl - HP'xIl I < ||(P - F)Px ||
^ ||(P-P')PK±
if ε is sufficiently small. Hence
HP'HH and HP"1!!^.
We note that A]M and F~ 1A'F are maps of M into itself. Let B = Y*AX and
B' = Y*F~1A'FX be square matrices of order m representing these maps in a
chosen basis X of M and an adjoint basis Y.
(Ρ'-1Α,Ρ')Ρ'-ίφ = μ'Ρ'-1φ\
that is
£ V = μ'η',
where ^' = Y*F~^' and iy' ^ 0 because F~ ιφ' Φ 0. We conclude that sp(J3') =
Therefore
\\B-Bf\\^ce.
Let / be a function of a complex variable z, and suppose that / is holomorphic
in a neighbourhood οϊλ, which constitutes the spectrum of B. By using Cauchy's
integral formula (2.2.6) we define
f(B) = ^{f(z)(B-zyldz.
Lemma 4.3.5
because, for sufficiently small ε, the contour Γ contains sp(2*'). Finally, we apply
the inequality
-|trC|^p(CK||C||
m
to the matrix
C =/(£')-/(£).
PROOF
where
1 ^7 < k ^ t ^ m.
and so
x;.-x fc = ( / - n ) x ; . e i v
Now
*i - x t = i(A - urtF j - H^ - A/)"(X;. - x»).
Hence
\\χ) -xJ^cUA- XI)k(x'j - xk) \\=cUA- Xlfx) ||.
A PRIORI ANALYSIS OF ERRORS 169
and
(A - Xlf = l(A' - ΧΊ) + (Χ' - X)If = X Ck(X' - X)\A' - Λ7)*-\
i=0
We deduce that
|| (A - Xlfx) - (A'-Xlfx) II ^ce.
Also, when j < k, we have
PROOF We remark that (4.3.3) is an improvement of (4.3.1) only if g/m > 1/7.
Suppose now that m < g£, which is satisfied when the Jordan box Βλ, associated
with X9 contains g blocks of different sizes.
We choose a Jordan basis of B in M. Then B' is similar to C = Βλ + εΚ, where
εΚ is the perturbation matrix induced by H, and we may suppose that
|| K || = 0(1). We shall show that there exists at least one eigenvalue μ' of C (and
170 ERROR ANALYSIS
t= 1
Πμ-μίΙ^ε*
/= 1
and there exists necessarily at least one μ' such that | λ — μ'\ ^ ce9,m.
4.4.1
In this section, the norm || · || on <C" is supposed to be monotonic, that is for each
diagonal matrix D = diag(A1,...,yln) the induced norm satisfies ||D|| =max,|A l |.
The Euclidian norm and the norm || · || ^ are monotonic.
The definition of monotonicity we have given is equivalent to the following
characterization. Let x = (£,-} and y = (η^ be vectors in C"; then || · || is monotonic
if and only if | ξί{\ ^ | ηίf | (i = 1,..., m) implies that || x || < y || (see Horn and Johnson,
1990, p. 310). Let a be a scalar and let u be a vector such that ||M|| = 1. In order
to test how accurately these data represent eigenelements of A, it is natural to
A POSTERIORI ANALYSIS OF ERRORS 111
Theorem 4.4.1 Let a and u be given, where \\u\\ = 1. Put r = Au — OLU. Then there
exists an eigenvalue XofA of index ί ^ 1, such that:
(a) If A is diagonalisable,
|A-a|^cond(J0||r||. (4.4.1)
|A
~ a | , , <cond(X)\\r\\. (4.4.2)
(Ι+μ-αΙ)'"1
Theorem 4.4.2 When A' = A + H, then for each eigenvalue X of A', there exists
an eigenvalue Xof A, of index ί^\, such that:
(a) If A is diagonalisable, then
| A ' - A | < cond (JO || fl ||. (4·4·3)
(b) If A is not diagonalisable, then
μ'-Λ|'
— ^cond(X)||tf||. (4.4.4)
(ΐ + μ ' - λ | >
172 ERROR ANALYSIS
4.4.2
Suppose we know U — [u l 5 ..., uM], which is a basis of the subspace M close to
the invariant subspace M; we suppose further that the vectors are normalized by
A POSTERIORI ANALYSIS OF ERRORS 173
where Y is a given matrix. The matrices U and Y are augmented to [C/, (/] and
[Y, J ] so as to become adjoint bases of <Cm. Relative to these bases, A takes the
form
Theorem 4.4,3 //sp (D) n sp (B) = 0 and ifysw < \, then there exists a basis X of
M normalized by Y*X = / such that
\\U-X\\^g(s)w
and
||B-B||«0(e)sw,
where ε = ysw and 1 < g(e) < 2.
\a l/b)
and let u = y - ev Then e[Aex = 0 and exe\ is the orthogonal projection on the
direction lin {e^. We have
and
||Z||=y = |fc|, 5*Ir=-a2fe, ||ΣΓ|| = w = |ofr|.
2
The eigenvalues of A are A = (l/2fc)(l ± ^/l — 4α ΊΡ). Let Ö be the acute angle
between the eigenvectors and eue2. Then
tan Ö = —.
Ifll
We put ε = y \\ s \\ w = (ab)2. Than the bounds given in Theorem 4.4.3 are attained:
μ| = g(e)\b\a2 and tan0 = g(e)\ab\.
The bounds that were established in Theorem 4.4.3 can be computed from the
solution W of the equation
(I-UY*)AW-WB = R,
and they apply when p is sufficiently small. It should be borne in mind that it is
w and not p that serves as a good indicator for the quality of the approximation
of X by U and of B by B. If Y also happens to be an approximate basis of the
left invariant subspace, then s, too, is small and the Rayleigh quotient B
approximates B to the second order 0(sw).
In the very special case in which A is almost triangular, it is possible to obtain
bounds that can be computed from the eigenvalues without knowledge of the
approximate eigenvectors (Exercise 4.4.3).
4.4.3
We now suppose that we know the matrix A — A + /i, close to A, for which M'
is an exact invariant subspace with a given orthonormal basis Q'. We put
A POSTERIORI ANALYSIS OF ERRORS 175
Let Σ' = ß ' ( F , B')~ lQ*; the right residual matrix for A, based upon B' and ρ',
is defined as
AQ' -Q'B' = (A - A')Q' = - HQ'.
We put / = ||Σ'||,s = ||Q'*AQ'Q'*|| and ί = ||β'||,« = IIQ*II; Θ is the diagonal
matrix of canonical angles between M and M'.
For a proof the reader is referred to Exercise 2.11.2. It is easy to deduce that
ΙΙ*-6ΊΙ = ΙΙ*ΊΙ^2||Σ7/ρι.
On the other hand, the identity
B-B' = Q'*AX - Q'*A'Q
= Q'*A(X-Q') + Q:*(A-A')Q:
= ο:*Ανν*(χ-<ζ)+ο:*Η<2'
furnishes a bound for || B — B'\\.
176 ERROR ANALYSIS
Corollary 4.4.6 With the Euclidian norm and under the hypothesis that
iiHii2<Ky'0+y'*)]~ 1
and
||B-BJ2^(2y's+l)||H||2.
The bounds established in Theorem 4.4.5 and Corollary 4.4.6 can be computed
from the solution W = ZHQ' of the equation
(/ - Q'Q'*)A'W - W'B' = (/ - QQ*)HQ'.
Finding an estimate for / can turn out to be costly when the matrix A' is not
normal. In Section 4.6 we shall get to know the simplifications brought about
by the assumption that A' is Hermitian.
4.4.4
Having established the a posteriori bounds for || B — B || and \\B — B' || one would
like to deduce bounds for the distance between their respective spectra. This is
a very difficult question for which no solution has yet been found that is
sufficiently simple to be altogether satisfactory.
In Exercise 4.3.2 the following qualitative result will be established:
dist [sp(/4), sp (A')-] ^ c \\A - A'\\1/n,
if || A — A'\\ is small enough, where n is the order of A and of A'. The constant c
is difficult to estimate in the general case (see, for example, Ostrowski, 1957) and
is of little use in practice because it increases with the order n of the matrix. The
exponent 1/n is deleted when A and A' are assumed to be diagonalisable. In
particular, when A and A' are Hermitian,
dist[sp(/l),spM')]<|M-^|l25
as we shall see in Section 4.6.
The simplicity of such a result is preserved even in the context of arbitrary
matrices if we replace the distance between the spectra by the distance between
the arithmetic means of the eigenvalues:
A = -tri4, ;: = - t r / l \
n n
A IS ALMOST DIAGONAL 177
In fact,
\l-l'\ = -\tr(A-A')\^\\A-A'\
n
Let us now return to the matrices B, B and B' in which we are interested. We
denote their spectra by
{μ,.}™ KJ7 and M7
respectively, and their arithmetic means by
Theorem 4.5.1 Each eigenvalue ofA= (au) lies in at least one of the Gershgorin
disks
Corollary 4.5.2 If each pair of the n Gershgorin disks has an empty intersection,
then each disk contains exactly one eigenvalue of A, which is therefore simple.
PROOF By virtue of the hypothesis, the au are distinct. Put Α(ε) = D + εΗ when
Ο ^ ε ^ 1. When ε = 0, the disks reduce to the points au. By continuity, as ε
increases, each disk contains one eigenvalue as long as the disks remain disjoint.
When applied to a matrix A which is almost diagonal, that is for which
|| H ||^ = maXiEj^ilfl/jl is small, the above results enable us to find bounds for
\λ — au\ provided that the disks are disjoint. In some cases of non-empty inter
sections we can use diagonal similarity transformations on A in order to render
the disks disjoint (see Exercises 4.5.1 and 4.5.2).
We shall now establish a result that enables us to treat the case of a matrix A
that is close to a block-diagonal matrix.
where, generally,
ΜΙΙ* = ΣΚΙ·
Λ U,...,m)
be the m by m matrix extracted from X by selecting the elements in the rows
iah*--»im anc* the columns 1,2,...,m.
We permute the rows of X in such a way that
*-G::}
where X u is the square matrix with the property that
It follows that
tr^-trT^-tr^X^*-/).
Put
^21=^21^11 Ο Γ
^21*11 =
^2V
s
By Cramer's* rule, the general element of y 2 i *
/l....,ft.....*\
Vl,.-7,··.,«/
n,= detX n
4.6 A IS HERMITIÄN
Numerous simplifications occur in this context, which enable us to obtain more
precise results.
We recall what Theorem 4.4.1 becomes: if a and u are given such that || u\\2 = 1
and r = Au — OLU, there exists an eigenvalue λ of A such that | λ — α| ^ || r || 2 . The
Rayleigh quotient p — u*Au, constructed with w, possesses the following optimality
property.
is solved by p = u*Au.
PROOF We have
\\Au-zu\\22= (u*A* - zu*)(Au - zu)
— u*A*Au — zu*A*u — zu*Au + zz
= u*A*Au + (u*Auu*A*u — u*Auu*A*u) — zu*A*u — zu*Au + zz
= u*A*Au — u*Auu*A*u + u*Au(u*A*u — z) — z(u*A*u — z)
— u*A*(Au — uu*Au) + \u*Au — z| 2 ,
The minimum of which is attained for z = u*Au = p.
In particular, Theorem 4.4.1 now asserts that, given u such that || u || 2 = 1, there
exists an eigenvalue / of A with the property that
μ - p K M u - p n ||2.
A IS HERMITIAN 181
This property is often attributed to Krylov (Krylov and Bogolioubov, 1929) and
Weinstein (1934).
If additional information is available about the distance of p from the other
eigenvalues of A, then the Krylov-Weinstein inequality can be improved; this
will now be done after a preliminary lemma. We put
e= \\Au-pu\\2.
Lemma 4.6.2 Let a and b be two real numbers such that a<p<b and suppose
that the open interval (a,b) contains no eigenvalues of A. Then
(b-p)(p-a)^e2.
v = Σ to,
i=l
= Z(ft-*>fa-*)liil 2
= ε2 + ( ρ - ί ? ) ( ρ - α ) ^ 0 ,
because (μί —ft)(ju,·— a) > 0 for all i.
We denote by Θ the acute angle made by the direction of u and the eigen-
subspace M associated with λ (see Figure 4.6.1).
Figure 4.6.1
182 ERROR ANALYSIS
Theorem 4.6.3 Suppose the open interval (λ,λ) contains p and precisely one
eigenvalue λ of A. Then
ε2 ε2
p-= <λ^ρ + - (4.6.1)
λ—p p—λ
and
2 1/2
. + ÄV 2Ί
sin0<: Ρ-^Γ1) + «Ί (4-6.2)
'λ-λ
p-= ^λ<ρ.
λ-ρ
(b) If λ < ρ < λ, put a = λ and ί? = λ. Then
ε2
ρ<Λ<ρ+
Ρ~λ
Hence (4.6.1) is always true when λ < p < λ. With the notations of Lemma 4.6.2
we have
(Au - ku)*(Au - 'λύ) = ε2 + (p - λ)(ρ - λ)
= (Dv - λν)*(ϋν - λν)
i= 1
Since
(W
It follows that
siπ^,
'*(ή)T( ! i i ) ^+ε,+0 '- ä,( ''- 3, }
whence (4.6.2) is readily deduced.
, , , 2χ10~10
\-yfix 10"5
Theorem 4.6.5 There exists an ordered set ofm eigenvalues {μ^™ of A such that
(a) m a x | f t - a i | < | | A ( C ) | | 2 , (4.6.3)
i
PROOF
A = G*AG = (B S
*°S
where
||Ä(C)|| 2 = ||Ä(C)|| 2 ,
and
Ö = G*G
-(';>
We now adopt the basis defined by G and apply the dilation theorem (see
Exercise 4.6.1) to the matrix R(C); thus we construct the matrix
H = H* = (S-C S
*\
A-H-lC ° )
\0 E-WJ
is denoted by {aj", with the proviso that
{<*,.}? = sp(C).
By Weyl's theorem (Exercise 1.9.6) there is a set {μ,.}? of eigenvalues of A
such that
Ιμ,-α,Ι < ||if ||2 = |lR(C)||2 (/=l,...,m).
(b) We diagonalise A and C; thus
F M ^ = D = diag(^,·),
i/*Cl/ = A = diag(ai),
and we put
Q = V*QU, R(Q = V*R(C)U.
Hence
R(C) = D ß - ß A
and
||R(Q||F=||R(C)||F.
We may therefore confine ourselves to the case in which A and C are diagonal
matrices D and Λ respectively.
Next, consider the change of basis
ο=[ο,6]=(0,;)·
We put
ΗΌ=|0Ο·Ι2> ß = (4«A
and
_j(^i-aj)2 whenl^j^m, l^i^n.
tJ
\o when;>n, Ui$«.
Put R = R{C) = DQ- QA. Then
|| R || I = tr K *K = tr (Q*D2Q - AQ*DQ - Q*DQA + Λ2)
m / n n
= Σ Σ^Ι^·Ι2-2Σ«Λ·Ι^Ι2 + «,2
m n n
= Σ Σ Wy(M!-a/, because X w 0 = 1
j=li=l i=l
R ft
=
Σ Σ wi/A/> because dy = 0 when j >m.
i=lt=l
186 ERROR ANALYSIS
Σ Σ wudu
is attained for
Σ du.
Hence
min\\R(C)\\2F=YJ(pi-oii)\
PROOF We have
(AQ - QZ)*(AQ - QZ) = Q*A2Q - Z*Q*AQ - (Q*AQ)Z + Z*Z
= Q*A2Q + (B - Z)*(B -Z)-B2
= (AQ - QB)*(AQ - QB) + (B - Z)*(B - Z).
Put
F = AQ-QZ, G = AQ-QB, H= B-Z.
1/2 i,
(a) ||F|| 2 = p (/ *F); since H*H is positive semi-definite, it follows that
p(F*F)^p(G*G) (see Exercise 1.9.1).
A IS HERMITIAN 187
\\R\\F=\\AQ-QB\\F=\\AU-UA\\F
/ m \l/2
where e, = \\Α^-ΡΜ\\2.
Theorem 4.6.7 Let (λ, J) be an open interval that contains sp (B) and precisely the
set σ of eigenvalues of A. Then there exists a numbering {μ^™, of these eigenvalues
such that
ε2
pi- Z f ^ ^ i ^ i + Σ
(i=l,...,m).
PROOF The reader is referred to Kato's article (1949). We remark that in the
denominator there are no quantities of the type pf — pj9 which could be small in
the case of neighbouring eigenvalues.
The above inequalities use the knowledge of a basis of eigenvectors of B.
We deduce from them other inequalities that are less precise but require only a
knowledge of \\R\\F and (A, I).
If\\Rh<S9then
maxlp.-^.l^-^-A
i O
PROOF Evident.
The various results that we have enunciated in this section generalize those
established in Section 4.6.1 which deal with the relationship between a single
eigenvalue and the Rayleigh quotient. In order to complete the analogy it only
remains for us to mention the results which enable us to find bounds for the
matrix of the canonical angles between eigenspaces.
be the decomposition of U into singular values, where D = diag (σ,,..., σ„) and
Y and X are unitary matrices of orders n and m respectively. Put
A = YAY*, C = X*CX, R(C, U) = YR(C, U)X.
Then
R(C,U) = ÄU-UC
and
||Ä(C,l/)||,= ||Ä(C,l/)||„ (p = 2 o r F ) .
Consider the partitioning
(B S'*\
whence
'£7)-DC
^(C,C7):
1
S"D
(a) On applying (4.6.3) to A, we obtain
m a x l ^ - a . - K \\L\\2,
where
Now
■<r)
\\L\\=\\L*L\\
11 f
Ί2 =
2*i\V
II ^ ^ II2 ^ II ** -C\\
~ ^112 +
"^ H \S\\%
^ II2
190 ERROR ANALYSIS
\\L\\l^\(\\B'D-DC\\22 + \\S'D\\l)^-2\\R\\22.
o o
(b) It is clear that
Example 4.2.8 and Theorem 4.4.3 were inspired by Stewart (1971). The analysis
of a priori errors presented in Section 4.3 was previously given by Wilkinson
(1965) using Gershgorini's disks. The inequalities (4.4.2) and (4.4.4) were inspired
by Kahan, Parlett and Jiang (1982). The a posteriori bounds given in Theorem
4.4.5 are new. Theorem 4.5.3 was proved by Ruhe (1970a). The proof of Theorems
4.6.5 and 4.6.12, due to W. Kahan, is here published for the first time following
his report 'Inclusion theorems for clusters of eigenvalues of Hermitian matrices',
Computer Science Department, University of Toronto, Ontario (1967), kindly
supplied by the author.
EXERCISES
IM II2
»ΔΛ|| 2 <
cond2(A)
(aM = (
,0 ef
(b) A = (
:: r>
1 10* \
(c) Λ = (
,0 ε/
4.1.3 [D] Consider a rectangular matrix Ae<Cn xm, where n ^ m, and suppose
that the columns of A are linearly independent. Define
(c) Let
A = (! 1 1
'■" 1
)G&"+]
Compute κ2(Λ).
4.1.4[D] Let A' = A+ εΗ, where ||H|| = 1 with respect to the induced norm
|| · ||. Suppose that A is a regular matrix such that
0<ε<
and
Z1 = Q(B-U)-1Q*.
Let / be the index of the eigenvalue of B which is nearest to X. Prove that, if b is
sufficiently small, then
5-1^||(B-A/)-1|l2 = P1||2^2cond2(K^-/,
where V is the Jordan basis of B.
4.2.2 [A] Let Ae<Cnx\ee<£ and HeC" x n such that || H ||2 = 1. Define
Α(ε) = Α±εΗ.
EXERCISES 193
where
IIHa-IWJIa.
P being the spectral projection associated with A.
(c) Let Α(ε) be the eigenvector of 0(ε) that is closest to A and suppose that Α(ε) Φ A.
Prove that
i^\\imm-erlie(z)-e^\\2
and deduce that
Λ-(1 10
Ί and Δ^1 ° \
Verify that the balancing of A by Δ decreases the condition number of the basis
of eigenvectors as well as the departure from normality.
4.2.4 [B:25] Let x and x+ be the right and left eigenvectors associated with a
simple eigenvalue of a matrix A. Prove that if no normalization is imposed on x
194 ERROR ANALYSIS
csp(A) = ■m\-
\χ*χ\
4.2.5 [D] Suppose that A is diagonalisable in a basis V whose columns are unit
vectors in the Euclidean norm. Let Is,·!"1 be the condition number of the
eigenvalue Xt as it was defined in Exercise 4.2.4. Prove that when all eigenvalues
are simple, then
condF(K)=t|5ir1
i=l
Verify that the basis X' of the invariant subspace M' of A which is associated
with the block a' = {— 0.1; 1.1} and normalized by Q*X' = /, where Q = (el9e2)9
EXERCISES 195
is equal to
( 1 θλ
X' = 0 1
\ 10" 3 /36 5/9/
4.2.11 [A] Investigate the departure from normality of the Schur form as a
function of the condition number (relative to inversion) of the matrix representing
the Jordan form when the block σα)η8Ϊ8ί8 of a double eigenvalue λ or two distinct
eigenvalues λ and μ.
4.2.12 [D] Verify that the Jordan form is numerically unstable. The computation
of the Jordan form is an ill-posed problem.
4.2.13 [A] Let A have the simple eigenvalue λ:
Ax = λχ (χφ 0),
where || · || # is the dual norm of || · ||. Deduce the value of the relative condition
number for λ Φ 0:
KW(A)= Ihn I ^ I J I L .
ΙΙΔΛΙΙ-Ο |A| ||Ai4||
4.2.14 [B:20,23] The space of matrices is now equipped with the relative
componentwise distance:
ε = ππη(ω;|Δ/1| ^ ω | Λ | ) with AA = A — A\
where the inequality is taken componentwise. Show that, for the same eigenprob-
lem, the relative condition number for λ Φ 0 is given by
1 |ΑλΜχ;||Λ||χ|
Κα(λ) = lim
ε-οε |Λ| |χ*χ||Α|
where \A\ denotes the matrix with the (ϊ, j)th element equal to |aiV|.
4.2.15 [A] Let y be an arbitrary vector, non-orthogonal to x. We suppose that
x and the perturbed eigenvector x' are normalized such that y*x = y*x' = 1, so
that
Δχ = — ΣΔΑχ lies in lin(y) 1 .
Show that
On ί££-|Σ||*|.
lAiin-o Ai4||
196 ERROR ANALYSIS
4.2.16 [B:20,23] With the distance defined in Exercise 4.2.14, show that
KG(x) = lim
HIAxIL IIPPIIXIIL
ε-οε || x | IML
4.2.17 [C] Compute the normwise relative condition numbers defined in
Exercises 4.2.13 and 4.2.15 for the three norms || · ||s, || · || 2 , || * ||«, and the matrix
104
A=
2
Compute the componentwise condition numbers and compare with the normwise
ones obtained with || · || ^.
4.2.18 [B:20] Let λ\ x' be approximate eigenelements for A; r = Αχ' — λ'χ' is the
residual vector. The backward error is defined as the minimal size of perturbation
AA such that (A + AA)x' = λ'χ'. Prove that the normwise backward error is given
by IMI/(M|| ||x'||), for any subordinate norm, and that the componentwise
backward error is given by max 1 < i < n |r ( |/(|/l||x'|) l . Verify that the backward
errors are independent of the normalizations chosen for x'.
4.2.19 [D] We suppose that A is diagonalisable: A=XDX~ Prove the
following componentwise version of the Bauer-Fike theorem:
min ΙΔλΙ^ΙΐμΤ-ΜΐΔΛΐμτΐΙΙ
Xesp(A)
<ε|||Α Γ -ΊΜΙΙ^ΙΙΙ
for all AA s uchthat |ΔΛ|<ε|Λ|.
Apply th<s result to the highly non-noirmal matrix
f
2 109 -2xl09\
A= -10"9 5 -3 with ei|
9
^2xl0" -3 2 1
Check that
/l9 20 14^
Ι^-ΊΙ^ΙΙ^Ι = 20 21 15
1^14 15 l l y
and compare with || X"l \\ \\ X \\ \\A ||. Conclude.
4.2.20 [B:43] We suppose that the elements of A are complex-valued differentia-
ble functions of a complex parameter t varying in D a C The matrix A = A(t)
EXERCISES 197
admits the eigenelements λ = λ(ή, χ = x(t)9 defined in D. Let t0eD be such that
X(t0) is simple and A\ λ\ χ' exist. If we impose the Euclidean normalization
x*(t)x(t) = 1 in Z), then prove that the derivative x' at t = t 0 is
x' = (x*SA'x)x - SA'x,
where S is the reduced resolvent.
Suppose now that A(t) = A + ίΔΑ for ί in a neighbourhood of ί == 0 including
t = 1. Check that the first-order Taylor expansion around t = 0 is
x(i) = x + i[(x*SA4x)x - SAAx] 4- 0(t2).
Set Ax= A + AA, x(l) = x 1 , interpret geometrically the identity xx — x =
(I-P1) (~SAAx) = {I-P±)Ax, where P 1 = x x * , and Δχ = χ 2 - χ is the
variation on x induced by the Wilkinson normalization xjx 2 = 1, where χ # is
the left eigenvector
Check that
||χ 1 |Ι1 = 1 + ||(/-ί , ) 1 Χ2ΐΐ2>1·
4.2.21 [A] Let λ belong to the normwise ε-pseudo-spectrum of A. Show that
this is equivalent to any of the following three statements:
(a) There exists a vector y such that
\\(A-Xl)y\\^E\\A\l ||y|| = l.
c(r)'
then the matrix JR'(z) = (A' — zl)~l exists and
c(T)
max\\ R'(z)\\ 2 <
zer 1 — yc{r)
198 ERROR ANALYSIS
4.23 [A] The distance between two finite sets σ and r is defined as
= 0(8),
rw-um) = 0(8).
(d) Define the Rayleigh quotient associated with < v> B and a non-zero vector x
as follows:
<Sx x >B
RB(S9x) = >2 .
Prove that
min || Sx - μχ \\B = || Sx - RB(S9 x)x ||B.
ß
and
oi<RB(Sux)<ß.
200 ERROR ANALYSIS
Show that
RB(S9 JC) - Δ, < μ( ^ RB(S9 x) + Δα,
where
Δ = \\Sx-RB(S9 xMl
lRB(S9x)-*]\\x\\2B
and
\\Sx-RB{S9xMl
(ß-RB(S9x))\\x\\2B
(f) Deduce the a posteriori bounds:
min|ft - RB(S9x)\ ^ lRB(s\x) - RB(S9x)2V/29
ft-KB(S,x)
mm 2
0.· Λ Ä B (s ,x);
4.4.5 [D] Consider the results of Section 4.4.2. Find bounds for
\\XX-X\\9 || X\- X\\ and H^-BU,
where
XX = V+W9 X\=Q!+W and Bl = Y*AXl.
and
r isi r&
EXERCISES 201
(a) With the help of Theorem 4.5.3 obtain a bound of the type
|2-4|<-IIIH«illi.
TisI
(b) Complete the study of the localization of the eigenvalues of A by means of
Corollary 4.5.2 when \\Η\\γ is sufficiently small.
4.5.4 [B:51] Let A = (ay)e<C" xn and suppose that /ieC differs from all the
diagonal elements of A. Define the map
μ»-+[Κι(μλ...,ΑΛμ)],
where
*ι(μ))=ί>ιΑ
Λ|(μ)=Σΐ^ΐΓ^1+Σΐ^Ι (* = 2,...,n-l)
and
«.<,>-Σ'Ι«.Α
;=ι 1%-μΙ
The set
K, = {jieC:|n-a„| <«,(*«)} (i = 1,2,...,«)
is called the ith Gudkov region. The Gershgorin disks are denoted by Gf
(i = l,2,...,n).
(a) Prove that
sp(A)s\jKts\jGt.
i=l i=l
(b) Construct the Gershgorin disks and the Gudkov regions associated with the
matrix
i-l 1 0\
A=\ 1 1 1
\ 2 0 3/
and compare their precision of localization.
(e) Put Ki = Kt(A)9Gi = Gi(A) and D = diag(d1,...,i/w). Consider the minimal
sets
G(A)= Π Ü G i(ö"^D),
D>Oi=l
κ(/ΐ)= π lU.P""1^)·
D>Oi=l
- ( ;
where A is Hermitian. Show that there exists a Hermitian matrix W such that
T-■ = ( A
" )
satisfies
imi2 = imi 2 .
This result is known as the dilation theorem.
4.6.2 [D] In the case of a simple eigenvalue, establish an inequality, computable
with the help of the generalized Rayleigh quotient, as a function of
p = M M - ξ ι ι ||2 and s= \\A*v- <fc||2,
where
ξ = ν*Αιι and v*u = u*ii = 1.
4.6.3 [B:34] Let iieR", a e R and
r(a) = Au — an,
where u*u = 1 and A is symmetric. Define
J f = {H symmetric: (A — H)u = ati}.
(a) Prove that
min||ii|| 2 = ||r(a)|| 2 ,
HeJiT
min||//|| F = 2 | | r ( a ) | | 2 - ( a - « M M ) 2 .
HeJT
(b) Prove that if p = MMM and r = r(p\ then the minima are attained for
H = ru* + ur*,
and r(p) is the minimum of r(a).
For a non-symmetric matrix Λ the situation is as follows: Let u and v be
vectors in C n such that v*u = u*u = 1. For a e C define
r(a) = Au + an, 5(a) = /l*t; = öiv
and
^ = {H:(A - H)u = au,M* - H*)v = äi;},
EXERCISES 203
Then
min||/i|| 2 = max{||r(a)|| 2 ,||s(a)|| 2 } )
min \\H\\F=\\r(a)\\22+\\s(<x)\\22-(*-v*Au)2.
HeJfT
(b) Prove that the vertices of 3C are the permutation matrices and the matrices
belonging to W.
(c) Use this result to prove the following property: let D and Δ be diagonal
matrices and let P be a matrix such that
||D - P*AP||F = min {|| D - Κ*ΔΚ||Ρ: V*V= /}.
Then P is a permutation matrix.
4.6.9 [D] What do the bounds of Theorem 4.4.3 imply when A is Hermitian?
4.6.10[B:34] Let U and V be two orthonormal bases in <P x m such that V*U
is regular. Prove that there exists a basis H and a diagonal matrix D such that
(A-H)U-UC and V*(A -U) = DV*,
where
C= (V*Uy1D(V*U).
4.6.11 [B:34] With the notation of the preceding Exercise 4.6.10 prove that
there exist matrices H of minimal norm:
||H|| 2 = max{||R|| 2 ,||S*|| 2 },
||H|| F = ||R\\* + | | 5 * \ \ 2 F - \\Z\\%,
where
R = AV-UC, S=V*A-DV* and Z=V*R = S*U.
4.6.12 [B:17,33] Let p be the Rayleigh approximation of the largest eigenvalue
of i4, which we assume to be simple. Prove that the bound
sin Θ ^ ~
δ
tan θ ^ ^.
Foundations of Methods
for Computing Eigenvalues
The present methods for computing eigenvalues are based on the convergence
of a Krylov sequence of subspaces towards the dominant invariant subspace of
the same dimension. The fundamental QR method is interpreted as a collection
of methods of subspace iteration. In this chapter we shall give a geometric
presentation of the convergence of these methods with the help of the tools
introduced in Chapter 1. This presentation will complement and illuminate in a
new light the traditional algebraic study of convergence. For example, it will
enable us to give a natural explanation of the condition upon the matrix of
eigenvectors of A which is necessary and sufficient for the convergence of the
basic QR algorithm.
The vectors of the Jordan basis are denoted by {xj" and the vectors of the adjoint
basis by {xij|r}".
= Xß*F + X5kG.
Note that the eigenvalues of B and B are {μ.}^ and {^J;+ j respectively, so
that B is invertible by virtue of (5.1.1).
Hence
Uk(F~ lB~k) = X + X{BkGF-lBk).
CONVERGENCE OF A KRYLOV SEQUENCE OF SUBSPACE 207
Now
||B'[GF-1ß-tK||fk||||ß-''||||GF-1||.
For every ε > 0, there exists an integer K such that k ^ K implies that
(\\Bk\\\\B-k\\)1/k^p(B)p(B'i)^e = Z^r-f
+ ε.
Vr
We can choose ε such that |μΓ+ 1/μτ\ + ε < 1, which implies that co(/4*S, M)->0;
for, given the bases Uk of /1*S and X of M, there exists a regular matrix of
order r, namely
such that
|ί/Λ-;πι = ο(|—Π.
The convergence is linear at the rate |μΓ+ί/μΓ\.
(b) Suppose there exists a sequence of regular matrices Fk such that
AkUFk -► X, and so PA k i/F k -♦ X.
Now
ΡΛ*17 = ^ k P(7 = AkXF = XBkF.
Hence
XBkFFk->X.
On multiplying on the left by X * we obtain
BkFFk->Iy asfc->oo.
we conclude that F is necessarily regular.
The necessary and sufficient condition that
dim PS = r
is satisfied by a particular choice of the matrix A and the subspace S, which, as
we shall see later, is of great theoretical and practical importance.
Let Er = lin (ex,..., er) = [e A ,..., e r ], where the et are the first r vectors of the
canonical basis of C". We are interested in the Krylov sequence HkEr.
Lemma 5.1.2 Under the hypothesis (5.1.1) we have
dim PEr = r,
where P is the dominant spectral projection of rank r associated with an irreducible
Hessenberg matrix H.
PROOF Suppose that xeEr but x$Er_ t. Then HxeEr+l but Hx$Er. Repeating
this argument, we deduce that the n - r + 1 vectors x, Hx,..., Hn~rx are linearly
independent: every subspace that is invariant under H and contains x must be
of dimension greater than n — r. Hence Er has zero intersection with every
invariant subspace of dimension less than or equal to n — r. In particular, Ker P,
which is of dimension n — r and is invariant under H, has the property that
Cn = £ r 0 K e r P .
Now
lmP = P<En = PEr
and so
(Cn = I m P 0 K e r P ,
which proves that dim PEr = r.
Corollary 5.1.3 IfH is an irreducible Hessenberg matrix and if (5.1.1) holds, then
ω(//*£ Γ ,Μ)->0 as /c-> oo.
Theorem 5.2.1 Suppose (5.1.1) holds. Given orthonormal bases Qk and Q of AkS
and M respectively, there exists a sequence of unitary matrices Zk such that
QkZk-+Q
if and only if dim PS — r.
PROOF It is clear that the matrix Qk defined in (5.2.1) is a basis of AkS. The
assertion follows from Theorem 1.5.2 because co(AkSiM)->0 asfc->oo.
The matrix Bk = Q*AQk is a matrix of order r whose spectrum converges
towards the r dominant eigenvalues.
Corollary 5.2.2 If (5.1.1) holds and if dim PS = r, then sp (Bk) converges to {μ^
as k^co.
PROOF The matrix QkZk furnishes an orthonormal basis for AkS. We note that
B'k = Z*Q*AQkZk is similar to Bk and
where B = Q*AQ is the matrix that represents the map A restricted to M relative
to the basis Q. Hence
sp(ßMQ) = {^}·;.
Since Bk -* B we have
sp(ig-+sp(£).
The linear convergence of Qk towards Q and of sp(Bk) towards {μ.}^ is
controlled by |μΓ+ χ/μτ\. In Chapter 6 we shall meet more precise results: in fact,
the rate of convergence of μ(*} towards μχ is of the order of |μ Γ+1 /μ 1 |, when μχ is
simple. An iteration on the subspace S = Sr carries with it simultaneously an
iteration on each of the subspaces
S 7 = 1111(11!,...,ii/), 1 ^/^r-
This remarkable fact will have important consequences: suppose the eigenvalues
are such that
ΙλΛ > \λ2\ > > \K\ > lAWiI > - > lM.1 > 0. (5.2.2)
On this assumption we define a strictly increasing sequence of invariant
subspaces Mf where
Mf = \m(xl9...,xf) (K/<r).
210 FOUNDATIONS OF METHODS FOR COMPUTING EIGENVALUES
Theorem 5.2.3 Suppose that (5.2.2) holds. Given orthonormal bases Qk ofAkS and
a Schur basis Q ofM, there exists a sequence of unitary diagonal matrices Dk such
that
ÖA->ß
if and only ifX*U is regular and its r — 1 principal minors are non-zero.
PROOF Let Uf = [ux,..., uf~\. When 1 ^ / ^ r, the condition that dim PfSf = /
is equivalent to the assertion that X*+Uf is regular. Now X%Uf is the principal
submatrix of order / extracted from X * U. Hence we know that, when 1 ^ / ^ r,
(o(AkSf9Mj)-+0.
By a recurrence argument on / we can show that the matrix Zk of Theorem 5.3.1
can be taken to be in diagonal form by a suitable choice of Q.
Corollary 5.2.4 Suppose that (5.2.2) holds. If X*U is regular and if its r— 1
principal minors are non-zero, then the limit form ofBk, as k tends to infinity, is an
upper triangular matrix whose diagonal consists of the eigenvalues {λ.}\ in this
order.
We can say that, modulo a unitary diagonal matrix, the matrix Bk converges
towards an upper triangular matrix. This is the 'essential* convergence of
Wilkinson* (see Ciarlet, 1989, p. 205, and Wilkinson, 1965, p. 517).
The condition (5.2.2) is not satisfied when there exist eigenvalues having the
same modulus. Without further investigations we can no longer deduce the
convergence co(AkSf9Mf)-+09 when / is an index corresponding to a set o f /
eigenvalues that include eigenvalues having the same modulus. The matrix Dk of
Theorem 5.2.3 becomes block-diagonal unitary; the limit form of Bk becomes
block-triangular, although it remains triangular in certain cases.
An important case in which there exist eigenvalues having the same modulus
is the case of real matrices which may have pairs of conjugate complex
eigenvalues. We shall return to this question in greater detail when we discuss
the power method in Section 5.3.
The method of subspace iteration is characterized by the subspace AkS, the
fcth iterate by A of the subspace S. Starting from there one can study the
convergence of the method and, if it converges, at what speed it does so.
The reader will verify that the method is still defined when r = n and A is
regular; we then have
dim Sn = dim Mn = dim AkS = n,
and a)(AkSn9 Mn) = 0 whatever the value of k.
In practice, several ways of constructing a basis for AkS can be throught of. In
(5.2.1) we presented the construction of an orthonormal basis Qk arising from the
Schmidt factorization QR.
By way of an example we shall give the 'staircase' iteration (Bauer, 1957)
(Treppeniteration') which rests on the Gaussian LR factorization (see Ciarlet,
1989, p. 138).
1 O O "ol
x 1I O o
: x 1
X
o1
X
X )< X "x"
stable. It is for this reason that we presented Example 5.2.1 only for its historical
interest.
and the algorithm becomes identical with the subspace iteration (5.2.1) when
r = n; in this case Xk is unitary and AXk = Xk+ xRk + v
The method of subspace iteration is most frequently used in conjunction with
the auxiliary techniques of deflation and spectral preconditioning whose principles
we are going to describe. These are techniques that modify the spectrum of A,
but not the eigenvectors or the Schur basis; their purpose is to facilitate the
computation of the spectrum.
5.2.1 Deflation
The object is to eliminate those eigenvalues that have already been computed,
the computation being carried out one by one.
Proposition 5.2.5 Let x and x+ be the right and left eigenvectors corresponding to
the simple eigenvalue λ. The matrices
A' = A — σχχ* and A = A — σχχ*
have the same eigenvalues as A except λ, which is replaced by λ — σ.
When A is diagonalisable, A' and A have the same eigenbasis.
The matrices A and A have the same Schur basis.
PROOF
Hub Mfc-ill*
is such that
5.3.1
What happens when condition (5.3.1) is not satisfied? In the following we shall
suppose that A is diagonalisable. Two cases are possible:
(a) There exists a multiple semi-simple eigenvalue and the method still converges.
(b) There exist distinct eigenvalues with the same modulus, and, in general, the
method fails to converge.
We discuss these cases in more detail:
(a) μ1 = ··.=μ Γ , |μιΙ>|μ Γ +ιΙ>···.
If
i
then
If there exists a rational number p = t/s such that θ = 2π/ρ, then there exist
t subsequences that converge to the vectors
e 2ikÄt/s (x*^)x 1 + (x**u)x2 (k = 1,..., t)
respectively.
5.3.2
In this section we present some aspects of the behaviour of subspace iteration,
subject to condition (5.1.1), in the light of the behaviour of the power method.
First of all, if the necessary and sufficient condition for convergence, that is
dim PS = r, is not satisfied, then convergence will occur towards an invariant
subspace that is no longer dominant.
Example 5.33 Let
where
sp(^) = {3,2,1}.
Whenr = 2,
216 FOUNDATIONS OF METHODS FOR COMPUTING EIGENVALUES
Choosing
n o\
U = 1 1
Vo o)
we have
Ή:
As we should expect from the preceding discussion, we notice the convergence
of Bk to an upper triangular matrix having the eigenvalues {1, 3} (in this order)
upon its diagonal. In fact, the power method applied to A with the initial vector
u= {1, 1, 0)T yields the eigenvalue 1, since u happens to be the eigenvector
associated with 1.
When there are eigenvalues having the same modulus, we can obtain several
distinct block-triangular limit matrices.
and
-0.447 0 -0.894 0 V
0.365 0 -0.183 -0.913/'
but only a single limit block
0.600 -2.94 \
-2.94 -0.600/
The reader will verify that this pecularity is due to the fact that in this case the
eigenvectors associated with 3 and — 3 are orthogonal.
Lemma 5.4.1 If the vector x is well-conditioned, the error made in solving (5.4.1)
is mainly in the direction generated by x, which is the direction required.
PROOF Let Q be a basis for (x) 1 such that [x, Q] is unitary. Then
, r πίλ~σ X A
* Q Μ~Χ*Ί
where
B = Q*AQ
and
Σ ± = ρ(Β-Α/)- 1 ρ*, csp(x)=P1||2.
218 FOUNDATIONS OF METHODS FOR COMPUTING EIGENVALUES
(4-ff/-tf)y = !!+/,
where H and / are of small norm. We write
(A - al)y = u + Hy +f=u + g,
where g = Hy +f. The error made on y is therefore e = (A — aI)~1g. Now
z\m
1 -1
ι x*AQ(B -
(Α-σΙ) = [χ,0]\λ-σ λ-σ
0 (Β-σΙ)
Therefore
e = x(x*e) + Q{Q*e\
where
(**'l·
V 0.
"u«e#-
Figure 5.4.1
THE METHOD OF INVERSE ITERATION 219
Note that
σ-l and ^ = (1 + 1 0 - 2 0 Γ 1 / 2 ( 1 0 ^ 1 0 )
2
A' = ( "1010 ^
V - 1010/(1 + 1(T 20 ) 2 - 1(T 2 7(1 4- lO" 2 0 )/
and || A' — A ||2 ~ 10" 10 . Now A has the double eigenvalue λ = 2 and |σ — λ\ = 1,
which is not small. However,
Mil
220 FOUNDATIONS OF METHODS FOR COMPUTING EIGENVALUES
ViO" 1 0 1/
5
and σ = 1. The exact eigenvalues are 1 ± 10 . We compute the iterates defined
by (5.4.1), starting with
wT = (0, 1).
Put
rk = (A-I)qk.
We obtain
<?0 Zl 9l Ί *2 12 r2 *3 <?3 r3
0 10 10 1 0 0 0 1 1010 1 0
1 0 0 lO-io 1 1 0 1 0 lO-io
Put
Ä* = ß i - ß * and ak = Rk...Rt.
Then it can be verified that
We assume that the eigenvalues are simple and of strictly positive distinct
moduli; thus
|A 1 |>|A 2 |>..->|AJ>0. (5.5.1)
We point out immediately that the assumption 0£sp(y4) is not restrictive, since
one can satisfy it by making a translation of the spectrum.
Lemma 5.5.2 The first r columns of £k generate the subspace AkEr, where
Er = \m(e1,...,er), r = l , . . . , n .
222 FOUNDATIONS OF METHODS FOR COMPUTING EIGENVALUES
PROOF This follows from the triangular form of the matrix Mk which appears
in the factorization Ak = &k&k\ we observe that both A and $k are regular.
Theorem 5.5.4 Under the hypothesis (5.5.1) the QR algorithm, when applied to
an irreducible Hessenberg matrix, produces a sequence of unitarily similar
Hessenberg matrices which converges (modulo a unitary diagonal matrix) to an
upper triangular matrix whose diagonal consists of the eigenvalues {/l,.}" in this
order.
and
'■■-Kb
Theorem 5.5.5 On the assumption (5.5.1) the QR algorithm, when applied to A,
produces a sequence of unitarily similar matrices whose limit form is an upper
triangular matrix having {A.}" as its diagonal elements in this order, under the
necessary and sufficient condition that the n—\ principal minors of X~l are
non-zero.
PROOF See Parlett (1968) and Parlett and Poole (1972). Parlett's article of 1968
gives a necessary and sufficient condition for convergence to a quasi-triangular
form, that is having blocks on the diagonal of order at most two.
The subspaces
S,AS,A2S,...
and
S\A-*S\(A*)-2S\...
are orthogonal complements in pairs. Note that if Er = lin (ex,..., er\ then
£:r1 = lin(e r + x,..., e„).
The QR algorithm, which we have interpreted as a collection of n subspace
iteration methods for A, can also be interpreted with the help of iterations for
A~*.
where
/ = R~*
( 3 )σ, = <> = β ί ν „ . , .
(b) σ^ is the eigenvalue of the submatrix ET2AkE2 that is closest to -aJJ*, where
E2 = [£„_!,£„]· This shift is known as Wilkinson's shift.
There exists no global result on convergence of the shifted QR algorithm for a
non-normal Hessenberg matrix. With strategy (a) the convergence of ajj* to an
eigenvalue is asymptotically quadratic. When strategy (b) is applied to a
symmetric tridiagonal matrix, the convergence of a{^ to an eigenvalue is at least
quadratic and almost always cubic.
Strategy (a) is related to the iteration of the Rayleigh quotient in the following
manner.
Lemma 5.5.8 If we choose ok — a™, then Qkek is proportional ίο(Α% — äkI) ~1ek.
implies that
Q* = Rk(Ak-akI)-K
Hence
Qke„ = (At-ökI)-lRken
= (A*-äkI)-lf<»e„,
where r*'( #0) is the element of Rk in the position (n, n). Hence
Vt
" MAt-ajrwi,
Starting from qk_ i and pk„l= a'*', one iteration of the Rayleigh quotient upon
A* yields qk = QkeH and
226 FOUNDATIONS OF METHODS FOR COMPUTING EIGENVALUES
Theorem 5.7.1 There exist unitary matrices Q and Z such that Q*AZ = T and
Q*BZ = S are upper triangular matrices. If for some values of i, we have
ίί£ = sit = 0, then sp [Λ, B~] = C; otherwise
s p [ ^ B ] = {i„/sfl;sl|9feO}.
NEWTON'S METHOD AND THE RAYLEIGH QUOTIENT ITERATION 227
x° = — , z = xfc+1-xk,
y*u
(I - xky*)Az - z(y*Axk) = - Axb + xk(y*Axk) (k > 0),
where the superscript k represents the number of the iteration. We deduce that
Axk+1-xk +
\y*Axk) = xkly*A(xk +x
- x k )] (k ^ 0).
228 FOUNDATIONS OF METHODS FOR COMPUTING EIGENVALUES
q (A-vk_1I)zk = qk_u
°~U' y*io
(5.8.2)
y*Mk
q k
~ π π'
y*<ik
II zk II
(that is only the right-hand vector in the Rayleigh quotient is modified).
We shall show inductively that μ* = vk and that the vectors xk and qk generate
the same direction; they differ only in their normalizations, which are y*xk = 1
and || qk \\ = 1 respectively. This is true when k = 0; indeed,
o *Λ o y*Au
= Vn.
y*u
Suppose the assertion is true for the (k — l)st iteration; hence the vector
5.9.1
We continue to assume that λ is a simple eigenvalue and that the corresponding
eigenvector x is normalized by y*x = 1. We consider Newton's modified method:
v - y
X Z
0 — ~T~' — Xk+1 ~ X
fc>
y*u
(5A1)
(I - xky*)Az - z(y*Ax0) =-Axk + xk(y*Axk) (k > 0),
NEWTON'S METHOD AND SIMULTANEOUS INVERSE ITERATIONS 229
or else, if C = y M x 0 ,
(A - CI)xk + i = xkly*A(xk+ x - x 0 )]·
Remark The two methods are mathematically equivalent, but not numerically.
In fact we have seen in Proposition 5.4.2 that one cannot surpass the machine
precision for qk when it is calculated by the inverse iteration method. On the other
hand, we can obtain xk to a precision equal to that used to compute the residual
Axk — xk(y*Axk)9 while the linear systems are solved in simple precision.
PROOF This is entirely analogous to the proof of Lemma 5.4.1. The system we
wish to solve can be written
X
where B = Q*AQ and B = QAQ. It remains to compare ||(B,B) \\F with
230 FOUNDATIONS OF METHODS FOR COMPUTING EIGENVALUES
1
Vo a
l
be of order r. Ife = |a| is sufficiently small, then G is of rank unity up to ε1/2.
Vo
(-i) r + 1 or 1
(-l)'af~r-2 1
= (-l)r+1a'r
·. ( - l ) r a r ~ 2
V .(-I)'*1«'-1
r+1 r
= (-l) oT K,say.
Let Π be the matrix e^J, which maps C on to lin (ej. Π is of rank unity: one
of its singular values is equal to unity while the other r — 1 singular values are
equal to zero. In fact, Π Τ Π = ejie^ejej = ereTr, which is the diagonal matrix
diag(0,...,l).
On the other hand, G l and K have the same rank: the columns of G~1 are
multiplied by ar, apart from a sign, in order to obtain K. Now
|| K - Π || 2 ^ c(2e + 3ε2 + · · · + rer' *) ^ ce,
where c is a generic constant depending on r.
We conclude that the difference between the singular values of K and those of
Π is of the order of ε 1/2 (see Exercise 5.9.2).
In view of Lemma 5.9.2 we are interested in that part of the solution Y of (5.9.3)
which lies in M, that is
ß * y = (B - σΙ)~ lQ*lV - AQ(B - aiy x
Q*Ul
For that reason we study the solution Z of the system (A — σΙ)Ζ = Q.
NEWTON'S METHOD AND SIMULTANEOUS INVERSE ITERATIONS 231
Theorem 5.9.4 Ifk is defective, the m vectors that are the solutions of
(A-aI)Z =Q
1/2
are linearly dependent up to ε , for small enough ε.
Table 5.9.1.
A-IHU*
0.85 ρ15<0.2χ10"10 e„-[;33 0
0
0.581
0.814
0.575
-0.471
0.575 T
-0.411 J
0 0.5779 0.5771 0.5771 Ί
0.99 Pi < 0.25 x 10" 10
-[: 0 0.8161
0
-0.4086
0.5 x 10" 4
-0.4086 J
0 - 0 . 2 x ΗΓ 4
p2<0.2xKT3
0.999999
does not decrease
e100= 0.5773503 0.8164965
0.5773503 -0.4082488
JX5773503 -0.4082486 _
0 0.81
0 -0.064
0.9999999 2<pk<4 ßioo "~ 0.5773503 0.42
0.5773503 -0.050
0.5773503 -0.37
Then the bases calculated by (5.9.2) and by (5.9.4) generate the same subspace
(A-aI)~kS.
we suppose that the basis U is chosen to be orthonormal. Then W*W= F*F and
cond2(^) = ^ g ,
where amax(F) and ^ i n (F) are the greatest and the least singular values of F
respectively.
Let Ex be the eigenspace of A associated with the eigenvalue λ. We have
dim Ex-g<m.
Let Π λ be the matrix representing the orthogonal projection on Ex. The matrix
G = FLXF is of rank g because each of its columns is an eigenvector of A. On the
other hand,
F-G^Z-IIJF.
Hence on putting F = [mfl,..., / m ] we have
|| ( / - n j / i || 2 =dist (UEX) (i=l,...,m).
The reader can verify that Ex is also the eigenspace of A0 = XBY*, where
B = Y*AX.
We deduce from Theorem 4.3.7 that
\\F-G\\2F= Σ MI-IlJfiWl^cWA^-AJ2/,
Hence
cond2(HO > ^-l^(a2max(G) - αη1")1*2 = αη~ ^2',
where cjs a generic constant; the quantity r\~1/2/ is greater the smaller η is. As a
result, B cannot be well approximated by the diagonal σΐ,
The main interest in the iteration (5.9.2) lies in the fact that the matrix A — σΐ
remains fixed throughout the iterations; in contrast, the iteration (2.11.1) requires
the solution of a Sylvester equation. We now put forward a modification of
(2.11.1) that does not require such an expensive step, while remaining stable when
λ is defective. When λ is defective, as we have seen, B is not well approximated
by σΐ. Let B = QTQ* be the Schur decomposition of ß, where
r = d i a g ( Q + N,
EXERCISES 235
B = QfQ*.
T h e n | | 5 - 5 | | 2 = | | T - f | | 2 = maxl|f1-a|.
This enables us to put forward a modified Newton method which is stable
when λ is defective and which has almost the same complexity as (5.9.2):
X0 = U, Z = Xk + i — Xk, (^QS)
(I - UY*)AZ -ZB=- F(Xk) (fc ^ 0).
In this context a natural choice for σ is the arithmetic mean
m
- 1 1
rrii=i m
In contrast to (5.9.2), the interest in (5.9.5) lies in the fact that it allows us to
compute the set of all m basis vectors of M with the precision required. See
exercise 2.11.3 for a sufficient condition of convergence for (5.9.5). This sufficient
condition requires that λ is not too ill-conditioned.
5.10 BIBLIOGRAPHICAL C O M M E N T S
The geometrical illumination of Sections 5.1 to 5.5 was inspired by the fundamental
article of Parlett and Poole (1973) and by Watkins' paper (1982). The presentation
of the inverse iteration method was adapted from the article by Peters and
Wilkinson (1979), in particular Proposition 5.4.2 and Example 5.4.2. The study
of the simultaneous inverse iterations for calculating a defective eigenvalue is
new (Chatelin, 1986).
Here is a point of nomenclature: the method of simultaneous iterations
(Rutishauser, 1969) has different names according to the context: subspace
iteration in Parlett's (1980) book (following the custom of structural engineers)
and orthogonal iteration in the book by Golub and Van Loan (1989, Ch. 7). The
practical implementation of this method always involves a step of projection (see
Chapter 6).
EXERCISES
0. *.. x
0 0. '*.
U0 = e(Cn
Ό. x
'*
0 0
V I
where * is a non-zero element, x is an element that is not necessarily zero and
all other elements are zero.
5.3.2 [D] Suppose that A is Hermitian and that the conditions of Theorem 5.3.1
are satisfied. Prove that
| ^ k - A J = 0(|^2//iJ2*)
5.3.3 [D] How can the power method be used to compute the eigenvalue(s) of
least modulus of a regular matrix?
5.3.4[C] Study the behaviour of the power method for the matrices
Prove that every pair of linearly independent vectors chosen from the limit
vectors of the power method enables us to construct a 2 x 2 matrix with spectrum
Ui,A 2 }·
5.3.9 [B:67] Revert to Exercise 5.3.8 in the case in which A is real and λχ = — A2.
Prove that if v and w are two limits of convergent subsequences, then v -f w and
v — w are the corresponding eigenvectors.
- G -?>
Show that one obtains two distinct constant subsequences. Comment.
5.5.4[D] Let Ae<Enxn. Consider the following algorithm, known as additive
reduction (AR):
A0 = A,
A — F'1A F
Λ _£,
Η1 fc ^k^k'
where Ek is the lower triangular part of Ak, including the diagonal.
(a) Prove that if A is an irreducible lower Hessenberg matrix with eigenvalues
of distinct moduli, and if the matrices Ak = (a|*)) generated by the AR
algorithm are such that
Vfc: |<{ | > \af21>...> I O >0,
then, as k -► oo, the diagonal elements of Ak tend to the eigenvalues of A.
(b) Compare the complexity of this algorithm with that of the QR method.
EXERCISES 239
5.5.5 [D] Examine the potential instability of the AR algorithm (Exercise 5.5.4).
In particular, examine the case of defective eigenvalues and the case of matrices
with a greatly extended spectrum.
5.5.6 [A] Compare the basis Qk defined by (5.2.1) with the basis Qk of the QR
method.
5.5.7 [B:21,26] Let Ψ be the vector space of polynomials with real coefficients,
endowed with a scalar product < ·, · > . Consider an orthonormal system
(Po> Pi»..., p«,...), where pk is of degree k.
(a) Prove that the polynomials pk satisfy the relation
Pn +1 (x) = (Anx + Bn)pn(x) - Cnpn _! (x).
(b) Prove that if ak and bk are such that
pk(x) = afcxfc + fokxfc_1 + ···,
then
Ak = -
'k+l
Ble = Al
<*k+lak-l
ck = al
(c) Define
B„
a* =
Ak
and construct the symmetric tridiagonal matrix
/ 0 0■ π
a
0 ßl U
α 0- π
ßl ι /»2. "U
T = 0 ßl. «2. 0
••A
0
••A '·«.
<P><7>= w(x)p(x)<?(x)dx,
240 FOUNDATIONS OF METHODS FOR COMPUTING EIGENVALUES
P\\22w=\ vv(x)|p(x)|2dx
' - - ! >
exists.
(d) Show that, for each k ^ 1, the polynomial pk has k simple real roots in the
interval [a, b]; we denote these roots by xjik(j = 1,2,..., fc).
(e) Prove the identity
where the weights wjn are such that the error En(f) is zero when / is a
polynomial of degree ^ In — 1.
(f) Show that the weights vv, „ can be deduced from the first component of the
eigenvector </>„(*,,„) provided that the moments of the functions w, that is
integrals
1w(x)x dx,
are known.
mk =
4' * k
5,5.8 [B:65] We generalize here the basic QR algorithm with the help of the
notion of isospectral flow.
(a) Show that every matrix Ae<Cn xn has a unique decomposition
Α = πι(Α) + π2(Α\
where π^Λ) is skew-Hermitian [πι{Α)* — — π{(Α)~\ and π2{Α) is an upper
triangular matrix with real diagonal elements.
For all B and X in <P x n define
[£, X] = BX - XB.
n xn
Let B0e<C and suppose that / is analytic in an open set containing the
EXERCISES 241
it produces a sequence
,l k = e/(B(fc)) (fc=l,2,...).
We recover the QR method for B0 by taking f(z) = In z.
(*> ßl. \
I R.
ßl «1
A=
V "A- •a«
Define the Wilkinson shift by
ifa t = a 2 ,
2 1 otherwise,
Ul-sffl(S)ßl(\S\ + y/S + ß\)-
where δ = (α1 — a2)/2. Define the vector p by
(A —wl)p = el
and the vector q by
(A — ω/)<? = τρ,
where
\P\\2
(d) Prove that
\\(Α-ω^ί\\22 = ^^τηϊη(2β2ι,β22Αβ1βι\/^β)'
Deduce that if A is an irreducible real symmetric tridiagonal matrix, then the
QL algorithm, with Wilkinson's shift of origin, generates a sequence of
irreducible real symmetric tridiagonal matrices
/ a„<*) \
1 ß\k).
A = ft»(*) a <*>
Ä*',
··«<*>
ΐ,
EXERCISES 243
such that
σ = ω_ίτΜ (Saadx
π'τ(ω)
where π τ is the characteristic polynomial of T.
U*AV=['Axi Axl
0 Λ 22
#11 #12
L/*£K .
0 £22
m xm
where A x 1? J ^ x eC . Define
dif(i4 l l ,ß 1 1 ;i4 2 2 ,ß 2 2 )= min max{A(i4 1 1 ,/l22)^(ß 1 1 ,ß 2 2 )},
||X||F=I
nyiiF=i
where
*(A11M22)=\\A22Y-XAli\\F
and
A(B 1 1 ,B 2 2 )=||B 2 2 y"-JfB 1 1 || F .
Now consider two arbitrary unitary matrices in C x":
1/ = (1/!,1/ 2 ) and K=(K lf K 2 ),
nxm
where L/ 1 ,K 1 eC . Define
/*,.,. = £/|M V. and B y = l/fßK,.,
( n _ m ) xm
and, for given X, y in C :
£/; = ( l / x + I/ 2 X)(/ + X*X)~ 1/2 ,
l/'2 = (U2 - l / ^ M / + JlfX*)"1/2,
K , 1 =(K 1 + K 2 y)(/+ y*y)~ 1 / 2 ,
EXERCISES 245
(f) Prove that the matrices X and Y of parts (c) and (d) satisfy the conditions
max{||Jf||F,||y||F}<2^
o
5.7.2 [B:53] Let A e ( P x n and Be€nxn be two symmetric matrices. We suppose
that B is positive definite. Consider the Rayleigh quotient
, x x*Ax
μ(χ) = -~— (x*0).
x*Bx
For a given vector xk we use the notation
to = Μ**)>
Ck = A- μβ.
Let
C* = Ö* — Ek — Fk
be the descomposition of Ck into a diagonal matrix (£>k), the strictly lower
triangular part ( — Ek) and the strictly upper triangular part ( — Fk). For a given
value of the parameter ω(>0) we define
*** = / - ^ - c » .
Consider the iteration
X
k+l ~ MkXk.
μo = min-^ = minμ(βi),
i bu
then
lim fk = 0,
k-+ao
where
^ = (A ~ VkB)xk
EXERCISES 247
and
A 1
Φο = Φ'*
λΗ = ψ'*ΑφΗ9
5.9.2 [A] Consider Lemma 5.9.3. Prove that the difference between the singular
values of the matrices K and Π is of order ε1/2.
5.9.3 [A] Consider Proposition 5.9.7. Let of be the eigenvalues of F*F. Prove
that
σί = 0(η*").
5.9.4 [C] Let
(l
0 θ\
A= 0 1 1
yO 0 1
and let M be the invariant subspace associated with the defective eigenvalue
χ = l: M = linZ, where X = (e2,e2). In the method (2.11.1) take Y= X. Choose
EXERCISES
The numerical methods for large matrices are based on the principle of projection
on an appropriate subspace; they require only the product of the matrix A and
a vector, the matrix being stored in a secondary memory. The methods we are
going to propose in this and the next chapter are at present the most efficient
ones for computers of traditional construction (sequential computers) which can
dispose of a vectorial unit.
What is a large eigenvalue problem? Evidently, there exists no precise and
absolute answer to this question, for the notion of size depends on the computer
used. We could propose the following answer: an eigenvalue problem is regarded
to be large when it is much cheaper to compute only those eigenvalues and
vectors that are required, rather than to compute them all.
The eigenvalue problems of large sparse matrices arise mainly from the dis
cretisation of partial differential equations. The most frequent requirements are
to find (a) the least eigenvalues of a symmetric matrix or (b) the eigenvalues of
greatest real part of a non-symmetric matrix. For example, in structural
mechanics one may wish to compute several hundred eigenvalues of matrices
whose orders exceed 105. In quantum chemistry the order may reach 106 and
more. The majority of the spectral problems that have been solved up to now
are symmetric, but the share of non-symmetric matrices is increasing (problems
of stability and bifurcation are considered in Chapter 3).
The next two chapters present the state of the art with regard to the algorithms
for large eigenvalue problems. Several theoretical questions are open at present,
which explains the heuristic aspect of some of the algorithms that will be
described.
Chapter 6 is concerned with the extreme eigenvalues while Chapter 7 treats
the eigenvalues of greatest real part when the matrix is non-symmetric.
Lemma 6.2.1 Let dim PS = r and let xf be an eigenvector associated with μ(. For
every ε > 0, there exists a unique vector s( in S and an index k such that
Hence
r
Since the {Puj}ri are linearly independent, there exists for any xteM a unique
Si in S such that Pst = x t . In what follows xf will be an eigenvector:
Axt = μ,χ,..
By definition
||(/-^)xi||2 = min||x/-);||2^||xi->;/||2,
where
Hence
\\^-yi\\2^v]^l\\L^-P)y\\2\\i-P)si\\2.
\\ßi\
Put
C = -,4(/-P), ρ(η = \μ,+ ί/μί\.
For every ε > 0, there exists an integer k such that, when / > /c, we have
IIC'H^piQ + e.
When /I is diagonalisable, X = XDX "* and || Cl || ^ cond 2 (X)pl(C). See Exercise
6.2.1 for a study of the constant, when A is not diagonalisable.
When /-»oo, then dist(xi}Si) tends to zero like Ιμ,+ ^μ,Ι'. The constant
II *i — Si || 2 diminishes as the acute angle between the eigenvector xt and the initial
subspace S becomes smaller (see Figure 6.2.1 for the case r = 1).
X I ^M
' x=Ps
Figure 6.2.1
THE METHOD OF SUBSPACE ITERATION REVISITED 255
PROOF The equation πΑχ = λχ (λ Φ 0) implies that x = πχ, and so xeS. Let B
be the matrix of nAn in an orthonormal basis V of S. Put x = Υξ; then Βξ = λξ.
Figure 6.2.2
and
W-X^cHA-AfaW
We now have
(A - At)At = A(/ - πζ)χ + (/ - π,Μίί, - x),
which implies that
Theorem 6.2.4 On the assumption that (6.2.1) holds and that dim PS = r, the
method of simultaneous iterations on r vectors converges. Moreover, if the ith
dominant eigenvalue μ( is simple, then the convergence rate of the ith pair of
eigenelements of Ax is of the order of\μr+1/μi\,i=l,...7r.
When A is Hermitian, the convergence rate of the ith eigenvalue becomes
l^r+lMI 2 .
PROOF The assertions are simple consequences of Lemmas 6.2.1 and 6.2.3. See
Exercise 6.2.3.
PROOF The proof is left to the reader, who will observe that the bases Qt in
(6.2.2) and in (5.2.1) are different.
PROOF We use Corollary 1.5.5 to assert that there exists a sequence of regular
matrices Δ, such that XtAt -► X.
The proof that Δζ is block-diagonal is carried out by induction (see Exercise
(6.2.5).
We suppose that μ( is a simple eigenvalue of A. Then μ|0 ->μ·, where μ{.° is the
ith (simple) eigenvalue of Bt associated with the eigenvector ξ{Ρ. We put
In practice one does not calculate Bt at each iteration. If the step of projection
takes place at every k iterations, this amounts to projecting on the subspace AklS.
Since the dimension r of Bx is moderate in comparison with n, the methods of
Chapter 5 can be employed in order to diagonalise B (for example by the QR
algorithm or by inverse iteration).
v
u
j = 7Z> u = Avj~ bjVj_ ί, a. = vju, (6.3.1)
uj = u-ajVj, bJ+1=y/ü*ü.
THE LANCZOS METHOD 259
We suppose that the quantities 6,0* = 2,...,/) are positive, that is to say, that
dim J f j = L This assumption is not restrictive, for if it is not fulfilled, the
eigenvalue problem for A reduces to two subproblems (see Exercise 6.3.1).
The dimension / of the approximate problem is either fixed in advance or else
is determined dynamically by the algorithm according to the value of bl+1 (see
Section 6.3.5).
The tridiagonal matrix Tt is then diagonalised.
71 = W 7 ·
The eigenvalues Dt and the eigenvectors Xt = VlYl of At = ntA are called the
Ritz values and Ritz vectors of A; they are the required approximations of certain
eigenelements of A9 as we shall see. They possess the global optimal properties
envisaged in Section 4.6 of Chapter 4.
In the basic Lanczos method we seek to keep / small in relation to n, and there
can be no question of convergence in the classical sense of this term since / takes
only a finite number of values. When / is of modest size compared with n, we
shall see that Jf, contains eigenvalues of Ax which are sufficiently close to certain
eigenvectors associated with extreme eigenvalues of A. This property of approxi
mation justifies the choice of the Krylov subspace Jf,, but this is not the only
justification. There is also a computational reason: the Schmidt orthogonalization
process is particularly simple in the Krylov subspace; this is the reason why the
matrix Τχ is tridiagonal (see Exercise 6.3.3).
We shall now show that the Lanczos method can approximate only one
eigendirection associated with a multiple eigenvalues. The irreducible tridiagonal
Hermitian matrix Tt possesses only simple real eigenvalues (Exercise 5.1.1), which
may be arranged in decreasing order of magnitude:
λ®>λ{2>->λ®.
Let {2f; 1 ^ ii < d] be the distinct eigenvalues of A9 also arranged in decreasing
order of magnitude:
λί > λ2 > ' · ' > λά = Amin.
Let Pt be the eigenprojection associated with Af and let E be the subspace
generated by the vectors {Ρ,Μ: 1 ^ Ϊ ' ^ d } ; thus
£ = lin(P 1 w,...,P d M).
If these vectors are not zero, they are eigenvectors of A corresponding to the
distinct eigenvalues; they are linearly independent.
PROOF We have
i=l i= 1 i= 1
ιΙΛ·"ΙΙ 2
x, is an eigenvector associated with λ·χ. Put
it follows that
u = PiU + £pjU and i; = q^P.u + Σ ^ Ρ , " .
tan20(P|M,tO = ( £ . ^
If(/-P l )M#0, we have
Σ «2(^)l|P/i|li = Il^)yj*||(/-P>||*.
THE LANCZOS METHOD 261
Ρ(·) = - .,
-Γ min „ P ( „ y , l l 2 l f c i ^ .
LPePi-i
"p(Ai)=l J II Λ" II2
Finally, we observe that
tan0(P,. M , M )J l ( / - P ) ""*.
II P.« II2
It remains to obtain an estimate for the number
i„ = min \\p(A)yt\\2.
pePi-i
p(Ai)=l
This can be accomplished with the help of Chebyshev polynomials. We recall
that for real t and |i| > 1, the Chebyshev polynomial of the first kind and of
degree k is defined by
Tk(t)=Η(ί+yfi^r+(t+y/F^in
In Chapter 7, Sections 7.2 and 7.3, we shall collect the properties of Cheby
shev polynomials of a real or complex variable which are required in Chapters 6
and 7.
wfter^
Amin
Δ1 = 1, Δ | = Π^ (i>D
J<i kj — kt
and
M ~~ M+1
7i = l + 2
X| + 1 — / „
262 NUMERICAL METHODS FOR LARGE MATRICES
PROOF We have
t ü= min ||ρ(Λ)^|| 2
pePi-i
Ρ(λ,)=1
= min 1 JTF l
pePi-
ρ(λ,)=1
By Theorem 7.2.1,
min max |p(i)l = ,
Ρ(λ,)= 1
where
^1 ~ ^ 2
?i = l + 2
A? Ami
^ min maxIpWI·
pePi-i j>i
P(A<)= · = ρ ( λ , - ι ) = 0
PU,)=1
Pit)
maxlpi^O^max
, r v J
j>i " i>i
«(*,)
<(Π "m,n max
*<* A k — A; / j>i
«Λ)
We conclude that
When / increases, the decrease of tan d(xh Jfj) is of the order of the decrease
THE LANCZOS METHOD 263
of 1/T, _;(}>;). The quantity y( depends on the relative distance (Λ· —A i+1 )/
v^i+i ~ - ^ m i n ) .
Put
*i = Vi + y/yi-i-
For sufficiently great /, the value of T, _,·()>;) is of the order of ^τ\~\ and the rate
of decrease of 0(xh ΧΊ) is Ι/τ,. This rate is the better the greater yt is and γχ is the
greater the smaller i is in comparison with /; that is X; is an eigenvector associated
with one of the greatest eigenvalues of A.
6.3.4 Approximation
We seek to estimate the precision of the Lanczos method as a function of Uu and
the spectrum of A, for a pair of eigenelements λ, x, || x || 2 = 1. We put
«i —\\U — π ι ) * II2 = si*10(x, #Ί) < t a n 0(x, ^i)>
which can be majorized with the help of Theorem 6.3.3.
Theorem 6.3.4 Suppose that λ and x are given. If a, is sufficiently small, there
exist eigenelements λι and xt of Ax such that \ λ — Λ,| ^ cocf and sin 0, ^ caf, where
c is a generic constant and where 6t is the acute angle formed by the eigendirections
lin (x,) and lin (x).
(z)(/-^MÄ(z)xdz
2πυΓι
- ^ f ^αζ(/-π()χ,
2πΐ J r ,A — z
because
R(z)x =
Now
|| ( P , - P ) x || < ^ V , a „
2π
264 NUMERICAL METHODS FOR LARGE MATRICES
where
μ, = meas (Γ,), ct = max || Rt(z) ||, dt = [dist (A, Γ,)] \
ZBTI
If we put
μΐ
c = — max (Mat),
2πκι<η
we conclude that
\\(Pl-P)x\\^coil.
Let Xj = Pzx and let 0, be the acute angle between lin(x) and lin(xj). Then
|| X/1| 2 = cos 0j, xfx = cos 2 0,
and
| Xj — x || 2 = sin 0j ^ co^
(see Figure 6.3.1). Put
Put
1
(xtxt)112'
We have the relations
Λ Ai -— X ./1X Χ · /ΊιΛι
Figure 6.3.1
THE LANCZOS METHOD 265
where
/t
i + 2~~ / min
but the constant c„ which contains (Af — Ai+x)~ \ is large.
When A is a multiple eigenvalue, the Lanczos method (in exact arithmetic) does
not enable us to compute the set of eigenvectors associated with A. However, in
practice, rounding errors have the effect that from a certain number of iterations
onwards, the Lanczos method is applied to a neighbouring matrix having neigh
bouring eigenvalues that are distinct (no longer multiple). In fact, it will be noticed
that a second copy of A appears which corresponds to a second eigenvector that
is not proportional to the first. This second copy appears as a result of the
Lanczos method being applied with an initial vector that has a zero component
(to machine precision) in the desired eigenspace. When A is of multiplicity m, then
m copies will appear successively as / increases.
We shall return to the question of multiple eigenvalues in Section 6.4, where
we shall present the Lanczos block method.
whence
Mx<'>-^xf>||2 = f> 1+1 |^| = ^ .
There exists an eigenvalue A,· of A such that
266 NUMERICAL METHODS FOR LARGE MATRICES
which is a block tridiagonal matrix, the order of each block being r. Moreover,
the blocks form a band of size r + 1 (see Figure 6.4.1).
Write
»Ί = [βο,..·,βι-ι].
where the orthonormal basis Qj of AJS is constructed from the orthonormal
basis Q0 of S in the following manner:
(a) χ Ι , - β ί ^ ρ ο , Β ^ Ο , ρ . ^ Ο ;
(b) when ./= 1,2,...,/-1,
(i) put X, = AQj_, - β,._ Jj- Qj_2B*,
(ii) carry out the Schmidt factorization
Figure 6.4.1
268 NUMERICAL METHODS FOR LARGE MATRICES
D
Lemma 6.4.1 The eigenvalues ofT1 are of multiplicity less than or equal to r.
PROOF Since the matrices Rj are regular, the matrix t{ is of rank ^ n — r. For
each eigenvalue λ of A, the matrix Tl — λΐ is still of rank ^ n — r. Hence
dimKer^-AJKr.
Now let E be the subspace generated by the {Ρβ}\.
Lemma 6.4.2 The block Lanczos process amounts to approximating the eigen
values of A' = A\E whose eigenvalues are of multiplicity ^ r .
and
s= Σ^Ρ
J-i
THE BLOCK LANCZOS METHOD 269
and so
Ps = tsjPuJ-
Since the r vectors {PUj}\ are independent by hypothesis, there exists a unique
vector sk < S such that Psk — xk for each given eigenvector xk (fee/). We put
vk = (I-P)sk = sk-xk.
Hence
Us k -x fc || 2 = tan0(xfc,sfc).
For a given xk we consider the vector veX{ which can be written in the form
v = q(A)sk9 where gePj_ x.
Since
Sk = xk + 2., -Pj*fc>
we have
W
(a) TTie cose i = 1. Here Sj is defined by Xj
||(/-P)i,||>= «^illPjSillS
II ^ II2 J>1+- Λ )
The minimum of the right-hand side for gePj-i is attained for p. We put
s = p(A)sleJfh
2 0 μι+Γ + Μπ
PI — Vl+r-Vn
~Vn
Hence, for j ^ 1 + r,
<*lVj-ßl = 1- 2AW
ji> 1 + r p2(Aii)
< y ll p ^"2
270 NUMERICAL METHODS FOR LARGE MATRICES
and
Σ \\PjSl\\22=W-P)sl\\22 = \\s1-x1\\22.
3> 1+r
PM = T^fat-ßü
Lj<i
then Ρί(μ,) = 0 when; < i. Let s = Pi(A)sk9 where sfc is defined by xk. Now
K
<_.. '.~.llx»-sJ5·
TU7ky
We remark that the bounds of Theorem 6.4.2 reduce to those of Theorem 6.3.3
when r = 1 and the eigenvalues are distinct. The angle 9(xk,Jft) decreases like
T^_\(yk), where yk depends on the distance μΙί — μ ί+Γ . The generalization of the
Lanczos method to the block Lanczos method has an effect that is comparable
to the transition of the power method to the method of simultaneous iterations
(see Section 6.2).
Bounds for \μΗ — μ[1)\ and || xk — x[l) ||2 (kel) can be established as in Theorem
6.3.4.
where the product is not evaluated explicitly for large matrices. Equation (6.5.1)
is equivalent to
Ay = Xy,
l
where x — R~ y. This reduction preserves the eigenvalues. Suppose we wish to
compute some of the smallest eigenvalues; the convergence factor for the least
eigenvalue kx is determined by
Now this number may be very small. In structural mechanics, it is not rare to
have A! = 105, A2 = 2 x l 0 5 and i m a x = 10 19 , which leads to y1 = 10" 1 4 . It
requires about n Lanczos steps to separate λχ from λ29 even in exact arithmetic.
An efficient remedy is provided by the spectral transformation, which is a natural
generalization of inverse iteration.
Α = (Κ-σΜ)~ιΜ and v=
λ-σ
However, A is no longer symmetric with respect to the Euclidean scalar
product. We have the following lemma.
Lemma 6.5.1 The matrix A is self-adjoint with respect to the scalar product
defined by M.
The additional cost in relation to (6.3.1) at each step consists in the evaluation
of w = Mu and in the solution of (X — aM)u = v. This solution is carried out with
the help of the factorization K - σΜ = LDlJ, where L is a lower triangular
matrix with a unit diagonal. The strategy of complete reorthogonalization is here
recommended in order to keep / as small as possible. The spectral transformation
1
t-a
transforms the part of the spectrum that is close to σ into the extremities of the
spectrum of A. Hence the algorithm (6.5.3) will efficiently compute the eigenvalues
in an interval containing σ. If required, one could use different shifts σ. This
method enables us to determine any eigenvalue whatsoever in the interior of the
spectrum if we know an approximation σ.
Remark For very large problems, the triangular factorization cannot be kept in
the central memory. If transfer time between the central and the secondary memories
is important then it may be advantageous to use the block Lanczos method, which
involves the solution ofr systems of the form (K — σΜ)uf = w,· (i = 1,..., r).
For the sake of simplicity we have assumed that M is positive definite, but in
practice it may be singular (see Exercises 6.5.1 and 6.5.2).
V
J+i=Λ;Λ.Λ·+1> hj+1=**AVJ+i e <i+*)·
The algorithm terminates when χά = 0, which is impossible if the minimal
polynomial of A with respect to u is of degree > /. If this condition is satisfied,
Hi = (hij) is an irreducible Hessenberg matrix.
In what follows we shall suppose that A (and hence At) is diagonalisable with
eigenvalues {A,·}*}. Since Ht is diagonalisable, it necessarily possesses / simple
eigenvalues, which we shall denote by {Aj0}^. Let Pf be the eigenprojection
associated with λ{. ΙίΡμφ 0, we put x, = Pf w/ll P»w II2 ·
PROOF We have
||(/-7r l )x i || 2 = dist 2 (x i ,Jf / ) = min \\Xi-q(A)u\\2.
qePi-i
we obtain
1
Ιΐα-π^χ,ΙΙ,^ min \\p(A(I-P,))(/-PtM2-
i.Pi-1 IIΛ" II2
pUi)=i
Now v4 is diagonalisable, say Λ = XDX~\ and so A(I - Pf) = XD'X'1, where
£>' is a diagonal matrix consisting of the eigenvalues Xj(j Φ i) of A. Hence
||p(A(I - P,))|| 2 ^ min|p(^-)| cond 2 (X).
274 NUMERICAL METHODS FOR LARGE MATRICES
This is the uniform norm of the best approximation to the zero function on the
set sp(A) — {AJ by means of polynomials in a complex variable of degree </
satisfying the condition ρ(λ() = 1.
In Chapter 7, Theorem 7.1.6 and Example 7.1.2, it will be shown that if the
spectrum of A consists of d distinct eigenvalues, then among the d — 1 eigenvalues
of A that are distinct from A, there exist / eigenvalues, denoted by A 1? ..., Aj such
that
6.6.3
The discussion of the decrease of ε(/) when / increases is a difficult problem in the
approximation theory of functions of a complex variable. Except in particular
cases in which the spectrum is of a very special form, it is not easy to establish
upper bounds of ε(° that are both simple and precise. The two examples below
show that ε{Ρ depends on sp (A) in an important manner.
Example 6.6.1 When the eigenvalues are uniformly distributed over [0, 1], we
have
^ =^ 7 . (7=l,...,n)
w
and ε(Λ"1> = : 1 1
n-\ ' 2"- -!
Example 6.6.2 When the eigenvalues are uniformly distributed over the circle
\z\ = /, we have A,· = exp [2(; - 1)πί/η] (; = 1,...,n) and ε(0 = 1//.
It is seen that the decrease of ε(0 with increasing / can be quite moderate for
ARNOLD'S METHOD 275
certain cases of spectral distribution. The study of ε(,) is pursued by studying the
upper bound η{1); this is obtained by letting z vary in a domain D which contains
sp(A) = {λ} and excludes λ. In fact,
max |p(z)|<maxi0( 2 )|
sp(A)-{X) D
and
e ( l ) ^ij ( , ) = min max|p(z)|,
pePi-i zeD
Ρ(λ)=1
When the matrix A is real, its spectrum is symmetric with respect to the real
axis. If the eigenvalue λ is also real, we may choose for D a domain that is
symmetric with respect to the real axis.
The following theorem determines η{1) in three particular cases: λ is real and
D consists of
(a) a line segment,
(b) a disk,
(c) the interior of an ellipse with real major axis.
Theorem 6.6.2 We have the following characterizations, where a,X — c,e and p
are positive real numbers:
(a) When D is the real interval {i; 11 — c\ ^ a},
(c) When D is bounded by the ellipse with centre e, focal distance e and semi-
major axis a,
Figure 6.6.1
276 NUMERICAL METHODS FOR LARGE MATRICES
6.6.4 Approximation
We shall prove the analogue of Theorem 6.3.4 after putting
α/=||(/-π,)χ||2,
Theorem 6.6.3 Suppose that λ and x are given and that λ is simple. Then ι/α, is
sufficiently small, there exist eigenelements λι and x, of At such that \λ — AJ< ca(
and sin 0, ^ calf where c is a generic constant.
PROOF We revert to the proof of Theorem 6.3.4, where it was shown that
\\?i — PII2 < ca 2· We suppose that the eigenvectors x and x, are such that
ΙΙχ|| 2 = ΙΙχ/ΙΙ 2 = ΐ .
(/)
We introduce a vector x which is proportional to x and satisfies xf. x = 1 so that
P,x (i) = x,.
Similarly, define
Ptx = x\
(see Figure 6.6.2).
Now
(
ΙΙ(ρ,-ρ)χΙΙ
I 2 = ΙΙχ;-χ|Ι 2 = ΙΙχ;ΐΙ2ΙΙχ "-χ(ΙΙ2.
We obtain
A —— X»# /\X ,
= X
ΙΓΊΓ **V - πι)Αχ + χΐ*πιΑ(χΊ- χ
) \
ll-^il^L J
ARNOLD'S METHOD 277
x λ2
X * λ,
* * »»
λ
X 4 X
X λ3
Figure 6.6.3
Hence
μ - λ.\ ^ c ( max I M l ] α ^ ca
1
ν<'<»ΙΙχίΙΙ2/ ' '
Suppose we wish to approximate the dominant eigenvalue λ = λν We assume
that the remainder of the spectrum is real (or nearly real; see Figure 6.6.3). Then
Theorem 6.6.2 enables us to deduce an error bound equal (or nearly equal) to
that of the Lanczos method which amounts to Ι/Τ^^γ^ (without, however,
the exponent two for the eigenvalues).
Supposing that the dominant eigenvalue is real; we can obtain an approxima
tion for the ith eigenvalue which is still very close to that provided by Lanczos,
on the condition that the remainder of the spectrum is real or nearly real.
When A possesses complex eigenvalues, the study of the precision of Arnoldi's
algorithm is far less conclusive than that of the Lanczos method. The reader will
understand that to a large extent this is due to the lesser degree of perfection of
the theory of uniform approximation on a compact set in the complex
plane.
and
Theorem 6.6.4 The vectors defined in (6.6.2) form a basis Wt for Xl such that
wfwj — di} when \i —j\ ^ q + 1.
PROOF Put
i* = max(l,;-<7) = ^ .
[j-q if; > q.
when ; = 1,...,/, we have
i = 2*
Let Ht be the band Hessenberg matrix such that its non-zero elements are hi}
when i —\^j^i + q:
\ 0
we have the identity
AW^Wfit + K^w^e]. (6.6.3)
with respect to the adjoint bases Wl and WlGl * (Exercise 6.6.3). On multiplying
the identity (6.6.3) by Wf we deduce that
ώ^Αχχ — λιΧι) = 0
The problem (6.7.1) is known as the Petrov approximation of (6.1.1) (Chatelin,
1983), p. 64 and Ch. 4).
We construct orthonormal bases V\ and Vf in G\ and Gf respectively.
Equation (6.7.1) becomes
[νί*Αν})ξι^λ,νί*ν}ξρ
which is a generalized eigenvalue problem.
The reader can verify (Exercise 6.7.1) that the orthogonal projection ώι on Gf
Figure 6.7.1
280 NUMERICAL METHODS FOR LARGE MATRICES
defines an oblique projection π' on G) (see Figure 6.7.1); this justifies the name
of oblique projection for this method.
cJ+1=(|*i+1ii+1|)l/2,
v
V
-ϋ-α±
j+l— >
C
J+1
EXERCISES
6.1.2 [D] Let N > n. Consider three matrices aaeCn x", aßeCN x n and aveCn x N
such that
<*« = aßP = ra
v
Nxn nx N
where peC and reC are such that
rp = /„,
Define the following square matrices of order N:
Aa = paar, Aß = paß, Ay = a/.
Let μ be a non-zero eigenvalue of algebraic multiplicity m. Let weCn x m be a basis
of the right invariant subspace and let i;eC Xm be a basis such that v*u = Im.
Define
σ = ν*αα and n = pr.
Let Sa be the block-reduced resolvent of aa associated with the eigenvalue μ.
Let ra(z) be the inverse operator of y -»a^y — yz, where zeCm x m is a given matrix
whose spectrum is disjoint from that of αα.
(a) Prove that σ is regular.
(b) Prove that for each xeC" x m we have
sa(x) = lim re(z)[(/m - tw*)x].
(c) Prove that μ is an eigenvalue of algebraic multiplicity m of the matrices Aa, Αβ
and Ay.
282 NUMERICAL METHODS FOR LARGE MATRICES
(d) Obtain the spectral projections for Aa,Aß and Ay as functions of αα9αβ,αν
p, r, u and v.
(e) Prove that for each XeCN x m the block-reduced resolvents of Aa, Aß and Av
associated with μ, are given by the formulae
Sa(X) = psa(rX)-(IN-n)XG-\
Sß(X) = (psa(aßX) - l(IN - ρησ-ιν*αβ)Χ1σ- \
l l
Sy(X) = laysa(rX) - (IN - ayua~ v*r)X^-
respectively.
yj+i=Uj+i· Σ (yfuj+im
where vx is such that §vx \\ 2 = 1. Let Xj be basis vectors constructed by the Lanczos
algorithm and let ys be the vectors obtained in the Gram-Schmidt orthogonali-
zation process.
Show that, for j = 1,2,..., n, there exists a real non-negative number Oj such
that
e x
yj = jr
vj+i=Avj-0CjVj-ßjVj-u
j
«i+i.
ßj+1
v -5£±i
*J+lßj+l=*J+l*J+V
284 NUMERICAL METHODS FOR LARGE MATRICES
Jfl(A,vl) = \in(vuAv1,.-.,Al~1vl),
jr / (^*,w 1 ) = lin(w 1 M*w 1 ,...,(>4*r i w 1 ),
(OLK β2 0 ... 0^
α
\&2 " 2 ß3 ··· ;
r,= o ··.. '·.. '■·.. o
i '■■•■..'"V'-A.
0 ·· 0 'öm'ctm
(c) Prove that if the algorithm terminates at the Zth step (<5jf+ x Φ 0, ; = 1,2,..., Z),
then
\ν*ν, = ι„
lm(V,) = jrl(A,Vl)
\m(W,) = X-l(A*,Wl)
^i=y,T, + Sl+lVl+le*,
A*W,= WtT* = ßl+lwI+le*,
Tl=W*AVl.
(d) what happens if Sj+1= 0?
(e) Interpret the matrix T, in relation to a representation of the linear map A.
6.3.6 [A] Find estimates for the constants in the bounds given in Theorem
6.3.4.
63.7 [B: 46] Suppose the calculations are carried out in finite precision arithmetic
with machine error of order ε. Thus the recurrence formulae (6.3.1) become
AV^V^ + b^^rf + F,,
F*F, = L, + / + L*,
where L, is a lower triangular matrix. Suppose that there exists local orthogona
lity:
v,llin(Oi-i,v,-2)
in such a way that the diagonal and the first subdiagonal of L, are zero.
EXERCISES 285
Λ + ι 4 0=1,2,...,/),
Pn
where
yiJt = £<"■%£<«,
Kt being the strictly triangular part of Fj Vt - VjFr
(c) Show that ||KJ|2 = 0(ε||Λ|| 2 ).
(d) Show that the relations
7α = 0(ε\\Λ\\2) and xj°Vi~l
imply that
ßu = 0(e\\A\\2).
Deduce that 'the loss of orthogonality entails convergence'.
6.3.8 [D] Retain the notation of Exercise 6.3.7.
(a) Prove that if i φ k and i, k < /, then
(b) Deduce that the Ritz vectors xf) and x%\ which are not good approximations
for the eigenvectors xf and xk (because ξη and ξΙΗ are too great), are
orthogonal up to machine precision.
6.3.9 [B:15] Given a real symmetric matrix A - (atJ) of order n, choose an
arbitrary vector v{*] such that \\ν^]\\ 2 = 1, and an integer fc0 « n.
Let
(h) d = — ^ <+i>
ΚΊ1ΙΙ2
K? + 1> = x g (/+-/+1).
Prove the following inequalities:
ιιη°ιι 2 =ι>
Wlla^MII*
Κ°|| 2 = ι,
4 ° ^ ^max(^) (the greatest eigenvalue of A\
K%^\\A\\r
6.3.10 [D] Consider the algorithm of Exercise 6.3.9. Prove that
(a) V^ V(l] is an orthogonal projection for all / and k.
(b) If λ -► C(A) is continuous on a compact set containing the spectrum of A, then
the sequence 4° is bounded with respect of / and k.
6.3.11 [D] Investigate the convergence of the algorithm proposed in Exercise
6.3.9 when λ^ is the greatest eigenvalue of the matrix H%\
(a) Show that, for each fce{l,2,...,/c0}, the sequence (A^)leN is increasing and
bounded^.
(b) Show that
independently of k.
6.3.12 [D] Show that if in Exercise 6.3.9 C(A) is symmetric and positive or
negative definite, then the sequences r^ and rjjj tend to zero as / tends to infinity.
6.3.13 [C] Study the behaviour of the algorithm described in Exercise 6.3.9
EXERCISES 287
when
A=
0(λ) = (λΙ
where D is the diagonal of A.
6.3.14 [B:15,16] The choice made in Exercise 6.3.9, namely
C(A) = (^/-D)" 1 ,
where D is the diagonal of A, corresponds to what is called Davidson's algorithm.
We suppose the aim is to compute the greatest eigenvalue of A. Show that if v{^]
is such that λψΐ — D is positive definite, then the algorithm converges.
6.3.15 [B:15] Let (λ,ν) be a pair of eigenelements of A, where λ is not the
greatest eigenvalue of A. Let weUn and let ε be a non-zero real number. Put
vE = v + ε W.
(a) Show that
νΎεΑνε_ , t
T 2
2w Aw-λ\\\ν\\ 2
kiis ιι^.ιΐί
Define
S + = {wGRn:wT/lw-A||w||2>0}
S_ = Rn\S + .
(b) Show that S + is a non-empty open cone.
(c) Consider the algorithm defined in Exercise 6.3.9. Show that the convergence
of x[° towards V can take place only if
X^-VGS_.
(d) Deduce that the method is unstable when λ is not the greatest eigenvalue of A.
6.3.16 [D] Consider the basis KjJ* of Exercise 6.3.9. Can this basis be associated
with a Krylov subspace?
6.3.17 [A] Consider the classical Davidson algorithm (see Exercises 6.3.9 and
6.3.14) when applied to the real symmetric sparse matrix A = (α^). Let i0 be an
index such that
Sk
yk = <iIA<iki
uk = Aqk-ykqk-ökqk.u
ff*
_ II»-*«!
d\Adi
X
fc+1 = xk + akdk,
r
k+l = rk- akAdk,
K + 1 ll 2 2
ßk ii- n2 '
Il'fcll2
^k + i = r * + i + ßkdk-
EXERCISES
(e) Prove that the minimum of the function gk(o) = (xk — x + cdk)TA(xk
+ adk) is attained at ak.
Consider the tridiagonal symmetric matrix
ί^ο <*i. 0
^
^ι Vi. '··..
'··.'·. A-i
\0 'h-i'Jk-i)
Let
Dk = dmg(a-\...iak~}1),
β* = (<7ο>· ··><?*-1)>
*j = - % / £ / (0<j<fc-lX
A o .. o^
τ0 i o ;
** = 0 t! 1.
Tk = LkDkLl
and deduce the relations between the parameters yi9 <5„ ft and ai9
(g) Show that the iterate xk of the conjugate gradient method can be obtained
from the Lanczos method by the equation
6.5.2 [D] Generalize the study made in Exercise 6.5.1 to a matrix M which is
real symmetric semi-definite.
6.5.3 [D] Consider the problem
Kx = AM x,
where K and M are symmetric, K is regular and M is positive semi-definite
singular. Let X be the basis of eigenvectors normalized by ΧτΜΧ = /.
(a) What is the result of the inverse iteration
(K-aM)z = Myk, Λ + ι=/τ?
11*11
(b) Use Exercise 1.13.2 to show that when y is arbitrary and
z = (K-aM)~lMy,
then either
(i) K ~ lM is non-defective (eigenvalue 0 of index 1) and zelmX;
or
(ii) K ~l M is defective (eigenvalue zero of index 2) and (K — σΜ)"1 Mz e Im X.
(c) Deduce that, whatever the initial vector y 0 , after at most two iterations the
vectors yk lie in ImX and everything takes place as if M were regular.
Chebyshev's Iterative
Methods
min max|p(z)|,
pePk zeS
P(A)=1
where S is a set in the complex plane that does not contain A and is bounded by
an ellipse.
In the chapter we have collected a certain number of methods inspired by this
principle in order to compute the eigenvalues of greatest real part of a
non-symmetric matrix.
Definitions
(a) v* is a best approximation of / over S in V if
min max \f(z) - v(z)\ = \\f-v*\\ „.
veV zeS
294 CHEBYSHEV'S ITERATIVE METHODS
PROOF The reader is referred to Revlin (1990, p. 74). The {zj'i are the
critical points of the error / — v*.
such that
£«AP(*i) = 0 WveV),
i= 1
where
(a) e^sgnif-O*)^)
(b) £i(f -v*)(zt) = | | / - v * | | „ (i=l r).
Definition The subspace V of dimension k is said to satisfy the Haar (or the
Chebyshev) condition on S, if every non-zero function of V possesses at most
k — 1 zeros in S.
ν=Σ <*iVi
i=l
v(ti) = yi (i = l,...,fc).
Example 7.1.1 For every given AeC, the set K = Pfc = {pePk9 ρ(λ) = 0} is a
vector space of dimension k which satisfies the Haar condition on every compact
subset of C that does not contain λ.
Theorem 7.1.5 (Haar) Every function fe C(s) possesses a unique best approxima
tion v* in V if and only if V satisfies the Haar condition.
II l — « * II oo = m i n m a x I * - Φ)\-
qePk zeS
Theorem 7.1.6 When k </, there exist k + 1 points λί,...,λίί+ί of S such that
ΙΡ·Ι--(Σ Π f - j ) ·
PROOF By Theorem 7.1.1 there exists a subset of r points {A,}^ of S that are
critical points of the error p* = 1 — q* and have the property that
i=l
where /c + 1 < r ^ 2/c + 1. We shall show that for this particular problem we have
r = /c+l.
Choose k + 1 points λί9..., kk + x among the r ^ k + 1 critical points of p*.
Consider the following basis of Pk\
Wj(z) = (z-X)lj(z) (; = l,...,/c),
where /, is the Lagrange polynomial
f = 1 Λ; /f
/J(AI) = 0 (i#./'andi#fc+l)
/,.(Afc + 1 )#0.
We verify immediately that
ω,(Λ,) = A,· - λ, ω/ΑΛ + i) Φ 0
and ω,·^,·) = 0 when i Φ] and i φ k + 1. By virtue of Haar's condition we have
detCo^^O (U=l,...,fc).
Hence the system
r=l A : - /f
ELEMENTS OF THE THEORY OF UNIFORM APPROXIMATION 297
such that
l'j(Xj)=l /;.(Ar) = 0 (t*j)
0 = l , . . . , f c + 1). We verify that the system (7.1.2) has a particular solution
ßj = 1)(λ) Φ 0 (j = 1,..., k +1). In fact, the ;th equation can be written as
β^-β^ψ^ήψ^ o=i,...,fc).
The reader will verify that ßj can be identified with
r = i A.· — Af
where
e i e ' = -^- (s=l,...,fc+l)
«.-[I!e"-w]"''
It is clear that
p ( ^ = pe»' = p Ä
\ßs\
On the other hand, p is a positive real number i in fact
r*+i *)-i r*+i Ί-ι
p = | £ [sgn/;μ)] ς(Α)| =^ |/;wiJ > a
298 CHEBYSHEV'S ITERATIVE METHODS
lft|p&) = PÄ (P>0).
This proves that, when k < f, the polynomial q = 1 — p is the best approximation
required: q* = 1 — p*, therefore p* = p. For this optimal polynomial we have
Example 7.1.2 Suppose that S = sp(/4) —{A}, where sp(/l) = {AJ^ represents
the d distinct eigenvalues of a matrix A. In Chapter 6, Section 6.6, we defined
ε ( 0 = min max \p(z)\.
p&i-x S
PU)=1
when all the eigenvalues, other than A, are in a circle that is well separated from
λ (see Figure 7.1.1), then ε(,) is small.
Figure 7.1.1
In order to obtain a bound for ε(/) which involves only a localization of the
spectrum (not all the eigenvalues) we consider a compact connected region D
containing S. Then:
CHEBYSHEV POLYNOMIALS OF A REAL VARIABLE 299
and
(b) the maximum is attained on the boundary dD because the polynomial p is
analytic in D.
Particular optimal results were cited in Theorem 6.6.2. They involve the
Chebyshev polynomials of the first kind which we are now going to study.
7.2.1 Definition
ThefcthChebyshev polynomial of the first kind Tk(t) is defined as follows:
Jcos(fccos_1i) when|i|^|
( cosh (k cosh _ 1 i) when|i|>|
7.2.2 Properties
Tk(-t) = (-lfTk(t),
7ΌΜ-1, Tl(t) = t, Tk(t) = 2tTk.1(t)-Tk-2(t) (k = 2,3,...),
\Tk(t)\^l when|t|<l.
300 CHEBYSHEV'S ITERATIVE METHODS
is attained by
T»[l+2(f-ft)/(ft-fl)]
tk(t) =
ΓΛ[1+2(Λ-*>)/(*>-a)]
and
1
I 4 II oo =
T t [l+2(A-6)/(fc-<i)]'
Definition Let E = E(c, e, a) denote the ellipse with centre c, semi-major axis a,
and focal distance e, where c is real and a, e, λ — c> 0; let $ denote the region
bounded by E (see Figure 7.3.1).
Lemma 7.3.1
Ua/e)
f/*= min max \p(z)\ ^
pePu ze<f T k [(A-c)/e]
Figure 7.3.1
CHEBYSHEV POLYNOMIALS OF A COMPLEX VARIABLE 301
PROOF Put z' = (z — c)/e. Then z' lies in the region &' which is bounded by the
ellipse £'(0,1, a/c) of centre 0, focal distance 1 and semi-major axis a/c. Therefore,
by the maximum principle,
HK>
Then ze£'(0,1, a/c) if and only if
»*c>-{«-^-<--e+JW7i\
On the other hand, Tk(z') = (w* + w"*)/2, whence
max | Tk(z')| == max £| w* + w~k\= max ||p*eik* + p~*e~
z'eE' weCp 0 < 0 < 2*
i(p' + p-*)=7\(j|).
Figure 73.2
302 CHEBYSHEV'S ITERATIVE METHODS
(As regards the existence and uniqueness of p*, the reader is referred to Exercise
7.3.9.)
Proposition 7.3.3
lim max|p*(z)11/fc= lim max if MI 1 /*
kV
fc->oo zeE' fc->oo zeE'
The inequality on the right is evident. Suppose that the inequality on the left is
false:
max | n*( z )|< min | fk(z) |
zeE' zeE'
implies that
p*(z)<tk(z) when zeE'.
By Rouche's* theorem. tk(z) — p*(z) has as many zeros in the interior of £' as f(z).
Now tk has k zeros on the segment joining the foci c — e and c + e (Exercise 7.3.5).
On the other hand, tkW — P*W = 0 and λ is exterior to E'. This proves that
tk — p* is the zero polynomial, because its degree does not exceed k and it has at
least k 4-1 distinct zeros. Thus tk(z) = p*(z) on E\ which contradicts our
hypothesis.
By virtue of (7.3.2) it suffices to prove that
lim min|ffc(z)1/fc= lim max if.(z)\1,k
fcV
k - o o zeE' fc-oo zeE' '
All the points of the ellipse E are such that limk_ „ |/(z)| 1/k is constant for zeE.
mil
min |p(reie)l2sd0j > ™ η { | J Ι^' β )Ι 2 ( 1 θ | }
peQkLJo
Let
ks
qiz) = £ atzl.
z=o
Then
Jo i=o
starting from q0 = u/ \\ u ||. This iteration can also be written yk = ßkAku, k > 1.
One might think of using a more general polynomial iteration yk = pk(A)u,
where pkePk is a polynomial of degree k.
Suppose A is diagonalisable and has the eigenvectors {xf}". Moreover, we
make the assumption (5.3.1) that
|A 1 |>max|/i i |.
Now
w = Σ £i*i
and
Π
We suppose that the dominant eigenvalue λ is rea/ and that the remainder of the
spectrum lies in the ellipse E(c, e, a) where c, e, a are real and λ — οα.
By Theorem 7.3.2 the optimum polynomial is
r ,v 7U(z-c)/eJ
P* = T / ~ ) (k = 0,1,2,...)
we obtain
P k + l i k + l ( z ) = Tk+i
Z "—" C
m
e
Again, on putting ak+1 = pjpk+i we have
z—c
tk+l(z) = 2ak+1 tk(z) - akak+1 tk.x(z). (7.4.2)
7.5.1 Definition
The following method of calculating yk is known as the Chebyshev iteration:
(b) F o r ; = l , 2 , . . . , / c - l p u t (7.5.1)
1
Remarks
(a) We have assumed that the parameters c, e and a, which define the ellipse, are
real. If £, still centred on the real axis, has its major axis parallel to the
imaginary axis, then e and a are imaginary. Nevertheless, the computations
of (7.5.1) can always be carried out in real arithmetic. In fact, the σ, are pure
imaginary and so σ,+ l/e and σ]σ]+ j are real.
We remark that when a and e are imaginary, the polynomial
k
Tfc[(A-c)/£>]
is no longer optimal but remains asymptotically optimal for large k.
(b) When the eigenvalues are real and \μ( — c\ ^ a (i = 2,..., n\ then the optimal
polynomial is
f T f c [ (r-c)/a]
Tkl{X-c)ld\
by virtue of Theorem 7.2.1. Therefore, in order to obtain the Chebyshev
iteration in this case it suffices to replace e by a.
7.5.2 Convergence
The Chebyshev iteration (7.5.1) can be interpreted as a method of projection
on the direction generated by ük = tk(A)u (fc = 1,2,...), where
i(vVi + Wi-i) = * ^
THE CHEBYSHEV ITERATION METHOD 307
α
ι + V αι ~ e
and
^2 "~ *min
a =
Figure 73.1
308 CHEBYSHEV'S ITERATIVE METHODS
Hence, we have just shown the remarkable property that, without knowledge of
c or e, the Lanczos method determines automatically the vector uk = tk(A)u in the
space
Jfk+i=(pk(A)u;pGPk)9
which is the best possible vector with regard to the speed of convergence towards
liniXi).
This remains true when A is no longer symmetric and λ is the dominant
eigenvalue, replacing the Lanczos by the Arnoldi method (when i = 1): the
convergence rate towards λ9 that is
max^ 11 wy| _ a + ^/a2 — e1
is the same as that for the Arnoldi method (i =1), defined for Jffc + 1 when k is
sufficiently great (see Exercise 7.5.2).
It should be borne in mind that such a performance of the Chebyshev iteration
can be attained only when optimal parameters c and e are used. It is unrealistic
in practice to assume that these quantities are known. It is necessary to determine
them dynamically in the course of the iteration. This will be treated in Section 7.7.
J*r+l
X
f X X^2
-a x—»
x c
V x
X
f*r+2
r=4
Figure 7.6.1
SIMULTANEOUS CHEBYSHEV ITERATIONS 309
dimension m^r9 the constants σΐ9 k and ε being given. This is carried out as
follows:
(a) ( / 0 = C / , t / 1 = ^ - c / ) ( 7 ;
e
(b) when; = 1,...,k — 1, put
'"■-(£-'') ■
'
υ,-,,-2-ίϋ(/1-ε;)ϋ,-σΛ,,ϋ;_,; (7.6.1)
e
(c) Uk = QkRk;
(d)Bk = QtAQk^FkDkF~l
(projection and diagonalization).
From the m eigenvalues of Bk retain the r eigenvalues of greatest real parts;
form the diagonal matrix Dk with them and let Fk comprise the associated r
eigenvectors. Put Xk = QkFk.
(e) If || AX'k - X'kD'k ||F > ε, then U = QkFk; substitute in (a).
For the computation we use the polynomial
Tkl{z-cye]
h(z) =
r*[(A-c)/e]
because tk(X) = 1 [see (7.4.2) and (7.4.3)]. We assume that the matrix A is
diagonalisable. The orthogonal projection on Sk is denoted by nk.
The following result is a consequence of Lemma 6.2.1.
Lemma 7.6,1 Suppose that dim PS = r. Then for each eigenvector xf associated
with μ, ί/iere exists a unique vector st ofS such that PSi = x, and
where
Hence
UM)\
Since A is diagonalisable, we have
IItklA(I - P)] ||2 ^max|rfc^,.)|cond2(X).
and
max|ik(z)|
zeE o
The constant c, has the value of cond2 (X) || (/ — P)sf || 2 .
Corollary 7.6.2 Suppose that the eigenvalues {μ.}" are rea/ and that they are
arranged in decreasing order. Then under the hypotheses of Lemma 7.6.1 we have
MI-*Jx,h*c,-l— (i=l,...,r),
Tk(yd
where
PROOF There are n — r eigenvalues in the interval [μ„,μ Γ +ι]. Apply Lemma
7.6.1 when
c= , a= , y,= .
Theorem 7.6.3 Suppose that the assumptions of Lemma 7.6.1 are satisfied. Then
the method of simultaneous Chebyshev interations with optimal parameters
converges:
(a) If the ith eigenvalue of the greatest real part is simple and if the {μ7}"+ χ lie in
the ellipse E (c, e, a), then the error boundsfor the ith pair of eigenelements are of
the order
Tk{a/e)
\TkUni-c)le\\
(b) If A is Hermitian, the bound for the ith greatest eigenvalue becomes of order
Tk~2(U
PROOF co(Sk,M)-»0.
The convergence rates that we have found, are, respectively, those of the
block Arnoldi method (when the dominant eigenvalues are real and positve) and
DETERMINATION OF THE OPTIMAL PARAMETERS 311
those of the block Lanczos method. For a non-symmetric matrix the cost of
simultaneous Chebyshev iterations is well below that of the block Arnoldi
method. The former method will be preferred in practice if a satisfactory
technique is available for estimating the optimal parameters.
The gain due to the Chebyshev acceleration is measured by comparing
|μΓ+1/Αΐ,|* with Tk(a/e)/Tk[frt-c)/e]9 which is equivalent to (max^lvv^/KI)*
(i = 1,..., r), when k is sufficiently great.
We assume that the eigenvalue of the greatest real part is real. The set sp (A) — {λ}
is symmetric with respect to the real axis.
The problem (7.7.1) consists in seeking the minimum of a finite number of
functions of two real variables c and e if sp(A) — {λ} is supposed to be known.
When /l = 0, the problem has been studied in Manteufifel (1977), where an
algorithm for the computation of c and e is proposed.
When r > 1, a natural idea consists in seeking to solve
λ2
X
Χλ,
r=4
Figure 7.7.1
312 CHEBYSHEV'S ITERATIVE METHODS
-X-
r=5
Figure 7.7.2
where S is a set in the complex plane containing the spectrum of A except A. (If
the problem is to solve a system, then λ = 0 and S contains the whole spectrum.)
When S is bounded by an ellipse, the solution of (7.8.1) is the Chebyshev
polynomial tk(z\ whence the term 'Chebyshev iterations'.
However, the problem (7.8.1) has many other applications apart from the
acceleration of linear systems. In fact, in very diverse contexts we meet the more
general problem of determining a polynomial that is large (in a certain sense) on
some eigenvalues {μ^\ of A while it is as small as possible on the remainder r of
the spectrum. For example, we mention the filtering or the techniques of
preconditioning.
LEAST SQUARES POLYNOMIALS ON A POLYGON 313
*μ 2
-X *
Χμ3
Figure 7.8.1
In the case of a complex spectrum the uniform norm that appears in (7.8.1) is
not necessarily the norm that leads to the best polynomial in practice. The
Chebyshev polynomial depends on the optimal ellipse which contains the part
τ of the spectrum that is to be eliminated. This ellipse may turn out to be far too
large in relation to r(see Figure 7.8.1).
It might be more interesting to consider the polygon H which is the convex
hull of the set of eigenvalues in τ and to determine the least squares polynomial
that satisfies
™° HPL, (7-8.2)
Σ^ι«ίΡ(/*ι) = ι
where the {aj^ are given coefficients and |||| w is the I2 norm relative to
a weight function w defined on the boundary dH.
Theorem 7.8.1 Let {SJ}Q be the first k + 1 orthogonal polynomials with respect to
w. The polynomial that satisfies (7.8.2) can be written as
where
^=Z«.s>i) (j = o,...,k).
i=l
PROOF This generalizes the known result for r = l . Consider the degene
rate kernel
Jlc
k(**z)= Σ Sj(t)sj(z).
Then
</Kz),/fc(i,z)>w= P(z)lk(t,z)w(z)dz
JdH
= p(t)
for every pePk.
314 CHEBYSHEVS ITERATIVE METHODS
i=l
q*(z) = c£ α,/^,,ζ),
where !
-(W'
Let pePk and suppose that
Σ α«Ρ(μ,·)=1·
We put p = q* + e; clearly
Σ«Λ)=ο.
On the other hand,
iipiii=ii«*ni+ikiii+2Re(<e,i*>w).
Now
r
i=l
r
= c X a^/i;) = 0.
i=l
Remarks
(a) It is unnecessary to compute Bk explicitly in order to apply Arnoldi's method
to it; only the product pk(A)x is required, where x is a given vector.
316 CHEBYSHEV'S ITERATIVE METHODS
(b) The following observations refer to each of the two methods we have just
described.
(i) The Chebyshev polynomial associated with the ellipse containing the
unwanted eigenvalues may be replaced by the least squares polynomial
associated with the convex hull of these eigenvalues.
(ii) If the number r of required eigenvalues exceeds a certain size, then it may
be of interest to use a deflation technique (Exercises 7.9.1 and 7.9.2).
(c) In addition, the spectral transformation λν-+{λ — σ)" 1 may be used for
preconditioning. In order to solve (A — σΙ)χ = y9 we employ a direct method
(Gauss factorizaion with pivot and preconditioning) or an iterative method
of the conjugate gradient type with preconditioning (see Golub and van
Loan, 1989, p. 373). The algorithm of minimal residual of Saad and Schultz
(1986) makes no particular hypothesis about the matrix A — al.
EXERCISES
U-i(x)= Σ/(*Λ(*λ
where the l} are the Lagrange polynomials of degreefc— 1. Let p^_1 be a best
approximation of / in Pk _ t . Define
/>* = liy— i-fc-i lloo,
«* = Ι Ι / - Ρ Γ - 1 Ι Ι - -
Show that
7.2.1 [C] Show that the first five Chebyshev polynomials are
T0(t)=l, ■
Ti(f) = t,
T2(t) = 2t2-l,
T3(t) = 4 t 3 - 3 i ,
T4(i) = 8f 4 -8f 2 + l.
7.2.2 [B:50] Show that the Chebyshev polynomials satisfy
Tk(t) = 2tTk.l{t)-Tk.2(f),
T0(t)=h
Deduce that Tk is of degreefcand that the coefficient oft* in Tk(t) is equal to 2*.
7.2.3 [C] Show that the Chebyshev polynomials satisfy
1 "0 if ΙΦΚ
I at
Tt(t)Tk(t)-—=:
π/2
π
if / = fc#0,
if / =fc= 0.
7.2.4 [D] Show that the Chebyshev polynomials satisfy
(1 - t2)T'k(t) = kTk. ,(t) - ktTk(t), (fc S* 1),
2 2
(1 - ί )Τ'ί(ή = tT'k(t) - k Tk(t) (fc > 0).
318 CHEBYSHEV'S ITERATIVE METHODS
—^-eE(Oy Ud)^>teE(c,e,de).
e
EXERCISES 319
G*(z) = % ^ (ZGCC),
Tk(z)
when fc tends to infinity.
7.3.11 [D] Show that, when the ellipse E(c,e,a) becomes a circle, we have
^0 and ^«^ϊ.
7.3.12 [A] Prove the existence and uniqueness of the polynomial p* such that
max|p*(z)|= min max|p(z)|,
zeD pePk zeD
p(A)=l
/maxj^lwjiy ^ Tk(a/e)
\ |wj / Tkl(X-c)/e]'
7.5.3 [D] Consider the Chebyshev iteration method. Show that if the eigen
values of A, other than A, lie in the disk | z | < p, then the convergence of the power
method is not improved by Chebyshev.
Ηη(β) = Μ*,
where M + is the invariant subspace of A1 associated with the eigenvalues
λ1,..., Ι Γ of A1 with greatest real parts.
(a) Propose one or more algorithms to compute the basis Q of this partial Schur
factorization of A1.
(b) Show that the choice
/=ß5, see,
τ
ί = β ί>
reduces the proposed problem to the following problem of size r.
Find s e C such that RT — tsT has the eigenvalues μΐ9...,μτ. This problem
is called the partial pole assignment in control theory.
EXERCISES 321
xn = x„-i+ Σ ym'i,
i=l
where
rf = b — Axt.
Let the error be denoted by
| ( c - ^ + [(c-A) 2 -e 2 ] 1 / 2 l
c + (c 2 -e 2 ) 1 / 2
we call optimal parameters those that minimize maxZ€sp(/1)r(A).
(e) Show that for a given pair (c, e) one has to compute the sequence
x0 eR", r0 = b — Ax0,
Δ0 —-r0, Xj — x 0 + Δ 0 ,
where
2c
«1 =
2c2 - e 2 '
i?1=ca1-l,
"^l·"©^-1] '
ßn = C0tn-l.
Polymorphic Information
Processing with Matrices
The 25 years which separate the writing in 1987 of the original French version of this
textbook and the present Classics Revised Edition have enabled scientists and soft
ware developers to progress significantly in the understanding of the role played by
matrices in intensive scientific computing. The evolution in computing know-how is
fuelled by the necessity to translate mathematical computation into numerical soft
ware which should be fast and reliable enough to meet the ever-growing demands of
high-tech industries. In this endeavour, computations which rest upon an explicit or
implicit spectral decomposition of highly non-normal matrices represent a formidable
challenge. The spectra are inherently unstable: this is convincingly illustrated by Ex
ample 4.2.11, pp. 162-164 in Section 4.2.7. And the subject is developed more thor
oughly in Chapter 10 of the book [4J, which addresses the specific difficulties created
by high non-normality for the necessary backward assessment of finite precision com
putations. In practice, high non-normality does arise in matrix computation when the
underlying equations express a strong coupling between two phenomena observed in
physics or technology. This coupling creates mathematical instabilities which have a
serious impact on the computed results. Various tools such as pseudo-spectra [4,21]
have been designed to assess the validity of such results when obtained with a reli
able numerical software run on a computer with a classical architecture. But when
codes are run on massively parallel architectures, the validity assessment of computer
simulations remains today an open, yet pressing, problem. An extensive survey of
concurrent advances in numerical software for eigenvalues is found in [18]. All ref
erences which do not appear in Appendix B or in C are listed in the text as [n] and
appear at the end of the chapter for reference number n. An analysis of what works
best in software practice is a precious source of information about the theoretical
reasons why matrices are such dependable tools in scientific computing.
This chapter attempts to present some of these reasons in a way which does jus-
324 POLYMORPHIC INFORMATION PROCESSING WITH MATRICES
tice to the power of mathematics to compute and model our world. Thus it is equally
important to consider what conceptual tools are at work in today's (2012) view of the
phenomenological world presented by theoretical physics [1,2]. To set the scene, let
us review the rich variety of basic building blocks, generically called scalars, which
are used in computation understood in a broad enough sense, running from engineer
ing practice to physical theory.
resulting from considering the imaginary unit i = (0,1), i2 = —1, Arg i = π/2, and
circular trigonometry on the unit circle.
The 1848 proposal by J. Cockle to equip R 2 with an alternative ring structure
related to hyperbolic trigonometry on the two dual unit hyperbolas x2 — y2 = ±1 is
not widely appreciated. It rests on the introduction of a non real unipotent u — (0,1),
u2 = 1, u Φ ± 1 , such that w = x + yu eV..
Some properties of complex vs. hyperbolic numbers are contrasted below at a
point M = (x,y) inR 2 :
c u
2 2
i, i = - 1 u, w = 1, u 7^ ± 1
z = x + iy w = x + yu
z = x — iy w* = x — yu
|i| = l \u\h = ± i
The algebraic structures C (field) and Ή. (ring) yield respective information at the
point M which differ markedly:
• The Euclidean norm or modulus \z\ G R + is replaced by a threefold mark
function w *-> \w\h £ {R + , iR} which is not a norm and depends on the loca
tion of w for its expression. The mark relates implicitly H to the axes R and
326 POLYMORPHIC INFORMATION PROCESSING WITH MATRICES
These differences are induced by the two quadratic forms x2 + y2 vs. x2 — y2.
The plane R 2 is treated as a whole by means of the positive definite form x2 4- y2\ it
is divided into four distinct quadrants by the two asymptotes y — ±x on which the
indefinite form x2 — y2 is 0. For example, in the quadrant where 0 < y < x, the
angles Θ (Euclidean) and φ (hyperbolic) are related by 0 < tan# = tanh</? = A'A
with OA! = 1 in Figure 8.2.1. The point B — el6 on the unit circle defines OB' =
cos# < 1, B'B — sin#; alternatively the point C = βηφ on the unit hyperbola
defines OC = coshy? > 1 and C'C = sinh(/?. Moreover, θ < π/4 and φ > 0
represent respectively twice the area of the circular and hyperbolic sectors OA'B and
OA'C.
Figure 8.2.1
The structure Ή equips the plane R 2 with a model of hyperbolic geometry. The
hyperbolic number w — χΛ-yu models the addition of the two heterogeneous numbers
xl and yu, where 1 and u span two distinct real categories such as space and time.
Not surprisingly, hyperbolic numbers are well-suited to describe special relativity in
one spatial dimension [19J.
SQUARE MATRICES ARE MACRO-SCALARS 327
The standard model for particle physics is set in the multiplicative and associative
algebras due to Clifford [1,9]. However we cannot leave this foundational topic [8,9]
without touching on non-associativity, i.e. on algebraic structures beyond rings. This
is because recursive complexification of x leads to the non-associative algebras Ak,
k > 3, proposed by Graves (k = 3) and Dickson to go beyond A^ = H. These weaker
algebraic structures offer new computational possibilities which may appear at odds
with classical logic [3]. Computation paradoxes should not be feared: they reveal new
phenomena which invite us to extend the current logic of computation [3,7,8,9]. We
shall be facing yet another computational contradiction later in Section 8.8.
The smallest non-associative Dickson algebra is the division algebra As = G
consisting of octonions in R 8 . This algebra would be a field if it were associative. It
stands at the crossroads of many computational phenomena in geometry, theoretical
physics and number theory [1,2,8].
But it is high time to refocus on our central theme: the associative ring of square
matrices defined over R or C
Cx = 1
Let || · Up denote the Holder norm where p is specialised to be p = 1,2 or oo. Thus
n
for a; = (xi) e C n , ||x||i = ^ | x » | , ||z||oo = max*|x;|.
2=1
Lemma 8.4.1 \\μ\\ρ < \\σ\\ρ, p = 1,2, oo, with equality iff A is normal.
A = HLUL = URHR,
where the modules HL and HR are Hermitian positive semi-definite, and the unitary
matrices UL and UR are called left and right phase factors.
8.5.2 A Is Normal
The factors H and U commute iff A is normal (Statement 208, p. 184, Chapter III
in [12] or Proposition 1, p. 191, Section 5.7 in (Lancaster and Tismenetsky 1985).
Therefore the polar factorisation has a unique form: left = right.
Lemma 8.5.2 The factors H and U of A normal have a common orthonormal eigen-
basis together with A.
POLAR REPRESENTATIONS OF A OF ORDER n 331
PROOF H and U are semi-simple and commute. See Exercise 6, p. 240 in (Lan
caster and Tismenetsky, 1985) or Statement 193, p. 127, Chapter II in [12].
PROOF Left to the reader (Exercise 2, p. 192 in Lancaster and Tismenetsky 1985).
• The form of the representation is unique iff A is normal <<==> \\μ\\ρ — \\o-\\pi
p = 1,2, oo. Otherwise ||μ|| ρ < ||cr||p; the left and right forms differ.
Remark The absolute condition numbers for H and U are studied with the Frobe-
nius norm in [5] in the case of a rectangular matrix A G C m x n , r(A) = n < m.
Interestingly for m — n the explicit values for the condition numbers of the phase
factor U depend on the ground field for A. More precisely (1991),
CR(U) = —?—>Cc(U) = —,
C{H) V2
~ 1 + K{A)
K(A)
for m > n, K(A) = ση/σ\ does not depend on the field R or C [5]. We mention for
future reference in Section 8.6 that
Θ(Α, u) is the unique canonical (or principal) angle between the complex lines spanned
by u and Au (Definition p. 5, Section 1.2). Below we are interested in this acute "an
gle" Θ(Α, u) defined for Au φ 0 and its maximal value φ(Α), 0 < φ(Α) < π/2.
The argument φ{Α) for the yield is the maximal dynamical play produced by A.
Theorem 8.6.1 Let A be Hermitian positive definite. The components ofa(A) are
cos φ(Α) = ^ ^ f = min u ^ o cos0(A,Tz) and sin φ(Α) = ^ ^ - \\emA - I\\,
2
ε - - Id
Ax — 0 are trivially orthogonal when x G Ker A. In the limit λι —> 0, one can set
Θ(Α, u) = π/2 for all eigenvectors u in Ker A which are associated with 0. On the
other hand, Θ(Α, x) — 0 for all eigenvectors x associated with λ > 0. This introduces
a sharp distinction between the kernel Ker A and all eigenspaces Ker {A — XI),
λ > 0. In the following examples, the positive (semi-) definite matrices of interest
are the modules of a non-Hermitian matrix.
Example 8.6.1 Let A invertible and non-normal have two polar representations:
A = HLUL = URHR.
The eigenvalues of HL and HR are the singular values for A: the extreme ones define
Assuming that d > 2, we set w\ = Λ +fA and w\ = XX+X , 0 < w\ < 1/2 <
w\. We choose the square roots w\ > -4= and 0 < Wd < -j=. Then cos φ(Α) =
2w1Wd-
Example 8.6.2 Let us apply this trigonometric view to the Hermitian factor H for
A — HU (say) not invertible. Then φ(Η) = π/2 and the phase factor is not uniquely
defined (Proposition 8.5.1). The value π/2 for φ(Η) signals the singularity of H and
hence of A.
Now let us go back to the Remark in Section 8.5.3 and let m = n = r(A). We
can readily interpret (8.5.1) as
Let A = UYV* e C n x n and AA = UBV\ and let H'(A) denote the Freenet
derivative of H : A\-+ (ΑΑ*)1/2; then
with ||ΔΑ||,ρ = ||J5||i?· It can be shown that C(H) is achieved for the rank 2-matrix
B = wxeiel + wdene{, w\ = j+fa and w\ = χ^χ^; hence ||S|||, = tr BBT =
w
i + wd = 1 anc* ll-^lb = wi < 1. Moreover, £?2 = wiu^(eief 4- ene^)9 where
^ i ^ d = I cos(/>(#). ThusO< ^wrwj = p(B) < 75 < II-BH2 = wi < 1.
When λι -> 0, so does Wd and B tends to the nilpotent matrix e\e^\ r(B) — 2
drops to 1 in the limit.
More details are found in [5] and in Section 4.6 of [13]. Actually the practising
numerical analyst will find ample food for theoretical thought in all of Chapter 4 of
[13].
Example 8.6.3 Let A represent the real matrices Lx and Rx defined in Section 8.5.4.
The common module is Hx. For k < 3, Hx = ||x||/ n · Therefore φ(Ηχ) = 0:
there is no dynamical play for the multiplication maps. Alternatively the yield is real:
a(Hx) = 1.
A positive play occurs for multiplication by non-alternative vectors in higher di
mensional Dickson algebras. Then the norm stops being multiplicative: \\x x y\\ φ
IMIIMI [8]· If a; is a zerodivisor, φ(Ηχ) = π/2 and the yield is pure imaginary:
a(Hx) = i.
The above discussion shows that the linear map which multiplies by x in Ak is best
(respectively worst) conditioned when x is alternative (respectively x is a zerodivisor,
fc>4).
Going back to A Hermitian, we only assume from now on that it is positive semi-
definite, so that 0 < φ(Α) < 7Γ/2. If A = XI, X = λι = λ^ and φ(Α) = 0 because
all vectors are eigenvectors. In conclusion, 0 < φ(Α) < π/2 iff A is invertible ^ XL
Let x\ and xd be a choice of normalised eigenvectors in C n for A associated
with λι and λ^ respectively. We denote S — S(x\,Xd) = {u G Cn;u = zx\ 4-
z'xd, ζ,ζ' G C, |z| 2 +|2/| 2 = 1} = 5 3 the unit sphere in E 4 , and 5 0 = S0(xi,Xd) =
{v G C n ; u = eiew1x1 + e*'wdx&, Θ, ff G R} 9* S1 x S 1 , where S1 is the unit circle
in R 2 . Note that e%ew\ and e%e Wd represent arbitrary square roots of Λ ^ Λ and
λΙΤΧ^ respectively.
Theorem 8.6.2 Let A be invertible, then sin</>(j4) = 11(A +A ^ ~ ^UW for αγν
$
u G S and cos φ(Α) is achieved for any v in So C S.
PROOF By an easy adaptation to the complex case of Theorem 3.1 on pp. 31-32
in [13].
336 POLYMORPHIC INFORMATION PROCESSING WITH MATRICES
When A is symmetric rather than Hermitian, the eigenvectors x\ and Xd, as well
as u and v are real vectors in R n . The analogues of S and So are respectively R =
R(xuxd) = {u € Rn;u = axi + α'α^,α,α' e R, a2 + a / 2 = 1} ^ 5 1 , and
i? 0 = Ro(xi,Xd) reduced to the four points {v e Rn;u = ±wix\ + (±WdXd)}
lying on R.
Corollary 8.6.3 When A is symmetric positive definite, the sets S and So in Theorem
8.6.2 are replaced by the sets R and Ro respectively.
PROOF Clear.
Several remarks are in order:
(1) If A is singular (λι = Wd = 0), the set S (respectively i?) is unchanged with
Au — z'xd (respectively a'xd) so that sin φ(Α) = 1. The sets So and Ro are
reduced, according to Wd = 0, w\ — 1 to be subsets of Ker A, leading to
cos φ(Α) = 0, as expected.
(2) When d = 1, λι = λ^ and w\ = Wd = 4^.
(3) The different behaviour of cos Θ(Α, u) and n(A, u, e) with respect to minimi
sation is quite remarkable. It deserves further study |10J.
(4) cos φ(Α) is called the first "antieigenvalue" with "antieigenvectors" in RQ
(A real) in Gustafson's parlance. And the general study of cos φ(Α) and sin φ(Α)
is referred to as matrix trigonometry by its inventor. More generally, over C
φ(Α) is merely a particular case of canonical angle between complex subspaces
of dimension r = 1 generated by u and Au. For 1 < r < n, the associated
trigonometry was presented earlier in Chapter 1, Sections 1.2 to 1.5, as a handy
tool devised to quantify the convergence of eigensolvers in Chapters 5 to 7. But
Gustafson had a different goal in mind for A symmetric.
The difference is mentioned in Exercise 5, Section 6.7, pp. 119-120 of [13J.
However, the maximal canonical angle shown in Figure 1.3.1, p. 9 has lost in
|13| the proper reference to its twofold origin: statistical over R (Afriat 1957)
and numerical over C (Section 1.14 on p. 43). Many examples of the use of the
angle φ{Α) to analyse the convergence of iterative linear solvers are provided
in Chapter 4 of [13].
(5) Unlike the Euclidean angle Z(u, Au) in the real 2D-plane, the canonical angle
Θ(Α, u) derived from the complex Euclidean scalar product has no familiar
interpretation in real 4D-geometry.
Therefore the map u i-> Au expresses, when u is not an eigenvector, a change
of direction only over R. Over C, the change appears, in real terms, as an evolu
tion of a different, more intricate, nature. It incorporates the complex structure
of the ground field C in a computational manner to be studied elsewhere [10].
YIELD OF A HERMITIAN POSITIVE SEMI-DEFINITE UNDER SPECTRAL COUPLING 337
(6) The argument φ(Α) measures the maximal dynamical play that the matrix A
can induce on a vector u. This viewpoint is a move away from the more fa
miliar search for colinearity, that is for eigenvectors of A. This alternative
viewpoint does not look for the directional invariance expressed by A through
its eigenvectors. It looks rather for the directional laxity that A can express by
an inner coupling of any two distinct eigenvalues, as we shall see in Section
8.6.3, the maximal laxity being achieved by φ(Α) for the extreme eigenvalues
λπιίη, Xmax> The larger the φ(Α), the greater the evolution that is possible un
der A by coupling λι, λ^. The number λ = Λι+Λ<* represents the middle point
in the spectrum. Let A = QDQ*; the maximal value Ad ~ Al = ||A - λ/|| =
\\D — XI \\ = maxi<-K n \ßi — X\ is achieved by \\(A — XI)u\\ for u e S.
A2u Λ Au Λ
T7ö Γ-2τι r+u = 0, Αυ,φΟ. (8.6.1)
{A2u,u) (Au,u)
PROOF See Section 3.3, pp. 33-37 in [13].
Observe that A2u is a real linear combination of An and u.
Corollary 8.6.5 The Euler equation for cos Θ(Α, u) is satisfied by eigenvectors of A
as well as by linear combinations of normalised eigenvectors Xk and x\ associated
with 0 < Xk < Xi, 1 < k < I < d, lying in So(xk,xi) = {v = eieWkXk + £τθ wixi}>
where w
t = x^xl> 1/2. wf = ^ < 1/2
It is clear that φηύη{Α) < φω < φ(Α) = ^ m a x (A) < π/2, where φηιιη corresponds
to minAfc<A; (A|/Äfe, 1 < fc, / < d), just like φτηαχ corresponds to max(A/ — Xk) =
Xd - λι.
Another useful geometric picture is given in Figure 8.6.2 by means of the trapez-
YIELD OF A HERMITIAN POSITIVE SEMI-DEFINITE UNDER SPECTRAL COUPLING 339
ium ABCD, where the opposite sides AB and CD are parallel with lengths λ and
λ'. The diagonal lines BD and AC meet at E: the parallel line through E to the
sides AB and CD meets the concurrent sides at F and G. Then Thales tells us that
2FE = 2EG = FG = 2 ^ 7 = h, the harmonic mean of the side lengths λ and λ'.
Observe that Figure 8.6.2 concerns the particular case of a complete quadrilateral for
which one pair of lines meets at infinity.
Definition π(ζ) = det A(z) is the homotopic polynomial with degree d, 0 < d <
n — r. Its zeroset is Z = {z G C; π(ζ) = 0}.
The resolvent R(t, z) is analytic in t around 0 (|£| small enough) and oc (\t\ large
enough) for z in res (A)\F(A, E), where F(A, E) = {z e res (A); r(Mz) < r} =
Z Π res (A) = Lim Π res (A).
The points in F(A, E) are frontier points, i.e. those points in res (A) where
R(t,z) is not analytic around \t\ = oo: they are the limits of X(t) which exist in
res (A). As \t\ —> oc they attract the flow of spectral information consisting of
the spectral rays t — \t\eie i-> X(t) for Θ fixed, which do not escape to oc or to an
eigenvalue of A.
Critical points are particular frontier points z where Mz is nilpotent<^=> p(Mz) =
0. At such a point z, the map: t >-» Ä(t, 2) is a matrix polynomial in t of degree < r.
The critical points repel the spectral flow for |t| < oo but become asymptotically
attractive as \t\ -» oo. Observe that if r = 1, all frontier points are critical because
Mz is reduced to be a complex scalar. In computational practice, the asymptotic
regime is reached as soon as |t| > 300 or so.
This sketchy summary attempts to convey the flavour of the easy part of HD the
ory, which is able to characterise Lim Π res (A) with a few more analytic tools than
the classical spectral theory presented in Chapter 2. The study of Lim Π sp (A) is
more subtle: it calls for a significant generalisation of spectral theory and uncovers
new computational phenomena. For example, when 2 < r < n, it is possible that
a limit eigenvalue λ ( lim X(t) — X G sp (A)) does not belong to Z (π(λ) Φ 0).
|£|—>oo
When this is the case, a local bottom-up organisation of the information occurs at λ.
This contrasts with the top-down organisation which is the rule at all frontier points
in res (A).
The interested reader is referred to Chapter 7, pp. 247-346 in [8]] for an in-depth
treatment of HD. Reference [6J treats an example arising from the discretisation of
the acoustic wave equation where the homotopy parameter is the complex admittance.
The critical points correspond to frequencies for which no finite value of the admit
tance can cause a resonance.
/ In A \ ( 0n 0 \( In -A\ ( AB 0 \
V 0 In J\ B BA ) \ 0 In ) \ B 0n J-
NON-COMMUTATIVITY OF THE MATRIX PRODUCT 343
The Jordan forms for AB and BA differ only at the eigenvalue 0 for which the sizes Si
of the Jordan blocks satisfy \si(AB) - Si(BA)\ < 1, i = 1, to max(g(AB),g(BA))
g(AB) g(BA)
while keeping m = ^ Si(AB) = Y^ Si(BA) invariant as the common alge-
i=l i=l
braic multiplicity of 0 [ l l ] .
Let 0 be a defective eigenvalue of AB characterised by the three integers g, /, m,
with g,l e [l,m]. Then, the structural matrix Co (AB) of size g x I for AB de
fined in Section 8.4.2 can be associated with finitely many possible structural matri
ces CQ(BA) of size g' x /' for BA. The total number N(g, Z,TO)of possibilities is
minimal at 1 when AB and BA are similar. When the similarity exists only at the
augmented level 2n, N can grow exponentially withTO.The reader can find in [16]
an algorithmic description of the possibilities Co (AB) \-+ CQ(BA).
We exploit below the associativity of the matrix product A(BA) = (AB)A and
denote E = AB and F = BA.
Lemma 8.8.2 If A and B are invertible, the spectral projections associated with
0 ± X e sp (E) = sp (F) satisfy
PROOF Let (C) be a Jordan curve isolating λ φ 0 from the rest of the spectrum.
Then (BA)"1 = A~XB~X a n d £ ( A B -ziyxA - ( / - z F " 1 ) " 1 - -\RF-i (|).
We set u = \, du = - # , and thus / BRE(z)Adz = / zRF-i(u)du. It
7c Jc
follows that BPtfA = \PF-i = XPF.
Proposition 8.8.3 The identification of the matrix coefficients for l/(z — X)k, k >
0, in the expansion (2.2.5) for RF and RF satisfying (8.8.1) entails the following
relations:
The paradox induced by Cauchy integration could become even more puzzling if
we were to progress further into the direction (i) with only the incomplete knowledge
of the existence of λ, taken wrongly to be the unique singularity for the resolvents RE
and RF. One would be led to the (most often spurious) conclusion that
AB = BA = XI. (8.8.5)
The equalities (8.8.2) and (8.8.5) are valid only if / = 1 and sp (E) = sp (F) = {X}
respectively. In mathematical theory the first contradiction has gone undetected so
far because spectral consequences have not been drawn for the bond (8.8.1). Conven
tional wisdom treats the two matrices AB and BA as separate global entities. The
second contradiction is easily resolved by processing local and global information at
the same time. The mind has access to the global information about the spectrum
provided by the common characteristic polynomial TTE(Z) = TTF(Z). Such is not the
case in experimental sciences when experimentalists have access only to the local
phenomenological reality. The computational paradox stemming from local contour
integration is a welcome warning against the dangers of a naive inference from local
to global information.
Remark The warning may not be as far-out as it looks if we ponder on the matrices
Lx and Rx representing the left and right multiplications by x in Ak, k > 0. For
k < 3, these matrices have the simple module Hx = \\x\\I2k, corresponding to
the unique singular value \\x\\. And the development of our physical intuition of the
manifested world is primarily based on the validity in R, C, El and G of this extremely
special situation. It is therefore reasonable to expect that our 3D-based intuition can
be challenged in higher dimensional algebras (k > 4) when the multiplication maps
have not only module factors which carry polymorphic information about themselves
{3, 8} but also phase factors which can be ambiguously defined.
Proposition 8.8.4 Spectral analysis of (8.8.1) around X ^ 0 entails for k > 1 the
infinite sequence of relations
8.9 CONCLUSION
This chapter has offered a selection of snapshots taken in the booming domain of ma
trix computation which contain eigenvalues in their inner core - in an explicit and,
at times, implicit fashion. Due to the fast, bush-like, evolution of the domain (from
pure algebra to finance and the Internet) many other aspects could have been pre
sented. Admittedly, the chapter betrays the author's personal views on the evolution
of mathematical computation over the ages [7,8,9,10].
In her view of computation, matrices will prove themselves to be even more essen
tial tools than vectors in the scientific understanding of the ever-changing scheme of
living organisms. The polymorphic and possibly ambiguous character of the dynam
ical information that matrices carry makes them versatile macro-scalars upon which
can be built a complex multilevel processing of information.
EXERCISES
Section 8.5 Polar Representations of A
8.5.1[C] When r(A) = n - 1 for A e C n x n , show that there exist exactly two
left phase factors Ui and U2 = (/ - T)U\, where T = 2 Ä , a e Ker A*. Interpret
\T. Deduce that \ΌΧ - C/2||2 = 2.
EXERCISES 347
8.5.2[D] When r(A) < n — 2, show that there exist uncountably many distinct
phase factors for A.
8.5.3[D] Use Exercises 8.5.1 and 8.5.2 to show that for A e C n x n , the number
of distinct phase factors for A can take the values {1,2} for n = 2 and {1,2, oo} for
n > 3. In the latter case, oo is uncountable.
8.5.4[D] When 1 < r(A) < n, show that the matrix T defined in Proposition
8.5.1 satisfies max ||T|| 2 = 2.
8.6.1[C] Let 0 < λ < λ' be two distinct eigenvalues for A Hermitian positive
semi-definite, with associated normalised eigenvectors x, x' which span the invariant
subspace M = lin (χ,χ'). Let P = QQ* be the orthogonal projection on M,
Q*Q = J 2 . Let B = Q*AQ = &(Q) represent the 2 x 2 Rayleigh quotient of A on
M; see p. 34 in Chapter 1, Section 1.11.1.
(1) Show that B is Hermitian positive definite with sp(.B) = {λ, λ'}.
b
(2) Set B = ( ? J. Show that tr B = a + c = λ + λ' > 0 and det B =
λλ' = ac- \b\2 > 0.
(3) Prove that \b\2 >0<=>\<{a,c}< X'.
(4) Suppose that λ < a < X = ^ - < c < λ'. Show that 0 < \b\ < ^f^.
(5) Show that sgn6 = τ|τ, b Φ 0, is not specified by the triple {α, λ, λ'}.
Solution to Exercises
CHAPTER 1
1.1.1 The columns of (V~l)* are the adjoint basis of the columns of V when
V is non-singular. We remark that when V is not a square matrix, then the
adjoint basis either does not exist or else is not unique. For example, when
Λ 0\
V= 01
VI 0 /
than these exist at least two adjoint bases, namely
1 0\
and
Thus
t ^ 2r - n > 0.
n r
Let Qe<C * be an orthonormal basis of M whose first t columns form a basis
of Mr\N. Let Ue<£nXr be an orthonormal basis of N whose first t columns are
those of Q. Then
* \o w)
where W is of order r — t.
This shows that U*Q has at least t singular values which are equal to unity;
hence there are at most r — t non-zero canonical angles between M and N.
However,
n
r — t^n — r < - ,
2
that is
Put
Q = (V,Q') and l/ = (F,£/'),
where
lin(K) = M n N .
Let
M' = lin(Q') and N' = lin(l/');
then
M'nN' = {0},
M n N + M' + i V ^ M + N',
and the non-zero canonical angles between M and N are the same as those
between M' and N'.
1.2.2 Let QeC n x m be an orthonormal basis of M and let Ue<Cnxm be an
orthonormal basis of N. Let 0, be the greatest canonical angle between M and
N, and put cx = cosöj. Then
öi = i^^Ci = 0 o l / * Q is singular
o 3 w 6 C m such that u Φ 0 and t/*ßw = 0
o3xe<C n such that χφθ and x e M n N 1 .
SOLUTION TO EXERCISES 353
where fc ^ r. We define
C' = diag(c 1 ,...,c k ),
and therefore
= V*W*1WUV1+(W21V1)*(W21V1)
2
= c + (w*ivlnw2lv1)
354 APPENDIX A
and so
(^ 2 1 K 1 )*(^ 2 1 F 1 ) = diag(si,...,sk2;0,...,0)
where
5.^0 and sf + cf = l (i=l,...,fc).
{n r) x (n r)
Let Z2<E ~ ~ be a unitary matrix whose first k columns are those of W2\ V\
when normalized. Then
where
Z^W
Ml)·
S = diag(s1,...,sfc;0,...,0)e<Cr>
Let S' = diag(s lv ..,s k ). Then S' is regular and
We therefore have
(C\
l s
Vo zj \w2J
In an analogous manner we determine a unitary matrix K2e(C<n r)x<
" r)
such
that
Z*W12V2 = (T,0),
where T is a diagonal matrix with non-positive elements such that
T2 + C2 = Ir.
Thus T= -S. Let
Z3 = (X" X
4s\e<p«-r-k)xin-r-k)
\A54 Xss/
in unitary and
(c o -s' o \
0 /,_* 0 0
ZWV = S' 0 CO
\0 0 0 Z3
/7» ο o o \ /c o -S' 0
0 /,_» 0 0 0 /,_, 0 0
0 0 lk 0 S' 0 C O
\o o o z3y v0 0 0 /„
Put
and
whence
j - l n
k=l k=i+l
where
0 n
Σ - Σ =o.
fc=l k=n+l
Put j = 1 and ί = n; we find that unl = 0. Suppose that when fc>2 we have
ie{/c,fc+ Ι , . , . , η - l,n}=>wtti = 0;
we deduce that Μ^-!! = 0 . This shows that
wMi = 0 when i = 2,3,...,n.
Now suppose that when j = 2,3,...,/ and k>j + 1 we have
ie{/c,/e+ l,...,w— l,n}=>u o = 0.
It then follows that
Uij = 0 when j = 1,2,...,n— 1 and / = . / + Ι,.,.,η.
We leave to the reader the task of verifying that in the presence of repeated
eigenvalues, the diagonal of U will contain blocks.
SOLUTION TO EXERCISES 357
fl ifi + j = n + l ,
Pi / = 1
[0 otherwise.
Then
p - i = P = p*
and
J* = P* JP.
1.6.19 Let X be a basis of eigenvectors of A and let Q be a Schur basis:
A = XDX~\
Q*AQ = D + N\
where D is the diagonal of eigenvectors and N is a strictly upper triangular
matrix. Then
\\N\\2¥=\\A\\2F-\\D\\2F
and
Hence
, , ^ Λ JIAMIFV 7 2
cond2{x
H^Ü¥J ■
On the other hand,
\\A*A\\F^\\X-1\\l\\D*X*XD\\
2 II ^ Λ Λ
** IIF
1 II 2 || v II 2 |
= \\Χ-'\\\\\Χ\\\\\ΌΌ*\\Υ.
However,
||DD*|| F =||D*D|| F =||D 2 || F ^||>1 2 || F ;
hence
2
cond2(AT)
' Λ . ^^
M 2 HF
Moreover,
M M - ~ ^ * | | F = 2(MM|| F -M 2 || F ),
whence
cond^(X)^l+-^^r.
2
2V
' 2O IIII JA2 | |||2F
358 APPENDIX A
j=2
Since
we conclude that
Ι Ι ^ Ι Ι ρ ^ - γ - ) Ί ΐ + - - ^ - ) ; 2 2 + - + — -7«n.
V
12 Ä
However,
/4*/ί - AA* = βΓβ*.
Hence
and so
<,♦»-*,»-.(,-£♦»)«(,-*♦')
hence
i>2
α2«ζ1
l+2c2
Since c2 < 1, we have a2 ^ 1 — ^b2, that is
-^<ii"ii2.
6MII 2
1.8.1 Let XeC""" and suppose that X is of rank r<m. Hence there exists a
permutation matrix Π such that the singular value decomposition of ATI can
be written
v*xnu ■e :>
where V is a unitary matrix of order n, U is a unitary matrix of order m and Σ
is a non-singular diagonal matrix of order r. W e may write
Vt/21 i / 2 J
where l / u is of order r, and hence
ATI
= "(7' ο>
Let
where
e
=i ":) and R =.
'An
0
0
0
Q being unitary and K an upper triangular matrix.
1.9.1 Let C = A + By where /I and 5 are Hermitian and B is positive semi-
definite. For all ue<En such that ||u\\2 = 1, we have u*Bu^0 and
u*Cu = uMw + u*Bu ^ WMM.
On taking the maximum on the left subject to ||u|| 2 = 1, we obtain
p(C)^u*Au.
We conclude that
p(Q>p(A).
and
u*Au = M*£M -I- u*Cu.
Now
X1(C)= max M*CM,
l|u||2=l
Hence
\λί(Α)-λί(Β)\^\\€\\2.
If C is positive semi-definite, then Art(C) > 0, and so λ{(Α) ^ λ^Β).
Then
. u*Au
= max mm
dimS = n-j+l ueS U*U
. U*Ali
= max mm
dimS1=j-l «IS1 U*U
. u*Au
= max min .
dimS = j - l ulS U*U
CHAPTER 2
2.2.3 First we state the following facts: If
J = (xM)
is a square matrix such that
U ifa = )3,
x*ß= \ 1 if P = a + 1,
[0 otherwise,
SOLUTION TO EXERCISES 363
then:
(a) J is non-singular, if and only if λ Φ 0;
(b) if λ Φ 0 and J " l = (^), then
0 ifa>0,
if a ^j?.
A'"«"1
where
Ö;W=
VΣ (-ö/
lj being the least integer such that
Finally, let Tk be a Jordan curve which isolates Xk from the rest of the spectrum
of A. Then
Hence
I)rAh-z)
dz
—=0 0 = 1,2,...,<*;«>()).
-ΛΓ μ - ζ / ) - 1 α ζ = ρλ = ^ χ ^ .
2πι Jrk
364 APPENDIX A
GK>
where T is an upper triangular matrix of order n. The matrix T is obtained by
premultiplying in turn by permutation matrices and Gaussian elementary
matrices. If, instead of the latter, we use Householder matrices (Exercise 1.8.5),
then we obtain the Schmidt factorization.
The final structure is of the same form.
2.3.2 It is trivial that diagonalisable matrices with the same spectrum are
similar. Defective matrices are similar if and only if they possess the same
spectral structure (eigenvalues with the same algebraic and geometric multi
plicities and the some indices); that is if and only if they possess the same
Jordan form.
2.3.3 We show that λ is not an eigenvalue of (/ — Π) A: if λ were an eigenvalue
of (/ — Tl)A, we should have
(/ - U)Au = Xu
for some u Φ 0. If λ φ 0, then
0 = Π(/ - U)Au = /UIw=>nu = 0,
whence
u = (I - Π)Μ Φ0
and
(l-U)A(l-Ti)u = ku\
that is λ is an eigenvalue of (/ — U)A(I — IT), which is impossible because λ is
not an eigenvalue of B. We deduce that the unique solution is
ζ = Σί>.
2.3.6 We refer to the identities
(/-Π)(/-Π1) = / - Π ,
(/-Π1)(/-Π) = / - Π 1
Ι!Σ(Π^)|| 2 ^ ||Σ(Π)|| 2 ,
ΙΙΣίΠ^ϊρ^ΙΙΣίΐυΐΙρ.
(1-P)AY-YJ = RV.
An easy computation yields
Yi=SRVu
Y2 = SRV2 + ocSYl
= SRV2 + ocS2RVl.
Hence the reduced resolvent S and the block-reduced resolvent S satisfy the
relation
SR = X = YV~x = SR + OLS2R(0 VX)V~X.
0.986 0.579\/u\/0.235\
0.409 0 . 2 3 7 / W ~ MU07/
366 APPENDIX A
If the Jacobian matrix satisfies the Lipschiz condition, then p = 1 and the
convergence is quadratic.
2.11.2 W e d e n o t e / = | | J ' - 1 | | , P = l | / i ^ H , i = I I ^ H , w = | | y * | | , s = | | y M ( / -
X' y*) ||, v, = J ' " l HX\ || Vx || ^ γ'ρ = πχ by definition. We suppose that || Vk \\ ^
nk, and set
β! = H J ' " 1 » ||. ε2 = \\Υ*ΗΧ'\\, η = ε1^γ'ε2 and e = yt2sp.
Then
|| Vk+11| ^ nx 4- exnk 4- )>'ε2πΛ 4- /STT*
= π! -f fo -f /ε 2 )π* + /$π* = π*+ x (say).
Set nk = π χ (1 4- xk) for k ^ 1. Forfc= 1, Xj = 0; forfc= 2,
π 2 = π ^ Ι 4- η 4- v's^) = π ^ Ι 4- ?y 4- ε) = π ^ Ι 4- x 2 )
and
π* +1 = ^ [ 1 + ^(1 4- xk) 4- ε(1 4- x k ) 2 ]
= π 1 [1 - x k + 1 ]
which defines the recurrence relation Χχ=0, x k + 1 — η 4-ε4-(*7 4- 2ε)χΙς + εχ£,
fc^l.
The limit x satisfies x = /(x) = εχ2 4- (η 4- 2ε)χ 4- ff 4- ε; χ = /(x) has two real
roots if 2 Ν /ε + ^ < 1 . (One can verify that lyfz + η < 1 implies that the
discriminant is positive.) Let x* denote the smallest root. When fc-+oo, xk
converges monotonically from xx = 0 towards x*, and nk converges to π χ (1 -f x*).
Let
G'\V*-*Vx+i'-llHV-V(Y*HX')+V(Y*AVy\.
We determine a sufficient condition under which G' is a contractive map in the
closed ball:
®= {ν;\\ν\\^(\+χ*)π1};
G\V) - G'(K') = J ' " * [tf(K - V) -(V- V)Y*HX'
+ (κ- V)Y*AV+ VY*A(V- vy\
Now
B - Β' = Y*AX - Y*A'X' = Y*A(X - X') + Y*(A - A')X'
implies
\\B-B'\\^s\\X-X'\\+u\\HX'\\.
The condition */ + 2ε < | is rewritten as
y'||H||+/||H||iu + 2y'25||//||i<i,
that is
/||H||[l+iii + 2/si]<i.
If we choose the Euclidean norm | | | | 2 and the bases Y = X' = Q to be
orthonormal, then t = u = 1, and the sufficient condition becomes
CHAPTER 3
3.3.1 If J = V ~ XA V, then Jk=V~lAkV, and we deduce that
A J
e =Ve V-K
The result follows from the identity
e J(i+h) - e Ji = (eJ* - /)e J i = hJeJt [/ + 0(h)l
The computation of the elements of e Ji follows from Exercise 3.1.6.
SOLUTION TO EXERCISES
CHAPTER 4
4.1.1 Let u be such that || u ||2 = 1 and
\\A~lu\\2 = max \\A'lx\\2 = IM' 1 1| 2 .
IWl2 = i
Let
t> = —— Λ *u and Δ/4 = —uv*
\\A~l\ 12 II Λ
112
where
sp(B) = {A}.
and
B = Q*AQy
where
sp(B) = sp(>l)\U}.
Let
δ= min \μ — λ\.
߀sp(B)
Hence
( Γ 1 ^ max -^ρΚΒ-λΙΓ^^ΗΒ-λΙΓ1^.
με*ρ{Β)\μ — λ\
Let
Hence
ΙΙΣ^ΙΙ^ΙΙΒ-λ/ΓΜΐί-
However,
Therefore
whence
||ΣΧ ||2 = | | ( B - A / ) - ' ||2.
Let
Aesp(ß) suchthat δ = \λ — λ\,
and let
J = V'lBV
be the Jordan form of B. Hence
(B-Xl)~i = V(J-XI)-1V-1,
and so
\\(B-U)-lh*cotid2(V)Ml-U)-
Let / be the index of λ and let ./(A) be the corresponding / x / Jordan block.
Hence
< max | | ( J y - * / ) - % ,
where the Ju are the different Jordan blocks of J. For sufficiently small δ the last
maximum is attained by J 0 = J(A), whence we have the result that
^ - 1 ^ | | ( B - A / ) - 1 | | 2 = IIS1||2<2cond2(K)o-z.
4.2.2 The function
Ρ(δ)=
~2πϊ' (Me)-ziyldz
nJr
is analytic for
\e\<minplR(z)Hy\
where
R(z) = (A-ziy\
and Γ is a Jordan curve isolating λ. Hence
lim||P(e)-P|| 2 = 0
and
χ(ε) = Ρ(ε)φ
can be normalized as follows:
Φ(ε) = [0^(ε)]-^(β),
where
Α*φ* = φ>θ*, ΦΙΦ = ΙΜ.
In fact, for sufficiently small ε, we have φ*χ(ε) Φ 0, because
lim|0*x(e)-l| = O.
Hence
θ(ε) = φΙΑφ(ε)
and we can prove that
döl
= ΦΙΗφ.
*
da ε = 0
374
APPENDIX A
whence
\\θ(ε)-φ\\2*!ί\\φ*\\2\ε\ + 0(ε2).
This proves (b).
The inequality (9) is proved as follows. Since
QNQ* = Vny-it
it follows that
l^°nd*(K)ll*(£)-^maxil,- l
\λ(ε)-λ\ l \λ(ε)-λ\'
(see[B:l,25]).
4.2.3 We calculate
B = A~lAA = (1 l
\0 0
The departures from normality are
v(A) = || A*A - AA* ||F = 10*72(1 + 108) > / 2 x 108,
v(B) = \\B*B-BB*\\F = 2.
The bases of eigenvectors are
X(A) = ( 1 l
A ) and X(B) = (
' \0 10-*/ \0 0
respectively, whence
cond 2 [X(ß)] = 3 + v 5
< 2.62.
SOLUTION TO EXERCISES 375
ε» =
1ΙΛΙΙ2
On the other hand, for each eigenvalue λ φ 0 of a non-singular matrix we have
1
o< — <\\Α~1\\2.
\λ\
Hence when λ is semi-simple (/ = 1 and V= Im) we have
m M
~ ^ cond 2 (A) || P || 2εκ + 0(ε 2 ).
|Λ|
X l
J = V - i B V = (
\0 λ
R = (a C
VO b
whence
b C
R~^U
ab\0
~ a
T=Q*BQ = RJR~l = (A al
Clearly, if b is small (the Jordan vectors almost dependent), then cond2 (B) is large
and so is the departure from normality of T, which is equal to \a/b\2.
4.2.13 (A + AA)(x + Δχ) = (λ + Αλ)(χ + Ax) gives, to the first order,
Αλχ = AAx + {A- λΙ)Αχ;
hence
(χ*χ)ΔΑ = x^AAx.
Then
{A. \L*JJAA\
|x*x|
by making use of the Schwarz inequality \(y,x)\ ^ \\y\\+ \\x\\ for xeX a normed
space, yeX* the adjoint space (Kato, 1976, p. 13). We show that the upper bound
is attainable with the perturbation AA = ε|| A ||wt?*, where the vectors v and w
are defined respectively by
-eMIINIIlblU^eMi
SOLUTION TO EXERCISES 377
We get
1
Αλ = _„.„,_....*,.....,_« A,ιι,J **UW
-Γε\\Α\\(χ^)(ν*χ)=\\ΑΑ
x x x x
* t
The result follows.
4.2.15 Δχ = - ΣΑΑχ implies ||Δχ || =ζ || Σ || \\AA || ||x ||. We show that equality is
reached with ΑΑ — ε\\Α \\σν*, where the vectors σ and v satisfy
| | A v l | | = 8 | M | | m a x i ^ l = eM||||ff||||i;|U = e | | 4 | | ,
*"° iizii
Δχ=-εΜ||(Σσ)(υ·χ),
||Δχ|| = ||ΔΛ||||Σ||||χ||.
4.2.21
(a) A + A A — XI is singular«» 3y # 0 such that
(A+AA- kl)y = 0, (A - XI)y = AAy.
(a)=*(b): (A-XI)y = zmth\\y\\ = l,\\z\\^£\\A\\.
(A-xiyiz = y, iK/t-A/j-Ml^M^· 1
Ml «Mil
(b)=>(a): ||ΔΛ || = dist((A - λΐ), singularity) = 1/||(A - λΙ)~1 || < ε \\A |
(c) ||(A/ - >i)-' ||2 = ffmax[(A/ - Λ)~»] = l/[>min(A/ - /!)].
4.3.1 If the inequalities
0<ε<ν<
c(H
are satisfied, then
p[Ä(z)/i]«||ll(z)|| 2 ||H|| 2 <l
for all zeT and the series
whence
c(T)
zer 1- yc(T)
4.3.2 Suppose that || A — A' \\2 < 1. Hence for every index / of an eigenvalue of
A or of A we have / ^ n. Hence
\\Α-Α\\ψ^\\Α-Α\\ν\
and so
dist(spM),sp(^))^cM-^|l2/n·
4.4.1 If || A A || 2 is sufficiently small, then Exercise 4.2.2 shows that
\λ-λ'\<1
whence
(1 + \λ-λ'\)(1-1)/ι^2.
4.4.3 Let R be the triangular part of A:
A= R-H,
where ε = || H || j is small. Hence B~l exists and Σ is well-defined, whence no two
diagonal elements are equal to au. Hence
Tk = (I-Q)B-k(I-Q).
Let
Äi=F[(/-ß)(Ä-%/)]l { . f } .:{ei} 1 -W J l .
Hi = L(I-Q)m\{ei]x:{ei}1-+{ei}\
We obtain
|A— 1 | < 4 x 10~ 8 , μ-2|<0.2501, μ-3|^10~4.
These three disks are mutually disjoint. Hence by Corollary 4.5.2 each contains
precisely one eigenvalue of A If we take
£/1 = 10~ 4 , </ 2 =0.5, <i 3 =10- 4 ,
we localize an eigenvalue λ of A by the inequality
| A - 2 | < 4 x 10" 8 .
Finally, if we take
d1==d2 = 10- 4 , d3 = 0.25,
We localize an eigenvalue λ of A by the inequality
| A - 3 | < 4 x 10* 8 .
4.6.1 Put
-0 - M:')
It is easy to show that, for every matrix W, we have
imi2^||T||2.
Let p > ||T|| 2 . Hence p > \\A\\2 and the matrix p2I — A2 is positive definite
Hermitian. Hence there exists a matrix K such that
K = B(p2I - A2yJ and W= - KAB*.
Using this matrix W we shall show that || f\\ 2 < p. To this end we shall show that
the Hermitian matrix p 2 / — T 2 is positive definite. Since
\KA Ij \0 I J \ 0 p2MJ
and
so are the elements of T; as the latter are bounded, they must have a limit when
ρ tends to \\T\\2 (from the right).
4.6.5 Let U and Q be two orthonormal bases of the subspace S = lin (Q) of
dimension m. Then there exists a unitary matrix B such that
U = QB.
It can be shown that
\\AU-UU*AU\\2=\\AQ-QQ*AQ\\2.
By Lemma 4.6.6 we now have
M ß - Ö e * ^ ö l l 2 = min(M(/-L/Z|| 2 ;lin(L/) = S,(;*(7 = / m ,ZeC m X M ).
Let Ve<£mxn be an orthonormal basis of eigenvectors of Q*AQ. Then
ΚΔΚ* = β*Λβ,
where Δ is the diagonal matrix of eigenvectors of Q*AQ. It follows that
min\\AU-UZ\\ = \\AQ-QVAV*\\2
υ,ζ
= \\(AQV-QVA)V*\\2
= \\A(QV)-(QV)A\\2.
We remark that Δ is the diagonal matrix of approximate eigenvalues and that
QV represents the associated vectors.
4.6.13 First consider the spectral norm. It is known that ||£' — C\\2 or
— || B' — C || 2 is an eigenvalue of the Hermitian matrix B' — C. Let v be a vector
such that || f || 2 = 1 and
(fl'-Qi; = fi||B'-C||2i;,
where ε = 1 or — 1. Then
v*(B'D - DC)v = ε || B' - C \\ 2v*Dv + v*(B'D - DB')v.
Now ε\\Β' — C\\2v*Dv is a real number and v*(B'D — DB')v is pure imaginary.
We deduce that
|| B'D - DC \\2^\v*(B'D-DC)v\^ || B'-C \\2v*Dv.
However v*Dv ^ omin(U) = σ, whence
\\VD-DC\\2>a\\ff-C\\2.
The inequality concerning the Frobenius norm is proved as follows: suppose that
i a n d ; are such that Ü{ > σ, and put
μ = —,
SOLUTION TO EXERCISES 381
(|b - /^c|2 + |fo^ - c i 2 ) ^ Y ^ (Ic|2 -+- |fo|2)(l -l· ^2) - 2/z(5c ^ fee)
>2\b-c\\
which implies the inequality required.
CHAPTERS 5
5.1.1 Let H = (hiJ)e<Cnxn be an irreducible Hessenberg matrix. Let Ae(C. Since
hi+1ti Φ 0 (i = 1,2,...,n — /), the first n — 1 columns of H — λΐ are linearly in
dependent. Hence
dimlm(H-λΙ)^η-\
and therefore
dimKer(//-,l/Kl.
On the other hand, if Aesp(/f), we have
dimKer(//-A/)^l.
we conclude that
(a) The geometric multiplicity of each eigenvalue of an irreducible Hessenberg
matrix is equal to unity.
(b) If an irreducible Hessenberg matrix is diagonalisable, then all its eigenvalues
are necessarily simple. For example, this is the case for an irreducible tridia-
gonal symmetric matrix.
5.3.1 Given an eigenvector φ associated with λχ = μί and given the sequence
qki there exists a sequence of complex numbers afc such that
lim otkqk = φ.
However,
—
ω(1ίη φ^mqk) = 0(\\akqk -φ\\2) = 0 Ui
If we suppose that \\qk\\2 = II0112 = 1> then we may choose the ak such that
382 APPENDIX A
|αΛ| = 1, and so
q* Aqk = (<xkqk)*A{cckqk) for all k ^ 0.
However, φ*Αφ = λχ. Hence
\q*Aqk - λχ | = \(akqk)*A(<xkqk) - φ*Αφ\
<\l(*hqh)* - Φ*1Αφ\ + \(<xkqk)*A(ockqk - φ)|
<2|μ||2||αΑ-φ||2 = 0
then
V=(Vi>V2)is unitary. Let Ule<Cnx (n~m) be such that its columns are ortho-
normal and belong to the subspace \_A{s) + B(s)] x , whose dimensions is not
less than n — m. Finally, let C/x be such that U = (Ul9 U2) is unitary. Then
^2i=*21=0.
(b) We compute U'*U' by taking into account the orthonormality relations
U*Ut=I9 U*U2 = I,
0*0^0, U*U2 = 0.
It will be found that U'*U' = /, and similarly for V.
(c) For example, U'fA V\ = 0 is equivalent to
(/ + xx*)- 1/2(u* - χυχ)Α{νΐ + v2Y)(i + Y*Y)~ 1/2
= o,
which is equivalent to
(υ*2-Χυ*)Α(ν^ν2Υ) = 09
or again
**-2\ ~T" y *22 11 1 2 ■* = = ^*
(d) We shall construct a sequence (Xh Yt) that converges to (X, Y). We use the
following recursive definition:
(X0, Y0) is a solution of the system
=
^ 2 2 ^ 0 ~~ -^Ό^ιι ~ ^21»
=
^22^0 ""^0^11 ~~^2ΐί
Now
max{||X0||F,||y0||F}^.
O
If
we have
max{||Xi+1||F,||yi+1|iF}<pi+1.
If we define
k - ^
*ί+ι=*ι(1+*ι)2.
384 APPENDIX A
then
Ρ,· = ΡοΟ+*ί)
and
2/c l
ki < lim kt = < 1.
l-2/c1+v/l-4/c1
On the other hand, if p = lim,^ „ pf, then
max{||X/+1-XJ|F,||yi+1-yi||F}^2^max{||Xi-Xi_1||F,||yi-yi.1||F},
m- 1
(B-XI)zm = cm + X nimz{.
i=l
The conditioning of such systems depends on ||(B — λΙ)~ι \\2 and on ||JV||F.
Hence || (B,B~l\\F moderate implies || N || F and \\(B — λΙ)~ι\\2 moderate. Therefore
||(B — σΙ)~11|2 is moderate for σ close to A.
5.9.2 Let A = K*K and £ = ΓΡΠ. Since || K - Π ||2 = Ο(ε), it follows that
ΐ μ - Β | | 2 < ( ΐ Η - | | Χ | | 2 ; | | Χ - Π | | 2 = 0(ε),
whence
\λΜ)-λι(Β)\ = 0(ε).
Now
Α ι μ) = σί(Κ)2>
Λ·(Β) = σ,.(Π)2
and there are r — 1 values σ,ίΠ) that are zero. Let
Io = {i:ai(Tl)ai(K) = 0}.
SOLUTION TO EXERCISES 385
CHAPTER 6
6.1.1 Let Vx be an orthonormal basis of Gx. The map YlxA\Gl:Gl-+Gl is
represented in this basis by the matrix
B,= V*AVl.
Let μ Φ 0 be an eigenvalue of Bx. If xx is an eigenvector of B associated with μ,
then
VxV*(Vxxx)=VxBxxx = (Vxxx^.
UxAyx = γχμ.
If μ Φ 0, then
y,eG„ Π|^, = )?ι and K*^#0.
Hence
386 APPENDIX A
and
(V*AV,){Vfy^ = (V*y,)fL
It follows that
*i=V*yi
is an eigenvector of Bt and μ is the corresponding eigenvalue.
6.2.1 It is proved in [B:62] that for every square matrix B we have
\\Bk\\2<zp(B)kkL-\
where a does not depend on k (and may be chosen to be greater or equal to
unity) and where L is the order of the largest Jordan block of B. Hence we may
rewrite the bound in Lemma 6.2.1 as follows:
μ Γ +ι
||(/-n/)xI.||2^a||xl.-si||2
μτ
when A is a diagonalisable matrix. Moreover, in the general case, when ε > 0
is given, we can determine the integer k in such a way that
(al/i/(L- !)/!_,)
0<ε<
to
for all / > k.
6.2.3 Let λ be an eigenvalue chosen among the r dominant eigenvalues. We
suppose the assumption (6.2.1) is satisfied as well as
dim PS = r.
Hence there exists an eigenvector x associated with λ such that
α^ΙΚΖ-Π,ίχΙ^.
Hence, for sufficiently great λ,
Ί^Γ+ΙΙ
α, = 0
II χ - Χι || 2 ^ ca„
and, when A is Hermitian,
U-AjKca,2,
where c is a generic constant.
6.2.4 Let U =(w!,...,Mr). By definition,
S = M(U) and dimS = r.
Since R0 is an upper triangular non-singular matrix and since Q is unitary, then
SOLUTION TO EXERCISES
387
A = o.
Hence the tridiagonal matrix Tt consists of two blocks. How is the second block
contructed?
6.3.3 First, we remark that
(ßj when/= 7— 1,
vfAvj =
r wheni=y,
when i =j+ 1,
o otherwise.
1
h 'IyJ yy
2
By construction,
yi = x·.ι β ί ι = » ι ·
Suppose that
y, = OjXj when 1 ^7 ^ fc,
where Oj is a real non-negative number. Then
9J = OJ ifö^O.
On the other hand, when j = k,
whence
k- 1
k l
A - Ox=ek + lvk+ Σ(ν*Α"~\)ι>Ρ
j=i
fc-l k
- Σ Σ (vtAk-1v1)(vfAv,)vJ
i=l j = ί— 1
= Ok+l(Avk-ockvk-ßkvk_1)
k-l
However,
and
Avt ~ QLiVi - ßiVi.1 + Xi+i = ßi+1Vi+1.
Hence
We suppose that A = Αέ and Aj = Af) (in accordance with the numbering given
on page 259). The constants that occur in these bounds are as follows:
l 1
" 2(l) - A
—
^i + 1 Ληΐη
SOLUTION TO EXERCISES 389
η=\\(Ι-πξ)Α*Λ2,
inl
and Tk is the Chebyshev polynomial of the first kind of degree k. The proof of
these bounds is rather complicated from the technical point of view. The
inequality concerning the eigenvectors is deduced from the majorization
||(π/-Ρί-ι>)χ||2^-Ρ-α£,
*t = yi + y/yf-i-
6.3.7 If ; 0 does not exist, then aioio is an eigenvalue of A and the problem is
divided into two parts. The matrix Η(2υ is given by
a
Wo jojoJ
The matrix (λΐ — D) is positive definite because
^)TAv(2)>aii (i<i<n).
CHAPTER 7
7.1.1 Let {</>!,..., <j>k) be a basis of V. For a given feC(S) we define the map
k
(ai)eC*H»||/-^«AB.6R
ΙΐΣαΓ^-/ΙΙ=ο<ΙΙ/ΙΙο
On the other hand,
J
2fc
Hence Tt(z) has fc real zeros, namely
z, = cose; = c o s ( ^ - ^ e [ - l , l ] (j = l,...,k).
392 APPENDIX A
7.3.9 Let c be real, but e = ic/10 is pure imaginary. For every semi-major axis
a such that 0 < a < c we have
max |i 2 ( z )l< max
\h(z)\>
zeE(c,e,a) zeE(c,e,a)
where
,,v Tkl(z-c)/e]
T k [(A-c)/e]
7.3.12 Let V = {qeVk\q(X) — 0} and let D be a compact subset of C that does
not contain λ. By Example 7.1.1, V satisfies the Haar condition. By Exercise
7.1.1, there exists a better approximation q* of the function 1 eC(D) in V\ thus
Ι|1-*ΊΙαο<ΙΙ1-*ΙΙ«> forallgeK
Hence there exists a unique polynomial p* = 1 — q* that satisfies
max|/?*(z)| = min max|p(z)|.
zeD pePk zeD
Ρ(λ)=1
Tfc(xM(x + yx y -l)*.
Since a/e > 1 and (λ — c)/e > 1, where λ — c — a^ we have
___Tk(a/e) Γ «+ ν5Εΐ^Τ«^—Υ
W-c)/*] Li/l-ci + y i r - c ) 2 - ^ ] V IwJ /'
7.8.1 We remark that
r,· = A(x — x,) = Aev
(a) Since
n- 1
i= 1
and
Ρη(μι) 0 \
ΡηΦ) =
\ 0 ρη(μη))
where the μ, are the eigenvalues of A.
(c) If V is the Jordan basis and J is the Jordan form of A, then
of order i, we have
( 1 1
,<«(-!)(λ,)
(d)P„(Ji) = P„Uf)
P'ndi)
pn(A;)
\
Hence || p„(A) \\ 2 -* 0 as n -* oo if and only if | ρ^{λ() | -+ 0 as n -»oo for each
j < t( and for each Jordan block.
The reader will verify that when pn is defined by
TnUc-X)/e]
p„« =
Tn(c/e)
we have
|p.ft)l^0=Hpi?ft)l-*0
for all ; < tt.
(d) It suffices to consider that
exp {n cosh "* [(c — A)/e]} + exp {— n cosh "* [(c — X)je\}
PnW =
exp [n cosh 1 (c/e)] + exp [ — n cosh * (c/e)]
and to use the logarithmic definition of cosh" l .
394 APPENDIX A
(e) The proposed algorithm follows from the recurrence relation proved in
Exercise 7.3.4.
7.9.1 Let U = (Uu...,Uj)e<i:nxj be a matrix such that U*U = Ij and
AU = UR,
jxj
where Re<E is an upper triangular matrix. We suppose that the diagonal of
JR consists of the eigenvalues λΐ9..., λ5 of A. Let Σ] = d i a g ^ t f ^ . . . , ^ ) . Then the
eigenvalues of
AJ = A-U£JU*
are
ί^-σί ifl^i^y,
\λ( iij<i^n.
In fact, let
Ej = (el9...,ej)
and suppose that R — U*A U is the Schur form of A, where U is unitary. Then
AjU^UiR-EfijE*),
which proves the result.
7.9.2 We suppose that we possess an algorithm for calculating the eigenvalues
with the greatest real part and the associated eigenvectors (for example: Arnoldi,
Arnoldi-Chebyshev, Arnoldi with least squares, preconditioned Arnoldi and so
on).
The progressive deflation consists of the following steps:
( a ) ; = 0, 1/Ο = (0),Σ Ο = Ο.
(b) Use <$/ to calculate λ]+ί with the greatest real part of
[1] Ahues, M. (1989) 'Spectral condition numbers for defective eigenelements of linear
operators in Hubert spaces', Numer. Fund. Anal, and Optim., 10, 843-61.
[2] Ahues, M. and Telias, M. (1986) 'Refinement methods of Newton type for approxi
mate eigenelements of integral operators', SIAM J. Numer. Anal., 23, 114-59.
[3] Atkinson, K. (1973) 'Iterative variants of the Nyström method for the numerical
solution of integral equations', Numer. Math., 22, 17-31.
[4] Atkinson, K. (1976) A Survey of Numerical Methods for the Solution of Fredholm
Integral Equations of the Second Kind, SIAM Philadelphia, Pennsylvania.
[5] Aubin, J. P. (1984) LAnalyse Non-linaaire et ses Motivations Economiques, Masson,
Paris.
[6] Bjorck, A. and Golub, G. (1973) 'Numerical methods for computing angles between
linear subspaces' Math. Comp., 27, 579-94.
[7] Brandt, A. (1977) 'Multi-level adaptive solutions to boundary-value problems',
Math. Comp., 31, 333-90.
[8] Brinkmann, H. and Klotz, E. (1971) Linear Algebra and Analytic Geometry, Addison-
Wesley, Massachusetts.
[9] Campbell, S. L. and Meyer, C. D. Jr (1991) Generalized Iuverses of Linear Trans
formations, Dover, Mineola, N.Y.
[10] Cartan, H. (1972) Calcul Differ entiel, Herman, Paris.
[11] Chatelin, F. (1983) Spectral Approximation of Linear Operators, Academic Press,
New York.
[12] Chatelin, F. (1984) 'Iterative aggregation/disaggregation methods', International
Workshop on Applied Mathematics and PerformanceI Reliability Models of
ComputerI Communication Systems, North Holland, Amsterdam.
[13] Chatelin, F. (1992) The influence of nonnormality on matrix computations', in Linear
Algebra , Markov Chains, and Queueing Models, C. D. Meyer and R. J. Plemmons
(eds.), IMA Volume in Mathematics and Applications, Springer-Verlag, New York.
[14] Chatelin, F. and Belaid, D. (1987) 'Numerical analysis for factorial data analysis.
Part I: numerical software—the Package INDA for microcomputers', Appl. Stoch.
Mod. and Data Anal, 3, 193-206.
[15] Crouzeix, M. and Sadkane M. (1989) 'Sur la convergence da la methode de
Davidson', CRAS, 308, Serie I, 189-91.
[16] Davidson, E. (1983) 'Matrix eigenvector methods', in Methods in Computational
Molecular Physics, Diercksen and Wilson (eds.), pp. 95-113, D. Reidel.
[17] Davis, Ch. and Kahan, W. M. (1970) 'The rotation of eigenvectors by a perturbation
III, SIAM J. Numer. Anal, 7, 1-46.
[18] Debreu, G. and Herstein, I. N. (1953) 'Nonnegative square matrices', Econometrica,
21, 597-607.
396 APPENDIX B
[19] Demmel, J. (1987) Three methods for refining estimates of invariant subspaces',
Comp. J., 38, 43-57.
[20] Fraysse, V. (1992) 'Reliability of computer solutions', PhD thesis, Institut National
Polytechnique de Toulouse, France.
[21] Fröberg, C.-E. (1985) Numerical Mathematics. Theory and Computer Applications,
Benjamin/Cummings, California.
[22] Geradin, M. and Carnoy, E. (1979) O n the practical use of eigenvalue bracketing
in finite element applications to vibration and stability problems', in Euromech
112, pp. 151-71, Hungarian Academy of Science, Budapest.
[23] Geurts, A. J. (1982) Ά contribution to the theory of conditions', Numer. Math., 39,
85-96.
[24] Golub, G., Nash, S. and Van Loan, C. (1979) Ά Hessenberg-Schur method for
the problem AX + XB = C, IEEE Trans. Autom. Control., AC-24, 909-13.
[25] Golub, G. and Van Loan, C. (1989) Matrix Computations, 2nd edition, North
Oxford academic, Oxford. Johns Hopkins University Press, Maryland.
[26] Golub, G. and Welsch, J. (1969) 'Calculation of Gauss quadrature rules', Math.
Comp., 23, 221-30.
[27] Graham, A. (1981) Kronecker Products and Matrix Calculus with Applications, Ellis
Horwood, Chichester.
[28] Hackbusch, W. (1981) 'On the convergence of multigrid iterations', Beiträge Numer.
Math., 9, 213-39.
[29] Ho, D. (1990) Tchebychev iteration and its optimal ellipse for nonsymmetric
matrices', Numer. Math., 56, 721-34.
[30] Ho, D. Chatelin, F. and Bennani, M. (1990) 'Arnoldi-Tchebychev procedure for
large scale nonsymmetric matrices', M2AN, Math. Mod. Numer. Anal., 24(1),
53-65.
[31] Hoffman, K. and Kunze, R. (1971) Linear Algebra, Prentice-Hall, Englewood Cliffs,
N.J.
[32] Hoffman, A. J. and Wielandt, H. W. (1953) The variation of the spectrum of a
normal matrix', Duke Math. J., 20, 37-9.
[33] Kahan, W. M. (1967) 'Inclusion theorems for clusters of eigenvalues of Hermitian
matrices', Computer Science Department, University of Toronto.
[34] Kahan, W. M., Parlett, B. N. and Jiang, E. (1982) 'Residual bounds on approximate
eigensystems of nonnormal matrices', SI AM J. Numer. Anal., 19, 470-84.
[35] Kato, T. (1976) Perturbation Theory for Linear Operators, Springer-Verlag, Berlin,
Heidelberg, New York.
[36] Kreweras, G. (1972) Graphes, Chaines de Markov et Quelques Applications Eco-
nomiques, Dalloz, Paris.
[37] Laganier, J. (1983) 'Croissance diversifiee de l'economie mondiale', Cours DEUG,
Universite de Grenoble, France.
[38] Lancaster, P. (1970) 'Explicit solutions of linear matrix equations', SI AM Rev., 12,
544-66.
[39] Lascaux, P. and Theodor, R. (1986) Analyse Numerique Matricielle Appliquee ä
l'Art de VIngenieur, Masson, Paris.
[40] Mardones, V. and Telias, M. (1986) 'Raffinement d'elements propres approches de
grandes matrices, in Innovative Numerical Methods in Engineering. Shaw, Periaux,
Chaudouet, Wu, Marino and Brebbia (eds.), pp. 153-8, Springer-Verlag, Berlin.
[41] Markushevich, A. (1970) Theorie des Fonctions Analytiques, Mir, Moscow.
[42] Matthies, H. G. (1985) 'Computable error bounds for the generalized symmetric
eigenproblem', Comm. Appl. Numer. Meth., 1, 33-8.
REFERENCES FOR EXERCISES 397
References
Jennings, A. (1977) Matrix Computations for Engineers and Scientists, John Wiley, New
York.
Kahan, W., Parlett, B. N. and Jiang, E. (1982) 'Residual bounds on approximate eigen-
systems of nonnormal matrices', SI AM J. Numer. Anal, 19, 470-84.
Kato, T. (1949) O n the upper and lower bounds of eigenvalues', J. Phys. Soc. Japan, 4,
334-9.
Kato, T. (1976) Perturbation Theory for Linear Operators, 2nd edition, Springer-Verlag,
Berlin and New York.
Kavenoky, A. and Lautard, J. J. (1983) 'State of the art in using finite element method
for neutron diffusion calculation', Proc. Adv. in Reactor Computations, 1, 28-31.
Kerner, W. (1989) 'Large scale complex eigenvalue problems', J. Comp. Phys., 85,1 -85.
Krylov, N. and Bogolioubov, N. (1929) Bull. Acad. Sei. USSR, Phys. Math. Leningrad, 471.
Kulisch, U. and Miranker, W. (1981) Computer Arithmetic in Theory and Practice,
Academic Press, New York.
Lancaster, P. and Tismenetsky, M. (1985) Theory of Matrices, 2nd edition, Academic
Press, New York.
Lanczos, C. (1950) 'An iteration method for the solution of the eigenvalue problem of
linear differential integral operators', J. Res. Nat. Bur. Stand., 45, 255-82.
LAPACK User's Guide (1992) SIAM, Philadelphia, Pa.
Laurent, P. J. (1972) Approximation et Optimisation, Hermann, Paris.
Leontiev, W. W. (1941) The Structure of the American Economy, Harvard University Press.
Manteuffel, T. A. (1977) 'The Tchebychev iteration for nonsymmetric linear systems',
Numer. Math., 28, 307-27.
Meirovitch, L. (1980) Computational Methods in Structural Dynamics, Sijthoff and Meyer,
C. D. Jr and Stewart, G. W. (1988) 'Derivatives and perturbations of eigenvectors',
SIAM J. Numer. Anal., 25, 679-91.
Moler, C. B. and Stewart, G. W. (1973) 'An algorithm for generalized matrix eigenvalues
problems', SIAM J. Numer. Anal., 10, 241-56.
Morishima, M. (1971) Marx' Economics, A Dual Theory of Value and Growth, Cambridge
University Press.
Nicolis, G. and Prigogine, I. (1977) Self-organization in Non-equilibrium Systems. From
Dissipative Structures to order through Fluctuations, Wiley, New York.
Noble, B. and Daniel, J. W. (1977) Applied Linear Algebra, 2nd edition, Prentice-Hall,
Englewood Cliffs, N. J.
Nour-Omid, B., Parlett, B. N. and Taylor, D. (1983) 'Lanczos versus subspace iteration
for solution of eigenvalue problems', Int. J. Numer. Meth. Engng, 19, 859-71.
Nyström, E. H. (1930) 'Über die praktische Auflösung von Integralgleichungen mit
Anwendungen auf Randwertaufgaben', Acta Math., 54, 185-204.
Ortega, J. M. (1972) Numerical Analysis: A Second Course, Academic Press, New York.
Ortega, J. M. and Rheinboldt, W. C. (1971) Iterative Solution of Non linear Equations
in Several Variables, Academic Press, New York.
Ostrowski, A. M. (1957) 'On the continuity of characteristic roots as functions of the
elements of a matrix', J. Ber. Deutsch. Math.-Verien Abt., 1, 40-2.
Ostrowski, A. M. (1958-9) 'On the convergence of the Rayleigh quotient iteration for
the computation of characteristic roots and vectors', Arch. Mech. Anal., 1, 233-41, 2,
423-8; 3, 325-40, 341-7, 472-81; 4, 153-65.
Paige, C. (1971) 'The computation of eigenvalues and eigenvectors of very large sparce
matrices', Ph.D. thesis, Institute of Computer Science, London University.
Parlett, B. N. (1968) 'Global convergence of the basic QR algorithm on Hessenberg
matrices', Math. Comp., 22, 803-17.
REFERENCES 403
Saad, Y. (1982b) 'Projection methods for solving large sparse eigenvalue problems', in
Matrix Pencils, P. Havbad, B. Kägström and A. Ruhe (eds.), Lecture Notes in Mathe
matics 973, pp. 121-44, Springer-Verlag.
Saad, Y. (1984) 'Chebychev acceleration techniques for solving nonsymmetric eigenvalue
problems', Math. Comp., 42, 567-88.
Saad, Y. (1987) 'Least squares polynomials in the complex plane, and their use for solving
sparse nonsymmetric linear problems', SIAM J. Numer. Anal., 24, 155-69.
Saad, Y. (1989) 'Numerical solution of large nonsymmetric eigenvalue problems', Comp.
Phys. Comm., 53, 71-90.
Saad, Y. and Schultz, M. (1986) 'GMRES: a generalized minimal residual algorithm for
solving non-symmetric systems', SIAM J. Sei. Stat. Comp., 7, 856-69.
Sanchez-Palencia, E. (1980) Non-homogeneous Media and Vibration Theory, Lecture
notes in Physics 127, Springer-Verlag.
Schur, I. (1909) 'Über die charakteristischen Wurzeln einer linearen Substitution mit
einer Anwendung auf die Theorie der Integralgleichungen', Math. Ann., 66,488-510.
Simon, H. D. (1984) 'The Lanczos algorithm with partial reorthogonalization', Math.
Comp., 42, 115-42.
Stettari, A. and Aziz, K. (1973) Ά generalization of the additive correction methods for
the iterative solution of matrix equations', SIAM J. Numer. Anal., 10, 506-21.
Stetter, H. J. (1978) 'The defect correction principle and discretization methods', Numer.
math., 29, 425-443.
Stewart, G. W. (1971) 'Error bounds for approximate invariant subspaces of closed linear
operators', SIAM J. Numer. Anal., 8, 796-808.
Stewart, G. W. (1973a) 'Error and perturbation bounds for subspaces associated with
certain eigenvalue problems', SIAM Rev., 15, 727-64.
Stewart, G. W. (1973b) Introduction to Matrix Computations, Academic Press, New York.
Stewart, W. J. and Jennings, A. (1981) 'Algorithm 510. LOPSI: a simultaneous iteration
algorithm for real matrices', ACM Trans. Math. Software, 7, 230-232.
Stoer, J. and Bulirsch, R. (1980) Introduction to Numerical Analysis, Springer-Verlag,
Berlin and New York.
Strang, G. (1980) Linear Algebra and Its Applications, 2nd edition, Academic Press, New
York.
Temple, G. (1928) 'The computation of characteristic numbers and characteristic
functions', Proc. London Math. Soc, 29, 257-80.
Thompson, J. M. T. (1982) Instabilities and Catastrophes in Science and Engineering,
Wiley, Chichester.
Trefethen, L. N. (1991) Tseudospectra of matrices', in Proceedings of 14th Dundee
Biannual Conference on Numerical Analysis, D. F. Griffiths and G. A. Watson (eds.).
Varah, J. M. (1968) 'The calculation of the eigenvectors of a general complex matrix by
inverse iteration', Math. Comp., 22, 785-91.
Varah, J. M. (1979) 'On the separation of two matrices', SIAM J. Numer. Anal., 16,
216-222.
Varga, R. S. (1957) Ά comparison of successive over-relaxation and semi-iterative
methods using Chebyshev polynomials', SIAM J. Numer. Anal, 5, 39-46.
Varga, R. S. (1962) Matrix Iterative Analysis, Prentice-Hall, Englewood Cliffs, N.J.
von Neumann, J. (1945-6) Ά model of a general economic equilibrium', Rev. Econ. Stud.,
13, 10-18.
Wachpress, E. (1966) Iterative Solution of Elliptic Systems and Applications to the Neutron
Diffusion Equations of Reactor Physics, Prentice-Hall, Englewood Cliffs, N.J.
REFERENCES 405
Wasow, W. (1978) 'Topics in the theory of linear ordinary differential equations having
singularities with respect to a parameter', IRMA, Universite Louis Pasteur, Strasbourg.
Watkins, D. S. (1982) 'Understanding the QR algorithm', SI AM Rev, 24, 427-40.
Watkins, D. S. (1984) 'Isospectral flows', SI AM Rev., 26, 379-91.
Weber, H. (1869) 'Über die Integration der partiellen Differentialgleichung', Math.
Ann., 1, 1-36.
Weinstein, D. H. (1934) 'Modified Ritz methods', Proc. Nat. Acad. Sei. USA, 20,
529-32.
Weyl, H. (1911) 'Das asymptotische Verteilungsgesetz der Eigenwerte linearer partieller
Differentialgleichungen', Math. Anal, 71, 441-79.
Wilkinson, J. H. (1963) Rounding Errors in Algebraic Processes, Prentice-Hall, Englewood
Cliffs, N.J.
Wilkinson, J. H. (1965) The Algebraic Eigenvalue Problem, Oxford University Press
(Clarendon), London and New York.
Wilkinson, J. H. and Reinsch, C. (1971) Handbook for Automatic Computation. Linear
Algebra, Vol. 2, Springer-Verlag, Berlin and New York.
Wrigley, H. E. (1963) 'Accelerating the Jacobi method for solving simultaneous equations
by Chebyshev extrapolation when the eigenvalues of the iteration matrix are complex',
Comp. J., 6, 169-76.
Index