Vous êtes sur la page 1sur 134

# PREFACE

## "Mathematics," wrote Alfred North Whitehead, "is

the most powerful technique for the understanding of
pattern and for the analysis of the relations of pat-
terns." In its pursuit of pattern, however, mathema-
tics itself exhibits pattern; the mathematics on the
printed page often has visual appeal. Spatial arrange-
ments embodied in formulae can be a source of mathe-
matical inspiration and aesthetic delight.
The theory of matrices exhibits much that is
visually attractive. Thus, diagonal matrices, symmet-
ric matrices, (0, 1) matrices, and the like are
attractive independently of their applications. In
the same category are the circulants. A circulant
matrix is one in which a basic row of numbers is
repeated again and again, but with a shift in posi-
tion. Circulant matrices have many connections to
problems in physics, to image processing, to probabil-
ity and statistics, to numerical analysis, to number
theory, to geometry. The built-in periodicity means
that circulants tie in with Fourier analysis and group
theory.
A different reason may be advanced for the study
of circulants. The theory of circulants is a relative-
ly easy one. Practically every matrix-theoretic ques-
tion for circulants may be resolved in "closed form."
Thus the circulants constitute a nontrivial but simple
set of objects that the reader may use to practice,
and ultimately deepen, a knowledge of matrix theory.
Writers on matrix theory appear to have given
circulants short shrift, so that the basic facts are
vii
viii Preface
Preface ix
rediscovered over and over again. This book is inten-
ded to serve as a general reference on circulants as
well as to provide alternate or supplemental material
for intermediate courses in matrix theory. The reader
will need to be familiar with the geometry of the com-
plex plane and with the elementary portions of matrix
theory up through unitary matrices and the diagonaliza-
tion of Hermitian matrices. In a few places the Jordan
form is used.
This work contains some general discussion of
matrices (block matrices, Kronecker products, the UDV
theorem, generalized inverses). These topics have been
included because of their application to circulants and
because they are not always available in general books
on linear algebra and matrix theory. More than 200
problems of varying difficulty have been included.
It would have been possible to develop the theory
of circulants and their generalizations from the point
of view of finite abelian groups and group matrices.
However, my interest in the subject has a strong numer-
ical and geometric base, which pointed me in the direc-
tion taken. The interested reader will find references
to these algebraic matters.
Closely related to circulants are the Toeplitz
matrices. This theory and its applications constitute
a world of its own, and a few references will have to
suffice. The bibliography also contains references to
applications of circulants in physics and to the solu-
tion of differential equations.
I acknowledge the help and advice received from
Professor Emilie V. Haynsworth. At every turn she has
provided me with information, elegant proofs, and
encouragement.
I have profited from numerous discussions with
Professors J. H. Ahlberg and Igor Najfeld and should
like to thank them for their interest in this essay.
Philip R. Thrift suggested some important changes.
Thanks are also due to Gary Rosen for the Calcomp
plots of the iterated n-gons and to Eleanor Addison for
the figures. Katrina Avery, Frances Beagan, Ezoura
Fonseca, and Frances Gajdowski have helped me enormous-
ly in the preparation of the manuscript, and I wish to
thank them for this work, as well as for other help
rendered in the past.
The Canadian Journal of Mathematics has allowed
me to reprint portions of an article of mine and I
would like to acknowledge this courtesy.
Finally, I would like to thank Beatrice Shube
for inviting me to join her distinguished roster of
scientific authors and the staff of John Wiley and
Sons for their efficient and skillful handling of the
manuscript.
Philip J. Davis
Providence, Rhode Island
April, 1979
CONTENTS
Notation
Chapter 1 An Introductory Geometrical
Application
xiii
1
1.1 Nested triangles, 1
1.2 The transformation a, 4
1.3 The transformation o , iterated with
different values of s, 10
1.4 Nested polygons, 12
Chapter 2 Introductory Matrix Material
2.1 Block operations, 16
2.2 Direct sums, 21
2.3 Kronecker product, 22
2.4 Permutation matrices, 24
2.5 The Fourier matrix, 31
2.6 Hadamard matrices. 77
. - .
2.7 Trace, 40
2.8 Generalized inverse. 40
2.9 Normal matrices, quadratic forms,
and field of values, 59
Chapter 3 Circulant Matrices 66
3.1 Introductory properties, 66
3.2 Diagonalization of circulants, 72
3.3 Multiplication and inversion of circulants, 85
3.4 Additional properties of circulants, 91
3.5 Circulant transforms, 99
3.6 Convergence questions, 101
xi
xii Contents
Chapter 4 Some Geometric Applications of
Circulants
Circulant quadratic forms arising in
geometry, 108
The isoperimetric inequality for isosceles
polygons, 112
Quadratic forms under side conditions, 114
Nested n-gons, 119
Smoothing and variation reduction, 131
Applications to elementary plane geometry:
n-gons and Kr-grams, 139
The special case: circ(s, t, 0, 0, ..., 01, 146
Elementary geometry and the Moore-Penrose
inverse, 148
Chapter 5 Generalizations o m c u ? a n c s :
9-Circulancs and Block Circblancs 155
5.1 g-circulants, 155
5.2 0-circulants, 163
5.3 PD-matrices, 166
5.4 An equivalence relation on il, 2, ..., n], 171
5.5 Jordanization of g-circulants, 173
5.6 Block circulants, 176
5.7 Matrices with circulant blocks, 181
5.8 Block circulants with circulant blocks, 184
5.9 Further generalizations, 191
Chapter 6 Centralizers and Circulants 192
6.1 The leitmotiv, 192
6.2 Systems of linear matrix equations. The
centralizer, 192
6.3 t algebras, 203
6.4 Some classes Z(Po, PT), 206
6.5 Circulants and their generalizations, 208
6.6 The centralizer of J; magic squares, 214
6.7 Kronecker products of I, n , and J, 223
6.8 Best approximation by elements of
centralizers, 224
Appendix
Bibliography
Index of Authors
Index of Subjects
C the complex number field
'rnx n
the set of m x n matrices whose elements are
in C
transpose of A
-
A conjugate of A
A* conjugate transpose of A
A B B direct (Kronecker) product of A and B
A 0 B Hadamard (element by element) product of
A and B
A' Moore-Penrose generalized inverse of A
r (A) rank of A
If A is square,
det(A) determinant of A
tr (A) trace of A
h(A) eigenvalues of A: individually or as a set
A
-1
inverse of A
p ( A ) spectral radius of A
xiii
m
a

4
1
0

z

H

o

@
m

x

X

t
i

(
D

c

L
o
u

I
I

I
I

1
1

I
I

I
I

I
I

I
I

I
I

I
I

I
I

r
.
1
0

Q

n

a

x
a

=
I
a

n

h

.
m

r
-

N

s
t
i

+

o

r
t
r
.

r
.

r
-

I

.
C

a

lo

w
r
.

c
Y
r
"
a
r
"
t
i
n
.
3
(
D
t
i
r
t
T
U

x

t
i

Q

r
-
Q

n

r
.
-
a
s

o

r
~
g

h
l
r
-
a
-
w
-
-
h

w
r
t

~
o
o
Q
r
C
o
o
3
r
.
I
I
*

t
i
m
-

-

-
I
D

r
t

Y

w

r
-

I

c

3

Y

n

(
D
t
i

r
t
O
H
\$
P
.

r
-
-

p

.

.

r
t

P
.

(
D

W
I
I
t
i
O

r
t

0

0

C

.

.

n

h
0

t
i

3
'

H

<
N

.

n

-

a
i
m

m

r
-

3

a

o

(
~
r
t

I
.
.

X

.

.

0
-

h

0

t
i
.
?

3

0

-

i
-
1

-
.

r
-

0

.

r
-

.

-

z
2
e
;
-
w

r
t

-

Z
T

r

.

.
.

0

-

w

.

m
c

C

0
.
P
r
t
L
n

z
3

-
,

r

.

.

C

-

w
3

c

0

1
C
O

3
.

'
(
0

r
"

.

C

X

.

0

.
r
t

t
i

-

(
D

.

-

2
.

.

-

J

I1

3

(
0

r
t

r
t

r
"
X

z

.

"

;
.

x
"

h

0

-

r
t
n

w

~
i
r

0

1
1

(
D

-

L
-

r
-
c

t
i

0

-

m

9

n

m

r
.

m

u
.

"

w
o
o
r
o

N

O
W
.
.

.

.

.
.
.
.
,

.
.
.
.
.

w

r
o
o
0

3

I

P

*
m

v
L
1
'0

r
.
w

o

o

o

r
.

r
.
r
.

c
a

3
3

P

*
*

w
w

m

w

t
i
t
3

*

t
i

P
O

t
i
0

w

r
:
(
:

r
.
m

d

*

w

*

t
i
*

w

z
*

Q

z

m
T

r
m

*

r
.

ID

t
i

;
:
lo

=
m

C
I
.

r
'

w

r
m

r
a

1

P
r
.
N

ID

Q

0
3
.

m

r

z
-
z

0

m

3
0

e
m

H

T

r

?
"

m
*

0

c
i
T

r
.

3

I
D
P

2

m

m

r
.
w

m

Q

t
i
*

T

c

I
D
t
i

ID

t
i

P

w

9
w

w

*

w
3

C
I

P
.

3
1
4

w

0

'
<
P

1

3

m
m

.
*
.

T

w

P
-

2

m
m

t
i

0

O
X

t
3

*
t
i

f
f
l
7
3

*

m
w

T

t
i

*
*

ID

<

T

m

0
0

3

w

r
-

w
3
=
a

w

9
1

t
i

t
3

r
.

I

m
a

I

2 An I nt r oduc t or y Geomet r i cal Appl i c a t i on Nest ed Tr i a ngl e s 3
( 4 1 Given a T7, t h e r e i s a uni que t r i a n g l e T,
-
whose mi dpoi nt t r i a n g l e it is.
( 5) The a r e a of T2 i s minimum among a l l t r i a n g l e s
T, t h a t a r e i ns c r i be d i n T, and whose ver -
L L
t i c e s d i v i d e t h e s i d e s of T1 i n a f i x e d
r a t i o , c y c l i c a l l y .
( 6 ) If t h e mi dpoi nt t r i a n g l e of T7 i s T?, and
- -
s uc c e s s i ve l y f o r T4, T5, ..., t h i s ne s t e d
s e t of t r i a n g l e s conver ges t o t h e c e n t e r of
g r a v i t y of T, wi t h geomet r i c r a p i d i t y .
A
[By t h e c e n t e r of g r a v i t y ( c . 9. ) of a t ri -
a ngl e whose v e r t i c e s have r e c t a n g u l a r coor -
d i n a t e s ( xi , y i ) , i = 1, 2, 3, i s meant t he
p o i n t 1/ 3( x1 + x 2 + x3, y1 + y2 + y 3 ) . l
Fi gur e 1. 2. 2
PROBLEMS
1. Prove t h a t t h e t r i a n g l e s Tn a r e a l l s i mi l a r .
..
2. Prove t h a t t h e medi ans of Tn, n = 2, 3, ..., l i e
a l ong t h e medians of T ~ .
-
3. Prove t h a t t h e c. 9. of T_, n = 2 , 3, ..., coi n-
Prove t h a t a r e a T
n + l
= 1 / 4 a r e a T .
n
Prove t h a t t h e pe r i me t e r of Tn+l = 1/2 pe r i me t e r
of T".
..
Concl ude, on t h i s b a s i s , t h a t Tn conver ges t o
c . g. TI ( Fi gur e 1. 1. 2) .
-
Des cr i be t h e s i t u a t i o n when T1 is a r i g h t
t r i a n g l e ; when T, i s e q u i l a t e r a l .
Given a t r i a n g l e T1, c ons t r uc t a t r i a n g l e To such
t h a t T1 i s i t s mi dpoi nt t r i a n g l e .
The mi dpoi nt t r i a n g l e of T1 d i v i d e s T1 i n t o f o u r
- -
s u b t r i a n g l e s . Suppose t h a t T, de s i gna t e s one of
-
t h e s e , s e l e c t e d a r b i t r a r i l y . Now l e t Tn des i g-
n a t e t h e sequence of t r i a n g l e s t h a t r e s u l t from
an i t e r a t i o n of t h i s pr oc e s s . Prove t h a t Tn
conver ges t o a p o i n t . Prove t h a t ever y p o i n t
i n s i d e TI and on i t s s i d e s i s t h e l i m i t of an
a p p r o p r i a t e sequence T .
n
Sys t emat i ze, i n some way, t h e s e l e c t i o n pr oc e s s
i n Problem 9.
I f two t r i a n g l e s have t h e same a r e a and t h e same
pe r i me t e r a r e t he y n e c e s s a r i l y congr uent ?
Le t P be an a r b i t r a r y p o i n t l y i n g i n t h e t rj
Tl = AA B C Le t T2 = o( T ) be det er mi ned
1 1 1' 1
.. Fi gur e 1. 1. 3
c i d e s wi t h t h e c. g. of T1.
4 An Introductory Geometrical Application
The Transformation o 5
n
Fisure 1.1.3. Determine the rate at which U (T,) Write
<
I
converges to P.
1.2 THE TRANSFORMATION U
As a first generalization, consider the following
transformation a of the triangle T1. Select a
nonnegative number s: 0 < s < 1, and set
Let A2, B2, C2 be the points on the sides of the
triangle T1 such that
In this equation A A designates the length of the
1 2
llne segment from A to A2, and so on. Thus the
1
points A C divide the sides of Tl into the
2' B2' 2
ratio s/t, working consistently in a counterclockwise
fashion. (See Figure 1.2.1.)
(1.2.3) T2 = AA2B2C2 = 0 (TI)
and in general
( 1 . 2 3 Tn+, = o(Tn)
n
= o (T1), n = 1, 2, 3, ... .
Figure 1.2.2 illustrates the sequence T for s =
t = 3/4.
n
Figure 1.2.2
The transformation a depends, of course, on the
parameter s, and we shall write os when it is neces-
sary to distinguish the parameter.
To analyze this situation, one might work with
vectors, but it is particularly convenient in the case
of plane figures to place the triangle T in the com-
1 .
plex plane. We write z = x + iy, 2 = x - ly, i = fl,
and designate the coordinates of Tn systematically by
. .
z z z Write, for simplicity, zll - zl, z =
-
In' 2n' 3n' 21
z2, z31 = z3. The transformation a operating succes-
sively on T
1' T2'
..., is therefore given by
-

k
-

.

l
i
h
.

N

a

o
m

P
I

3

e
a

-

I
-
N
a

s
r

C
.

m

w

+

n

r
.
n

G

h
r
D

<

o
r

-

m
l

w

a
r
t

e

t
-
N
h

O
F

t
-

V

h

-

+

r
t

D
i
m

0

t
i
m

I1

0

-

0
1

Y

3

G

F
N
P

0

3

N

3

r
-

I
-

O
D
i
r
D
,

-

c
1
3
3

N

a

r
t
.
0

r
j

0

r
t
.

3
'
3

s
.

r
.

m

m
.

L
O

.
.

3

Y

3

W
H
3

7

H

m
w
r
t
t
m

r
t

D
i
.

3

3

E

n
m

L
O

r
t
r
.

m

o

S
P

r

r
t
w
r

r
t
r

s

r
t

7

0

w

m

*
o
r
.

*
L
O

.

3

a

r
t

Y
r

3
.
-

3
'
F

-
C
.

rD

D
i
e

e
m

r
t

s
r
-
w

a

w

0

0

r
t

n

r
-
.

C
.
0

3
0

m
m
a

r
t
.

o

a

-

r

0

e

r

O
r
t

.
3

0

o
w

1
0
-

5

D
i
r
t

.

r
t

r

-

I
I

rD

0

P
e

n

-
.

w

r
9

w

r
t

r
.

r
-

m

m

m
-

m

r
t

e

G

s
o

n
r

3

0
3

0
-

rD

rD

3
-

0

r
t

h

w

3

s

r
.
L
O

D
i
L
Q
L
O

5
1
1

r
t

r

m

2

a

r

.

0

+

a

o

r
-

-

w

m
3

m

r
t

N

m

r
.

N

-

r
.

J

3

3

m
w

N

1
4

h
m

r
t

L
O

+

r
t

C
.
I
D

-

s

D
i
m

h

-

N

0

D
i
W

(
3

C
r
t

-
N

-
-
-
+
-
N

m
r
m

W

r
n
w

+
N

N
-

N
-

P

I

N

N

N

+

I
W

I

-
w

+

W

-

N
I

-

0
7

L
O

N
O

W
N

r

r
t

-

r
t

N

I

+

+

H

+

W

+

E
T

+

rD

N
I

m

m

r
t

r
t

'
I
N

N

-

N
N

N

m

w
-

-

-

-
c
<

-

w
N

c
-
-

-

-
*
e

'
I

+

N
l

N
m

I
-

\$

-
-

I
-
-

I
-

-

N

-

w

+

N

N

I

r

N
I

+

N

-

+

N

N

I

+

N

-

W

N
I

N

W

N

N
-

8 An Introductory Geometrical Application The Transformation o 9
so that
1 2 . 1 4 lim z . = 0 for i = 1, 2, 3.
n+m I t n
We have therefore proved the following theorem.
Theorem 1.2.1. Let 0 < s < 1 be fixed and let T.. be
I,
the sequence of nested triangles given by Tn =
o " ( ~ ~ ) , n = 1, 2, . . . . Then T n .converges to c.g. (T1).
The function V(T) is a simple example of a
Lyapunov function for a system of di.fference equations.
The c.g. is known as the limit set of the process.
It is also of interest to see how the area of T1
changes under o. Designate the area by p(T1). Assum-
ing, as we have, that z, = x, + iylr z2 = x2 + iy2,
A A
z3 = x + iy are the vertices of T1 in counterclock-
3 3
wise order, we have
Theorem 1.2.2. min
o . s . 1
,,J (o (TI) ) occurs uniquely when
s = 1/2 and equals (1/4)u(T1).
Proof. The minimum value of g(s) = 1 - 3s + 3s 2
-
occurs uniquely when s = 1/2 and equals 1/4.
PROBLEMS
1. Interpret the transformation o geometrically when
s is real but does not satisfy 0 < s < 1. What
does o do when s = l?
2. Interpret the transformation o geometrically when
s and t are complex.
3. In this case, find a formula for V(o(T 1 ) .
- 1
4. Let V(T designate the polar moment of inertia of
1
T1 about its center of gravity, regarding T1 as a
lamina of unit density. Prove that
5 . Let o(T ) have vertices A2, B2, C2. Then the lines
1
A1B2, B1C2, CIA2 are concurrent if and only if s =
t = 1/ 2. (Use Ceva's theorem.)
6. Let T be an equilateral triangle. Then for any s,
o (T) is equilateral. Interpret this as an eigen-
S
value property of
\ t 0 s '
Thus the equilateral triangles are "eigenfigures"
10 An Introductory Geometrical Application Different Values of s 11
of o. Generalize. @: Let the vertices of T
Then T
in counterclockwise order be zl, z2, z3.
is equilateral if and only if zl + wz
+ w z3 = 0,
where w = exp(2ni/3).
2
1.3 THE TRANSFORMATION a, ITERATED WITH DIFFERENT
VALUES OF S
As observed, the transformation o depends on the selec-
tion of the parameter s. Let us indicate this by writ-
ing o . Begin with the triangle T and form
S 1
Now iterate this, using different values of the para-
meter s. We obtain
so that, in general.
We then have from (1.2.10)
Whether or not V(Tn) converges to 0 depends on the
-
behavior of the infinite product Ilc=19(~k) -
m 2
IIkxl(l - 3Sk + 3Sk).
2
Let pk = 3sk - 3sk = 3s (1 - sk). Then
m w
k
IIkz1g(sk) = IIkzl(l - pk).
Assuming that 0 < sk < 1,
we have 0 < pk < 3/4. As is well known, if iF=l~k < m ,
n"
then limn,,lIk=l(l-pk) exists and is not zero. On the
-
other hand, if Ikz1pk - -, then lirnn,,n~=,(l-pk) = 0.
(See, e.g., Knopp, 1928, pp. 219-221.) Thus we must
investigate the convergence of 1" s (1-sk) To this
end, for 0 < s < 0, introduce
k=l k
k
2
kX!E. lz=l(~k - sk) <
if and only if 'Irn s* < -.
k=l k
Proof.
-
s
2
k
o < s - s = Sk(l - Sk) < and 5 min(Sk, 1-S ) = sg
k k k
1 - Sk
.n c*
2
Hence Ik=ls{ C rn implies tk=l(sk - sk) < m. On the
other hand,
m m
2
Hence lk=l~i = - implies lk=l(sk - s k ) = a.
This leads to
Theorem 1.3.1
(b) If lL=l~{ ( then
In Case (a), as before, limn+,-T = c.g. ( T ~ ) In
n
case (b), one conjectures that [T I approach a non-
n
trivial limiting triangle T, (see Figure 1.3.1). We
shall return to this point in Section 3.6 for a more
complete analysis.
PROBLEMS
1. Let s = l/(k + 1l2, k = 1, 2, . . . . Compute,
k
12 An Introductory Geometrical Application
1 Nested Polygons 13
Each side of P is now divided in length into the
ratio s/t, 0 < s < 1, t = 1 - s, proceeding cyclically
-
counterclockw~se. The points of division form the ver-
tices of a new polygon &[P). (See Figure 1.4.1.) We
wish to discuss what happens when this transformation
is iterated.
Figure 1.3.1
approximately, limk_,l~ (Tk) /b (TI) .
2. Do the same with s = exp(-pk), p > 0, k =
1, 2, ... .
k
Figure 1.4.1
1.4 NESTED POLYGONS
t
Let pn = an (P~), let the vertices of Pn have the
coordinates z z
l,nr 2.n' ...'
and for Simplicity
-
p,n'
write z 1,1 = zl, ..., z - z . The transformation
P,l P
o may obviously be written in matrix form as
We pass now from triangles to polygons. Let z,, z,,
A -
..., z be ordered vertices of a polygon P (assumed to
P
be located in the complex plane). We make no restric-
tions on the complex numbers zk, so that P may be con-
vex or nonconvex, simply covered or not; furthermore,
the points z, are not necessarily distinct so that the
..
polygon may have -"multiple vertices." All geometric
constructions described below are to be interpreted
appropriately with this in mind. We shall also call
such a figure a p-gon. We shall assume, however, that
the center of gravity of P, l/p(zl + . - . + z ) , is at
the origin. This means that P
-
14 An Introductory Geometrical ~pplication Nested Polygons 15
If one writes
and abbreviates the p x p matrix in the right hand of
(1.4.2) by G, then
( 1 4 . 2 Zn+l = GZn; Z1 = a given initial vector.
This is a linear autonomous system of difference equa-
tions, that is, G i s independent of n. The solution
of this iteration is
Thus the limitinq behavior of P _ (i.e., Z-) as n + =
I1 I I
depends substantially on the behavior of
G" as n + -.
The matrix G is a circulant matrix; that is, in
each successive row the elements move to the right
one position (with wraparound at the edges). It is
also true that the matrix G is a nonnegative. doubly
stochastic, irreducible, and normal matrix. In this
essay we emphasize the circulant aspect of G. We post-
pone further discussion of the p-yon problem until we
have somewhat developed the theory of circulants.
PROBLEMS
1. Let G = (gij) be a P * p matrix. Let the p-gon Z1
be transformed into the p-gon Z2 linearly by means
of Z2 = G Z 1 What are necessary and sufficient
conditions on G that it ?reserve centers of
sravitv? Express as an eiqenvalue-vector
condition. -
2. Let G (as in Problem 1) satisfy G~ = I for some
positive integer k. Describe the geometric situa-
-
'3n+l - G3Z3n'
-
'3n+2 - G1Z3n+l'
for n = 0, 1, . . . .
z
-
3n+3 - G2Z3n+2'
Find a formula for Z .
n
Generalize this section to space p-gons
(in three
dimensions).
Develop analytical apparatus for generalizing
this section to nested polyhedra.
In particular,
let T1 be a tetrahedron.
Let T2 be the tetrahed-
ron whose vertices are the c.9.'~ of the faces of
T1. Iterate this.
REFERENCES
Convergence of nested polygons:
Berlekamp et al.;
Rosenman; Huston; Schoenberg Ill.
p-gons in a general setting: Bachmann and Schmidt;
Davis 111, 121.
Liapunov functions, limit sets: LaSalle .
tion upon iteGation.
3. Suppose that Z is given and that
0
2
HNTRODUCTOWY
MATERIAL
MATRIX
2.1 BLOCK OPERATIONS
It is very often convenient in both theoretical and
computer work to partition a matrix into submatrices.
This can be done in numerous ways as suggested by
this example:
Each submatrix or block can be labeled by subscripts,
and we can display the original matrix with submat-
rices or blocks for its elements. The general form
of a partitioned matrix therefore is
Dotted lines, bars, commas are all used in an obvious
way to indicate partitions. The size of the blocks
must be such that they all fit together properly.
16
Block Operations 17
This means that the number of rows in each A..
1 3
must be the same for each i and the number of columns
must be the same for each j. The size of A. . is
11
-
therefore m. x n. for certain integers m. and n.. He
1 I 1 3
indicate this by writing
No. of columns
m m2 ... m
e
No. of rows
A12 . . .
"1
2 . 1 1 ' A =
A square matrix A of order n is often partitioned
symmetrically. Suppose that n = nl + n2 + - - .
+ nr
with n. > 1. Partition A as
1 -
A12 . . .
(2.1.2) A = ( ;ll
Arl Ar2 ... Arr A1r)
where size Aij = ni x nj. The diagonal blocks Aii are
square matrices of order n..
1
Example. X X X
X X X
X X X
X X X
X X X X X X
X X X
t X X X
is a symmetric partition of a 6 6 matrix.
Square matrices are often built up, or compounded,
of square blocks all of the same size.
18
Introductory Matrix Material
: X : X 1 ;
X X X X X
X X X I X X X
X X X I X X X
If a square matrix A of order nk is composed of n x n
square submatrices all of order k, it is termed an
(n, k) matrix. Thus the matrix depicted above is a
(2, 3) matrix.
Subject to certain conformability conditions on
the blocks, the operations of scalar product, trans-
pose, conjugation, addition, and multiplication are
carried out in the same way when expressed in block
notation as when they are expressed in element nota-
tion. This means
-
Here T designates the transpose and * the conjugate
transpose.
Block Operations 19
where C . . = hA. B ..
1 I Ir=l lr r]
In (2.1.6) the size of each A , . must be the size
of the corresponding Bi i .
1 3
- 2
In (2.1.71, designate the size of A . . by a. x 6 .
13 1 3
and the size of B.. by Yi X 6 . . Then, if B = Yr for
1 I I r
1 - < r - < R, the product AirBrj can be formed and pro-
duces an a+ x 6 ; matrix, independently of r. The sum
A
J
can then be found as indicated and the C . . are ai x 6 .
l:! 3
matrices and together constitute a partition. Note
that the rule for forming the blocks C;; of the matrix
product is the same as when A . . and B. . are single
numbers.
1 I 1 I
Example. If A and B are n x n matrices and if
then
W
N
V

x

m

:

a
u

N

r
.
x

c

r
n

<
w

n

I
I

m
a
r
t
r
t

v
w

I

-

"
3

g
o

r

I
-

u
t
i

u

I

'
4

a

u
r

I
-
U
p

ID

0

-

2
z

m
s

w
r
t

0

t
i

c
i
a

m
r
.

3
~
.

n

m
w

x

o

m

a
m

N

X

+

!"=

.
d

*

w

r

Y
O

I
r

0
t
h

Y

I
I

w
3

a
o

r
t
w

-

m
t
i

r
t

m
a
r
.
t
i

z

z

r
.
m

m

r
.

N

N

3
-

n

x

x

r
.

*
I
D

r
t
N

r
m

:

3
w

v

I

N

N

z

2

;
?
?

a
m

X

z

Q

C
J

I
n

2

5
;

3
3

-

N

r
t
0

.

Y
c
i

C
I

-
-

-
-

e

N

m
a

o

u
m
d

r
n
m
~
w
~
r

-

-

-
-
-

-
-
-
-
-
-

z

ID

v

t
i

0

H

-
H

m

m

a
m

m

-

P

*
P

0

v

o
m
:

z
:

-

a

-
a

a

w

n

o
n

t
i

r
t

I
D
W

w

t
i

n
t
i

r
t

ID

-
0

Y

o
m

m

m
a

9

m
c

c

W
D

0
1
1

t
i

m
m

m

-

v

r
t
t
i

c
i

0
0

0

1
0
C

C

I
D
I
D

ID

t
t

Y
r
t

r
t

0
7

3
'

t
i
w

P

r
t

r
t

z
.

r
t

a

*
Y

ID

r
m

r
t

r
t

ID

0
-P

Y

C
.

0

1
0

n
m

m
I
D

-

m

3

<

I1

0

w

m

r

c

n

m

0
1

22 Introductory Matrix Material
PROBLEPIS
1. Let A = A1 O A2 O ... O Ak. Prove that det A =
P
Iltz1 det Ai and that for integer p, A ' = A ' 1 O A2 O
... o AP.
K
2. Give a linear algebra interpretation of the direct
sum along the following lines. Let V be a finite-
dimensional vector space and let L and M be sub-
spaces. Write V = L @ M if and only if every
vector x E V can be written uniquely in the form
x = y + z with y E L , z E M. Show that V = L O M
if and only if
(a) dim V = d i m L + dim M, L n M = {Ol.
(b)
if {x l,...,xLl and {yl, ..., ym) are bases for
L and M, then ixl .... ,xL,Y1, . - . #Yml is a
basis for V.
3. The fundamental theorem of rank-canonical form for
square matrices tells us that if A is a n x n
matrix of rank r, then there exist nonsingular
matrices P, Q such that PAQ = Ir @ On-r. Verify
this formulation.
2.3 KRONECKER PRODUCT
Let A and B be m n n and p x q respectively. Then the
Kronecker product (or tensor, or direct product of A
and B) is that mp x nq matrix defined by
Important properties of the Kronecker product are
as follows (indicated operations are assumed to be
defined) :
(1) (aA) B B = A B (aB) = o(A 8 B); o scalar.
( 2 ) ( A + B ) B C = ( A 0 C ) + ( B 0 C ) .
(3) A B ( B + C) = ( A 0 B) + ( A B C ) .
(4) A 0 (B 0 C) = (A 8 B) 0 C.
Kronecker Product 23
( 5 ) (A B B) ( C 0 D) = (AC) B BD.
( 6 ) = A B B.
( 7 ) ( A 0 B ) ~ = B B ~ ; ( A 0 B)* = A * B B*.
We now assume that A and B are square and of
orders m and n. Then
( 9 ) tr(A B B) = (tr(A)) (tr(B)).
(10) If A and B are nonsingular, so is A 0 B and
(A 4 ~ 1 - l = A-I B B-I.
(11) det(A 0 B) = (det ~ ) ~ ( d e t B ) ~ .
(12) There exists a permutation matrix P (see
Section 2.4) depending only on m, n, such
that B 0 A = P*(A B B)P.
(13) Let p(x, y) designate the polynomial
Let @(A; B) designate the mn x mn matrix
Tham the eigenvalues of D (A; B) are
@(Art us), r = 1 , 2, ..., m, s = 1, 2, ...,
n where A and uc are the eigenvalues of A
r - -
and B respectively. In particular, the
eigenvalues of A 0 B are h y ) ~ < , r = 1, 2,
PROBLEMS
-
1. Show that Im B In -
Imn.
2. Describe the matrices I B A, A B I.
3. If A is m x m and B is n x n, then A 8 B =
(A B In) (Im 0 B) = (Im B B) (A 0 In).
2 4 Introductory Matrix Material
4. If A and B are upper (or lower) triangular, then
so is A @ B.
5. If A @ B f 0 is diagonal, so are A and B.
6. Let A and B have orders m, n respectively. Show
that the matrix (Im 4 B) + (A 8 In) has the
eigenvalues .Ar + p , i = 1, 2, ..., m, j =
S
1, 2, ..., n, where hr and us are the eiqenvalues
of A and B. This matrix is often called the
Kronecker sum of A a11d B.
7. Let A and B be of orders m and n. If A and B
both are (1) normal, ( 2 ) Hermitian, (3) positive
definite, (4) positive semidefinite, and ( 5 )
unitary, then A @ B has the corresponding
property. See Section 2.9.
8. Kronecker powers: Let = A @ A and, in
general,
A[~+'] = A 0 A[~].
Prove that A [k+9.1 =
Alkl @ AIZl.
9 .
Prove that (AB) l k l = Ark] B[~].
T
10. Let Ax = Ax and By = uy. x = (xl. ..., xn) .
T T T
Define z by zT = [xly , x2y , .. . , xmy I.
Prove
that (A @ B) Z = AuZ.
2.4 PERMUTATION MATRICES
By a permutation a of the set N = 11, 2, ..., nl is
meant a one-to-one mapping of N onto itself. Includ-
ing the identity permutation there are n! distinct
permutations of N. One can indicate a typical per-
mutation by
u(n) = i
n
which is often written as
Permutation Matrices 25
The inverse permutation is designated by a-l.
Thus
-1
a (ik) = k.
Let Ei designate the unit (row) vector of n com-
ponents which has a 1 in the jth position and 0's
elsewhere:
By a permutation matrix of order n is meant a
matrix of the form
a.
l,a (i)
= 1, i = 1,2, ..., n,
(2.4.4) P = (a. . ) where
11
a. 1, 1 = 0 , otherwise.
The ith row of P has a 1 in the a(i)th column and 0's
elsewhere. The jth column of P has a 1 in the
-1
o (j ) th row and 0's elsewhere. Thus each row and
each column of P has precisely one 1 in it.
Example
It is easily seen that
2 6 Introductory Matrix Material
that is, PuA is A with its rows permuted by o.
More-
over,
so that if A = (a. . ) is r n,
11
-1
That is, AP, is A with its columns permuted by o
.
Note also that
(2.4.91 POPT = PUT,
where the product of the permutations 0, T is applied
from left to right. Furthermore,
hence
Therefore
The permutation matrices are thus unitary, forming a
subgroup of the unitary group.
Permutation Matrices 27
From (2.4.6), (2.4.8) and (2.4.12) it follows
that if A is n x n
so that the similarity transformation PnAP; causes a
- "
consistent renumbering of the rows and columns of A by
the permutation o.
Among the permutation matrices, the matrix
plays a fundamental role in the theorv of circulants.
his corresponds to the forward shift-permutation
o(1) = 2, a(21 = 3, ..., o(n-1) = n. ofn) = 1. that
. . .
is, to the cycle u = (1, 2, 3, ..., n) generating the
cyclic group of order n (n is for "push"). One has
2 2
corresponding to u2 for which o (1) = 3, o (2) = 4,
2
. . . , a (n) = 2. similarly for nk and ok. The matrix
n n
n corresponds to o = I, so that
Note also that
- 1 n- 1
(2.4.17) nT = n* = n = n .
A particular instance of (2.4.13) is
(2.4.18)
n ~ n ~ = (ai+l, j+l 1
where A = a and the subscripts are taken mod n.
11
2 8
Introductory Matrix ater rial
I Permutation Matrices
29
Here is a second instance. Let L = (Al, A2, ...,
'n)
T. Then, for any permutation matrix P 0 '
(2.4.19) Po (diag LIP* o = diag(PoL).
A second permutation matrix of importance is
which corresponds to the permutation o(1) = 1, o(2) =
n, o(3) = n - 1, ..., o(j) = n - j + 2, ..., o(n) = 2.
Exhibited as a product of cycles, o = (1) (2, n)
(3, n - 11, ... , (n, 2).
It follows that o2 = I, hence
that
Also,
-1
(2.4.22) r* = rT = r = r .
Again, as an instance of (2.4.13),
(2.4.23) 1 (diag L)l = diag(rL).
Finally, we cite the counteridentity K, which has
1's on the main counterdiagonal and 0's elsewhere:
2 - 1
One has K = K*, K = I, K = K .
Let P = Po designate an n x n permutation matrix.
Now o may be factored into a product of disjoint
cycles. This factorization is unique up to the
J
arrangement of factors. Suppose that the cycles in
the product have lengths p 1~ P2, - . . I Pm,
(pl + p2 4
. . . + pm = n). Let n designate the n matrix
p k
i
(2.4.14) of order pk. By a rearrangement of rows and
..
!
columns, the cycles in Po can be brought into the
form of involving only contiguous indices, that is,
indices that are successive inteqers. By (2.4.131,
r then, there exists a permutation matrix R of order n
such that
(2.4.25) RPR* = RPR-' = n e n e ... a71 .
PI p2 Pm
Since the characteristic polynomial of n~
is
Pk
(-1) Pk(APk - I), it follows that the characteristic
m Pk Pk
polynomial of RPR*, hence of P, is Ilk=,(-l) (A - 1).
.. A
The eigenvalues of the permutation matrix P are there-
fore the roots of unity comprised in the totality of
roots of the m equations:
Example. :.et 0 be the permutation of 1, 2, 3. 4, 5, 6
for which n(1) = 5, n(2) = 1, o(3) = 6. 014) = 4.
o(5) = 2, ~ ( 6 ) = 3. Then o can be factored into cycles
as a = (152) (4) (36). Therefore, m = 3 and p, = 3,
p2 = 1, p3 = 2. The matrix Po is
t
i
3
P
'
d

0

X
3

F

m
e
m

t
i
o

r
.
w

m

<
r
r
r
t

0

-

3
r
t

c
t

m
m

a
m

t
i

z
m

n

m

r
t

r
r

3

.

s

I

r
r

I1

r

-

r
m

7

0
0

X

Q

h
e
N
3

-

'
d
o

r
t

-

r
t

m
-

m
2
1
1

r
'
i

I

n

x
n

m
B
r
'
d

r

3
s
N

G

-

-

o
m

t
r

I
l
X
E
U
O

r
-

e

N

-

m
w

0

3

r
r
-

n

m

r
~

L
U

r
.
.

'
d

I1

r
t

0

t
i
0

-

3
s

3

w

I

x

e
m

-

C
I

O
M

-

t
r
o
-

-

m

~
m

.

n

z

r
t

N

!
a
3

'
0

T

m
z

3

m
+

0

(
D

m
r

r

t
i

o
r
t

I

m

-
r

m

m

s

'
i

X
3

e

r

t
i

-
0

x

r
t

r
t

-

r
.

3

s

3

I
I

r
-

I
I

m
M
e

Q

e

m

3

X
P

N

m
m
a

r
t

n
m

r

(
3

o

3

n

r

r
~
l

-
0

0

a

r
t
m

a

x
m

m

m
m
r
t

-

3

0

r
t
m

*
m
u

t
r

'
2
'

V
l

t
i

C

o
a

a

n

a

z
,
"

"
,

a
?

m

m

r
-

v
i

r
.

a
-

t
i

r
.

Z

1
4

-

-
m

Q

.

I

x
r
t

I

-

r
.

>
O
N

I

m

r
t

r
.
w

>

P

r
Q
r
r

C
I

Q

m

r
-
a

-

m

t
i
3
o
m

-

3

0
<
3
b

r

<

O
w

B

L
U

r
t
-
0

C

0

C

0

r
t

I

z

m
3
w

r

m

o
m
m
r
t

-

m

m

r
.
r
-

-

o
m
0

r

0

c
m
m
v

N
M

3

m

w

a

I

V

.

r
t

0

a

w
v
m
r
.

r

t
i

m

-
w

r
.

0

.

t
i

r
8

3

n

J

m

r
.
0

e

3
r
r

C
I

r
t

n

P
o
C
I

7

I
D

<
L
<

m

m

m
n
a

t
i

m

%
'
d
T
2

h
l

m

m

0

w
t
i
o
r
-

'
i

e

2

"
?
r
-

m

t
i
r
t
m
r
r

r
r

m
w
c
r
.

s

*
P
C

m

a
w
r
m

r
-

o

t
i

m

3

r
r
-

o

r
t

m
m

o

r
.
3

3

r
t

J

w
~
r
.

(
n

n

r
t
r
t
r
t

*
t
i

s
m

0

.

r
.

m

X
Y
~

.

w

e

0

t
i

I

m

0
1
W
H

*
1
3

l
u

w

1

e

a
a

e

t
i
Q

t
i

r
.
0

a
s
0

r
.
r
.
a

N

r
t

I
D

r
-
a

<

m

r
t
m

(1
1

P

k
-

m

3

r
t

0

0
1

s
m

m

r
e
*

I
D
0
7

0

0

e

F
.

*

r
r

r

w
c
r

m
s
o

r
.
z

r
.
r
t

0
1

3
w
-

-

32 Introductory Matrix Material I The Fourier Matrix 33
n
(2.5.2) (a) w = 1,
(b) ww = 1,
-
-1
(c) w = w ,
(d)
;k = w-k = w n-k
(e)
1 + w + w2 + ... + w
n- 1
= 0.
By the Fourier matrix of order n, we shall mean
the matrix F (= Fn) where
Note the star on the left-hand member. The sequence
k
w , k = 0, 1, ..., is periodic; hence there are only n
distinct elements in F. F can therefore be written
alternatively as
It is easily established that F and F* are
symmetric :
(2.5.5)
T
F = F , F * = ( F * ) ~ = F, F = F*.
It is of fundamental importance that
Theorem 2.5.1. F is unitary:
(2.5.6) FF* = F*F = I or F-I = F* or
F\$ = FF = I
- 1
or F = \$.
Proof. This is a result of the geometric series
identity
n-l r(j-k)= 1 - w n(j-k)
I w
r=O
j-k
l - w
= ~ ~ ~ ~ j = ~ ~ '0 if j p k .
A second application of the geometrical identity
yields
Theorem 2.5.2
/ 1 0 ... 0 \
4 2
Corollary. F* = r = I. F*3=F*4(~*) -1 = I F = F .
We may write the Fourier matrix picturesquely in
the form
(1t may be shown that all the qth roots of I are of
9 -
the form M-~DM where D = diag (ul, u2, . . . , pn), ui - 1,
and where M is any nonsingular matrix.)
Corollary. The eigenvalues of F are +1, +i, with
appropriate multiklicities.
Carlitz has obtained the characteristic polynom-
ials f lh) of F* (= F*) . They are as follows.
n
n = Olmod 4 ) ,
2
f(X) = ( A - 1) ( A - i) ( h + 1)
( A 4 - 1) (n/4)-1
3 4 Introductory Matrix Material I The Fourier Matrix 35
!
The discrete Fourier transform. Working with
complex n-tuples, write
n S 1 (mod 4),
f (A) = (A - 1) (A4 - 1) (114) (n-1)
j P (1)
I
Z = (zl, z2, ..., Zn) and
A
A A
Z = (zl, Z2, ..., QT.
The linear transformation
n ! 2(mod 4), f ( h ) = (A2 - 1) (A4 - 1)
(1/4) (n-2)
n ! 3(mod 4), f(h) = ( A - i) (A' - 1)
I
i
where F is the Fourier matrix is known as the discrete
i
Fourier transform (DFT). Its inverse is given simply
by
1 (2.5.9) Z = F-li = F*2.
I
(2.5.11)
The transform (2.5.8) often goes by the name of
harmonic analysis or periodogram analysis, while the
inverse transform (2.5.9) is called harmonic synthesis.
The reasons behind these terms are as follows: suppose
n- 1
that p(z) = a. + a z + ... + anz 1s a polynomial of
1
degree 5 n - 1. It will be determined uniquely by
specifying its values p(z ) at n distinct points zk,
n
k = 1, 2, ..., n in the complex plane. Select these
2 n- 1
points zk as the n roots of unity 1, w, w , ..., w .
Then clearly
so that
(:' )= n-'I2!j (:(w) ) ,)
The passage from functional values to coefficients
through (2.5.11) or (2.5.8) is an analysis of the
function, while in the passage from coefficient values
to functional values through (2.5.10) or (2.5.9) the
functional values are built up or "synthesized."
These formulas for interpolation at the roots of
unity can be given another form.
By a Vandermonde matrix V(z,,, zl, ..., z ) is
n-1
meant a matrix of the form
(h4 - 1)
(1/4) (11-31
From (2.5.4) one has, clearly,
2
v(1, W, W , ..., Wn-l) = n1I2F*,
(2.5.13)
- -2 -n-1 1/2-*
~ ( 1 , w , w , ..., w ) = n F =n1'2~.
One now has from (2.5.11)
n- 1
(2.5.14) p(z) = (1, z, ..., z ) (ao, al, ..., a )T
n-1
n-1 -1/2
= (1, z, ..., z )n
F(p(l), ~ ( w ) , ...,
= n
-1/2 n- 1 - -2
l , z , . , z ) V ( l , w , w ,
-n-1
..., w ) (p(l), P(w), ..., P(wn-l))T.
a
n-1 p (wn-l
-
3 6 Introductory Matrix Material
Note. In the literature of signal processing, a
-
sequence-to-sequence transform is known as a discrete
or digital filter. Very often the transform ]such as
(2.5.8)) is linear and is called a linear filter.
Fourier Matrices as Kronecker Products. The Fourier
matrices of orders 2" may be expressed as Kronecker
products. This factorization is a manifestation,
essentially, of the idea known as the Fast Fourier
Transform (FFT) and is of vital importance in real
time calculations.
Let F:n designate the Fourier matrices of order
1
2" whose rows have been permuted according to the bit
reversing permutation (see Problem 6, p. 30).
Examples
F;=%(: -:),
One has
where Dd = diag(1, 1, 1, i). This may be easily
checked out.
As is known, A B B = P(B @ A) P * for some permu-
tation matrix P that depends merely on the dimensions
of A and 8. We may therefore write, for some permu-
-1
tation matrix S4 (one has, in fact, S4 = S4):
(2.5.16) F; = (I2 @ F')D S (I B F;)S4
2 4 4 2
Similarly,
The Fourier Matrix 3 7
where
with
2 3
(2.5.19) D = d i a g ( l , w , w , w ) ,
-2ni
w = exp - 16 '
Again, for an appropriate permutation matrix S -
T
16 -
s-l = S16,
16
For 256 use
where the sequence 0, 8, 4, ..., 15 is the bit
reversed order of 0, 1, ..., 15 and where
(2.5.22) D = diag(1, w, . . . , w15), w = e 2ni/256
PROBLEMS
1. Evaluate det Fn.
..
2 . Find the polynomial P,-~(Z) of degree < n - 1 that
-
takes on the values l/z at the nth roots of unity,
I , j = 1, 2, . . , n.
What is the limiting
behavior of pn(z) as n + m?
(de Mbre)
3. Write F = R + iS where R and S are real and i =
a. Show that R and S are symmetric and that
2
R~ + S
= I, RS = SR.
4 . Exhibit R and S explicitly.
2.6 HADAMARD MATRICES
BY a Hadamard matrix of order n, H (= H,), is meant a
matrix whose elements are either +l or -1 and for
which
3 8
Introductory Matrix Material
-
Thus, n 1'2~ is an orthogonal matrix.
Examples
H1 = (l),
JT F2 = H2 =
(1 I),
1 1 1
-
H4, 1
-
1 -1 1 -1
It is known that if n , 3, then the order of an
Hadamard matrix must be a multiple of 4. With one
possible exception, all multiples of 4 5 200 yield at
least one Hadamard matrix.
Theorem 2.6.1. If A and B are Hadamard matrices of
orders m and n respectively, then A B B is an Hadamard
matrix of order mn.
Proof
-
(A B B) (A @ B ) ~ = (A @ B) ( A ~ @ B ~ ) = ( A A ~ ) (BB~)
In some areas, particularly digital signal proc-
essing, the term Hadamard matrix is limited to the
n
matrices of order 2 given specifically by the recur-
sion
Hadamard Matrices 3 9
These matrices have the additional property of being
symmetric,
so that
The Walsh-Hadamard Transform. By this is meant the
transform
where H is an Hadamard matrix.
PROBLEMS
1. Hadamard parlor game: Write down in a row any
four numbers. Then write the sum of the first
two, the sum of the last two; the difference of
the first two, the difference of the last two to
form a second row. Iterate this procedure four
times. The final row will be four times the
original row. Explain, making reference to H
Generalize. 4'
2. Define a generalized permutation matrix P as
follows. P is square and every row and every col-
umn of P has exactly one nonzero element in it.
That element is either a +1 or a -1. Show that
if H is an Hadamard matrix, and if P and Q are
generalized permutation matrices, then PHQ is an
Hadamard matrix.
3. With the notation of (2.6.2) prove that
40 Introductory Matrix Material
4. Using Problem 3, show that the Hadamard transform
of a vector by H
can be carried out in
,n
L
< n 2" additions or subtractions.
- -
5. If H is an Hadamard matrix of order n, prove that
4 2
l d e t ~ I = n .
2.7 TRACE
The trace of a square matrix A =
(a..) of order n is
1 I
defined as the sum of its diagonal elements:
The principal general properties of the trace are
(1) tr (aA + bB) = a tr (A) + b tr (B) .
(2.) tr (AB) = tr (BA) .
(3)
tr A = tr(s-lAs), S nonsingular.
(4) If A . are the eigenvalues of A, then
' n
tr A = li=l Xi.
(5) More generally, if p designates a polynomial
n
then tr (p (A) = Ikzl P (Ak).
n 2
(6) tr (AA*] = tr (A*A) = li . 1 a . 1 = square
I 11
of Frobenius norm of A.
(7) tr(A Ci B) = tr A + tr B.
(8) tr (A c3 B) = (tr A) (tr B) .
2.8 GENERALIZED INVERSE
For large classes of matrices, such as the square
"singular" matrices and the rectangular matrices, no
Generalized Inverse 4 1
inverse exists. That is, there are many matrices A
for which there exists no matrix B such that AB = BA
= 1. -.
In discussing the solution of systems of linear
equations, we know that if A is n x n and nonsingular
then the solution of the equation
where X and B are n x m matrices, can be written very
neatly in matrix form as
-1
X = A B.
Although the "solution" give above is symbolic,
and in general is not the most economical way of solv-
ing systems of linear equations, it has important
applications. However, we have so far only been able
to use this idea for square nonsingular matrices. In
this section we show that for every matrix A, whether
square or rectangular, singular or nonsingular, there
exists a unique "generalized inverse" often called
the "Moore-Penrose" inverse of A, and employing it,
the formal solution X = A-'B can be given a useful
interpretation. This generalized inverse has several
of the important properties of the inverse of a
square nonsingular matrix, and the resulting theory
is able in a remarkable way to unify a variety of
diverse topics. This theory originated in the 19205,
but was rediscovered in the 1950s and has been
developed extensively since then.
2.8.1 Right and Left Inverses
Definition. If A is an m x n matrix, a right inverse
of A is an n x m matrix B such that AB = I_. Similar-
...
ly a left inverse is a matrix C such that CA = I .
n
Example. If
/1 1 1 \
a right inverse of A is the matrix
4 2 Introductory Matrix Material
since AB = I
2'
However, note that A does not have a left
inverse, since for any
matrix C, by the theorem on
the rank of a product,
r(CA) 5 r(A) = 2, so that CA #
I3.
Similarly, although A is, by definition, a left
inverse of B, there exists no right inverse of B.
The following theorem gives necessary and suffic-
ient conditions for the existence of a right or left
inverse.
Theorem 2.8.1.1.
An m x n matrix A has a riqht (left)
inverse if and only if A has rank m(n).
Proof,
We work first with riqht inverses.
Assume that AB = Im.
Then m = r (I m ) 5 r (A) 5 m.
Hence rlA) = m.
Conversely, suppose that r(A) = m. Then A has m
linearly independent columns, and we Fan find a per-
mutation matrix P so that the matrix A = AP has its
first m column? linearly i~dependent.
Now, if we can
find a matrix B such that AB = APB = I, then B = PB
is clearly a right inverse for A.
Therefore, we may assume, without loss of gen-
erality, that A has its first m columns linearly
independent.
Hence A can be written in the block form
A = (A1, A2)
where Al is an m Y m nonsingular matrix and A2 is some
m u (n - m) matrix. This can be factored to yield
-1
A = A (I Q )
1 m' (Q = A, A,) .
NOW let
where B1 is m x n and B2 is (n - m) x m.
Then AB = I
Generalized Inverse
4 3
if and only if
AIBl + AlQB2 = I,
or if and only if
B1 + QB2 =
-1
A1
or if and only if
- 1
B1 = A1 - QB2.
Therefore, we have
for an arbitrary (n - m) x m matrix B2.
Thus there is
a right inverse, and if n > m, it is not unique.
We now prove theTheorem for a left inverse.
Suppose, again, that A is m x n and r(A) = n.
Then
T .
A is n x m and r ( ~ ~ ) = n. BY the first part, has
T a right inverse: A ~ B = I.
Hence B A = I and A has a
left inverse.
Corollary.
If A is n x n of rank n, then A has both
a right and a left inverse and they are the same.
Proof.
The existence of a right and a left
-
inverse for A follows immediately from the theorem.
To prove that they are the same we assume
AB = I, CA = I.
Then C(AB1 = CI = C.
But also,
SO that B = C.
This is the matrix that is defined to
be the - inverse of A, denoted by A-I.
4 4 Introductory Matrix Material
PROBLEMS
1. Find a left inverse for (i i) . Find all the
left inverses.
have a left inverse?
3. Let A be m x n and have a left inverse B. Suppose
that the system of linear equations AX = C has a
solution. Prove that the solution is unique and
is given by X = BC.
4. Let B be a left inverse for A. Prove that ABA = A
and BAB = B.
T
5. Let A be m x n and have rank n. Prove that A A is
nonsingular and that (A~A)-'A~ is a left inverse
for A.
6 . let^ be m x n and have rank n. Let W be m x m
positive definite symmetric. Prove that A ~ W A is
T
nonsingular and that (A W A ) - ~ A ~ W is a left inverse
for A.
2.8.2 Generalized Inverses
Definition. Let A be an m x n matrix. Then an n x m
matrix X that satisfies any or all of the following
properties is called a generalized inverse:
(1) AXA = A,
(2) XAX = X,
(3) (AX)* = AX,
(4) (XA)* = XA.
Here the star * represents the conjugate transpose. A
matrix satisfying all four of the properties above is
called a Moore-Penrose inverse of A (for short: an
M-P inverse). We show now that every matrix A has a
unique M-P inverse. It is denoted by A ~ . It should
be remarked that the M-P inverse is often designated
Generalized Inverse 45
+
by other symbols, such as A . The notation A? is used
here because (a) it is highly suggestive and (b) it
comes close to one used in the APL computer language.
We first prove the following lemma on "rank
factorization" of a matrix.
Lemma. If A is an m x n matrix of rank r, then A = BC,
whereB is m x r, C i s r x nand r(B) = r(C) = r.
Proof. Since the rank of A is r, A has r linearly
independent columns. We may assume, without loss of
generality, that these are the first r columns of A,
for, if not, there exists a permutation matrix P such
that the first r columns of the matrix AP are the r
linearly independent columns of A. But if AP can be
factored as
A , .
AP = BC, r (B) = r (C) = r,
then
A = BC
where C = CP-' and r(C) = r(?) = r, since P is non-
singular.
Thus if we let B be the m x r matrix consisting
of the first r columns of A, the remaining n - r
columns are linear combinations of the columns of B,
of the form BQ") for some r x 1 vector ~(j).
Then
if we let Q be the r x (n - r) matrix,
Q = (Q (1) ... (n-r) ,
we have
r n-r
A = (B, BQ) (letters over blocks
indicate number of columns)
If we let
c = (Ir, Q),
we have
A = B(Ir, Q) = BC
and r(B) = r(C) = r.
46 Introductory Matrix Material
We next show the existence of an M-P inverse in
the case where A has full row or full column rank.
Theorem 2.8.2.1
- 1
(a) If A is square and nonsingular, set A~ = A .
(b) If A is n x 1 (or 1 x n) and A # 0, set
= - A* (or A~ = -
(A*A)
A*).
(AA* )
(c)
If A is m x n and r(A) = m, set A~ =
A*(AA*)-l. If A is m x n and r(A) = n, Set
A+ = (A*A)-l~*.
-
Then A' is an M-P inverse for A. Moreover, in the
case of full row rank, it is a right inverse; in the
case of full column rank, it is a left inverse.
Note that (a) and (b) are really special cases
of (c).
Proof. Direct calculation. Observe that if A is
m x n T r ( ~ ) = m, then AA* is m x m. It is well
- 1
known that r(AA*) = m, so that (AA*) can be formed.
Similarly for A*A.
We can now show the existence of an M-P inverse
for any m x n matrix A.
1f A = 0, set A? = O* = 0 This is readily
n,m'
verified to satisfy requirements (11, (2), ( 3 ) and ( 4 )
for a generalized inverse.
If A # 0, factor A as in the lemma into the
product
A = BC
where B is m x r, C is r x n and r(B) = r(C) = r. Now
B has full column rank while C has full row rank, so
that B~ and C' may be found as in the previous theorem.
Now set
. .
A T = c?B7.
Theorem 2.8.2.2. Let A~ be defined as above.
is an M-P inverse for A.
Then it
Generalized Inverse
47
-.
Proof. It is easier to verify properties (3)
and (4)rst. They will then be used in proving
properties (1) and (2).
. .
( 3 ) AA' = B(CC-)B~ = BIB' = BB', and since
BB: = (BB~)*, we have AAi = ( rnt ) *.
(4) Similarly, A-A = CtC = (CiC)* =
-
(1) (-')A = (BB-)BC = BC = A.
. .
- -
( 2 ) (A'A)A~ = (cic)c'gi = c.B. = A+.
Now we prove that for any matrix A the M-P inverse is
unique.
Theorem 2.8.2.3.
Given an m x n matrix A, there is
only one matrix A~ that satisfies all four properties
for the Moore-Penrose inverse.
Proof. Suppose that there exist matrices B and
c satisfying
ABA = A
(1)
BAB = B
(2)
(AB) * = AB
(3)
(BA)* = BA
(4)
Then
ACA = A,
CAC = C,
(AC)* = AC,
(CA)* = CA.
and
( 3 ) (3) and ( 2 )
= (cc*A*)(AB) - - CAB.
Therefore B = C.
The integers over the equality
signs show the equations used to derive the equality.
Penrose has given the following recursive method
48 Introductory Matrix Material
for computing A7, which is included in case the
reader would like to write a computer program.
Theorem 2.8.2.4 (the Penrose algorithm). Let A be
m x n and have rank r > 0.
(a) Set B = A*A (B is n x n).
(b)
Set C1 = I (C1 is n x n).
(c) Set recursively for i = 1, 2, . . . , r - 1:
Ci+l = (l/i)tr(C.B)I 1 - CiB (Ci is n x n).
Then tr (C\$) f 0 and A? = rCrA*/tr (CrB).
Moreover,
- -
Cr+lB = 0. We therefore do not need to know r
beforehand, but merely stop the recurrence when we
have arrived at this stage.
The proof is omitted.
Also very useful is the Greville algorithm.
Theorem 2.8.2.5. Define A = (Ak-l ak) where ak is the
kth column of A and Ak-l is the submatrix of A consis-
-
ting of its first k - 1 columns. Set dk = Ak-lak and
-
-
ck - ak - Ak-ldk.
Set b k = ck if ck # 0. If ck = 0,
set bk = (1 + d;dk)-ld~~i-l. Then
-
To start: set A; = 0 if al = 0; if not, set A; =
- 1
(a;al) a;.
PROBLEMS
2 2 0
1. If A = (1 2 1) , verify that
1 2 1
Generalized Inverse 4 9
1 1
I f A = (1 l), find A'.
1 2
find A'
Use Penrose's formulas to compute the inverse of
the nonsingular matrix
Use Greville's algorithm.
-
If c is a nonzero scalar, prove that (CAI' =
(l/C)A'. . .
Prove that (Ail = A.
. .
Prove that (Ai)* = (A*)~.
If d is a scalar, define di by di = d-I ' ~f d # 0,
di = 0 if d = 0. Let A = diag(dl, ...,
dn) .
Prove that A' = diag(di, . . . , d=).
A 0)i -
Prove that (O - (O AT B+) 0 and (B 0 A)+ =
- 0
Prove that if AT = 0. then A = O * .
~ ~~ . .
Let A = ( : : ) and have rank 1. Prove that
-
A' = 1
A*.
1a12 + lb12 + 1c12 + /dl
Let J be the J matrix of order n. Prove that
-
J. = (l/n2)~.
Let S be an n x n matrix with 1's on the super-
diagonal and 0's elsewhere. Find s:.
2
Let P be any projection matrix (i.e., P = P, P*
= P). Prove that P: = P.
5 0 Introductory Matrix Material
15.
prove that both AA? and A-A are projections.
-
16. Prove that Ai = (A*A)~A* = A*(=*)'.
17. Prove that r (A) = r (A') = r (A~A)
= tr (ATA) .
1
18. Taking A = (1, o), B = show that, in gen-
eral, (AB)' + B'A+.
19.
~f a and b are column vectors, then a? = (a*a):a*,
and (ab*)' = (a*a)' (b*b)iba*.
20. Prove that (A 3 B) = A~ 3 B ~ .
2.8.3 The UDV Theorem and the M-P Inverse
We begin by establishing a theorem that is of great
utility in visualizing the action and facilitating the
manipulation of rectangular (or square) matrices. This
is the UDV theorem, also called the diagonal decomposi-
tion theorem or the singular value decomposition
theorem.
Theorem 2.8.3.1. Let A be an m x n matrix with com-
plex elements and of rank r.
Then the exist unitary
matrices U, V of orders m and n respectively such that
(2.8.3.1) A = UDV*
where
2 . 8 3 2 D = (zl
O) 0
is m x n and where Dl = diag(dl, d2. ..., dr) is a
nonsingular diagonal matrix of order r.
Note that the representation (2.8.3.1) can be
written as U*AV = D or, changing the notation, UAV =
D, and so on (since U and V are unitary).
Let A be m x n; then, as is well known, AA* is
positive semidefinite Hermitian symmetric and r(AA*)
=
r(A) = r(A*). Hence the eigenvalues - of - AA* are - real
L
and nonnegative.
Write them as dl, di, ... , d : , 0,
0, ..., 0 where the dils are positive and where there
are m - r 0's in the list.
The numbers dl, d2, ...,
d are known as the singular values of A.
r
Generalized Inverse
5 1
Proof. Define D = diag(dl, d2, ..., d r ) Let
-
1
U1 be m x r and consist of the (orthonormal) eigenvec-
2 2 tors of AA* corresponding to the eigenvalues dl, d2,
. . . , d2 (cf. Theorems 2.9.3 and 2.9.9). We have AA*Ul
2 '
= U D and UPUl =
1 1 Ir.
Let U2 be the m x (m - r) matrix
whose columns consist of an orthonormal basis for the
null space of A*.
Then A*U2 = 0 and U p 2 =
Im-r *
Write U = (U1, U2) (block notation). Then
Now, since AA*U1 = 2 2
UIDlr U*AA*U 2 1 = U5U1D1. But A*U2 =
0, so that U*A 2 = 0, hence U*U D2 = 0.
Since D: is
2 1 1
nonsingular, it follows that u p 1 = UiU2 = 0. This
means that
0
m-r
and hence that U is unitary.
Let V1 be the n x r matrix defined by V =
1
A*ulDil. Let V2 be the n x
(n - r) matrix whose n - r
columns are a set of n - r orthonormal vectors for the
null space of A.
Thus AV2 = 0 and V*V = I
2 2 n-r '
Define
V as the n x n matrix V = (V 1. V2).
Now
and V*V 2 1 = V;A*U D-l = (AV~) *ul~yl = 0.
It follows
1 1
that V is unitary. Finally,
-
52 Introductory Matrix Material
Using UDV theorem, we can produce a very conven-
ient formula for AT.
Theorem 2.8.3.2.
If A = U*DV*, where U, V, D are as
above, then
where
r m-r
Proof. By a direct computation, it is easy to
show that the n
-
x m matrix
I 0
(VD'U)A = v ( ~ ~ O)V*, the third and fourth properties
for the generalized inverse are satisfied.
Also.
AA'A = (u*Dv*) . (VD-U) . (u*Dv*) = U*DD~DV* = U*DV* = A.
Similarly ATAAi = A?, proving the first two propertif
Theorem 2.8.3.3.
For each A there exist polynomials
and q such that
Generalized Inverse
5 3
Proof. Let A be m x n and have rank r.
Then by
the diagonal decomposition theorem there exist unitary
matrices U, V of order m and n and an m x n matrix
r n-r
D = (,1 O ) r
0 0 m-r
where Dl = diaq(dl, d2, . . . , d,), dld2..-dr # 01, such
that A = U*DV*. Then A* = VD*U, AA* = U*DD*U, and
- -
A' = VD'U.
For an arbitrary polynomial p ( z ) , p(AA*) =
p (U* (DD*)U) = U*p(DD*)U. Hence A*p (AA*) =
VD*p(DD*)U. Therefore for A7 to equal A*p(AA*) it is
necessary and sufficient that D~ = D*~(DD*).
Equi-
valently,
("1 O ) = ( D; o ) jDiDi O !
0 0 0 0 0 0
- 1 2
2
o r d k
=dkp(Jdk/ ) , k = 1, 2. ..., r.
~ h u s p(/dkl )
2
= l/(ldkl ) , k = 1, 2, .... r is necessary and suffic-
ient. Let s designate the number of distinct values
among idll, IdZI, ..., Idrl. Then by the fundamental
theorem of polynomial interpolation (see any book on
interpolation, approximation, or numerical analysis)
there is a unique polynomial of degree 5 s - 1 that
- 2 2
takes on the values idk/
at the s points ldkl .
The second identity for A~ is proved similarly.
PROBLEMS
1. Let U and V be unitary. Prove that (UAV)' =
v*AiU*.
2. Let A be normal. Give a representation for in
terms of the characteristic values of A. See
Section 2.9.
3.
Prove that if A is normal, AAi = A ~ A .
4.
Prove that AT = A* if and only if the singular
5 4
Introductory Matrix ate rial
I
Generalized Inverse
55
values of A are 0 or 1.
- - 1
5. Prove that A' = limt,O A*(tI + AA*) .
2.8.4 Generalized Inverses and Systems of
Linear Equations
Using the properties of the generalized inverse we are
able to determine, for any system of equations
whether or not the system has a solution.
If it does,
we can obtain a matrix equation, involving the gener-
alized inverse, which exhibits this solution. Oddly
enough, we need only the first property of a general-
(1)
ized inverse. That is, we may use any matrix A ,
!
such that AA(')A = A.
Definition. If A is m x n, any n x m matrix that
satisfies AA(~)A = A is called a (1)-inverse of A.
!
More generally,
any matrix that satisfies any combinn-
tion of the,four requirements for the generalized
I ,, inverse on page 44 is designated accordingly.
, .
,
I ,, Example. A (1, 2, 4)-inverse for A is one that Satis-
i ..
, :: fies conditions (I), ( 2 ) , and (4).
I .
I .,
; !:
Theorem 2.8.4.1. Let A be m x n. The system of
I I
, ,, equations
I 1,
has a solution if and only if B = AA(~)B, for any (1)-
inverse of A. In this case, the general solution
is given by
x = A(~)B + (I - A(')A)Y
for an arbitrary n x 1 vector Y.
Proof. Let B = AA(')B. Then AX = AA(l)B is
solved by x = A(~)B.
Suppose, conversely, that the
system has a solution Xo: AXO = B. Then, for any
Moreover, if X = A(')B + ( I - A(~)A)Y, then with B =
AA(~)B,
Therefore any such X is a solution.
To show that it is the general solution, we must
AXO -
AA(l)B = B - B = 0.
Now therefore R = R - A(~'AR.
Hence, X
- A( "B + (I - A(~)A)R which is of the
0
required form with Y = R.
In the numerical utilization of this theorem one
should. of course, use some standard (1)-inverse of A
such as A:.
PROBLEMS
1.
Show that if A is an m x n matrix and B is any
(1)-inverse of A , then AB and BA are idempotent
of orders m and n respectively and BAB is a
(1,2)-
inverse of A.
2. Show that if A is m x n (n x m), of rank m, then
any (1)-inverse of A is a right (left) inverse of
A, and any right (left) inverse of A is a (1,2,31-
[(1,2,4)-1 inverse of A.
3. Consider two systems of equations: (1) AX = B,
( 2) CX = D.
Find conditions such that every solu-
tion of (1) is a solution of (2).
4 . What happens in Problem 3 if B = D = O?
5 . Prove that the matrix equation AXB = C has a
. .
solution if and only if ~ A ~ C B ~ B = C.
In this
case, the general solution is given by
5 6 Introductory Matrix Material
for an arbitrary Y.
2.8.5 The M-P Inverse and Least Square Problems
Let A be m x n , X and B be n x 1, and consider the sys-
tem of equations
If the vector B lies in the range of A, then there
exists one or more solutions to this system. If the
solution is not unique we might want to know which
solution has minimum norm. If the vector B is not in
the range of A, then there is no solution to the sys-
tem, but it is often desirable to find a vector X in
some way closest to a solution. To this end, for any
X, define the residual vector R = AX - B and consider
its Euclidean norm I / R / j = m. A least squares
solution to the system is a vector Xo such that its
residual has minimum norm. That is,
I / R ~ ~ 1 = 1 I A X ~ . - B / I - < I I A X - B I I
for all n x 1
vectors X.
tI
Theorem 2.8.5.1. The system of equations AX = B
always has a least squares solution. This solution is
unique if and only if the columns of A are linearly
YI!
j; independent. In this case, the unique least squares
solution is given by X = A-B.
Proof. Let R(A) designate the range space of A
and by [R(A)lL designate its orthogonal complement.
Then we can write B = B + B2 where B1 is in R(A) and
1
B2 is in orthogonal complement [R(A)lL. For any X,
AX is in R(A) as is Ax - B hence is orthogonal to
1'
B2. Now AX - B = AX - B1 - B2. Hence, for any X,
2 2 2 2
IlAx - B I I = 1 1 ~ ~ - B1l/ + I I B ~ I ~ 2 ~~B~~~ .
Therefore 1 1 B2 1 1 is a lower bound for the values
/ /AX - B / 1 ' and is achieved if and only if AX = B1.
Since B1 is in R(A), there is a solution Xo to AX = B
1
Generalized Inverse 5 7
For this vector Xo,
2 2 2
llRo/I = /lAxo - B / I = I I B ~ I I ~ 2 IIAx - BIl
so that the lower bound is achieved.
Since a unique solution to AX = B exists if and
1
only if the columns of A are linearly independent, the
theorem is proved.
For any solution Xo to AX = B
1'
= A X - B = B - (B + B2) = -B
I
Ro 0 1 2
is in [R(A) 1 .
Therefore A*R = 0, or
0
These are the normal equations determining the least
squares solution.
If the columns of A are independent, then
r(A*A) = r(A) = n, so that the n x n matrix A*A is
nonsingular. The least squares solution Xo is deter-
mined by A*AXO = A*B, so that X,, = (A*A)-IA*B. But,
from our previous work, A+ = (A;A)-~A*.
Finally, we take up the general case.
Lemma. Let P = A A ~ , Q = ATA. Then, if x and Y are
arbitrary vectors (conformable),
and
Proof. Since A = A A 7 ~ , AX = AA~AX = PZ with Z =
AX. We now prove that PZ 1 (I - P)Y. This is equi-
valent to (Pz)*(I - P)Y = 0 or Z*P*(I - P)Y = 0. But
P* = P and p2 = (AA+A)A+ = AA+ = P. Therefore,
5 8 Introductory Matrix Material
P*(I - P) = 0. The first equality above now follows
from Pythagoras' theorem. The . second . equality can be
derived from the first using ATT = A.
Another way of phrasing this work is that P is
the projection onto the range space R(A) of A while
I - P is the projection onto the orthogonal complement
of R(A).
Theorem 2.8.5.2. Let A be m x n and B be m x 1. Let
X = ATB. Then for any n x 1 X f Xo, we have either
0
(2) I /AX - B I I = I I A X ~ - B I I
and
I Ix I I > I Ixol I -
Proof. For any X we have
-
AX - B = AX,- AAiB + AAiB - B
By the previous lemma,
The equality holds here if and only if A(X - Xo) = 0.
Hence if AX # AXO, inequality (1) holds.
Suppose, then, that AX = AX Then A ~ A X = AiAxO
. .
- - - 0'
= A'AA'B = A'B = X
0'
Therefore, X = X + (X - X ) =
0 0
ATB + (I - ASA)x. Hence by inequality (2) of the
lemma,
so that
Generalized Inverse 5 9
This theorem may be rephrased as follows. Given
the system AX = B. Then the vector ATB is either the
unique least squares solution or it is the least
squares solution of minimum norm.
PROBLEM
1. A is square and singular. Characterize the solu-
tion A'B.
2.9 NORMAL MATRICES, QUADRATIC FORMS, AND FIELD OF
VALUES
We record here a number of important facts. By a
normal matrix is meant a square matrix A for which
(2.9.1) AA* = A*A.
Examples. Hermitian, skew-Hermitian, and unitary
matrices are normal. Hence real symmetric, skew-
symmetric, and orthogonal matrices are also normal.
All circulants are normal, as we shall see.
Theorem 2.9.1. A is normal if and only if there is a
unitary U and diagonal D such that A = U*DU.
Theorem 2.9.2. A is normal if and only if there is a
Polynomial p(x) such that A* = p(A).
Theorem 2.9.3. A is Hermitian if and only if there is
a unitary matrix U and a real diagonal D such that
A = U*DU.
Theorem 2.9.4. A is (real) symmetric if and only if
there is a (real) orthogonal matrix U and a real
diagonal D such that A = U*DU.
6 0 Introductory Matrix Material
PROBLEMS
1. Prove that A is normal if and only if A = R + is
where R and S are real symmetric and commute.
2. Prove that A is normal if and only if in the polar
decomposition of A (A = HU with H positive semi-
definite Hermitian, U unitary) one has HU = UH.
3. Let A have eigenvalues Al, ..., An. Prove that A
is normal if and only if the eiqenvalues of AA*
2 2 2
are lhll : Ih2/ , ..., lhnl .
4. Prove that A is normal if and only if the eigen-
values of A + A* are X1 + XI, A 2 + P2, ...,
in +a.
5 . If A is normal and p(z) is a polynomial, then
p (A) is normal.
6 . If A is normal, prove that A~ is normal.
7. If A and B are normal, prove that A B B is normal.
8. Use Theorem 2.9.1 to prove Theorem 2.9.2.
C
Quadratic Forms. Let M be n x n and let 2 = (zl, z2,
-
T
..., zn) .
By a quadratic form is meant the function
?.
of zl, ..., z given by
n
11
It is often of importance to distinguish the
quadratic form from a matrix that gives rise to it.
The real and the complex cases are essentially dif-
ferent.
Lemma 2.9.5. Let Q be real and square and U a real
column. Then U ~ Q U = 0 for all U if and only if Q =
T
-Q , that is, if and only if Q is skew-symmetric.
Proof.
-
T T T T -
(a) Let Q = -Q . If a = U QU, a T = a = ~ ~ ~ -
Normal MatrlCeS 61
T
u (-Q) u = -a. Therefore a = 0.
m
(b) Let U'QU = 0 for all (real) U. Write Q =
Q1 + Q2 where Q1 is symmetric and Q2 is skew-symmetric.
Then, for all U
Since Q1 is symmetric, we have for some orthogonal P
and real diagonal matrix A: Q = pTAP. Therefore for
1
ail real Ul U ~ P ~ A P U = (PU)*A (PU). Write PU =
(ul, ..., Un), A = diaq(X,, ..., A,). Then we have
- A.
n
X ( G )2 = 0 for all (ul, . . . , u,), hence for all
Ik=l k n
(ul, .... Gn).
This clearly implies Xk = 0, for k =
1, 2, ..., n.
Hence Q1 = 0 and Q = Q2 = skew-symmetric.
Theorem 2.9.6.
Let Q and R be real square and U be a
T T
real column.
Then U QU = U RU for all U if and only if
Q - R is skew-symmetric.
T T
Proof.
U QU = U RU if and only if U ~ ( Q - R)U = 0.
Corollary. Let Q be real and U be a real column.
Then
1 T
The matrix -(Q + Q ) is known as the symmetriza-
tion of Q.
2
- We pass now to the complex case.
Lemma 2.9.7.
Let M be a square matrix with complex
elements and let Z be a column with complex elements.
Then
for all complex Z if and only if M = 0
Proof
(a) The "if" is trivial.
(b) "Only if." Write Z = X + iY, M = R + is
6 2 Introductory Matrlx Material
where X, Y, R, S are all real. Then we are given
(2.9.4) (X* - iY*) (R + i5) (X + iY) = 0 for all
real X, Y.
Select Y = 0. Then X*(R + iS)x = 0 for all real X
or X*RX = 0 and X*SX = 0. Therefore, by the first
T
lemma, K and S must be skew-symmetric: R + R = 0,
S + ST = 0. Expanding the product on the left side
of (2.9.4), we obtain
In view of the skew symmetry of R and S and the first
lemma, we have x*RX = X*SX = Y*RY = Y*SY = 0. There-
fore, we have for all real X, Y:
Thus, for all real X, Y, X'(R - R*)Y = 0 and
Y*(S - S*)X = 0. Selecting X and Y as appropriate
unit vectors (0,. , a , 1, 0, ... , O), this tells us that
R - R * = 0 and s - S* = 0. But R* = R~ = -R and S* =
sT = -S, therefore R = S = 0 and M = 0.
Theorem 2.9.8. Let M and N be square matrices of
order n with complex elements and suppose that
for all complex vectors Z. Then M = N.
Proof. As before, Z*MZ = Z*NZ if and only if
-
Z*(M - N)Z = 0.
~ormal Matrlces 6 3
Note that this theorem is false if (2.9.5) holds
only for real Z.
Corollary. z*Mz is real for all complex Z if and only
if M is Hermitian.
Proof. Z*MZ is real if and only if Z*MZ =
-
(Z*MZ)* = Z*M*Z. Hence M = M*.
Let M be a Hermitian matrix. It is called ~~- ~ ~ - - -~
positive definite if Z*MZ > 0 for all Z # 0. It is
called positive semidefinite if Z*MZ 1 0 for all 2.
It is called indefinite if there exist Z, # 0 and Z,
I L
# 0 such that ZTMZl > 0 > ZzMZ2.
Theorem 2.9.9. Let M be a Hermitian matrix of order
n with eigenvalues A,, ..., A,. Then
- ..
( a) M is positive definite if and only if Ak > 0,
k = 1, 2, ..., n.
(b) M is positive semidefinite if and only if
> 0, k = 1, 2, ..., n.
Ik -
(c) M is indefinite if and only if there are
integers j , k, j Z k, with A . > 0, Ak < 0.
I
Field of Values. Let M designate a matrix of order n.
The set of all complex numbers Z*MZ with I IzI I = 1 is
known as the field-of values of M and is dksignated
by P(M). 1 IZ11 desiqnates the Euclidean norm of Z.
-
The foliohing facts, due to Hausdorff and
Toeplitz, are known.
(1) 9(M) is a closed, bounded, connected,
convex subset of the complex plane.
(2) The field of values is invariant under
unitary transformations:
(2.9.6) Y(M) = F(U*MU), U = unitary.
(3) If ch M designates the convex hull of the
eigenvalues of M, then
(2.9.7) ch M 5 F(M).
(4) If M-is normal, then .F(M) = ch M.
64 Introductory Matrix Material Normal Matrices 65
PROBLEMS
1. Show that the field of values of a 2 x 2 matrix M
is either an ellipse (circle), a straight line
segment, or a single point. More specifically,
by Schur's theorem**, if one reduces M unitarily
to upper triangular form,
then
(a) M is not normal if and only if m # 0.
(a') A1 f A2. 9(M) is the interior and
boundary of an ellipse with foci at XI,
X2'
length of minor axis is iml. Length
2 1/2
of major axis (/m12 + IX1 - h21 )
.
- A2. 9(M) is the disk with center (a") X1 -
at hl and radius jm1/2.
(b) M is normal (m = 0).
!'!
, 8 .
.,. (b') X1 f h2. y(M) is the line segment
4
joining A and X2.
5
-
1
.. (b") hl = X 2 . 9(M) is the single point hl.
Ci
i l
I I REFERENCES
"a
-n
General: Aitken, [I]; Barnett and Story; Bellman, 121;
Browne; Eisele and Mason; Forsythe and Moler; Gant-
macher; Lancaster, [I]; MacDuffee; Marcus; Marcus and
Minc; Muir and Metzler; Newman; M. Pearl; Pullman;
Suprunenko and Tyshkevich; Todd; Turnbull and Aitken.
Vandermonde matrices: Gautschi.
Discrete Fourier transforms: Aho, Hopcroft and Ullman;
Carlitz; Davis and Rabinowitz; Fiduccia; Flinn and
McCowan; Harmuth; Nussbaumer; Winograd; J. Pearl.
**Any square matrix is unitarily similar to an upper
triangular matrix.
Hadamard matrices: Ahmed and Rao; Hall; Harmuth;
Wallis, Street, and Wallis.
Generalized inverses:
Ben-Israel and Greville; Meyer.
UDV theorem: Ben-Israel and Greville; Forsythe and
Moler; Golub and Reinsch (numerical methods).
CIRCULANT MATRICES
3.1 INTRODUCTORY PROPERTIES
By a circulant matrix of order n, or circulant for
short, is meant a square matrix of the form
1
",
\$ The elements of each row of C are identical to those
of the previous row, but are moved one position to the
right and wrapped around. The whole circulant is
evidently determined by the first row (or column). We
may also write a circulant in the form
1 . c = (c. = (ck-j+l)* subscripts mod n.
ik
Notice that
Introductory Properties 6 7
so that the circulants form a linear subspace of the
set of all matrices of order n. However, as we shall
see subsequently, they possess a structure far richer.
Theorem 3.1.1. Let A be n x n. Then A is a circulant
if and only if
The matrix n = circ(0, 1, 0, ..., 0). See (2.4.14).
Proof. Write A = (a..) and let the permutation o
11
be the cycle o = (1, 2, ..., n). Then from (2.4.13)
PoAP* = (
u ao(i) ,o(j) 1
-
where, in the present instance, Po - n. But A is
evidently a circulant if and only if a.. = a
11 o(i),u(j)'
that is, if and only if nAn* = A . This is equivalent
to (3.1.3) by (2.4.17).
We may express this as follows: the circulants
comprise all the (square) matrices that commute with
- 1
n, or are invariant under the similarity A + nAn .
Corollary. A is a circulant if and only if A* is a
circulant.
Proof. Star (3.1.3).
PROBLEMS
1. What are the conditions on c2 in order that
J
circ(cl, c2, ..., c ) be symmetric? Be Hermitian
n
symmetric? Be skew-symmetric? Be diagonal?
2. Call a square matrix A a magic square if its row
sums, column sums, and principal diagonal sums are
all equal. What are the conditions on ci in order
,
that circ(ci, c2, ..., c ) be a magic square?
n
3. Prove that circ(1, 1, 1, -1) is an Hadamard matrix.
It has been conjectured that there are no other
68
Circulant Matrices I Introductory Properties 6 9
circulants that are Hadamard matrices.
This has
been proved for orders 5 12,100.
(Best result as
of 1978. )
A Second Representation of Circulants.
In view of the
structure of the permutation matrices nk, k = 0, 1,
..., n-1, it is clear that
Thus, from (3.1.21, C is a circulant if and only if
C = p(n) for some polynomial p(z). Associate with the
n-tuple y = (cl, c2, ..., c n ) the polynomial
n-1
(3.1.5) pY(z) = c1 + c 2 z + - - . + c n z
.
The polynomial p Y (z) will be called the representer of
the circulant. The association y ff p Y (2) is obvious-
ly linear. (Note: In the literature of signal proc-
essing the association y ++ p (l/z) is known as the
I * Y
z-transform.) The' function
-. i (n-1) 8
1 (3.1.5') \$(I) = Oy(8) = c1 + c 2 ei8 + - . - + cne
5
m,
-,
is also useful as a representer.
3: Thus,
111
5 (3.1.6) C = circ y = p (T).
Y
Inasmuch as polynomials in the same matrix com-
mute, it follows that all circulants of the same order
commute. If C is a circulant so is C*. Hence C and
C* commute and therefore all circulants are normal
matrices.
PROBLEMS
1. Using the criterion (3.1.3), prove that if A and
B are circulants, then AB is a circulant.
2. Prove that if A is a circulant and k is a non-
negative integer, then is a circulant. If
A is nonsingular, then this holds when k is a
negative integer.
3. A square matrix A is called a "left circulant" or
a (-1)-circulant if its rows are obtained from
the first row by successive shifts to the left of
one position. Prove that A is a left circulant
if and only if A = TAT (see Section 5.1).
4 . A generalized permutation matrix is a square mat-
rix with precisely one nonzero element in each
row and column. That nonzero element must be +1
- - .. -
or -1. How many generalized permutation matrices
of order n are there?
5. Let C be a circulant with integer elements.
T
Suppose that CC = I. Prove that C is a general-
ized permutation matrix.
6. Prove that a circulant is symmetric about its
main counterdiagonal.
7. Let C = circ(al, a2, ..., an). Then, for integer
m,
nmc = circ (a
a2-,,,' -. -, an-m)
Subscripts mod n.
8. By a semicirculant of order n is meant a matrix
of the form
Introduce the matrix
Show that E is nilpotent. Show that C is a
70
Circulant Matrices
semicirculant if and only if it is of the form
C = p(E) for some polynomial p(z).
9. Prove that if (d, n) = (greatest common divisor
of d and n) = 1, then C is a circulant if and
d
only if it commutes with n . Hence, in partic-
ular, if and only if it commutes with n*.
10. Let K[w] designate the ring of polynomials in w
of degree < n and with complex coefficients. In
K[wl the uzual rules of polynomial addition and
multiplication are to hold, but higher powers are
to be replaced by lower powers using wn = 1.
Prove that the mapping circ(cl, c2,
..., C n ) ++
n- 1
C1 + C W + "' +cnw
2
[or circ y ++ py (w) 1 is
a ring isomorphism:
(a) If a is a scalar, u circ y ++ ap (w).
Y
(b) circ y1 + circ y2 ++
(c)
(circ y ) (circ y2) f-
1
(W)P (w).
Pyl Y2
.,
n
11. Let circ y ++ py (w).
Then (circ y)T ++ w p (w-')
Y
;;
'*, s Block Decomposition of Circulants; Toeplitz Matrices.
ir. The square matrix T = (t,,) of order n is said to be
*
\$ (3.1.7) t.. = t. i, j = 1 , 2, ..., n - 1 .
I 1 I 1+1, j+lr
.'
-,
Thus Toeplitz matrices are those that are constant
along all diagonals parallel to the principal diagonal.
Example.
(4
It is clear that the Toeplitz matrices of order
n form a linear subspace of dimension 2n - 1 of the
space of all matrices of order n. It is clear, fur-
thermore, that a circulant is Toeplitz but not neces-
sarily conversely.
A circulant C of composite order n = pq is auto-
matically a block circulant in which each block is
Toeplitz. The blocks are of order q, and the arrange-
ment of blocks is p x p.
Example. The circulant of order 6 may be broken up
into 3 x 3 blocks of order 2 as follows:
where
It may also be broken up into 2 x 2 blocks each of
order 3.
A block circulant is not necessarily a circulant.
This circulant may also be written in the form
Quite generally, if C is a circulant of order n = pq,
then
where I nJ are of order p and where the A. are
P' P I
Toeplitz of order q.
A general Toeplitz matrix T of order n may be
- 7.
embedded in a circulant of order 2n as ( G
.
See
also Chapter 5.
72 Circulant Matrices
3.2 DIAGONALIZATION OF CIRCULANTS
This will follow readily from the diagonalization of
the basic circulant n.
Definition. Let n be a fixed integer 2 1. Let w =
exp(2ni/n) = cos (2n/n) + i sin(2n/n), i = a. Let
(3.2.1)
2 n-1)
n = ( Q ) =diag(l, w, w , ..., w
n
k 2k
Note that rl k = diag(1, w , w , . . . , w
(n-l)k)
Theorem 3.2.1
Proof. From (2.5.31, the jth row of F* is
(I/&) (w
T - l ) O , w(j-l)l
, ..., w j 1 . Hence the
jth row of F*R is ( 1 w r - wr) = (I/&) (Wjr),
r = 0 1, . . . , - 1 . The kth column of F is (I/&)
(;(k-l)r
, r = 0 1, . - 1 . Thus the (j,k)th
!
element of F*RF is
.
1 i f j = k - 1 ,
mod n
lli
am Then (3.2.2) follows.
NOW
(3.2.3) C = circ y = p (n) = p (F*~F)
Y Y
Thus we arrive at the fundamental
Theorem 3.2.2. If C is a circulant, it is diagonalized
by F. More precisely,
Diagonalization of Circulants 73
(3.2.4) C = F*AF
where
-
(3.2.5) = A, = diag(p Y (I), p Y (w), . . . , py (wn-l)).
The eigenvalues of C are therefore
(Note: The eigenvalues need not be distinct.)
The columns of F* are a universal set of (right)
eigenvectors for - all circulants. They may be written
T
as F*(O, ..., 0, 1, 0, ..., 0) .
We have conversely
Theorem 3.2.3. Let A = diag(hl,X2, ..., An); then
C = F*AF is a circulant.
Proof. By the fundamental theorem of polynomial
interpolation, we can find a unique polynomial r(z) of
degree 5 n - 1, r(z) = dl + d2z + . - . + dnzn-I and
such that r(wJ-l) = A;, j = 1, 2, .. ., n. Now, form
J
D = circ(dl, d2, ..., d n ) It follows that D = F*AF =
C, so that C is a circulant.
With regard to the diagonalization (3.2.4), it
should be observed that there is really no "natural"
order for the eigenvalues of a matrix. Corresponding
to every permutption of eigenvalues, there will be a
unitary matrix F for which a formula analogous to
(3.2.4) will be valid.
More precisely, let C = F*AF and let Pn be the
-
permutation matrix corresponding to the permutation a.
Then C = F* (P;Pu) A (P;Po)F = (F*P;) (P,AP;) (PaF). Now
T
if A = diag(Al, ..., An) and L = (A1, ..., An) , then
from (2.4.19), PaAP; = diag (POL). If we now let @
be the unitary matrix F = PaF, we have
74 Circulant Matrices
We have found it to be convenient to standardize
the order of the eigenvalues in the way we have done,
leading to (3.2.4).
Let us exhibit the solution of this interpolation
problem more explicitly. Write
L = (:') and yT = (i' ) .
' n n
Then, from (25.11 and (2.5.14).
(3.2.7) yT = n -1/ZFL
and
(3.2.8) py(z) =
n- 1
1 Z ..., z )FL.
'.
Also,
,-,
i . (3.2.9) A = n1/2diag(~*yT)
*
a: .
It.
Since F2 = and FF* = I, one also has the identity
.,
-
2 T
(3.2.10) FyT = F (F*y ) = n-l/'r~.
I%
il
OW
MI
On the basis of the fundamental representation
!. (3.2.41, it is now easy to establish that
Theorem 3.2.4. If A and B are circulants of order n
and ak are scalars, then A ~ , A*, a A + u2B, AB,
1
li,oak~k are circulants. Moreover, A and B commute.
If A is nonsingular, its inverse is a circulant. With
A = F*AF, A = diag(hl, ..., An) its inverse is given
by
where
Diagonalization of Circulants 75
Since
(circ (c
T -
c2, . c - circ(cl. cn, c ~ - ~ , ..., c2)
= r ( c ~ , c2, . .. ,
T
cn) 8
if we write
Y = (cl, C2' ... , Cn),
we have
The determinant of a square matrix is the product
of its eigenvalues. Therefore from (3.2.61,
(3.2.14) det(circ y) = det circ(cl, c2, ..., cn)
n
= n (wj-l ) .
j=1 Y
If
m
f(z) = a0z
+ a z m-1 + .. .
1 + a,,,, a" # 0,
n
g(z) = b0z
+ blz n-l + . . . + bnr
bo # 0
and have roots al, ..., a ; el, ..., 8, respectively,
m
the resultant R(f, g) of f and g is defined by
R(f, 9) = a:g(al)g(a2) 0 . . g(am)
mn n
= - 1 bof (el)f (B2) " ' f (0,)
= (-l)rn"~(g, f).
I
Thus, with f (z) = zn - 1, g(z) = py(z), we have
76 Circulant Matrices
where pl, ...,
"-1
are the roots of p (2).
Y
In this way, det circ is expressed as the resul-
tant of the two polynomials zn - 1 and pY (2).
In the case of real elements, the representation
(3.2.14) may be simplified somewhat. Let y = (c
lr C2'
..., Cn), py(z) = C1 + C z + " '
2 + cnzn-1, w =
8 .
exp (27ri/n). Then
, ,.
- = wn-j
w3 = exp(
, .
- and therefore, with c's real,
Z
If now n = 2r + 1 = odd, then
: 5
n- 1 r
2
det circ y = Il p (wJ) = py (1) Il I P (wJ) 1 .
j=o Y j=1 Y
Z
If n = 2r + 2 = even,
I
Corollary. Let y = (c,, c,, ..., c,,) have real com-
-
n
.A
ponents. If n is odd, then Ijel ci 5 0 implies
det circ y > 0.
If n is even and n = 2r + 2, then
n
Proof. We have p (1) = Ijz1cj and p (-1) =
-
Y Y
Diagonalization of Circulants 7 7
1 (-l)lcj. Since jp (wj) 1 2 0, the odd case is
Y
immediate. For the even case, note that
Conditions for det circ y > 0 or for det circ y
< 0 are easily formulated.
A square matrix is called nondefective or simple
if the multiplicity of each of its distinct eiqenval-
ues equals its geometric multiplicity. By geometric
multiplicity of an eiqenvalue is meant the maximal
number of (right) eigenvectors associated with that
eiqenvalue. A matrix is simple, therefore, if and
n
only if its right eigenvectors span C . Equivalently,
a matrix is simple if and only if it is diaqonalizable.
It follows from Theorem 3.2.2 that all circulants are
simple.
As we have seen, all circulants are diagonalized
by the Fourier matrix, and the Fourier matrix is a
particular instance of a Vandermonde matrix. It is
therefore of interest to ask: what are the matrices
that are diagonalized by Vandermonde matrices?
Toward this end, we recall the following defini-
tion. Let
(3.2.15) \$(x) = xn - a x
n-1 -
a x
n-2 - ...
n- 1 n- 2
- alx - a
0
be a monic polynomial of degree n. The companion
matrix of \$,
c\$r is defined by
78
Circulant Matrices
It is well known and easily verified that the charac-
teristic polynomial of C '\$ is precisely +(x).
Hence,
if a,,, alp ..., a n- 1 are the eigenvalues of C dr' we
have
Theorem 3.2.5.
LetV = V(ao, al, . . . , a n- 1 ) designate
the Vandermonde formed with ao,
..., an-l [see
(2.5.12)l.
Let D = diag(ao, al, ..., a n-1 ) .
Then
If the a. are distinct, V is nonsingular, which
I
gives us the diagonalization
(3.2.19)
1
C\$ = vov- .
Hence, for any polynomial p(z),
Proof.
A direct computation shows that the first
n - 1 sows of VD and of C V are identical.
Now the
dr
element in the (n, j) position of VD computes out to
n
be aj-l.
The element in the (n, j) position of C dr V
computes out to be
Diagonalization of Circulants 79
this reduces to an 3-1- Therefore VD = C \$ V.
Since det V = lli<j (ai - ai), it follows that V is
nonsingular if and only if the a. are distinct. In
I
this case we can arrive at (3.2.19).
Example. If we select @ (x) = xn - 1, then C
= T.
dr
The roots of \$ are wJ, j = 0, 1, . .., n-1 and V is a
scaled version of F*.
Since all polynomials in C, = n
Y
are circulants and vice versa, (3.2.20) reduces to
(3.2.4).
Let us note another consequence of (3.2.2) which
is of interest.
Let P (= Po) be the permutation matrix corres-
ponding to the permutation a. From (2.4.111 we know
that PP* = P*P = I, so that P is unitary and normal.
It follows from general theory that P is unitarily
diagonalizable. It is often useful to be able to
exhibit this diagonalization explicitly.
In Section 2.4, we arrived at the following iden-
tity. Let o be factored into the product of disjoint
cycles of lengths pl, p2, ..., pm. Then, by (2.4.25),
- - ...
there is a permutation matrix R such that
RPR* = n B n B ... n .
p1 p2 pm
From (3.2.2),
~l = F * n F , j = 1, 2, ..., m,
P P. Pj
3 3
where F and n are the Fourier and 0 matrices of
P.
3
order p.. Thus if we set
3
(3.2.21) U = F O F o -. . " F ,
p1 p2 Pm
n = n (B ...
" np2
e n ,
p1 pm
we have
By (3.2.15) this is an 3-1 - a . I-I I, and by (3.2.17)
80 Circulant Matrices
RPR* = U*AU,
so that
Observe that A is diagonal and U, and hence UR are
unitary.
PROBLEMS
1.
If A and B are square and AB is a circulant, are
A and B circulants?
2. If is a circulant, is A a circulant?
3. Diaqonalize J = circ(1, 1, ..., 1).
4. Diagonalize circ(a, a + h, a + 2h. ....
a + (n - 1)h). Find its determinant.
2 n-1)
5. Diagonalize circ(a, ah, ah , ..., ah
. Find
its determinant.
6. Diaqonalize circ (1, 3, 6, 10, . . . , n(n + I)/>).
7. Diagonalize A = pI + qJ.
J is as in Problem 3.
Find det A.
8. In Problem 7, prove that if p > 0 and p + nq > 0,
A is positive definite symmetric.
9. Diagonalize circ(1, s, 0, 0, ..., 0, S).
10. Let C be a circulant with eigenvalues Ak. Show
T
that C = F*diaq(hl, An, An-,, ..., h2)F.
11. Diagonalize the checkerboard circulant
circ(0l 01 01 ... 01).
12. Diagonalize circ(001 001 001).
13. Diagonalize circ(0, 1/2, 0, 0, ... , 0, 1/2) =
1/2(7~ + n*). (Random walk on a circle. One-
dimensional lattice.)
14. Analyze circ(0, p, 0, ..., 0, q), p + q = 1.
15.
Prove that a circulant C is real and has eigen-
values A . if and only if A . = Xn+l-j,
j =
I I
1, 2, ..., n.
Diagonalization of Circulants 8 1
Let
1
G2 =
circ(5, 1 + fi, -1, 1 - A, 1, 1 - fi,
-1, 1 + a) .
Show that G2 and G3 are symmetric circulants and
-
that G2G3 = G G = G2.
3 2
Let A, B be circulants of order n with eigen-
values X
A , j r ' ~ , j '
j = 1, 2, ..., n. Prove that
AB = A if and only if h B r i = 1 whenever AA
f 0.
. i
Prove that a circulant is-~ermitian if and.only
if its eigenvalues are real.
Prove that a circulant is unitary if and only if
its eigenvalues lie on the unit circle.
Prove that a circulant is Hermitian positive def-
inite if and only if its eigenvalues are positive.
Prove that circ(cl, c2, ..., cn) has all row and
column sums equal to a if and only if IEZ1ck = a.
Prove that if A is normal and has all row sums
equal to a, then all column sums equal a.
Prove that A is normal if and only if there exists
a unitary U and a circulant C such that A = U*CU.
In other words, A is normal if and only if it is
the unitary transform of a circulant.
A matrix M is said to be periodic if there exists
p 1 1 such that M~ = I. Find all the circulants
of order n that satisfy this equation.
Prove that det circ(x, 1, 1, 1, 1) =
Prove that det circ(a,, a,, a,, 0, 0, ..., 0)
I L J
n
= a; + a; - 5; - C2 where il and c2 are the
*
roots of xL + a x + a a = 0.
2 1 3
Prove that
Circulant Matrices
det circ(a, a, ..., a; b, b, ..., b)
(ma + nb) (a - b)
m+n-1
if (m, n) = 1,
if (m, n) > 1.
Here m = number of a's, n = number of b's, and
(m, n) = greatest common divisor of m and n.
Prove that
2 r-1
det circ(1, a, a , ..., a , 0, 0, ..., 0)
(0. Ore.)
Prove that
det circ(ao, al, a 2' 0, 0, . . . , 0)
=a:+a;- (-1) n+s - n - s ) (aoa2) al .
( n n S
s n-2s
s=o
1
s ' ~n
(0. Ore.)
The matrix circ(1, -2, 1, 0, 0, ..., 0) occurs in
the theory of morphogenesis (diffusion on a
circle). Diagonalize it. Generalize; for exam-
ple, circ(1, -3, 3, -1, 0, ..., 01, circ(1, -4,
6, -4, 1, 0, 0, ..., 0).
Let c2 = c
n+l
= C
N-n+l
= cN = 1. All other c's =
0. Find the eigenvalues of circ(cl, c2, ..., cN).
(Two-dimensional lattice.)
Let p(z) be the representer of the circulant C.
Prove that C is idempotent (12' = C) if and only
if p(w3) = 0 or 1 for j = 0, 1, ..., n-1.
If A is square, of order n, define per(A) as the
determinantal expansion of A in n! terms where
all the minus signs have been changed to plus.
a b
For example, per (c = ad + bc; per(A) is called
the permanent of A. Let Dn = per(J - I) with J as
Diagonalization of Circulants 8 3
in Problem 3. Prove that
(For this and applications of circulants to
combinatorial problems, see Minc.)
3.2.1 Skew Circulants
A skew circulant matrix is a circulant followed by a
change in sign to all the elements below the main
diagonal.
Example
(3.2.1.1) scirc(a, b, c, d) =
-b -c -d a
In the same way that the theory of circulants is
related to the matrix n, the theory of skew circulants
is related to the matrix
r o I o ...
The main development of the theory is given in the
next group of problems, and the solutions can be car-
ried out along the lines already indicated for circu-
lants. Skew circulants have also been called
negacyclic matrices.
The notion can be extended somewhat by using the
matrix
Circulant Matrices
where lkl = 1. A {k)-circulant is one which commutes
with nk. For k = 1, k = -1 we obtain the circulants
and skew circulants respectively. Representations
analogous to those given in the Problems are valid.
PROBLEMS
3. A is a skew circulant if and only if An = 0A.
4. The characteristic polynomial of q is
n
(-1) (in + l), and its eigenvalues are o, ow,
2
ow , . . . , own-I where
71 TI
o = cos - + i sin -,
n n
2 n 2n
w = o2 = cos - + i sin - .
n n
- 1
Note that o = o .
5. The eigenvectors of n corresponding to these roots
2 n-1 T 2
are (1, o, o , ... , o ) , (1, ow, (ow) , ... ,
n-1 T 2 2 2 2 n-1 T
(OW) r (I, ow , (OW ) , -.. r (ow ) ) , - . . r
(1, own-I, (ow
n-1)2 n-1 n-1 T
, ..., (ow 1 1 .
6. The eigenvalues of scirc(al, a2, ..., an) are
n- 1
where p (z) = al + a z + a z2 + '..
2 3
+ anz .
2 n- 1
7.
Define fill2 = diag(1, o, o , . . . , 0
) , R =
2 "-' and fi are unitary.
diag(l,w,w , . . . , w ).fi
Moreover.
Diagonalization of Circulants 8 5
-112, -1/2
Q = (FR *(on) (FR 1.
8. S is a skew circulant if and only if it is of the
1 2 ) where A is diagonal. form s = (~5~") *A(FR
9. S is a skew circulant if and onlv if it is of the
form S = R1/2~~1'2, where C . ,.. is a-circulant. -
..
11. If S, V are skew circulants and q(z) is a poly-
nomial in z, then ST, S*, SV, q(S), S ' (cf.
Theorem 3.3.1). S-I (if it exists) are skew
circulants. Moreover, S and V commute.
3.3. MULTIPLICATION AND INVERSION OF CIRCULANTS
Since a circulant is determined by its first row, it
is really a "one-dimensional" rather than a "two-
dimensional'' object. The product of two circulants
is itself a circulant, so that a good fraction of the
arithmetic normally carried out in matrix multiplica-
tion is redundant. For circulants of low order,
multiplication can be performed with pencil and paper
using the abbreviated scheme sketched below.
Product of two
circulants:
: ) : :5=(32 37 36 32 37) 36
Abridged multiplication: 1 2 4
4 5 6
It is seen from this that the multiplication of
two circulants of order n can be carried out in at most
n2 multiplications and n(n - 1) additions.
8 6
Circulant Matrices
However, using fast Fourier transform techniques,
the order of magnitude n2 may be improved to O(n log n).
Recall the relationship between the first row y
of a circulant C = circ y = circ(cl, c2,
..., cn) and
its eigenvalues Al, ..., An. From (3.2.7) we have
Now let A have first row a and eigenvalues XAJ1.
..., AA,n and B have first row 0 and eigenvalues
hBrlr -.., XB,n.
Let the product AB have first row y.
Then
(3.3.2) A = circ a = F*dlag(XArl, ..., XA,n)F,
B = circ 0 = F*diag(pBrl, ... , fiB,n)Fr
SO that
(3.3.3) AB = circ y = F*diag(A X X IF.
A,1 B,1' ...' 'A,n B,n
Now from (3.3.1)
f
n1/2~*aT = ( x ~ , ~ , . . . ,
T
'A,n)
n 1/ZF*@T = ( A ~ , ~ , ..., XB,n )T.
Therefore, we have
(3.3.4) yT = n1/2~ [ (F*aT) f (F*B~) I .
The symbol ? is used to designate element-by-element
product of two vectors.
Thus the multiplication of two circulants can be
effected by three ~burier transforms plus O(n) ordin-
ary multiplications. Since it is known that fast
techniques permit a Fourier transform to be carried
out in O(n log n) multiplications, it follows that
circulant-by-circulant multiplication can be done in
O(n log n) multiplications.
It would be interesting to know, using specific
computer programs, just where the crossover value of n
is between naive abridged multiplication and fast
Fourier techniques.
Multiplication and Inversion of Circulants
8 7
Moore-Penrose Inverse. For scalar A set
and for A = diag (A
l1 X21
..., in) set
-
(3.3.6) A'=diaq(h;, A;, ..., A:).
Theorem 3.3.1. If C is the circulant C = F*AF, then
its Moore-Penrose generalized inverse (M-P inverse)
is the circulant
Proof. The four conditions of Section 2.8.2 are
-
immediately verifiable for c7 (or see Theorem 2.8.3.2).
Corollary
where Bk are the matrices Bk = F*AkF, Ak = diag(0, 0,
..., 0, 1, 0, ..., 0). In particular,
-
(3.3.9) Bk - - Bk.
Circulants of Rank n - r, 1 5 r 5 n. Insofar as a
circulant is diagonalizable, a circulant of rank n - 1
has precisely one zero eigenvalue. If C = F*AF, then
C has rank n - 1 and only if for some integer j, 1 <
I 5 n,
-
with ui # 0, i p j. Now,
and C' = F*A'F, so that
88 Circulant Matrices
(3.3.12) CC: = CTC = F*(l, 1, ..., 1, 0, 1, ..., 1)F,
where 0 occurs in the jth position. From this it
follows that
(3.3.13)
CC' = cfC = F*(I - A . ) F = I - F*A.F = I - B..
I I I
The B. are the matrices given by (3.3.8). For
I
circulants of rank n - 2, one has
-
(3.3.14) CC- = C'C = I - B. - B
I k
for some i, j, j # k.
PROBLEMS
1. Let A, X , B be of order n. Let A and B be cir-
culants. Prove that AX = B has a solution if and
only if, wherever an eigenvalue of B is not 0,
the corresponding eigenvalue of A is not 0. In
this case, there is a solution X that is a cir-
? culant.
1. .
2. Let A, B be circulants of order n with eigenvalues
1, . 5'
..., An; ul, ..., 'n'
Let p(x, y) be a poly-
nomial in x, y. Prove that the eigenvalues of
! p (A, B) are precisely p (Xi, u . ) , j = 1, 2, . . . , n.
I
f ,
Remark: A theorem of Frobenius says that if A and
B commute, then the eigenvalues of p(A, B) are
precisely p(X. pj), j = 1, 2, ..., n for some
I '
I pairing of the eigenvalues. This has been gener-
1
alized by numerous authors.
I
! Circulant Inverses, Continued. Let C = circ(al, a2,
..., an) and let
. . . + anz
n- 1
(3.3.15) p(z) = al + a2z +
i be its representer. From (3.1.4) one has
(3.3.16) c = p(n).
The last few coefficients in (3.3.15) may be zero.
Assuming that C + 0, let us rewrite (3.3.15) in the form
Multiplication and Inversion of Circulants
8 9
with 1 5 r - < n - 1 and ar # 0.
Suppose that ul, u2, ...,
pr- 1
are the zeros of
- - - -
the representer p(z) (to be distinguished from the
eigenvalues of C). Thus p(z) = ar(z - u )(z - u ) - . -
(Z - u ~ - ~ ) . hence
1 2
. . . (7 - ur-lI).
This gives us a factorization of any circulant into a
product of circulants n -11 I that are of a particular-
ly elementary type. k
Suppose now that C is nonsinqular. This is true
if and only if none of the eigenvalues of C is zero.
That is, if and only if A . = p(wl-l) # 0, j = 1, 2,
1
..., n. This will be true if and only if pk # an nth
n
root of unity. Thus pk # 1, k = 1, 2, ..., r-1.
From (3.3.18) one has
Let us examine a typical factor. Let u be a
complex variable. Then, for a given matrix M, a
-1 .
matrix of the form (M - pI) 1s called the resolvent
function of M. The resolvent of n has a particularly
simple form.
Theorem 3.3.1 Let U" # 1. Then
(3.3.20) ( " - =
1
n[pn-l~ + pn-'n + pn-3n2
1 - I J
Proof. Multiply the right side by n - uI and use
-
the fact that nn = I.
9 0 Circulant Matrices
- 1
We may also relate C
to the reciprocal of p(z).
Let C be a circulant with representer p(z). Suppose
that p(eiO) # 0, 0 < 8 5 27.
Then, since the zeros of
a polynomial are isolated, p(z)
is not zero in some
open annulus A that contains lzl = 1 in its interior.
Thus [p(z)]-' is regular there, hence has a Laurent
expansion
-
b.21) =
which converges absolutely in A, and ~ ( z ) (Ij=-_
I. It follows that the series
!
converges, and one has p(n) (IT=--blnl) = 1.
i
Theorem 3.3.2. Let p(eiO) # 0, 0 5 0 5 2 ~ .
Then
n
Proof. Make use of n = I to regroup the terms
-
in 17 ,=-- b , 73.
Circulant Inversion by FFT Techniques.
Let C = circ Y
= circ (cl, c2: ..., . c n ) = F*diag(hl, ..., hn)F. Then
ci = F*diag(X;, hi, . . . , h;)F.
Let C' = circ 8: then
from (3.2.7) or (3.3.1)
n1/2F*yT = (Alr ..., in) T ,
B~ = n-ll2F(h;, hi, ..., A*)
Thus
T
(3.3.24) 8 = F(F*~~)'.
I -
The notation ( ) means apply "t" element by element.
A somewhat more aesthetic form of (3.3.24) is as
Multiplication and Inversion of Circulants
91
follows. For c = circ y, write c7 = circ y7. Then
(3.3.25) (yilT = F(F*~~)'.
From (3.3.25) it appears that a circulant inverse (or
generalized inverse) can be computed in two Fourier
transforms plus n ordinary reciprocations. Thus it
can be done in O(n log n) multiplications.
The same line of reasoning allows us to compute
f(C) where f is any function defined on the eigenvalues
A of the circulant C. Write C = circ y and f(C) =
k
circ 8. Then nl'*E'*yT = (Al, h2, ..., But the
eigenvalues of f (C) are f (Al), . . . , f (An), so that
fiT = n-1'2F(f(~l), f (A2), ..., f(An))T.
Thus
where we use the notation
PROBLEM
1. Let C be a circulant of order n with representer
p(z) and characteristic polynomial q(z). Prove
that zn - 1 divides q (p (2)).
3.4 ADDITIONAL PROPERTIES OF CIRCULANTS
Multiplication of Circulants. Let us look more
closely at the product of circulants.
Let C,, k = 1,
..
2, ..., p be circulants with diagonalization C =
F*AkF, Ak = diagonal. Then k
92
Circulant Matrices
From this it follows that the eiqenvalues of the
product C1C2..-C P are the product of the eigenvalues.
This is an essential feature of a11 families of mat-
rices that are - simultaneously diagonalizable by a
fixed matrix.
A special case of (3.4.1) is
Rank.
The rank of a diaqonalizable matrix is equal to
the number of its nonzero eigenvalues. Hence, if C =
F*AF, A = diag(hl, . . . , A ) , then r(C) = number of the
X's that are not zero. From (3.4.2) it follows that
Trace. Let C = circ(clr C2, ..., c ) = F*AF, A =
- n
diag(hl, ..., An). Then
where y = (cl, C2. . . . r Cn).
From (2.7.16) we have
-
(3.4.6) t r ( ~ ~ * ) = tr(~*Ax~) = tr(bh)
Determinant. The determinant of circ(cl, c2, ..., Cn)
is a homogeneous polynomial of degree n in the
Additional Properties of Circulants 9 3
variables cl, ..., c
n '
There are no "simple' formulas.
We note the first four cases:
(3.4.7) n = 1, det circ(cl) = cl,
- -
n = 2, det circ(cl, c21 = c 2 2
1 - C24
n = 3,
3 3
det circ(cl, c2, c3) = cl + c2 + c 3
3
- 3 c c c
1 2 3'
n = 4, det circ (c 1 r C2, C3, c4) =
4 4 4
c1 - C2 + C3
4
- C4
- 2c3c: + 2c2c4)
Spectral Decomposition. Let C = F*AF where A =
diag(X1, X2, ..., An). Introduce the diagonal matrices
where the 1 occurs in the kth position. Now A =
diaq(hl. .... An) = ~ ~ = l ~ k ~ k , so that C = ~ ~ = l l k ~ * ~ k ~ .
If we set
then we can write
The matrices Bk are the component or principal idem-
potent matrices of the circulant C. The matrices B
k
are, of course, circulants. Note that B.B =
I k
'
1
-
H

o
w
3

H
.

a

a

a
.

w

r

t
i

w
r
t

I
1

-

P

x
u
-

x
u
.

1
1
>
w
w
q
'
1

t
i
p
*
*

M
I
I
(
D
(
D
>
>

x

3

0

X

u
.

I1

S
m
w
m
m
I
I
*

>

w

I
-

r
m
r
.
x
.

e
m
-

o
r
t

m

0

t
i

P

I
I

t
i

r
m

W

3
t
3
q

c

0

m
r
t
-

m

0
0
7

3

r
t

I1

-

r
-

o
m
m
r
.

m

m

1
1

3
x
*
x

m

m

\$

x
-

-

H

.

1
1

w

'
i

U
'

r
t

m

w
m
x

7
O
O
H
*

m

H

3

3

>
r
.
F

a
m

x
*
a

>

m
x

(
D

I1

(
D

m

X

-

*
n

F

I-

7
C

m

-

m
r
r
t

3

w

r

3
Y

-

-
r
t
r

r

-

m

\
n

,

r

3

m

-
w

.

2

2
2
:

3

-

r
t

Y
t
-

Y

r

m
3

0

-
r
t

m

t
T

+
f

(
D

f

-

f
t
T

r
t

n

-
m

7

-
r

(
D

.

N

w

-

(
D

I
+

P
.

+
c

"
2

3

(
0

.
n

3

.
r
t

C

r
.
w

0

!-

+
3

c

m

f
a

m

-

(
D

r
m

0

3
r
.
m

-

3

-
0

r
t

a

s

I1

0

0

r
t
3

0

7

r
.

m
*

t
i

r

I

S

z

\$
2

m

r
.

.

o

5
'

"

?

2
G
:

0

m

a

P

r
.
.

t
i

w
r
t

'

n

0

-

m

u
l
r

a

w

m

c

o
o
w

0

t
i
0
3

(
D

n

z

a

u
l
F

m
a

\

w

n

W

N

O

W
N

Z
r

a

n

I
I

n

n

r
.

t
i

0

*

?
;
g

c

a

w
w
c

I
-

r
.

a

r
t

r
t

w

w

0

r
.

(
D

3

I
D

a
o
.

r
t

-

r
t

3

Y

n

V
F

w

0

P

\

m

\
N
<

N

-

W
-

Y

F
?

2

,
I
-

m
a

t
i

\

N

O
F

<

-

m

i
n

-

E
.

Y

n

.

x
r
r
-

7
.

\

m

.

N

-

.

r
t

n

Y

0

r

3
r

\

Y
m

h

N

7
9

-

-

m

c

N

'1

3

W

.

t
i

I1

r
.

m

m

n

96
Circulant Matrices
2. Let B. be the matrices of (3.4.9). Let C be a
3
circulant with eigenvalues nlr ..., qn. Prove that
B.C = 11 .B..
I 3 I
2
3. ~ e t y = (1, w, w , ... , wn-l)T.
Prove that B.Y =
3
6.Y where 62 = 1, 6. = 0 otherwise. Prove that
I -
3
8.y = ~ . y where E = 1, E . = 0 otherwise.
3 3 n 3
4. Outer product expansion.
Let A be of order n and
have the singular value decomposition A = UDV*
where u and V are unitary and D = diag(dl,
...,
d ) (see (2.8.3.1)). Let Ak be as in (3.4.8) and
n
set Bk = uA~v*, k = 1,
..., n. Let u. be the ith
1
column of U and v* be the jth row of V*. Show
3
(a) B = u v* (the outer product of uk and vk);
k k k
(b) The matrices Bk have rank (1); B.B" 0,
1 I
i f j;
(d) ~ ! B . B ? = I; (e) tr(BiBT) = 1 (see
1=1 1 1
Minimal Polynomial of a Circulant. Let A be a matrix
whose characteristic polynomial is
where X ..., X are distinct and the integers ak 1.
1' s
Then the minimal polynomial of A has the form
with 1 < f3; 5 a;, j = 1, 2, . . . , s. Now, it is known
-
J ,
that a matrix is simple (diagonalizable) if and only if
its minimal polynomial has only simple zeros. There-
fore if A is simple, in particular, if A -
circulant, then
Additional Properties of Circulants 97
In other words, m(A) is that monic polynomial of
minimal degree which has as its zeros all the distinct
eigenvalues of A. Of course, one has m(A) = 0.
Derivatives of Circulants and of Determinants of
Circulants. Let A be an m x n matrix whose elements
a,, = a,,(t) are differentiable functions of t on
A J 'J
some common interval. By dA/dt or we mean the m x n
matrix I (d/dt)a. . I . It is easy to verify the
1 I
identities
d d A dB
(3.4.22) =(aA + 68) = a dt + 6 -. dt' n, 6 scalar
constants,
d
(3.4.23) -
dA da
dt
a A = u - + - A ,
dt dt
u = scalar function.
If A and B are compatible for multiplication,
d
-
d A dB
(3.4.24) d t ( A B ) = - B + A - d t dt'
If A is square and nonsingular,
Now let A = A(t) = circ(cl. c2, ..., c ) where c.
n 3
= c.(t) are differentiable functions. Then by (3.2.2)
3
where A = diag(Al(t), ..., hn(t)) and
Then
with
Of course, one also has from (3.1.4)
98 Circulant Matrices
n-1 dc.
Let c. = c.(t) be differentiable functions and
I 3
set A = A(t) = det circ(cl, c2, ..., c,). The follow-
! ing identity is valid.
c c . . . c' C'
n-1 n
dA cncl ... c
(3.4.27) = n det n- 2 'n-9 .
. .
. .
. .
2
... C C
n 1
From the ordinary law of determinant differentiation,
one has
... C' C'
... C
- dA = det
d t
C
n 1
C1 C2 ... C C
n- 1
C' C' ... C'
+ det
C2 C3 . a . ' n
C
1
+ ...
I
C1 C2 . .. C
+ det (:
!
I c c . . . c' c'
! i
n 1
Additional Properties of Circulants 9 9
Now it turns out that these n determinants are all
equal; hence the theorem.
In order not to get lost in a welter of notation,
we show this in the case n = 3. It is merely a row-
column interchange. The method is perfectly general.
Note that
and
Since n* =
- 1
n , we find, upon taking determinants,
that all the determinants in the previous expansion are
equal.
3.5 CIRCULANT TRANSFORMS
Let C = circ y, y = (cl, c2, ..., cn) be a circulant
of order n. Let Z = (zl, z2, ..., z ~ ) ~ and w =
T
(wl, w2, ..., wn) . If W is related to Z by means of
(3.5.1) W = CZ,
then W is called the circulant transform of Z by C. It
is also called the circular convolution or the wrapped
convolution of y and 2 .
We mention a number of circulant transforms of
of particular interest:
(1) C = I = circ(1, 0, . . . , 0). This is the
identity.
(2) n = circ(0, 1, 0, . . . , 0). This is the
fundamental circulant. n causes a circular
shifting of the components of Z.
100 Circulant Matrices
r
(3) For integer r, n causes a circular
shifting of the components of Z by r
positions.
( 4 ) D = I - n = circ(1, -1, 0, 0, ..., 0)
Since DZ = (zl - z2, z2 - z3,
T
..., z - zl) ,
n
it is clear that D is a circular differenc-
ing operator.
-
(5) For integer r 2 0, D~ = (I - is a
circular differencing operator of the rth
order.
-
(6) For s, t > 0, s + t = 1, the circulant
transform C = SI + tn is, as we shall show
later, a smoothing operator.
Let C = F*AF; then (3.5.1) becomes
so that if one writes 2 and ii for the Fourier trans-
forms of Z and W, one has
If C is nonsingular, then the inverse transform
is given by
and is itself a circulant transform.
If C is singular, then (3.5.1) may be solved in
the sense of least squares, yielding
This, again, is a circulant transform that is often of
interest.
r
As a concrete instance of (3.5.4). select C = n ,
r = 0, +1, t2, ... . Then nrZ is just Z shifted
circularly by r indices. Since nr = F*R~F, R =
2 n-1)
diag(1, w, w , . . . , w , one has
Circulant Transforms 101
This is known as the shift theorem.
PROBLEM
1. Is the circular convolution of two vectors a
commutative operation?
3.6 CONVERGENCE QUESTIONS
Convergence of Sequences of Matrices. Let MI, M2, ...
be a sequence of matrices all of the same order.
Iteration problems often lead to questions about
whether certain infinite sequences or infinite
products of matrices converge. In the case of
infinite products, particular importance attaches to
whether the limiting matrix is or is not the zero
matrix.
Prior to discussing this question, we recall the
definition of matrix convergence. Let
be a sequence of matrices all of size m x n. We shall
say that
(3.6.1) lim Ar = A = (a. ) if and only if
r-m lk
lim a!r) = a ,
Jk'
for j = 1, 2, ..., m;
r+.. Jk k = 1, 2, ..., n.
The notation l:=l~r = A is an abbreviation for
k rn
limk+mlr=lAr = A and the notation IIrZ1Ar = A is an
abbreviation for limk,,II~=,Ar = A. One sometimes
~- - -
writes A_ - A for convergence.
L
Elementary properties of convergent sequences of
matrices are:
(1) If Ar + A , then uAr + aA; u, scalar.
(2) If Ar, Br are of the same size, then Ar + A,
102 Circulant Matrices
Br + B implies Ar + Br + A + B.
(3)
If AT are m x n and Br are n x p and if Ar
-
+ A, Br + B then ArBr + AB.
(4) If A i s m x n and / / A / / designates the
matrix norm
k=l
then Ar + A if and only if limr+,l /A-ArI I
= 0.
If Ar is a sequence of square matrices of order n the
m
question of the convergence of lIr=lAr may be a diffi-
- -
cult one. Somewhat simpler to deal with is the case
in which all the
Ar
are simultaneously diaqonalizable
by one and the same matrix.
Theorem 3.6.1. Let Ar = MArM-l, r = 1, 2, . . . , where
(r)
M is a nonsingular matrix and where Ar = diag(hl ,
. . . , h (l) ) . Then Ilm A exists if and only if
n r=l r
cm
x!~) exists for j = 1, 2, . . . , n. In such a case,
=r=1 ,
k
Proof. l\$.=l~r =
k
(MA,M-l) = M(IIr,lAr)~-l and
-
nr=l
A =.
r l l r ~ . Hence n!=l~r converges if and
only if II:=l~r does. But I l : , l ~ r = diag (l~!=~h,!~))
The theorem now follows.
Corollary. An infinite product of circulants conver-
ges if and only if the infinite products of the
respective eiqenvalues converge.
Proof. All circulants are simultaneously
-
diagonalizable by F.
Convergence Questions 103
Note. We have said that IIT=lhr CO- if and
only if limk IIk h exists. This terminology is at
+- r=l r
variance with some parts of complex variable theory
which requires also that lim Ilk X # 0.
k-f- r=l r
Corollary. If C is a circulant with eigenvalues X1,
k
h2, ..., An, then limk+_C exists if and only if
k
If limk+_C exists we shall designate its limit-
ing value by C-. It is useful to have an explicit
form for the limiting value C- of a circulant C.
Let JC designate the subset of integers r = 1, 2,
..., n for which X = 1.
r
Corollary. Assuming (3.6.2).
C- = Br
Jc # (the null set),
rtJC
(3.6.3) if
Proof. If C = F*AF, A = diag(hl, A 2 , ..., In),
then C = F*A-F,
= diag(X7, A;, . . . , A:), where
CO
Xr = 1 if Xr = 1 and 0 if lArl C 1. The statement now
follows from (3.4.8) and ( 3 . 4 . 9 ) .
Corollary. Let C be a circulant with eigenvalues
hl, X2, ..., An. Then the C6saro mean
1
lim-(I + C + ... + cr-l) = c
r
rt-
exists if and only if
(3.6.4) l h r 1 5 l, r = 1, 2, ..., n.
The representation (3.6.3) persists with replacing
c-.
104
Circulant Matrices
Proof. Write C = F*AF, A = diag(A1, X2, ..., An).
Then
1 1
-(I + . . . + cr-l) = F*diag(,(l + A . + A? + .
r I 3
+ A?'))F.
I
Now
1 and
It is clear that or converges if and only if
ihl 5 1.
It converses to 1 if and only if X = 1 and to 0 if
and only if X # 1,
I A j 5 1 .
In discussing convergence problems, it is useful
to introduce the spectral radius or norm, p(M), of
a matrix M by means of
(3.6.5) P (M) = max
, , j=1,2, ..., n 151
where A , are the eigenvalues of M.
I
Inasmuch as circulants are a special case of a
. . diagonalizable matrix, we append a table of the beha-
. ,,
vior of M~ as r + - for diagonalizable matrices. All
,
-1 r
. 1:
results are obtained by using M~ = S A S and an
examination of the individual behavior of A : as r + m.
BY a unimodular eigenvalue we mean an eigenvalue
A, for which 1 A, 1 = 1.
.. ..
It is of interest to contrast this tabulation
m
with the general theorem on the existence of M , where
M is not necessarily diagonalizable.
I 1 Theorem 3.6.2
cm
(a) If X = 1 is an eigenvalue of M, then M
exists if and only if A = 1 is a slmple root of the
minimal polynomial of M and if all other roots are
less than 1 in absolute value.
(b) If A = 1 is not an eigenvalue of M, then M-
exists if and only if p (M) < 1, in which case M- = 0.
What is the general form of infinite powers?
Omit the trivial case M- = 0.
Assume M has order n.
Then, since the Jordan blocks corresponding to the
eigenvalue X = 1 all must be of dimension 1, it fol-
lows that M can be Jordanized as follows:
where S is nonsingular and where Q has the form
Convergence Questions 105
Behavior of M ~ , r + -; M Diagonalizable
Necessary and Sufficient
Behavior Conditions
Converges to 0 P(M) < 1
Converges to M- # 0 P(M) = 1; all unimodular
eigenvalues equal 1
Diverges boundedly P(M) = 1; not all unimodular
eigenvalues equal 1
Cgsaro mean converges P (M) = 1, no unimodular
to 0 eigenvalue equals 1
~Qsaro mean converges, p(M) = 1, at least one, but
but not to 0 not all unimodular eigen-
values equal 1
Finite number of limit p(M) = 1, not all unimodular
points eigenvalues equal 1. ~ l l
unimodular eigenvalues are
roots of unity
Infinite number of p(M) = 1, at least one uni-
limit points modular eigenvalue is not a
root of unity
Diverges unboundedly P (M) > 1
10 6 Ci r c u l a n t Ma t r i c e s
I n ( 3. 6. 7) , Im i s t h e i d e n t i t y ma t r i x of a c e r t a i n
o r d e r m, 1 < m 5 n, and X i s ( n - m) x ( n - m) and
-
p ( X) < 1. Hence X- = 0, s o t h a t
- -1
The r e f or e , M' " = SQ S .
NOW w r i t e S i n bl ock form a s
s = ( A I B ) where A i s ( n x m) and B i s ( n n - m ) .
where C i s ( m x n ) and D i s ( n - m ) x n. W r i t e s - ~ =
Then from ( 3. 6. 6) it f ol l ows t h a t M~ = AC.
PROBLEMS
1. I n v e s t i g a t e t h e conver gence of sequences of
d i r e c t sums.
2. I n v e s t i g a t e t h e conver gence o f sequences of
Kronecker pr oduct s .
3. Pr ove t h a t i f Ak a r e s qua r e , l i mk+, Ak = A, and A
i s nons i ngul a r , t hen f o r k s u f f i c i e n t l y l a r g e , Ak
- 1
i s nons i ngul a r and l i m k + m ~ ~ l = A .
4. Le t A, B be s qua r e of same o r d e r and commute. Le t
k
l i mk+- A = Am, Bk = Bm e x i s t . Then l i m k , , ( ~ ~ ) k =
A-B,.
5. Show t h a t t h e i d e n t i t y of Probl em 4 may n o t be
5 2
v a l i d i f AB # BA. Take A = ( ' 0 0 ) , B = A*.
6. What f u n c t i o n s of ma t r i c e s a r e c ont i nuous under
ma t r i x conver gence? For example: de t e r mi na nt ,
r ank , e t c .
7. Le t A = 1 be an e i ge nva l ue of A and a s i mpl e r o o t
of i t s mi ni mal pol ynomi al ~ ( h ) .
Le t Am e x i s t .
Then, i f one wr i t e s L I ( ~ ) = ( A - l ) q ( h ) , q ( l ) # 0,
one ha s Am = ( q ( l ) ) - l q ( ~ ) . ( Gr e v i l l e . )
a b
8. When i s (c an i n f i n i t e power?
Convergence Que s t i ons 107
9. Level s p i r i t s . Take t h r e e g l a s s e s , c ont a i ni ng
d i f f e r e n t amounts of vodka. By pour i ng, a d j u s t
t h e f i r s t two g l a s s e s s o t h a t t h e l e v e l i n bot h
is t h e s a me . Adj us t t h e l e v e l i n t h e second and
t h i r d g l a s s e s . Then i n t h e t h i r d and f i r s t
g l a s s e s . I t e r a t e . P r e d i c t t h e r e s u l t a f t e r n
i t e r a t i o n s . What happens a s n + a? What i f t h e
g l a s s e s d o n o t have t h e same c r os s - s e c t i on? What
i f t h e g l a s s e s d o n o t have c o n s t a n t c r os s -
s e c t i o n a l a r e a ? What i f a f t e r t h e k t h l e v e l i n g ,
an amount v is dr unk from bot h of t h e l e ve l e d
g l a s s e s ?
k
10. Prove t h e s t a t e me nt a t t h e end of Se c t i on 1. 3.
Ge ne r a l i z e i t .
REFERENCES
Ci r c u l a n t ma t r i c e s f i r s t appear i n t h e mat hemat i cal
l i t e r a t u r e i n 1846 i n a paper by E. Cat al an.
I d e n t i t y ( 3. 2. 14) f o r t h e de t e r mi na nt of a cir-
c u l a n t is e s s e n t i a l l y due t o Spot t i swoode, 1853.
For a r t i c l e s on c i r c u l a n t s i n t h e o l d e r l i t e r a t u r e
see t h e b i b l i o g r a p h i e s of Mui r , ( 11 - 161.
Ci r c ul a nt s : Ai t ken, 111, 121; Bellman, [ l ] ; Ca r l i t z ;
Charmonman and J u l i u s ; Davi s, L l l , [ 2 ) ; Marcus and
Minc. [21; Mui r , [ l l ; Muir and Me t z l e r , ( 71; O r e ;
Trapp; Varga.
z-Transform: J ur y.
Fr obeni us t heor em: Taussky.
Convergence: Gr e v i l l e , (11; Or t ega.
Skew c i r c u l a n t s ; { k) - c i r c ul a nt s : Beckenbach and
Bellman; Smi t h, [ l ] .
Toe pl i t z ma t r i c e s : Gray, [ I ] - [ 4 1 ; Grenander and
Szeg6; Widom.
Det er mi nant al i n e q u a l i t y : Beckenbach and Bellman.
Out er pr oduct : Andrews and Pa t t e r s on.
SOME GEOMETRICAL
APPLICATIONS OF
CIRCULANTS
We are interested here in the quadratic form
where Q is a circulant matrix. The reader will
perceive that some of what is presented is valid ih
a wider context. In (4.0.1) we have written Z =
T
(zl, . . . , zn) .
Insofar as Q = F*AF, A = diag(il,
A2'
..., An), one has
This is the reduction of Q(Z) to a sum of squares. If
one writes for the Fourier transform of Z,
A ,, A
(4.0.3) Z = (zl, z 2 , . . . , ;n)T = FZ,
then one has
4.1 CIRCULANT QUADRATIC FORMS ARISING IN GEOMETRY
We list a number of specific quadratic forms Q(Z) in
which Q are Hermitian circulants and which are of
importance in geometry.
Circulant Quadratic Forms 109
= polar moment of inertia around z = 0
of the n-gon Z whose vertices are unit
point masses.
From (4.0.4),
which expresses the isometric nature of the unitary
transformation F.
= sum of squares of the sides of the
n-gon Z.
where k is a positive integer.
Z*QZ = sums of squares
of the kth-order cyclic difference of the vertices of
Z. For example,
We wish next to exhibit the area of an n-gon as
a quadratic form in Z. Since for a general Z, the
geometrical n-gon may be a multiply covered figure,
it is more convenient to deal with the oriented or
signed area of Z.
Let zk = xk + iyk, k = 1, 2, 3 be the vertices
. ~ .. ..
of a triangle T taken in counterclockwise order. From
1 . 2 1 5 we have
r
t

W
e

r
'

1

r
t
1

r
t

m

m

0

c

-

3

v

N

m

e

-

I1

P
l
W

r
t

a

m

m

r
r

-

r
.

r
t

r
n

r
t

1

w

o
m
3

N
N
N

m
e
^

W
N
P

m

P

r
t

m
m

1

0

N
I

N
I

N
I

"
;
5

W
N
V

r
t

I

-
1

a

v

o
\
n

I
-
P
I
-

N

H
3

r
n

N
I
M

U
.

n

L
O

N

I
-

P
U
.

0

<
+

n

m
I
-

X

u

r
n

Y
m

x

m

<

3

m

0

'
I

m

*

P
.

n

m

r
n

-

X

X

X

W
N
V

Y
L
C
L
C

W
N
V

n

N
N
N

W
N
P

N
I

N
I

N
I

W
N
V

I
-

w

w

0

m

O
F

k
-

w
m
m

-
r
t

3
:

h

I
n

N

O

-
w

112 Some Geometrical Applications The Isoperimetric Inequality 113
T
2. Let J = (1, 1, ..., 1) . Prove that Q3(Z + cJ) =
Q3(Z). Interpret geometrically.
3. Prove that Q3 (nz) = Q3 (2). Interpret.
4. Prove that Q3(TZ) = -Q3(Z). (See p. 28 for T.)
Interpret.
4.2 THE ISOPERIMETRIC INEQUALITY FOR ISOSCELES
POLYGONS
Consider a simply connected, bounded, plane region9
I
with a rectifiable boundary. If A designates its area
and L the length of its boundary, the nondimensional
ratio A/L~ is known as its isoperimetric ratio. The
famous isoperimetric inequality asserts that for all 9
and that equality holds in (4.2.1) if and only if @
1 is a circle.
:? If @is a regular polygon of n sides each of
. " I
2
length - 2a, it is easily shown that L = 2na, A =
. I L
i na cot v/n. Hence the isoperimetric ratio for a
' '
. , a regular polygon of n sides is
A - 1 1 1
- - -
Ti
4n cot - = < -
~2
n 4n tan n/n - 4n
It is a reasonable conjecture that if @ is any
equilateral polygon of n sides, with area A and peri-
meter L, then
(4.2.2.) <
1
L
2 - 4n tan n/n
with equality holding if and only if @ is regular,
that is, equiangular as well. We can now establish
the truth of this conjecture. Write (4.2.2) in the
form
From (4.1.9) we have, using the double angle
formula and observing that the first term of the
series vanishes,
n
n TI n(' - 1)
4n (tan -)A = 4n 1 tan (:)sin
In n
j=2
. cos n(j - n 1) l;j12,
NOW if @ is equilateral, then for some b > 0,
- 2.1 = b, j = 1, 2, ..., n, so that L = nb,
1 3
2 2 2 n 2
L = n b . NOW Q2(Z) = lj=lI~j+l - zjI2 = nb2 = L /n.
Thus from (4.1.8), since the first term of the series
vanishes,
For j = 2, we have (tan n/n) (sin n/n) (cos n/n) =
sin2 n/n, so that
n
- [sin (1 - - tan - cos
(j - l)n 2
n n n I IGjl
Notice that sin[(j - l)nl/n > 0 for j = 3, 4,
. n. The bracketed quantity
sin
( j - l)n - tan 1 cos (j - l)n
n n n
(1 - l)nKtan
TI
= cos
(1 - l)n - tan -].
n n n
When cos[(j - l)n]/n = 0, then sin[(j - l)nl/n > 0.
When the cos > 0, the tan > 0 and tan[(j - l)nl/n >
tan n/n. When the cos < 0, the tan < 0. Therefore
the coefficients of 1;. l 2 are always positive. It
I
follows that I,' - 4n(tan n/n)A t 0, and equality holds
if and only if i3 = A - - . . . = = 0. To interpret
24A n
the equality, one has Z = FZ so that
114 Some Geometrical Applications
Side Conditions 115
for some a, B. Thus, in the case of equality,
and these are the vertices of a regular polygon of n
sides.
4.3 QUADRATIC FORMS UNDER SIDE CONDITIONS
Pick an r with 1 5 r 2 n.
Let z'~) be an eigenvector
of Q corresponding to A,. Then, up to a scalar
factor, Ztr) = F*(O, ..I, 0, 1, 0, ..., O)T, where
the 1 is in the rth position. Suppose now that
Z I z"), that is, z*z(~) = 0. Then Z*F*(O, ..., 0,
1, 0, ... , O)T = (FZ)*(O, ..., 0, 1, 0, ..., o)T = 0.
This is valid if and only if 1 = 0. Hence
r
2
(4.3.1)
z I z(~) implies Q(Z) = 1 AklGkl .
k#r
For distinct rl, r2, ..., r 0 < m < n,
mr - -
(rk)
(4.3.2) Z I Z k = 1, 2, ..., m,
implies
Q(z) = 1 xk~'k~2.
k#rlrr 2r...,r m
In particular, since Z (I) = (l/fi) (1, 1, ..., 1) T ,
n
2
implies p(Z) =
1 XklGkl .
k= 2
The eigenvalues Ak are, of course, generally
neither real nor positive.
For a given matrix Q, the set of all values Q(Z)
with 1121 1 = 1 is the field of values of Q (see Page
63).
It is easily shown, using the fact that a normal
matrix is unitarily diagonalizable, that the field of
values of a normal matrix is the convex hull of its
eigenvalues. Since circulants are normal, the same
may be asserted for the field of values of a circulant.
The X, are real if and only if a circulant Q is
..
Hermitian. Then from (4.0.4), Q(Z) will be real for
all 2. In this case, one has the Rayleiqh inequal-
ities arrived at as follows. Let Amin and Xmax be
the smallest and largest of the Ak. Then
Hence, from (4.1.1') and (4.0.4),
2
(4.3.4) Amin/ 121 / 5 Q(Z) I Amaxl 121 12.
Therefore, for any Z # 0,
In all our work so far with circulants, it has
been convenient to number the eigenvalues so that A . =
I
p(wl-I), where p is the representer of the circulant
[cf. (3.2.611. To derive equality conditions and
further conclusions along the lines of what is now
called the Courant-Fisher theorem, it is convenient
briefly to renumber the eigenvalues and vectors so
that one has
116 Some Geometrical ~ ~ ~ l i c a t i o n s ~
Side Conditions 11
(j I
The corresponding eigenvectors of Q will be Z .
Suppose now that we have a vector Z # 0 for which
2 2
(4.3.7) Q(Z1 = hminl 121 I = Anl 121 I .
Then
n
2 2 2
Q(Z) = 1 hk/Gkl = Anl 121 I = Anl 1
k=l
2
Thus
(Ak - An) / zk/
= 0. Since (Ak - A n ) - > 0, k =
1, 2, ..., n, it follows that (Ak - A,) /zkl2 = 0, k =
1, 2, . . . , n. Now assume that
(4.3.8) > A > ...
i 1 - 2 - 5
> An.
Then (Ak - An) # 0 for k = 1, 2, .. . , n - 1. Thus
A
-
(4.3.7) holds if and only if = A
22 - "'
= Z = 0.
n- 1
Therefore, Z = F*? = F* (0, 0, . . . , 2 ) = g 2'"). In
n n
other words, (4.3.7) holds if and only if Z is an
eigenvector corresponding to An (i.e., to A . ) .
mln
Let now Z be a vector such that Z I 2'"). As
observed, z = 0, and from (4.0.4)
n
or briefly,
(n)
for all vectors Z I Z .
Make the further hypothesis that
> A > ...
(4.3.10) A1 - -
-
2 An-3 > 'n-2 - 'n-1 ' 'n
and suppose that equality holds in (4.3.9):
Then
SO that
Since (Ak - An-l) L 0 for k = 1, 2, ..., n-1, it
follows that (Ak - An-l) lgkl2 = 0 for k = I, 2, ...,
n-1. Hence, by (4.3.10), zk = 0 for k = 1, 2, ...,
..
n-3. The structure of must therefore be 2 =
(0, 0, ..., 0, zn-2, z ~ - ~ , 01 for arbitrary \$n-2,
A
A
z so that Z = F*z = z
("-2) (n-1)
n-1' n-2 + zn-lz
In summary, if (4.3.10) holds, then (4.3.11)
holds if and only if Z is a linear combination of the
eigenvectors Z ("-I1 and Z
(n-21
We now present an application of these ideas.
Select Q = (I - n)*(I - r r ) . From (4.1.81, the eigen-
values of Q are (in the usual ordering)
A . = 4 sin (j-l)', , = I , 2, ..., n.
I n
The eigenvalue of smallest value ?s 0, corresponding
to j = 1. The next two in size are paired, corres-
ponding to j = 2 and j = n. The common value is
.
4 sinZ n/n. Thus we arrive at
Theorem 4.3.1. Let zl, z2, ..., -
'n'
- zl be
complex numbers with lklzk = 0. Then
118 some Geometrical Applications
Equality in (4.3.12) holds if and only if
k-1 + Bak-l,
(4.3.13) z = aw
k
k = 1, 2, ..., n
for constants a, B.
Proof
-
n
2
Q(Z) = Z*(I - n)*(I - n)Z= 1 Izk+l- zkl .
k= 1
The eigenvalue of Q of lowest value is 0; the corres-
ponding eigenvector is (1, 1, ..., 1). The eigen-
values next in size are paired; the eigenvectors are
:I
2
n
(l,w,w ,..., w ) and (1, wn-I, w"-~, ... , w)
(second and last columns of F*).
. \$
The inequality (4.3.12) goes by the name of the
discrete inequality of Wirtinger.
; '
For upper bounds we must obtain
I :I
,:
= max 4 sin
2 ( j - l)n .
I j 'max j n
For n = 2p, one has h 3 4, occurring when j = p + 1
! I max
.. , For n = 2p + 1, one has X
2
= 4 sin (pn/n) =
.: j
2
max
i .,! 4 cos (n/2n) , occurring doubled when j = P + 1, P + 2.
I,
, c c: :
This information may now be inserted in (4.3.5).
PROBLEMS
n
1. Letzl, z2, . . . , z be complex numbers with lkc1zk
n
= 0. For other integers k, define zk cyclically.
Let A designate the difference operator (Azk =
Z
2
k+l
- zk, A zk = A(azk), etc.). Then for all
integers p 2 0, use (I - n)' to prove that
Side Conditions 119
2. For real x. write the Wirtinqer inequality in the
form
1'
- 2n ! x2 < [ 2n/n 2 2n k - 2
n k - 2 sin n/nl IF 1 ( 2n/n 1 1
k= 1 k=l
Use this, together with n + m, to prove that if
2 n
f (t) has period 2n and f (t) dt = 0, then
2 n 2 n
0
j f2(t) dt 5 /(f* (t) )2 'It.
0 0
What integrability conditions on f(t) are required
here? This is Wirtinger's integral inequality.
3. Let zk, k = 1, 2, ..., n be as in Problem 1. Prove
~-
that the z, are the real affine images of the
..
vertices of a regular n-qon (see p. 123 for
"affine").
4. Let C be a circulant whose eigenvalues have equal
moduli 0. Then, for all vectors 2, I I C Z I I =
ollzl I .
5. Prove that the field of values of any matrix is a
convex set in the complex plane.
6. Prove that for any matrix, the convex hull of its
eigenvalues is contained in the field of values.
4.4 NESTED n-GONS
(See Section 1.4.) Let Z = (zl, z2, . . . , zn)T desig-
nate the vertices of an n-qon and let the transforma-
tion C (= Cs) be applied iteratively where
= SI + tn, s > o , t > O , s + t = l .
The eigenvalues of C are hk = s + twk-l, k = 1, 2,
..., n. These numbers are strictly convex combina-
tions of 1 and wk-l. Hence, il = 1 and for k = 2, ...,
120 Some Geometrical Applications
Figure 4.4.1
n, one has jXkj < 1.
See Figure 4.4.1. In fact, these
numbers lie on a circle interior to and tangent to the
unit circle at z = 1. One has
2n(k - 1)
= Is2 + t2 + 2St COS I t
It is clear that the eigenvalues of-absolute value next
in size to A1 = 1 are A2 and hn ( = h2) for which *
2 n
(4.4.3)
1 h 2 2 = A = s 2 + ti + 2st cos _ l L .
From (3.4.14) one has for r = 0 , 1, ...,
hence
r
(4.4. 4') lim C Z = BIZ.
r+-
Since from (3.4.13), B1 = l/n circ (1, 1,
. . . ,
T
B z = (l/n) (zl + z2 + ... + zn) (1, 1, ..., 1) .
~ence,
1
as r + -, each component of CrZ approaches the c.9. of
Nested n-Gons 121
z with geometric rapidity.
It is useful, therefore,
to assume that this c.9. is at z = 0, eliminating the
first term in (4.4.4). Thus we assume that
(4.4.5) Z + Z + ...
1 2
+ Zn = 0 .
Further asymptotic analysis may be carried out
along the line of the power method in numerical
analysis for the computation of matrix eigenvalues.
Write
(4.4.6) crz = h > 2 ~ + h r ~ z + (hrg + ...
n n 3 3 + x ~ - ~ B ~ - ~ ) Z -
Then, since / A / = Ih2/,
n
Now Since 1 h3 / t
1 h4 1, . . . t
1 1 < 1 AZ 1 , the term in
the parentheses approaches 0 as r + m.
We designate
it by E (r). (It is a column vector.) Let
(4.4.8) h2 = li21eie,
8 = tan
-1 t sin 2n/n
(S + t cos 2n/n).
An = ~h~/e-~',
Therefore,
Write
(4.4.10) Y = eire -ire
r B 2 Z + e BnZ,
SO that
122 Some Geometrical Applications
Since from (3.4.9) Bk = F*AkF, we have
ir8 -ire
Y = e B2Z + e
r BnZ
Hence
= constant (as far as r is concerned).
From this follows immediately that if the second
and nth components of FZ, the Fourier transform of Z,
are not both zero, then the Yr are a family of nonzero
n-gons of constant moment of inertia.
In this case, then, the rate of convergence of
crZ is precisely I 1, I-r, r + -.
Notice from (4.4.3)
-
or Figure 4.4.1 that as n + m, X2 t 1, so that the
more vertices in the n-gon, the slower the convergence.
r
The sequence of n-gons crz/I X 7 /
will be called
-
normalized, and the normalized n-gons "approach" the
family Y_. It is of some interest to look at the
geometric nature of Yr.
T
Lemma. Let Z = (Zl, Z2t - - - 8 Zn) - Let
-
n-1
(4.4.12) pz(u) = zl + z 2 u + z 3 u2 + * - - + znu
.
For r = 1, 2, . . . , n, let
Nested n-Gons 123
Then
In particular,
1 2 n-1 T
(4.4.15)
B2Z = ,(P, (W)) (1, W, W , . . . , W
) ,
k 2k
Proof.
From (3.4.12), Br = l/n circ(1, w , w ,
..., W
(n-1) k .
Hence each row of Br is the previous
row multiplied by Ck.
The identities should now be
obvious.
Lemma.
Let z = x + iy, z' = x' + iy', 'rl, T~ complex.
Then
is an affine transformation of the (x, y)-plane. It
is nonsingular if and only if / T 1 # I T 1 .
1 2
Proof.
Write rl = t1 + inl, r2 = c 2 + in2, where
the S's and n's are real. Then the transfGrmation
(4.4.17) can be written as
x' = (C1 + S2)x + (nl - r12)y,
(4.4.18)
Y' = (nl + n2)X + (C2 - C1)y.
This is an affine transformation of the x, y
plane. The determinant A of the transformation is
2 2 2 2 2 2
A = E2 - El - n1 + n2 = 1 ~ ~ )
- 1 ~ ~ 1 ,
so that A # 0 if and only if
# 1 ~ ~ 1 .
Theorem 4.4.1.
If / g 2 / # I;*/, the n-gons Yr are
nonzero, and of constant moment of inertia. They are
the affine images of the regular unit polygon of n
sides, hence are convex.
124 Some Geometrical Applications
Proof. We have
--
Hence if we write T~ = (l/n)eir8
-
pZ(w), T2 -
(l/n) e-ire
-
pZ(w), the vertices in Yr are the images of
2
-
(1, w, w , . . . , wn-l) under z' = 7 , z + T ~ Z .
Since
A -
pZ(w) = < and pZ(" = ;2, it follows that 1 ~ ~ 1 # 1 ~ ~ 1 .
This is a nonsingular affine transformation and all
I such transformations send convex figures into convex
* figures.
t
..
i
For further analysis, one makes the assumption
that 8 is a rational multiple of 2n. In this case,
; one can identify llmits of subsequence of the normal-
,!
ized figures c ~ z / I x , ~ ~ , r = 0, 1, 2, ... .
-
Instead of working generally, we shall assume
that
1 .
This leads immediately to
TI
(4.4.20 i h I = cos -, 8 = ' 7
2 n n
so that (4.4.9) becomes
Let now
(4.4.22) r = 2jn + b, O ~ b 2 2 n - 1,
] = 0, 1, ... .
Then (4.4.21) becomes
Nested n-Gons 125
Writing
(4.4.24) Ub = e Tib/n~2z + e
-nib/ng
n
one now has
C2jn+b
(4.4.25) lim 2jn+b = 'b'
b = 0, 1, 2, ...,
j+- (COS n/n) 2n - 1,
so that the normalized n-gons approach 2n limiting
n-gons, each of which is an affine transform of a
regular n-gon. See Figure 4.4.2.
PROBLEMS
1. Prove that if I i2 / # 1 , the sequence of cor-
responding normalized vertices of the nested n-gons
r = 0, 1, 2, ... lie asymptotically on an ellipse.
2. Analyze what happens when Z is taken as the
vertices of a regular polygon.
2 4 3 T
3.
Take Z = (1, w , w , w, w ) . w5 = 1 (a regular
pentagram). What happens under C 1/2? Do the
successive iterates ever become convex?
4. Analyze what happens when Z is taken as the affine
image of a regular polygon.
5. Let C -1 = circ(l/r, 1-(l/r), 0, 0, ..., O), r =
r
-
1, 2, 3, . . . .
Discuss nm r=l C -1, and apply it to
nested n-gons. r
I

i
-
\
+

I

-
.

z

-
.

z

0

0

0

I
4
4

m

+

0

q

r
.

R

Q

C

t
i

I
D

P

/
+

f
t
"

2

+
-
+

/
+

5
1

0

Smoothing and Variation Reduction 131
Fi~ure 4.4.2 (Continued)
4.5 SMOOTHING AND VARIATION REDUCTION
The smoothing or filtering of data is a common opera-
tion and is worthy of discussion within the present
framework. We assume that we have a finite sequence
of data values Z = (zl, ..., zn) and we subject the
data to a linear transformation with matrix A:
A
(4.5.1) Z = AZ.
What properties of the matrix A will be required
for smoothing? Numerous definitions have been put
forward. Greville has proposed the following. A
matrix A will be called smoothing if:
(1) A has A = 1 as an eigenvalue,
(2) A- = lim exists.
PA"
The rationale behind this definition is as follows.
The eigenspace S of vectors corresponding to A = 1 has
the property that if z E S, Az = 2 . Call S the set
of smooth vectors. Then vectors that are already
smooth are unaffected by the operation A. Now take
any vector Z and "smooth" it over and over again by
applying A. Then this will approach A-2. NOW since
A(A~Z) = A"Z, A ~ Z E S, hence it is a smooth vector.
Referring to Theorem 3 . 6 . 2 , we see that the
necessary and sufficient condition for A to be smooth-
Ing in the sense of Greville is that:
(1) ,i = 1 be an eigenvalue of A.
(2) h = 1 be a simple root of the minimal
polynomial of A and if h # 1 is an
eigenvalue, then i h l < 1.
If A is a circulant then the criterion simplifies
somewhat.
Theorem 4.5.1. A circulant C is a smoothing operator
if and only if
(1) A = 1 is an eigenvalue of C.
(2) If i # 1 is an eiqenvalue of C, then l h / < 1.
r
t

w

s

C

J

m

r
t

m

m
m

.

s

r
.

o

m

N

w
a
r

O
M

-

r
m
m

P
o

+
'
"
I

r
t

m

<

j
i
l
a

N

-

m

5
:

W

N

W
r
t

I1

-

Y
O

-

P
w

?

I1

*
t
f
N

N

P

-

s
m

r

w

m

w

-

+

N

n

c
i

-

P

Q
Y
N

O
W

c

n

N

P

N

I

w
-
n

1

Y

r
.

o

I
I

I

N

r
t
n
.

3

N

r
-
.
.

m
r

-

r
t

.

r
t
\

I

N
Y

-

w
3

w

3

N

+

Y
N

r
t
n

3

-

-

Y

m

r
.

r
t
-

-.
"
I

N

N

s

H

o

N

m

-

+

t
f

r

I

<

m

-

w

N

N

c
i

w

P

1

w

r
-

-

.

r
"

a

I

N

r
t
r
"

r
.

r
t

N

+

0

r
"

P

Y

.

-
.

N
.

<

o

m

r

m

n

-

r
t

.

N

0

-

c
i

<

g

r
.
e

c
i

n

r
-

m
s
m
o

Y

c
a
3

m

r
m
m
t
f

r
.

w
r
.

a

x
<

r
t
1

m

m
m
D

P

-
*

r
t

Y

r
.

3

r
.
w

o

m

V
I
C
H
Y

m
r
t

c

w

0

J

Y
C
m
m

r
.

m
s

r
t

m
c
i
m
m

w
r
.
1
0

n

u
a
m
m

3

r
.

m
w
m
m

1

Y
O
0

0

<
a
1

m

r

w

m
c

m

P
r
t

U

.

c

r
r
m

m

w

r
.
m

-

r
t
m
r
t

m

m

c

u
a
r
r

H

r
.
v

s

s

C

m

m

m

1
1

1

3

m

m

F

U

c
,

E
!
Y

-
0

0

J

*

-
o
r
.

s

y
r
t
m

m

P

C

r
t

m

.

r
t
s

0

w
s
m

r
t

-

w

0

.

r
t
m

v

C

m

0
"
K

r
"

m

c
i

x
r
t

m

m
-

A
P
r
.
P

0

a
-

r
t

r
r
-
m

s

.

m
r
t
3

E

w
s

z
z
m
.

r

0

.

X

M
m
c
.

m
r
.
2
-

r
-

o
u
a
r
-

m

<
m
r
t
~
~
~

0

3

c
i
s
0

I

-

w
r
.

r
n

P
C
(
-

0

4
C
O

3

o

m
P
-

C

m
r

m

I
l
r
"
.

r
.

x

3

r
t

a

s

a
P

g

m

m
-

m

r
.
n

c
i

1
m

m

r
.
.

r
t

c

r
-

r
t

x

0

3

r
m

w

1

r
t
z

m

s

w

m
Y

r
t

r

s

m
m

m

r
.
r
.

Q
u
a

m

Z
Z
2

<
9

w

L
g

;

C
r
t

m

m

r
.

m
m

0

I

m

0

r
t

m
-

r
t

r

3
:

c
i

m

0

m

r
r

3

r

-
E

0

7

e
m

c
i
w

3

r
t

c
n
g

.
w

o

m
z
m
r

m

o
m
-

r

m

N

W
Z
s
W

O
O
O
Y

I1

r
r
r

m

a
r
t

-

a
m
s

N

G
P
.
.

IU

m
m

r
t

N

n

-

C
C

<

P
m

m

\
m
m

r
.

N

m

.

r
t

P
.

Y

w

m
m

1
c

r
t

P

0

0

N

-
r
t

-
Y

a

134 Some Geometrical Applications
have for real pk 2 0, D = diag(pl,
..., un) and
unitary U, A*A = U*DU. Hence nI - A*A = U*(nI - D)U.
So the eigenvalues of 01 - A*A are T? - uk.
Thus 0 - <
< 0 is necessary and sufficient.
"k -
corollary. I /AZ/ I 5 qlIZ/ 1 for all Z if and only if
p (A*A) - < n .
1f 0 5 0 5 1, condition (4.5.6) may be described
by saying that A is norm reducing (more strictly:
norm nonincreasing). If 0 < n < 1, A is a contraction.
[A contraction generally means that (4.5.6) is valid
with 0 5 T- < 1 where ( 1 I I can be taken to be any
vector norm.]
Lemma. Let Mk, k = 1, 2, ..., be a sequence of
-
matrices. Then
(a)
lirnk+_MkZ = 0, for all 2, if and only if
Proof. Using a compatible matrix norm, I I M I 1,
one h V M k ~ ] 1 - < / / M k / l ljzli. NOW limk+_M k = 0 if
and only if limk+, / lMk 1 1 = 0. Hence (b) + (a) . Con-
versely, (b) follows from (a) if, in (a), one selects
Z successively as all the unit vectors.
Theorem 4.5.2. Let Mk, k = 1, 2, ..., be a sequence
of matrices and set ok = p(MgMk) = spectral radius of
MGMk. Let
r
(4.5.8) lim Il o k = 0.
r+- k=l
Then
for all 2, hence
Smoothing and Variation Reduction 135
Proof. From the previous corollary,
If we wish to obtain a condition such as (4.5.7)
or (4.5.8) directly on the eigenvalues of M (and not
on those of M*M), it is convenient to hypothesize that
M is normal.
For in this case M = U*diag(A1, ..., hn)U so that
M*M = u*diag(AIX1, A2X2, ..., h )U, and the eigen-
n n
2 2 2
values of M*M are precisely 1 hl 1 , / X2 I , . . . , / h n l .
In this way we are led to our next result.
Theorem 4.5.3. Let Mk, k = 1, 2, ... be a sequence
of normal matrices. Assume that
m
(4.5.11) n P ( M ) = 0.
k=l
k
Then
-
(4.5.12) n M ~ = o .
k=l
In the case of a sequence of circulants, see
corollary to Theorem 3.6.1 for a stronger statement.
2 2
We return now to the inequality I I A Z I I S I I B Z ~ 1 .
We have already seen that a necessary and sufficient
condition for this is that B*B - A*A be positive
semidefinite. We should like to be able to "decouple"
the matrices A and B. To this end, we make the hypo-
thesis that A and B are normal and commute. (Recall
- -
that this means that A*A = AA*, B*B = BB*, AB = BA.)
Such pairs of matrices are remarkable in that they are
simultaneously unitarily diagonalizable. We shall now
prove this basic fact.
Theorem 4.5.4. Let A and B be square matrices of the
same order. Then A and B are normal and commute if
136 Some Geometrical Applications
and only if they are simultaneously diagonalizable by
one and the same unitary matrix.
Proof. ''If." Let A = U*DIU, B = U*D U where
-
2
is unitary and Dl, D2 are diagonal.
Then A*A =
U * ~ ~ U U * D ~ U = u * D ~ D ~ u = U*D 5 U = AA* so that A is norma
1 1
Similarly for B. Now AB = U*DIUU*D 2 U = U*D1D2U =
U*D2D1U = BA.
"Only if." Assume that A, B are normal and
commute. Since A is normal, we have for some unitary
U and diagonal D, A = U*DU. Since AB = BA, we have
U*DUB = BU*DU. Hence D (UBU*) = (UBU*)D. Set C =
UBU*. Hence B = U*CU. Then DC = CD. Write
where p , U 2 ..., us are distinct and where ul is
repeated a, times, ..., us is repeated as times,
- -
a, + a., + . - . + aq = n. This displays the possible
- -
multiplicities of the eigenvalues of A. If now C =
(c. ) , then DC = CD implies
lk
U.C. -
1 ]k - %cjk
j, k = 1, 2, ..., n.
Therefore
if u j # u k thenc. = 0 ,
1 k
if u . = uk then c. = arbitrary.
I 1 k
Therefore C must be of the form C = C, Q C? Q . . - Q Cq
- - -
where Cr is of order ar and is arbitrary. Since B is
normal, so is C. Since C is normal, so is each C
k'
k = 1, 2, ..., r (as is easily established). Hence
for appropriate unitary V and diagonal Ak of order
k
ak, we have Ck = VgAkVk. Thus,
Smoothing and Variation Reduction 137
where
v = v Q v2 @ ...
1 Q Vs,
A = n Q A, @ ...
1
Q As.
Now
= U*(ulV;Vl @ p2V;V2 Q -. . Q u V*V )U
S S S
= u* (V? Q v; Q . - - @ v;) (lJlIL1 @ . . - @ P I )
1 as
(V1 0 v2 Q "' Q VS)U
= U*V*DVU.
Therefore VU diaqonalizes A and B. It is easily
verified that VU is unitary.
Theorem 4. 5. 5. Let A and B be normal and commute.
Then / 1AZI I < 1 ~ B Z I I for all z if and only if there is
an ordering of the eigenvalues of A and B
hl, h 2 , ..., h ;
n
ul, u 2 r ..., ' n
(under a simultaneous diagonalization) such that
Proof. Let A and B be normal and commute. Then
we can find a unitary U such that A =
138 Some Geometrical Applications
Hence B*B - A*A = U*d'
2 2 2 2
1ag(lu1/ - lill E 1 1 * - Ih21 1
. . . ,
2 2
ILJn 1 - A n U . Condition (4.5.13) is now
.. ..
equivalent to the positive semidefiniteness of B*B -
A*A.
Corollary. If A and B are circulants, then (4.5.13)
is necessary and sufficient for 1 ~ A Z I 1 < 1 I B Z ~ 1 for
all Z.
Proof. Circulants are normal and commute.
-
In dealing with pairs of matrices that are normal
and commute, it is useful to assume that their eigen-
values have been ordered so as to be consistent with
the simultaneous diagonalization by unitary U.
Let M be a square matrix. We shall call a matrix
A M-reducing if
(4.5.141 1 1 ~ ~ ~ 1 I 5 1 I M Z I I for all Z.
Theorem 4.5.6
(a) A is M-reducing if and only if M*M - (MA)*MA
is positive semidefinite.
(b) Let A and M be normal and commute. Let A,,
A
..., An; pl, ..., vn be the eigenvalues of A and M.
Let JM be the set of integers r = 1, 2, ..., n for
which vr # 0. Then a necessary and sufficient condi-
-
tion that A be M-reducing is that
(4.5.15) 1 . h k l - < ' for k E JM.
Proof. Under the hypothesis, there is a unitary
-
U such that A = U*diag (Al, . . . , h )U, M = U*diag ( L I ~ ,
n
. U . Therefore T = M*M - (MA)*(MA) =
n- - -
U*diaq(ilkuk - hkhkukuk)U. Hence the condition for
2 2
positive semidefiniteness of T is 111 (1 - I A 1 ) 0,
k k
-
k = 1, 2, ..., n. This is equivalent to (4.5.15).
Smoothing and Variation Reduction 139
Corollary. A is variation reducing [see (4.5.5)l if
and only if (I - n)* (I - n) - ((I - n)A) * ((I - n)Al is
positive semidefinite.
Proof. Set M = I - n.
-
Corollary. Let A be a circulant with eigenvalues Al,
..., A n . Then a necessary and sufficient condition
..
that A be variation reducing is that
j-1
Proof. The eigenvalues of M = I - n are 1 - w ,
j = 1 . n. Hence JI-n = 12, 3, ..., nl.
PROBLEM
1. Consider the nonautonomous system of difference
equations Z = G Z where
n+l n n
Show that p(Gn) < 1, but the sequence Z may
n
diverge. (Markus-Yamabe, discretized.)
4.6 APPLICATIONS TO ELEMENTARY PLANE GEOMETRY:
n-GONS AND Kr-GRAMS
We begin with two theorems from elementary plane
geometry.
Theorem A. Let zl, z2, z 3' z4 be the vertices of a
quadrilateral. Connect the midpoints of the sides
cyclically. Then the figure that results is always a
parallelogram (Figure 4.6.1). Write P = (z
T
1' z2r
Z3' z4) C1/2
= circ(l/2, 1/2, 0, 0). This means
that C,,,P is always a parallelogram.
Hence the
I/ ' .
transformation C is not invertible. (For if it
1/2
Some Geometrical Applications
Figure 4.6.1
were, there would be quadrilaterals whose midpoint
quadrilaterals would be arbitrary.)
Theorem B. Given any triangle, erect upon its sides
outwardly (or inwardly) equilateral triangles. Then
the centers of the three equilateral triangles form
an equilateral triangle (see Figure 4.6.2). This is
known as Napoleon's theorem.
Figure 4.6.2
Applications to Elementary Plane Geometry 141
Our object is now to unify and generalize these
two theorems by means of circulant transforms and to
derive extremal properties of certain familiar geo-
metrical configurations by means of the M-P inverses
of relevant circulants.
Let us first find simple characterizations for
equilateral triangles and parallelograms. Let zl, z2,
z3 be the vertices of a triangle T in counterclockwise
order. Then T is equilateral if and only if
(4.6.la)
2 2% i
Z1 t WZ + W z3 = 0, w = exp
while
(4.6.lb)
2
Z1 t w z2 + WZ3 = 0
is necessary and sufficient for clockwise equilateral-
ity. The proof is easily derived from the fact that
if zl, z2, z3 are clockwise equilateral they are the
2
images under z + a + bz of 1, w, w ; that is, if and
only if for some a, b, zl = a + b, z2 = a + bw,
-
2
z3 -
a t bw . Of course, if b = 0, the three points
degenerate to a single point. The center of the
triangle is defined to be z = a = c.g. (zl, z 2r z3).
Let zl, z2, z3, z4 be a non-self-intersecting
quadrilateral Q given counterclockwise. Then Q is a
parallelogram if and only if .
(4.6.2) z1 - z2 + 2
3
- z = 0.
4
This is readily established.
For integer n - > 3 and integer r set w = exp(2ni/n)
and set
1 r 2r
(4.6.3) K =-circ(1, w , w , ..., w
(n-l)r
r n
) -
Notice that the rows of Kr are identical to the
first row 1, wr, .. ., w
(n-l)r !.
, mul'iiplied by some w .
In particular, one has
i L
(4.6.4) n = 3, r = 1 : K = -c~rc(l, w, w ) ,
1 3
w = exp (2ni/3),
142 Some Geometrical Ap p l i c a t i o n s
1 .
( 4. 6. 5) n = 4, r = 2 : K = +l r c( l , -1, 1,
2 4
- 1 1 ,
w = e xp ( 2 n i / 4 ) = i.
W e see f r om ( 4. 4. 1) and ( 4. 4. 2) t h a t P i s e q u i l a t e r a l
or a p a r a l l e l o g r a m ( i n t e r p r e t e d p r o p e r l y ) if and o n l y
i f KP = 0 , t h a t i s , i f and o n l y i f P l i es i n t h e n u l l
s p a c e of K. Th i s l e a d s t o t h e d e f i n i t i o n
De f i n i t i o n . An n-gon P = ( z l , Z 2 , . . . , zn) w i l l
be c a l l e d a gram i f and o n l y if
o r e q u i v a l e n t l y i f and o n l y i f
The r e p r e s e n t e r pol ynomi a l f o r Kr i s p ( z )
=
( l / n )
2 r z 2 + 0 . . + w
( n - 1 ) r n- 1
(1 + wrz + w
z ) = ( ( wr z ) " - 1)/
r
n ( w z - 1).
The e i q e n v a l u e s o f Kr a r e p ( wj - ' ) , . - 1 =
1, 2 , . n . NOW f o r j - 1 # n
r , p(wl-') = 0.
n - r +l
wh i l e p( w ) = 1. Thus i f
t h e n Kr = F*di a g( O, 0, ..., 0, 1, 0, ..., O) F, t h e 1
o c c u r r i n g i n t h e j t h p o s i t i o n . Th i s means t h a t
( 4. 6. 8) Kr = F*A.F = B . [ s e e ( 3 . 4 . 9 ) ] .
I I
The B . are t h e p r i n c i p a l i d e mp o t e n t s o f a l l c i r c u l a n t s
3
of o r d e r n. We have [ s e e a f t e r (3. 4. 1011
I f C is a c i r c u l a n t of r a n k n - 1, t h e n by
( 3 . 3 . 1 3 ) , f o r some i n t e g e r j . 1 - < j - < n ,
From ( 4 . 6 . 8 ) , ( 4 . 6 . 9 ) , and S e c t i o n 2. 8. 2, p r o p e r t i e s
(1) and ( 2 ) .
Ap p l i c a t i o n s t o El ement ar y P l a n e Geomet ry
143
S e v e r a l more i d e n t i t i e s w i l l b e of u s e .
Agai n,
r 2 r
l et Kr = ( l / n ) c i r c ( l , w , w , ..., w ( n - l ) r I . Le t Y
be a n a r b i t r a r y c i r c u l a n t so t h a t one c a n w r i t e Y =
F* d i a g ( q l . v 2 . ..., nnl F f o r a p p r o p r i a t e q . .
NOW K ~ Y
1
= (F*A . F) ( F*di ag ( n l , . . . ,
I
nn) Fl = F*di ag( O, . .. , 0, q j ,
0, ..., O ) F = q. F*A. F = n.K . Thus
1 3 I r
6 . 1 1 Kr Y = n . K .
I r
I n p a r t i c u l a r , i f Y i s mer el y a col umn v e c t o r
Y = ( yo, yl , ... , Y ~ - ~ ) ~ ~ t h e n
( 4. 6. 12) Kr Y = n . f c ( Kr )
3
wher e t h e n o t a t i o n f c ( K d e s i g n a t e s t h e f i r s t col umn
r
of Kr . One a l s o h a s
( 4. 6. 13) K Y = o ( l , w( " - ' ) ~ ( n- 21r
r W
r T
r
, ..., w )
wher e
( 4. 6. 14)
o = yo + ylw
r
+ ... ( n - l ) r
+ Yn-lw
Le t Y be f u r t h e r s p e c i a l i z e d t o Y = f c ( Kr ) .
Then Y =
1 / 1 1 w
( n - l ) r ( n - 2 ) r
. w
r T
, ..., w 1 . Th e r e f o r e f r om
( 4 . 6 . 1 4 ) . o = 1 , and f r om ( 4. 6. 13)
Each c i r c u l a n t C o f r a n k n - 1 d e t e r mi n e s a n
i n t e g e r j u n i q u e l y , and t h r o u g h ( 3. 3. 13) and ( 4. 6. 9)
a ma t r i x Kr , hence a c l a s s o f K -grams. I n t h e f o l -
r
l owi ng t heor ems t h i s d e t e r mi n a t i o n w i l l be assumed.
Theorem 4. 6. 1. Le t P b e a n n-gon. Then t h e r e e x i s t s
a n n-qon 6 s uc h t h a t CG = P i f and o n l y i f P i s a K -
gram. r
144 Some Geomet r i cal Ap p l i c a t i o n s
Proof.
The syst em of e q u a t i o n s CB = P ha s a
s o l u t i o n i f and o n l y i f P = C C ~ P .
Thi s i s e q u i v a l e n t
t o P = ( I - Kr)P = P - KrP o r KrP = 0 [by ( 4 . 6 . 9 ) l .
Cor ol l a r x. Le t P be a Kr-gram. Then t h e g e n e r a l
s o l u t i o n t o C; = P i s gi ven by
( 4. 6. 16) \$ = C ~ P + T f c ( Kr )
f o r an a r b i t r a r y c o n s t a n t T.
Pr oof . I f P i s a Kr-gram, t h e n t h e g e n e r a l
s o l u t i o n t o c6 = P i s gi ven by 6 = c Tp + ( I - C ~ C ) Y =
C'P + K,Y f o r a n a r b i t r a r y column v e c t o r Y. From
( 4. 6. 17) . Kr Y = n . f c ( Kr ) and t h e s t a t e me n t f ol l ows .
I
Co r o l l a r y . P i s a Kr-gram i f and onl y i f t h e r e i s a n
n-gon Q s uch t h a t P = CQ.
Pr oof . Le t P = CQ. Then KrP = KrCQ. Si nc e KrC
- -
= 0, i t f ol l ows t h a t KrP = 0 s o t h a t P i s a Kr-gram.
A
Conver sel y, l e t P be a Kr-gram. Now t a k e f o r Q any P
whose e x i s t e n c e i s guar ant eed by t h e p r e v i o u s c or ol -
l a r y .
Cor ol l a r y. Given a n n-gon P which i s a Kr-gram. Then
A
gi ven an a r b i t r a r y complex number zl , we c a n f i n d a
A
uni que n-gon P = (il, ", ..., ;,IT, wi t h :1 a s i t s
f i r s t v e r t e x and such t h a t ~6 = P.
Pr oof . Si nc e t h e g e n e r a l s o l u t i o n of C\$ = P i s
-
P = C' P + T f c ( Kr ) , qi ven gl, we may s o l v e uni quel y
f o r a n a p p r o p r i a t e T s i n c e t h e f i r s t component of
f c ( Kr ) i s 1 (# 0 ) .
Theorem 4. 6. 2. Le t P be a n n-gon which i s a Kr-gram.
Then t h e r e is a uni que n-qon Q whi ch i s a Kr-gram and
such t h a t CQ = P. I t i s gi ven by Q = C ~ P .
Ap p l i c a t i o n s t o El ement ar y Pl a ne Geometry 145
Pr oof
O S i n c e P i s a Kr-gram, it has t h e form P = CR
f o r some R. Hence Q = C'P = cTcR = c( c' R) . Hence Q
i s a K -gram.
r
( b) Q i s a s o l u t i o n of CQ = P, as we can s e e by
s e l e c t i n g T = 0 i n t h e above.
( c)
A l l s o l u t i o n s a r e of t h e form P = C ~ P +
A
T f c ( Kr ) . Now P i s a Kr-gram i f and onl y i f K,\$ = 0.
That i s, i f and onl y i f K, C~ P + r Kr f c( Kr ) = 0. Now
K ~ C - = 0. But Krfc ( Kr ) = Kr . The r e f or e T = 0.
Theorem 4. 6. 3. Le t P be a Kr-gram. Among t h e i n f i n -
i t e l y many n-gons R f o r whi ch CR = P, t h e r e is a uni que
one o f minimum norm 1 / R / 1 . I t i s qi ve n by R =
CTp. Hence it c o i n c i d e s wi t h t h e uni que Kr-gram Q
such t h a t CQ = P.
Pr oof . Use t h e l a s t t heor em and t h e l e a s t
-
s qua r e s c h a r a c t e r i z a t i o n of t h e M-P i n v e r s e .
Suppose now t h a t P i s a g e n e r a l n-gon and we wi sh
t o appr oxi mat e it by a Kr-gram R such t h a t I I P - R I /
= minimum. Ever y K -gram can be wr i t t e n a s R = CQ f o r
r
some n-gon Q s o t h a t our pr obl em is: gi ve n P , f i n d a
Q such t h a t 1 I P - C Q I I = minimum. Thi s pr obl em ha s a
s o l u t i o n , and t h e s o l u t i o n i s uni que i f and onl y i f
t h e col umns of C a r e l i n e a r l y i ndependent . Thi s i s
n o t t h e c a s e ( t h e r a nk of C bei ng n - l ) , hence Q =
C ~ P i s t h e s o l u t i o n wi t h minimum 1 1 . Thus, R = CQ
= C C ~ P i s t h e b e s t a p r oxi ma t i on of t h e n-gon P by a
K -gram wi t h minimum 7 I Q I I . W e phr a s e t h i s a s f ol l ows .
r
Theorem 4. 6. 4. Given a g e n e r a l n-gon P = ( z
T
1' ""
zn) . The uni que Kr-gram R = CQ f o r which I I P - R I 1
=
minimum and / I Q I / = minimum i s gi ven by
( 4. 6. 17) R = CC-P = (1 - Kr)P = P - KrP
= P - o ( 1 , w
( n - l ) r ( n - 2 ) r r~
, w , ..., w )
146 Some Geometrical Applications
where o = zl + z wr + -.. + znW . Alternatively,
2
this can be written as
where n. is determined from
I
circ(zl, z2, . . . , z n ) = F*diag(nl, q2, . . . ,nn)F.
proof. AS before, R = C C ~ P = (I - K )P = P -
- r
K,P. BY (4.6.12), K ~ P = n .fc (K ) .
Notice that R is
3 r
a K -gram because KrR = Kr (P - Ii . f c (Kr)
= KrP -
r 3
q.K fc(Kr). Since by (4.6.15) Krfc(Kr) = fc(Kr),
3 r
K R = 0.
r
Notice also that if P is already a Kr-gram, a =
Z + z wr + * . .
2
+ znw = 0. In this case, from
1
(4.6.17), R = P: so, as expected, P is its own best
approximation.
Generally, of course, the operation R(P) = CCIP
is a projection onto the row or column space of C.
4.7 THE SPECIAL CASE: circ(s, t, 0, ..., 0)
An interesting class of cyclic transformations comes
about from circ(s, t, 0, 0, ..., O), of order n, where
one assumes that s + t = 1, st # 0, and that the rank
is n - 1. Write
The representer polynomial is p(z) = s + (1 - s)z, so
k k
that the eigenvalues of Cs are p(w ) = s + (1 - s)w ,
k = 0 1, . - 1 . Suppose that for a fixed j, 0 - <
j - < n - 1, s + (1 - s)w3 = 0. Thus, there will be a
zero eiqenvalue if and only if s = w3/(w3 - 11,
t =
1 - w . For such s, Cs can have no more than one
k
zero eiqenvalue since s + (1 - s)w
= s + (1 - s)w3 = 0
k
implies that w = w3, or k = j. Thus we have
The Special Case 147
Theorem 4.7.1. The circulant Cs has rank n - 1 if and
only if for some integer j, 0 < j < n - 1,
- -
In this case,
If s is real, then C has rank n - 1 if and only if n
s
is even and s = t = 1/2.
Proof.
The j + 1st eigenvalue of Cs is zero.
Hence (4.7. 2) follows by (4.6.7), (4.6.9). If s is
real, so is 1 - s and hence 1 - wl. Therefore wJ is
real. Since j = 0 is impossible (s = m ) , w3 = -1.
This can happen if and only if n is even. From
(4.7.2). s = t = 1/2.
If s is real, the transformation induced by Cc
-
is interesting visually because the vertices of P =
CsP lie on the sides (possibly extended) of P. More-
over, if s and t are limited by
-
that is,a convex combination, then P is obtained from
P in a simple manner: the vertices of \$ divide the
sides of P internally into the ratio s: 1 - s. (Cf.
Section 1.2.)
If s and t are complex, we shall point out a
geometric interpretation subsequently.
As seen, if n = even and s is real, then C9 is
-
singular if and only if s = t = 1/2. In all other real
cases, the circulant Cs is nonsingular and hence, given
an arbitrary n-gon P, it will have a unique pre-image
5 under Cs: csP = P.
Example. Let n = 4, s = t = 1/2. If Q is any quadri-
lateral, then ClI2Q is mbtained from Q by joining suc-
cessively the midpoints of the sides of Q. rt is
148 Some Geometrical Applications
therefore a parallelogram. Hence, if one starts with a
quadrilateral Q, which is not a parallelogram, it can
have no pre-image under ClL2.
Since in such a case the system of equations can
be "solved" by the application of a generalized
inverse, we seek a geometric interpretation of this
process.
4.8 ELEMENTARY GEOMETRY AND THE MOORE-PENROSE INVERSE
select = even, s = t = 1/2. Then Cs = circ(l/2, 1/2,
0, . 0 For simplicity designate C112 by D:
This corresponds to j = n/2 in (4.7.2). Hence by
(4.7.3)
(4.8.2) DD= = I - K 4 2
where by (4.6.3)
For simplicity we write K = K.
n/2
It is of some interest to have the explicit
expression for D-.
Theorem. Let D = circ(l/2, 1/2, 0, 0, ..., 0) be of
order n, where n is even. Let
(-1) (n/2)-1
(4.8.4) E = circ n 1 n 2 1 ( n - 1 . . . ,
5, -3, 1, 1, -3, 5, ..., (-1) (42)-1 (n-l) ) .
Then E = D ~ .
As particular instances note:
1
n = 4: D~ = circ -(3, -1, -1, 3)
4
-
1
n = 6: D' = circ z(5, -3, 1, 1, -3, 5).
Elementary Geometry 14 9
Proof
(a) A simple computation shows that
DE = c i r c n n - 1 1 - 1 1 - 1 ..., -1, 1)
Hence DED = (I - K)D = D - KD = D, since by (4.6.10)
(or by a direct computation) KD = 0.
(b) On the other hand, EDE = DEE = (I - K)E =
E - KE. An equally simple computation shows that KE =
0. Hence EDE = E. Thus by (2.8.2) (1)-(4). E = D ~ .
From (4.6.6b) or (4.6.6a). in the case under
~~ -
study, a K-gram is an n-gon whose vertices zl, ..., z
satisfy n
(4.8.5) zl - z
2
+ z 3 - Z + " ' + Z
4
- z = 0.
n- 1 n
It is easily verified that for n = 4 the condition
holds if and only if zl, z2, zj, z4 (in that order)
form a conventional parallelogram. Thus, an n-gon
which satisfies (4.8.5) is a "generalized" parallel-
ogram. The sequence of theorems of Section 4.6
can now be given specific content in terms of parallel-
ograms or generalized parallelograms. We shall write
it up in terms of parallelograms.
Theorem 4.8.2. Let P be a quadrilateral. Then there
exists a quadrilateral 6 such that DP = 6 (the midpoint
property) if and only if P is a parallelogram.
Corollary. Let P be a parallelogram. Then the gen-
A
era1 solution to DP = P is given by
for an arbitrary constant T.
Corollary. P is a parallelogram if and only if there
is a quadrilateral Q such that P = DQ.
150 Some Geometrical ~pplications
Corollary. Let P be a parallelogram. Then, given an
arbitrary number zl, we can find a unique quadrilat-
A
era1 P with gl as its first vertex such that DP = P.
Theorem 4.8.3. Let P be a parallelogram. Then there
is a unique parallelogram Q such that DQ = P. It is
qiven by Q = Dip.
Notice what this is saying. DQ is the parallelo-
gram formed from the midpoints of the sides of Q.
Given a parallelogram P, we can find infinitely many
quadrilaterals Q such that DQ = P. The first vertex
may be chosen arbitrarily and this fixes all other
vertices uniquely. But there is a unique parallelogram
Q such that DQ = P. It can be found from Q = Dip
(see Figure 4.8.1).
Figure 4.8.1
Theorem 4.8.4. Let P be a parallelogram. Among the
infinitely many quadrilaterals R for which DR = P,
there is a unique one of minimum norm I I R I I.
It is
given by R = D ~ P . Hence it coincides with the unique
paralleloqram Q such that DQ = P.
Theorem 4.8.5. Let P be a general quadrilateral. The
unique parallelogram R = DQ for which
1 IP - R /
=
minimum and I Q I I = minimum is qiven by R = (1 - K)P.
In the theorem of Section 4.7, select n = 3 and
\$ = exp(2ni/3), so that w3 = 1. Select j = 1, so that
s = W/(W - 1). 1 - s = 1/(1 - w). In view of 1 + w +
w2 = 0, this simplifies to s = 1/3 (1 - w , 1 - s =
2
1 3 1 - w . On the other hand, the selection j = 2
Elementary Geometry 151
2 2 2
leads to s = w / ( w - 1) = 113 (1 - w ) , 1 - s =
1 - w 2 = 1 3 1 - w . The corresponding circulants
Cs we shall designate by N (in honor of Napoleon):
1 2
(4.8.7) NI = circ - w, 1 - w , O), j = 1
1 2
No = circ ?(1 - w , 1 - w, O), j = 2
the subscripts I, 0 standing for "inner" and "outer."
For brevity we exhibit only the outer case, writing
1 2
(4.8.7') N = circ -(1 - w , 1 - w, 0).
3
We have
1
KO = circ -(l, 1, 11,
3
1 2
(4.8.8) K = circ ?(1, W, w ) , KO + K1 + K2 = I
1
1 2
K2 = circ ~ ( 1 , w , w).
From (4.7.3) with n = 3, j = 2,
-
Theorem 4.8.6. N' = K - WK
0 2'
Proof. Let E = Kg - wK2 Then from (4.8.7'),
-
2
N = K - w K 2
0
2. Hence, NE = (KO - w K2) (KO - wK2) =
- -
2 3 2
K + w K = K + K2 = I - Kl [cf. after (4.6.8)l.
0 2 0
2 2
Therefore NEN = (I - K1)(KO - w K ) = K - w K2 = I\$.
2 0
Similarly, ENE = (I - K ~ ) (KO - WK ) = K - wK2 = E.
2 0
Thus, by Section 2.8.2, properties (1) to (4). E = N ~ .
It follows from (4.6.la) and (4.6.lb) that a
counterclockwise equilateral triangle is a K,-gram,
A
while a clockwise equilateral triangle is a K2-gram.
Let now (zl, z2, z3) be the vertices of an
arbitrary triangle. On the sides of this triangle
erect equilateral triangles outwardly. Let their
vertices be zi, z' z' From (4.6.la),
2' 3'
152 Some Geometrical Applications
The centers of the equilateral triangles are therefore
This may be written as
providing us with a geometric interpretation of the
transformation induced by Napoleon's matrix.
The sequence of theorems of Section 4.6 can now
be given specific content in terms of the Napoleon
operator. In what follows all figures are taken
counterclockwise.
Theorem 4.8.7. Let T be a triangle. Then there
exists a triangle ? such that N? = T if and only if T
is equilateral. (The "only if" part is Napoleon's
theorem. )
Corollary. Let T be equilateral. Then the general
A
solution to NT = T is given by
for an arbitrary constant T.
Corollary. T is equilateral if and only if T = NQ for
some triangle Q.
Corollary. Given an equilateral triangle T. Given
also an arbitrary complex number There is a
unique triangle ? with as its first vertex such
that N? = T.
Theorem 4.8.8. Let T be an equilateral triangle.
Elementary Geometry 153
Then there is a unique equilateral triangle Q such
that NQ = T. It is given by Q = N~T.
Theorem 4.8.9. Let T be equilateral. Let R be any
triangle with NR = T. The unique such R of minimum
norm I I R I I is the equilateral triangle R = N ~ T . It is
identical to the unique equilateral triangle Q for
which NQ = T. (See Figure 4.8.2.)
Figure 4.8.2
Finally, suppose we are given an arbitrary
triangle T and we wish to approximate it optimally by
an equilateral triangle. Here is the story.
Theorem 4.8.10. Let T be arbitrary; then the equilat-
eral triangle NR for which I I T - N R ( ~ = minimum and
such that I I R / ( = minimum is given by R = N'T and NR =
N N ~ T = (I - K )T.
1
PROBLEMS
1. Discuss the matrix circ(l/3, 1/3, 1/3, 0, 0, 0)
from the present points of view and derive geomet-
rical theorems. To start: this matrix maps every
6-gon into a parahexagon, that is, a 6-gon whose
'
(
1
3

r
t
3
b

'
0

r
o
m

e
c

+
a

O
(
D

m

r
-

(
D
a

2
m

m

H
m
.

*
M
e

C

0

n
r
u
m

a
r
t
2

C

5
.
e

.

a
e

r
t

T
I

A

Y

(
D

r

<

1
4

(
D

r
.

3

a
s
3

'
T

[
U

r
t
-

~
r
.

h

m

O
Q
m

3

c

r
u

0
2

3
-
0

P
O

a
r
r

c
i

'
i
u

0

X

n
c
;

0
.

m
o

2

H
m
l

m
1
U

m
-

r
t

r

0

P
C

Q
o
m

2

t
i

r
t

m

*

'
C

"
3

Y
t
r

z
o
r
.

m

-
e
m

h
n

(
D

m

h

m
a
r
t

s

r
.

7

Y

r
-
a

(
D
r
t
r
D

2
m

0

a
m

Q

H

m

3

3

m

r
t

h

r

e

r
.

'

m

r
.

N

0

e

Y

r
t

e

r
-
a

0

r
t

2

m

0
1

h

1>6 Generalizations of Circulants
places is the same as a shift of g mod n places. By
convention, if g is negative, shifting to the right g
places will be equivalent to shifting to the left (-9)
places. ~ h u s , for any integers q, g' with g'
g(mod n) a 9'-circulant and a g-circulant are synonym-
ous.
Example 1. A 4-circulant of order 6 is
a
4
a
5
a
a3 a4 a5 a6 al a2
al a2 a3
Example 2. A 1-circulant is an (ordinary) circulant.
Example 3. A 0-circulant is one in which all rows
are identical.
Example 4. J = circ(1, 1, ..., 1) is a g-circulant
for all g.
Example 5. A (-1)-circulant (or an (n - 1)-circulant)
has each successive row moved one place to the left.
It is sometimes called a left circulant or an anti-
circulant or a retrocirculant. Thus
is the anti-identity or the counter-identity.
Let A = (a,.). Then, evidently, A is a g-
1 I
circulant if and only if
(5.1.2) a. . = a. i, j = 1, 2, ..., n.
I t 3 l+l, j+g
Equivalently, if A = (a. . ) = g-circ (a a2, ..., an),
then
1 I
Take g > 0 and let (n, g) designate the greatest
common divisor of n and g. The g-circulants split
into two types depending on whether (n, g) = 1 or
(n, g) > 1. The multiples kg, k = 1, 2, . . ., n
through a complete residue system mod n if and any; if
(n, g! = 1. Hence the rows of the general g-circulant
are dlstinct if and only if (n, 9) = 1. In this case,
the rows of a 9-circulant may be permuted so as to
yield an ordinary circulant. Similarly for columns.
Hence if A is a 9-circulant, (n, g) = 1, then for
appropriate permutation matrices P
1' P2
(5.1.4a) A = P C,
1
(5.1.4b) A = CP2,
where in (5.1.4a) C is an ordinary circulant whose
first row is identical to that of A. In a certain
sense, then, if (n, g) = 1, a g-circulant is an
ordinary circulant followed by a renumbering.
However, the details of the diagonalization, and
so on, are considerable. If (n, 9) > 1, this is a
degenerate case, and naturally there are further com-
plications.
Example. Making use of the geometric construction of
Section 1.4, we shall illustrate this distinction by
the two matrices of order 8:
In the first case, transformation of the vertices of
a regular octagon by A, yields a regular octagon in
-
permuted order (Figure 5.1.1). In the second case,
a square covered twice (Figure 5.1.2).
Theorem 5.1.1. A is a g-circulant if and only if
(5.1.5) nA = ~n'.
1 2 ... n
Proof. In (2.4.6) take o = (2 ... Then
Pa = n so that if A = a , nA = (a.
1 I + In
(2.4.8). take
9 . 3 , n: 8
Fi gur e 5. 1. 1
9 - 2 , n - 8
Fi gur e 5. 1. 2
\ 1 + g 2 + g ... 4 1
t hen P -1 = (% )-l = ag. Hence T A ~ - ~ =
o (ai+l, j +g) .
The r e s u l t now f ol l ows from ( 4. 1. 2) .
Cor ol l a r y. Le t A and B be g- c i r c ul a nt s . Then AB* i s
a 1- c i r c ul a nt . I n p a r t i c u l a r , i f A i s a g- c i r c ul a nt ,
AA* is a 1 - c i r c u l a n t .
Pr oof . A = n*Ang, B = n*Bn9. Hence AB* =
n * ~ n ~ ? r * ~ ~ * n = n*AB*?r.
Theorem 5.1.2. If A i s a q - c i r c u l a n t and B i s a n h-
c i r c u l a n t t hen AB i s a gh- c i r c ul a nt .
h
Pr oof . nA = &ng and ilB = Bn . Now
n(AB) = AvgB = ( ~ n ~ - l ) (nB) = (Ang-') (Bnh)
h h h
= ( ~ n ~ - ~ ) (nBn ) = ( ~ n ~ - ~ ) ( Bn ) n
=
Keep t h i s up f o r h t i m e s , l e a di ng t o
9h
?r( AB) = ( ~ n ~ - ~ ) ( ~ n ' ~ ) = (AB)n .
Now appl y Theorem 5. 1. 1.
W e r e q u i r e s e v e r a l f a c t s from t h e el ement ar y
t he or y of numbers.
Lemma 5. 1. 3. Let q , n be i n t e g e r s n o t bot h 0. Then
t h e e qua t i on
h a s a s o l u t i o n i f and onl y i f ( n , g ) = 1.
Pr oof . I t is wel l known t h a t gi ven i n t e g e r s g, n ,
-
n o t bot h 0, t he n t h e r e e x i s t i n t e g e r s x, y such t h a t
gx - ny = ( n, g ) . Hence i f ( n, g ) = 1, ( 5. 1. 6) h a s a
s o l u t i o n . Conver sel y, i f ( 5. 1. 6) hol ds , t he n f o r some
160 Generalizations of Circulants
integer k, gx - 1 = kn. If q and k have a common
factor > 1, it would divide 1, which is impossible.
I
Corollary. For (n, 9) = 1, the solution to qx = 1
(mod n) is unique mod n.
proof.
Let gxl = 1 (mod n) and qx2 = 1 mod n;
then q(xl - x ) = 0 (mod n). Since (n, g) = 1.
2
( x ~ - x ) = 0 (mod n).
2
For (n, g) = 1 we shall designate the unique
- 1
solution of (5.1.6) by g .
Theorem 5.1.4. Let A be a nonsinqular g-circulant.
-
Then A-l is a g l-circulant.
1
Proof. Since A is nonsingular, it follows that
-
- 1 - 1
(n, g) = 1, hence that g exists with qg = 1 (mod
-1"-1 -
n) : Now, from (5.1.5) nA = An9 so that A -
T - ~ A - ~ . Hence
-g+l -1 -g+l -1 -1 T12
T A - ~ = n A n = n (A n )
-9+1 ("-qA-l), 2 = 71-2q+lA-ln2*
= Tl
DO this s times, and we obtain
n'4-l = n -sq+lA-ins
-
Now select s = q and there is obtained nA-l =
- 1
-1 .
-
A-ln9 , which tells us that A is a q l-circulant.
I
Theorem 5.1.5. A is a g-circulant if and only if (Ai) *
is a g-circulant.
I
Proof. Let A be a g-circulant. Then A = n-lAn9.
- 1 -g i
Hence (since n , n , n9 are unitary) A+ = n A n.
~ h u s ( A ~ ) * = n* (A+) * (n-9)* = IT-~(A+)*T~. Therefore
( A\$ ) * is a g-circulant.
Conversely, let (Ai)* be a g-circulant. Then by
1
what we have just shown, ( ((Ai) * ) ' I * is also a g-
circulant. But this is precisely A.
Corollary. If A is a q-circulant then AAi is a 1-
circulant.
Proof. In the corollary to Theorem 5.1.1, take
-
B = (A7)*. This is a q-circulant by what we have just
shown. Hence AB* = AA7 is a l-circulant.
If A is a g-circulant, then AA* is a l-circulant.
Hence it may be written as AA* = F*hAA,F where h an* is
.-.
the diagonal of eigenvalues of AA*. Now by Problem 16
of Section 2.8.2, for any matrix M, M7 = M*(MM*)~.
Hence
Theorem 5.1.6. If A is a q-circulant, then
(5.1.7) AT = A*(AA*)~ = A*F*A~ AA*F.
We now produce a generalization of the represen-
tation (3.1.4). Let
Notice that Q_ is a permutation matrix and is
' 3
unitary if and on1.y if (n, 9) = 1. (For in this case
and only in this case will Qn have precisely one 1 in
3
each row and column.)
Theorem 5.1.7
Proof. The positions in A occupied by the symbol
a are precisely those occupied by a 1 in Q . The
1 4
positions occupied by the symbol a2 in A are one
place to the right (with wraparound) of those occupied
162 Generalizations of Circulants
by al.
Since right multiplication by n pushes all
the elements of A one space to the right, it follows
that the positions occupied by a2 in A are precisely
those occupied by 1 in Q n. Similarly for a
9
3r ..., a
n'
Corollary. A is a 9-circulant if and only if it is of
the form Q C where C is a circulant.
9
Proof. Use (3.1.4).
Since
one has
Corollary. A is a (-1)-circulant if and only if it
has the form A = TC where C is a circulant and where
the first rows of A and C are identical.
Corollary. A is a (-1)-circulant if and only if it
has the form
where A is diagonal. In this case,
for integer values of n.
Proof. A = TC with circulant C. But such C =
-
F*AF, so that A = (TF*)hF. From the corollary to
Theorem 2.5.2, F * ~ = T* = r so that TF* = F * ~ = F*T
and (5.1.10) follows.
If A = diag(X1, . . . ,An), then
The eigenvalues of the (-l)-circulant A are identical
to those of F A and the latter are easily computed.
(See Section 5.3.)
Note also that
5 . 1 2
trn)' = diag(hlhl, hnA2, An-1X3, .. ., i2in)
so that the even powers of TA are readily available.
PROBLEMS
1. Prove that g-circulants form a linear space under
matrix addition and scalar multiplication.
2. Let S denote the set of all matrices of order n
that are of the form aA + BB where A is a circu-
land and B is a (-1)-circulant. Show that they
form a ring under matrix addition and multiplica-
tion.
3. What conditions on n and g are sufficient to
guarantee that the g-circulants form a ring?
k
4. Let A be a g-circulant. Then for integer k, T A =
~ n ~ ~ . Hence if g / n, nnIg~ = A.
5. Let (n, g) = 1 and suppose that A is a g-circulant.
Prove that there exists a minimum integer r 5 1,
such that A~ is a circulant. Hint: use the Euler-
Fermat theorem. See Section 5.4.2.
6. Let (n, g) = 1. Prove that if A is a g-circulant,
each column can be obtained from the previous
column by a downshift of g-' places.
If g = 0, each row of A is the previous row "shifted"
zero places. Hence all the rows are identical. Since
the rows are identical, r (A) 5 1. If r (A) = 0, A = 0,
and the work is trivial. Suppose, then, that r(A) = 1.
Then, by a familiar theorem (see Lancaster [l], p. 56),
A must have a zero eigenvalue of multiplicity 2 n - 1.
Its characteristic polynomial is therefore of the form
An - ohn-'.
If we write A = 0-circ(a 1, a2, ..., an) =