Académique Documents
Professionnel Documents
Culture Documents
Abstract. A criterion for the existence of H-polar decompositions based on comparing canonical
forms is presented, and a numerical procedure is explained for computing H-polar decompositions
of a matrix for which the product of itself with its H-adjoint is diagonalisable. Furthermore, the
H-orthogonal or H-unitary procrustes problem is stated, and solved by application of H-polar de-
compositions.
Key words. Indefinite scalar products, polar decompositions, factor analysis, procrustes prob-
lems.
This is equivalent to the fact that between the H-adjoint A[∗] and the ordinary adjoint
A∗ = A T the relationship
A[∗] = H−1 A∗ H
A = UM with U∗ HU = H and M∗ H = HM
† Institut für Mathematik, MA 4-5, TU Berlin, Straße des 17. Juni 136, 10623 Berlin, Germany;
email: UKintzel@aol.com
1
Decompositions of this kind have been investigated in detail in the publications
[BMRRR1-3] and [MRR] as well as in the further references specified there. More spe-
cialised results concerning polar decompositions of H-normal matrices, i.e. matrices
which commute with their H-adjoint, are discussed in [LMMR].
H-polar decompositions are also the central subject of this paper, in which theo-
retical as well as practical questions are discussed. For example, the more theoretical
Chapter 3 is primarily concerned with finding a further criterion for the existence of
H-polar decompositions, whereas the more practical Chapter 4 presents a numerical
procedure for computing H-polar decompositions of a matrix A for the case in which
the matrix A[∗] A is diagonalisable. In both chapters some statements are required
concerning subspaces of the Fn , which are first of all derived in Chapter 2, whereby
some numerical questions are already examined in outlook for Chapter 4. In the final
Chapter 5, two applications from a branch of mathematics known in psychology as
factor analysis or multidimensional scaling, are ported into the environment of indef-
inite scalar products. This involves on the one hand the task of constructing sets of
points which take up given distances, and on the other hand the task of converging
two sets of such points in the sense of the method of least squares or the procrustes
problem1 , which is achieved with the help of an H-polar decomposition.
In a typical application of multidimensional scaling (MDS, for example see [BG])
test persons are first of all requested to estimate the dissimilarity (or similarity)
of specified objects which are selected terms describing the subject of the analysis.
For example, if professions are to be analysed, terms such as politician, journalist,
physician, etc. can be used as the objects. In this way the comparison of N objects
in pairs produces the similarity measures, called proximities, pkl , 1 ≤ k, l ≤ N ,
from which the distances dkl = f (pkl ) are then determined using a function f , for
example f (x) = ax + b, which is called the MDS model. Using these distances, the
coordinates of points xk in an n−dimensional Euclidean space are constructed such
that kxk − xl k = dkl whereby k.k stands for the Euclidean norm. Thus each object
is now represented by a point in a coordinates system and the data can be analysed
with regard to their geometric properties. For example, it can thereby be attempted
to interpret the basis vectors of the space in the sense of psychological factors, such
as social status in the given example of professions.
The results of interrogating the test persons are often categorised in M groups,
e.g. according to gender and/or age, producing several descriptive constellations of
(r)
points xk , 1 ≤ r ≤ M , in a Euclidean space of dimension n = max{n(r) } which
must be mutually compared in the analysis. To make such a comparison of two con-
stellations xk and yk possible, it is first of all necessary to compensate for irrelevant
differences resulting from the different locations
P in space. This is done with an orthog-
onal transformation U devised such that k kUxk − yk k2 is minimised. Thereafter
the constellations x̃k = Uxk and yk can be analysed.
Thus the MDS model f is chosen in particular by adding a constant b (and
by making further assumptions such as dkk = 0), so that the triangular inequality is
fulfilled and therefore the points can be embedded in an Euclidean space [BG, Chapter
18]. This restriction is not required mathematically if a pseudo-Euclidean geometry is
1 Procrustes, a robber in Greek mythology, who lived near Eleusis in Attica. Originally he was
called Damastes or Polypemon. He was given the name Procrustes (“the stretcher”) because he
tortured his victims to fit them into a bed. If they were too tall, he chopped off their limbs or formed
them with a hammer. If they were too small, he stretched them. He was overcome by Theseus who
served him the same fate by chopping off his head to fit him into the bed.
2
admitted. This is the subject of the investigations in Chapter 5, where the stated task
of constructing points from given distances and rotation data in the sense of optimum
balancing in the environment of indefinite scalar products is considered.
The following notation is used in the course of this work: The kernel, the image
and the rank of a matrix A are designated ker A, im A and rank A respectively. If
the matrix A is square, then tr A, det A and σ(A) are its trace, determinant and
spectrum respectively. Furthermore, the abbreviation A−∗ = (A∗ )−1 = (A−1 )∗ is
used. The symbol 0 is used for zero vectors as well as for zero matrices. In some
places it is additionally provided with size attributes 0p,q ∈ Fp×q or 0p ∈ Fp×p ,
whereby lower indices may also be intended as enumeration indices. This is evident
from the respective context. Ip , Np and Zp respectively designate the p × p identity
matrix, the p × p matrix with ones on the superdiagonal and otherwise zeros, and
the p × p matrix with ones on the antidiagonal and otherwise zeros. In particular
Jp (λ) = λIp + Np specifies an upper Jordan block for the eigenvalue λ. Moreover,
A1 ⊕ . . . ⊕ Ak stands for the block diagonal matrix consisting of the specified blocks,
and diag(α1 , . . . , αk ) stands for a diagonal matrix with the specified diagonal elements.
Even when no further specifications are made, a regular (real) symmetric or (complex)
hermitian matrix is always meant by H, and instead of A[∗] sometimes AH is written
to specify the matrix on which the scalar product is based. Lastly, the direct sum of
two subspaces X, Y ⊂ Fn is denoted by X ⊕ Y .
2. Subspaces. The properties of subspaces of an indefinite scalar product space
over the field of the real or complex numbers are discussed in detail in [GLR, Chapter
I.1]. In this chapter some additional properties are described which are required in
the course of the further considerations.
Let F = R or F = C and let [., .] be an indefinite scalar product of Fn with the
underlying regular symmetric or hermitian matrix H ∈ Fn×n . A subspace M ⊂ Fn is
said to be positive (non-negative, neutral, non-positive, negative) if
exists with two H-orthogonal summands, and it remains to show that M− is negative.
Suppose that a vector x ∈ M− exists with [x, x] > 0. Then it would follow that
[x + y, x + y] = [x, x] + [y, y] > 0 for all y ∈ M+ . But this would mean that the
subspace M+ ⊕ span{x} is also positive, in contradiction to the maximality of M+ .
Thus M− is non-positive and the Schwarz inequality [GLR, Chapter I.1.3]2
can be applied. Now assume that x0 ∈ M− with [x0, x0 ] = 0. Then the Schwarz
inequality shows that [x0, x] = 0 must be fulfilled for all x ∈ M− . Since it is also true
that [x0, y] = 0 for all y ∈ M+ , it follows that [x0, z] = 0 for all z ∈ M . Thus x0 = 0,
because M is non-degenerate.
2. Let M0 = M ∩ M [⊥] . Then M0 is neutral, because if a vector x ∈ M0 ⊂ M
were to exist with [x, x] 6= 0, it would follow that x ∈ / M [⊥] ⊃ M0 . Now let M1 be a
complementary subspace, so that
(M ∩ M [⊥] ) ⊕ M1 = M, M0 = M ∩ M [⊥] ,
M = M+ ⊕ M− ⊕ M0
(Hy, z)n = (HXỹ, Xz̃)n = (X∗ HXỹ, z̃)m = (H̃ỹ, z̃)m where
k
X
(x, y)k = xα y α .
α=1
and, moreover, M00 ⊂ (M1 ⊕ M2 )[⊥] . If it is now chosen that M0 = (M1 ⊕ M2 )[⊥] =
M00 ⊕ M000 , then the assertions 1. and 3. are fulfilled too. From
with
[x+ + − − + − 0 0
k , xl ] > 0, [xk , xl ] < 0, [xk , xl ] = 0 and [xk , xl ] = 0 for 1 ≤ k, l ≤ r
can now be chosen. Since M0+ is positive, M0− is negative and M00 is neutral, it must
also be true that M0+ ∩ M00 = M0− ∩ M00 = {0}, so that each basis vector of M00 can
be expressed in the form
r
X r
X
x0k = αki x+
i + βki x− T T
i with (αk1 , . . . , αkr ) 6= 0, (βk1 , . . . , βkr ) 6= 0.
i=1 i=1
can be used as new basis vectors for M0+ ,M0− , because if it is assumed that the
+
constants (λ1 , . . . , λr ) 6= 0 with λ1 x̃1 + . . . , λr x̃+ 0
r = 0 exist, then 0 6= λ1 x1 + . . . +
0 + − + − − − −
λr xr = λ1 (x̃1 + x̃1 ) + . . . + λr (x̃r + x̃r ) = λ1 x̃1 + . . . + λr x̃r ∈ M0 and thus
M0− ∩ M00 6= {0}. The linear independence of the vectors x̃− −
1 , . . . , x̃r can be shown
analogously. Finally, defining
x00k = x̃+ − 00 00 00
k − x̃k for 1 ≤ k ≤ r and M0 = span{x1 , . . . , xr },
Then span{x̃k , x̃l } = span{xk , xl } and [x̃k , x̃k ] 6= 0. Let the particular basis obtained
by replacing xk , xl with x̃k , x̃l and then exchanging x1 and x̃k still be designated as
{x1 , . . . , xm }. If now we make
p
u1 = x1 / |[x1 , x1 ]| and ²1 = sig[x1 , x1 ] ∈ {+1, −1},
α4 |x|2 + |y|2
cond X2 = = 2 because
α2 |x||y|
2(α4 |x|2 + |y|2 ) |ω|2 (α4 |x|2 + |y|2 )
kX2 k2 = 2 2
, kX−1 2
1 k = .
α |ω| 2α2 |x|2 |y|2
cond X1 ≥ cond X2
with p + q + r = m and p + q + 2r + s + t = n.
2. If the subspaces X1 , X00 , X000 , X2 are defined by
X1 = span{u1 , . . . , up+q },
X00 = span{up+q+1 , . . . , up+q+r },
X000 = span{up+q+r+1 , . . . , up+q+2r },
X2 = span{up+q+2r+1 , . . . , up+q+2r+s+t },
then X00 , X000 are neutral subspaces with the same dimension, X1 , X2 and
X0 = X00 ⊕ X000 are non-degenerate and mutually H-orthogonal, and Fn =
X0 ⊕ X1 ⊕ X2 as well as
If now we make
1 ≈ 1 ≈
e0k = √ (ek + e k ) and e00k = √ (ek − e k ) for 1 ≤ k ≤ r
2 2
then it is true that
[e0k , e0l ] = δkl , [e00k , e00l ] = −δkl , [e0k , e00l ] = 0 for 1 ≤ k, l ≤ r and
[ei , e0k ] = 0, [ei , e00k ] = 0 for r + 1 ≤ i ≤ m, 1 ≤ k ≤ r.
3 The same construction is also specified in [BMRRR2] and in [BR] within the scope of the proof
∼
for Witt’s theorem. However, the necessary orthonormalisation of the vectors e k is there not carried
0 00
out completely, so that the basis {ei , ek , ek } also constructed there is not orthonormalised.
10
Thus the set of the vectors
whereby s specifies the number of positive, t specifies the number of negative com-
plementing vectors, and a suitable sorting method is assumed. Instead of the basis
{u1 , . . . , un } it is also possible to use the basis
and evidently the second part of the assertion is fulfilled too by this basis.
An important application of this result is the following Theorem of Witt con-
cerning the extension of isometries, whose proof has been taken over from the papers
[BR, Theorem 4.1] and [BMRRR2, Theorem 2.1]. Thereby π(H) gives the number of
positive eigenvalues of an hermitian matrix H.
Theorem 2.8 (Witt, extension of isometries). Let F = R or F = C and let [., .]1 ,
[., .]2 be two indefinite scalar products of Fn with the underlying regular symmetric or
hermitian matrices H1 , H2 ∈ Fn×n for which π(H1 ) = π(H2 ) is fulfilled. If X1 and
X2 are subspaces of Fn and U0 : X1 → X2 is a regular transformation such that
with r+s+t = n−m, since the number of positive and the number of negative vectors,
s and t respectively, must be identical for both bases, because of the assumption that
the signatures of the matrices H1 and H2 are the same. Thus the transformation
defined by
UR1 = R2 or U = R2 R−1
1
be a singular value decomposition of the matrix HX. Then the columns [y1 . . . ym ]
of
· ¸
£ ¤ Σ̃−1
Y = Ũ1 Ũ2 Ṽ∗ = Ũ1 Σ̃−1 Ṽ∗
0
X∗ HX = Ip ⊕ −Iq ⊕ 0r with p + q + r = m,
that
· ¸
0r Ir
(X0 )∗ H(X0 ) = Ip ⊕ −Iq ⊕
Ir 0r
and for the matrix defined by X00 = [x1 . . . xp+q x0p+q+1 . . . x0m x00p+q+1 . . . x00m ] with
1 1
x0k = √ (xk + yk0 ), x00k = √ (xk − yk0 ) for p + q + 1 ≤ k ≤ m
2 2
it is found that
m+r
X
u0k = uk − ²µ [uk , x00µ ]x00µ ,
µ=1
it follows that
Finally, the columns of U02 can be orthonormalised, giving a matrix U002 for which
as is also possible by using two scalar products [., .]1 and [., .]2 with π(H1 ) = π(H2 ).
Thus the matrix
whereby the blocks Aj and Hj are of equal size and each pair (Aj , Hj ) has one and
only one of the following forms:
1. Pairs belonging to real eigenvalues
14
3. If a basis in which the blocks from 2. exist is designated with E1 ∪ E2 ∪ E3 ,
(1) r 2pi (2)
s 2pi −1 (3)
t
E1 = {ei,k } i=1 k=1 , E2 = {ei,k } i=1 k=1 , E3 = {ei,1 } i=1 ,
(Jp (λ) ⊕ Jp (λ), Z2p ) or (Jp (−λ) ⊕ Jp (−λ), Z2p ) with λ2 = α + iβ.
(N2p−1 , ²Z2p−1 ).
15
f. Third case with eigenvalue 0. If the canonical form of (M2 , H) contains a
block of the form
(0, ²) ∈ (J3 , Z3 ),
(0, ²).
Proof. see [BMRRR1, Theorem 4.4, Lemma 7.8], [BMRRR3, Errata] and Theo-
rem 4.4.
Whereas Theorem 3.2 makes it possible to transfer results concerning complex
H-polar decompositions to real decompositions, Theorem 3.3 - whose proof is based
on Witt’s theorem - and Theorem 3.4 constitute the essential result for the existence
of H-polar decompositions. Note that there is an error in [BMRRR1, Theorem 4.4]
which is pointed out by the following example.
Example 3.5. With the designations used in [BMRRR1], let
· ¸ · ¸ · ¸
1 0 1 1 + ξ −1 − ξ [∗] 0 0
H= , X= p , X X=
0 −1 1 − ξ 2 1 + ξ −1 − ξ 0 0
with −1 < ξ < 1. Then according to the statement (ii) of Theorem 4.4 the equation
(X[∗] X, H) = (B0 , H0 ) is satisfied and ker B0 = span{e1 , e2 }. However, ker X =
span{e1 + e2 } 6= ker B0 , so that according to the statement (iii) of the theorem the
H-polar decomposition
· ¸ · ¸
1 1 ξ 1 −1
X = UA, U = p , A=
1 − ξ2 ξ 1 1 −1
so that AH A and AAH are H-unitary similar. If now (R−1 AH AR, R∗ HR) = (J, Z)
is the canonical form of the pair (AH A, H) and if S = UR, then (S−1 AAH S, S∗ HS)
= (R−1 U−1 AAH UR, R∗ U∗ HUR) = (J, Z) also gives the canonical form of the pair
(AAH , H).
16
The question now arises whether the converse of this statement is also true. It
can be answered immediately as follows for regular matrices.
Theorem 3.7. Let A ∈ Cn×n be regular and let the canonical forms of the pairs
(A A, H) and (AAH , H) be identical. Then A admits an H-polar decomposition.
H
gives the canonical form of the pairs (AH A, H) and (AAH , H). Then the regular
matrix defined by
B = S−1 AR
respectively. The last one of these equations is obtained from the last but one equa-
tion and the fact that the commutability of two matrices P and Q also implies the
commutability of P−1 and Q, provided that the inverse exists.
−1
√ If now X √ BX 2
= Jp1 (λ1 )⊕. . .⊕Jpk (λk ) is the Jordan normal form of B, a matrix
B with√( B) = B can be chosen such that it can be expressed as a polynomial
f (B) = B, namely
µq q ¶
√
B=X Jp1 (λ1 ) ⊕ . . . ⊕ Jpk (λk ) X−1 with
f 0 (λ) f 00 (λ) f (p−1) (λ)
f (λ) 1! 2!
···
(p − 1)!
f 0 (λ) .. ..
f (λ) . .
q 1! √
Jp (λ) = .. f (λ) , f (λ) = λ,
00 (3.2)
f (λ) .
2!
.. f (λ)
0
.
1!
f (λ)
then
[⇐]: First of all, for all matrices A ∈ Cn×n the easily proved equations [GLR,
Proposition 2.1]
in [G, Chapter V and Chapter VIII] as well as [WED, Chapter VII and Chapter VIII].
18
are true, so that AAH = 0 on the one hand implies
Thus if
Cn = X1 ⊕ X00 ⊕ X000 with ker A = X1 ⊕ X00 and X00 = ker A ∩ (ker A)[⊥]
as well as
Cn = Y2 ⊕ Y00 ⊕ Y000 with (im A)[⊥] = Y2 ⊕ Y00 and Y00 = (im A)[⊥] ∩ im A
whereby X00 , X000 , Y00 , Y000 are neutral subspaces of dimension r and X1 , Y2 are non-
degenerate subspaces of dimension p + q = n − 2r. If now {x1 , . . . , xp+q , x01 , . . . , x0r } is
taken to be an orthonormal basis of ker A, then this basis can be extended according
to Theorem 2.7 with r further vectors x001 , . . . , x00r to give a complete basis of the Cn ,
and for the matrix X = [x1 . . . xp+q x01 . . . x0r x001 . . . x00r ] consisting of these basis vectors
the equations
X∗ HX = Z with
· ¸ · ¸
Ip 0 Ir
Z= ⊕ r and AX = [01 . . . 0p+q+r y10 . . . yr0 ]
−Iq Ir 0r
apply. Therein {y10 , . . . , yr0 } is a basis of the neutral space im A, which, too, can be ex-
tended according to Theorem 2.7 with p+q+r further vectors y1 , . . . , yp+q , y100 , . . . , yr00
to give a complete basis of the Cn , so that for the matrix Y = [y1 . . . yp+q y10 . . . yr0 y100
. . . yr00 ] consisting of these basis vectors, the equations
Y∗ HY = Z and
· ¸ · ¸
0p 0r Ir
YK = [01 . . . 0p+q+r y10 . . . yr0 ] with K = ⊕
0q 0r 0r
are satisfied. Moreover, the matrices Z and K thus introduced fulfil the equations
AX = YK, Z−1 = Z∗ = Z and K∗ Z = 0p+q ⊕ Ir ⊕ 0r = ZK. Finally, if
then
which were not used in the proof but are valid too. ♦
Example 3.11.
1. If λ ∈ C\{0} and
· ¸ · ¸ · ¸
Ip 0p 0p λIp 0p
H= , A= , AH = ,
Ip λIp λIp λIp 0p
3. If λ ∈ R and
· ¸ · ¸ · ¸
Zp 0p Jp (λ)
H= , A= , AH = ,
Zp Jp (λ) 0p
These examples are here listed in detail because the cases 3. and 4. are important
in connection with normal forms of H-normal matrices, but this will not be considered
further here [LMMR, Theorem 10, Example 11]. The following sufficient condition for
the existence of H-polar decompositions can now be proved with the help of Lemma
3.9.
Theorem 3.12. Let A ∈ Cn×n and let the canonical forms of the pairs (AH A,
H) and (AAH , H) be identical. Furthermore, let all blocks of the canonical form
belonging to the eigenvalue 0 be of size 1. Then A admits an H-polar decomposition.
Proof. Let R, S, J, Z and B be as in the proof of Theorem 3.7, so that BZ B =
Z
BB = J can be assumed. Furthermore, let
whereby J0 , Z0 designates the part of the canonical form belonging to the eigenvalue
0. Then the spectra of the blocks J1 and J0 are disjoint and also the matrices J and
B commute, so that also B must take the form
B = B1 ⊕ B0 , B0 ∈ Cm×m .
an H-polar decomposition, then the present restrictions regarding the block sizes for the eigenvalue
0 would no longer be needed.
21
negative eigenvalues and whose existence is not guaranteed by Theorem 34. Thereby,
in the listed cases, the index of nilpotency k of the matrices XH X = XXH is
1, in case(I),(VI)-(VII)
k = 2, in case(II)-(III),(VIII)-(XII)
3, in case(IV)-(V)
as can easily be verified. Thus, the existence of the given H-polar decompositions in
the cases (I), (VI)-(VII) is ensured by Corollary 3.13 and, moreover, the hypothesis
expressed in the footnote on the last page is supported, too. On the other hand for
α ∈ R and
1 0 1 iα 0 0 1
H = 1 , X = 0 1 with XH X = XXH = 0 0
1 0 0
is guaranteed by Theorem 34(ii) but not by Corollary 3.13, so that the two criteria
are mutually supplementary.
4. Numerical computation of H-polar decompositions of a matrix A
for which A[∗] A is diagonalisable. The practical computation of H-polar decom-
positions of a matrix A ∈ Cn×n is a tedious task consisting of the following steps6 :
1. Computation of the canonical form of the pair (A[∗] A, H),
2. Computation of an H-selfadjoint matrix M such that M2 = A[∗] A and
ker M = ker A,
3. Computation of an H-isometry U such that A = UM.
This chapter specifies a numerical method for the case that the matrix A[∗] A is
diagonalisable. This requires a simplified canonical form for a pair (A, H) where A
is a diagonalisable H-selfadjoint matrix and H is a regular hermitian matrix, which
is derived based on the following facts taken from [GLR, Chapter I.2.2]:
If A ∈ Cn×n is H-selfadjoint then for every non-real eigenvalue λ ∈ σ(A) it follows
that also λ ∈ σ(A) and the Jordan normal form of A contains blocks of the same size
for both eigenvalues. Let
be the generalised eigenspace for the eigenvalue λ. If λ1 , . . . , λr are the real and
λr+1 , . . . , λs are the non-real eigenvalues with positive real parts, and if Xi = EA (λi )
6 In the case H = I the polar decomposition of a square matrix A is obtained from its singular
value decomposition. Namely if A = UΣV∗ , then UV∗ is the isometric and VΣV∗ is the selfadjoint
factor of a polar decomposition of A. This suggests that in the case of an indefinite scalar product,
too, a similar approach can be adopted to obtain an H-polar decomposition. An analogy to the
singular value decomposition is examined in the paper [BR]. In Chapter 8 thereof the statements
regarding H-polar decomposition are derived, on which the present chapter is based: The existence of
an H-polar decomposition of A is inferred there, too, from the canonical form of the pair (A[∗] A, H).
22
for 1 ≤ i ≤ r and Xi,1 = EA (λi ), Xi,2 = EA (λi ) for r + 1 ≤ i ≤ s, then the Cn can
be expressed as the direct sum of the non-degenerate generalised eigenspaces
Therefore, if R ∈ Cn×n is a matrix whose column vectors are bases of the subspaces
Xi , then R is regular and
· ¸ · ¸
Ar+1,1 0 As1 0
R−1 AR = A1 ⊕ . . . ⊕ Ar ⊕ ⊕ ... ⊕ ,
0 Ar+1,2 0 As2
· ¸ · ¸
0 Hr+1 0 Hs
R∗ HR = H1 ⊕ . . . ⊕ Hr ⊕ ⊕ . . . ⊕ ,
H∗r+1 0 H∗s 0
The following result is obtained directly from this representation whose proof will be
given in two ways in order to, on the one hand, show the connection to Theorem 3.1
and, on the other hand, to provide the foundation for the corresponding numerical
method.
Theorem 4.1 (Simplified canonical form). Let A ∈ Cn×n be H-selfadjoint and
diagonalisable. Then there exists a regular matrix S ∈ Cn×n such that
whereby the blocks Aj and Hj are of equal size and the pairs (Aj , Hj ) have one and
one only of the following forms:
1. Pairs belonging to real eigenvalues.
· ¸
Ip−q 0
Aj = λIp and Hj = (4.1b)
0 −Iq
with λ ∈ R and p, q ∈ N, q ≤ p.
2. Pairs belonging to non-real eigenvalues
· ¸ · ¸
λIp 0 0 Ip
Aj = and Hj = (4.1c)
0 λIp Ip 0
S∗ HS) of the pair (H−1 G, H) can therefore be expressed according to Theorem 4.1
in the simplified form
µM
r ¶ µ M
s · ¸¶
−1 −1 λj Ipj
S H GS = λj Ipj ⊕ ,
λj Ipj
j=1 j=r+1
µMr · ¸¶ µ M
s · ¸¶
Ipj −qj Ipj
S∗ HS = ⊕ ,
−Iqj Ipj
j=1 j=r+1
µMr · ¸¶ µ Ms · ¸¶
∗ Ipj −qj λj Ipj
S GS = λj ⊕
−Iqj λj Ipj
j=1 j=r+1
rmi−1 +1 , . . . , rmi−1 +pi or rmi−1 +1 , . . . , rmi−1 +pi and rmi +1 , . . . , rmi +pi+1
Now if such a matrix M is given, and if the basis in which the (simplified) canonical
form exists is designated as {g1 , . . . , gn }, then the (simplified) canonical form of the
pair (M2 , H) has the following properties:
1. If λ ∈ C\R ∪ iR and if the canonical from of (M, H) contains the blocks
· ¸ · ¸
λIp1 −λIp2
Mj1 ⊕ Mj2 = ⊕ ,
λIp1 −λIp2
· ¸ · ¸
Ip1 Ip2
Hj1 ⊕ Hj2 = ⊕ ,
Ip1 Ip2
then the canonical form of (M2 , H) contains a pair of blocks of the form
· 2 ¸ · ¸
λ Ip1 +p2 Ip1 +p2
M2j = 2 , Hj = . [End of 1.]
λ Ip1 +p2 Ip1 +p2
then the canonical form of (M2 , H) contains a pair of blocks of the form
M2j = λ2 I(p1 +p2 )+(q1 +q2 ) , Hj = Ip1 +p2 ⊕ −Iq1 +q2 . [End of 2.]
1 1
ek = √ (gk + gk+p ), fk = √ (gk − gk+p ) for 1 ≤ k ≤ p,
2 2
then
1 1
ek = √ (gk + gk+p+q ), fk = √ (gk − gk+p+q ) for 1 ≤ k ≤ p,
2 2
1 1
ek = √ (gk − gk+p+q ), fk = √ (gk + gk+p+q ) for p + 1 ≤ k ≤ p + q,
2 2
ek+p+q = gk+2p+2q for 1 ≤ k ≤ s,
fk+p+q = gk+2p+2q+s for 1 ≤ k ≤ t,
then
then for
· ¸
ωΣp
Mj =
ωΣp
28
the equations M2j = Bj and M∗j Hj = Hj Mj are satisfied. [End of 1.]
2. If λ ∈ R ∩ (0, ∞) and if the canonical form contains a pair of blocks of the form
· ¸
Ip−q
Bj = λIp and Hj = ,
−Iq
then for
√
Mj = λΣp
the equations M2j = Bj and M∗j Hj = Hj Mj are satisfied. [End of 2.]
3. If λ = −α2 ∈ R ∩ (−∞, 0) and if the canonical form contains a pair of blocks
of the form
· 2 ¸ · ¸
−α Ip Ip
Bj = and Hj = ,
−α2 Ip −Ip
then for
· ¸ · ¸ · ¸
iαΣp Ip 1 Ip Ip
M̃j = , H̃j = and Sj = √
−iαΣp Ip 2 Ip −Ip
This confirms again the correction of Theorem 3.4 explained with Example 3.5. Fur-
thermore, the diagonal matrices Σp used in the proof given above comply with the
relationships between the canonical forms of (M2 , H) and (M, H) listed in Theorem
3.4(a-f).
Summing up, the following procedure can now be described using the designation
X ¦ Y = [x1 . . . xp y1 . . . yq ] ∈ Cn×(p+q) for given matrices X = [x1 . . . xp ] ∈ Cn×p
and Y = [y1 . . . yq ] ∈ Cn×q having the specified columns.
Numerical Procedure 4.5 (H-polar decomposition). Let A ∈ Cn×n and let
[∗]
A A be diagonalisable. Then an H-polar decomposition of A can be constructed as
follows.
1st step: First of all the (simplified) canonical form of the pair (A[∗] A, H) must
be determined7 using the procedure 4.3, which is given as example by
R−1 A[∗] AR = J = J3 ⊕ J2 ⊕ J1 ⊕ J0 ,
· ¸ · 2 ¸ · 2 ¸ · ¸
ω 2 Ip3 α Ip2 −β Ip1 0
J= ⊕ ⊕ ⊕ r+s ,
ω 2 Ip3 α 2 I q2 −β 2 Ip1 0r+t
R∗ HR = ZJ = ZJ,3 ⊕ ZJ,2 ⊕ ZJ,1 ⊕ ZJ,0 ,
· ¸ · ¸ · ¸ · ¸
Ip3 Ip2 Ip1 I
ZJ = ⊕ ⊕ ⊕ r+s ,
Ip3 −Iq2 −Ip1 −Ir+t
R = R3 ¦ R2 ¦ R1 ¦ R0
S−1 MS = K = K3 ⊕ K2 ⊕ K1 ⊕ K0 ,
· ¸ · ¸ · ¸ µ· ¸ · ¸¶
ωIp3 αIp2 iβIp1 Ir 0
K= ⊕ ⊕ ⊕ ⊕ s ,
ωIp3 αIq2 iβIp1 0r 0t
S∗ HS = ZK = ZK,3 ⊕ ZK,2 ⊕ ZK,1 ⊕ ZK,0 ,
7 If the matrix A[∗] A has only non-real and positive eigenvalues, then the canonical form does
not necessarily have to be determined. If in this case K is a diagonal matrix with square roots of
the eigenvalues, then M = RKR−1 and U = AM−1 are given directly as H-polar decomposition
of A. The canonical form is required in the general case considered here, in order to decide whether
an H-hermitian square root exists in the case of negative eigenvalues, and in order to find a suitable
kernel transformation in the case of the eigenvalue 0.
30
· ¸ · ¸ · ¸ µ· ¸ · ¸¶
Ip3 I I Ir I
ZK = ⊕ p2 ⊕ p1 ⊕ ⊕ s ,
Ip3 −Iq2 −Ip1 Ir −It
S = R3 ¦ R2 ¦ R1 ¦ R000 (R000 according to kernel transformation).
In the case of the negative eigenvalue −β 2 this construction is possible only if condition
1. of Theorem 4.4 is fulfilled, i.e. only if the number of negative is the same as the
number of positive eigenvectors from R1 . It would then also be possible to use the
blocks
· ¸ · ¸ · ¸
iβIp1 Ip1 1 I Ip1
K01 = , Z0K,1 = and R01 = √ R1 p1 ,
−iβIp1 Ip1 2 Ip1 −Ip1
but this is found to be less convenient when constructing the isometry in the third
step. Furthermore, in the blocks of K diagonal matrices Σp3 , Σp2 , Σq2 , Σp1 , Σr with
diagonal elements from {+1, −1} could also be used instead of the identity matrices,
which would then produce another H-polar decomposition of A. Finally, the required
treatment of the eigenvalue 0 in this step is based on the following procedure.
Kernel transformation: Let R0 = R+ ¦ R− , where R+ contains the r + s positive
and R− contains the r + t negative vectors from ker A[∗] A, so that on the one hand
Furthermore, let U∗+ (AR+ )V+ = Σ+ and U∗− (AR− )V− = Σ− be singular value de-
compositions of the matrices AR+ and AR− . Then if the number r of non-vanishing
singular values σi ≥ τ (see procedure 4.3) in Σ+ and Σ− is the same, i.e.
AR00 = [−a− − − −
1 . . . − ar a1 . . . ar 01 . . . 0s+t ], (AR00 )∗ H(AR00 ) = 02r+s+t
and (R00 )∗ H(R00 ) = (Ir ⊕ −Ir ) ⊕ (Is ⊕ −It ),
31
and for
· ¸
1 I Ir
R000 = √ (R̂+ ¦ R̂− ) r ¦ (Ř+ ¦ Ř− )
2 Ir −Ir
we get
√
AR000 = − 2[01 . . . 0r a− −
1 . . . ar 01 . . . 0s+t ], (AR000 )∗ H(AR000 ) = 02r+s+t
· ¸ · ¸
00 ∗ 00 Ir Is
and (R0 ) H(R0 ) = ⊕ .
Ir −It
If the number of non-vanishing singular values in Σ+ and Σ− differ or the transfor-
mation W is not unitary, then ker A cannot be expressed in the just constructed form
ker A = span{e1 + f1 , . . . , er + fr , er+1 , . . . , er+s , fr+1 , . . . , fr+t } with
R̂+ = [e1 . . . er ], Ř+ = [er+1 . . . er+s ], R̂− = [f1 . . . fr ], Ř− = [fr+1 . . . fr+t ].
In that case the condition 2. of Theorem 4.4 is violated and an H-polar decomposition
of A then does not exist.
3rd step: After the second step M = SKS−1 is the H-hermitian factor, and
in the regular case, in which no blocks of the form J0 , ZJ,0 and K0 , ZK,0 exist,
U = AM−1 = ASK−1 S−1 is the H-unitary factor of an H-polar decomposition of A.
The inverse of the matrix S which thereby appears can be obtained in the numerical
evaluations of the equations using S−1 = ZK S∗ H, which follows from S∗ HS = ZK .
In the singular case let
· ¸
Ir
K̃ = K3 ⊕ K2 ⊕ K1 ⊕ K̃0 with K̃0 = K̃−1 0 = ⊕ Is+t
Ir
and also let
S0 = S00 ¦ S000 ¦ S̃0 with S00 , S000 ∈ Cn×r and S̃0 ∈ Cn×(s+t)
be a partitioning of the part of the matrix S belonging to √ the eigenvalue 0. Then
on the one hand AS0 = 0n,r ¦ A00 ¦ 0n,s+t with A00 = − 2[a− −
1 . . . ar ], because the
columns of S00 and S̃0 constitute a basis of ker A after the kernel transformation, and
on the other hand S0 K0 = 0n,r ¦ S00 ¦ 0n,s+t . Therefore the matrices ASK̃−1 and
MSK̃−1 take on the form
ASK̃−1 = AS3 K−1 −1 −1 −1
3 ¦ AS2 K2 ¦ AS1 K1 ¦ AS0 K̃0
MSK̃−1 = S3 K3 K−1 −1 −1 −1
3 ¦ S2 K2 K2 ¦ S1 K1 K1 ¦ S0 K̃0 K̃0
with S0 K0 K̃−1 0
0 = S0 ¦ 0n,r+s+t
(MSK̃−1 )m = S3 ¦ S2 ¦ S1 ¦ S00 ,
are bases of im A or im M for which
(ASK̃−1 )∗m H(ASK̃−1 )m = (MSK̃−1 )∗m H(MSK̃−1 )m =
· ¸ · ¸ · ¸
Ip3 Ip2 Ip1
(ZK )m = ⊕ ⊕ ⊕ 0r .
Ip3 −Iq2 −Ip1
32
Now both bases can be extended according to Theorem 2.7 to bases of the Cn such
that
p3 = p2 = q2 = p1 = r = s = t = 2, i.e. n = 20,
and random values for ω, α, β. Using randomly chosen transformations S̃ and ran-
domly chosen H-isometries Ũ, test examples of the kind
were constructed, always based on normally distributed random numbers from the
interval [−2, 2]. Finally H-polar decompositions UM of the test matrices A were
computed, whose numerical accuracy can be estimated via the residuals or the con-
dition number, respectively,
For the first variant the Gauss-Jordan method of matrix inversion [ST, Kapitel
4.2] was used to compute the matrix S−1 ; for the second variant the inversion was
accomplished with the equation S−1 = ZK S∗ H. The eigenvalues were computed in
the depicted cases using a generalisation of the Jacobi method according to Eberlein
[E]. In other experiments the LR or the QR method was used. It was found that all
methods produce nearly the same numerical results.
33
Table 4.1
Results of two statistical experiments
5. Factor analysis and procrustes problems. This chapter shows how pro-
crustes problems in indefinite scalar product spaces are solved by application of H-
polar decompositions. As motivation for this task, it will first of all be shown how to
construct points from given distances. This is a generalisation of the work [YH] for
complex vector spaces and indefinite scalar products.
Let F = R or F = C and let [., .] be an indefinite scalar product of the Fn with
the underlying regular symmetric or hermitian matrix H ∈ Fn×n . Then for arbitrary
vectors x, y ∈ Fn in the case F = R
1
[x, y] = ([x, x] + [y, y] − [x − y, x − y]) (5.1a)
2
and in the case F = C
1
Re[x, y] = ([x, x] + [y, y] − [x − y, x − y])
2
1
= ([x, x] + [y, y] − [iy − ix, iy − ix]),
2 (5.1b)
1
Im[x, y] = ([x, x] + [y, y] − [x − iy, x − iy])
2
1
= − ([x, x] + [y, y] − [y − ix, y − ix]),
2
so that the scalar products of the vectors can be expressed in terms of their norm
and distance squares. Now let the vectors x1 , . . . , xN ∈ Fn be given and let X =
[x1 . . . xN ]∗ ∈ FN ×n be a matrix whose rows are the conjugate transposed vectors.
Then
W = XHX∗
is the Gramian matrix of the xk and for the dimension of the set of points which they
set up m = rank X = rank W. In particular, if N ≥ n and span{x1 , . . . , xN } = Fn ,
then the number of positive and the number of negative eigenvalues of H and W are
equal, and furthermore the eigenvalue 0 appears in σ(W) with the multiplicity N − n
(Sylvester’s law of inertia). Moreover, the elements wkl = [xl , xk ] of the matrix W
according to (5.1) can be expressed in the form
1 2
wkl = (ρ + ρ2l − σkl
2
) if F = R or (5.2a)
2 k
1 i
wkl = (ρ2k + ρ2l − σkl
2
) + (ρ2k + ρ2l − τkl
2
) if F = C. (5.2b)
2 2
34
where
W = RΛR∗ .
X = R1 Σ ∈ FN ×n
fulfils on the one hand rank X = n and on the other hand XHX∗ = R1 ΣHΣ∗ R∗1
= R1 Λ1 R∗1 = W. Therefore the conjugate transposed rows x1 , . . . , xN ∈ Fn of
the matrix X constitute a generating system of the Fn , and for the indefinite scalar
product defined by [x, y] = (Hx, y) we have wkl = [xl , xk ]. This means that also
so that the constructed points correspond to the given norm and distance squares.
We thus have proved the following theorem.
Theorem 5.1 (Construction of vectors from norm and distance squares). Let
F = R or F = C and let ρ2k , σkl 2
be real numbers such that σkl 2 2
= σlk and σkk2
= 0 for
2
all k, l from {1, . . . , N }. Furthermore, for the case F = C let τkl be real numbers such
2 2
that τkl + τlk = 2(ρ2k + ρ2l ) for all k, l from {1, . . . , N }. Then the following statements
are equivalent:
1. There exist vectors x1 , . . . , xN ∈ Fn constituting a generating system for the
Fn , for which [xk , xk ] = ρ2k as well as [xl − xk , xl − xk ] = σkl2
, and in the case
2
F = C also [xl − ixk , xl − ixk ] = τkl is satisfied. Thereby [., .] is an indefi-
nite scalar product of the Fn with underlying regular symmetric or hermitian
matrix H ∈ Fn×n which has p positive eigenvalues.
35
2. The symmetric or hermitian matrix W ∈ FN ×N whose elements wkl are de-
fined by (5.2) has p positive and n−p negative eigenvalues, and the eigenvalue
0 appears with multiplicity N − n.
A real vector space provided with an indefinite scalar product defined by a matrix
and the relations obtained by cyclic exchange of the variables are fulfilled. But this
is just the triangle inequality, so that Corollary 5.2 contains a generalisation of this
essential property of Euclidean geometry.
Remark 5.3 (Factor analysis and multidimensional scaling). If the coordinates
2 2
origin is designated x0 and writing σk0 = σ0k instead of ρ2k , then Theorem 5.1 shows
how N +1 points (objects) must be arranged in a coordinates system so that they take
up given distances. This constitutes the basis for an entire discipline in psychology
which is there called factor analysis [H] or multidimensional scaling [D]. Essentially
this is a procedure for geometric modelling of cognitive processes and analysing the
resulting constellations of objects with regard to the geometric invariants (dimension,
signature, lengths, volumes, angles). It would thereby also be possible in principle to
interpret physical invariants, because the matrix
N
X
∗
£ αβ ¤ αβ β
T = X X, T = T with T = xα
k xk for 1 ≤ α, β ≤ n
k=1
whereby for brevity kxk2H = [x, x] has been set [T]. From the given distance squares
2
σkl = kxl − xk k2H and τkl 2
= kxl − ixk k2H it is therefore possible to calculate the
elements wkl = [xl , xk ] of a matrix W whose row and column sums vanish. Using the
construction of Theorem 5.1 this produces points whose centroid lies at the origin so
that the resulting set of points is oriented along its central main inertial axes. However,
2
in the complex case this is possible only when the τkl satisfy a rather complicated
condition arising from the centroid situation. ♦
After this brief excursion, the main result of this chapter will now be derived. For
this purpose let x1 , . . . , xN still be the vectors, constructed according to Theorem 5.1,
of a Fn provided with an indefinite scalar product [., .] = (H., .). For every H-isometry
U ∈ Fn×n it then follows that
by making V = U∗ . Thus the conjugate transposed rows x0k = Uxk contained in the
matrix X0 = XV take on the specified distances, too.
Now let two N −tuples of vectors (x1 , . . . , xN ) and (y1 , . . . , yN ) of the Fn be given,
which one can consider as having arisen, for example, by measuring the distances of
a dynamical system of objects at two different times. Then, on comparing the two
constellations, the question arises, what part of the observed differences is due to the
different position in space, and what part is due to actual differences in the inner
37
structure of the constellations. Expressed mathematically, the task is to determine
an H-isometry U ∈ Fn×n which solves the optimising problem
N
X
f (U) = [Uxk − yk , Uxk − yk ] → min, h(U) = U∗ HU − H = 0. (5.4a)
k=1
The sum of distance squares arising therein can be expressed in the form of a trace,
so that an alternative expression with
and the necessary first order condition for solving the problem is
∂
(f + h) = 0.
∂V
Differentiation of the trace [DP] gives
∂f ∂h
= 2X∗ XVH − 2X∗ YH and = (L + L∗ )VH,
∂V ∂V
so that V must satisfy the equation
L + L∗
X∗ XVH + ΛVH = X∗ YH for Λ = = Λ∗
2
and U must satisfy the equation
Now defining
it follows that M∗ H = H(X∗ X + Λ)H = HM and the necessary condition takes the
form
whereby ² ∈ {+1, −1} is called the characteristic of the depiction. For the operators
∧ and ∨ introduced this way, the following calculation rules apply.
Lemma 5.4 (Real depiction of complex matrix equations). Let X, Y ∈ Cm×n ,
Z ∈ Cn×k , A, B ∈ Cn×n and λ ∈ C. Then
1. X∧ = (iX)∨ , X∨ = (−iX)∧ ,
2. (λX)∧ = λ1 X∧ − λ2 X∨ , (λX)∨ = λ1 X∨ + λ2 X∧ ,
3. (X + Y)∧ = X∧ + Y∧ , (X + Y)∨ = X∨ + Y∨ ,
4. (XZ)∧ = X∧ Z∧ = −X∨ Z∨ , (XZ)∨ = X∧ Z∨ = X∨ Z∧ ,
T T
5. (X∗ )∧ = (X )∧ = (X∧ )T , (X∗ )∨ = (X )∨ = −(X∨ )T ,
6. A∧ = (A∧ )T , A∨ = −(A∨ )T , if A∗ = A,
7. B∧ = −(B∧ )T , B∨ = (B∨ )T , if B∗ = −B,
8. (A−1 )∧ = (A∧ )−1 , (A−1 )∨ = −(A∨ )−1 , if det(A) 6= 0,
9. 2 tr(A) = tr(A∧ ) + i tr(A∨ ),
10. | det(A)|2 = det(A∧ ) = det(A∨ ).
Proof. The proofs are obtained by simple verification demonstrated by some
examples. Proof of 2 and 3 :
Proof of 8 : Let B = A−1 . Then I+i0 = AA−1 = AB = (A1 +iA2 )(B1 +iB2 ) =
(A1 B1 − A2 B2 ) + i(A1 B2 + A2 B1 ) and therefore
· ¸· ¸
∧ −1 ∧ ∧ ∧ A1 −²A2 B1 −²B2
A (A ) = A B =
²A2 A1 ²B2 B1
· 2
¸ · ¸
A1 B1 − ² A2 B2 −²(A1 B2 + A2 B1 ) I 0
= = ,
²(A1 B2 + A2 B1 ) A1 B1 − ²2 A2 B2 0 I
· ¸· ¸
A2 ²A1 B2 ²B1
A∨ (A−1 )∨ = A∨ B∨ =
−²A1 A2 −²B1 B2
· ¸ · ¸
A2 B2 − ²2 A1 B1 ²(A1 B2 + A2 B1 ) −I 0
= = .
−²(A1 B2 + A2 B1 ) A2 B2 − ²2 A1 B1 0 −I
39
Proof of 10 : On the one hand | det(A)|2 = det(A)det(A) = det(A) det(A) =
det(AA) = det[(A1 + iA2 )(A1 − iA2 )] = det[A21 + A22 ] and on the other hand
· ¸ · ¸
A1 −²A2 A1 − i²A2 −²A2
det(A∧ ) = det = det
²A2 A1 ²A2 + iA1 A1
· ¸
A1 − i²A2 −²A2
= det
0 A1 + i²A2
= det[(A1 − i²A2 )(A1 + i²A2 )] = det[A21 + A22 ],
· ¸ · ¸
A2 ²A1 A2 + i²A1 ²A1
det(A∨ ) = det = det
−²A1 A2 −²A1 + iA2 A2
· ¸
A2 + i²A1 ²A1
= det
0 A2 − i²A1
= det[(A2 + i²A1 )(A2 − i²A1 )] = det[A21 + A22 ].
If now the abbreviation Z = XV−Y is used, the equations (5.4b) and (5.5) in the
case F = C can be brought with the Lemma 5.4 into the equivalent real representation
and
Thereby the antisymmetry of H∨ was taken into account in the first transformation,
from which the vanishing of the imaginary part follows. The necessary first order
condition for an optimum is
∂ f˜ ∂ h̃1 ∂ h̃2
+ = 0 and =0
∂V∧ ∂V∧ ∂V∧
and the differentiation of the trace gives
∂ f˜
= 2(X∧ )T X∧ V∧ H∧ − 2(X∧ )T Y∧ H∧ = 2(X∗ XVH − X∗ YH)∧ ,
∂V∧
∂ h̃1
= [L∧ + (L∧ )T ]V∧ H∧ = [(L + L∗ )VH]∧ ,
∂V∧
∂ h̃2
= [L∨ − (L∨ )T ]V∧ H∧ = [(L − L∗ )VH]∨ ,
∂V∧
40
so that the equations now written again in complex form
L + L∗ L − L∗
X∗ XVH − X∗ YH + VH = 0 and VH = 0
2 2
must be satisfied. The second equation demands that the antihermitian part of the
Lagrange multipliers vanishes, and the first can be expressed with the abbreviation
Λ for the hermitian part and with V∗ = U in the form
This is just (5.6), so that in the complex case too the necessary condition (5.7) must
be fulfilled.
Lastly, to also get a sufficient criterion for the minimum of the function f , let
A = UM be an H-polar decomposition of the matrix A = Y∗ XH and let
k
M
(R−1 A[∗] AR, R∗ HR) = (J, ZJ ) with J = Jpi (λi ),
i=1
Mm
(S−1 MS, S∗ HS) = (K, ZK ) with K = Jpj (κj )
j=1
be the canonical forms of the pairs (A[∗] A, H) and (M, H), respectively. Returning
therewith to the initial equation (5.4b), we find that
Thus the value of f (V) is minimised when in the calculation of the H-polar decompo-
sition of A the square roots κρ of the positive real eigenvalues λρ of A[∗] A are chosen
positive (Theorem 3.4, case b)
p
κρ = + λρ for λρ > 0 (5.8a)
and when the square roots κσ , κσ of the non-real eigenvalues λσ , λσ of A[∗] A are
chosen such that their real parts are positive (Theorem 3.4, case a)
√ ϕσ ϕσ
κσ = + ασ (cos + i sin ) with
2 2 (5.8b)
ασ = |λσ |, ϕσ = arg(λσ ) for λσ ∈ C\R.
The roots of the negative real eigenvalues and of the eigenvalue 0, which are fixed by
specification anyway, make no contribution to the optimised value. Thus altogether
the following theorem applies.
Theorem 5.5 (Existence of solutions of the H-isometric procrustes problem).
A solution of the H-orthogonal or H-unitary procrustes problem (5.4) exists if the
matrix A = Y∗ XH admits an H-polar decomposition. In this case the H-isometry U
or V = U∗ contained in the decomposition A = UM minimises the function f when
the eigenvalues of M are chosen according to (5.8).
41
Therefore, in the indefinite case a situation can occur in which no solution for
(5.4) can be found. Moreover, there is also another problem which will be pointed
out as follows:
In a pseudo-Euclidean or pseudo-unitary space with index of inertia q the scalar
product of two vectors x = (x1 , . . . , xn )T and y = (y 1 , . . . , y n )T whose coordinates
are referred to the canonical basis {e1 , . . . , en } are defined by
n−q
X n
X
(x, y)q = xα y α − xα y α (y α = y α if F = R).
α=1 α=n−q+1
whereby the components of the matrix H, which is called the metric tensor in tensor
algebra, are given by hαβ = (hβ , hα )q for 1 ≤ α, β ≤ n. From this viewpoint the
H-orthogonal or H-unitary procrustes problem is a generalisation of the orthogonal
procrustes problem for arbitrary non-orthogonal coordinate systems and arbitrary
index of the scalar product of a real or complex vector space. Whereas the solution
of the problem according to Theorem 5.5 is satisfactory in the case of a positive (or
negative) definite matrix H, in the case of an indefinite matrix H it must be taken into
consideration that the minimum of the function f can have a considerable negative
contribution, so that the desired goal - convergence of two tuples of points in the sense
of an optimum congruence - cannot be achieved this way in most cases. However, if
G and H are regular symmetric or hermitian matrices from Fn×n and if the geometry
within the tuples (x1 , . . . , xN ) and (y1 , . . . , yN ) is measured with the scalar product
[., .]G = (G., .), but the geometry between the tuples is measured with the scalar
product [., .]H = (H., .), then the problem can be expressed, instead of (5.4), as
N
X
f (U) = [Uxk − yk , Uxk − yk ]H → min with
k=1
(5.9a)
∗ ∗
g(U) = U GU − G = 0 and h(U) = U HU − H = 0
or in matrix notation
f (V) = tr[(XV − Y)∗ (XV − Y)H] → min with
(5.9b)
g(V) = VGV∗ − G = 0 and h(V) = VHV∗ − H = 0,
∂
(f + g + h) = 0
∂V
leads in the same way as above to the equation
The necessary condition for solving (5.9), with the additional prerequisite (5.11), thus
finally takes on the form
P−1 AP = J = diag(±µ).
Thus the selfadjoint matrices P∗ HP and P∗ GP commute and can therefore be di-
agonalised simultaneously, so that an orthogonal or unitary matrix Q consisting of
eigenvectors of P∗ HP (or P∗ GP) can now be chosen for which
where ΛH , ΛG are diagonal matrices containing the real eigenvalues. This means that
Λ−1
H ΛG = µΣ with Σ = diag(±1).
Thus, setting ΛH = |ΛH |ΣH and ΛG = |ΛG |ΣG , whereby ΣH and ΣG contain the
signs of the eigenvalues, we obtain ΣH ΣG = Σ as well as |ΛH |−1 |ΛG | = µI and for
p −1
S = PQ |ΛH | we finally have
p p−∗ −1 p −1 p −1
S∗ HS = |ΛH | Q∗ P∗ HPQ
|ΛH | = |ΛH | ΛH |ΛH | = ΣH ,
p −∗ p −1 p −1 p −1
S∗ GS = |ΛH | Q∗ P∗ GPQ |ΛH | = |ΛH | ΛG |ΛH | = µΣG .
8 The matrices µΣ and J have the same diagonal elements, but their arrangement may be different.
44
The asserted form can always
p be obtained by suitable permutation. (The magnitudes
|Λ| and the square roots |Λ| must always be taken element by element.)
[⇐]: It is true that H−1 G = µS(Ip+q ⊕ −Ir+s )S−1 and G−1 H = µ−1 S(Ip+q ⊕
−Ir+s )S−1 , from which the assertion follows directly.
Evidently a (G,H)-polar decomposition A = UM with UH = UG = U−1 , MH =
M = M can exist only if H−1 A∗ H = H−1 M∗ HH−1 U∗ H = G−1 M∗ GG−1 U∗ G =
G
G−1 A∗ G or AH = AG , which has already been shown for the case (5.12). Matrices
with this property can be characterised as follows.
Lemma 5.8. Let F = R or F = C and let G, H be regular symmetric or hermitian
matrices from Fn×n such that H−1 G = µ2 G−1 H for a µ ∈ R\{0}. Furthermore let
A ∈ Fn×n with GAG−1 = HAH−1 (or G−1 A∗ G = H−1 A∗ H). Then there exists a
regular matrix S ∈ Fn×n such that
S−1 AS = A1 ⊕ . . . ⊕ Ak , S∗ HS = H1 ⊕ . . . ⊕ Hk , S∗ GS = G1 ⊕ . . . ⊕ Gk
with
for µ ∈ R\{0} or
· ¸ · ¸ · ¸
Aj,1 Ip µIp
Aj = ∈ C2p×2p , Hj = , Gj =
Aj,2 Ip µIp
for µ ∈ C\R. The corresponding generalisation of Lemma 5.9 states that the matrix A
admits a (G,H)-polar decomposition if and only if each block Aj admits a Hj −polar
decomposition. In conclusion, the statements of the lemmas will now be explained
with the help of two examples.
Example 5.10. 1. Let H = Ip ⊕ Ir and G = Ip ⊕ −Ir . Then a matrix
C ∈ F(p+r)×(p+r) with CH = CG takes on the form C = C1 ⊕ C2 where C1 ∈ Fp×p
and C2 ∈ Fr×r . If now
The matrix
· ¸ · ¸
0 β 0 α
A1 = ⊕ , B1 = AH G 2 2 2 2
1 A1 = A1 A1 = diag(−α , −β , −β , −α ),
α 0 β 0
but U∗1 GU1 = −G and M∗1 G = −GM1 . A G-polar decomposition cannot exist,
because the pair (B1 , G) which is present in the canonical form does not fulfil the
condition 1 of Theorem 3.4 (or Theorem 4.4). The matrix
· ¸ · ¸
0 β 0 β
A2 = ⊕ , B 2 = AH G 2 2 2
2 A2 = A2 A2 = diag(−α , −β , −α , −β )
2
α 0 α 0
46
admits the G-polar decomposition
0 −i iα 0
−i 0 0 iβ
A2 = U2 M2 with U2 =
0 −i
, M2 =
iα
,
0
−i 0 0 iβ
but U∗2 HU2 = −H and M∗2 H = −HM2 . An H-polar decomposition cannot exist,
because the pair (B2 , H) which is present in the canonical form does not fulfil the
condition 1 of Theorem 3.4 (or Theorem 4.4). But if we now make α = β, i.e.
· ¸ · ¸
0 α 0 α
A= ⊕ , B = AH A = AG A = diag(−α2 , −α2 , −α2 , −α2 ),
α 0 α 0
then
· ¸ · ¸ · ¸ · ¸
−i −i iα iα
A = UM with U = ⊕ , M= ⊕
−i −i iα iα
REFERENCES
[BG] I. Borg, and P. Groenen, Modern Multidimensional Scaling: Theory and Applications,
Springer, New York, 1997.
[BMRRR1] Y. Bolshakov, C.V.M. van der Mee, A.C.M. Ran, B. Reichstein, and L.Rodman, Po-
lar decompositions in finite dimensional indefinite scalar product spaces: General
Theory, Linear Algebra Appl. 261, 91-141, 1997.
47
[BMRRR2] Y. Bolshakov, C.V.M. van der Mee, A.C.M. Ran, B. Reichstein, and L.Rodman, Exten-
sion of isometries in finite-dimensional indefinite scalar product spaces and polar
decompositions, SIAM J. Matrix Anal. Appl. 18, 752-774, 1997.
[BMRRR3] Y. Bolshakov, C.V.M. van der Mee, A.C.M. Ran, B. Reichstein, and L.Rodman, Polar
decompositions in finite-dimensional indefinite scalar product spaces: special cases
and applications, in: Recent Developments in Operator Theory and its Applica-
tions, OT 87 (I. Gohberg, P. Lancaster, P.N. Shivakumar, Eds.), Birkhäuser, Basel,
1996, 61-94. Errata, Integral equations and Operator Theory 17, 497-501, 1997.
[BR] Y. Bolshakov, and B. Reichstein, Unitary equivalence in an indefinite scalar product:
an analogue of singular-value decomposition, Linear Algebra Appl. 222, 155-226,
1995.
[D] M. Davidson, Multidimensional Scaling, Wiley, New York, 1983.
[DP] P.S.Dwyer, and M.S.McPhail, Symbolic Matrix Derivatives, Ann. Math. Statist. 19,
517-534, 1948.
[E] P.J. Eberlein, Solution to the Complex Eigenproblem by a Norm Reducing Jacobi Type
Method, Numer. Math. 14, 232-245, 1970.
[F] J.F.G. Francis, The QR transformation. An unitary analogue to the LR transformation,
Computer J. 4, 265-271, 332-345, 1961/62.
[G] F.R. Gantmacher, The Theory of Matrices (Vol. I), Chelsea, New York, 1959.
[GLR] I. Gohberg, P. Lancaster, and L. Rodman, Matrices and Indefinite Scalar Products,
Birkhäuser, Basel, 1983.
[GR] W. Greub, Linear Algebra (3rd Ed.), Springer, Berlin, 1967.
[H] H. Harman, Modern Factor Analysis (3rd Ed.), Univ. of Chicago Press, Chicago, 1976.
[K] U. Kintzel, CNF, An algorithm for numerical computation of the canonical form of a
pair (A, H) consisting of an H-hermitian matrix A and a regular hermitian matrix
H, submitted to ACM Transactions on Mathematical Software (TOMS), 2003.
[KR] B. Kågström, and A. Ruhe, An Algorithm for Numerical Computation of the Jordan
Normal Form of a Complex Matrix, ACM Transactions on Mathematical Software
(TOMS) Vol. 6, No. 3, 398-419, 1980.
[LMMR] P. Lins, P. Meade, C. Mehl, and L. Rodman, Normal Matrices and Polar Decompo-
sitions in Indefinite Inner Products, Linear and Multilinear Algebra 49, 45-89,
2001.
[MMX] C. Mehl, V. Mehrmann, and H. Xu, Canonical forms for doubly structured matrices
and pencils, Electron. J. Linear Algebra 7, 112-151, 2000.
[MRR] C.V.M. van der Mee, A.C.M. Ran, and L. Rodman, Stability of self-adjoint square
roots and polar decompositions in indefinite scalar product spaces, Linear Algebra
Appl. 302-303, 77-104, 1999.
[PW] G. Peters, and J.H. Wilkinson, Eigenvectors of Real and Complex Matrices by QR and
LR triangularizations, Numer. Math. 16, 181-204, 1970.
[S] P. Schönemann, A generalized solution of the Orthogonal Procrustes Problem, Psy-
chometrika, Vol. 31, No. 1, 1-10, 1966.
[ST] J. Stoer, Numerische Mathematik 1 (5. Aufl.), Springer, Berlin, 1989.
[T] W.S. Torgerson, Theory and methods of scaling, Wiley, New York, 1958.
[WEY] H. Weyl, Raum, Zeit, Materie: Vorlesungen über allg. Relativitätstheorie (7. Aufl.),
Springer, Berlin, 1988.
[WED] J.H.M. Wedderburn, Lectures on Matrices, AMS, Vol. 17, New York, 1934.
[YH] G. Young, and A.S. Householder, Discussion of a set of points in terms of their mutual
distances, Psychometrika, Vol. 3, No. 1, 19-22, 1938.
48