Vous êtes sur la page 1sur 1225

February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

Reviews in Mathematical Physics


Vol. 22, No. 1 (2010) 153

c World Scientific Publishing Company
DOI: 10.1142/S0129055X10003874

SPECTRAL THEORY OF NO-PAIR HAMILTONIANS

OLIVER MATTE
Mathematisches Institut, Ludwig-Maximilians-Universit
at,
Theresienstrae 39, D-80333 M unchen, Germany
matte@math.lmu.de

EDGARDO STOCKMEYER
Institut f
ur Mathematik, Johannes Gutenberg-Universit
at,
Staudingerweg 9, D-55099 Mainz, Germany
stockm@mathematik.uni-mainz.de

Received 27 October 2008


Revised 18 August 2009

We prove an HVZ theorem for a general class of no-pair Hamiltonians describing an atom
or positively charged ion with several electrons in the presence of a classical external
magnetic field. Moreover, we show that there exist infinitely many eigenvalues below the
essential spectrum and that the corresponding eigenfunctions are exponentially localized.
The novelty is that the electrostatic and magnetic vector potentials as well as a non-
local exchange potential are included in the projection determining the model. As a main
technical tool, we derive various commutator estimates involving spectral projections of
Dirac operators with external fields. Our results apply to all coupling constants e2 Z < 1.

Keywords: Dirac operator; Brown and Ravenhall; no-pair operator; pseudo-relativistic;


Furry picture; intermediate pictures; HVZ theorem; exponential localization.

Mathematics Subject Classification 2000: 81Q10, 47B25

1. Introduction
The relativistic dynamics of a single electron moving in the potential of a static
nucleus, VC 0, in the presence of an external classical magnetic eld B = curl A
is generated by the Dirac operatora
DA,VC := (i + A) + VC . (1.1)

On leave from Mathematisches Institut, Ludwig-Maximilians-Universit at, Theresienstrae 39,


D-80333 M unchen, Germany.
a Energies are measured in units of mc2 , m denoting the rest mass of an electron and c the speed

of light. Length is measured in units of /(mc), which is the Compton wave length divided by 2.
 is Plancks constant divided by 2. In these units, the square of the elementary charge equals
the fine structure constant, e2 1/137.037.

1
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

2 O. Matte & E. Stockmeyer

Here an electron is a state lying in the positive spectral subspace of DA,VC . A


ground state of the one-electron atom modeled by DA,VC can be characterized as
an energy minimizing bound state of the restriction of DA,VC to its positive spectral
subspace, + 2 3 4
A,VC L (R , C ), where

1 1
+
A,VC = + sgn(DA,VC ). (1.2)
2 2
This is conrmed by Diracs interpretation of the negative spectral subspace as
a completely lled sea of virtual electrons which, on account of Paulis exclusion
principle, forces an additional electron to attain a state of positive energy. On the
other hand, it is well known that there is no canonical a priori given atomic or
molecular Hamiltonian generating the relativistic time evolution of N > 1 inter-
acting electrons. Guided by non-relativistic quantum mechanics one might naively
propose to start with the formal expression

N
(j)

DA,VC + Wjk , (1.3)
j=1 1j<kN

where the superscript (j) means that the operator below acts on the jth electron
and Wjk 0 is the interaction potential between the jth and kth electron. It then
turns out, however, that (1.3) suers from the phenomenon of continuum dissolution
which is also known as the BrownRavenhall disease [9]. That is, the eigenvalue
problem associated to (1.3) has no normalizable solutions; see, e.g., [43, 48]. A
frequently used ansatz to nd a reasonable and semi-bounded Hamiltonian for an
N -electron atomic or molecular system again incorporates the concept of a Dirac
sea. Namely, one projects (1.3) onto the N -fold antisymmetric tensor product of a
suitable one-electron subspace, i.e. one considers operators of the form

N
(j)

HN := +,N
A,V
DA,VC + Wjk +,N
A,V , (1.4)
j=1 1j<kN

where

N
+(j)
+,N
A,V := A,V . (1.5)
j=1

Here + A,V is dened as in (1.1) and (1.2) but with VC replaced by a new potential
V . A Hamiltonian of this kind has been introduced rst by Brown and Raven-
hall in [9]. We emphasize that HN can formally be derived from quantum elec-
trodynamics (QED) by a procedure that neglects the creation and annihilation
of electron-positron pairs [47], the latter being dened with respect to +A,V . For
this reason operators of the form (1.4) are often called no-pair Hamiltonians. Mod-
els of this type are widely used as a starting point for numerical computations in
quantum chemistry. We refer the reader to the recent textbook [43] for a detailed
exposition of the application of no-pair models in quantum chemistry, for examples
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians 3

of molecular systems which can be studied eectively by these methods, and an


exhaustive reference list. Roughly speaking, no-pair models are supposed to give
a good description of heavy atoms and molecules where relativistic eects play an
important role in the understanding of their chemical properties but where QED
eects can be neglected since their contributions live on a negligible energy scale;
compare [43, Chap. 8]. In particular, the binding energies of a molecular system are
low enough so that processes involving pair creation and annihilation do not need
to be taken into account in the investigation of its chemical properties. Another
broad eld of application for no-pair models is the theoretical and numerical study
of highly ionized heavy atoms; see, e.g. [11,29,30,45] for a review. In fact, since very
accurate spectroscopic data is available for highly charged heavy ions they provide
important tests of relativistic atomic structure theory and QED. We quote from
the introduction to [11]: Although the proper point of departure for relativistic
atomic structure calculations is quantum electrodynamics, very few atomic struc-
ture calculations have been carried out entirely within the QED framework. Indeed,
almost all relativistic calculations of the structure of many-electron atoms are based
on some variant of the Hamiltonian introduced a half century ago by Brown and
Ravenhall to understand the helium ne structure. As already indicated above,
various QED eects like retardation of the electron-electron interaction, electron
self-energy, and vacuum polarization are not accounted for by the no-pair Hamil-
tonian. However, by comparing the splitting between eigenvalues of the no-pair
operator with experimental values and taking corrections due to the nite mass of
the nucleus into account one can infer the size of the omitted QED corrections.
The good agreement of the discrepancy found in this way with theoretical com-
putations of QED eects provides a test of QED in the strong elds of highly
charged nuclei; see, e.g., [29, 3]. In particular, the no-pair energy levels may serve
as a rst approximation for more accurate and complicated QED calculations; see,
for instance, [6]. In practice, the eigenvalues of the no-pair operator and the nite
nuclear mass corrections can be determined by means of the (formal) relativistic
many-body perturbation theory (MBPT) as described in [11,29,30,45]. There exist
variants of MBPT where the negative-energy states discarded in the no-pair approx-
imation are re-introduced in perturbative expansions which becomes important in
certain physical situations [12, 14]. We remark that in the articles cited above the
electronelectron interaction is often given by the CoulombBreit potential and A
is equal to zero.
In the discussion of no-pair models in quantum chemistry and in atomic physics
(see, e.g., [43, Chap. 8] and [11]) it is assumed that the spectrum of the Hamil-
tonian in (1.4) shows the usual qualitative features well known, for instance, for
multi-particle Schrodinger operators. That is, the essential spectrum is supposed
to cover some positive half-axis and there should exist innitely many eigenvalues
below the ionization threshold (provided N is not too large). In view of the vari-
ety and number of applications, it therefore seems worthwhile to complement the
treatment of no-pair models in the quantum chemical and physical literature by
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

4 O. Matte & E. Stockmeyer

mathematically rigorous results and, in particular, to conrm the assumptions on


the spectral properties of HN by providing mathematical proofs. We would like to
give some further comments on the connection between no-pair Hamiltonians and
certain computational techniques in quantum chemistry. Namely, there exists a
block-diagonalization scheme which is used to represent the formal CoulombDirac
operator in (1.3) as a two-fold direct sum of operators acting on two-spinors [23,26].
For the one-particle CoulombDirac operator DA,VC , both blocks are unitarily
equivalent to the restriction of DA,VC to its positive and negative spectral sub-
spaces, respectively. In the general case, the upper left block turns out to be uni-
tarily equivalent to the no-pair Hamiltonian. One may then expand the upper left
block with respect to the Coulomb coupling constant. The partial sums in this
expansion then give reasonable approximations for Hamiltonians describing a rela-
tivistic molecular system and are implemented in numerical algorithms in quantum
chemistry [43, Chaps. 11 and 12]. It has been rigorously shown in [25, 46] that,
for suciently small Coulomb coupling constants and suciently large orders in
the expansion, each partial sum in the expansion has a distinguished self-adjoint
realization and that the sequence of partial sums converges in the norm resolvent
sense. In particular, the spectra of the partial sums which are directly studied
numerically converge to the spectrum of the no-pair Hamiltonian HN .
Obviously, the question arises how to choose the new potential V determining
the projection (1.5) or, in other words, how to x the Dirac sea for one electron
in the presence of the others in a physically ecient way. Various possibilities are
discussed from a physical and numerical point of view in [29,45,47,48]; see also [12]
where the potential dependence of MBPT results is discussed and a possibility to
eliminate this dependence is proposed. The choice V = 0 is referred to as the free
picture, or BrownRavenhall model [9]. It has by now been studied in many math-
ematical works [25, 10, 17, 21, 20, 24, 27, 28, 35, 3740, 5052]. This is due to the
fact that the free projection, +0,0 , can be calculated explicitly both in momentum
and position space [2, 40]. The free picture is considered as one extreme case in a
family of intermediate pictures [48]. The opposite extreme case, sometimes called
the Furry picture (see, e.g., [45, III.F] and [48]), is given by substituting the (neg-
ative) Coulomb potential VC for V . Other members of that family are obtained
by choosing V to be equal to VC plus some additional positive and in general non-
local operator. The Furry and intermediate pictures give better numerical results in
comparison to the free picture [47, 48] and are the preferred choices in MBPT. The
additional non-local term may, for instance, incorporate the interaction with the
remaining electrons. An example would be the HartreeFock potential generated
by a set of appropriately chosen orbitals which is in fact a choice often employed
in relativistic MBPT [11, 29, 30, 45].
In this paper, we do not aim to contribute to the subtle question of optimizing
the choice of V . Rather we try to keep the assumptions on V as general as our
techniques permit. Namely, we consider a class of potentials which can be written
as V = VC + VH + VE , where VC may have several Coulomb-type singularities,
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians 5

VH is a bounded potential function vanishing at innity, and VE is a compact non-


local operator that behaves nicely under conjugation with exponential weights. As
already mentioned above, our goal then is to establish some basic qualitative spec-
tral properties of HN . First, we show that HN is well-dened on a natural dense
subspace (which is not obvious) and, thus, has a self-adjoint Friedrichs extension.
We further locate its essential spectrum and, assuming that the number of elec-
trons does not exceed the sum of all nuclear charges, we prove that there exist
innitely many eigenvalues below the ionization threshold. Moreover, we show that
the corresponding eigenfunctions are exponentially localized. Our results apply to
all nuclear charges Z < e2 137.037. The easy part of the HVZ theorem, i.e.
the upper bound on the ionization threshold holds for a certain class of possibly
unbounded magnetic elds. The hard part, i.e. the lower bound on the ionization
threshold as well as our results on existence of eigenvectors and their exponential
localization are derived for bounded magnetic elds.
Although the general strategy of our proofs is fairly standard the discussion of
general no-pair models poses a variety of new mathematical problems. As an essen-
tial and novel technical input, necessary to obtain any of the results mentioned
above, we rst derive various commutator estimates involving the spectral projec-
tion +A,V , exponential weights, and cut-o functions. They describe the non-local
properties of + 2
A,V in an L -sense and might be of independent interest. Similar
estimates are obtained in [36] in the case V = 0. The case where V does not van-
ish requires, however, more care due to the complex extension theory for singular
Dirac operators. In order to derive the exponential localization estimate, we employ
a strategy from [1] which has been complemented by a number of useful observa-
tions in [19]. The argument presented in [1] is advantageous for us since no a priori
knowledge on the spectrum is required to prove the localization. (In particular, no
eigenvalue equations are exploited in the argument.) Inspired by a remark in [19] we
rather argue in the opposite direction and infer the lower bound on the ionization
threshold from the localization estimate. This possibility is very convenient in the
analysis of the no-pair Hamiltonian HN since the corresponding argument is very
simple and requires only form bounds on the interaction potentials Wij . The com-
parably bad behavior of singular Dirac operators and the resulting weak control on
the interaction potentials also complicates the derivation of the upper bound on the
ionization threshold, which is obtained by a modied version of the standard Weyl
criterion [13] where a suitable strictly monotonic function of the operator is con-
sidered. In order to prove the existence of innitely many bound states we employ
minimax principles proceeding along the lines of [40] where the BrownRavenhall
model (with A = 0) is considered. The main problem here is to replace those argu-
ments in [40] where explicit position or momentum space representations of + 0,0
are used by new and somewhat more abstract ones.
We remark that our results on spectral projectors also allow to analyze
the Hamiltonian HN in the free picture proceeding along the discussion of the
Furry and intermediate pictures presented here. There is, however, a subtlety to
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

6 O. Matte & E. Stockmeyer

consider: Namely, for vanishing magnetic elds, it is known that the one-particle
BrownRavenhall operator is stable if and only if Z Zc := e12 (2/)+(/2) 2

124.2 [17]. In the presence of an exterior B-eld one can show that H1 is still
bounded below in the free picture, for all Z Zc , provided the vector potential
is locally bounded and Lipschitz continuous in a neighborhood of the nuclei [36].
(For smaller values of the coupling constant, one can actually prove the stability of
matter of the second kind in the free picture treating a gauge xed vector potential
as a variable in the minimization and adding the eld energy to the multi-particle
Hamiltonian; see [35, 34], where the quantized electro-magnetic eld is considered.
In this situation it is essential that the vector potential is included in the projection
for otherwise the model is always unstable if N > 1 [21].)
Finally, we comment on some closely related recent work. In the free picture
and for vanishing exterior magnetic eld, an HVZ theorem and the existence of
innitely many eigenvalues below the essential spectrum have been proved in [40],
for nuclear charges Z Zc . The case N = 2 is also treated in [27]. A more general
HVZ theorem that applies to dierent particle species and a wider class of interac-
tion potentials and exterior elds in the free picture has been established in [37].
Moreover, the reduction to irreducible representations of the groups of rotation
and reection and permutations of identical particles is considered in [37]. The L2 -
exponential localization of eigenvectors in the BrownRavenhall model is studied
in [38,39] improving and generalizing earlier results from [2]. In all these works, the
authors employ explicit position space representations of + 0,0 . An HVZ theorem
in the free picture with constant magnetic eld is established in [28] again using
explicit representations for the projection based on Mehlers formula. By employing
somewhat more abstract arguments we are able to study a wider class of projections
in this paper. Similar results on the spectral projectors are used in [36] to study
the regularity of the eigenvectors of H1 in the free picture and to derive pointwise
exponential decay estimates for their partial derivatives of any order (for Z Zc
and assuming that all partial derivatives of A increase more slowly than any expo-
nential function). The rate of exponential decay found in [36] is actually the same
as it is known for the Chandrasekhar operator and, hence, seems to be the optimal
one. For many-electron atoms it is, however, more dicult to prove the exponential
localization and as in [2,38] we shall only get suboptimal bounds on the decay
rate in the present article.
For recent developments and numerous references to the literature on HVZ
theorems we refer to [33].

The article is organized as follows. In Sec. 2, we introduce our mathematical model


precisely and state our main theorems. Section 3 is devoted to the study of some
non-local properties of +A,V expressed in terms of various commutator estimates
which form the basis of the spectral analysis of HN . Moreover, it contains results
that allow to compare the projections + +
A,V and A,0 . In Sec. 4, we derive the
exponential localization estimate for HN and in Sec. 5 infer a lower bound on the
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians 7

threshold energy. Section 6 deals with Weyl sequences and, nally, in Sec. 7 we
show that HN possesses innitely many eigenvalues below the threshold energy.

Some frequently used notation. Open balls in R3 with radius R > 0 and center
z R3 are denoted by BR (z). Spectral projections of a self-adjoint operator, T ,
on some Hilbert space are denoted by E (T ) and EI (T ), if R and I is an
interval. D(T ) denotes the domain of the operator T and Q(T ) its form domain.
The characteristic function of a subset M Rn is denoted by 1M . C, C  , C  , . . .
denote constants whose values might change from one estimate to another.

2. The Model and Main Results


In our choice of units the free Dirac operator reads
3

D0 := i + := i j xj + .
j=1

Here = (1 , 2 , 3 ) and =: 0 are 4 4 hermitian matrices which satisfy the


Cliord algebra relations

{i , j } = 2ij 1, 0 i, j 3. (2.1)

In Diracs representation, which we x throughout the paper, they are given as


   
0 j 1 0
j = , j = 1, 2, 3, = ,
j 0 0 1

where 1 , 2 , 3 are the standard Pauli matrices. D0 is a self-adjoint operator in


the Hilbert space

H := L2 (R3 , C4 )

with domain H 1 (R3 , C4 ). Its spectrum is purely absolutely continuous and given
by the union of two half-lines,

(D0 ) = ac (D0 ) = (, 1] [1, ). (2.2)

Next, we formulate our precise hypotheses on the exterior electrostatic potential


VC and on the potential V determining the Dirac sea. We think that, at least
with regards to the commutator estimates in Sec. 3, it is interesting to keep the
conditions on VC and V fairly general.

Hypothesis 2.1. There is a nite set Y R3 , #Y < , such that VC


L 3 4
loc (R \Y, L (C )) is almost everywhere hermitian and

VC (x) 0, |x| . (2.3)


February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

8 O. Matte & E. Stockmeyer

Moreover, there exist (0, 1) and > 0 such that the balls B (y), y Y, are
mutually disjoint and, for 0 < |x| < and y Y,

VC (y + x) . (2.4)
|x|

Example 2.1. The main example for a potential satisfying Hypothesis 2.1 is cer-
tainly the Coulomb potential generated by a nite number of static nuclei,
 e2 Zy
VC (x) = 1, x R3 \Y.
|x y|
yY

In this case the restriction on the strength of the singularities of VC imposed in


Hypothesis 2.1 allows for all nuclear charges, 0 Zy < e2 137.037, y Y.

Hypothesis 2.2. V = VC + VH + VE , where VC fullls Hypothesis 2.1 and


VH L 3 4
loc (R , L (C )) is an almost everywhere hermitian matrix-valued function
dropping o to zero at innity,

VH (x) 0, |x| . (2.5)

VE is compact and has the following property: There exist m > 0 and some
increasing function c : [0, m) (0, ) such that, for every F C 1 (R3 , R) with
|F | a < m,

C 1 (R3 , [0, 1]) : [eF VE eF , ] c(a) , (2.6)


[VE , eF ]eF c(a) F , (2.7)
lim 1R3 \BR (0) eF VE eF 1R3 \BR (0) = 0. (2.8)
R

Example 2.2. (i) Possible choices for VH and VE satisfying the conditions of
Hypothesis 2.2 are the Hartree and non-local exchange potentials corresponding
to a set of exponentially localized orbitals 1 , . . . , M H , |i (x)| Cem|x| ,
1 i M , for some C (0, ). Their Hartree potential is given as

M

2 2 1
VH (x) := e |i | (x), x R3 .
i=1
||

It incorporates the presence of M electrons in a xed state into the Dirac sea
by a smeared out background density. The exchange potential corresponding
to 1 , . . . , M is the integral operator with matrix-valued kernel


M
i (x) (y)
VE (x, y) := e2 i
.
i=1
|x y|
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians 9

It is a correction to the Hartree potential accounting for the Pauli principle.


In the sense of quadratic forms it then holds VC V = VC + VH + VE 0,
which justies the notion intermediate picture. These choices of VH and VE
are discussed, e.g., in [11, 29, 30, 45].
(ii) More generally, we may set VH := | |1 , for some 0 L1 L5/3 (R3 ).
In this case we nd some C (0, ) such that 0 VH C/| |. Moreover,
standard theorems on integral operators show that every kernel with values in
the set of hermitian (4 4)-matrices satisfying

em|xy|
VE (x, y) C  ,
x |x y|y

for some m, , C  > 0, yields a compact operator satisfying the conditions of


Hypothesis 2.2.

As a rst consequence of Hypothesis 2.2 we nd, for every locally bounded


vector potential A : R3 R3 , a distinguished self-adjoint realization of the Dirac
operator

DA,V = (i + A) + + V,

whose essential spectrum is again contained in (, 1] [1, ); see Lemma 3.2


below, where we recall some important well-known facts on Dirac operators with
singular potentials. Therefore, it makes sense to dene the spectral projections,

+
A,V := E[e0 ,) (DA,V ), +
A,V := 1 A,V , (2.9)

where

e0 (DA,V ) (1, 1). (2.10)

For later reference we introduce the parameter



0 := 1 e20 . (2.11)

Many of our technical results on DA,V , for instance, various commutator estimates
of Sec. 3 hold actually true under the mere assumption that the components of A
are locally bounded. Of course, if not all eigenvalues of DA,V are larger than 1
and e0 is chosen between 1 and the lowest eigenvalue, the physical relevance of
the N -particle Hamiltonian HN becomes rather questionable. We remark that such
situations are not excluded by our hypotheses. For instance, if VC is the Coulomb
potential and the intensity of a constant exterior magnetic eld is increased, then
the lowest eigenvalue of DA,VC eventually reaches the lower continuum [16]. Never-
theless, our theorems hold for any choice of e0 as in (2.10).
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

10 O. Matte & E. Stockmeyer

In order to dene the atomic no-pair Hamiltonian precisely we rst set



N
HN := H, HN+ := +,N
A,V HN , N N, H + := H1+ ,
i=1

where +,N 3 3
A,V is given by (1.5) and (2.9). We let W : R R [0, ] denote the
interaction potential between two electrons.

> 0 such that, for all x, y R3 , x = y,


Hypothesis 2.3. There is some
|x y|1 .
0 W (x, y) = W (y, x) (2.12)
When we consider N electrons located at x1 , . . . , xN R3 we denote their
common position variable by X = (x1 , . . . , xN ). Furthermore, we often write Wjk
for the maximal multiplication operator in HN induced by the function (R3 )N 
X  W (xj , xk ). For N N, we introduce a symmetric, semi-bounded operator
acting in HN+ by

N
N ) := +,NDN ,
D(H DN := D, D := C0 (R3 , C4 ), (2.13)
A,V
i=1

N
(i)

N :=
H +,N DA,VC + Wij +,N D(H
N ). (2.14)
A,V A,V ,
i=1 1i<jN

Proposition 2.1. Assume that V fullls Hypothesis 2.2, W fullls Hypothesis 2.3,
and that A L 3 3
loc (R , R ) satises e
0 |x|
A < , for some 0 0 <
min{0 , m}. Then the operator H N given by (2.13) and (2.14) is well-dened,
symmetric, and semi-bounded from below.

Proof. The only claim that is not obvious is that Wij +,N A,V is again square-
integrable, for every DN . This follows, however, from Corollary 3.3.

N by HN . Note that we do not require


We denote the Friedrichs extension of H
the elements in the domain of HN to be anti-symmetric since in our proofs it is con-
venient to consider HN as an operator acting in the full tensor product. Of course,
in the end we shall be interested in the restriction of HN to the anti-symmetric
(fermionic) subspace of HN+ . We denote the anti-symmetrization operator on HN
by AN ,
1 
(AN )(X) = sgn()(x(1) , . . . , x(N ) ), HN , (2.15)
N!
SN

where SN is the group of permutations of {1, . . . , N }, and dene the no-pair Hamil-
tonian by
A
HN := HN AN H + .
N
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians 11

Our rst main result is the following theorem, where

ENA := inf (HN


A
), N N, E0A := 0. (2.16)

Theorem 2.1 (Exponential Localization). Assume that V and W fulll


Hypotheses 2.2 and 2.3, respectively. If A C 1 (R3 , R3 ) and B = curl A is bounded
and if I R is an interval with sup I < ENA1 + 1, then there exists b (0, ) such
A
that Ran(EI (HN )) D(eb|X| ) and
A
eb|X|EI (HN ) < .

Proof. This theorem is proved in Sec. 4.

Remark 2.1. In the case N = 1 the assertion of Theorem 2.1 still holds true under
the assumptions of Proposition 2.1. This follows from the proof of Theorem 2.1.
In fact, for N = 1, we do not have to control error terms involving the interaction
W which is the only reason why B is assumed to be bounded in Theorem 2.1. If also
V = VC , then we obtain an exponential localization estimate for an eigenvector,
E , with eigenvalue E (1, 1) of the Dirac operator DA,VC . The estimate on the
decay rate which could be extracted from our proof is, however, suboptimal due to
error terms coming from the projections; see also [7] for decay estimates for Dirac
operators.

Next, we introduce a hypothesis which is used to prove the easy part of the
HVZ theorem below.

Hypothesis 2.4. (i) For every 1, there exist radii, 1 R1 < R2 < ,
Rn , and 1 , 2 , . . . D such that

n = 1, supp(n ) R3 \BRn (0), lim (DA )n = 0. (2.17)


n

(ii) A C 1 (R3 , R3 ) and B = curl A has the following property: There are b1
(0, ) and 0 < min{0 , m} (m and 0 are the parameters appearing in
Hypothesis 2.2 and (2.11)) such that, for all x, y R3 ,

|B(x) B(y)| b1 e |xy|. (2.18)

Example 2.3 ([22]). We recall a result from [22] which provides a large class
of examples where Hypothesis 2.4(i) is fullled: Suppose that A C (R3 , R3 ),
B = curl A, and set, for x R3 and N,

| B(x)|
||=
0 (x) := |B(x)|,  (x) :=  .
1+ | B(x)|
||<
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

12 O. Matte & E. Stockmeyer

Suppose further that there exist N0 , z1 , z2 , . . . R3 , and 1 , 2 , . . . > 0 such


that n  , the balls Bn (zn ), n N, are mutually disjoint and
sup{ (x) | x Bn (zn )} 0, n .
Then there is a Weyl sequence, 1 , 2 , . . . , that satises the conditions of Hypoth-
esis 2.4(i).
The fact that the additional assumption of Part (ii) of the next theorem yields
a lower bound on the ionization threshold is an observation made in [19] for
Schr
odinger operators.
Theorem 2.2 (HVZ). Assume that V and W fulll Hypotheses 2.2 and 2.3,
respectively. Then the following assertions hold true:
A
(i) If Hypothesis 2.4 is fullled also, then ess (HN ) [ENA1 + 1, ).
(ii) Assume additionally that, for every interval I R with sup I < ENA1 + 1, there
A
is some g C(R, (0, )) such that g(r) , as r , and g(|X|)EI (HN )
A A
L (AN HN ). Then ess (HN ) [EN 1 + 1, ). In particular, this inclusion is
valid if A C 1 (R3 , R3 ) and B = curl A is bounded.

Proof. (i) follows directly from Lemma 6.3 and (ii) follows from Theorems 5.1
and 2.1.

To show the existence of innitely many eigenvalues below the bottom of the
A
essential spectrum of HN we certainly need a condition on the relationship between
VC , W , and the magnetic eld. To formulate it we set, for , R > 0,
S (R) := {x R3 : (1 )R |x| (1 + )R} (2.19)
v
(, R) := sup sup v | VC (x)vC4 . (2.20)
xS (R) |v|=1

Hypothesis 2.5. (i) V fullls Hypothesis 2.2.


(ii) A C 1 (R3 , R3 ) and B = curl A is bounded.
(iii) There exist radii 1 R1 < R2 < , Rn , some constant (0, 1/N ), and
a sequence of spinors, 1 , 2 , . . . D, with vanishing lower spinor components,
n = (n,1 , n,2 , 0, 0) , n N, such that
n = 1, supp(n ) {Rn < |x| < (1 + /2)Rn }, 2Rn Rn+1 , (2.21)
for all n N, and
(DA 1)n = O(1/Rn ), n . (2.22)
(iv) W fullls Hypothesis 2.3 and, for every (0, N1 ), we nd some (0, 1)
such that
 
lim sup Rn v
(, Rn ) + (N 1) sup W (x, y) < 0.
n |xy|(1)Rn
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians 13

Example 2.4. V = VC + VH + VE and W fulll Hypothesis 2.5(i) and (iv), if



VC is given as in Example 2.1 with yY Zy N , VH and VE are given as in
Example 2.2(i) or (ii), and W is the Coulomb repulsion, W (x, y) = e2 /|x y|.
Hypothesis 2.5(iii) is fullled under the following strengthened version of the
condition given in [22]: Suppose again that A C (R3 , R3 ), B = curl A, and let
Bn (zn ) denote the balls appearing in Example 2.3. Suppose additionally that there
is some C (0, ) such that n < |zn | Cn , for all n N, and that either

sup{|B(x)| : x Bn (zn )} C/|zn |2 , n N,

or

n N : |B(zn )| 1/C and sup{ (x)|x Bn (zn )} = o(


n ).

Then we nd a Weyl sequence 1 , 2 , . . . satisfying the conditions in Hypothe-


sis 2.5(iii). This follows by inspecting and adapting all relevant proofs in [22]. We
leave this procedure to the reader since it is straightforward but a little bit lengthy.

Theorem 2.3 (Existence of Bound States). Assume that V, W, and A fulll


A
Hypothesis 2.5. Then HN has innitely many eigenvalues below the inmum of its
A
essential spectrum, inf ess (HN ) = ENA1 + 1.

Proof. This theorem is proved in Sec. 7.

3. Spectral Projections of the Dirac Operator


In this section, we study spectral projections of Dirac operators with singular poten-
tials in magnetic elds. We start by recalling some basic well-known facts about
Dirac operators in Sec. 3.1. A crucial role is played by the resolvent identity stated
in that subsection which applies to Coulomb singularities with coupling constants
up to e2 Z < 1. We remark that the domains of the Dirac operators studied here
are not known in general and actually change when the strength of a Coulomb-type
potential is increased. Consequently, the usual resolvent identities are not applicable
and all formal manipulations involving Dirac operators and their spectral projec-
tions have to be treated carefully in the whole paper. In Sec. 3.2 we derive some
norm estimates on resolvents of Dirac operators which are conjugated with expo-
nential weight functions. We verify that the conjugated resolvent stays  bounded
provided the weights increase with an exponential rate smaller than 1 (z)2 ,
where z (1, 1) + iR is the spectral parameter. The simple Neumann-type argu-
ment we employ to prove this for non-vanishing electrostatic potentials might be
new. In Sec. 3.3, we derive the main technical tools of this paper, namely, various
commutator estimates involving spectral projections of singular Dirac operators.
Some long and technical proofs are postponed to Sec. 3.5. Finally, in Sec. 3.4, we
study the dierence of projections with and without electrostatic potentials.
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

14 O. Matte & E. Stockmeyer

3.1. Basic properties of Dirac operators with singular potentials


in magnetic elds
In the next lemma, we collect various well-known results on Dirac operators which
play an important role in the whole paper. To this end we let Hcs := Hcs (R3 , C4 )
denote all elements of H s := H s (R3 , C4 ), s R, having compact support. Moreover,
we denote the canonical extension of D0 to an element of L (H 1/2 , H 1/2 ) by D 0.
It shall sometimes be convenient to consider the singular part of VC ,

VCs (x) := (x y)VC (x), x R3 , (3.1)
yY

where C0 (R3 , [0, 1]) equals 1 on B/2 (0) and 0 outside B (0). Here is the
parameter appearing in Hypothesis 2.1. We let VCs (x) = S(x)|VCs |(x) denote the
polar decomposition of VCs (x). By Hardys inequality we know that VCs is a bounded
operator from H 1 (R3 , C4 ) to L2 (R3 , C4 ). By duality and interpolation it possesses
a unique extension VCs L (H 1/2 , H 1/2 ). Given some A L 3 3
loc (R , R ) we set
s 3
A := (1 )A, where C0 (R , [0, 1]) is equal to 1 on some ball containing
supp(VCs ). We let As (x) = U (x)| As (x)| denote the polar decomposition of
As (x) and note that the operator sum D 0 + As + V s is well-dened as an
C
1/2 1/2
element of L (Hc , Hc ), for every A L loc (R 3
, R 3
). So V s s
C and A have disjoint
support by denition. As a consequence the application of the following lemma
eventually becomes more convenient.

Lemma 3.1 ([8,41,42,44]). Assume that A L 3 3


loc (R , R ) and VC fullls Hypoth-
esis 2.1. Then there is unique self-adjoint operator, DAs ,VCs , such that :
1/2
(i) D(DAs ,VCs ) Hloc (R3 , C4 ).
1/2
(ii) For all Hc (R3 , C4 ) and D(DAs ,VCs ),

 | DAs ,VCs  = |D0 |1/2 | sgn(D0 )|D0 |1/2 

+ | As |1/2 | U | As |1/2  + |VCs |1/2 | S|VCs |1/2 .

Proof. In [44, Proposition 4.3] it is observed that the claim follows from [8, The-
orem 1.3] and [41, 42].

Consequently, we may dene a self-adjoint operator,

DA,V := DAs ,VCs + (A As ) + (VC VCs ) + VH + VE (3.2)

on the domain D(DA,V ) = D(DAs ,VCs ). Notice that in (3.2) we only add bounded
operators to DAs ,VCs . We state some of its properties in the following lemma where

RA,V (z) := (DA,V z)1 , z (DA,V ). (3.3)


February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians 15

Lemma 3.2 ([8, 41, 42, 44]). Assume that A L 3 3


loc (R , R ) and that V fullls
Hypothesis 2.2. Then the following assertions hold true:

(a) 1BR (0) (DA,V i)1 is compact, for every R > 0.


(b) ess (DA,V ) = ess (DA ), (DA ) (, 1] [1, ).
(c) DA,V is essentially self-adjoint on

De := { Hc1/2 (R3 , C4 ) : D
0 + A + V s L2 (R3 , C4 )}
C (3.4)

and, for De , DA,V is given as a sum of four vectors in H 1/2 ,

0 + A + VCs + (V VCs ).
DA,V = D

Moreover, De = D(DA,V ) E  , where E  denotes the dual space of C (R3 , C4 ).


(d) For C0 (R3 ) and D(DA,V ), we have De D(DA,V ) and

[DA,V , ] = i( ) + [VE , ].

In particular, for z (DA,V ),

[, RA,V (z)] = RA,V (z)[DA,V , ]RA,V (z)


= RA,V (z)(i( ) + [VE , ])RA,V (z). (3.5)

(e) If A is bounded, then D(DA,V ) H 1/2 (R3 , C4 ).

Proof. Since VE is compact it is clear that all assertions hold true as soon
as they hold for VE = 0, which we assume in the following. To prove (a)
we write

1BR (0) (DA,V i)1 = (1BR (0) |D0 |1/2 )(|D0 |1/2 (DA,V i)1 ),

where C0 (R3 , [0, 1]) equals 1 in a neighborhood of BR (0). Then we use that
1BR (0) |D0 |1/2 is compact and that |D0 |1/2 (DA,V i)1 is bounded by Lemma 3.1
and the closed graph theorem. By standard arguments, we obtain the identity
ess (DA,V ) = ess (DA ) from (a) since V drops o to zero at innity; see, e.g., [49,
4.3.4]. The inclusion (DA ) (, 1] [1, ) follows from supersymmetry
arguments; see, e.g., [49, 5.6]. The assertions in (c) follow from [8, 2], (d) follows
from [8, Lemma G], and (e) from [41].

Next, we recall the useful resolvent identity (3.6) (see, e.g., [18, 53]) which is
used very often in the sequel. It should be regarded as a substitute for the second
resolvent identity which is typically not applicable in order to compare two dierent
Dirac operators in this paper. For, in general, the domain of one of these Dirac
operators is not included in the domain of the other. The vector potential A in
Eq. (3.6) below could for instance be the gradient of some gauge potential or just be
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

16 O. Matte & E. Stockmeyer

equal to zero. We recall another well-known resolvent identity [41] in the beginning
of Sec. 3.5.

Lemma 3.3. Assume that V fullls Hypothesis 2.2, and that A, A L (R3 , R3 ).
loc
s
Let V be either VC (given by (3.1)) or 0, let z (DA, e ) (DA,V ) and
eV
C (R3 , R) be constant outside some ball in R3 , and assume that (VC V ) and
are bounded. Then
(A A)

RA,V (z) = RA,


eV e (z) + RA,
eV e (z)i ()(RA,
eV e (z) RA,V (z))

RA,
eV

e (z)(V V + (A A))RA,V (z). (3.6)

Proof. Let D e := { Hc1/2 |D + V L2 }. Since can be written


0 + A
as = c + , for some c R and C0 (R3 , R), Lemmas 3.2(c) and (d) imply

that D e . By the denition of De in (3.4) and the assumptions on it further


follows that De D(DA,V ) and

DA,
eV

e = DA,V + {V + V (A A)}.

Therefore, we obtain

(RA,
eV e (z) RA,V (z))(DA,
eV e z)

= (RA,
eV e (z) RA,V (z))((DA,
eV e z) + i ())

= RA,V (z)(DA,V z V + V (A A))


+ (RA,
eV e (z) RA,V (z))i ()

= RA,V (z)(V V + (A A))R


eV
A, e (z)(DA,
eV e z)

+ (RA,
eV e (z) RA,V (z))i ()RA,
eV e (z)(DA,
eV e z).

As DA,
eV

e is essentially self-adjoint on De , we know that (DA,
eV

e z)De is dense,
which together with the calculation above implies

(RA,
eV e (z) RA,V (z)) = (RA,
eV e (z) RA,V (z))i ()RA,
eV e (z)

+ RA,V (z)(V V + (A A))R


eV
A, e (z). (3.7)

Taking the adjoint of (3.7) (with z replaced by z) we obtain (3.6).

3.2. Conjugation of RA,V (z) with exponential weights


As a preparation for the localization estimates for the spectral projections, we shall
now study the conjugation of RA,V (z) with exponential weight functions eF acting
as multiplication operators on H . To this end we recall that e0 (1, 1) is an
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians 17

element of the resolvent set of DA,V and set

0 := inf{|e0 | : (DA,V )} > 0, (3.8)


0 := min{1 e0 , e0 + 1},
:= e0 + iR. (3.9)

Notice that the decay rate in the following lemma is determined only by the decay
rate m appearing in Hypothesis 2.2 and the number 0 dened in (2.11). In the
next proof and henceforth we shall often use the abbreviations

DA := DA,0 , RA (z) := (DA z)1 , z (DA ). (3.10)

We remark that, for V = 0, the bound (3.11) below follows from a well-known
computation (see, e.g., [7]) which is recalled in the next proof. For non-vanishing,
singular potentials V a bound on the operator norm of the conjugated resolvent
seems to be less well known and the Neumann-type argument we use to prove it
might be a new observation.

Lemma 3.4. Assume that A L 3 3


loc (R , R ) and that V fullls Hypothesis 2.2.
Let 0 < a < min{0 , m}. Then there is some Ca (0, ) such that, for all F
C (R3 , R) with F (0) = 0, F 0 or F 0, F a, and all z = e0 + i ,

Ca
eF RA,V (z)eF  . (3.11)
1 + 2

Proof. First, we assume that F is constant outside some ball in R3 . Then it suf-
ces to treat the case F 0, since otherwise we could consider the adjoint of
eF RA,V (z)eF .
Since F is smooth and constant outside some compact set a straightforward
calculation (see [7]) using Lemma 3.2(d) yields, for z C and D(DA ) E  ,

1 F
e (DA z)eF 2 + 3 (i + A) 2
4
+ 3(1 + |z|2 ) 2 + 3 | |F |2 
eF (DA + z)eF | eF (DA z)eF 
= (i + A) 2 +  | (1 z 2 |F |2 ).

This and the assumption |F | a permit to get, for z = e0 + i , that is,


z 2 = e20 2 , and for every 0 < < (1 e20 a2 )/9 = (20 a2 )/9,

1 Ca,
eF RA (z)eF 
2
 . (3.12)
2 2
4 1 e0 a 9 + /2 1 + 2
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

18 O. Matte & E. Stockmeyer

We choose := (1 e20 a2 )/10 in what follows. Next, we pick some R > max{|y| :
y Y} and C (R3 , [0, 1]) such that 0 on BR (0), 1 on R3 \BR+2 (0),
and 1. We set := 1 . Furthermore, we let denote the character-
istic function of R3 \BR (0). We choose R so large (depending on a, but not on F ;
recall (2.8)) that
1
eF VE eF
sup VC (x) + sup VH (x) + . (3.13)
|x|R |x|R 2Ca,

Conjugating (3.6) with exponential weights and rearranging the terms we nd,
for z ,

{1 + eF RA (z)eF ( VH + eF VE eF
VC + )}eF RA,V (z)eF
= eF RA (z)eF (eF RA (z)eF )(eF i )(RA,V (z) RA (z))eF
(eF RA (z)eF )(eF VE eF )(1BR+2 (0) eF )RA,V (z)eF .

Here the operator { } on the left side can be inverted by means of a Neumann
series and { }1 2 by (3.12) and (3.13). Furthermore, we recall the identity

v L (C4 ) = |v|, v R3 , (3.14)

which follows from the Cliord algebra relations (2.1), and observe that, by the
choice of , the assumption on F , and (3.14),

eF i ea(R+2) , eF 1.

Moreover, we have, for z = e0 + i ,


1 1
RA (z)  2 , RA,V (z)  2 . (3.15)
0 + 2 0 + 2

Using these remarks together with (2.7) and (3.12), we obtain

C  eR+2
eF RA,V (z)eF a , z = e0 + i .
1 + 2
This estimate implies the assertion if F is constant outside some ball since, certainly,
eF RA,V (z)eF ea(R+2) (02 + 2 )1/2 .
Let us now assume that F 0 is not necessarily bounded. Let F1 , F2 , . . .
C (R3 , [0, )) be constant near innity and such that Fn = F on Bn (0) and
Fn F . Then eFn RA,V (z)eFn eF RA,V (z)eF by the dominated conver-
gence theorem, for every D. Since eFn RA,V (z)eFn obeys the estimate (3.11)
uniformly in n, we see that the densely dened operator eF RA,V (z)eF D is
bounded and satises (3.11), too. But this is the case if and only if its adjoint,
eF RA,V (z)eF = (eF RA,V (z)eF ) , is an element of L (H ) and satises (3.11)
as well.
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians 19

In the applications of the previous lemma, the following observation is very


useful.

Lemma 3.5. Assume that A L 3 3


loc (R , R ) and that V fullls Hypothesis 2.2.
Let 0 < a < min{0 , m}. Then there is some Ca (0, ) such that, for all
F C (R3 , R) with F (0) = 0, F 0 or F 0, F a, which are constant
outside some ball in R3 , and for all H ,

|DA,V |1/2 eF RA,V (z)eF 2 |dz| Ca 2 , (3.16)

and, for D(|DA,V |1/2 ),



eF RA,V (z)eF |DA,V |1/2 2 |dz| Ca 2 . (3.17)

Proof. For later reference we additionally pick some C (R3 , R) which is


constant outside some large ball and infer from Lemma 3.2(e) that, for z ,

[RA,V (z), eF ] = RA,V (z){i ( + F ) + [eF , VE ]eF }eF RA,V (z). (3.18)

The special case 1 implies

eF RA,V (z)eF = RA,V (z) RA,V (z){i F + [eF , VE ]eF }eF RA,V (z)eF .
(3.19)

Taking the adjoint and replacing F by F and z by z we also get

eF RA,V (z)eF = RA,V (z) eF RA,V (z)eF {i F + eF [VE , eF ]}RA,V (z).


(3.20)

Now, let T be a self-adjoint operator on some Hilbert space, K , such that


(0 , 0 ) (T ). Then, for K ,
  
||
|T |1/2 (T i)1 2 d = d d E (T ) 2 = 2 , (3.21)
R R R 2 + 2

and it is elementary to check that, for R,


1/2
0 1(0 ,0 ) () 1(0 ,0 )c ()
|T |1/2 (T i)1  +  . (3.22)
02 + 2 2||

Using (3.21) and (3.22) with T = DA,V e0 and taking (2.8), (3.11), (3.14),
and (3.15) into account, we readily derive the asserted estimate (3.16) from (3.19).
The second estimate (3.17) it obtained analogously by means of (3.20).
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

20 O. Matte & E. Stockmeyer

3.3. Commutators
In this subsection, we derive the crucial technical prerequisites for the spectral anal-
ysis of HN , namely various commutator estimates involving the projection + A,V ,
cut-o functions, and exponential weights eF . Roughly speaking, these estimates
allow to adapt many arguments known from the spectral analysis of partial dier-
ential operators that involve partitions of unity and conjugations with exponential
weights to our non-local model.
Our standard assumptions on the cut-o and weight functions are

C (R3 , [0, 1]) is constant outside some ball (3.23)

and

F C (R3 , R), F 0 or F 0, F (0) = 0, |F | a,
(3.24)
F is constant outside some ball.

To shorten the presentation, we generalize our estimates to unbounded F only if


this is explicitly used in this article.

Proposition 3.1. Assume that A L 3 3


loc (R , R ) and that V fullls Hypothesis 2.2
and let 0 < a0 < min{0 , m}. Then there is some constant Ca0 (0, ) such that,
for all a [0, a0 ] and , F satisfying (3.23) and (3.24),

|DA,V |1/2 [+ F F
A,V , e ]e |DA,V |1/2 Ca0 ( + a). (3.25)

Proof. We shall employ the identity


1
[+ F
A,V , e ] = [sgn(DA,V e0 ), eF ] (3.26)
2
and the representation of the sign function as a Cauchy principal value (see, e.g.,
[31, p. 359]),

dz
sgn(DA,V e0 ) = RA,V (z)

 R
d
:= lim RA,V (e0 + i) , (3.27)
R R
for H , where is dened in (3.9). Taking also (2.6), (2.7), and (3.18) into
account we obtain

||DA,V |1/2 | [+ F F
A,V , e ]e |DA,V |1/2 |

|DA,V |1/2 RA,V ( z ) i ( + F ) + [eF , VE ]eF

|dz|
eF RA,V (z)eF |DA,V |1/2
2
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians 21


1/2
2 |dz|
Ca 0 ( + F + a) |DA,V | 1/2
RA,V (z)
2

1/2
|dz|
eF RA,V (z)eF |DA,V |1/2 2 , (3.28)
2

for , D(|DA |1/2 ) Ran(RA (z)). By virtue of (3.16) and (3.17), we rst infer
that

[+ F F
A,V , e ]e |DA,V |1/2 D((|DA,V |1/2 ) ) = D(|DA,V |1/2 ).

We conclude by recalling that an operator T : D(T ) K on some Hilbert space


K is bounded if and only if

sup{| | T | : X, D(T ), = = 1} (3.29)

is nite, in which case it is equal to the norm of T . Here X K is a subspace with


X Ran(T ).

Given some suitable weight function, F , we abbreviate


F + F
F
A,V := e A,V e . (3.30)

Corollary 3.1. Assume that A L 3 3


loc (R , R ) and that V fullls Hypothesis 2.2.
Let 0 < a < min{0 , m}. Then there is some C(a) (0, ) such that, for all
F C (R3 , R) satisfying F (0) = 0, F 0 or F 0, and F a, we have
A,V L (H ) and A,V C(a).
F F

Proof. First, we assume that F satises (3.24). In this case the claim follows from
Proposition 3.1 because [eF , +
A,V ]e
F +
A,V A,V . If F is unbounded, then
= F
we apply an approximation argument similar to the one at the end of the proof of
Lemma 3.4.

Corollary 3.2. Assume that A L 3 3


loc (R , R ) and that V fullls Hypothesis 2.2
and let 0 < a0 < min{0 , m}. Then there is some constant C (0, ) such that,
for all a [0, a0 ], , F satisfying (3.23), (3.24), and 1, L L (H ), and
H,
+ + 2
A,V LA,V   | A,V LA,V | (a + )C L .
| | F F
(3.31)

Moreover, for all D(DA,V ),


+ +
| | F
A,V DA,VC A,V   | A,V DA,VC A,V |
F

(a + ) inf { | + +
A,V DA,VC A,V  + C
1
2 }. (3.32)
0<1
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

22 O. Matte & E. Stockmeyer

If VC = V = 0, then (3.32) still holds true, if DA is replaced by |DA | on the


left side.

Proof. The proof of (3.31) is a rather obvious application of Proposition 3.1 and in
fact a simpler analogue of the derivation of (3.32) below. Once (3.31) is veried, it
suces to prove (3.32) with DA,VC replaced by DA,V since VH and VE are bounded.
Without loss of generality we may further assume that DA,V is positive on the range
of +A,V . For otherwise we could add a suitable constant to DA,V . To prove (3.32)
we rst recall that +
A,V maps the domain of DA,V into itself and by Lemma 3.2(d)
we know that multiplication with or eF leaves D(DA,V ) invariant, too. We thus
have the following identity on D(DA,V ),
+
A,V DA,V A,V A,V DA,V
F F

= eF [+
A,V , e
F
]DA,V + + F +
A,V + A,V DA,V [e , A,V ]e
F

+ eF [+
A,V , e
F
]DA,V [eF , +
A,V ]e
F
.

It follows that the absolute value on the left-hand side of (3.32) is less than or
equal to


|DA,V |1/2 +
A,V |DA,V |1/2 [+ F F
A,V , e ]e

=


+ |DA,V |1/2 [+ F F
A,V , e ]e 2 ,
=

which together with Proposition 3.1 implies (3.32). The last statement of the lemma
is valid since the argument above works equally well with |DA,V | in place of DA,V
because + +
A,V |DA,V | = A,V DA,V .

In order to carry out explicit computations it is important to know that functions


in the image set + A,V D still have a certain regularity. This is ensured by the
following lemma. As a rst consequence we shall see in the corollary below that
HN is actually well-dened on DN .

Lemma 3.6. Assume that A L 3 3


loc (R , R ) satises Ae
0 |x|
< , for
some 0 0 < min{m, 0 }, and that V fullls Hypothesis 2.2. Then + A,V
1/2 3 4
H (R , C ), for every D.

Proof. Let D. We pick some C0 (R3 , [0, 1]) with 1 on supp().


Furthermore, we pick C0 (R3 , [0, 1]) such that 1 on supp() BR (0), where
1/2
R > max{|y| : y Y}. We set := 1 . Since D(DA,V ) Hloc (R3 , C4 ) and
+
the spectral projection A,V maps the domain of DA,V into itself it follows that
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians 23

+
A,V H
1/2
(R3 , C4 ). Furthermore, we pick a (smooth, locally nite) partition of
3

unity on R , {J }N , =1 J = 1, such that =1 |J | C, for some constant
C (0, ). Setting := J , N, := + A,V (DA,V i), and using (3.6), we
obtain


+
A,V = RA,V (i)+
A,V (DA,V i)
=1


= R0 (i) R0 (i)i ( )(RA,V (i) R0 (i)) (3.33)
=1



R0 (i) (V + A)RA,V (i). (3.34)
=1

Here the sum in (3.33) commutes with the rst resolvent and the strong limit

( ) denes a bounded operator on L2 (R3 , C4 ). To treat (3.34) we rst
=1 i
3
use that =1 V = V is bounded. Next, we pick some F C (R , [0, ))
vanishing on some ball containing 0 and supp() and satisfying F (x) = a|x| a ,
for x outside some suciently large ball with 0 < a < min{m, 0 }, a > 0. Then
we write

( A)RA,V (i) = (eF A)(eF RA,V (i)eF )


(eF +
A,V e
F
)(DA,VC +VH + i F + eF VE eF ).

Using (2.7), Lemma 3.4 and Corollary 3.1 we see that ( A)RA,V (i) is an ele-
ment of L2 (R3 , C4 ). These remarks imply that +
A,V belongs to Ran(R0 (i)) +
1 3 4
Ran(R0 (i)) = H (R , C ).

We may now conclude that HN is well-dened on the dense domain DN dened


in (2.13).

Corollary 3.3. Assume that A L 3 3


loc (R , R ) satises Ae
0 |x|
< , for
some 0 0 < min{m, 0 }, and that V fullls Hypothesis 2.2. Then, for DN
and 1 i < j N,

1
2
|+,N 2
A,V (X)| dX < .
R 3N |xi xj |

Proof. Let , D. Thanks to Lemma 3.6 we know that both + +


A,V and A,V
1/2 3 4 3 3 4
belong to H (R , C ) and, hence, to L (R , C ) by the Sobolev inequality for
| 1i |. An application of the HardyLittlewoodSobolev inequality thus yields

1
2
|+ 2 + 2
A,V (x)| |A,V (y)| dx dy < .
R6 |x y|

This estimate clearly implies the full assertion.


February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

24 O. Matte & E. Stockmeyer

In our applications it is important to control commutators that are multi-


plied with square-roots of the electron-electron interactions W (xi , xj ). In order
to formulate an appropriate estimate we set

Wy (x) := W (x, y) = W (y, x), x, y R3 , (3.35)

in what follows. The proof of the next proposition looks somewhat lengthy and
is hence postponed to Sec. 3.5. This is due to the fact that the singularity of Wy
may be located anywhere and that we allow for unbounded magnetic elds. We
remark that, even in the case V = 0, a diamagnetic inequality is not very useful in
this context since, for unbounded magnetic elds, one cannot compare |i + A|
with |DA |. We tackle this problem by a procedure that involves a partition of
unity, local gauge transformations, and exponential decay estimates which control
the correlation between dierent regions in position space. As a result we obtain a
commutator estimate which can be chosen to depend only on the local magnitude
of |B| either at the singularity y or on the support of the involved cut-o function.
For any function on R3 we use the notation

B , := sup{|B(x)| : x supp()}. (3.36)

Proposition 3.2. Assume that A C 1 (R3 , R3 ) and B = curl A satises (2.18)


and that V fullls Hypothesis 2.2. Let 0 a0 < min{m, 0 } and N R3 be a
neighborhood of the set of singularities, Y, of VC . Then there is some constant,
Ca0 ,N (0, ), such that, for all a [0, a0 ], all , F satisfying (3.23), (3.24) which
are constant on N , and all y R3 ,

Wy1/2 [+ F
A,V , e ]e
F
Ca0 ,N (1 + min{|B(y)|, B , })(a + ). (3.37)

If VE = 0, then B , can be replaced by B ,F + in (3.37).

Corollary 3.4. Assume that A C 1 (R3 , R3 ) and B = curl A is bounded and that V
fullls Hypothesis 2.2. Then we nd, for every > 0, some constant Ca0 , (0, )
such that, for all F satisfying (3.24), DN , and 1 i N,

| | 1 eF +,N +,N
A,V WiN A,V 1 e
F
  | +,N +,N
A,V WiN A,V |

a{ | +,N +,N 2


A,V WiN A,V  + Ca0 , },

where eF acts only on the last variable.

Proof. This corollary is proved by means of Proposition 3.2 in the same way as
Corollary 3.2. We also recall that +,N
A,V DN D(Wij ).

The technique used in the proof of Proposition 3.2 also yields the following
result whose proof can be found in Sec. 3.5, too:
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians 25

Lemma 3.7. Assume that A C 1 (R3 , R3 ) and B = curl A satises (2.18) and
that V fullls Hypothesis 2.2. Then there is some constant C (0, ) such that,
for all D(DA,V ),

Wy1/2 +
A,V C(1 + min{|B(y)|, B , }) (DA,V i) . (3.38)

3.4. Dierences of projections


In our applications it is eventually necessary to have some control on the dierence
between +A,V and

+ +
A := A,0 .

Lemma 3.8. Assume that A L 3 3


loc (R , R ) and that V fullls Hypothesis 2.2.
Then there is some C (0, ) such that, for all C (R3 , [0, 1]) which are
constant outside some ball such that VC is bounded,

|DA |1/2 (+ +
A A,V ) C( V + ).

In particular, +
A,V D(|DA |
1/2
), for every D(DA ).

Proof. Due to (3.27) the norm in the statement (if it exists) is bounded from
above by

|dz|
sup ||DA |1/2 | (RA (z) RA,V (z))| .
D(|DA |1/2
), H

=1

We next use (3.6), (3.15), and (3.17) to conclude that the asserted bound holds
true.

We note the following trivial consequence of the previous lemma: Namely, we


pick some C0 (R3 , [0, 1]) with 1 on B1 (0) and 0 outside B2 (0), and set
R (x) := (x/R), for R 1, x R3 . By virtue of Hypothesis 2.2 and Lemma 3.8
we then have, for every as in the statement of Lemma 3.8,


|DA |1/2 (1 R )(+
A +
A,V ) C (1 R )V + 0, (3.39)
R

as R tends to innity.

Corollary 3.5. Assume that A L 3 3


loc (R , R ) and that V fullls Hypothesis 2.2.
Then there is some C (0, ) such that, for every C (R3 , R), which is
constant outside some ball and such that VCs = 0 and V + 1, and
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

26 O. Matte & E. Stockmeyer

every D,

| | + + + +
A,V DA,VC A,V   | A DA A |
 
C
( V + ) inf  | +
A |D A | +
A  + 2
, (3.40)
0<1

and

| | + + + +
A,V DA,VC A,V   | A DA A |
 
+ + C 2
( V + ) inf  | A DA A  + . (3.41)
0<1

The last estimate still holds true (with a new constant C), if DA and DA,VC are
replaced by DA 1 and DA,VC 1, respectively.

Proof. Let D and let R be the cut-o function constructed in the paragraph
preceding (3.39). On account of Lemma 3.6 we know that + A,V H
1/2
and,
1/2
hence, we infer from Lemma 3.2(c) that R + A,V Hc belongs to the domain
of DA . Applying also the formula appearing in (ii) of Lemma 3.1 and using VCs = 0
we obtain

 | + s
+ + s
+
A,V DA,VC A,V  = lim R A,V | DA,VC A,V 
R

= lim DA R + +
A,V | A,V .
R

Writing + := + +
A,V A we further get

DA R + +
A,V | A,V 

= DA R + + + +
A | A  + DA R A | 

+ R + | DA + + +
A  + DA R | .

By virtue of Lemma 3.8, we know that + D(|DA |1/2 ) and it is easy to see
that DA R + +
A DA A , as R . Using also (3.39) we arrive at

| | + + + +
A,V DA,VC A,V  DA A | A |
s

2 |DA |1/2 +
A |DA |
1/2
+ + |DA |1/2 + 2 .

Therefore, we obtain (3.40) by applying Lemma 3.8 once again. (3.41) follows from
a straightforward combination of (3.40) and Corollary 3.2. The last statement of
Corollary 3.5 follows from (3.41) and Lemma 3.8.
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians 27

3.5. Proofs of Proposition 3.2 and Lemma 3.7


First, we recall a useful resolvent identity. We remind the reader that VCs = S|VCs |
denotes the polar decomposition of the potential dened in (3.1) and set

M (z) := |VCs |1/2 R0 (z)|VCs |1/2 , z (D0 ). (3.42)

Lemma 3.9 ([32, 41, 42]). Assume that VC fullls Hypothesis 2.1 and let VCs
be given by (3.1). Then there exist 0 > 0 and 0 (, 1) such that, for every
R\(0 , 0 ), we have M (i) 0 and

R0,VCs (i) = R0 (i) R0 (i)|VCs |1/2 (1 + SM (i))1 S|VCs |1/2 R0 (i). (3.43)

Proof. The inequality | |1/2 R0 (i)| |1/2 1 has been conjectured in [41] and
proved in [32]. By means of this inequality and the arguments of [42, pp. 2 and 3
(with ks 1)] we nd some 0 (, 1) such that M (i) 0 provided || is
large enough. The resolvent formula (3.43) then follows from [41, Lemma 2.2 and
Theorem 2.2].

Proof of Proposition 3.2. We pick some C0 (R3 , [0, 1]) such that = 1
in a neighborhood of Y and supp()  N . We set := 1 . Moreover, we pick
3
some C0 (R , [0, 1]) with 1 on B1/2 (0) and supp( ) B1 (0) and set
y (x) := (x y), x R3 . On account of Proposition 3.1 it suces to consider

 y Wy1/2 | [+ F
A,V , e ]e
F

 
1/2 dz dz
=  y Wy | T (z) +  y Wy1/2 | T (z) , (3.44)
2 2

for H 1/2 (R3 , C4 ) and H , where, by (3.18) and (3.27),

T (z) := RA,V (z)T eF RA,V (z)eF , z , (3.45)


T := i (F + ) + [eF , VE ]eF = O( + a). (3.46)

To study the rst integral in (3.44) we write, using (3.6),

RA,V (z) = R0,VCs (z) + R0,VCs (z)i (R0,VCs (z) RA,V (z))
R0,VCs (z){V VCs + A}RA,V (z),

where VCs = S|VCs | is dened in (3.1). Since D(D0,VCs ) H 1/2 (R3 , C4 ) due to
Lemma 3.2(e) we nd C, C  (0, ) such that, for all y R3 and all z ,

Wy1/2 R0,VCs (z) C |D0 |1/2 R0,VCs (z) C  . (3.47)


February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

28 O. Matte & E. Stockmeyer

By denition of T (z) and T and by (3.11) we thus have



|dz|
| y Wy1/2 | {T (z) R0,VCs (z)T eF RA,V (z)eF }|
2

d
Ca, (a + ) 2
. (3.48)
R 1+

To treat the remaining part of the rst integral in (3.44) we employ the rst resol-
vent formula and (3.43) to obtain, for R with || 0 and z = e0 + i (0 is
the parameter appearing in Lemma 3.9),

R0,VCs (z) = R0 (i) + e0 R0,VCs (z)R0,VCs (i)

R0 (i)|VCs |1/2 (1 + SM (i))1 S|VCs |1/2 |D0 |1/2 |D0 |1/2 R0 (i). (3.49)

Here the operator (1 + SM (i))1 is uniformly bounded for || 0 . Moreover,

Wy1/2 R0 (i)|VCs |1/2 C, (3.50)

uniformly in y R3 and R, which is a simple consequence of Katos inequality.


In view of (3.48), (3.49) and (3.22) it is therefore clear that it suces to discuss the
contribution coming from the bare resolvent R0 (i) in (3.49). On account of (3.11),
(3.21) and (3.46) we nd by means of the CauchySchwarz inequality,

d
|R0 (i)Wy1/2 y | T eF RA,V (e0 + i)eF |
||0 2

T (|D0 |1/2 Wy1/2 ) y = O(a + ) .

Since the remaining part of the integral over {|| < 0 } does not pose any further
problem, we see altogether that the rst integral in (3.44) is absolutely convergent
and of order O( + a).
Next, we treat the second integral on the right-hand side of (3.44). Since we
eventually have to change the gauge locally, we pick a (smooth) partition of unity,
{J }Z3 , on R3 such that supp(J ) B1 (), for every Z3 . We can certainly

assume that Z3 |J | C, for some C (0, ). We set

G () := { Z3 : J = 0}, := J , := 1 ,
G ()

so that = , = 0, and re-write the operator dened in (3.46) as

T = i (F + ) + [eF , VE ]eF + [, VE ]
= {i (F + ) + [eF , VE ]eF + [, VE ]} {VE }

=: U1 U2 = (J U1 U2 J ).
G ()
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians 29

For every Z3 , there is some gauge potential, g C 2 (R3 , R), such that
A g = A , where A is dened by
 1
A (x) := B( + t(x )) t(x )dt, x R3 . (3.51)
0

:= g ,
By virtue of (3.6) we obtain with A

y RA,V (z)T = y RA,V (z)(J U1 U2 J )
G ()

= y RAe (z)(J U1 U2 J )
G ()

+ RAe (z)i ( y )(RAe (z) RA,V (z))(J U1 U2 J )
G ()

RAe (z) y (V + A )RA,V (z)(J U1 U2 J )
G ()

=: S1 + S2 S3 . (3.52)

The proof of Proposition 3.2 is nished as soon as we have shown Lemma 3.10
below.

Lemma 3.10. In the situation above there exists a - and F -independent constant,
Ca0 (0, ), such that, for j = 1, 2, 3,

Ij := |Wy1/2 | Sj eF RA,V (z)eF ||dz|

Ca0 (1 + min{|B(y)|, B , })( + a) .

Proof of Lemma 3.10. In our estimations below, that involve non-local operators,
we exploit the fact that the interference between spatially separated regions decays
exponentially. Therefore, we start by introducing appropriate exponential weight
functions: We pick some a (, min{0 , m}) and a convex, even function f

C (R, [0, )) such that f 0 on [2, 2], f(t) = a a, for |t| 4, and 0 f 1
|t|3
on (0, ). We dene f (x) := f(|x |), x R3 , so that f 0 on B2 () and
f a dist(, B2 ()) a
with equality outside B4 (). Moreover, |f | a . We
further pick some non-decreasing C (R, R) such that (t) = t, for t 1, 2
on [3, ) and  1. We set ,y (t) := (| y| + 1)(t/(| y| + 1)), t R,
and f,y := ,y f , so that f,y is bounded, f,y = f a dist(, B2 ()) a
on
B1 (y) supp( y ), and |f,y | a f,y
. By construction e J = J . Setting

,y (z) := (J U1 ef,y U2 ef,y J )eF RA,V (z)eF


February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

30 O. Matte & E. Stockmeyer

we thus have

(J U1 U2 J )eF RA,V (z)eF = ef,y ,y (z).

Observing that = 0 implies

ef,y U2 ef,y = [ef,y VE ef,y , ]

and employing (2.6) and (3.11) we further nd some constant C (0, ) such that,
for all z = e0 + i , Z3 , and y R3 ,

a +
,y (z) C  . (3.53)
1 + 2

Taking these remarks into account we obtain


 
I1 z )ef,y Wy1/2 ( y ef,y ) | ,y (z)||dz|.
|ef,y RAe (
G ()

Now, since A = 0, whence (2.1), (2.12) and


= g is a gradient we have curl A
Hardys inequality imply

) C |D e | ,
Wy C (eig ) = C (i + A A

for D, with some - and y-independent constant C (0, ). Standard argu-


1/2
ments now imply that |DAe |1/2 Wy is a bounded operator whose norm is uni-
formly bounded in and y. Setting

:= |DAe |1/2 Wy1/2 ( y ef,y )

and applying (3.17) we thus nd

 
1/2
f,y 1/2 2
I1 e RAe (
z )e f,y
|DAe | |dz|
G ()


1/2
2
,y (z) |dz|

 
1/2
d
C (a + )
R 1 + 2
G ()

C sup{ef,y (x) : x B1 (y)} (a + )
Z3

C (a + ) .
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians 31

For I2 , we obtain the estimate by means of (3.22), (3.11) and (3.53),


 
I2 |dz| |DAe |1/2 Wy1/2 |DAe |1/2 RAe (z)
G ()

|( y )|ef,y ef,y (RA,V (z) RAe (z))ef,y ,y (z)



C( + a) sup{ef,y (x) : x B1 (y)}
Z3

C ( + a) .

To derive a bound on I3 we employ the special properties of the gauge transformed


vector potentials A . Namely, we make use of the bound
|x |
y (x)ef,y (x) |A (x)| y (x) (b1 e(a )|x|+3a + |B()|ea|x|+3a ),
2
(3.54)

for all Z3 and x R3 , which follows from (2.18). Since also |B()|
|B(y)| + |B() B(y)| and |x y| 1, if y (x) = 0, we infer again from (2.18)
that

y ef,y |A |
G ()
  


a|y|
C e 1 + min |B(y)|, sup |B()| , (3.55)
G ()
G ()

for some suciently small a > 0. Using these observations and the uniform bound-
f,y f,y
edness of e V e , which is implied by Hypothesis 2.2 and the choice of , we
nd some -, F -, and y-independent constant C  (0, ) such that
 
I3 z )Wy1/2 { y ef,y ef,y V ef,y
RAe(
G ()

+ y ef,y |A | } ef,y RA,V (z)ef,y ,y (z) |dz|


C  (1 + min{|B(y)|, B , })( + a) .

This completes the proof of Lemma 3.10 and, at the same time, the proof of Propo-
sition 3.2. (The last assertion of Proposition 3.2 follows by inspecting the arguments
above.)

Proof of Lemma 3.7. We use the notation introduced in the proofs of Propo-
sition 3.2 and Lemma 3.10 in the following. We already know from Corollary 3.6
1/2
that the vector +A,V belongs to D(Wy ), but we do not have any control on the
norm on the left in (3.38) yet. It is certainly sucient to derive the asserted bound
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

32 O. Matte & E. Stockmeyer

1/2 1/2
with Wy replaced by y Wy . As in the proof of Corollary 3.6, we rst pick some
C0 (R3 , [0, 1]) (independent of ) such that 1 on some large open ball
containing Y and set = 1 . By the closed graph theorem |D0 |1/2 RA,V (i) is
bounded whence

y Wy1/2 +
A,V C |D0 |
1/2
RA,V (i) (DA,V i) .

We denote the characteristic function of the support of by . To treat the remain-


ing piece of the norm we set

:= +
A,V (DA,V i)J ,

and write analogously to (3.52),

|Wy1/2 | y +A,V |

|Wy1/2 | y RA,V (i) |
G ()

|Wy1/2 y | RAe (i) |
G ()

+ |Wy1/2 | RAe (i) ( y )(RA,V (i) RAe (i)) |
G ()

+ |Wy1/2 | RAe (i) y (V + A )RA,V (i) |
G ()

=: Q1 + Q2 + Q3 ,

where H 1/2 (R3 , C4 ). Again we use exponential weights constructed in the


beginning of the proof of Lemma 3.10 and abbreviate

,y := (ef,y +
A,V e
f,y f,y
)e (DA,V i)ef,y J ,

so that by Corollary 3.1,

,y C (DA,V i) + O( J + a
) C  (DA,V i) ,

where C, C  (0, ) neither depend on nor y. Writing also ef,y RAe (i)ef,y =
RAe (i)(1 i f,y ef,y RAe (i)ef,y ) and using (3.11) we thus obtain

Q1 |DAe |1/2 Wy1/2 y ef,y
G ()

|DAe |1/2 ef,y RAe (i)ef,y ,y


C  (DA,V i) .
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians 33

Using (3.55) we further nd



Q3 |DAe |1/2 Wy1/2 |DAe |1/2 RAe (i)
G ()

{ y ef,y ef,y V ef,y + y ef,y |A | } ,y


C  (1 + min{|B(y)|, B , }) (DA,V i) .

The remaining term, Q2 , can be dealt with similarly.

4. Exponential Localization
In this section, we prove Theorem 2.1. To this end, we adapt an argument from [1]
and some useful improvements of the latter from [19] to our non-local situation.
In the proof below, we present the general strategy of the argument. In doing so,
we refer to three technical lemmata whose proofs are postponed to the end of this
section. The argument from [1] is advantageous here since it does not require any
a priori knowledge on the spectrum of HN . It rather gives the possibility to prove
the exponential localization of spectral projections directly and to infer results
on the nature of the spectrum from the localization estimate. In particular, the
argument avoids the use of eigenvalue equations which are, for instance, exploited
in Agmon type estimates.
Throughout this section we always assume that the assumptions of Theorem 2.1
are fullled.

Proof of Theorem 2.1. Since HN is bounded from below we may suppose that
inf I > . By assumption we have sup I < ENA1 + 1. Moreover, we consider HN
as an operator on the unprojected N -particle space HN . In this case we have to
keep in mind that 0 becomes an innitely degenerated
 eigenvalue of HN . Our goal
2b|X|
is to show that there are b, C (0, ) such that R3N e |(X)|2 dX C, for all
normalized Ran(EI (HN )) such that = AN and = +,N A,V . Borrowing an
idea from [38] we simplify the problem by using the bounds

N
e2b|X| max e2b N |xj |
e2bN |xj |
, X = (x1 , . . . , xN ) (R3 )N ,
j=1,...,N
j=1

and the anti-symmetry of = AN . (We are not aiming to derive good estimates
on the decay rate here.) Indeed, it suces to show that there exist a, C  (0, )
such that

e2a|xN | |(X)|2 dX C, (4.1)
R3N

for all = AN = +,N


A,V EI (HN ), = 1. Then Theorem 2.1 holds true with

b = a/ N . Furthermore, it suces to show that (4.1) holds true with a|xN | replaced
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

34 O. Matte & E. Stockmeyer

by F (xN ), for every (bounded) F : R3 R satisfying (3.24). This is in fact an


obvious
 consequence of the monotone convergence theorem applied to the integrals
2Fn (xN )
R 3N e |(X)|2 dX with as above, where F1 , F2 , . . . is a suitable increasing
sequence of functions satisfying (3.24) and converging to a|xN |. Therefore, it suces
to nd some a > 0 such that

(AN 1 eF )EI (HN )(AN 1 1)+,N


A,V < , (4.2)

for every F satisfying (3.24), where AN 1 denotes anti-symmetrization of the rst


N 1 variables and eF acts only on the N th electron variable.
We start by introducing a comparison operator. To this end we pick some
C (R3 , [0, 1]) such that 1 outside B2 (0) and 0 on B1 (0) and set R :=
(/R) and R := 1 R , for R 1. Furthermore, we dene orthogonal projections
+,N 1
PN 1 := AN 1 A,V , PN1 = 1HN 1 PN 1 ,
QN := (AN 1 1)+,N
A,V , Q
N = 1HN QN .

Then the comparison operator is dened, a priori on the domain DN HN , by


N := QN HN QN + HN
H A A
1 A,V + EN 1 PN 1 1

+ PN 1 + A +
A,V (1 E1 )R A,V + QN
A A
= HN 1 1 + EN 1 PN 1 1 (4.3)
+ PN 1 + A +
A,V {DA,VC + (1 E1 )R }A,V + QN (4.4)


N 1
+ QN WiN QN . (4.5)
i=1

We denote the Friedrichs extension of H N again by the same symbol. (The idea to
introduce an additional cut-o function R in (4.4) to compensate for the Coulomb
singularity in the last variable xN is borrowed from [19]; together with the other
N stays away
additional terms in (4.3) and (4.4) it ensures that the spectrum of H
from the interval I.) Notice that on DN we have HN 1 1 + EN 1 PN1 1
A A

ENA1 1HN . Furthermore, Lemma 4.3 below implies that

PN 1 + A +
A,V {DA,VC + (1 E1 )R }A,V + QN 1 o(1)PN 1 1,

as R tends to innity. We now pick some > 0 with sup I < ENA1 + 1 . Then
the above remarks imply
N  (E A + 1 /2) 2,
 | H DN , (4.6)
N 1

for all suciently large R 1. Next, we dene


 A
HN := QN HN QN + HN 1 A,V .
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians 35

Then H N and H  have the same domain since they dier by a bounded operator
N
on their common form core DN . We further pick some I C0 (R, [0, 1]), such that
I 1 on I and supp(
I ) (, ENA1 + 1 ). Then N ) = 0 by (4.6). As
I (H
in [1] we now observe that


QN EI (HN )QN = QN EI (HN )QN = ( 
I (HN ) N ))QN .
I (H (4.7)

We preserve the symbol I to denote an almost analytic extension of


I (see, e.g.,
[15]) to a smooth, compactly supported function on the complex plane satisfying

I ) (, ENA1 + 1 ) + i(, ),
supp(
(4.8)
I (z) = ON (|z|N ),
z N N,

where z = 12 ( z + iz ). Here we may choose > 0 as small as we please. We


shall apply the HelerSjostrand formula (see, e.g., [15]),

i
I (T ) =
(z T )1 d
I (z), d
I (z) := I (z)dz d
z z,
C 2

which holds for every self-adjoint operator T on some Hilbert space. By means
of (4.7), we then nd the representation

QN EI (HN )QN = 
[(HN N z)1 ]d
z)1 (H I (z)QN . (4.9)
C

For some F as in (3.24) (which acts only on the last variable in what follows), we
abbreviate

F +,N F
F,N
A,V := e A,V e .

Then (4.9) and the second resolvent identity together with the trivial identities
Q
N QN = 0 = (PN 1 1)QN yield

(AN 1 eF )EI (HN )(AN 1 1)+,N


A,V

eF (H N z)1 PN 1 {+ (1 E A ) + }
A,V 1 R A,V
C

QN (HN z)1 QN |d I (z)|

(1 E1A ) eF (H N z)1 eF F,N eF R |d
I (z)|
C
A,V
|z|

Ca,R eF (H N z)1 eF |d
I (z)|
. (4.10)
C |z|
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

36 O. Matte & E. Stockmeyer

In the last step, we apply Proposition 3.1 and eF R e2aR . By (4.8)


|d
I (z)|/|z| is a nite measure. To conclude the proof of Theorem 2.1 it thus
remains to show that the norm of eF (H N z)1 eF is uniformly bounded in all
z supp( I )\R and F satisfying (3.24). This is done in the rest of this proof.
Since F satises (3.24) we know that 1N 1 eF is an isomorphism on HN .
Therefore, the densely dened operators eF H N eF and H N have the same resolvent
set and

RF (z) := eF (H N eF z)1 ,
N z)1 eF = (eF H N ).
z (H (4.11)

In particular, eF H N eF is closed because its resolvent set is not empty. Using the
1 N eF ) = eF H
N eF .
identity RF (z) = RF (z)1 we readily verify that (eF H
Since eF maps DN into itself we further have

N eF ) = eF D(H
DN D(eF H N ) eF Q(H
N ). (4.12)

The following two lemmata, whose proofs are postponed to the end of this section,
show that eF H N eF is a small form perturbation of H
N . We dene T : DN
HN by

N eF H
T := eF H N , DN . (4.13)

Lemma 4.1. Assume that F : R3 R satises (3.24). Then we have, as a > 0


tends to zero,

N  + O(a) | ,
| | T | a | H DN . (4.14)

Lemma 4.2. There exist constants c1 , c2 (0, ) such that, for all F : R3 R
satisfying (3.24) and all DN ,

N eF | c1 eF 2  | H
|eF | H N  + c2 eF 2 2 . (4.15)

N ) Q(H
In particular, eF Q(H N ).

If a < 1/2 then Lemma 4.1 implies that (eF H N eF )DN has a distinguished,
F
sectorial, closed extension, HN , that is the only closed extension having the prop-
erties D(H F ) Q(H N ), D(H
F ) Q(H
N ), and i (H F ), for all R with
N N N
suciently large absolute value; see [31]. Thanks to (4.11), (4.12) and Lemma 4.2,
we know that eF H N eF is a closed extension enjoying these properties, whence

N
H F N eF .
= eF H
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians 37

We are now prepared to derive a uniform bound on the norm under the integral
I ) and DN , we obtain
sign in (4.10). For z supp(

F z) =  | (H
 | (H N z) +  | T 
N
 


(1 a)  H N z O(a) 2 . (4.16)
1a

By (4.6) and (4.8), we thus nd a (0, 1/2) and R [1, ) such that, for all
I ) and DN ,
z supp(

F z)
 | (H N 2 .
4

This inequality implies that, for z supp( I ), the numerical range of H F z


N
is contained in the half space { C :  /4} [31, Theorem VI.1.18 and
Corollary VI.2.3]. Moreover, by (4.11) the deciency of H F z is zero, for all
N

z C\R, and we may hence estimate the norm of (HN z)1 by the inverse
F

F [31, Theorem V.3.2]. We thus arrive at


distance of z to the numerical range of H N

N 4
(H F
z)1 , z supp(
I ),

which together with (4.10) proves Theorem 2.1.

Lemma 4.3. For every suciently large R 1, there is some cR (0, ) such
that cR 0, as R , and, for all D,

 | + A + + 2 2
A,V [DA,VC + (1 E1 )R ]A,V  A,V cR . (4.17)

Proof. To begin with we introduce a scaled partition of unity. Namely, we pick


some C0 (R3 , [0, 1]) such that
1 on B2 (0) and observe that := 2 +
2 3
(1 ) is strictly positive. We further set, for R 1 and x R , 1 (x)
(x/R)/1/2 (x/R), and 2 (x) R,2 (x) := (1
R,1 (x) := (x/R))/1/2 (x/R), so
2 2 2 2
that 1 + 2 = 1. Since 1 1 + 2 2 = (1 + 2 )/2 = 0 it follows that, for
D,

 | + A +
A,V [DA,VC + (1 E1 )R ]A,V 

=  | + A 2 +
A,V [j DA,VC j + (1 E1 )j R ]A,V 
j=1,2

=: Yj . (4.18)
j=1,2

To treat the summand with j = 1 we use that, by construction, 1 R = 1 , for


every R 1. Taking also Corollary 3.2 and (2.16) into account we nd, for all R 1
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

38 O. Matte & E. Stockmeyer

and D,

Y1 (1 1/R)1 | + A + + 2
A,V [DA,VC E1 ]A,V 1  + 1 A,V O(1/R)
2

1 + 2 2
A,V O(1/R) . (4.19)

We next turn to the summand with j = 2 in (4.18) where 22 R 0. Applying


successively Corollaries 3.2 and 3.5, Proposition 3.1, and Lemma 3.8 we deduce
that, for all D and every > 0,

 | + + + +
A,V 2 DA,VC 2 A,V  (1 ) | 2 A DA A 2  o (1)
2

(1 ) + 2
A 2 o (1)
2

(1 )2 2 + 2
A o (1)
2

(1 )3 2 + 2 2
A,V o (1) , (4.20)

as R . We conclude by combining (4.18)(4.20) and using 21 + 22 = 1.

Proof of Lemma 4.1. We have to study the contribution to T = eF H N eF H


N
F
coming from each term in (4.3)(4.5). The terms in (4.3) commute with e and
hence give no contribution. In order to estimate the contribution coming from the
left term in (4.4) we rst observe that Corollary 3.1 and (3.32) imply the following
identities on D,

eF + +
A,V DA,VC A,V e
F
A,V (DA,VC + i F )A,V
= F F

A,V DA,VC A,V + O(a)


= F F

= (1 + a)+ +
A,V DA,VC A,V + O(a).

The term in (4.4) involving the cut-o function R yields a contribution of order
O(a), too, due to Corollary 3.2 (with L = (1 E1A )R and = 1). To account for
+
the projection on the right in (4.4) we write QN = 1HN PN 1 A,V and use
F F
Proposition 3.1 to obtain e QN e QN = O(a). Finally, we apply Corollary 3.4
to all terms in (4.5) this is the only place in this section where we use the
assumption that B is bounded and arrive at
   !
 
N 1
 (N )
| | T | a  QN DA,VC + WiN QN + O(a) 2 .

i=1

A A
Since HN 1 EN 1 this completes the proof of Lemma 4.1.

Proof of Lemma 4.2. We drop the -signs in (4.15) since the they do not play
any role in this proof. It is clear that we only have to comment on those terms in
(4.3)(4.5) that involve unbounded operators. Since HN 1 1 commutes with eF
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians 39

and since HN 1 ENA1 we rst nd, for DN ,

eF | (HN 1 1 ENA1 PN1 1)eF 


e2F  | (HN 1 1 ENA1 PN1 1). (4.21)

By virtue of Proposition 3.2 we can estimate | | eF QN WiN QN eF |, for


DN , as
1/2 1/2 1/2
WiN QN eF 2 2 eF WiN QN 2 + 2 WiN [eF , QN ]eF eF 2
1/2
2 eF 2 WiN QN 2 + O(a2 ) eF 2 2 . (4.22)

(If B is unbounded, then the O-symbol in (4.22) depends on the supremum of |B|
on supp(F ).) It remains to prove that there are constants c3 , c4 (0, ) such
that

 | eF + + F
A,V DA,VC A,V e 

c3 eF 2  | + + F 2 2
A,V DA,VC A,V  + c4 e , (4.23)

for D. Moreover, since VH and VE are bounded it suces to prove this estimate
with DA,VC replaced by D := DA,V e0 , which is positive on the range of + .
A,V
We abbreviate := A,V in the rest of this proof and seek for bounds on both
terms on the right side of

+ )1/2 eF 2 2
(D + )1/2 eF 2 .
(D (4.24)
=

Here the norm with  = equals (D + )1/2 [eF , ] and is not greater than
some O(a) eF due to Proposition 3.1. We next dene

+ + 1 1.
:= + (DA,V e0 )+ + 1 = + D
D

+ )1/2 D
In fact, because of (D 1/2 1 and

1/2 eF + 2 eF D
D 1/2 + 2 + [D
1/2 , eF ]+ 2

eF 2 + D
1/2 2 + D
1/2 [D
1/2 , eF ]+ D
1/2 2

we shall see that (4.23) holds true as soon as we have shown that

1/2 [D
D 1/2 , eF ]+ = O(a) eF . (4.25)

To check whether (4.25) is correct we rst note that, on D,

eF ] + [+ , eF ]D
eF ] = + [D,
[D,

= + i F eF + ([VE , eF ]eF )eF + [+ , eF ]D. (4.26)
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

40 O. Matte & E. Stockmeyer

We apply the norm-convergent integral representation



1 1 dt
T 1/2 = , (4.27)
0 T +t t

which holds for any strictly positive operator, T , on some Hilbert space. For ,
D, it implies
   
1/2 1/2 F + 1 1/2  1 F + dt
D | [D
, e ]  = D  [D, e ] . (4.28)
0
D+t
D+t t
We estimate the contribution of the rst term on the right side of (4.26) to (4.28) as

  
  + F
+  O(a) eF
 D 1/2  i F e  , t 0. (4.29)
 D +t D +t  (1 + t)3/2

In view of (2.7) the second term in (4.26) can be dealt with similarly. To account
for the third term in (4.26) we apply Proposition 3.1 and obtain, for t 0,
  !
  1  O(a) eF
  1/2 + F D + 
  {D
[ , e ]e }e
F F
 . (4.30)
 D +t +t
D  1+t

Equations (4.28)(4.30) show that (4.25) holds true, which completes the proof of
Lemma 4.2.

A
5. The Lower Bound on inf ess (HN )
In order to prove the hard part of the HVZ theorem, Theorem 2.2(ii), we employ
an idea we learned from [19]: One may use a localization estimate for spectral
projections to prove their compactness. Of course one might try to derive a lower
bound on the ionization threshold by a more direct argument, for instance, by
following the general strategy presented in [33]. Since we have already derived an
exponential localization estimate we nd it, however, more convenient here to adapt
the observation from [19] to our non-local model. Another advantage of the proof
below is that we can work with the square root of HN . This is important since only
form bounds on perturbations of HN are available.

Theorem 5.1. Let the assumptions of Theorem 2.2(ii) be fullled and let I R
be an interval sup I < 1 + ENA1 . Then the spectral projection EI (HN
A
) is a compact
+
operator on AN HN . In particular,
A
ess (HN ) [1 + ENA1 , ).

A
Proof. Let g C(R, (0, )) satisfy g(r) , r , and g(|X|)EI (HN )
L (AN HN ) and set h := 1/g. We let R denote a smoothed characteristic function
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians 41

of the closed ball in R3 with radius R > 0 and center 0 and set R (X) :=
R (x1 ) R (xN ), for X = (x1 , . . . , xN ) (R3 )N . First, we argue that it suces to
A
show that EI (HN )R (+1)1/8 is a (densely dened) bounded operator from HN
+
to AN HN . In fact, let us assume that this is the case. Since ( + 1)1/8 h(|X|)
A
is compact and g(|X|)EI (HN ) is bounded, it then follows that
A A
EI (HN )[R h(|X|)]g(|X|)EI (HN )
A
= EI (HN )R ( + 1)1/8 [( + 1)1/8 h(|X|)]g(|X|)EI (HN
A
)

is compact. Since R h(|X|) converges to h(|X|) in the operator norm, as R


A A A
tends to innity, it further follows that EI (HN ) = EI (HN )h(|X|)g(|X|)EI (HN )
is compact, too.
A
(j)
To verify that EI (HN )R ( + 1)1/8 is bounded we set S := 1 + N j=1 |DA,V |
and write, for some suciently large c > 0,
A
EI (HN )R ( + 1)1/8
A
= EI (HN )(HN + c)1/2 {(HN + c)1/2 +,N
A,V S
1/2
}{S 1/2 R ( + 1)1/8 }.
(5.1)

Here the left curly bracket in (5.1) is a bounded operator from HN to HN+ since
+,N +,N
A,V SA,V HN + c, provided c is large enough, due to the positivity of the
interaction potentials and the boundedness of VH and VE . To see that the right
curly bracket in (5.1) is a bounded operator in HN we rst notice that it is a
restriction of S 1/2 T , where T := ( + 1)1/8 R is closed. It thus remains to
show that T S 1/2 = T S 1/2 = (S 1/2 T ) belongs to L (HN ). To this end
(i) (i)
we recall that ((i) + 1)1/4 R (|DA,V | + 1)1 is bounded on L2 (R3i , C4 ) since
1/2
D(DA,V ) Hloc (R3 , C4 ). It follows that
(i) (i)
((i) + 1)1/4 R S 1 = ((i) + 1)1/4 R (|DA,V | + 1)1 (|DA,V | + 1)S 1

is bounded, for i = 1, . . . , N , and, hence, R ( + 1)1/4 R S 1 L (HN ). Since


R ( + 1)1/4 R is a restriction of T T we see that T T S 1 L (HN ), which
implies |T |S 1/2 L (HN ) and, hence, T S 1/2 L (HN ).

6. Weyl Sequences
In this section, we prove the easy part of our HVZ theorem, namely Part (i) of
Theorem 2.2 asserting that
A
ess (HN ) [ENA1 + 1, ).
A
This is done by constructing suitable Weyl sequences for HN . The diculties we
encounter are similar to those in [40] where the BrownRavenhall model (free pic-
ture without magnetic eld) is considered. We have, however, to replace those
arguments in [40] that require explicit momentum or position space representations
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

42 O. Matte & E. Stockmeyer

of the free projection +0 by more abstract ones; see, e.g., Lemma 6.2. Another
new technical complication is caused by the related facts that +
A,V maps the dense
subspace D merely into H 1/2 when V has a strong Coulombic singularity and that,
compared to the free picture, it is more dicult to control the singularities of the
interaction potentials. For this reason we shall eventually study the square root of
HN rather than HN itself.
We x some spectral parameter 1 throughout the whole section and
{n }nN will always denote a corresponding Weyl sequence as in Hypothesis 2.4(i).
In this and the following section we shall repeatedly employ the following
sequence of cut-o functions: We pick some C (R, [0, 1]) such that 0
on (, 1 /4] and 1 on [1, ). Here (0, 1) is a xed parameter whose
value becomes important only in Sec. 7. We set n := (|x|/Rn ), for x R3 and
n N, where Rn is given by Hypothesis 2.4(i). Then it holds n n = n and
n = Rn1 0, as n .
To begin with we draw two simple conclusions from our hypotheses:

Lemma 6.1. Assume that A L 3 3


loc (R , R ) and V fulll Hypotheses 2.4(i) and
2.2, respectively. Then

lim (DA,V )n = 0, lim (DA,VC )+


A,V n = 0. (6.1)
n n

Proof. The rst identity is clear from the hypotheses. To treat the second we
employ the cut-o functions dened in the paragraph preceding the statement of
this lemma and abbreviate VHE := VH + VE . By means of Proposition 3.1 and
VHE n 0 we then obtain

VHE + + +
A,V n VHE n A,V n + VHE [A,V , n ]n 0,

as n tends to innity. Therefore, the second identity follows from the rst.

Lemma 6.2. Assume that A L 3 3


loc (R , R ) and V fulll Hypotheses 2.4(i) and 2.2,
respectively. Let > 0 and set I := ( , + ). Then we have, as n tends to
innity,

EI (DA,V )n 1, in particular, +
A,V n 1. (6.2)

Proof. Clearly, EI (DA,V )n 1 since n is normalized. Suppose that there


is some > 0 such that lim inf EI (DA,V )n 2 1 . Then we have
lim EI (DA,V )n 2 1 , for an appropriate subsequence, and

lim (DA,V )n 2 2 lim inf (1 1I (s))d Es (DA )n 2
  R

= 2 lim EI (DA,V )n 2 2 > 0.


2


This is a contradiction to (6.1).


February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians 43

In the following we show that ENA1 + belongs to ess (HN ) by means of a


suitable Weyl sequence. Instead of applying Weyls criterion directly to HN we shall,
however, use a slightly strengthened version of it in Lemma 6.3 (see, e.g., [13]) which
allows to work with quadratic forms. As already mentioned above this is important
since, for instance, it seems that one cannot expect Proposition 3.2 to hold with
W 1/2 replaced by W . (At least not for large nuclear charges e2 Z 3/2.) To
construct the Weyl sequence we pick, for every n N, some

n | H A n  < E A + 1 ,
+,N 1 N 1 N 1
n = AN 1 n A,V DN 1 such that n (6.3)

n = 1.
This is possible since HN 1 is dened as a Friedrichs extension starting from
+,N 1
A,V DN 1 . We further set

n (x) := |n (x, X  )|2 dX  . (6.4)
R3(N 2)

Next, we pick 0 < a < min{m, 0 }, r (0, 1 /4), and r (0, 1) such that
(1 r)a > (1 + r), s := r + r 1 > 0. (6.5)
Here appears in (2.18). We further pick some cut-o function, C (R, [0, 1]),
such that 0 on (, s/2] and 1 on [s, ). By Lemma 3.6 we know that
(1)
|D0 |1/2 n HN 1 , where the superscript (1) again indicates that the operator
acts on the rst variable. Therefore, we nd a subsequence, {Rkn }nN , of {Rk }kN
such that, for every n N,

(1) 1
||D0 |1/2 (x1 /Rkn )n (X)|2 dX < , (6.6)
R3(N 1) n
As a candidate for a Weyl sequence we then try {AN n }nN , where
+,N
n := n +
A,V kn A,V DN , n N. (6.7)
To simplify the notation we again write n instead of kn in the following. Finally,
we pick some c > 1 and set
f (t) := (t + c)1/2 (t ENA1 ), t > c.

Lemma 6.3. Let the assumptions of Theorem 2.2(i) be fullled. If, in the situation
described above, c > 1 is suciently large, then AN n D(f (HN )), for every
n N, and
w-lim AN n = 0, lim inf AN n > 0, lim f (HN )AN n = 0. (6.8)
n n n

In particular, ENA1 + ess (HN ).

Proof. First, suppose that (6.8) holds true. If c > 1 is chosen suciently large,
then f is strictly monotonically increasing on (HN ). If I is some small open
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

44 O. Matte & E. Stockmeyer

interval around ENA1 + we thus get EI (HN ) = Ef (I) (f (HN )). By (6.8) and the
Weyl criterion applied to f (HN ) it follows that = dim Ran(Ef (I) (f (HN ))) =
dim Ran(EI (HN )).
To verify (6.8) we rst notice that n  0, as n , because of (2.17).
Exactly as in [40, 4] we can also check that lim inf AN n > 0. So it suces
to show that f (HN )n 0, as HN commutes with AN . Since n and n are
+,N
normalized and n = n + A,V n A,V DN we obtain

f (HN )n
1 1 1
(HN + c) 2 (HN 1 ENA1 ) 2 1H + (HN 1 ENA1 ) 2 n (6.9)
+ (DA,VC )+
A,V n (6.10)


N 1
1 1 1
+ (HN + c) 2 +,N 2 2 +
A,V WiN WiN (n A,V n ) . (6.11)
i=1

We rst show that the operator norm in (6.11) is actually nite. In fact,
1/2 1/2
(HN + c)1/2 +,N +,N
A,V WiN = WiN A,V (HN + c)
1/2
1,
+,N 1
since W 0, + +  + A
A,V DA,VC A,V C A,V , HN 1 EN 1 A,V , and, hence,
1/2
WiN +,N 2 +,N +,N
A,V =  | A,V WiN A,V   | (HN + c),

for +,N
A,V DN . Using similar estimates and (6.3) it is straightforward to check
that the term in (6.9) converges to zero provided c > 1 is suciently large. The
norm in (6.10) tends to zero by Lemma 6.1. The claim now follows from Lemma 6.4
below which implies that the remaining norm in (6.11) tends to zero, too.

The rst inequality of the following lemma is used in the proof of Lemma 6.3
and the second one in Sec. 7.

Lemma 6.4. There are , C (0, ) such that, for all n N,



1
W (x, y)n (y)|+ 2
A,V n (x)| d(x, y) sup W (x, y) + CeRn + .
R6 |xy| n
(1r  )Rn

If B is bounded, then there is some C  (0, ) such that, for all n N,



W (x, y)n (y)|+ 2
A,V n (x)| d(x, y)
R6

sup W (x, y) + 2 
A,V n + C (1 + B )e
aRn /2
|xy|
(1)Rn


+C n (y)dy(1 + B ) (DA,V + i)n 2 .
{|y|Rn /2}
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians 45

Proof. For n N, we pick a weight function, Fn C (R3 , [0, )), with Fn 0


on R3 \BRn (0), Fn (1 r)aRn a on BrRn (0) and Fn a. Here a and r
are the parameters from (6.5) and a > 0 is some xed, n-independent constant.
Since n = n n and 1BrRn (0) n = 0 we obtain

1BrRn (0) (x)W (x, y)n (y)|+ 2
A,V n (x)| d(x, y)
{|xy|<(1r  )Rn }

1BrRn (0) eFn sup Wy1/2 eFn [+


A,V , n e
Fn
]n 2 n 1
|y|(r+1r  )Rn

C  e(1r)aRn sup (1 + |B(y)|) C  e([1r]a[1+r] )Rn . (6.12)


|y|(1+r)Rn

In the last two steps we make use of Proposition 3.2 and (2.18). Next, if |x y|
(1 r )Rn and 1BrRn (0) (x) = 0, then |y| (r + r 1)Rn = sRn , and by the choice
of (see the paragraph below (6.5)) it follows that

(1 1BrRn (0) (x))W (x, y)n (y)|+ 2
A,V n (x)| d(x, y)
{|xy|(1r  )Rn }

sup W (x, y)(y/Rn )n (y)dy + 2
A,V n . (6.13)
|x|rRn R3

On account of (6.5), Katos inequality, and (6.6) the rst asserted estimate follows
from (6.12) and (6.13). The second one is derived similarly by means of Lemma 3.7
and the replacements r  1 /2, r  . Note that 1B(1+/2)Rn (0) n = 0, which is
used to derive the analogue of (6.12).

7. Existence of Eigenvalues
A
In this section, we prove Theorem 2.3 which asserts that HN possesses innitely
A A
many eigenvalues below inf ess (HN ) = 1 + EN 1 . We proceed along the lines of [40,
6] with a few changes. In particular, as in the previous section we replace the
arguments of [40] that employ explicit position or momentum space representations
of +0 by more abstract ones. A crucial observation is the new argument used to
prove Lemma 7.1.
Throughout this section we always assume without further notice that the
assumptions of Theorem 2.3, i.e. Hypothesis 2.5, are fullled.

Proof of Theorem 2.3. We proceed by induction on N and start with the induc-
A
tion step. So, we pick N N, N 2, and assume that HN 1 possesses innitely
A
many eigenvalues below EN 2 + 1. In particular, we can pick a normalized ground
A
state of HN 1 , which we denote by . Moreover, we denote the transposition oper-
ator which ips the ith and N th electron variable by iN , 1 i < N , and set
N N := 1. The vectors 1 , 2 , . . . are the elements of the sequence appearing in
Hypothesis 2.5.
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

46 O. Matte & E. Stockmeyer

Now, let d N. By Lemma 7.7 below we know that, for all suciently large
m0 N, the set {AN ( + m0 +d
A,V n )}n=m0 , where

1 
N
AN ( +
A,V n ) = (1)N i iN ( +
A,V n ), n N,
N i=1

is linearly independent. Our goal then is to show that the expectation of


 N := HN ENA1 1
H

with respect to any linear combination of the vectors {AN ( + m0 +d


A,V n )}n=m0 is
strictly negative provided m0 N is large enough. Since d is arbitrary the assertion
of Theorem 2.3 then follows from the minimax principle. For cm0 , . . . , cm0 +d C,
and

m 0 +d 
N
(1)N i
:= cn iN ( +
A,V n ), (7.1)
n=m0 i=1
N 1/2

we obtain as in [40] by means of the anti-symmetry of ,


m 0 +d
 N 
 | H |cn |2  +  +
A,V n | HN ( A,V n ) (7.2)
n=m0


m 0 +d

+ (N 1) |cn ||cm ||1N ( +  +


A,V n ) | HN ( A,V m )|
n,m=m0
(7.3)

m 0 +d

|cn ||cm || +  +


+ A,V n | HN ( A,V m )|. (7.4)
n, m=m0
m=n

Combining the eigenvalue equation (HN 1 ENA1 ) = 0 with Lemmata 7.17.4,


Hypothesis 2.5, and (6.2), we nd some 0 > 0 such that the scalar product in (7.2)
1 1
is bounded from above by 0 Rm 0
+ o(Rm 0
), as m0 gets large. Here the numbers
R1 , R2 , . . . are those appearing in Hypothesis 2.5. Lemmata 7.5 and 7.6 imply that
K
the scalar products in (7.3) and (7.4) are of order O(Rm 0
), as m0 , for every
K N. By the CauchySchwarz inequality we nd some 0 > 0 such that


m 0 +d
 N  
 | H |cn |2 ,
0
n=m0

for all cm0 , . . . , cm0 +d C, if m0 is suciently large (depending on d). This con-
cludes the induction step.
Finally, the case N = 1 is treated in the same way as the induction step N
N + 1 (setting E0A := 0 and ignoring , W , and the term (7.3)).
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians 47

To show that the contribution coming from the (one-particle) kinetic energy of
n decreases faster than its negative potential energy we make use of the require-
ment that the n have vanishing lower spinor components, n = (n,1 , n,2 , 0, 0) ,
n N. This has also been used in [40] together with explicit formulas for +0 . We
replace these arguments by the following observation:

Lemma 7.1. There is some C (0, ) such that

0 + + 2
A n | (DA 1)A n  CRn , n N.

Proof. Since the last two components of n are zero we have ( 1)n = 0.
If we denote the projection onto the rst two spinor components, L2 (R3 , C4 ) 
(1 , 2 , 3 , 4 )  (1 , 2 , 0, 0) , by p then we also have pi n = 0 = pi i n ,
2
i = 1, 2, 3, and, therefore, p(DA 1)n = 0. Moreover, [p, DA ] = 0 and, hence,
1
[p, |DA | ] = 0. This implies

|+ +
A n | (DA 1)A n |
 
1 1 
=  n | p(DA 1)n  + sgn(DA )n | (DA 1)n 
2 2
 
1 1 

=  (DA 1)n | |DA | (DA 1)n  + n | |DA | p(DA 1)n 
1 1
2 2
1
(DA 1)n 2 = O(1/Rn2 ).
2
In the last step we apply Hypothesis 2.5.

In the following we split VC into a singular and regular part, VC = VCs + VCr ,
where VCs is dened in (3.1). By Hypothesis 2.1 VCr is bounded.

Lemma 7.2. As n tends to innity,

+ +
A,V n | (DA,VC 1)A,V n 

= + r + + + 1
A,V n | VC A,V n  + A n | (DA 1)A n  + o(Rn ).

Proof. We let n , n N, denote the cut-o functions introduced in the paragraph


preceding Lemma 6.1. Then the assertion follows from Corollary 3.5 applied to
DA,VCs 1 with = n , since by Lemma 7.1 and Hypothesis 2.2,

( n V + n ) + + 2 1
A n | (DA 1)A n  n = o(Rn ).

In the next lemma, we single out the leading order negative contribution to (7.2).
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

48 O. Matte & E. Stockmeyer

Lemma 7.3. There is some constant C (0, ) such that, for all suciently large
n N,

+ r + + 2
A,V n | VC A,V n  v
(, Rn ) A,V n + Ce
Rn /C
,

where v
(, Rn ) is given by (2.20).

Proof. We pick some even function f C (R, [0, )) such that f 1 on [, ),


f 0 on [0, /2], and |f  | 4/. (Recall (2.21).) For some a (0, min{0 , m}/4),
we dene exponential weights, Fn (x) := aRn f (|x|/Rn ), n N. Using the notation
introduced in (2.19) and (2.20) we then obtain, for all suciently large n N,

+ r + + r +
A,V n | VC A,V n  A,V n | 1S (Rn ) VC A,V n 

+ VCr 1R3 \S (Rn ) eFn eFn +


A,V e
Fn
.

where, by (2.20) and Pythagoras theorem,

+ r +
A,V n | 1S (Rn ) VC A,V n 

v
(, Rn )( + 2 + 2
A,V n 1R3 \S (Rn ) A,V n )

v
(, Rn ) + 2
A,V n + |v
(, Rn )| 1R3 \S (Rn ) e
Fn 2 Fn +
e A,V eFn 2 .

By (2.19), (2.21), and the choice of Fn we know that 1R3 \S (Rn ) eFn
CeaRn /2 , which implies the assertion of the lemma.

From now on, we always assume that the induction hypothesis made in the proof
of Theorem 2.3 is fullled and that is a normalized ground state eigenvector of
A A A A A 1
HN 1 . So, HN 1 = EN 1 , EN 1 < EN 2 + 1. Given (0, N ) we pick some
(0, 1) as in Hypothesis 2.5(i). Then the following assertion is valid:

Lemma 7.4. As n tends to innity, we have, for 1 i N 1,

 + +
A,V n | WiN A,V n  sup W (x, y) + 2
A,V n + O(Rn ).
|xy|
(1)Rn


Proof. This follows from Lemma 6.4 with n (y) = R3(N 2) |(y, X  )|2 dX  and
the exponential decay of , which is ensured by Theorem 2.1 and the induction
hypothesis.

Now, we turn to the discussion of the terms in (7.3).

Lemma 7.5. As n and m tend to innity,

nm := |1N ( +  +
A,V n ) | HN ( A,V m )| = O(Rmin{n,m} ).
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians 49

Proof. We pick C0 (R3 , [0, 1]) such that 1 on B1/4 (0) and 0 outside
B1/2 (0) and set n := (/Rn ) and n := 1 n , for n N. As in [40] we nd
nm |{n + +
A,V n } | {(DA,VC 1)A,V m }|

+ |{+ (1) +
A,V n } | {n } {(DA,VC 1)A,V m }|


N 1
+ |{n + +
A,V n } | WiN (A,V m )|
i=1


N 1
+ |{+ (1) +
A,V n } | WiN (n ) (A,V m )|
i=1


N 1 
N 1
=: Y1 + Y2 + Y3i + Y4i .
i=1 i=1

For the rst two summands we nd


Y1 + Y2 (DA,VC 1)+ + (1)
A,V m ( n A,V n + n ),

where the right-hand side is of order O(Rmin{n,m} ) due to the exponential localiza-
tion of and the support properties of n and n . Moreover, we observe that, for
i = 2, . . . , N 1,
1/2
Y3i n + 1/2 +
A,V n WiN sup Wy A,V m ,
yR3

1/2
Y4i + (1) 1/2 +
A,V n WiN n sup Wy A,V m .
yR3

1/2
Here the norms WiN , i = 2, . . . , N 1, are actually nite since Ker(HN 1
ENA1 ) implies
1/2 +,N 1 +,N 1
WiN 2 =  | A,V WiN A,V  (ENA1 + C) 2 ,
for some constant C (0, ). Finally,
Y31 sup Wy1/2 n + 2 1/2 +
A,V n sup Wy A,V m ,
yR3 yR3

Y41 sup Wy1/2 + (1) 1/2 +


A,V n n sup Wy A,V m .
yR3 yR3

We pick f C (R, [0, )) such that f 0 on [1, ), f 1 on (, 1/2],


and 3 f  0, and set Fn (x) = aRn f (|x|/Rn ), x R3 , n N, where a
(0, min{0 , m}/3). Since n n = 0, we nd
sup Wy1/2 n +
A,V n
yR3

sup eFn sup Wy1/2 [+ Fn Fn


A,V , n e ]e n .
|x|Rn /2 yR3
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

50 O. Matte & E. Stockmeyer

This estimate, the exponential decay of , and Lemma 3.7 imply that the terms

Y3i and Y4i , 1 i N 1, vanish of order O(Rmin{n,m} ) also.

Finally, we discuss the terms in (7.4).

Lemma 7.6. As n tends to innity, it holds, for all m > n,

| +  +
A,V n | HN ( A,V m )| = O(Rn ).

Proof. We pick a family of smooth weight functions, {Fk }k,N , such that Fk 0
on supp(k ), Fk is constant outside some ball containing supp(k ) and supp( ),
Fk a < min{0 , m}, and

gk := eFk Fk Cea min{Rk ,R }
, k,  N,

where a, a (0, min{0 , m}) and C (0, ) do not depend on k,  N. In view


of (2.21) it is easy to see that such a family exists. Then we observe that

| +  +
A,V n | HN ( A,V m )|

|+ + +
A,V n | (DA,V 1)m | + |A,V n | (VH + VE )A,V m |
 1/2 1/2
+ |WiN + +
A,V n | WiN A,V m |
1i<N

gnm eFnm +
A,V e
Fnm
n { (DA,V 1)m

+ (i Fmn + VH + [eFmn , VE ]eFmn )m }


+ gnm eFmn (VH + VE )eFmn sup eFk +
A,V e
Fk
k 2
k=

+ gnm (N 1) sup sup Wy1/2 eFk +


A,V e
Fk
k 2 .
k= yR3

By virtue of Proposition 3.2 and Lemma 3.7, we know that all terms behind the
factors gnm appearing here are uniformly bounded which shows that the assertion
holds true.

Applying the above arguments in an easier situation, we obtain the following


lemma.

Lemma 7.7. For every d N, there is some n0 N such that the set of vectors
{AN ( + m0 +d
A,V n )}n=m0 is linearly independent, for all m0 N, m0 n0 .

Proof. We pick as in (7.1) and estimate 2 from below by an obvious analogue


 N replaced by the identity. Now, by virtue of Lemma 6.2 there
of (7.2)(7.4) with H
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians 51

is some m1 N such that + +


A,V n = A,V n 1/2, for all n m1 . The
proof of Lemma 7.5 shows that

|1N ( + +
A,V n ) | A,V m | = O(Rmin{n,m} ).

Furthermore, by employing the exponential weights from the proof of Lemma 7.6
we see that

| + + +
A,V n | A,V m | = |n | A,V m | Ce
min{Rn ,Rm }/C
.

Altogether we nd some C  (0, ) such that, for d N and all suciently large
m0 N,
m0 +d m0 +d 0 +d
1  C  (N 1)  
m
C
2 |cn |2 |cn ||cm | |cn ||cm |.
2 n=m Rm0 n,m=m
Rm0
0 0 n, m=m0
n=m

Hence, the CauchySchwarz inequality implies that, for suciently large m0 ,


in (7.1) is zero if and only if cm0 = = cm0 +d = 0.

Acknowledgments
It is a pleasure to thank Hubert Kalf, Sergey Morozov, and Heinz Siedentop for
useful remarks and helpful discussions. Moreover, we thank Sergey Morozov for
making parts of his manuscripts [38] available to us prior to publication.

References
[1] V. Bach, J. Fr ohlich and I. Sigal, Quantum electrodynamics of conned non-
relativistic particles, Adv. Math. 137 (1998) 299395.
[2] V. Bach and O. Matte, Exponential decay of eigenfunctions of the BetheSalpeter
operator, Lett. Math. Phys. 55 (2001) 5362.
[3] A. A. Balinsky and W. D. Evans, On the virial theorem for the relativistic operator
of Brown and Ravenhall, and the absence of embedded eigenvalues, Lett. Math. Phys.
44 (1998) 233248.
[4] A. A. Balinsky and W. D. Evans, Stability of one-electron molecules in the Brown
Ravenhall model, Comm. Math. Phys. 202 (1999) 481500.
[5] A. A. Balinsky and W. D. Evans, On the spectral properties of the BrownRavenhall
operator, J. Comput. Appl. Math. 148 (2002) 239255.
[6] P. Beiersdorfer, M. H. Chen, K. T. Cheng and J. Sapirstein, Transition energies of
the 3s3p3/2 resonance lines in sodiumlike to phosphoruslike uranium, Phys. Rev. A
68 (2003) 022507, 7 pp.
[7] A. Berthier and V. Georgescu, On the point spectrum of Dirac operators, J. Funct.
Anal. 71 (1987) 309338.
[8] A. M. Boutet de Monvel and R. Purice, A distinguished self-adjoint extension for the
Dirac operator with strong local singularities and arbitrary behavior at innity, Rep.
Math. Phys. 34 (1994) 351360.
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

52 O. Matte & E. Stockmeyer

[9] G. E. Brown and D. G. Ravenhall, On the interaction of two electrons, Proc. Roy.
Soc. London A 208 (1951) 552559.
[10] R. Cassanas and H. Siedentop, The ground-state energy of heavy atoms according
to Brown and Ravenhall: Absence of relativistic eects in leading order, J. Phys. A
39 (2006) 1040510414.
[11] K. T. Cheng, M. H. Chen and W. R. Johnson, Accurate relativistic calculations
including QED contributions for few-electron systems, in Relativistic Electronic
Structure Theory. Part 2: Applications, ed. P. Schwerdtfeger, Theoretical and Com-
putational Chemistry, Vol. 14 (Elsevier, 2002), pp. 120187.
[12] K. T. Cheng, M. H. Chen and J. Sapirstein, Potential independence of the solu-
tion to the relativistic many-body problem and the role of negative energy states in
heliumlike ions, Phys. Rev. A 59 (1999) 259266.
[13] H. L. Cycon, R. G. Froese, W. Kirsch and B. Simon, Schr odinger Operators, Texts
and Monographs in Physics (Springer, Berlin-Heidelberg, 1987).
[14] A. Derevianko, W. R. Johnson, D. P. Plante and I. M. Savukov, Negative-energy
contributions to transition amplitudes in heliumlike ions, Phys. Rev. A 58 (1998)
44534461.
[15] M. Dimassi and J. Sj ostrand, Spectral Asymptotics in the Semi-Classical Limit,
London Math. Soc. Lecture Note Series, Vol. 268 (Cambridge University Press,
Cambridge, 1999).
[16] J. Dolbeault, M. J. Esteban and M. Loss, Relativistic hydrogenic atoms in strong
magnetic elds, Ann. Henri Poincare 8 (2007) 749779.
[17] W. D. Evans, P. Perry and H. Siedentop, The spectrum of relativistic one-electron
atoms according to Bethe and Salpeter, Comm. Math. Phys. 178 (1996) 733746.
[18] V. Georgescu and M. M antoiu, On the spectral theory of singular Dirac type
Hamiltonians, J. Operator Theory 46 (2001) 289321.
[19] M. Griesemer, Exponential decay and ionization thresholds in non-relativistic quan-
tum electrodynamics, J. Funct. Anal. 210 (2004) 321340.
[20] M. Griesemer and H. Siedentop, A minimax principle for the eigenvalues in spectral
gaps, J. London Math. Soc. (2) 60 (1999) 490500.
[21] M. Griesemer and C. Tix, Instability of a pseudo-relativistic model of matter with
self-generated magnetic eld, J. Math. Phys. 40 (1999) 17801791.
[22] B. Heler, J. Nourrigat and X. P. Wang, Sur le spectre de lequation de Dirac (dans
R3 ou R2 ) avec champ magnetic, Ann. Sci. Ecole
Norm. Sup. 22(4) (1989) 515533.
[23] B. A. He, M. Reiher and A. Wolf, The generalized DouglasKroll transformation,
J. Chem. Phys. 117 (2002) 92159226.
[24] G. Hoever and H. Siedentop, Stability of the BrownRavenhall operator, Math. Phys.
Electr. J. 5 (1999) Paper 6, 11 pp.
[25] M. Huber and E. Stockmeyer, Perturbative implementation of the Furry picture,
Lett. Math. Phys. 79 (2007) 99108.
[26] G. Jansen and B. A. He, Revision of the DouglasKroll transformation, Phys. Rev. A
39 (1989) 60166017.
[27] D. H. Jakubaa-Amundsen, The HVZ theorem for a pseudo-relativistic operator,
Ann. Henri Poincare 8 (2007) 337360.
[28] D. H. Jakubaa-Amundsen, Heat kernel estimates and spectral properties of a pseu-
dorelativistic operator with magnetic eld, J. Math. Phys. 49 (2008) 032305, 22 pp.
[29] W. R. Johnson, Relativistic many-body theory applied to highly-charged ions, in
Many-Body Theory of Atomic Structure and Photoionization, ed. T. N. Chang (World
Scientic, 1993), pp. 1946.
February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians 53

[30] W. R. Johnson, Relativistic many-body perturbation theory for highly charged ions,
in Many-Body Atomic Physics, eds. J. J. Boyle and M. S. Pindzola (University Press,
1998), pp. 3964.
[31] T. Kato, Perturbation Theory for Linear Operators, Classics in Mathematics
(Springer, Berlin-Heidelberg, 1995).
[32] T. Kato, Holomorphic families of Dirac operators, Math. Z. 183 (1983) 399406.
[33] Y. Last and B. Simon, The essential spectrum of Schr odinger, Jacobi, and CMV
operators, J. dAnalyse Math. 98 (2006) 183220.
[34] E. H. Lieb and M. Loss, Stability of a model of relativistic quantum electrodynamics,
Comm. Math. Phys. 228 (2002) 561588.
[35] E. H. Lieb, H. Siedentop and J. P. Solovej, Stability and instability of relativistic
electrons in classical electromagnetic elds, J. Statist. Phys. 89 (1997) 3759.
[36] O. Matte and E. Stockmeyer, On the eigenfunctions of no-pair operators in classical
magnetic elds, Integr. Equ. Oper. Theory 65 (2009) 255283.
[37] S. Morozov, Essential spectrum of multiparticle BrownRavenhall operators in exter-
nal eld, Documenta Math. 13 (2008) 5179.
[38] S. Morozov, Multi-particle BrownRavenhall operators in external elds, PhD thesis,
Universitat Munchen (2008).
[39] S. Morozov, Exponential decay of eigenfunctions of BrownRavenhall operators,
J. Phys. A 42 (2009) 475206, 16 pp.
[40] S. Morozov and S. Vugalter, Stability of atoms in the BrownRavenhall model, Ann.
Henri Poincare 7 (2006) 661687.
[41] G. Nenciu, Self-adjointness and invariance of the essential spectrum for Dirac oper-
ators dened as quadratic forms, Comm. Math. Phys. 48 (1976) 235247.
[42] G. Nenciu, Distinguished self-adjoint extension for the Dirac operator with potential
dominated by multicenter Coulomb potentials, Helvetica Phys. Acta 50 (1977) 13.
[43] M. Reiher and A. Wolf, Relativistic Quantum Chemistry (Wiley-VCH, Weinheim,
2009).
[44] R. Richard and R. Tiedra de Aldecoa, On the spectrum of magnetic Dirac operators
with Coulomb-type perturbations, J. Funct. Anal. 250 (2007) 625641.
[45] J. Sapirstein, Theoretical methods for the relativistic atomic many-body problem,
Rev. Modern Phys. 70 (1998) 5576.
[46] H. Siedentop and E. Stockmeyer, The DouglasKrollHe method: Convergence and
block-diagonalization of Dirac operators, Ann. Henri Poincare 7 (2006) 4558.
[47] J. Sucher, Foundations of the relativistic theory of many-electron atoms, Phys. Rev.
A 22 (1980) 348362.
[48] J. Sucher, Relativistic many-electron Hamiltonians, Phys. Scripta 36 (1987) 271281.
[49] B. Thaller, The Dirac Equation, Texts and Monographs in Physics (Springer, Berlin-
Heidelberg, 1992).
[50] C. Tix, Strict positivity of a relativistic Hamiltonian due to Brown and Ravenhall,
Bull. London Math. Soc. 30 (1998) 283290.
[51] C. Tix, Self-adjointness and spectral properties of a pseudo-relativistic Hamiltonian
due to Brown and Ravenhall, preprint (1997) 20 pp.; mp arc 97-441.
[52] C. Tix, Lower bound for the ground state energy of the no-pair Hamiltonian, Phys.
Lett. B 405 (1997) 293296.
[53] J. Xia, On the contribution of the Coulomb singularity of arbitrary charge to the
Dirac Hamiltonian, Trans. Amer. Math. Soc. 351 (1999) 19892023.
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

Reviews in Mathematical Physics


Vol. 22, No. 1 (2010) 5589

c World Scientic Publishing Company
DOI: 10.1142/S0129055X10003886

ON THE LONG TIME BEHAVIOR OF FREE STOCHASTIC



SCHRODINGER EVOLUTIONS

ANGELO BASSI
Department of Physics, University of Trieste,
Strada Costiera 11, 34151 Trieste, Italy
and
Istituto Nazionale di Fisica Nucleare, Trieste Section,
Via Valerio 2, 34127 Trieste, Italy
bassi@ts.infn.it


DETLEF DURR and MARTIN KOLB

Mathematisches Institut der L.M.U.,


Theresienstr. 39, 80333 Munchen, Germany
duerr@math.lmu.de
kolb@math.lmu.de

Received 10 March 2009


Revised 15 August 2009

We discuss the time evolution of the wave function which is the solution of a stochastic
Schr
odinger equation describing the dynamics of a free quantum particle subject to spon-
taneous localizations in space. We prove global existence and uniqueness of solutions.
We observe that there exist three time regimes: the collapse regime, the classical regime
and the diusive regime. Concerning the latter, we assert that the general solution con-
verges almost surely to a diusing Gaussian wave function having a nite spread both
in position as well as in momentum. This paper corrects and completes earlier works
on this issue.

Keywords: Collapse models; GRW-model; Hilbert space valued diusions; large time
behavior.

Mathematics Subject Classication 2000: 60H30, 60J60, 82C31, 81S99, 35R60

1. Introduction
Stochastic dierential equations (SDEs) in innite dimensional spaces are a subject
of growing interest within the mathematical physics and physics communities
working in quantum mechanics; they are currently used in models of sponta-
neous wave function collapse [114], in the theory of continuous quantum meas-
urement [15, 1725], and in the theory of open quantum systems [2628]. In the
rst case, the Schrodinger equation is modied by adding appropriate nonlinear
and stochastic terms which induce the (random) collapse of the wave function in

55
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

56 A. Bassi, D. D
urr & M. Kolb

space; in this way, one achieves the goal of a unied description of microscopic
quantum phenomena and macroscopic classical ones, avoiding the occurrence of
macroscopic quantum superpositions. Current research focuses on designing exper-
iments which discriminate between collapse and non-collapse theories, see references
in [16]. In the second case, using the projection postulate, stochastic terms in the
Schrodinger equation are used to describe the eect of a continuous measurement.
In the third case, slightly generalizing the notion of continuous measurement to
generic interactions with environments, SDEs are used as phenomenological equa-
tions describing the interaction of a quantum system with an environment, the
stochastic terms encoding the eect of the environment on the system. Looking
directly at the stochastic dierential equation for the wave function, rather than
the deterministic equation of the Lindblad type for the statistical operator has some
advantages with respect to the standard master equation approach, e.g. for faster
numerical simulations [29].
Among the dierent SDEs which have been considered so far, the following
equation, dened in the Hilbert space H L2 (R), is of particular interest [17
19, 3037, 39]
 
i p2 2
dt = dt + (q qt )dWt (q qt ) dt t , 0 = . (1.1)
 2m 2
The rst term on the right-hand side represents the usual quantum Hamiltonian
of a free particle in one dimension, p being the momentum operator. The second
and third terms of the equation, as we shall see, induce the localization of the
wave function in space; q is the position operator and qt denotes the quantum
expectation t |qt  of q with respect to t . The parameter is a xed positive
constant which sets the strength of the collapse mechanism, while Wt is a standard
Wiener process dened on a probability space (, F , P) with ltration {Ft , t 0}.
Equation (1.1) plays a special role among the SDEs in Hilbert spaces because
it is the simplest exactly solvable equation describing the time evolution of a non-
trivial physical system. Within the theory of continuous quantum measurement,
it describes a measurement-like process designed to measure the position of a free
quantum particle; within decoherence theory it represents one of the possible unrav-
ellings of the master equation rst derived by Joos and Zeh [40]. Within collapse
models (like GRW-models), it may describe the evolution of a free quantum particle
(or the center of mass of an isolated system) subject to spontaneous localizations
in space [1, 2] in the following sense. Realistic models of spontaneous wave func-
tion collapse are based on a more complicated stochastic dierential equation: The
dierence between Eq. (1.1) and the equations of the standard localization models
such as GRW [1] and CSL [2] is most easily described on the level of the Lindblad
equations for the respective statistical operators t := EP [|t t |], induced by the
stochastic dynamics of the wave function. By virtue of Eq. (1.1) (see, e.g., [9]):
d i
t = [p2 , t ] [q, [q, t ]], (1.2)
dt 2m 2
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr


odinger Evolutions 57

with the Lindblad term in position representation


GRW
(x y)2 t (x, y). (1.3)
4
For the GRW dynamics as described in [1] the corresponding Lindblad term of the
GRW master equation in the position representation reads:
2
GRW [1 e(xy) /4
]t (x, y). (1.4)

When the distances involved are smaller than the length 1/  105 cm charac-
terizing the model we have that
2 GRW 1
GRW [1 e(xy) /4
] (x y)2 , for |x y|  . (1.5)
4
Accordingly, the stochastic dynamics of Eq. (1.1) approximates at least on the
statistical level the GRW dynamics for all atomic and subatomic distances. Since
this is a regime of growing interest [41, 13, 42, 43] it is reasonable to study now rst
the simpler Eq. (1.1).
Equation (1.1) is nonlinear. Nonlinearity is a fundamental ingredient because
only in this way it is possible to reproduce the collapse of the wave function. It is
well known how to linearize the equation, i.e. how to express its solutions as a
function of the solutions of a suitable linear SDE [31, 44]. We briey review this
procedure.
Let us consider the following linear SDE:
 
i p2 2
dt = dt + q dt q dt t , 0 = , (1.6)
 2m 2
dened in the same Hilbert space H L2 (R); the stochastic process t is a stan-
dard Wiener process with respect to the probability space (, F , Q) and ltration
{Ft , t 0}, where Q is a new probability measure whose relation with P will soon
be established. This equation does not conserve the norm of the state vector, as the
evolution is not unitary; we therefore introduce the normalized state vectors:

t /t  if t  = 0,
t = (1.7)
0 otherwise.
A standard application of It o calculus shows that, if t solves Eq. (1.6), then t
dened in (1.7) solves the following nonlinear SDE:
 
i p2
dt = dt + (q qt )(dt 2 qt dt) (q qt )2 dt t , (1.8)
 2m 2
for the same initial condition = .
Equation (1.8) is a well dened collapse equation, however it is not suitable
for physical applications, as the collapse does not occur with the correct quantum
probabilities. This can be seen by analyzing the time evolution of particular solu-
tions, such as Gaussian wave functions; it can also be easily understood by noting
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

58 A. Bassi, D. D
urr & M. Kolb

that there is no fundamental dierence between Eqs. (1.8) and (1.6), since any
solution of Eq. (1.8) can be obtained from a solution of Eq. (1.6) simply by nor-
malizing the wave function. In turn, Eq. (1.6) does not contain any information as
to why the wave function should collapse according to the Born probability rule,
i.e. the Wiener process t is not forced to pick most likely those values necessary to
reproduce quantum probabilities, during the collapse process.
The way to include such a feature into the dynamical evolution of the wave
function is to replace the measure Q with a new measure (which will turn out to
be the measure P previously introduced) so that the process t , according to the
new measure, is forced to take with higher probability the values which account
for quantum probabilities. This is precisely the key idea behind the original GRW
model of spontaneous wave function collapse [1]: the wave function is more likely
to collapse where it is more appreciably dierent from zero. The mathematical
structure of the GRW model suggests that the square modulus t 2 should be
used as density for the change of measure. We now formalize these steps.
In [31], Holevo has proven that for initial condition 0 2 = 1 the process
(t 2 )t0 is a martingale satisfying the equation
 t
t 2 = 0 2 + 2 qs s 2 ds . (1.9)
0

We shall always work with normalized initial states. The martingale t 2 can be
used as a RadonNikodym derivative to generate a new probability measure P from
Q, according to the usual formula:
P[E] := EQ [1E t 2 ], E Ft , t < +, (1.10)
where 1E is the indicator function relative to the measurable subset E. We recall
that the martingale property, together with the property EQ [t 2 ] = 1, guarantee
consistency among dierent times, so that (1.10) denes indeed a unique probability
measure P on F . In the following, for simplicity we will write dP/dQ t 2 .
One can then show that Eq. (1.8), with the stochastic dynamics dened on the
probability space (, F , P) in place of (, F , Q), correctly describes the desired
physical situations.
A drawback of the change of measure is that the equation is dened in terms
of the stochastic process t , which is not anymore a Wiener process with respect
to the measure P, as it was with respect to the measure Q. This can be a source of
many diculties, e.g. when analyzing the properties of the solutions of the equa-
tion. The disadvantage can be removed by resorting to Girsanovs theorem, which
connects Wiener processes dened on the same measurable space, but with respect
to dierent probability measures. According to this theorem, the process
 t
Wt := t 2 qs ds, (1.11)
0
is a Wiener process with respect to (, F , P) and ltration {Ft , t 0}, and thus
is the natural process for describing the stochastic dynamics with respect to the
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr


odinger Evolutions 59

measure P. It is immediate to see that, once written in terms of Wt , Eq. (1.8)


reduces to Eq. (1.1), thus the link between Eq. (1.6) and (1.1) is established. The
above discussion should also have given a rst idea of why SDEs like Eq. (1.1) are
those which are used in Quantum Mechanics to described the collapse of the wave
function; we will come back on this point later in the paper.
The rst important problem to address concerns the status of the solutions of
Eq. (1.6). In [31], Holevo has proven the existence and uniqueness of topological
weak solutions of a rather general class of SDEs with unbounded operators, to
which Eq. (1.6) belongs. (See the end of the section for the notation.) The problem
of the existence and uniqueness of topological strong solutions of Eq. (1.6) has been
addressed in [30]; there however, the proof relies on the expansion of wave functions
in terms of Gaussian states, which in general is problematic and requires special
care, as shown in [45]. An explicit representation of the strong solution of Eq. (1.6)
has been given in [37]; the representation is written in terms of path integrals and
is not particularly suitable for analyzing the time evolution of the general solution.
A much more convenient representation, given in terms of the Greens function of
Eq. (1.6), has been rst derived in [32, 35]; the Greens function reads:
 
t
Gt (x, y) = Kt exp (x2 + y 2 ) + t xy + a t x + bt y + ct ; (1.12)
2
the coecients Kt , t and t are deterministic and equal to


Kt = , (1.13)
sinh t
2
t = coth t, (1.14)


t = 2 sinh1 t, (1.15)

while the remaining coecients are functions of the Wiener process t :
 t
t = sinh1 t
a sinh s ds , (1.16)
0
 t
bt = 2i a
s
ds, (1.17)
m 0 sinh s
 t
i
ct = 2s ds.
a (1.18)
m 0

In the above expressions, we have introduced the following two constants:



1+i 
, 2 . (1.19)
2 m
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

60 A. Bassi, D. D
urr & M. Kolb

As we shall see, the parameter , which has the dimensions of a frequency, will set
the time scales for the collapse of the wave function. The representation in terms of
the Greens function (1.12), as we have said, is particularly suitable for analyzing the
time evolution of the general solution of Eq. (1.6), and thus of Eq. (1.1), even though
we will see that, when studying the long time behavior, another representation is
more convenient.
Our rst result concerns the meaning of the solution of Eq. (1.6) in terms of

t (x) := dyGt (x, y)(y) (1.20)

for given initial condition .

Theorem 1.1. Let t be dened as in (1.20); then the following three statements
hold true with Q-probability 1:
(1) L2 (R) t L2 (R), (1.21)
(2) L2B (R) t is a topological strong solution of Eq. (1.6), (1.22)
2
(3) L (R) limt0 t  = 0, (1.23)
where L2B (R) is the subspace of all bounded functions of L2 (R).
Having the explicit solution of the Eq. (1.6), and thus of Eq. (1.1), the next
relevant problem is to unfold its physical content. Previous analysis of similar equa-
tions [2,8,10,14,39] have shown that one can identify three regimes, which are more
or less well separated depending on the value of the parameters and m.
(1) Collapse regime. A wave function having an initial large spread, localizes in
space; the localization occurring in agreement with the Born probability rule.
(2) Classical regime. The localized wave function moves in space like a classical free
particle, since the uctuations due to the Wiener process can be safely ignored.
That a well localized Schrodinger wave function should move along a classical
path is connected to the validity of Ehrenfest- or Egorov-type theorems [38].
(3) Diusive regime. Eventually, the random uctuations become dominant and
the wave function starts to diuse appreciably.
It is not an easy task to spell out rigorously these regimes and their properties.
We shall, however, be a bit more specic on this in the following section. We shall
afterwards focus on the simplest regime, namely the diusive one, which in fact has
been intensively looked at in the previous years [7, 19, 26, 34, 35, 39] and we shall
prove a remarkable property of the solutions of Eq. (1.1): Any solution converges
almost surely to a Gaussian wave function having a xed spread.

Theorem 1.2. let t be a solution of Eq. (1.1); then under conditions which we
will specify, the following property holds true with P-probability 1:
lim t t  = 0, (1.24)
t
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr


odinger Evolutions 61

where t , dened in (5.2), is a Gaussian wave function with a xed spread both in
position and momentum.
Theorems 1.1 and 1.2 have been extensively discussed before in the literature
[7, 19, 26, 30, 3436, 39], proving that the community has devoted much attention
to the problem. However, these proofs are not complete or awed. Concerning
Theorem 1.1, in particular Statement (3) was not proven [30, 3436]. While State-
ments (1) and (2) are rather straightforward conclusions from the Gaussian kernel
of the propagator, the third statement is much more subtle and does not follow
from purely analytical arguments. Concerning Theorem 1.2, none of the previous
proofs is decisive. In [35, 36], the major aw was that it was overlooked that the
eigenfunction expansion of the relevant dissipative operator (not self-adjoint) does
not give rise to an orthonormal basis. In [19], the long time behavior was analyzed
by expanding the general solution in terms of coherent states, while in [26,39] it was
analyzed by scrutinizing the time evolution of the spread in position of the solu-
tion; in [45] it has been shown that both approaches are not conclusive. Finally, [7]
proposed Theorem 1.2 as a conjecture, but shows stability of t only against small
perturbations. Building on previous work of Holevo, Mora and Rebolledo recently
enhanced in [46, 47] the general theory of stochastic Schrodinger equations. In par-
ticular, they developed criteria for the existence of regular invariant measures for
a large class of stochastic Schrodinger equations as an important step towards an
understanding of the large time behavior. Until now however the only complete and
detailed results on the large time behavior seem to be Theorems 1.1 and 1.2.
We conclude this introductory section by summarizing the content of the paper.
In Sec. 2, we will present a qualitative analysis of the time evolution of the general
solution of Eq. (1.1); we will discuss the three regimes previously introduced, giving
also numerical estimates, and we will set the main problems which we aim at solving.
In Sec. 3, we will analyze the structure of the Greens function (1.12) and prove
Theorem 1.1. In Sec. 4, we will introduce another representation of the general
solution of Eq. (1.1), which is more suitable for analyzing its long time behavior.
Sec. 5 will be devoted to the proof of Theorem 1.2. Finally, Sec. 6 will contain some
concluding remarks and an outlook.
Notation. We will work in the complex and separable Hilbert space L2 (R), with
the norm and the scalar product given, respectively, by   and |. We will also
consider the subspace L2B (R) of all bounded functions of L2 (R). Given an operator
O, we denote with D(O) its domain and with R(O) its range.
Since in some expression the real and imaginary parts of some coecients
appear, we introduce for ease of readability the symbols z R or zR will denote the
real part of the complex number z, while z I or zI will denote its imaginary part.
Given the linear SDE (1.6), a topological strong solution is an L2 -valued process
such that for any t > 0,
  t 
i t p2 t 2
t = s ds + qs ds q s ds (1.25)
 0 2m 0 2 0
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

62 A. Bassi, D. D
urr & M. Kolb

holds with Q-probability 1. A topological weak solution instead is an L2 -values


process such that for any t > 0 and for any D(p2 ) D(q 2 ),
  t
i t 1 2
|t  = | p |s ds + q|s ds
 0 2m 0

t 2
q |s ds (1.26)
2 0
holds with Q-probability 1. Topological strong and week solutions for the nonlinear
SDE (1.1) are dened in a similar way.
There is also a distinction between strong and weak solutions in a stochastic
sense [48], depending on whether the probability space, the ltration and the Wiener
process are given a priori (strong solution) or whether they can be constructed in
such a way to solve the required SDE (weak solution). Throughout the paper, we
will deal only with strong solutions in the stochastic sense.

2. Time Evolution of the General Solution


We begin our discussion with a qualitative analysis of the time evolution of the
general solution of Eq. (1.1); we will spot out the regimes we introduced in the pre-
vious section, corresponding to three dierent behaviors of the wave function. These
regimes of course depend on the value of the mass m of the particle and also on the
value of the coupling constant which sets the strength of the collapse mechanism.
As discussed, e.g., in [39], it is physically appropriate to take proportional to the
mass m according to the formula:
m
:= 0 , (2.1)
m0
where 0 is now assumed to be a universal coupling constant, while m0 is taken
equal to the mass of a nucleon ( 1.671027 kg). To be denite, in the following we
take 0  1.00 102 m2 sec1 , so that the localization mechanism has the same
strength as that of the GRW model [1]. Though, as we discussed in the introduction,
Eq. (1.1) is used also in the context of the theory of continuous measurement as
well as in the theory of decoherence, for brevity and clarity in the following we will
only make reference to its application within models of spontaneous wave function
collapse.

1. The collapse regime


The rst important eect of the dynamics embodied in Eq. (1.1) is that a wave
function, which initially is well spread out in space, becomes rapidly localized. This
is most easily seen through the Greens function representation of the solution. The
Greens function Gt (x, y) in (1.12) can be rewritten as follows
   

t 2 t x 2
Gt (x, y) = Kt exp x + a t x + ct exp (y Yt ) (2.2)
2 2
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr


odinger Evolutions 63

where we have introduced the new parameters:

t2 2
t = t
= tanh t, (2.3)
t
tbt
a
t = a t + , (2.4)
t
b2
ct = ct + t , (2.5)
2t
t x + bt
Ytx = . (2.6)
t

 y-part of Gt (x, y) is a Gaussian function whose spread in position (equal to


The
1/ Rt ) rapidly decreases in time, and afterwards remains very small. In particular,
we have:
2 sinh t sin t
R
t =
cosh t cos t



2 24 2 1 1
t  1 ,
t  (3.99 10 m kg sec )mt
3
= (2.7)
2

 (2.39 1029 m2 kg1 )m t +,

with  5.01 105 sec1 independent of the mass of the particle.
Let us introduce a length , and let us say that a wave function is localized when
its spread is smaller than . For sake of deniteness, we take  1.00 107 m,
corresponding to the width of the collapsing Gaussian of the GRW model. By means
of this length, we can dene the collapse time t1 as the time when the spread of the
y-part of the Greens function Gt (x, y) becomes smaller than . By using the small
time approximation of R t given in (2.7), we can set:

3 2.51 1011 kg sec


t1 := 2
 . (2.8)
2 m
As we see, and as we expect, this time decreases for increasing masses, i.e. for
increasing values of , and is very small for macroscopic particles.
Let us assume that the initial state (x) is not already localized, and in partic-
ular that it does not change appreciably on the scale set by ; this is a physically
reasonable assumption when represents the state of the center of mass of a macro-
scopic object. In this case, from the time t1 on, the y-part of the Greens function
Gt (x, y) acts like a Dirac-delta on (x), and the solution at time t of the linear
equation can be written as follows:
  
2 t 2
t (x)  Kt exp x + a t x + ct (Ytx ). (2.9)
t 2
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

64 A. Bassi, D. D
urr & M. Kolb

This is a Gaussian state whose spread is controlled by


t , which evolves in time in
a way similar to t ; in particular:

2 sinh t + sin t
R
t =
cosh t + cos t

2t  (1.20 1025 m2 kg1 sec1 )mt t  1 ,
= 2 (2.10)
 (2.39 1029 m2 kg1 )m t +.


As we see, the spread 1/ R
t is well below , for any t t1 . We can then conclude
that, for times greater than the collapse time, any state initially well spread out in
space is mapped into a very well localized wave function.
An important issue is where the wave function collapses to, given that the initial
state is spread out in space. We now show that the position of the wave function after
the collapse is distributed in very good agreement with the Born probability rule.
A reasonable measure of where the wave function is, after it has collapsed,
is given by the quantum average of the position operator qt . Accordingly, the
probability for the collapsed wave function to lie within a Borel measurable set A
of R can be simply dened to be Pcollt [A] := P[ : qt A]. Though this probability
is mathematically well dened for any Borel measurable subset A, it is physically
meaningful only when A represents an interval much larger than the spread of
the wave function itself, or a sum of such intervals. In such a case, as discussed
in [49], one can show that:

coll 2
Pt [A]  EP [P t  ] pt (x)dx, (2.11)

where P (x) is the characteristic function of the interval of the real axis and
pt = EP [|t (x)|2 ]. The idea behind the approximate equality (2.11) is that when t
lies within , then P t  t , so that P t 2 is almost equal to 1, while when it
lies outside , it is practically 0. The critical situations, which require special care,
are those when the wave function lies at the edges of .
In [39] it has been proven that:
 
t 2
pt (x) = dy et y pSch
t (x + y), (2.12)

3mm0 m
t =  (2.27 1043 m2 kg1 sec3 ) 3 , (2.13)
22 0 t3 t

where pSch Sch 2 Sch


t (x) = |t (x)| and t (x) is the solution of the standard free-particle
Schodinger equation, for the given initial condition (x). For the times we are con-
sidering (t = t1 ), the Gaussian term in (2.12) is much more peaked than any typical
quantum probability distribution pSch t (x), and consequently acts like a Dirac-delta
on it; accordingly, pt (x)  pSch
t (x). Finally, for macroscopic systems and for the
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr


odinger Evolutions 65

times we are considering, the wave function solution of the free-particle Schr
odinger
equation does not change appreciably, implying that pt (x)  p0 (x) = |(x)|2 ,
Sch Sch

which means precisely that the collapse probability is distributed in agreement with
the Born probability rule.

2. The classical regime


After time t1 , we are left with a wave function which, when m is the mass of
a macroscopic particle, is very well localized in space, almost point-like. This is
the way in which collapse model reproduce the particle-like behavior of classical
systems, within the framework of a wave-like dynamics. The relevant question now
is to unfold the time evolution of the position and momentum of the wave function,
to see whether it matches Newtons laws.
When the wave function is well localized in space (t > t1 ), one can reasonably
assume that it can be approximated with the Gaussian state to which as we
shall see it asymptotically converges to. We will analyze the time evolution of
such a Gaussian state in the following, and we will see that its mean position x t
and momentum kt evolve in time as follows (see Eqs. (5.28) and (5.29)):
 
  t

t = x
x t1 + kt1 (t t1 ) + Ws ds + (Wt Wt1 ), (2.14)
m m t1 m

kt = kt1 + (Wt Wt1 ). (2.15)

We can easily recognize in the deterministic parts of the above equations the free-
particle equations of motions of classical mechanics describing a particle moving
along a straight line with constant velocity; the remaining terms are the uctuations
around the classical motion, driven by the Brownian motion Wt . The important fea-
ture of the above equations is that these uctuations, for macroscopic masses, are
very small, for very long times. As a matter of fact, if we estimate the Brown-
ian motion uctuations by setting Wt t, we have for the stochastic terms in
Eq. (2.14):

  t 2  3/2 t3/2
Ws ds  t  (1.63 1022 m kg1/2 sec3/2 ) , (2.16)
m t1 3 m m
  
 t 17 1/2 1/2 t
(Wt Wt1 )   (1.02 10 m kg sec ) . (2.17)
m m m

We see that the random uctuations decrease with the square root of the mass
m of the particle, which means that the bigger the system, the more determin-
istic its motion. This is how collapse models recover classical determinism at the
macroscopic level, from a fundamentally stochastic theory.
We can introduce a time t2 , dened as the time after which the uctuations
become larger than L; we can set, e.g., L  1.00 103 m. Since the uctuations
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

66 A. Bassi, D. D
urr & M. Kolb

in (2.16) grow faster as those in (2.17), we can set:


2/3
3 L m
t2   (3.55 1012 sec m1/3 ) 3 m
2 

 (1.13 105 year m1/3 ) 3 m. (2.18)
The time t2 denes the time interval [t1 , t2 ] during which the classical regime holds.
As we can see, for macroscopic systems this is a very long time much longer
than the time during which a macro-object can be kept isolated from the rest of
the universe, so that its dynamics is described by Eq. (1.1).
To summarize, during the classical regime, which for macroscopic systems lasts
very long, the wave function behaves, for all practical purposes, like a point moving
deterministically in space according to Newtons laws. In other words, the wave
function reproduces the motion of a classical particle.
3. The diusive regime
After time t2 , two new eects become dominant: First, the wave function converges
towards a Gaussian state, as we shall prove. Second, the motion becomes more
and more erratic: the dynamics begins to depart from the classical one, showing its
intrinsic stochastic nature.
A thorough mathematical analysis of these time regimes and their main proper-
ties is still lacking. In this paper, as we have anticipated, we focus now only on the
long time behavior of the solutions of Eq. (1.1), leaving the study of the remaining
properties as open problems for future research.

3. Solution of the Equation


In the rst part of this section, we derive the Greens function (1.12) in a way which
will make clear the connection between Eq. (1.6) and the equation of the so called
non-self-adjoint (NSA) harmonic oscillator [5254]. This connection is important
for two reasons; from a physical point of view, it will bring a deep insight on how
the collapse of the wave function actually works. From a mathematical point of
view, it will allow to prove rigorously both Theorems 1.1 and 1.2 presented in the
introductory section.
A way to connect Eq. (1.6) with that of the NSA harmonic oscillator is to apply
suitable transformations to the wave function in such a way to transform the SDE
in a Schrodinger-like equation. We will do this in two steps. We present this section
in detail for convenience although the approach goes back to Kolokoltsov [35].
1. Reduction of Eq. (1.6) to a linear dierential equation with random
coecients
The idea is to remove the stochastic dierential term q dt from Eq. (1.6): bor-
rowing the language of quantum mechanics, we shift to a sort of interaction picture
by dening a suitable operator which maps the solution of Eq. (1.6) to the solution
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr


odinger Evolutions 67

of a new equation which does not have that stochastic term. To this end, let us
consider the operator Qa : D(Qa ) L2 (R) L2 (R) dened as follows:
Qa (x) = eax (x), a C; (3.1)
where D(Qa ) is dened as the set of all (x) L2 (R) such that eax (x) L2 (R).
It should be noted that, in general, the operator Qa is unbounded and its domain
D(Qa ) is dense in L2 (R) but does not coincide with it. We will settle all technical
issues in the second part of the section. We now dene the vector:
(1)
t = Qt t ; (3.2)
(1)
an easy application of It
o calculus shows that t satises the dierential equation:

 
(1) i p2 1 2 (1) (1)
dt = Q t Q q t dt, 0 = . (3.3)
 2m t

The stochastic dierential q dt has disappeared; in turn, the free Hamiltonian
p2 /2m has been replaced by the operator Qt (p2 /2m)Q1
t
which, due to the
specic commutation relations between q and p, takes the simple form:

Qt p2 Q1

= p2 2i t p 2 t2 . (3.4)
t

Equation (3.3) can then be re-written as follows:


 2 
d (1) p 2 i 2 2 (1)
i t = iq t p . (3.5)
dt 2m m 2m t t
This is a standard dierential equation with random coecients; note that the
operator on the right-hand side is not self-adjoint, due to the presence of the second
and third terms. The last term of Eq. (3.5) is a multiple of the identity operator
and can be removed by dening:
  
(2) i t 2 (1)
t = exp ds t ; (3.6)
2m 0 s
we then obtain:
 2 
d (2) p 2 i (2)
i t = iq t p t . (3.7)
dt 2m m
The third term on the right-hand side contains a time dependent coecient, and
the next step aims at removing it.
2. Reduction of Eq. (3.7) to a dierential equation with constant coe-
cients
The idea we now follow is to perform a transformation similar to a boost. We
introduce the operator Pa : D(Pa ) L2 (R) L2 (R) dened as:
Pia/ (x) = (x + a), a C, (3.8)
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

68 A. Bassi, D. D
urr & M. Kolb

where D(Pa ) is the set of all (x) L2 (R) which can be analytically continued to
the line x + a in the complex space C, and such that (x + a) L2 (R). Similarly
to Qa , also Pa is in general an unbounded operator and its domain D(Pa ), though
being dense, does not coincide with L2 (R); we will come back to this point later in
this section. We dene the operator:

Vt = exp(iat /)Pibt / Qict / , (3.9)

where the coecients at , bt and ct , yet to be determined, will turn out to be complex
random functions of time. One can easily verify that:

Vt qVt1 = q + bt , (3.10)
Vt pVt1 = p + ct , (3.11)

and similarly for higher powers of q and p. Let us dene the vector:
(2)
t = Vt t , (3.12)

which solves the equation:


 2
d p 2 1 i
i t = iq bt ct + t p + (ct 2ibt )q
dt 2m m m

1 2 i 2
+ a t + ct bt + c t ct ibt t . (3.13)
2m t m
The time-dependent part of the equation can be removed by requiring that at , bt
and ct satisfy the rst-order dierential equations:

mb t ct = i t b0 = 0,
(3.14)
ct 2ibt = 0 c0 = 0

and
1 2 i
a t + ib2t + c t ct = 0, a0 = 0. (3.15)
2m t m
The rst two equations form a non-homogeneous linear system of rst-order dier-
ential equations, which has a unique Q-a.s. continuous random solution; the third
equation instead determines the global factor at , which is also random. With such
a choice for the three parameters, Eq. (3.13) becomes:
 2 
d p
i t = iq 2 t , 0 = , (3.16)
dt 2m
which is the equation of the so-called non-self-adjoint (NSA) harmonic oscillator,
whose solution and most important properties are well known. Before continuing,
we note that in the case of a more general Hamiltonian H = p2 /2m+V (q) appearing
in Eq. (1.1) in place of just the free evolution p2 /2m, the potential V (q) would have
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr


odinger Evolutions 69

been transformed, when going from Eq. (3.7) to Eq. (3.16), according to the rule:
Vt V (q)Vt1 = V (q + bt ); in this case, we would not be able to remove completely
the time-dependent terms from the equation and we would not be able to reduce
the original equation to one, whose solution is known. However, besides the free
particle case, all equations containing terms at most quadratic in q and p (among
them, the important case of the harmonic oscillator) can be solved in a similar way.
The solution of Eq. (3.16) admit a representation in terms of the Greens func-
tion, also known as Mehlers formula:
  
2 1
GNSA
t (x, y) = 2
exp (x + y ) coth t + 2 xy sinh t , (3.17)
sinh t

with and dened as in (1.19). In this way, we have established the link between
the solutions of the SDE (1.6) and those of the equation for the NSA harmonic oscil-
lator (3.16), which we summarize in the following lemma, whose proof is straight-
forward.

Lemma 3.1. Let TtNSA be the evolution operator represented by the Greens func-
tion GNSA
t (x, y) and Tt the one represented by Gt (x, y); then:

Tt exp(it /)Qt +(ict /) Pibt / TtNSA , (3.18)

where the two random functions bt and ct solve the linear system (3.14), and t ,
which includes all global, i.e. independent of x, phase factors, solves the equation:

1 2 i 2 2
t = ib2t ct + t ct + , 0 = 0. (3.19)
2m m 2m t
We now proceed to prove in which sense t := Tt is the topological strong
solution of Eq. (1.6) for the given initial condition . We rst need to set some
properties of the Greens function GNSA
t (x, y) which will be necessary for the sub-
sequent theorem.

Lemma 3.2. The absolute value of GNSA


t (x, y) is equal to:
 
2
|GNSA
t (x, y)| = exp (x2 + y 2 )pt + 4 xyqt , (3.20)
cosh t cos t

where we have introduced the following quantities:

sinh t sin t
pt = , (3.21)
cosh t cos t
sinh t/2 cos t/2 cosh t/2 sin t/2
qt = ; (3.22)
cosh t cos t
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

70 A. Bassi, D. D
urr & M. Kolb

note that the function pt is positive for any t > 0. The integral of |GNSA t (x, y)|2
with respect to y is equal to:
  
NSA 2 2 p2t 4qt2 2
dy|Gt (x, y)| = exp 2 x . (3.23)
(sinh t sin t) pt

A simple calculation shows that p2t 4qt2 > 0 for any t > 0; this means that
GNSA
t (x, ), taken as a function of y, belongs to L2 (R) for any x R and t > 0;
moreover :

dxGNSA
t (x, )2 < + for any t > 0. (3.24)

Finally, the following expression holds true:


  
bx NSA 2 2 p2t 4qt2 2
dy|e Gt (x + a, y)| = exp 2 x
(sinh t sin t) pt

qt (qt aR + qt aI )
+ 2 pt aR + pt aI 4 + 2bR x
pt

(qt aR + qt aI )2
+ pt (a2R a2I ) + 2
p t aR aI 4 ,
pt
(3.25)

with
sinh t + sin t
pt = , (3.26)
cosh t cos t
sinh t/2 cos t/2 + cosh t/2 sin t/2
qt = . (3.27)
cosh t cos t
The above formulas imply that, for any a, b C, for any x R and for any
t > 0, the function ebx GNSA
t (x + a, ) belongs to L2 (R) and:

dxebx GNSA
t (x + a, )2 < +. (3.28)

We are now in a position to state and prove the main theorem of this section.

Theorem 3.1. Let Pa and Qa be dened, respectively, as in (3.8) and (3.1); let bt
and ct solve the linear system (3.14) and t be the solution of Eq. (3.19). Finally,
let t = Tt , with L2 (R) and Tt dened as in (3.18). Then the following three
statements hold true with probability 1:

(1) Tt : L2 (R) L2 (R)denes a bounded operator for everyt > 0. (3.29)


(2) L2B (R) t is a topological strong solution of Eq. (1.6). (3.30)
(3) L2 (R) limt0 t  = 0. (3.31)
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr


odinger Evolutions 71

Proof. Statement 1. Let belong to L2 (R); since also GNSA t (x, ) belongs to
L2 (R) for any x R and t > 0, H olders inequality implies that GNSA
t (x, )
belongs to L1 (R); accordingly, the operator TtNSA is well dened for any t > 0, and
maps any L2 (R)-function into a measurable function. By using Schwartz inequality
together with relation (3.24), we have:
  2 
 
dx  dy GNSA
t (x, y)(y)  2 dxGNSA (x, )2 < +;
 t (3.32)

thus TtNSA belongs to L2 (R) for any in L2 (R) and for any t > 0.
In a similar way, since also GNSA
t (x + a, ) belongs to L2 (R) for any a C and
because of (3.28), one proves that Pa TtNSA belongs to L2 (R) for any L2 (R),
for any complex a and for any t > 0, i.e. that D(Pa ) contains R(TtNSA ). Using once
more the same inequalities and (3.28), one shows also that Qb Pa TtNSA belongs to
L2 (R) for any in L2 (R), for any a, b C and t > 0.
Remark. Actually a stronger statement is true, as can be readily seen from the
Gaussian form of the Greens function Gt of the operator Tt : For positive t, it
maps L2 (R) to Schwartz space S(R). We shall need this information in the proof
of Statement 3.
Statement 2. Let us consider the vector t := TtNSA , with L2B (R). By
construction, t solves Eq. (3.16), once one proves that the integration

dy GNSA
t (x, y)(y) (3.33)

can be exchanged with the rst and second partial derivatives with respect to
x and with the rst partial derivative with respect to t. We note that the
function GNSA
t (x, y)(y) satises the following two properties: (i) The function
y  GNSAt (x, y)(y) is measurable and integrable on R for any t > 0 and for
any x R; (ii) The rst and second partial derivatives with respect to x and the
rst partial derivatives with respect to t are exists for any t > 0, x R and y R
and can be bounded uniformly with respect to t and x. Accordingly, one can apply,
e.g., [50, Theorem 12.13, p. 199] to conclude that the operations of integration and
dierentiation can be exchanged.
Having proved that t solve Eq. (3.16), a direct application of It
o calculus proves
that t , dened as in (3.18), is a topological strong solution of Eq. (1.6).
Statement 3. Let = 0 Cc (R) be given. Since t solves Eq. (1.6) in a strong
sense, it also solves the SDE in a weak sense; hence, using, e.g., [31, Eq. (1.1)], one
has:

lim |t  = |0  Cc (R). (3.34)


t0

We extend (3.34) to the general case of L2 (R). Being dense in L2 (R), there
exist a sequence {n Cc (R), n N} which approximates any L2 (R). By
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

72 A. Bassi, D. D
urr & M. Kolb

triangle and Schwarz inequality we get

||t  |0 | |n |t  n |0 | +  n t  +  n 0 . (3.35)

The rst term on the right-hand side can be made arbitrarily small because
of (3.34); the second and third terms can also be made arbitrarily small by choosing
n suciently large, while t  can be bounded as it converges to 0  for t 0,
due to Eq. (1.9). This proves that:

lim |t  = |0  L2 (R). (3.36)


t0

Statement 3 for test functions Cc (R) now follows directly from Eq. (1.9),
Eq. (3.36) and observing t 0 2 = t 2 +0 2 20 |t R . It remains to extend
the strong continuity of Tt from the subspace Cc (R) to L2 (R). For this, observe
that for Cc (R) (t 2 )t0 denes a stochastic process with continuous paths
and by Holevos result (cf. Eq. (1.9)) it is a martingale. For given f L2 (R) choose
a sequence (n )nN Cc (R), which converges to f in L2 (R). Doobs inequality
for submartingales implies that for all n, m N, T > 0 and > 0

1
Q sup |t  t  | > EQ [|nT 2 m
n 2 m 2 2
T  |]. (3.37)
0tT

We now show that

lim EQ [|nT 2 m 2
T  |] = 0. (3.38)
n,m

The elementary inequality

|nt 2 m 2 n m n m
t  | (t  + t )t t 

implies that

EQ [|nt 2 m 2 n m n m
t  | EQ [(t  + t )t t ]
1 1
(EQ [(nt  + m 2 2 n m 2 2
t ) ]) (EQ [t t  ])
2 12
2(EQ [nt 2 ] + EQ [m n m 2
t  ])  
1
= 2(n 2 + m 2 ) 2 n m .

The right-hand side converges to 0 as n, m . Therefore, the sequence of stochas-


tic processes (nt 2 )t0 is a Cauchy sequence in the complete metric space (D, d) of
adapted processes with right continuous paths having left limits, where the metric
d is dened as (see [51, pp. 5657] for background concerning this topology)

  
1
d(X, Y ) = EQ min 1, sup |(X Y )s | (X, Y D).
n=1
2n 0sn
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr


odinger Evolutions 73

Therefore (nt 2 )t0 converges locally uniformly in probability to a stochastic pro-


cess. This stochastic process again has to be continuous almost surely, since a
subsequence of (nt 2 )t0 converges locally uniformly with probability one. Since
limn nt 2 = ft 2 almost surely we know that [0, )  t  ft 2 is contin-
uous, in particular limt0 ft  = f  almost surely and denes by the lemma of
Fatou a positive continuous supermartingale. Therefore, it has a unique decompo-
sition ft 2 = Mt At , where (Mt )t0 is a continuous martingale and (At )t0 is
increasing process. In fact, as we shall show now, the increasing process is iden-
tically 0, i.e. ft 2t0 is a positive martingale for every f L2 (R). For that, we
observed in the remark above that for positive the function f almost surely
belongs to the Schwartz space and in particular to the domain of the generator. By
Holevos result cited above, (Tt f )t is a continuous martingale. Therefore,
At = 0 for t > 0 and hence it equals 0 almost surely. In order to ensure strong
convergence limt ft f  = 0 we need only show that weak convergence holds,
i.e. limt |ft  = |f . Observing

||ft  |f | ||ft  |nt | + ||nt  |n | + ||n  |f |

it suces to show that for some T > 0 limn suptT ||ft  |nt | = 0. But
suptT ||ft  |nt |  suptT ft nt . Therefore, we need only establish
that limn suptT ft nt  = 0. This is done by a similar argument as above,
namely we show that for every > 0

n 2
lim Q sup ft t  > = 0,
n tT

because then there exists a subsequence which is almost surely convergent to 0.


But as we showed above (gt 2 )t0 is a martingale for every g L2 (R). Hence
(ft nt 2 )t0 is a martingale and we can again apply Doobs inequality as we did
before.

Remark 1. The Gaussian form of the Greens function (1.12) is a consequence of


the fact that Eq. (1.6) contains terms which are at most quadratic in q and p. This
in particular implies that the dynamics preserves the shape of initially Gaussian
wave functions; in fact, as shown e.g. in [30, 34, 35, 39], a state

t (x) = exp[t (x xm 2 m
t ) + ikt x + t ], (3.39)

is solution of Eq. (1.6) provided that the two real parameters xm m


t , kt and the two
complex parameters t , t satisfy the following stochastic dierential equations:
 
2i 2
dt = (t ) dt, (3.40)
m

 m
dxt = kt dt + R [dt 2 xm
m
t dt], (3.41)
m 2t
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

74 A. Bassi, D. D
urr & M. Kolb

I
dktm = Rt [dt 2 xm t ], (3.42)
t

 I m
dtR = (xmt )2
+ t + R
dt + xm t [dt 2 xt dt], (3.43)
m 4t

I  m 2  R tI tI m
dt = (kt ) t + R 2
dt + R xt [dt 2 xm
t dt]. (3.44)
2m m 4(t ) t
In particular, the solution of Eq. (3.40) is t = (/) coth(t + ), where sets the
initial condition. These results will be useful in the subsequent analysis.

4. Representation of the Solution in Terms of Eigenstates


of the NSA Harmonic Oscillator
We now turn to the problem of analyzing the long time behavior of the solution of
the (norm-preserving) nonlinear Eq. (1.1). The representation of the solution t of
Eq. (1.6) in terms of the Greens function (1.12) is not suitable for controlling the
long time behavior; it turns out to be more convenient to express t in terms of
the eigenstates of the NSA harmonic oscillator, resorting to the connection which
we previously established between Eqs. (1.6) and (3.16). In this way, as we shall
see, the collapse process will be manifest: the coecients of the superposition will
decrease exponentially in time, the damping being the faster, the higher the asso-
ciated eigenstate. Accordingly when normalization is also taken into account
in the large time limit only the ground state survives, which has a Gaussian shape.
We rst recall a few basic features of the Hamiltonian of the NSA harmonic
oscillator,
p2
H iq 2 (4.1)
2m
which has been studied in particular by Davies in a series of papers [52, 53] and
reviewed in his recent book [54]. The eigenvalues of H are complex and equal to:

1i 1
n n , n n + , (4.2)
2 2
and the corresponding eigenvectors are:

(n)
2 2 m
(x) zez x /2 H
n (zx), 2
z (1 i) (4.3)

where H n (x) is the normalized Hermite polynomial of degree n. Since the argument
of H n in (4.3) is complex, these eigenstates are not orthogonal; it can be shown
that they are linearly independent and form a complete set, however they do not
form a basis. As such, they cannot directly used to expand an initial state into a
superposition of the eigenstates of H. This problem can be circumvented in the
following way, also discussed by Davies.
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr


odinger Evolutions 75

It is easy to see that the sequences {(n) } and {(n) } form a bi-orthonormal
system; one then denes the (non-orthogonal) projection operators:
Pn |(n) (n) = n (n) , (4.4)
which satisfy the relations:
lnPn 
Pn Pm = n,m Pn , Pn  = (n) 2 and lim = 2c, (4.5)
n+ n
where c is an appropriate constant [54]. As we see, although the states (n) are
normalized, in the sense that
 +
(n) (x)(m) (x)dx = n,m , (4.6)

the norm of the projection operators Pn grows exponentially as n +. Finally,


the following equality holds true [54]:


TtNSA = e(1+i)n t/2 Pn for t > 4c/. (4.7)
n=0

A remarkable property of the above representation of the solution of Eq. (3.16) in


terms of the eigenstates of the operator (4.1) is that it holds not for any t 0, as
one would naively expect, but only for t > 4c/. The reason is that the norm of
the projection operators Pn grows exponentially with n, so one has to wait for t to
be large enough in order for the term ent/2 to suppress the exponential growth
of the projectors. From a physical point of view, recalling the discussion of Sec. 2,
since the constant c is of order 1 [54] and  5.01 105 sec1 , we see that
the representation (4.7) holds true only in part of the classical regime and in the
diusive regime, which is the one we are interested in studying now, but not in the
physically more crucial collapse regime.
We now apply the above results to our problem; we will rst proceed in an
informal way, and at the end we will prove the relevant theorems. Let L2 (R);
then, according to (3.18) and (4.7):
+

t (x) = Tt = e[ t +ict /]x+it /
n e(1+i)n t/2 (n) (x bt ) (4.8)
n=0

+

2 t x+t
xt )2 /2+ik
= ez (x
z n e(1+i)n t/2 H
n [z(x bt )], (4.9)
n=0

where n = |(n)  (see Eq. (4.4)), while the two real parameters xt , kt and the
complex parameter t are dened as follows:

t = bR
x I I
t + bt (2/m)ct + (/2 )t , (4.10)

kt = (m/)bIt + (1/)(cR I
t ct ) + t , (4.11)
t = (1 i)(m/4)(b2t x
2t ) + (i/)t . (4.12)
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

76 A. Bassi, D. D
urr & M. Kolb

By resorting to Eqs. (3.14) and (3.19), and after a rather long calculation, we obtain
the following set of SDEs for these parameters:

 
xt = kt dt +
d [dt 2 xt dt], (4.13)
m m

dkt = [dt 2 xt dt], (4.14)
 

dtR = x2t + dt + xt [dt 2 xt dt], (4.15)
4
 
I  2
dt = k + dt xt [dt 2 xt dt]; (4.16)
2m t 4
0 = k0 = 0 = 0. Note that these equations are equiv-
the initial conditions are: x
alent to (3.41)(3.44), with t = = / = z 2 /2, x t = xm m
t , kt = kt and
t = t + (1 + i)/4; as a matter of fact, the above equations describe the time evo-
lution (according to Eq. (1.6)) of the ground state of the NSA harmonic oscillator,
which is:
 2 
z 2 1+i

t (x) = exp (x xt ) + ikt x + t t , 0 (x) =
(0)
(x). (4.17)
2 4
As we shall prove in the next section, this is the state to which apart from
normalization any initial state converges to, in the long time limit, hence the
name t .
As we see, due to the stochastic part of the dynamics, the argument of the
Gaussian weighting factor and that of the Hermite polynomials of Eq. (4.9) are
dierent functions of time, while for analyzing the long time behavior of the wave
function, it is more convenient that both arguments display the same time depen-
dence. We thus modify the argument of the Hermite polynomials, to make it equal
to that of the weighting factor. To this end, let us dene t = x t bt ; we can then
write:
n [z(x bt )] =  1
H Hn [z(x x
t ) + zt ]
2n n!
n
1 n
=  (2zt )nm Hm [z(x x t )]
n
2 n! m=0 m
n
 n!
= ( 2zt )nm H m [z(x x
t )], (4.18)
m=0 m!(n m)!
where Hm is the standard (not normalized) Hermite polynomial of degree m; in
going from the rst to the second line, we have used property (A.2). Resorting to
the above relation, we can rewrite Eq. (4.9) as follows:
+

(m) (1+i)mt/2 (m)
t (x) = eikt x+t (1+i)t/4
t e (x x
t ); (4.19)
m=0
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr


odinger Evolutions 77

the functions (m) are the eigenstates dened in (4.3), while the time dependent
(m)
coecients t are dened as follows:
+ 
(m)
 (k + m)! k

t = k+m ( 2z t ) , (4.20)
k=0
m!k!

where we have introduced the new quantity t e(1+i)t/2 t .


Equations (4.19) and (4.20) represent the two main formulas, which we will use
in the next section to analyze the large time behavior. Before doing this, we need
to set these formulas on a rigorous ground; we will do these with the following two
lemmata.

Lemma 4.1. Let L2 (R) and n = |(n) , with (n) dened as in (4.3).
(m)
Then the series (4.20) dening t is a.s. convergent for any m and any t > 0.
Moreover, one has the following bound on the coecients:
+ k(c+1)

(m) (c+1/2)m e | 2z t |k
|
t | Nt e , Nt A a.s., (4.21)
k=0 kk

where A is a constant independent of the Brownian motion t .

Proof. Because of (4.5), there exists a constant C1 such that:

|n | (n)  = (n)  C1 enc . (4.22)

Secondly, using Stirling formula, there exists a constant C2 such that:



C21 2nnn en < n! < C2 2nnn en , (4.23)

for n > 1; we can then write the following estimate:


 
(k + m)! C22 4 k + m (k + m)(k+m)/2 e(k+m)/2

m!k! 2 mk 2 mm/2 em/2 k k ek
C2
2 ek(ln k2)/2+m/2 ; (4.24)

in the second line, we have used the inequality (k + m) ln(k + m) k ln k + m ln m +


k + m. Using Eqs. (4.22) and (4.24), we have the following bound:
  
 (k + m)! k  C1 C22 ek(c+1) | 2z t |k

k+m ( 2z t )  , k, m 1. (4.25)
 m!k!  4
kk

The cases k = 0 and m = 0 can be treated separately, giving the same bound, with
the only possible dierence of an overall constant factor. This proves convergence
of the series dened in (4.20) and the bound (4.21).
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

78 A. Bassi, D. D
urr & M. Kolb

Theorem 4.1. Let the conditions of Lemma 4.1 be satised; let moreover t
e(1+i)t/2 t , where t = x
t bt with x
t and bt solutions of Eqs. (4.13) and (3.14),
respectively. Then the series dened in (4.19) is a.s. norm convergent for t > t
(4c + 1)/. In addition, the following equality holds true:
+

(m) (1+i)mt/2 (m)
Tt = eikt x+t (1+i)t/4
t e (x xt ), t > t, (4.26)
m=0

where Tt is the evolution operator associated to the Greens function (1.12).

Proof. According to (4.5) and (4.21), one has:


(m) (1+i)mt/2 (m)

t e t )] C1 Nt e(2c+1/2t/2)m ,
[z(x x (4.27)
from which the conclusion follows. Comparing the two expressions of Eqs. (3.18)
and (4.19) when the initial state is an eigenstate (n) , we see that they coincide
on the dense subspace of all nite linear combinations of (n) , and hence on the
whole of L2 (R).

5. The Long Time Behavior


We are now in a position to study the long time behavior of the solution of Eq. (1.1).
Looking at expressions (4.19) for the solution t and (4.20) for the coecients
(m)
t , it should be clear what the long time behavior of the normalized solution

t = t /t  is: whatever the initial condition, at any time t > 0 the wave function
(0)
t picks a component on the ground state (0) (x x t ), since t = 0 as long as at
least one of the coecients k is not null, which is always the case. Equation (4.19)
on the other hand shows that each term of the superposition has an exponential
damping factor, which is the bigger, the higher the eigenvalue. Accordingly, after
normalization, only the eigenstate with the weakest damping factor survives, which
is the ground state. Hence we expect that the general solution of Eq. (1.1) converges
a.s., in the large time limit, to the ground state (0) (x x t ), which is a Gaussian
state. That this is true is proven in the following theorem.

Theorem 5.1. Let t be a strong solution of Eq. (1.6) that admits, for t > t
a representation as in (4.26). Let t t /t  (when t  = 0), which can be
written as follows:
+
 (m)
t x+ I t/4)
t
t = t +e i(k t e(1+i)mt/2 m (x x
t ), (5.1)
m=1
r t

with:
(0)

t i(kt x+tI t/4)
t := e 0 (x xt ), (5.2)
rt
 + 
 
 (m) (1+i)mt/2 
rt := 
t e m (x xt ) . (5.3)
 
m=0
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr


odinger Evolutions 79

Then, with P-probability 1:

lim t t  = 0. (5.4)
t

Note that, apart from global factors, t is the ground state of the NSA har-
monic oscillator, randomly displaced both in position space as well as in momentum
space.

Proof. According to Eq. (5.1), all we need to prove is that, with P-probability 1:
 + 
 (m) 
 t (1+i)mt/2 
lim  e m (x xt ) = 0. (5.5)
t  r t 
m=1

Resorting to (4.27), one can write the following bound:


 + 
 (m)  Nt e(tt)
 t 
 e(1+i)mt/2 m (x xt ) C1 , (5.6)

m=1
rt  rt 1 e(tt)

thus all we need to set is the long time behavior of rt and Nt . Lemmas 5.1 and 5.2
(see Eqs. (5.7) and (5.12)) state that, with P-probability 1, rt converges asymp-
totically to a nite and non-null random variable, while Nt converges to a nite
random variable. From the above properties, the conclusion of the theorem follows
immediately.

In the remaining of the section, we prove the required lemmas.

Lemma 5.1. Let rt be dened as in (5.3). Then, with P-probability 1,

lim rt = r nite and not null. (5.7)


t

Proof. According to Eqs. (4.19) and (5.3), the following equality holds:
R
t  = et t/4 rt ; (5.8)

resorting to the stochastic dierentials (1.9) and (4.15) for t 2 and tR , respec-
tively, one can write the following stochastic dierential equation for rt2 :

drt2 = [2 (qt xt )dt + 4(
x2t qt x
t ) dt]rt2 , r02 = 1. (5.9)

By using relation (1.11), the above equation can be re-written in terms of the
Wiener process Wt as follows:

drt2 = [2 (qt xt )dWt + 4(qt x
t )2 dt]rt2 , r02 = 1, (5.10)

whose solution is:


 
 t  t
rt2 = exp 2 (qs xs )dWs + 2 (qs x 2
s ) ds . (5.11)
0 0
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

80 A. Bassi, D. D
urr & M. Kolb

The crucial point is to establish the behavior of the dierence qt x


t between
the mean position of the general solution t and the mean position of the asymp-
totic state t . Since t converges to t , we expect qt x
t to vanishes asymp-
totically. That this is actually true with P-probability 1 is proven in Lemma 5.3
(see Eq. (5.15)), where indeed it is shown that the convergence is exponentially
fast. This fact, together with (5.11), concludes the proof of the lemma.

Lemma 5.2. Let Nt be dened as in (4.21). Then, with P-probability 1,


lim Nt = N nite. (5.12)
t

Proof. Looking back at Eq. (4.21), we see that in order to prove this lemma it is
sucient to show that t tends to a nite limit as t , with P-probability 1.
According to our previous denition, t is equal to:
t = e(1+i)t/2 (
xt bt ). (5.13)
Equations (3.14) and (4.10), together with the change of measure (1.11), lead
to the following stochastic dierential equation for t in terms of the Wiener
process Wt :

dt = e(1+i)t/2 [dWt + 2 (qt x t )dt], 0 = 0. (5.14)
2
Once again, the large time behavior of qt x
t (see Eq. (5.15)) yields the conclusion
of the lemma.

Lemma 5.3. Let qt t |q|t  and x


t dened in (4.10). Then, with P-
probability 1:
t = O(et/2 ).
ht qt x (5.15)

Proof. Let us consider the Gaussian solution of Eq. (1.6):


 
G t 2
t (x) Gt (x, 0) = Kt exp x + a t x + ct (5.16)
2
 
t G 2 G
= Kt exp (x x
t ) + ikt x + ct (5.17)
2
where Gt (x, y) is the Greens function dened in (1.12) and
R
a t I R t G 2
G
xt = , It Rt a
ktG = a , ct = ct + (
x ) . (5.18)
Rt t t 2 t
Note that xG G G
t is the mean position of the Gaussian state t , while kt is its average
momentum. Obviously we can write:
G
ht = (qt x xG
t ) + ( t x
t ); (5.19)
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr


odinger Evolutions 81

Lemma B.1 proves that qt x G


t has the required asymptotic behavior
(see Eq. (B.1)), so all we need to show is that also x Gt xt behaves as
required. Lemma B.1 was rst proven in [35]; for completeness, we reproduce it
in Appendix B, adapting it to our notation. The proof of the lemma is instructive
because it makes clear why it is convenient to analyze qt x G
t separately from
G
t xt .
x
By letting the ground state of the NSA harmonic oscillator evolve according to
the Greens function Gt (x, y), one can express xt in terms of the functions (1.13)
(1.18); a straightforward calculation leads to the following result:
 R 
G 1 R tbt
t x
x t = (pt 1)at , (5.20)
2 t +

where limt t = 2/. By inspecting expressions (3.21) and (1.15), we


recognize that p1
t 1 = O(et ) and |t | = O(et/2 ), thus in order to prove
the lemma all we have to do is to control the long time behavior of a t , which
in turn sets the asymptotic behavior of bt through (1.17). Inverting Eq. (5.18)
we get:

t = t xG
a G
t + ikt , (5.21)

thus we can control a G


t by controlling x G
t and kt . These two quantities, being the
average position and (modulo ) average momentum of the Gaussian solution (5.17),
satisfy the stochastic dierential equations (3.41) and (3.42), with t /2 in place of
t . By using the change of measure (1.11), we can re-express these equations in
terms of the Wiener process Wt as follows:
 
G  G 2
d
xt = k + R ft dt + R dWt , (5.22)
m t t t
I I
dktG = 2 Rt ft dt Rt dWt , (5.23)
t t

G
with ft qt xt . By integrating the second equation, by using the strong law of
large numbers applied to Wt , Eq. (B.1) for ft and the fact that t has an asymptotic
nite limit, one can show that, with P-probability 1, the process ktG grows slower
than t2 , for t . By integrating now the rst equation, and by using the same
properties as before, one can show that x G 3
t grows slower than t , for t and
again with P-probability 1. According to Eqs. (5.21) and (1.17), we then have, with
P-probability 1:

t = o(t3 ) as t ,
a lim bt = b nite. (5.24)
t

This proves that x G


t xt has the required asymptotic behavior, hence the conclusion
of the lemma.
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

82 A. Bassi, D. D
urr & M. Kolb

In this way, we have proven that any initial state is P-a.s. norm convergent to
the Gaussian state (5.2), which can be written as follows:
  2 
z
t 4 2 exp (x x t )2 + ikt x + i tI t , (5.25)
zR 2 4

which has a xed nite spread both in position and in momentum, given by [39]:

2 1/2 
q = t |(q xt ) |t  = , (5.26)
m

2 1/2 m
p = t |(p kt ) |t  = . (5.27)
2
This corresponds almost
to the minimum allowed by Heisenbergs uncertainty rela-
tions, as q p = / 2. Note also that, the more massive the particle, the smaller
the spread in position of the asymptotic Gaussian state: this is a well known eect
of the localizing property of Eq. (1.1). Finally, Eqs. (4.13) and (4.14), together with
the change of measure (1.11), tell how the average position x t and momentum kt
evolve in time, as a function of the Wiener process Wt :

d
xt = kt dt + ht dt + dWt , (5.28)
m 2

dkt = 2ht dt + dWt , (5.29)

which imply that there exist two random variables X and K such that [35]:

   t 
t = X + Kt +
x Ws ds + Wt + O(et/2 ), (5.30)
m m 0 m

kt = K + Wt + O(et/2 ). (5.31)

These parameters fully describe the time evolution of the Gaussian state (5.25).

6. Conclusions and Outlook


In Sec. 2, we have spotted three interesting time regimes during which the wave
function, depending on the values of the parameters and m, evolves in a dif-
ferent way. In the central sections of this paper, we have analyzed the long time
behavior, which pertains to the third regime, the diusive one. There are many
other properties of the solutions of Eq. (1.1) which deserve to be analyzed, and in
this conclusive section, we would like to point out a number of interesting open
problems.

I. Collapse regime
Let be the length which discriminates between a localized and a non-localized
wave function, i.e. such that, dening with q the spread in position of a wave
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr


odinger Evolutions 83

function , we
say that is localized
 in space whenever q . In our case, we
must take > /m, where /m is the asymptotic spread (see Eq. (5.26)).

Problem I.1: Find bounds on the collapse time


Let t be the solution of Eq. (1.1), for a given initial condition L2 (R) such

that q > . Let us dene the collapse time TCOL as the rst time at which the
wave function is localized in space:


TCOL := min{t : t q }. (6.1)


How is TCOL distributed? Find best possible bounds (depending on the param-
eters dening the model) for the distribution function.
The dependence of the collapse time on the parameters of the model is physically
relevant as for macroscopic bodies the collapse is supposed to happen at a much
shorter time, producing a classical macroscopic body. This time must be much
before diusion becomes eective. Bounds on the collapse time will lead hopefully
to experimentally testable deviations from linear quantum mechanics (i.e. where
the superposition principle holds on all scales.)

Problem I.2: collapse probability



Let := t , for t = EP [TCOL 
:= |q|
]. Let x be the position of the wave function
at the average time at which it is localized in space. Show that the distribution of
is close to the Born probability given by |(x)|2 .
x

II. Classical regime


In the classical regime, the wave function is expected to move, on the average, like
a classical free particle.

Problem II.1: classical motion


Let qt and pt be the (quantum) average position and momentum of t . Let t >

TCOL . Show that the random trajectories q and p are with high probability for
a reasonably large amount of time close to the classical trajectories. The closeness
will of course depend on the parameters dening the model.

III. Diusive regime


With this regime, the classical regime ends and has been analyzed in this paper:
as we have seen, the wave keeps diusing in the Hilbert space, eventually taking a
Gaussian shape, as described in Sec. 5.

Acknowledgments
The work was supported by the EU grant No. MEIF CT 2003-500543 and by DFG
(Germany). We thank GianCarlo Ghirardi, Lajos Diosi and the referees for helpful
comments and an anonymous referee for pointing out a aw in a previous version.
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

84 A. Bassi, D. D
urr & M. Kolb

Appendix A. Properties of Hermite Polynomials


We list here the main properties of Hermite polynomials, which are used in the
paper. The primary denition of the Hermite polynomials is

n/2
 (1)m (2z)n2m
Hn (z) = n! , (A.1)
m=0
m!(n 2m)!

where z is any complex number. These polynomials satisfy the following addition
rule

n
n
Hn (z1 + z2 ) = (2z2 )nm Hm (z1 ). (A.2)
m=0
m

When the argument is real (z = x R), they form an orthogonal set with respect
to the weight exp[x2 ]; the normalized Hermite polynomials are:

n
n (x) = 1 Hn (x),
H Nn = 2 n!. (A.3)
Nn

Appendix B. Lemma 3.1 in [34]

Lemma B.1. Let L2 (R),  = 1 and let t = Tt . Then, with P-probability 1:

G
ft qt xt = O(e
t/2
), (B.1)

G
where qt = t |q|t , and xt has been dened in (5.18).

Proof. Using the expression (1.12) for Gt (x, y) together with Schwartz inequality,
we can derive the following bound on t :
 
2 2 p2t 4qt2 2
|t (x)| |Kt | exp 2 x
Rt pt
bR qt 
R t R (bR
t )
2
+2 a
t + 8 x + 2
ct + , (B.2)
pt 2 pt

which holds for any t > 0. The above inequality implies that it is sucient to
consider L2 (R) such that:
2
|(x)| CeAx , (B.3)
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr


odinger Evolutions 85

where C and A are random variables. A direct calculation leads to the following
expression for the quantum average t |q|t :
  
2 
aR
( t )
t |q|t  = |Kt |2 exp 2c R
t + dy1 dy2 (y1 )(y2 )
Rt R
t
  R
  
t y 1 + t y 2 a
t 1 t2 2 1  t2
+ exp t y 1 t y22
2R
t R
t 2 2 R
t 2 2 R
t
 R
 R
2

t a
a |t |
exp bt + Rt y1 + bt + t Rt y2 + y1 y2 . (B.4)
t t 2R t

As we shall soon see, all exponential terms in the above expression can be controlled.
The crucial factors are the two within brackets: the rst term decays exponentially
in time, since t = O(et/2 ), while t has a nite asymptotic limit; the term
R
a R
t /t , instead, does not decay in time (see the discussion in connection with the
proof of Lemma 5.3). Since t 2 is equal to the expression (B.4) without the terms
in square brackets, and because of (5.18), we have that

R
a t
ft t 2 = t |q|t  t 2 (B.5)
R t
  
2 
 
aR
( t )

 t y 1 + t y 2
= |Kt |2 exp 2
c R
t + dy 1 dy 2 (y 1 )(y 2 )
Rt R t 2R
t
 2
2

1 1
exp t tR y12 t t R y22
2 2t 2 2t
 
R
t a aR |t |2
exp bt + Rt y1 + bt + t Rt y2 + y y
1 2 . (B.6)
t t 2Rt

According to the discussion above, we expect the quantity ft t 2 to decay expo-


nentially in time, as we shall now prove; this is the reason why, in proving
Lemma 5.3, it was convenient to split the dierence ht as done in Eq. (5.19).
Using the inequality y1 y2 (y12 + y22 )/2, we can write:
 
 R
a 
|ft |t 2 = t |q|t  tR t 2  (B.7)
t
  
2 
|t | 2 R aR
( t )
|K t | exp 2
c t + dy1 dy2 |(y1 )||(y2 )|
2Rt Rt R t

(|y1 | + |y2 |)g(y1 )g(y2 ), (B.8)

with:
 
1 R (tR )2 2 R tR a
Rt
g(y) exp t y + bt + y . (B.9)
2 Rt Rt
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

86 A. Bassi, D. D
urr & M. Kolb

Next, by using the inequality g(y1 ) + g(y2 ) (g(y1 )2 + g(y1 )2 )/2 and the symmetry
between y1 and y2 , we have:
  
2 
2 |t | 2 R aR
( t )
|ft |t  |Kt | exp 2
ct + dy1 dy2 |(y1 )||(y2 )|
2Rt Rt R t

(|y1 | + |y2 |)g(y1 )2 . (B.10)

Now, a direct computation shows that


   
aR
( t )
2
Gt (, y)2 dx|Gt (x, y)|2 = |Kt |2 exp 2
c R
t + g(y)2 ; (B.11)
Rt R
t

the key point is that, since Gt (x, y) solves Eq. (1.6), then Gt (, y)2 is a positive
martingale with respect to the measure Q, for any value of y; we call MarQ (t, y)
this martingale. We can then write:

|t |
|ft |t 2 dy1 dy2 |(y1 )||(y2 )|(|y1 | + |y2 |) MarQ (t, y)
2Rt

|t | 2
dy eAy (A1 |y| + A2 ) MarQ (t, y), (B.12)
2Rt

where A1 and A2 are suitable constants. In going from the rst to the second line,
we have used (B.3). The quantity

1 2

R
dyeAy (A1 |y| + A2 ) MarQ (t, y) (B.13)
2t

is another positive martingale with respect to Q, which we call Mar Q (t). We arrive
in this way at the inequality:

Mar Q (t)
|ft | |t | . (B.14)
t 2

Since Mar Q (t) is a positive martingale with respect to Q, then MarP (t) =
Mar Q (t)/t 2 is a positive martingale with respect to P which, by Doobs con-
vergence theorem, has a P-a.s. nite limit for t +. The conclusion of the
lemma then follows from Eq. (1.15), according to which t = O(et/2 ).

References
[1] G. C. Ghirardi, A. Rimini and T. Weber, Unied dynamics for microscopic and
macroscopic systems, Phys. Rev. D 34 (1986) 470491.
[2] G. C. Ghirardi, P. Pearle and A. Rimini, Markov processes in Hilbert space and
continuous spontaneous localization of systems of identical particles, Phys. Rev. A
42 (1990) 7889.
[3] G. C. Ghirardi, R. Grassi and P. Pearle, Relativistic dynamical reduction models:
General framework and examples, Found. Phys. 20 (1990) 12711316.
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr


odinger Evolutions 87

[4] P. Pearle, Reduction of the state vector by a nonlinear Schrdinger equation, Phys.
Rev. D 13 (1976) 857868.
[5] P. Pearle, Combining stochastic dynamical state-vector reduction with spontaneous
localization, Phys. Rev. A 39 (1989) 22772289.
[6] P. Pearle, Collapse Models, in Open Systems and Measurement in Relativistic Quan-
tum Theory, eds. F. Petruccione and H.-P. Breuer (Springer-Verlag, Berlin, 1999).
[7] L. Diosi, Localized solution of simple nonlinear quantum Langevin-equation, Phys.
Lett. A 132 (1988) 233236.
[8] L. Diosi, Models for universal reduction of macroscopic quantum uctuations, Phys.
Rev. A 40 (1989) 233236.
[9] L. Diosi, Relativistic theory for continuous measurement of quantum elds, Phys.
Rev. A 42 (1990) 50865092.
[10] S. L. Adler, D. C. Brody, T. A. Brun and L. P. Hughston, Martingale models for
quantum state reduction, J. Phys. A 34 (2001) 87958820.
[11] S. L. Adler and T. A. Brun, Generalized stochastic Schr odinger equations for state
vector collapse, J. Phys. A 34 (2001) 47974809.
[12] S. L. Adler, Quantum Theory as an Emergent Phenomenon. The Statistical Mechan-
ics of Matrix Models as the Precursor of Quantum Field Theory (Cambridge Univer-
sity Press, Cambridge, 2004).
[13] A. Bassi, E. Ippoliti and S. L. Adler, Towards quantum superpositions of a mirror:
An exact open systems analysis, Phys. Rev. Lett. 94 (2005) 030401, 4 pp.
[14] A. Bassi, E. Ippoliti and B. Vacchini, On the energy increase in space-collapse models,
J. Phys. A 38 (2005) 80178038.
[15] V. P. Belavkin, Non-demolition measurements, non-linear ltering and dynamic pro-
gramming in quantum stochastic processes, in Lecture Notes in Control and Infor-
mation Science, ed. A. Blaqui`ere, Vol. 121 (Springer-Verlag, Berlin, 1988).
[16] S. L. Adler and A. Bassi, Is quantum theory exact? Science 325 (2009) 275276.
[17] V. P. Belavkin and P. Staszewski, A quantum particle undergoing continuous obser-
vation, Phys. Lett. A 140 (1989) 359362.
[18] V. P. Belavkin and P. Staszewski, Nondemolition observation of a free quantum
particle, Phys. Rev. A 45 (1992) 13471357.
[19] D. Chruscinski and P. Staszewski, On the asymptotic solutions of Belavkins stochas-
tic wave equation, Phys. Scripta 45 (1992) 193199.
[20] A. Barchielli, Direct and heterodyne detection and other applications of quantum
stochastic calculus to quantum optics, Quantum Opt. 2 (1990) 423441.
[21] A. Barchielli, On the quantum theory of measurements continuous in time, Proceed-
ings of the XXV Symposium on Mathematical Physics (Toru n, 1992), Rep. Math.
Phys. 33 (1993) 2134.
[22] A. Barchielli and A. S. Holevo, Constructing quantum measurement processes via
classical stochastic calculus, Stochastic Process. Appl. 58 (1995) 293317.
[23] Ph. Blanchard and A. Jadczy, On the interaction between classical and quantum
systems, Phys. Lett. A 175 (1993) 157164.
[24] Ph. Blanchard and A. Jadczyk, Event-enhanced quantum theory and piecewise deter-
ministic dynamics, Ann. Physik 4(8) (1995) 583599.
[25] Ph. Blanchard and A. Jadczyk, Events and piecewise deterministic dynamics in event-
enhanced quantum theory, Phys. Lett. A 203 (1995) 260266.
[26] J. Halliwell and A. Zoupas, Quantum state diusion, density matrix diagonalization,
and decoherent histories: A model, Phys. Rev. D 52 (1995) 72947307.
[27] J. Halliwell and A. Zoupas, Post-decoherence density matrix propagator for quantum
Brownian motion, Phys. Rev. D 55 (1997) 46974704.
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

88 A. Bassi, D. D
urr & M. Kolb

[28] H.-P. Breuer and F. Petruccione, The Theory of Open Quantum Systems (Oxford
University Press, New York, 2002).
[29] H.-P. Breuer, U. Dorner and F. Petruccione, Numerical integration methods for
stochastic wave function equations, Comp. Phys. Comm. 132 (2000) 3043.
[30] D. Gatarek and N. Gisin, Continuous quantum jumps and innite-dimensional
stochastic equations, J. Math. Phys. 32 (1991) 21522157.
[31] A. S. Holevo, On dissipative stochastic equations in a Hilbert space, Probab. Theory
Related Fields 104 (1996) 483500.
[32] V. P. Belavkin and V. N. Kolokoltsov, Quasiclassical asymptotics of quantum
stochastic equations, Teoret. Mat. Fiz. 89 (1991) 163177 (Russian); translation in
Theoret. and Math. Phys. 89(2) (1991) 11271138.
[33] V. N. Kolokoltsov, Application of quasiclassical methods to the study of Belavkins
quantum ltering equation, Mat. Zametki 50 (1991) 153156 (Russian); translation
in Math. Notes 50 (1991) 12041206.
[34] V. N. Kolokoltsov, Scattering theory for the Belavkin equation describing a quantum
particle with continuously observed coordinate, J. Math. Phys. 36 (1995) 27412760.
[35] V. N. Kolokoltsov, Localization and analytic properties of the solutions of the sim-
plest quantum ltering equation, Rev. Math. Phys. 10 (1998) 801828.
[36] V. N. Kolokoltsov, Semiclassical Analysis for Diusion and Stochastic Processes,
Lecture Notes in Mathematics, Vol. 1724 (Springer-Verlag Berlin, 2000).
[37] S. Albeverio, V. N. Kolokoltsov and O. G. Smolyanov, Continuous quantum meas-
urement: Local and global approaches, Rev. Math. Phys. 9 (1997) 907920.
[38] S. Teufel, Adiabatic Perturbation Theory in Quantum Dynamics, Lecture Notes in
Mathematics, Vol. 1821 (Springer-Verlag, Berlin, 2003).
[39] A. Bassi, Collapse models: Analysis of the free particle dynamics, J. Phys. A 38
(2005) 31733192.
[40] E. Joos and H. D. Zeh, The emergence of classical properties through interaction
with the environment, Z. Phys. B 59 (1985) 223243.
[41] W. Marshall, C. Simon, R. Penrose and D. Bouwmeester, Towards quantum super-
positions of a mirror, Phys. Rev. Lett. 91 (2003) 130401, 4 pp.
[42] J. Z. Bernad, L. Di
osi and T. Geszti, Quest for quantum superpositions of a mirror:
High and moderately low temperatures, Phys. Rev. Lett. 97 (2006) 250404, 4 pp.
[43] S. L. Adler, A density tensor hierarchy for open system dynamics: Retrieving the
noise, J. Phys. A 40 (2007) 89598990.
[44] A. Barchielli, Some stochastic dierential equations in quantum optics and measure-
ment theory: The case of diusive processes, in Contributions in Probability in
Memory of Alberto Frigerio, ed. C. Cecchini (Forum, Udine, 1996), pp. 4355.
[45] A. Bassi and D. Durr, On the long-time behavior of Hilbert space diusion, Europhys.
Lett. 84 (2008) 10005.
[46] C. M. Mora and R. Rebolledo, Regularity of solutions to linear stochastic Schr
odinger
equations, Inn. Dimens. Anal. Quantum Probab. Relat. Top. 10 (2007) 237259.
[47] C. M. Mora and R. Rebolledo, Basic properties of nonlinear stochastic Schr odinger
equations driven by Brownian motions, Ann. Appl. Probab. 18 (2008) 591619.
[48] R. S. Liptser and A. N. Shiryaev, Statistics of Random Processes (Springer-Verlag,
Berlin, 2001).
[49] A. Bassi, G. C. Ghirardi and D. G. M. Salvetti, The Hilbert-space operator formalism
within dynamical reduction models, J. Phys. A 40 (2007) 1375513772.
[50] R. G. Bartle, A Modern Theory of Integration, Graduate Studies in Mathematics,
Vol. 32 (American Mathematical Society, Providence, RI, 2001).
February 11, 2010 10:1 WSPC/148-RMP J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr


odinger Evolutions 89

[51] P. E. Protter, Stochastic Integration and Dierential Equation (Springer-Verlag,


Berlin, 2004).
[52] E. B. Davies, Pseudo-spectra, the harmonic oscillator and complex resonances,
R. Soc. Lond. Proc. Ser. A Math. Phys. Eng. Sci. 455 (1999) 585599.
[53] E. B. Davies and A. B. J. Kuijlaars, Spectral asymptotics of the non-self-adjoint
harmonic oscillator, J. London Math. Soc. (2 ) 70(2) (2004) 420426.
[54] E. B. Davies, Linear Operators and Their Spectra, Cambridge Studies in Advanced
Mathematics, Vol. 106 (Cambridge University Press, Cambridge, 2007).
February 11, 2010 11:24 WSPC/148-RMP J070-S0129055X10003904

Reviews in Mathematical Physics


Vol. 22, No. 1 (2010) 91115

c World Scientific Publishing Company
DOI: 10.1142/S0129055X10003904

FROM GLOBAL SYMMETRIES TO LOCAL CURRENTS:


THE FREE (SCALAR) CASE IN FOUR DIMENSIONS

GERARDO MORSELLA and LUCA TOMASSINI


Department of Mathematics, Tor Vergata University,
via della Ricerca Scientifica I-00133 Roma, Italy
morsella@mat.uniroma2.it
tomassin@mat.uniroma2.it

Received 4 May 2009


Revised 15 October 2009

Within the framework of algebraic quantum field theory, we propose a new method
of constructing local generators of (global) gauge symmetries in field theoretic models,
starting from the existence of unitary operators implementing locally the flip automor-
phism on the doubled theory. We show, in the simple example of the internal symmetries
of a multiplet of free scalar fields, that through the pointlike limit of such local generators
the conserved Wightman currents associated with the symmetries are recovered.

Keywords: Quantum Noether theorem; split property; flip automorphism.

Mathematics Subject Classification 2010: 81T05, 46L45

1. Introduction
One of the most important features of eld theoretic models is the existence of local
conserved currents corresponding to space-time and internal (gauge) symmetries.
While in the framework of classical Lagrangian eld theory a clarication of this
issue comes from Noethers theorem (which provides an explicit formula for the
conserved current associated to any continuous symmetry of the Lagrangian itself),
it is well known that in the quantum case several drawbacks contribute to make
the situation more confusing. For example, symmetries which are present at the
classical level can disappear upon quantization due to renormalization eects.
In [1, 2], a dierent approach to the problem was outlined in the context of
algebraic quantum eld theory. It consisted of two main steps: (1) given double
cones O, O with bases B, B in the time-zero plane centered at the origin and such
that O O, start from generators Q of global space-time or gauge transformations

Q
and construct local ones, i.e. operators JO, generating the correct symmetry on
O
the eld algebra F (O) and localized in O (i.e. aliated to F (O));
and (2) these
local generators should play the role of integrals of (time components of) Wightman

91
February 11, 2010 11:24 WSPC/148-RMP J070-S0129055X10003904

92 G. Morsella & L. Tomassini

currents over B with a smooth cut-o in B and possibly some smearing in time, so
that one is led to conjecture that

1 Q Q
f (x)x (JO, )dx cj0 (f ) (1.1)
3 R4 O

holds, in a suitable sense, as 0. Here denotes space-time translations, j0Q (x)


the sought-for Wightman current, f S (R4 ) any test function and c a constant
Q
which (in view of the above interpretation of JO, ) would be expected to satisfy
O

vol(B) c vol(B).
(1.2)
It is important to note that there is a large ambiguity in the choice of the local
generators: since their action in O O
is not xed by the above requirements we are
free to add perturbations in F (O O). Thus, the limit (1.1) is not to be expected
to converge in full generality, but we can still hope that a canonical choice or
construction of the local generators might solve the problem (see below).
The rst problem above was completely solved in [1] for the case of Abelian
gauge transformation groups, while in [2,3] the general case (including discrete and
space-time symmetries and supersymmetries) was treated. The nal result was that
in physically reasonable theories what was called by the authors a canonical local
unitary implementation of global symmetries exists and if a part of them actu-
ally constitutes a Lie group the corresponding canonical local generators provide a
local representation of the associated Lie (current) algebras. A key assumption was
identied in the so-called split property (for double cones), which holds in theories
with a realistic thermodynamic behavior [4]. It expresses a strong form of statistical
independence between the regions O and O  and is equivalent to the existence of
normal product states on F (O) F (O)  such that (AB) = (A)(B) ( being
the vacuum state) for A F (O) and B F (O)  [5].
However, the above-mentioned construction crucially depends on such a highly
elusive object as the unique vector representative of the state in the (natural)
cone
 ) = 1/4 (F (O) F (O)
P (F (O) F (O)  )+ (1.3)
(see [6]), where indicates the vacuum vector and the modular operator of
the pair (F (O) F (O)  , ), so that nding an explicit expression of the local
generators appears as an almost hopeless task. This makes it extremely hard to
proceed to the above-mentioned second step, i.e. the determination of the current
elds themselves. Notwithstanding this, the reconstruction of the energy momentum
tensor of a certain (optimal) class of 2-dimensional conformal models was carried out
in [7], while partial results for the U(1)-current in the free massless 4-dimensional
case were obtained in [8], showing that for the local generators of [3] the drawbacks
briey discussed after Eq. (1.1) might be less severe. However, in both cases the
existence of a unitary implementation of dilations was crucial for handling the limit
0.
February 11, 2010 11:24 WSPC/148-RMP J070-S0129055X10003904

From Global Symmetries to Local Currents 93

In what follows, we restrict our attention to the case of continuous symmetries


and propose a new method for obtaining local generators based on the existence
of local unitary implementations of the ip automorphism, a requirement actually
equivalent, under standard assumptions, to the split property [9]. This method
turns out to be particularly suited for carrying out step (2) above, at least in the
free eld case.
To be more specic, we consider a quantum eld theory dened by a net
O F (O) of von Neumann algebras on open double cones in Minkowski 4-
dimensional spacetime acting irreducibly on a Hilbert space H with scalar product
,  satisfying the following standard assumptions:

(1) there is a unitary strongly continuous representation V on H of a compact Lie


group G, which acts locally on F
V (g)F (O)V (g) = F (O), g G,

and we set g := Ad V (g);


(2) (split property) for each pair of double cones O1  O2 (i.e. O
1 O2 ) there
exists a type I factor N such that
F (O1 ) N F (O2 ).

To such a theory, we associate the doubled theory O F (O) := F (O) F (O),


with the corresponding unitary representation of G given by V (g) := V (g) V (g).
In this situation, it is well known that for each pair of double cones O1  O2 there
exists a local implementation of the ip automorphism of F (O1 ), i.e. a unitary
operator WO1 ,O2 F (O2 ) such that
WO1 ,O2 F1 F2 WO 1 ,O2 = F2 F1 , F1 , F2 F (O1 ). (1.4)
Assume now, for the arguments sake, that there is a 1-parameter subgroup
R g G of G, such that the generator Q of the corresponding unitary group
V (g ) is a bounded operator on H . Considering the conditional expectation
(Fubini mapping) E : B(H ) B(H ) B(H ) dened by

E (A1 A2 ) = , A2 A1 , A1 , A2 B(H ),


where H is such that

= 1, we can dene the operator


Q
JO1 ,O2
:=
O1 ,O2 (Q) := E (WO1 ,O2 (1 Q)WO1 ,O2 ), (1.5)
and it is then easy to see that such operator gives a local implementation of the
infinitesimal symmetry generated by Q in the following natural sense:
Q Q
JO1 ,O2
F (O2 ), [JO1 ,O2
, F ] = [Q, F ], F F (O1 ). (1.6)
We also note that for this last equation to hold, it is sucient that WO1 ,O2 is
only a semi-local implementation of the ip, i.e. a unitary in F (O2 )
B(H ) for
which (1.4) holds.
February 11, 2010 11:24 WSPC/148-RMP J070-S0129055X10003904

94 G. Morsella & L. Tomassini

The assumption of boundedness for Q is of course very strong, and it is not


expected to be satised in physically interesting models. In the unbounded case it
is however possible, in the slightly more restrictive setting of [2, 3], to make sense
Q
of Eqs. (1.5) and (1.6) producing a self-adjoint operator JO 1 ,O2
aliated to F (O2 )
and implementing the commutator with Q on a suitable dense subalgebra of F (O1 ).
More explicitly, assume that the triple = (F (O1 ), F (O2 ), ) is a standard split
W -inclusion in the sense of [10] and consider the unitary standard implementation
U : H H H of the isomorphism
: F1 F2 F (O1 ) F (O2 ) F1 F2 F (O1 )
F (O2 ) .

This was used in [3] to dene the universal localizing map : B(H ) B(H ),
(T ) = U (T 1)U , T B(H ),
where the standard type-I factor N = (B(H )) satises F (O1 ) N
F (O2 ). For the commutant standard inclusion  = (F (O2 ) , F (O1 ) , ) [10], one
has  (T ) = U (1 T )U .
For any unitarily equivalent triple 0 = (V0 F (O1 )V0 , V0 F (O2 )V0 , V0 ), one
nds U0 V0 = V0 V0 U . Notice that in the case of gauge transformations
= 0 and so
U V (g) = V (g) V (g) U . (1.7)
It is then straightforward to verify that, with Z1,3 the unitary interchanging the
rst and third factors in H H H H , the operator
W = (U U )Z1,3 (U U )
is a local implementation of the ip. Setting g = g in (1.7) and dierentiating with
respect to , a simple computation shows that
W (1 Q)W = (Q) 1 + 1  (Q) = JQ 1 + 1 JQ ,
where JQ , JQ are the canonical local implementations of [2, 3], which of course
satisfy (1.6). Choosing now = U ( ), we see that Q
O1 ,O2 (Q) = J . The
above construction (1.5) therefore includes the canonical one as a particular case.
As remarked above, the control of the limit (1.1) for such operators does not
seem within reach of the presently known techniques. However, we shall see in
Sec. 3 below that if Q is the (unbounded) generator of a 1-parameter subgroup of
a compact Lie gauge group acting on a nite multiplet of free scalar elds of mass
m 0, it is possible to provide a dierent explicit (semi-)local implementation
of the ip WO1 ,O2 such that the limit (1.1) can actually be performed for the
Q
corresponding generator JO 1 ,O2
(which is self-adjoint and satises (1.6) in the same
Q
sense as J ).
The rest of the paper is organized as follows. In Sec. 2, we introduce a new
class of test functions spaces and use it to obtain estimates concerning certain
free eld bilinears; as it is shown in the Appendix, these estimates also allow to
February 11, 2010 11:24 WSPC/148-RMP J070-S0129055X10003904

From Global Symmetries to Local Currents 95

establish the existence of the above-mentioned unitaries. This is used in Sec. 3,


where we go into the study of our models of 4-dimensional free elds. We focus
on the case of a single-charged free eld with U(1) symmetry, the multiplet case
being an easy generalization discussed at the end. We elaborate on the explicit
realization of local unitaries implementing the ip automorphisms introduced for
the neutral eld case in [9], make use of the multiple commutator theorem in [11] to
get an expression for the corresponding local generators of the U(1) symmetry and
prove their (essential) self-adjointness on a suitable domain. Finally, convergence
of the limit (1.1) is proved and the constant c there shown to satisfy (1.2) (and in
particular to be dierent from zero).

2. Test Functions Spaces and N -Bounds for Free Field Bilinears


We collect here some technical results, needed in the following section, on the
extension of bilinear expressions in two commuting complex free scalar elds i ,
i = 1, 2, and their derivatives, to suitable spaces of tempered distributions. Using
this, we will also obtain useful N -bounds for such operators.
The Hilbert space H on which the elds i act is the bosonic second quantiza-
tion of K = L2 (R3 ) C4 . For H , we denote by (n) its component in K S n
(the symmetrized n-fold tensor power of K) and by D 0 we indicate the dense space
(n)
of H such that = 0 for all but nitely many n N0 . Let N be the
(n) (n)
number operator, dened by (N ) = n on the domain D(N ) of vectors
 =+,
H such that n n2
(n)
2 < . Fixing an orthonormal basis (ei )i=1,2 of
4 S n 1 ...n 1 ...n =+,
C , we can identify elements K with collections = (i1 ...in )i1 ...in =1,2 of
functions on R3n , such that i11...i
...n
n
(p1 , . . . , pn ) is symmetric for the simultaneous
interchange of (k , ik , pk ) and (h , ih , ph ), and
 
dp1 dpn |i11...i
...n
n
(p1 , . . . , pn )|2 < .
1 ,...,n =+, R3n
i1 ,...,in =1,2

We introduce then the operators on H


c,
i () = a( ei ),

c,+
i () = a( ei ) ,

where L2 (R3 ) and a(), K, is the usual Fock space annihilation operator.
Their commutation relations are

, ,
[ci (), cj ()] = ij , , dp (p)(p).
R3

Introducing also the maps j : S (R4 ) L2 (R3 ), = +, , dened by



j f (p) := 2/m (p)f(m (p), p),(where f(p) = R4 (2)dx
2 f (x)e
ipx
is the

Fourier transform of f and m (p) = |p|2 + m2 ) and the notation i (f ) := i (f) ,
February 11, 2010 11:24 WSPC/148-RMP J070-S0129055X10003904

96 G. Morsella & L. Tomassini

we have
1  , 1  +,
i (f ) = ci (j f ), i (f ) = ci (j f ).
2 =+, 2 =+,

With the notation := 0 , we have, for f S (R8 ) and D 0,



(: l i k j : (f ))(n) = : l c,
i k c+,
j : (f )(n) (n) , (2.1)
,

where : l c,
i k c+,
j : (f )(n) : K S (n) K S n is a bounded operator whose
expression can be obtained from the formal expression of i in terms of creation
and annihilation operators. For instance, if K S n ,
(: l ci,+ k cj+, : (f )(n) )i11...i
...n
n
(p1 , . . . , pn )
X
n
= r ,+ i,ir il+r (1)k
r=1
Z
dpm (p)k1/2 m (pr )l1/2 f(pr,+ , p+ )+1 ...

r ...n
(p, p1 , . . . , p
r , . . . , pn ),
ji1 ...ir ...in
R3

where the hat over an index means that the index itself must be omitted and
where we have introduced the convention (which we will use systematically in the
following) of denoting simply by q R4 the 4-vector (m (q), q), = +, .
We now want to show that such operators can be extended to suitable spaces
of tempered distributions on R8 , which in turn are left invariant by the operation
induced by the commutator of eld bilinears.

Definition 2.1. We denote by C the space of functions f C (R8 ) such that for
all r N, , N40 ,


f
r,, = sup |(1 + |p + q|)r p q f (p, q)| < .
(p,q)R8

Introducing the notation f(p, q) := f (q, p) and the expressions



(T k,l (f ))(p) := dq m (p)k1/2 m (q)l1/2 f (p+ , q+ )(q),
R3

k,l,
f (p, q) := f (p+ , q+ )m (p)k1/2 m (q)l1/2 ,

where k, l = 0, 1 and = +, , we denote by Ck,l the space of functions f C


such that T k,l (|f |), T l,k (|f|) : L2 (R3 ) L2 (R3 ) are bounded operators and k,l,
f
2 6
L (R ). Furthermore, we introduce on C the seminorm
k,l


f
k,l := max{
T k,l (|f |)
,
T l,k (|f|)
,
k,l,
f
L2 (R6 ) }.

The spaces Ck,l depend also on the mass m appearing in m , but we have
avoided to indicate this explicitly in order not to burden the notations. It is
clear that functions in C are bounded with all their derivatives and therefore
February 11, 2010 11:24 WSPC/148-RMP J070-S0129055X10003904

From Global Symmetries to Local Currents 97

Ck,l S  (R8 ). We denote then by C k,l the space of distributions f S  (R8 )


such that f Ck,l . It is also easy to verify that S (R8 ) C k,l .

Lemma 2.1. The expression


 
C l,k (f, g)(p, q) := (1)l (i)k+l
dk m (k)l+k1 f (p, k+ )g(k+ , q),
= R3

(2.2)
   
defines a bilinear map C l,k : Cl ,l Ck,k Cl ,k , such that
C l,k (f, g)
l ,k
2
f
l ,l
g
k,k .

Proof. We start by showing that if f, g C then C l,k (f, g) C. Setting =


2/|p + q|, and e = (p + q)/2, it is clearly sucient to show that, as 0,

|x|h/2 dx
Ih,r () := O(s(r,h) ), (2.3)
R3 (1 + |x + 1 e|)r (1+ |x 1 e|)r

where h = k+l1 = 1, 0, 1, and s(r, h) + as r +. Consider rst the case


h = 0. Choosing the x3  axis along e and evaluating the integral
 in prolate spheroidal
coordinates x1 = 1 (u2 1)(1 v 2 ) cos , x2 = 1 (u2 1)(1 v 2 ) sin ,
x3 = 1 uv, one gets
  +  +  + 
I0,r () = 22r3 du Jr1 (u) + 2 du Jr (u) 2 du uJr (u) ,
1+ 1+ 1+

where, by recursion,
 1 
r1
1 2(2r 3) (2r 2k + 1) 1
Jr (u) := dv 2 2
= 2k 2
1 (u v )r (2r 2) (2r 2k) u (u 1)rk
k=1
 
(2r 3)!! 1 u + 1
+ 
log ,
(2r 2)!! u2r1 u 1

which easily gives estimate (2.3) with s(r, 0) = r 3. Take now h = 1. Dividing
the integration region into the subregions {|x| 1}, {|x| > 1} and using the
CauchySchwarz inequality in the rst integral, one gets

1/2
I1,r () |x|1 dx I0,2r ()1/2 + I0,r () O(r3 ).
|x|1

Finally, for h = 1, taking into account the bound |x|1/2 /(1 + |x + 1 e|)(1 + |x
1 e|) 1/2, one gets I1,r () O(r4 ).
   
We now show that if f Cl ,l , g Ck,k , then C l,k (f, g) Cl ,k . We introduce
the notation K to denote the HilbertSchmidt operator on L2 (R3 ) with kernel
February 11, 2010 11:24 WSPC/148-RMP J070-S0129055X10003904

98 G. Morsella & L. Tomassini

L2 (R6 ). It is then easy to verify that, if L2 (R3 ),


   

T l ,k (|C l,k (f, g)|)
2 (
T l ,l (|f |)T k,k (|g|)||
2 +
Kl ,l,+ Kk,k , ||
2 ),
|f | |g|

  

T k ,l (|C l,k

(f, g)|)
2 (
T k ,k (|

g |)T l,l (|f|)||
2 +
Kl ,l, Kk,k ,+ ||
2 ),
|f | |g|

    
so that T l ,k (|C l,k (f, g)|) and T k ,l (|C l,k (f, g)|) are bounded. Furthermore one has,
2 6
for L (R ),
   
l ,k ,+
|C k,k ,+
l,k (f,g) , L2 (R6 ) | (|g| , (T l ,l (|f |) 1)||L2 (R6 )
 
+ l|f,l,+
| , (1 T
k ,k
g|) )||L2 (R6 ) )
(|
   
l ,k , l ,l,
|C l,k (f,g) , L2 (R6 ) | (|f | , (1 T
k,k
(|g|))||L2 (R6 )
 
k,k ,
+ |g| , (T l,l (|f|) 1)||L2 (R6 ) )
 
l ,k , 2 6
l,k (f,g) L (R ). The bound on
C
so that by Riesz theorem C l,k (f, g)
l,k now
follows at once from the above estimates.
 
For (f, g) C l ,l C k,k , we write C l,k (f, g) := C l,k (f, g) .

Proposition 2.1. The following statements hold for any i, j {1, 2}, k, l
{0, 1}, n N, , {+, }, with n 0.

(1) The map f S (R8 ) : l c, i k c+,


j : (f )(n) B(K S (n) , K S n )
can be extended to a map (denoted by the same symbol) from C l,k to
B(K S (n) , K S n ), such that

: l c,
i k c+,
j : (f )(n)

f
l,k (n + 2). (2.4)

(2) For each f C l,k the operator : l i k j : (f ), defined on D


0 by formula (2.1),
satisfies
+ 1)1/2 : l i k : (f )(N

(N + 1)1/2

f
l,k , (2.5)
j


(N , : l i k : (f )](N
+ 1)1/2 [N + 1)1/2

f
l,k , (2.6)
j
 
for some > 0. If furthermore (f, g) C l ,l C k,k , there holds, on D
0,
 
[: l i l i : (f ), : k j k j  : (g)]
   
= ij : l i k j  : (C l,k (f, g)) i ,j  : k j l i : (C k ,l (g, f ))
      ,+
+ il+l +k+k 2 i ,j  i,j ((1)l+l lf,l, , k,k

g L2 (R6 )

(1)k+k lf,l,+ , 
  k,k ,

g L2 (R6 ) )1. (2.7)
February 11, 2010 11:24 WSPC/148-RMP J070-S0129055X10003904

From Global Symmetries to Local Currents 99

Proof. (1) Dene the contraction operator () : K (n+2) K n , K 2 , by


()1 n+2 = , 1 2 3 n+2 . It is easily seen from the usual
expressions of creation and annihilation operators (see, e.g., [12, Sec. X.7]) that
for f S (R8 )


n
: l ci,+ k cj+, : (f )(n) = il (i)k Vr ((T l,k (f) |e+ +
i ej |) 1 1),
r=1
1,n

il+k
: l ci,+ k cj+,+ (n)
: (f ) =  Wr,s (l,k,+
f
(e+
i ej )) ,
n(n 1) r=s

: l ci, k cj+, : (f )(n) = (i)l+k (n + 1)(n + 2)(l,k,
f
(e +
i ej )),

where for i K, i = 1, . . . , n,

Vr 1 n = 2 1 n ,
r th place

Wr,s 1 n = 3 1 2 n .
rth place sth place

Thus the above formulas provide an extension of : l c,


i k c+,
j : ()(n) to C l,k
and the bound (2.4) holds.
(2) The bounds (2.5) and (2.6), with = 4( 3 + 1), follow easily from (2.4).
Equation (2.7) is obtained by a straightforward (if lengthy) calculation, using
the above expressions for : l c,
i k c+,
j : ()(n) .

Remark 2.1. It is not dicult to see that the above dened exten-
sion of : l c,
i k c+,
j : ()(n) to C l,k is unique in the family of linear maps
S (n)
S : C l,k
B(K , K S n ) which are sequentially continuous when
S (n) S n
B(K ,K ) is equipped with the strong operator topology and C l,k
is equipped with the topology induced by the family of seminorms



f
k,l, = max{
T k,l (|f|)
,
T l,k (|f|)
,
k,l,
f

L2 (R6 ) }, L2 (R3 ),

with respect to which S (R8 ) is sequentially dense in C l,k . On the other hand,
we point out the fact that, according to Eq. (2.7), the linear span of extended
eld bilinears is stable under the operation of taking commutators. Together with
Proposition A.1 in the Appendix, this implies that in the construction of the local
symmetry generator carried out in the following section, Eq. (3.8), only the above
dened extensions are relevant.

According to the results in [12, Sec. X.5], the bounds (2.5) and (2.6) imply that
: l i k j : l i k j : (f ) can be extended to an operator, denoted by the same
symbol, whose domain contains D(N ).
February 11, 2010 11:24 WSPC/148-RMP J070-S0129055X10003904

100 G. Morsella & L. Tomassini

3. Reconstruction of the Free Field Noether Currents


We start by considering the theory of a complex free scalar eld of mass m 0.
The Hilbert space of the theory is the symmetric Fock space H = (L2 (R3 ) C2 ).
As customary, we denote by D0 H the space of nite particle vectors, and by N
the number operator N = d(1), with domain D(N ). The local eld algebras are
dened as usual by

F (O) := {ei[(f )+(f ) ]
: f D(O)} ,

and if we consider
i
e 0
V () := 1 ,
0 ei

we obtain a continuous unitary representation of U(1) (i.e. a 2-periodic represen-


tation of R) on H , R V (), which induces a group of gauge automorphisms
:= Ad V () of F such that ((f )) = ei (f ). We denote by Q the self-adjoint
generator of this group. It is easy to see that
(N + 1)1/2 Q(N + 1)1/2
1 and
[N, Q] = 0, so that thanks to Nelsons commutator theorem (cfr. [12, Sec. X.5])
D(N ) D(Q). Furthermore we introduce the unitary operator Z on H such that
Z(f )Z = (f ), Z = .
In order to nd an explicit representation of the (semi-)local implementation of
the ip automorphism we consider, following [9], the doubled theory O F (O) :=
F (O) F (O), generated by the two commuting complex scalar elds 1 (f ) :=
(f ) 1, 2 (f ) := 1 (f ). There is a continuous unitary representation of U(1)
on H = H H , R Y (), which induces a group of gauge automorphisms
:= Ad Y () of F such that

(1 (f )) = cos 1 (f ) sin 2 (f ),
(3.1)
(2 (f )) = sin 1 (f ) + cos 2 (f ).

In Proposition A.1 in the Appendix it is shown that the Noether current of this
U(1) symmetry

J (x) = 1 (x) 2 (x) + 1 (x) 2 (x) 1 (x)2 (x) 1 (x) 2 (x) (3.2)

is a well-dened Wightman eld that when smeared with an h SR (R4 ) gives


an operator which is essentially self-adjoint on D(N ), and generates a group of
unitaries which locally implements the symmetry: given 3-dimensional open balls
Br , Br+ centered at the origin of radii r + > r > 0 together with functions
D  R (Br+ ), DR ((, )) such that < /2, (x) = 1 for each x Br+
and R = 1, it holds that

eiJ0 () F (Or+ ), eiJ0 () F eiJ0 () = (F ), F F (Or ), (3.3)


February 11, 2010 11:24 WSPC/148-RMP J070-S0129055X10003904

From Global Symmetries to Local Currents 101

where Or , Or+ are the double cones with bases Br , Br+ , respectively. It
then follows easily that setting h := with (x) = (1 x) and
(t) = 1 (1 t), the unitary operator

WOr ,Or+ := (1 Z)ei 2 J0 (h ) F (Or+ )
B(H ), (3.4)

is a semi-local implementation of the ip automorphism on F (Or ) for each > 0.


In what follows, we will keep the functions , xed and we will assume that
(Rx) = (x) for each R O(3).
For a function h S (R4 ), we introduce the distribution h S  (R8 ) dened

by h (x, y) = h(x)(x y) (i.e. h , f  = R4 dx h(x)f (x, x) for f S (R8 )).

Proposition 3.1. Let the operator WOr ,Or+ be defined as above. The operator
Or ,Or+ (Q) defined on D(N ) by


Or ,Or+ (Q) = P1 WOr ,Or+ (1 Q)WOr ,Or+
, D(N ), (3.5)

where P1 (1 2 ) = , 2 1 , is essentially self-adjoint. Furthermore, there are


l,k
distributions Kn,m () C l,k , n N, l, k = 0, 1, m 0, defined recursively by

1,0 0,1 0,0 1,1


K1,m () = K1,m () := (h ) , K1,m () = K1,m () = 0, (3.6)
1

l,k
Kn+1,m () = i(1)n [(1)l+1 C 1l,r ((h ) , Kn,m
r,k
())
r=0

+ (1) C r,1k (Kn,m


k l,r
(), (h ) )], (3.7)

such that, for all D(N ),



+
 2n 0,1


Or ,Or+ (Q) = n (2n)!
: l k : (K2n,m
l,k
()) , (3.8)
n=1
4
l,k

the series being absolutely convergent for all (0, 1].

Proof. We start by observing that, for all H for which the right-hand side
of (3.5) is dened, one has

Or ,Or+ (Q) = P1 ei 2 J0 (h ) (1 Q)ei 2 J0 (h ) . (3.9)

It follows from this formula that Or ,Or+ (Q) is well-dened (and symmetric) on
D(N ): according to formula (A.1) in the Appendix for J0 (h ), Proposition 2.1(2)

and [11, Lemma 2], we have ei 2 J0 (h ) D(N
) D(N
) and D(N ) D(Q) as remarked
above.
February 11, 2010 11:24 WSPC/148-RMP J070-S0129055X10003904

102 G. Morsella & L. Tomassini

)
Recalling now the denition of Q one has on D(N
2

Q1 () := i[J0 (h ), 1 Q] = [: j j  : ((h ) ) : j j  : ((h ) )]
j=1

0,1
2 

= : l j k j  : (K1,m
l,k
()),
j=1 l,k

where j  = 3 j. Proceeding now inductively using formula (2.7), one veries that
0,
there are operators Qn () such that, on D
Qn+1 () = i[J0 (h ), Qn ()], (3.10)
0,1
2 

Q2n () = (1)j+1 : l j k j : (K2n,m
l,k
()),
j=1 l,k
(3.11)
0,1
2 

Q2n+1 () = : l j k j  l,k
: (K2n+1,m ()),
j=1 l,k

l,k
where the distributions Kn,m () C l,k satisfy (3.7). It is also easy to verify induc-
tively that the distributions K l,k () are real (g S  being real if g, f  = g, f),
n,m
so that Qn () is symmetric. Arguing again by induction, it follows from (3.7) and
Lemma 2.1, that

K
n,m
l,k 
()
l,k (8)n1 (max{
(h 
)
0,1 ,
(h )
1,0 }) (8)
n n1

h
nS ,
where
h
S is some xed Schwartz norm of h. The last inequality above follows
from Lemma A.1 and from the observation that, switching for a moment to the
(m)
notation

l,k in order to make explicit the dependence on the mass m of the
seminorms

l,k , one has


(h
(m) (m)
)
l,1l =
h
l,1l , l = 0, 1.
Using now the bounds in Proposition 2.1(2) and the results in [12, Sec. X.5], we see
that Qn () can be extended to an operator (denoted by the same symbol) which
. The domain D
is essentially self-adjoint on any core for N 0 being such a core,
Eq. (3.10) can be assumed to hold weakly on D(N ) D(N
) and we are therefore
in the position of applying [11, Theorem 1 ] to obtain

1  n
+


ei 2 J0 (h ) (1 Q)ei 2 J0 (h ) = 1 Q + Qn ()
n=1
n! 2
). Combining this with (3.9),
and the series converges strongly absolutely on D(N
and the fact that
P1 : l j k j  : (K2n+1,m
l,k
()) = 0 = P1 : l 2 k 2 : (K2n,m
l,k
()) ,
Eq. (3.8) readily follows, upon identication of 1 (f ) = (f ) 1 with (f ).
February 11, 2010 11:24 WSPC/148-RMP J070-S0129055X10003904

From Global Symmetries to Local Currents 103

It remains to prove that Or ,Or+ (Q) is essentially self-adjoint on D(N ), but


this again follows from the easily obtained N -bounds

(N + 1)1/2 Or ,Or+ (Q)(N + 1)1/2
cosh(4 2
h
S ),
(3.12)

(N + 1)1/2 [N, Or ,Or+ (Q)](N + 1)1/2
cosh(4 2
h
S ),
where > 0 is a suitable numerical constant.

We now show that the unitary group generated by the operator Or ,Or+ (Q)
dened in the above proposition provides a local implementation of the U(1) sym-
metry.

Proposition 3.2. For each R and F F (Or ) there holds:


eiOr ,Or+ (Q) F (Or+ ), eiOr ,Or+ (Q) F eiOr ,Or+ (Q) = (F ).

Proof. Since the free eld enjoys Haag duality property, it is sucient to show
that

eiOr ,Or+ (Q) ei[(f )+(f ) ]
eiOr ,Or+ (Q) = ei[(f )+(f ) ]


if supp f Or+ and that
i
(f )+ei (f ) ]
eiOr ,Or+ (Q) ei[(f )+(f ) ]
eiOr ,Or+ (Q) = ei[e
if supp f Or . Applying once again [11, Theorem 1 ] and keeping in mind the
previously obtained N -bounds for Or ,Or+ (Q), Eq. (3.12), one sees that in order
to achieve this, it is enough to show that for all 1 , 2 D(N )
Or ,Or+ (Q)1 , (f )2  (f ) 1 , Or ,Or+ (Q)2  = 0 (3.13)

for supp f Or+ and
Or ,Or+ (Q)1 , (f )2  (f ) 1 , Or ,Or+ (Q)2  = 1 , (f )2  (3.14)
for supp f Or . In order to prove the latter equation we compute
Or ,Or+ (Q)1 , (f )2 

= (1 Q)ei 2 J0 (h) 1 , ei 2 J0 (h) ((f ) 1)2 

= (1 Q)ei 2 J0 (h) 1 , (1 (f ))ei 2 J0 (h) 2 

= (1 (f ) )ei 2 J0 (h) 1 , (1 Q)ei 2 J0 (h) 2 

+ ei 2 J0 (h) 1 , (1 (f ))ei 2 J0 (h) 2 
= (f ) 1 , Or ,Or+ (Q)2  + 1 , (f )2 ,
where in the second and fourth equalities we used (3.1) and (3.3), and in the third

equality the fact that, as noted in the proof of Proposition 3.1, ei 2 J0 (h) i
D(N ) and that for 2 D(N
1, ), there holds

(1 Q) 2  (1 (f ) )
1 , (1 (f )) 1 , (1 Q)
2  = 
1 , (1 (f ))
2
February 11, 2010 11:24 WSPC/148-RMP J070-S0129055X10003904

104 G. Morsella & L. Tomassini

which in turn is an easy consequence of the commutation relation

[Q, (f )] = (f ), D(N ),
is the closure of N 1 + 1 N and of the N
of the fact that N -bounds hold-
ing for 1 Q and 1 (f ). The proof of (3.13) being analogous, we get the
statement.

l,k
In the following lemma, we collect some properties of the distributions Kn,m :=
l,k
Kn,m (1) which will be needed further on. We will use systematically the notations


f
:= sup (1 + |p0 | + |p|) |f (p)|,
pR4

:= sup (1 + |p|) |(p)|,


pR3

1, := max{

,

},

1, := max{

,
1
, . . . ,
3
},

for f S (R4 ), S (R3 ), S (R) and > 0.

Lemma 3.1. The following statements hold.


l,k enjoy the following symmetry properties:
(1) The functions K n,m

n,m
K l,k
(p, q) = K
n,m
k,l
(q, p), n,m
K l,k n,m
(p0 , Rp, q0 , Rq) = K l,k
(p, q) (3.15)

for all p = (p0 , p), q = (q0 , q) R4 , and all R O(3).


(2) Given > 5 there exists a constant C1 > 0 such that, uniformly for all m
[0, 1] and all smearing functions DR (Br+ ), DR ((, )),
C1n1 n
|K
n,m
l,k
(p, q)| n (1 + |p|)2l (1 + |q|)2k ,

n N, (3.16)
4 2
for all p = (p0 , p), q = (q0 , q) R4 .
(3) For each n N, the function (p, q, m) R8 [0, 1] K l,k (p, q) is continuous.
n,m
(4) For each n N, the function (p, q, m) R8 [0, 1/e] K n,m
l,k
(p, q) is of
1
class C . Moreover, given > 5, there exists a constant C2 C1 such that
uniformly for all m [0, 1/e] and all smearing functions DR (Br+ ),
DR ((, )),
 
  C1n1 n

K l,k
(p, q) n1, (1 + |p|)2l (1 + |q|)2k ,

1,

(3.17)
 u n,m  4 2
 
  C2n1 n

K l,k
(p, q) m|log m|

1,

n1, (1 + |p|)2l (1 + |q|)2k ,


 m n,m  4 2
(3.18)

for all p = (p0 , p), q = (q0 , q) R4 , and where u in (3.17) is p or q.


February 11, 2010 11:24 WSPC/148-RMP J070-S0129055X10003904

From Global Symmetries to Local Currents 105

Proof. (1) Both properties in (3.15) follow easily by induction from the recursive
n,m
denition of K l,k
, taking into account rotational invariance of the function .

(2) We start by observing that, by interchanging k with k in the = 1 sum-


mand, formula (2.2) can be rewritten as
 
C l,k (f, g)(p, q) := (1)l (i)k+l dk m (k)l+k1 f (p, k )g(k , q),
= R3

(3.19)

where we recall that k = (m (k), k). Since > 5, there exists a xed constant
 
dk |k|s dk
B1 > , , s = 0, 1, 2, p R3 .
R3 |k|(1 + |p k|) R3 (1 + |k|)

It is then easily computed that for h = 1, 0, 1, j = 1, 2 and m [0, 1],



m (k)h (1 + |k|)j
dk 7B1 (1 + |p|)h+j ,
R3 (1 + |p k|)

so that estimate (3.16) follows by induction from (3.7) and the above expression
for C l,k , provided one denes C1 := 14B1 / and keeps in mind that h (p, q) =
1
4 2 (p 0 + q0 )(p
+ q).
(3) Using (3.16) and the fact that h S (R4 ), we obtain a bound to the inte-
1l,r , K ) with an integrable func-
grands in C (h r,k
n,m ) and C
r,1k (K
l,r , h
n,m
tion of k, uniformly for (p, q, m) in a prescribed neighborhood of any given
( R8 [0, 1]. By a straightforward application of Lebesgues domi-
p, q, m)
nated convergence theorem, the continuity of (p, q, m) K l,k (p, q) follows
n,m
then by induction from the recursive relation (3.7).
(4) Since K l,k Cl,k , we already know that it is dierentiable with respect to the
n,m
components of p and q. The estimate (3.17) and the continuity of (p, q, m)
l,k
u Kn,m (p, q) then follow by an easy adaptation of the inductive arguments
of points (2) and (3) above, using also (3.16). In order to show that K n,m
l,k

is continuously dierentiable in m and satises (3.18), we proceed again by


induction using (3.7). The m-derivative of the integrands in C 1l,r (h , K n,m
r,k
)
is given, apart from numerical constants, by

m(r l) m k )K
h(p k )Kn,m (k , q)
r,k
0 h(p r,k (k , q)
m (k)2+lr m (k)1+lr n,m


K r,k
k )
h(p
n,m k ) K
(k , q) + m (k)rl h(p r,k (k , q).
p0 m n,m

It is now straightforward to verify, using (3.16), (3.17) and the inductive hypoth-
esis (3.18), that it is possible to bound the last three terms in the above
February 11, 2010 11:24 WSPC/148-RMP J070-S0129055X10003904

106 G. Morsella & L. Tomassini

expression with an integrable function of k, uniformly for (p, q, m) in a given


neighborhood of a xed ( R8 [0, 1/e]. The same reasoning also applies
p, q, m)
to the rst term when 2 + l r < 3 and also when 2 + l r = 3 for |k| 1/2.
For |k| 1/2 and 2 + l r = 3 the rst term can be bounded uniformly
in a neighborhood of (p, q) by the function m(m + |k|)3 , apart from a con-
stant (depending on the chosen neighborhood). By maximizing the function
x x3 | log x| /(m + x)3 in the interval [0, 1/2], with > 1, one nds the
bound
3
/3
3 mW0 e
m 3 3m
,
(m + |k|)3 |k|3 |log|k||

where W0 is the principal branch of Lamberts W function [13]. From the


asymptotic expansion of W0 given in [13, Eq. (4.20)] it is then easily seen that
the numerator on the right-hand side converges to 0 as m 0; since the func-
tion k |k|3 |log|k|| is integrable for |k| 1/2, interchangeability of deriva-
tion with respect to m and integration with respect to k in C 1l,r (h , K
n,m
r,k
)
for all values of l, r, k = 0, 1 follows. A completely analogous argument applies
of course to C r,1k (K l,r , h l,k
), so that we conclude that K
n,m n+1,m is continuously
dierentiable in m. To complete the inductive step, it remains to be shown that
l,k
estimate (3.18) holds for m Kn+1,m . In order to do that, we argue again in a
similar way as in point (2) by choosing constants B2 , B3 > 0 such that
 
dk |k|s dk
B2 , , s = 0, 1, t = 0, 1, 2, p R3 ,
R3 |k| (1 + |p k|) R3 (1 + |k|)
t

 1
B3 log(1 + 1 + m2 ) , m [0, 1/e].
1 + m2

Taking now into account the identity


 1 
x2 dx 1
= log(1 + 1 + m2 ) log m,
0 (m2
+x )2 3/2
1 + m2

it is easy to verify that the estimate


 
 1l,r  m|log m|
 (h , Kn,m )(p, q)
r,k
 m C 8 3
[16(1 + B3 ) + 16B2 + 7B1 ]

2l
n+1

C2n1

1, 1, (1 + |p|)
n+1
(1 + |q|)2k ,

r,1k l,r
holds for all m [0, 1/e] together with a similar one for m C (Kn,m , h ).
2
Choosing C2 := [16(1 + B3 ) + 16B2 + 7B1 ] C1 , one nally gets (3.18) for
l,k
K n+1,m .
February 11, 2010 11:24 WSPC/148-RMP J070-S0129055X10003904

From Global Symmetries to Local Currents 107

In the next theorem, which is our main result, we denote by D0,S the dense
subspace of H of nite particle vectors such that the n-particle wave functions are
in S (R3n ) for each n N.

Theorem 3.1. There holds, for each f S (R4 ) and each D0,S ,

1
lim dx f (x)x (Or ,Or+ (Q)) = cj0 (f ), (3.20)
0 3 R4

where j0 (f ) = : : (f ) is the Noether current associated to the U(1)


symmetry of the charged KleinGordon field of mass m 0 smeared with the test
function f and
 
+
 2n 0,0
K
4 0,1 2n,0
c = (2)
K2n,0 (0, 0) + i (0, 0) . (3.21)
n=1
4n (2n)! p0

Proof. Since D0,S is translation invariant and contained in D(N ), according to


Proposition 2.1 and the estimates given in the proof of Proposition 3.1 there exists
a > 0 such that, for each x R4 ,

x (: l k : (K2n,m
l,k
()))
(8)2n1
h
2n
S
(N + 1)
,

and

x (: l k : (K2n,m
l,k
())) y (: l k : (K2n,m
l,k
()))

(8)2n1
h
2n
S
(U (x) U (y) )(N + 1)

+
(U (x) U (y)): l k : (K2n,m
l,k
())U (y)
,
so that the function x x (Or ,Or+ (Q)) is continuous and bounded in norm
for each D0,S , the integral in (3.20) exists in the Bochner sense and furthermore
it is possible to interchange the integral and the series.
Given now K C l,k , it is easy to see that the pointwise product K f still
1
belongs to C l,k
and
K f
l,k (2)2
f

K
l,k so that we can dene K f :=

4
(2) (K f ) C l,k . It is then straightforward to check that

dx f (x)x (: l k : (K2n,m
l,k
())) = : l k : (K2n,m
l,k
() f ).
R4

Furthermore one has K l,k ()(p, q) = 2+l+k K l,k (p, q) and, with the nota-
2n,m 2n,m
tion ( K)b(p, q) = K(p,
q), we see that we are left with the calculation of
0,1
 +
 2n
lim l+k1
n (2n)!
: l k : ( K2n,m
l,k
f ). (3.22)
0
n=1
4
l,k

As a rst step in this calculation, we show that it is possible to interchange the


limit and the series. Of course, it is sucient to consider vectors with vanishing
n-particles components except for n = N with any xed N N. For simplicity, we
will give here only the relevant estimates in the case m > 0, the case m = 0 being
February 11, 2010 11:24 WSPC/148-RMP J070-S0129055X10003904

108 G. Morsella & L. Tomassini

treated in a similar way. Using then the notations for creation and annihilation
operators and for wave functions introduced in Sec. 2 and the formulas in the proof
of Proposition 2.1, we have


: l c,+ k c+, : ( K2n,m
l,k
f )(N )

16 5 N
((T l,k (( K2n,m
l,k
)bf ) |e+ e+ |) 1 1)
,

together with the estimate, for [0, 1/m],


l,k
|[((T l,k (( K2n,m )bf ) |e+ e+ |) 1 1)]1 ...N (p1 , . . . , pN )|

C1n1 B1 n n (1 + |p1 |)2l m (p1 )l1/2






f

4 2
(1 + |p2 |) (1 + |pN |)

m (q)k1/2 (1 + |q|)2k
dq ,
R3 (1 + |q|) (1 + |p1 q|)

where we have used (3.16) and the fact that D0,S (which gives the constant
B1 > 0). It is now easy to see that the right hand side is a square integrable function
of (p1 , . . . , pN ) if > 3/2, > 3, > 15/2 and therefore we get


: l c,+ k c+, : ( K2n,m
l,k
f )(N )
B2 C1n1

n
,

where B2 > 0 is a constant depending on m, f , but not on n and . A similar


estimate holds then for
: l c, k c+,+ : ( K2n,m
l,k
f )(N )
. Furthermore we have


: l c, k c+, : ( K2n,m
l,k
f )(N 2)


16 5 N (N 1)
l,k, ( K l,k

2 6

,
)bf L (R )
2n,m

with
2(n1)
C1 B3 2n

l,k, l,k b

2L2 (R6 ) 2n

2

f

( K2n,m ) f 16 4

(1 + |p|)3 (1 + |q|)3
dp dq ,
R6 (1 + |p| + |q|)2

for some > 6. A similar estimate holds for


: l c,+ k c+,+ : ( K2n,m
l,k
f )(N +2)
.
In summary, we get, uniformly for [0, 1/m],


: l k : ( K2n,m
l,k n

f )
B4 C1n1

n
,

with B4 independent of and n, so that, if l + k 1, it is possible to interchange


the limit and the sum in (3.22). The term in (3.22) with l = k = 0 needs however a
separate treatment, due to the divergent prefactor 1 . We rst observe that, due
to the rst relation in (3.15), we have K 0,0 (0, 0) = 0. Using bounds (3.17) and
n,m
February 11, 2010 11:24 WSPC/148-RMP J070-S0129055X10003904

From Global Symmetries to Local Currents 109

(3.18), we thus obtain the estimate


     

 1 0,0  1 d  
 K (p , q ) =  d
K 0,0
(p , q ) 
 2n,m 

 0 d 2n,m



= 

3C2n1
1, (m + |p| + |q|)(1 + |p|)2 (1 + |q|)2 ,

1,

4 2
valid for ,  = and for [0, 0 ], with 0 := min{1/em, 1}. Then a straight-
forward adaptation of the above arguments easily gives, uniformly for [0, 0 ],

1

: : ( K2n,m
l,k
f )
B5 C2n1

n
1, 1, , (3.23)

with B5 > 0 a constant independent of and n.
The same estimates above, being uniform in [0, 1/m], together with use of
Lemma 3.1(3), allow us also to conclude that

lim : l k : ( K2n,m
l,k
f ) = (2)4 K
l,k (0, 0): l k : (f ).
2n,0 (3.24)
0

Furthermore there holds


b 0,0
1 0,0 K 2n,0
lim K2n,m f (p, q) = (2)4 (p0 q0 ) (0, 0)f(p + q),
0 p0
0,0
K 0,0
K
2n,0 2n,0
since, as a consequence of (3.15), we have pi (0, 0) =0= qi (0, 0), i = 1, 2, 3,
0,0
K 0,0
K
and 2n,0
p0 (0, 0) = q2n,0
0
Exploiting again the uniformity in [0, 0 ] of
(0, 0).
the estimates leading to (3.23), we nally get

0,0
1 0,0 K 2n,0
lim : : ( K2n,m f ) = (2)4 i (0, 0): : (f ).
0 p0

Together with (3.24), this gives the statement.

We stress that vanishing of the constant c in the previous theorem is still by no


means ruled out. That in general this is not the case, can be seen by choosing the
time-smearing function DR ((, )) suciently close to a function and the
space-smearing function DR (Br+ ) to a characteristic function.

Proposition 3.3. Assume that the time-smearing function used in the con-
struction of Or ,Or+ (Q) satisfies (t) = 1 1 ( 1 t), where 1 DR ((1, 1))
is such that R 1 = 1, and that the space-smearing function is such that
DR (Br+/2+ ), 0 1 and (x) = 1 for all x Br+/2 , with
< /2 . Then, denoting with c(, ) the corresponding constant given by
February 11, 2010 11:24 WSPC/148-RMP J070-S0129055X10003904

110 G. Morsella & L. Tomassini

Eq. (3.21), there holds


3
4
lim lim c(, ) = r+ . (3.25)
0 0 3 2

l,k :
Proof. By induction, it is straightforward to prove the following formula for K n,0

l,k (p, q)
K n,0
0,1
  
n1
(1)k+n1 inlk n r rj1
= j j
(2)n+1 r1 ,...,rn2 1 ,...,n1 j=1

 
n1
dkj |kj |rj rj1 h(p 1
1, k2, ) h(k
k1, )h(k
1 2
n1,
n1 + q),
R3(n1) j=1

where n = i for n even and n = 1 for n odd and r0 := l, rn1 := 1 k. Since


0 ) = 1 ( p0 ) (2)1/2 as 0 and kj, = (j |kj |, kj ), it is easy to see that
(p j

in the limit 0 the dependence on the j s drops o the integral in the second
line of the above equation and therefore

0,1 (1)n 22n1 (1)n 4n

lim K2n,0 (0, 0) = (0)
= dx (x)2n .
0 (2)3n+1 2(2)4 R3

Analogously, since  (p0 ) = 1 ( p0 ) 0 as 0, one has from the above


formula
0,0
K 2n,0
lim (0, 0) = 0.
0 p0

But, thanks to the estimates (3.16), (3.17), the convergence of the series (3.21) is
uniform in , so that one has
+ 
1  (1)n 2n
lim c(, ) = dx (x)2n .
0 2 n=1 (2n)! R 3

Since is bounded above by the characteristic function of the ball Br+ for < /2,
the convergence of the above series is also uniform in so that, taking into account
that converges to the characteristic function of the ball Br+/2 when 0, we
nally get (3.25).

It is straightforward to extend the above analysis to treat the case of the net
O F (O) generated by a multiplet of free scalar elds a , a = 1, . . . , d, with the
action of a compact Lie group G dened by

d
V (g)a (f )V (g) = v(g)ab b (f ), g G,
b=1

where v is a d-dimensional unitary representation.


February 11, 2010 11:24 WSPC/148-RMP J070-S0129055X10003904

From Global Symmetries to Local Currents 111

More precisely, consider the 1-parameter subgroup R g G associated


to a Lie algebra element g and correspondingly the global generator Q of
V (g ), which satises on D(N )


d
[Q , a (f )] = i t()ab b (f ),
b=1

t() being the representation of g (through antihermitian matrices) associated


to v. Then considering again the U(1) symmetry of the doubled theory and the
associated Noether current J0 it is possible to dene a semi-local implementation
of the ip as in Eq. (3.4) and to construct a local implementation Or ,Or+ (Q )
of Q as in Eq. (3.5), which is essentially self-adjoint on D(N ) and for which an
expansion analogous to (3.8) holds:

+
 2n 0,1 
 d

Or ,Or+ (Q ) = n (2n)!
t()ab : l a k b : (K2n,m
l,k
()) ,
n=1
4
l,k a,b=1

l,k
where K2n,m () are the distributions dened in (3.6) and (3.7). Finally, the ana-
logue of formula (3.20) holds, where on the right-hand side the appropriate Noether
current

d
j0 (f ) = t()ab : a b a b : (f )
a,b=1

appears and the normalization constant c is again given by (3.21).

4. Summary and Outlook


In the present work we have shown that it is in principle possible to construct
operators implementing locally a given innitesimal symmetry of a local net of
von Neumann algebras (local generators), starting from the existence of unitary
operators implementing (semi-)locally the ip automorphism on the tensor product
of the net with itself.
In particular, in a large class of free scalar eld models our construction provides
an ecient tool to obtain manageable such local generators through the explicit
expression of the local ip given in Eq. (3.4). Moreover, we showed that it is possible
to recover, up to a well-determined strictly positive normalization constant, the
associated Noether currents through a natural scaling limit of these generators in
which the localization region shrinks to a point. As expected, the above-mentioned
constant is found to depend only on the volume of the initial localization region
of the generator and not on the mass and isospin of the model. The existence of
this limit depends in this case on control of the energy behavior of the generators
(namely the existence of H-bounds) rather than on dilation invariance of the (thus
massless) theory, which was a key ingredient of previous similar results [7, 8].
February 11, 2010 11:24 WSPC/148-RMP J070-S0129055X10003904

112 G. Morsella & L. Tomassini

These results have been obtained in the spirit of giving a consistency check
towards a full quantum Noether theorem according to the program set down in [1]
and recalled in the introduction. In order to proceed further in this direction it is
apparent that two main problems have to be tackled. First, it is necessary to extend
the construction of local generators proposed in the Introduction to a suitably gen-
eral class of theories. Second, it would be desirable to gain a deeper understanding
of the general properties granting the existence and non-triviality of the pointlike
limit of the free generators, which are presently under investigation. Among other
things, this is likely connected with the problem of clarifying if it is generally possi-
ble, through a suitable choice of the local ip implementation, to gain control over
the boundary part of the local symmetry implementation, whose arbitrariness is
considered to be an important obstruction for the reconstruction of Noether cur-
rents. The methods of [14] can be expected to be useful to put this analysis in a
more general framework.
Finally, we believe that our method could help to shed some light on the dicult
problem of obtaining sharply localized charges from global ones.

Acknowledgments
We would like to thank Sergio Doplicher for originally suggesting the problem to
one of us and for his constant support and encouragement, and Sebastiano Carpi
for several interesting and useful discussions. We also thank the referees for suggest-
ing several improvements in the exposition. This work was supported by MIUR,
GNAMPA-INDAM, the SNS, the Marie Curie Research Training Network MRTN-
CT-2006-031962 EU-NCG and the ERC Advanced Grant 227458 Operator Alge-
bras and Conformal Field Theory.

Appendix. Local Implementation of the Doubled Theory U(1)


Symmetry
In this Appendix, we show that the smeared Noether current associated to the U(1)
symmetry of the theory of two complex free scalar elds of mass m 0, Eq. (3.1), is
represented by a self-adjoint operator which generates a group locally implementing
the symmetry. Although this material is more or less standard, we include it here
both for the convenience of the reader and because the proof of self-adjointness
of (Wick-ordered) bilinear expressions in the free eld (and its derivatives) can be
found in the literature only for mass m > 0 (see [15, 16]). For this reason, we will
only emphasize the main dierences in the (possibly) massless case.
To begin with, the main estimates in the appendix of [15], which are valid only
for m > 0, have to be sharpened as in the following lemma.

Lemma A.1. Let h S (R4 ), and consider the tempered distribution h (x, y) =
h(x)(x y). Then h C 0,1 C 1,0 for all m 0, and
h

0,1 ,
h

1,0
h
S
where

S is some Schwartz norm independent of m varying in bounded intervals.
February 11, 2010 11:24 WSPC/148-RMP J070-S0129055X10003904

From Global Symmetries to Local Currents 113

(p, q) = 1 C. We denote by
Proof. One has h (2)2 h(p
+ q), which implies h
w(p, q) the integral kernel dening T 1,0 (|h
|). It is easy to see that, for |q| 1,

m (p)
(1 + |p q|)1/2 ,
m (q)

S (R4 ), there exists a C1 > 0 and an r > 3 such that


and therefore, being h
  2  
2
  C1

dp  
dq w(p, q)(q) dp dq |(q)|
R3 |q|>1 R3 |q|>1 (1 + |p q|)r
 2
dp
C12

2L2 (R3 ) ,
R3 (1 + |p|)r
where use was made of the Young inequality
f g
L2
f
L1
g
L2 . On the other
hand, there exist C2 > 0 and s > 2 such that, for |p| > 1,
   
  m (p) C2
 dq w(p, q)(q) dq |(q)|
 |q| (1 + |p q|)s
|q|1 |q|1
 
m (p) dq
C2  |(q)|
|p|s
|q|1 |q|

2m (p)
C2

L2 (R3 ) ,
|p|s
and a C3 > 0 such that, for |p| 1,
   
  m (p)
 
dq w(p, q)(q) C3 dq |(q)|
 |q|
|q|1 |q|1

2C3 (1 + m2 )1/4

L2 (R3 ) .

Putting these inequalities together, we obtain


  
1/2
1,0
dp m (p)

T (|h |)
L2 (R3 ) 2 C1 + 2C2 2s
R3 (1 + |p|) |p|>1 |p|
r


 2 1/4
+ 2 2/3C3 (1 + m )

L2 (R3 ) ,

so that, since the constants Ci can be expressed by Schwartz norms of h, we conclude


that
T 1,0(|h
|)

h
S for a suitable Schwartz norm

S .
|)
,
T 0,1 (|h
The proofs that
T 1,0 (|h |)

h
S are completely
|)
,
T 0,1(|h
analogous and it is immediate to see that 0,1, 1,0,
, h
h L2 (R6 ), = , and that
their norms can be bounded by
h
S .
February 11, 2010 11:24 WSPC/148-RMP J070-S0129055X10003904

114 G. Morsella & L. Tomassini

This lemma, together with Proposition 2.1, shows that the timelike component
J0 (h) of the current (3.2) is well-dened for h S (R4 ). Using the fact that |pi |
m (p), the proof above shows that the spacelike components Ji (h), i = 1, 2, 3, are
well-dened too.

Proposition A.1. The following statements hold.

(1) For each h S (R4 ), the operator J (h) defined on D(N


) by

2

J (h) := (1)j [: j j  : (h ) : j j  : (h )], (A.1)
j=1

where j  = 3 j, defines a Wightman field such that J (h) is essentially self-


adjoint for real h.
(2) If h DR (O), O a double cone, then eiJ (h) F (O), R.
(3) Given a 3-dimensional open ball Br of radius r centered at the origin together
with functions  DR (R3 ), DR ((, )) such that (x) = 1 for each
x Br+ and R = 1, it holds that

eiJ0 () F eiJ0 () = (F ), F F (Or ), (A.2)

where Or is the double cone with base Br .

Proof. (1) According to Lemma A.1 one has


h
0,1 ,
h

1,0
h
S , so that J
is a Wightman eld and J (h) is symmetric for real h. Given now a K S n ,
J (h)p is the sum of 16p vectors of the form
,p kp +,p
: lp cjp cjp : (h)(np ) : l1 c,
j1
1 k1 +,1
cj  : (h)(n1 )
1

with nj = nj1 + j + j , j = 1, . . . , p (n0 := n). Therefore, by (2.4),


p
4
h
S

J (h)

p
(n + 2(p + 1)) (n + 4)

,

0 is a
and we see that is an analytic vector for J (h). Since any element in D
nite sum of such vectors, essential self-adjointness of J (h) follows.
(2) A straightforward but lengthy calculation shows that, on D 0,

[J (h), j (f ) + j (f ) ] = (1)j+1 i(j  (g) + j  (g) ),


(A.3)
g = h( f ) + (h( f )),
1
where, as customary, is the Fourier transform of 2i (p0 )(p2 m2 ). Since
supp is contained in the closed light cone and D 0 is an invariant dense set
of analytic vectors for both J (h) and j (f ) + j (f ) , we see by standard

arguments that eiJ (h) commutes with ei[j (f )+j (f ) ] if supp h is spacelike
from supp f , i.e. eiJ (h) F (O) = F (O) if supp h O.
February 11, 2010 11:24 WSPC/148-RMP J070-S0129055X10003904

From Global Symmetries to Local Currents 115

(3) Take f D(Or ). Since supp f does not intersect [, ] {x : (x) = 1}


we have that
(0 f ) + 0 ( ( f )) = 1(0 f ) + 0 ( 1( f )).

On the other hand, a calculation shows that, thanks to R = 1,
( 1(0 f ) + 0 ( 1( f ))) = f,
and, since f1 = 0 implies f1 = ( + m2 )f2 with fi S (R4 ), the commu-
tation relations (A.3) become
[J0 ( ), j (f ) + j (f ) ] = (1)j+1 i(j  (f ) + j  (f ) ).
Furthermore, thanks to the estimates (2.5) and (2.6) we can apply the multiple
commutator theorems in [11] to conclude, as in the proof of [9, Theorem 2],
that (A.2) holds.

References
[1] S. Doplicher, Local aspects of superselection rules, Comm. Math. Phys. 85 (1982)
7386.
[2] S. Doplicher and R. Longo, Local aspects of superselection rules. II, Comm. Math.
Phys. 88 (1983) 399409.
[3] D. Buchholz, S. Doplicher and R. Longo, On Noethers theorem in quantum eld
theory, Ann. Phys. 170 (1986) 117.
[4] D. Buchholz and E. H. Wichmann, Causal independence and the energy level density
of states in local quantum eld theory, Comm. Math. Phys. 106 (1986) 321344.
[5] D. Buchholz, Product states for local algebras, Comm. Math. Phys. 36 (1974) 287
304.
[6] S. Stratila, Modular Theory in Operator Algebras (Abacus Press, Bucharest, 1981).
[7] S. Carpi, Quantum Noethers theorem and conformal eld theory: A study of some
models, Rev. Math. Phys. 11 (1999) 519532.
[8] L. Tomassini, Sul teorema di Noether quantistico: Studio del campo libero di massa
zero in quattro dimensioni, Masters thesis, Universit`
a di Roma La Sapienza (1999).
[9] C. DAntoni and R. Longo, Interpolation by type I factors and the ip automorphism,
J. Funct. Anal. 51 (1983) 361371.
[10] S. Doplicher and R. Longo, Standard and split inclusions of von Neumann algebras,
Invent. Math. 75 (1984) 493536.
[11] J. Fr
ohlich, Application of commutator theorems to the integration of representations
of Lie algebras and commutation relations, Comm. Math. Phys. 54 (1977) 135150.
[12] M. Reed and B. Simon, Methods of Modern Mathematical Physics. Vol. II: Fourier
Analysis, Self-Adjointness (Academic Press, New York, 1975).
[13] R. M. Corless, G. H. Gonnet, D. E. G. Hare, D. J. Jerey and D. E. Knuth, On the
Lambert W function, Adv. Comput. Math. 5 (1996) 329359.
[14] H. Bostelmann, Phase space properties and the short distance structure in quantum
eld theory, J. Math. Phys. 46 (2005) 052301, 17 pp.
[15] J. Langerholc and B. Schroer, On the structure of the von Neumann algebras gener-
ated by local functions of the free Bose eld, Comm. Math. Phys. 1 (1965) 215239.
[16] S. Albeverio, B. Ferrario and M. W. Yoshida, On the essential self-adjointness of
Wick powers of relativistic elds and of elds unitary equivalent to random elds,
Acta Appl. Math. 80 (2004) 309334.
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Reviews in Mathematical Physics


Vol. 22, No. 2 (2010) 117192

c World Scientific Publishing Company
DOI: 10.1142/S0129055X10003916

PERTURBATIVE DEFORMATIONS OF CONFORMAL


FIELD THEORIES REVISITED

IGOR KRIZ
Mathematics Department, University of Michigan,
Ann Arbor, MI 48109-1109 USA
ikriz@umich.edu

Received 30 March 2009


Revised 12 October 2009

The purpose of this paper is to revisit the theory of perturbative deformations of con-
formal field theory from a mathematically rigorous, purely worldsheet point of view. We
specifically include the case of N = (2, 2) conformal field theories. From this point of
view, we find certain surprising obstructions, which appear to indicate that contrary to
previous findings, not all deformations along marginal fields exist perturbatively. This
includes the case of deformation of the Gepner model of the Fermat quintic along cer-
tain cc fields. In other cases, including Gepner models of K3-surfaces and the free field
theory, our results coincides with known predictions. We give partial interpretation of
our results via renormalization and mirror symmetry.

Keywords: N = (2, 2) conformal field theories; perturbative deformation; Gepner model.

Mathematics Subject Classification 2010: 83E30, 53D37, 81T15

1. Introduction
Recently, there has been renewed interest in the mathematics of the moduli space of
conformal eld theories, in particular, in connection with speculations about elliptic
cohomology. The purpose of this paper is to investigate this space by perturbative
methods from rst principles and from a purely worldsheet point of view. It is
conjectured that at least at generic points, the moduli space of CFTs is a manifold,
and in fact, its tangent space consists of marginal elds, i.e. primary elds of weight
(1, 1) of the conformal eld theory (that is in the bosonic case, in the supersym-
metric case there are modications which we will discuss later). This then means
that there should exist an exponential map from the tangent space at a point to
the moduli space, i.e. it should be possible to construct a continuous 1-parameter
set of conformal eld theories by turning on a given marginal eld.
There is a more or less canonical mathematical procedure for applying a Pexp
type construction to the eld which has been turned on, and obtaining a perturba-
tive expansion in the deformation parameter. This process, however, returns certain

117
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

118 I. Kriz

cohomological obstructions, similar to Gerstenhabers obstructions to the existence


of deformations of associative algebras [2629]. Physically, these obstructions can be
interpreted as changes of dimension of the deforming eld, and can occur, in princi-
ple, at any order of the perturbative path. The primary obstruction is well known,
and was used, e.g., by Ginsparg in his work on c = 1 conformal eld theories [30].
The obstruction also occured in earlier work, see [4547, 6365, 61], from the point
of view of continuous lines in the space of critical models. In the models consid-
ered, notably the Baxter model [11], the AshkinTeller model [8] and the Gaussian
model [48], vanishing of the primary obstruction did correspond to a continuous
line of deformations, and it was therefore believed that the primary obstruction
tells the whole story. (A similar story also occurs in the case of deformations of
boundary sectors, see [1, 2, 12, 22, 51, 52, 58, 38].)
In a certain sense, the main point of the present paper is analyzing, or giving
examples of, the role of the higher obstructions. We shall see that these obstructions
can be non-zero in cases where the deformation is believed to exist, most notably
in the case of deforming the Gepner model of the Fermat quintic along a cc eld,
cf. [3, 55, 23, 60, 44, 66, 67, 14, 15, 17]. Some discussion of marginality of primary
eld in N = 2-supersymmetric theories to higher order exists in the literature.
Notably, Dixon, [19] veried the vanishing for any N = (2, 2)-theory, and any linear
combination of cc, ac, ca and ac eld, of an amplitude integral which physically
expresses the change of central charge (a similar calculation is also given in Distler
Greene [18]). Earlier work of Zamolodchikov [70,71] showed that the renormalization
-function vanishes for theories where c does not change during the renormalization
process. However, we nd that the calculation [19] does not guarantee that the
primary eld would remain marginal along the perturbative deformation path, due
to subtleties involving singularities of the integral. The obstruction we discuss in
this paper is an amplitude integral which physically expresses directly the change
of dimension of the deforming eld, and it turns out this may not vanish. We will
return to this discussion in Sec. 3 below.
This puzzle of having obstructions where none should appear will not be
fully explained in this paper, although a likely interpretation of the result will
be discussed. It is possible that our eect does not impact the general ques-
tion of the existence of the nonlinear -model, which is widely believed to exist
(e.g., [3, 55, 23, 60, 44, 66, 67, 14, 15, 17]), but simply concerns questions of its per-
turbative construction. One caveat is that the case we investigate here is still not
truly physical, since we specialize to the case of cc elds, which are not real. The
actual physical deformations of CFTs should occur along real elds, e.g., a combi-
nation of a cc eld and its complex-conjugate aa eld (we give a discussion of this
in case of the free eld theory at the end of Sec. 4). The obstructions discussed here
however are not linear, and hence a priori the case of the corresponding real eld
in the Gepner model is much more dicult to analyze, in particular, it requires
regularization of the deforming parameter, and is not discussed here. Nevertheless,
it is still surprising that an obstruction occurs for a single cc eld; for example, this
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 119

does not happen in the case of the (compactied or uncompactied) free eld the-
ory. Also, there is strong evidence that obstructions to deformations along cc elds
and the corresponding real elds are equivalent (see the remark after Example 2
in Sec. 6).
Since an nth order obstruction indeed means that the marginal eld gets
deformed into a eld of non-zero weight, which changes to the order of the nth
power of the deformation parameter, usually [30, 4547, 6365, 61], when obstruc-
tions occur, one therefore concludes that the CFT does not possess continuous
deformations in the given direction. Other interpretations are possible. One thing
to observe is that our conclusion is only valid for purely perturbative theories where
we assume that all elds have power series expansion in the deformation param-
eters with coecients which are elds in the original theory. This is not the only
possible scenario. Therefore, as remarked above, our results merely indicate that in
the case when our algebraic obstruction is non-zero, non-perturbative corrections
must be made to the theory to maintain the presence of marginal elds along the
deformation path.
In fact, evidence in favor of this interpretation exists in the form of the analysis
of Nemeschansky and Sen [55,35] of higher order corrections to the -function of the
nonlinear -model. Grisaru, Van de Ven and Zanon [35] found that the four-loop
contribution to the -function of the nonlinear model for CalabiYau manifolds
is non-zero, and [55] found a recipe how to cancel this singularity by deforming
the manifold to metric which is non-Ricci at at higher orders of the deformation
parameter. The expansion [4] used in this analysis is around the 0 curvature tensor,
but assuming for the moment that a similar phenomenon occurs if we expanded
around the Fermat quintic vacuum, then there are no elds present in the Gepner
model which would correspond perturbatively to these higher order corrections in
the direction of non-Ricci at metric: bosonically, such elds would have to have
critical conformal dimension classically, since the -model Lagrangian is classically
conformally invariant for non-Ricci at target K ahler manifolds. However, quan-
tum mechanically, there is a one-loop correction proportional to the Ricci tensor,
thus indicating that elds expressing such perturbative deformations would have
to be of generalized weight (cf. [3942]). Fields of generalized weight, however, are
not present in the Gepner model, which is a rational CFT, and more generally
are excluded by unitarity (see discussions in Remarks after Theorems 2 and 3 in
Sec. 3 below). Thus, although this argument is not completely mathematical, renor-
malization analysis seems to conrm our nding that deformations of the Fermat
quintic model must in general be non-perturbative. It is also noteworthy that the
-function is known to vanish to all orders for K3-surfaces because of N = (4, 4)
supersymmetry. Accordingly, we also nd that the phenomenon we see for the Fer-
mat quintic is not present in the case of the Fermat quartic (see Sec. 7 below). It is
also worth noting that other non-perturbative phenomena such as instanton correc-
tions also arise when passing from K3-surfaces to CalabiYau 3-folds ( [14, 15, 17]).
Finally, one must also remark that the proof of [55] of the -function cancellation
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

120 I. Kriz

is not mathematically complete because of convergence questions, and thus one


still cannot exclude even the scenario that not all nonlinear models would exist
as exact CFTs, thus creating some type of string landscape picture also in this
context (cf. [20]). We should remark that this scenario also has a compelling inter-
pretation from the point of view of the relationship between classical and quantum
geometry (see the end of the Concluding Remarks).
In this paper, we shall be mostly interested in the strictly perturbative picture.
The main point of this paper is an analysis of the algebraic obstructions in certain
canonical cases. We discuss two main kinds of examples, namely the free eld theory
(both bosonic and N = 1-supersymmetric), and the Gepner models of the Fermat
quintic and quartic, which are exactly solvable N = 2-supersymmetric conformal
eld theories which should be the nonlinear -models of the Fermat quintic Calabi
Yau 3-fold and the Fermat quartic K3-surface (in the case of the Fermat quartic,
this was actually proved in [54]). In the case of the free eld theory, what happens
is essentially that all non-trivial gravitational deformations of the free eld theory
are algebraically obstructed. In the case of a free theory compactied on a torus,
the only gravitational deformations which are algebraically unobstructed come from
linear change of metric on the torus. (We will focus on gravitational deformations;
there are other examples, for example the sine-Gordon interaction [69, 13], which
are not discussed in detail here.)
The Gepner case deserves special attention. From the moduli space of Calabi
Yau 3-folds, there is supposed to be a -model map into the moduli space of CFTs.
In fact, when we have an exactly solvable CalabiYau -model, one gets operators
in CFT corresponding to the cohomology groups H 1,1 and H 2,1 , which measure
deformations of complex structure and K ahler metric, respectively, and these in
turn give rise to innitesimal deformations. Now the Fermat quintic
x5 + y 5 + z 5 + t5 + u5 = 0 (1)
in CP 4 has a model conjectured by Gepner [24, 25] which is embedded in the ten-
sor product of 5 copies of the N = 2-supersymmetric minimal model of central
charge 9/5. The weight (1/2, 1/2) cc and ac elds correspond to the 100 innites-
imal deformations of complex structure and 1 innitesimal deformation of K ahler
metric of the quintic (1). Despite the numerical matches in dimension, however, it
is not quite correct to say that the gravitational deformations, corresponding to
the moduli space of CalabiYau manifolds, occurs by turning on cc and ac elds.
This is because, to preserve unitarity, a physical deformation can only occur when
we turn on a real eld, and the elds in question are not real. In fact, the complex
conjugate of a cc eld is an aa eld, and the complex conjugate of an ac eld is a
ca eld. The complex conjugate must be added to get a real eld, and a physical
deformation (we discuss this calculationally in the case of the free eld theory in
Sec. 4).
In this paper, we do not discuss deformations of the Gepner model by turning on
real elds. As shown in the case of the free eld theory in Sec. 4, such deformations
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 121

require for example regularization of the deformation parameter, and are much
more dicult to calculate. Because of this, we work only with the case of one cc
and one ac eld. We will show that at least one cc deformation, whose real version
corresponds to the quintics
x5 + y 5 + z 5 + t5 + u5 + x3 y 2 = 0 (2)
for small (but not innitesimal) is algebraically obstructed. (One suspects that
similar algebraic obstructions also occur for other elds, but the computation is
too dicult at the moment; for the cc eld corresponding to xyztu, there is some
evidence suggesting that the deformation may exponentiate.)
It is an interesting question if nonlinear -models of CalabiYau 3-folds must
also contain non-perturbative terms. If so, likely, this phenomenon is generic, which
could be a reason why mathematicians so far discovered so few of these conformal
eld theories, despite ample physical evidence of their existence [3, 55, 23, 60, 44, 66,
67].
Originally prompted by a question of Igor Frenkel, we also consider the case of
the Fermat quartic K3 surface
x4 + y 4 + z 4 + t4 = 0
in CP 3 . This is done in Sec. 7. It is interesting that the problems of the Fermat
quintic do not arise in this case, and all the innitesimally critical elds exponentiate
in the purely perturbative sense. This dovetails with the result of Alvarez-Gaume
and Ginsparg [5] that the -function vanishes to all orders for critical perturbative
models with N = (4, 4) supersymmetry, and hence from the renormalization point
of view, the nonlinear model is conformal for the Ricci at metric on K3-surfaces.
There are also certain dierences between the ways mathematical considerations of
moduli space and mirror symmetry vary in the K3 and CalabiYau 3-fold cases,
which could be related to the behavior of the non-perturbative eects. This will be
discussed in Sec. 8.
To relate more precisely in what setup these results occur, we need to describe
what kind of deformations we are considering. It is well known that one can obtain
innitesimal deformations from primary elds. In the bosonic case, the weight of
these elds must be (1, 1), in the N = 1-supersymmetric case in the NS-NS sec-
tor the critical weight is (1/2, 1/2) and in the N = 2-supersymmetric case the
innitesimal deformations we consider are along so called ac or cc elds of weight
(1/2, 1/2). For more specic discussion, see Sec. 2 below. There may exist innites-
imal deformations which are not related to primary elds (see the remarks at the
end of Sec. 3). However, they are excluded under a certain continuity assumption
which we also state in Sec. 2.
Therefore, the approach we follow is exponentiating innitesimal deformations
along primary elds of appropriate weights. In the algebraic approach, we assume
that both the primary eld and amplitudes can be updated at all points of the defor-
mation parameter. Additionally, we assume one can obtain a perturbative power
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

122 I. Kriz

series expansion in the deformation parameter, and we do not allow counterterms


of generalized weight or non-perturbative corrections. We describe a cohomological
obstruction theory similar to Gerstenhabers theory [2629] for associative algebras,
which in principle controls the coecients at individual powers of the deformation
parameter. Obstructions can be written down explicitly under certain conditions.
This is done in Sec. 3. The primary obstruction in fact is the one which occurs for the
deformations of the free eld theory at gravitational elds of non-zero momentum
(gravitational waves). In the case of the Gepner model of the Fermat quintic, the
primary obstruction vanishes but in the case (2), one can show there is an algebraic
obstruction of order 5 (i.e. given by a 7 point function in the Gepner model).
It should be pointed out that even in the algebraic case, there are substan-
tial complications we must deal with. The moduli space of CFTs is not yet well
dened. There are dierent denitions of conformal eld theory, for example the
Segal approach [59, 36, 37] is quite substantially dierent from the vertex operator
approach (see [41] and references therein). Since these denitions are not known to
be equivalent, and their realizations are supposed to be points of the moduli space,
the space itself therefore cannot be dened until a particular denition is selected.
Next, it remains to be specied what structure there should be on the moduli space.
Presumably, there should at least be a topology, so that we need to ask what is a
nearby conformal eld theory. That, too, has not been answered.
These foundational questions are enormously dicult, mostly from the philo-
sophical point of view: it is very easy to dene ad hoc notions which immediately
turn out insuciently general to be desirable. Because of that, we only make min-
imal denitions needed to examine the existing paradigm in the context outlined.
Let us, then, conne ourselves to observing that even in the perturbative case, the
situation is not purely algebraic, and rather involves innite sums which need to
be discussed in terms of analysis. For example, the obstructions may in fact be
undened, because they may involve innite sums which do not converge. Such
phenomenon must be treated carefully, since it does not mean automatically that
perturbative exponentiation fails. In fact, because the deformed primary elds are
only determined up to a scalar factor, there is a possibility of regularization along
the deformation parameter. We briey discuss this theoretically in Sec. 3, and then
give an example in the case of the free eld theory in Sec. 4.
We also briey discuss sucient conditions for exponentiation. The main
method we use is the case when Theorem 1 gives a truly local formula for the
innitesimal amplitude changes, which could be interpreted as an innitesimal
isomorphism in a special case. We then give in Sec. 3 conditions under which
such innitesimal isomorphisms can be exponentiated. This includes the case of a
coset theory, which does not require regularization, and a more general case when
regularization may occur.
In the nal Secs. 5 and 6, namely the case of the Gepner model, the main
problem is nding a setup for the vertex operators which would be explicit enough
to allow evaluating the obstructions in question; the positive result is obtained using
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 123

a generalization of the coset construction. The formulas required are obtained from
the Coulomb gas approach (= FeiginFuchs realization), which is taken from [34].
The present paper is organized as follows: In Sec. 2, we give the general setup
in which we work, show under which condition we can restrict ourselves to defor-
mations along a primary eld, and derive the formula for innitesimally deformed
amplitudes, given in Theorem 1. In Sec. 3, we discuss exponentiation theoretically,
in terms of obstruction theory, explicit formulas for the primary and higher obstruc-
tions, and regularization. We also discuss supersymmetry, and in the end show a
mechanism by which non-perturbative deformations may still be possible when alge-
braic obstructions occur. In Sec. 4, we give the example of the free eld theories, the
trivial deformations which come from 0 momentum gravitational deforming elds,
and the primary obstruction to deforming along primary elds of non-zero momen-
tum. In Sec. 5, we will discuss the Gepner model of the Fermat quintic, and in
Sec. 6, we will discuss examples of non-zero algebraic obstructions to perturbative
deformations in this case, as well as speculations about unobstructed deformations.
In Sec. 7, we will discuss the (unobstructed) deformations for the Fermat quar-
tic K3 surface, and in Sec. 8, we attempt to summarize and discuss our possible
conclusions.

2. Infinitesimal Deformations of Conformal Field Theories


We shall work in the framework of [59] (see also [3638]). In the bosonic case
(without considering supersymmetry), a conformal eld theory in this framework
is characterized by a Hilbert space of states H, and for a worldsheet, by which one
means a Riemann surface (a 1-dimensional complex manifold) with analytically
parametrized boundary components, a trace class element
 
U H H (3)
dened up to scalar multiple. One assumes that these elements depend on analyt-
ically (i.e. are real-analytic functions on the moduli space of worldsheets). Here H
denotes the Hilbert space dual of H, and denotes the Hilbert tensor product. In

(3), the tensor product of copies of H (respectively, H) is over the inbound (respec-
tively, outbound) boundary components of . Inbound and outbound boundary
components are distinguished by orientation. For an annulus in C enclosed by two
concentric circles oriented counterclockwise, the inside circle is inbound. The ele-
ments (3) are subject to gluing identities (gluing of Riemann surfaces corresponds
to trace). These elements can also be viewed (perhaps even more conventionally,
but less symmetrically) as operators
 
U : H H (4)
where the tensor product in the source (respectively, target) is over inbound (respec-
tively, outbound) boundary components. In this paper, we shall almost exclusively
consider the case when is a Riemann surface of genus 0, since this is the key
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

124 I. Kriz

case for deformation theory. It should be noted that a physical CFT has still more
structure. Namely, we want to consider the operators (4) where is an annulus
approaching the degenerate annulus which is the unit circle with both inbound and
outbound parametrizations equal to the identity. In such limit, the operator (4)
should approach the identity

H H.

Also, one requires reection-positivity (which is the Wick rotation of unitarity).


This means that if we denote by the Riemann surface complex conjugate to ,
then U is adjoint to U . One also requires that for a physical theory that H actually
be the complexication of a real vector space, and the quadratic form one obtains
by taking limits to the degenerate annulus S 1 with boundary parametrizations
by z (the identity) and 1/z be related to the Hermitian form on H by complex
conjugation.
Treating the supersymmetric case mathematically is more technical, but analo-
gous. Essentially, one must work on the super-moduli space of superconformal sur-
faces (for a very quick review mostly sucient for the purposes of this paper, see
[49]). The structure just described originates in conformally invariant 2-dimensional
quantum eld theory. From the point of view of 2-dimensional quantum eld the-
ory, the element (3) can be viewed as a generalization of the vacuum expectation
value in the sense that no eld is inserted inside the worldsheet. From the point of
view of conformal eld theory, this element is a CFT amplitude.
Now in a bosonic (= non-supersymmetric) CFT H, if we have a primary eld u
of weight (1, 1), then, as observed in [59], we can make an innitesimal deformation
of H as follows: For a worldsheet with associated element U (see (3)), the
innitesimal deformation of the vacuum is

V = Uxu . (5)
x

Here Uxu is obtained by choosing a holomorphic embedding f : D , f (0) = x,


where D is the standard disk. Let  be the worldsheet obtained by cutting f (D)
out of , and let Uxu be obtained by gluing the vacuum U with the eld u inserted
at f (D). The element Uxu is proportional to f  (0)2 , since u is (1, 1)-primary, so
it transforms the same way as a measure and we can dene the integral (5) without
coupling with a measure. The integral (5) is an innitesimal deformation of the
original CFT structure in the sense that

U + V 

satises CFT gluing identities in the ring C[]/2 .


The main topic of this paper is studying (in this and analogous supersym-
metric cases) the question as to when the innitesimal deformation (5) can be
exponentiated at least to perturbative level, i.e. when there exist for each n N
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 125

elements

u0 , . . . , un1 H, u0 = u

and for every worldsheet



U0 , . . . , Un H H

such that

m
U (m) = Ui i , U0 = U (6)
i=0

satisfy gluing axioms in C[]/m+1 , 0 m n,



m
u(m) = u i i (7)
i=0

is primary of weight (1, 1) with respect to (6), 0 < m n, and



dU (m)
= Uxu(m1) (m 1) (8)
d x

in the same sense as in (5).


We should remark that a priori, it is not known that all deformations of CFT
come from primary elds: One could, in principle, simply ask for the existence of
vacua (6) such that (6) satisfy gluing axioms over C[]/m+1 . As remarked in [59],
it is not known whether all perturbative deformations of CFTs are obtained from
primary elds u as describe above. However, one can indeed prove that the primary
elds u exist given suitable continuity assumptions. Suppose the vacua U (m) exist
for 0 m n. We notice that the integral on the right-hand side of (8) is, by
denition, the limit of integrals over regions R which are proper subsets of such
that the measure of R goes to 0 (x an analytic metric on compatible with
the complex structure). Let, thus, D1 ,...,Dk be a worldsheet obtained from by
cutting out disjoint holomorphically embedded copies D1 , . . . , Dk of the unit disk
D. Then we calculate

dU (m)
= Uxu(m1) (m 1)
d x

= lim UD1 ,...,Dk (m 1) S
U(S Di )xu(m1) (m 1)
(D1 ,...,Dk )0 x Di
 
= lim UDi (m 1) U(Di )xu(m1) (m 1)
(D1 ,...,Dk )0 xDi
i
 dUDi (m)
= lim UDi (m 1)
(D1 ,...,Dk )0
i
d
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

126 I. Kriz

assuming (8) for = D, so the assumption we need is


dU  dUDi (m)
(m) = lim UDi (m 1) . (9)
d (D1 ,...,Dk )0
i
d

The composition notation on the right-hand side means gluing. Granted (9), we can
recover dUd(m) from dUDd(m) for the unit disk D. denotes the Lebesgue measure
(this is well dened on a worldsheet at least up to absolute continuity, which is
sucient for taking the limit in the above computation).
Now in the case of the unit disk, we get a candidate for u(m 1) in the following
way:
Assume that H is topologically spanned by subspaces H(m1 ,m2 ) of -weight
(m1 , m2 ) where m1 , m2 0, H(0,0) = UD . Then UD (m) is invariant under rigid
rotation, so

UD (m) H(k,k) []/m+1 . (10)
k0

We see that if Aq is the standard annulus with boundary components S 1 , qS 1


with standard parametrizations, then
1 dUD (m)
u(m 1) = lim UA (11)
q0 q2 q d
exists and is equal to the weight (1, 1) summand of (10). In fact, by (9) and the
denition of integral, we already see that (8) holds. We do not know however yet
that u(m 1) is primary. To see that, however, we note that for any annulus
A = D D where f : D D is a holomorphic embedding with derivative r, (9)
also implies (for the same reason the exhaustion principle) that (8) is valid with
u(m 1) replaced by
UA u(m 1)
. (12)
r2
Since this is true for any , in particular where is any disk, the integrands must
be equal, so (12) and u(m 1) have the same vertex operators, so at least in the
absence of null elements,
UA u(m 1)
= u(m 1) (13)
r2
which means that u(m 1) is primary of weight (1, 1), which says precisely that
the expression on the left-hand side of (13) is independent of A.
We have presented an argument by which, making certain assumptions, defor-
mations of CFTs occur along primary elds of critical weights. This is a question
raised in [59]. We shall see however that there are problems with this formulation
even in the simplest possible case: Consider the free (bosonic) CFT of dimension
1, and the primary eld x1 x 1 . (We disregard here the issue that H itself lacks
a satisfactory Hilbert space structure, see [37], we could eliminate this problem by
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 127

compactifying the theory on a torus or by considering the state spaces of given


momentum.) Let us calculate

1
UD = exp(zL1 ) exp(
zL 1 )x1 x
1
D

1  xk x
k
= . (14)
2 k
k1

We see that the element (14) is not an element of H, since its norm is k1 1 = .
This occurs despite the fact that the norm on H is preserved by the deformation,
i.e. the deformation is unitary. (This is because the inner product is conjugate by
reality to the quadratic form which is the operator associated with the degener-
ate worldsheet with two outbound boundary components S 1 = {z C| z = 1}
parametrized by z and 1/z; in the class of measures equivalent to the Lebesgue
measure by absolute continuity, this worldsheet has measure 0 and hence the defor-
mation acts trivially on it.)
The explanation is simply that the innitesimally deformed vacuum is
1  xn x n
1+ . (15)
2 n>0 n

When computing the square norm of (15), the second summand is orthogonal to
the rst, hence its square norm occurs with coecient 2 , which disappears when
calculating up to linear order in  (which is what we are doing in an innitesimal
deformation); such phenomena routinely occur when one attempts to dierentiate
unitary processes on Hilbert spaces. In our case, as we shall see, the situation is
further complicated by the fact that the process actually has to be regularized.
There are other problems as well. For one thing, we wish to consider theories
which really do not have Hilbert axiomatizations in the proper sense, including
Minkowski signature theories, where the Hilbert approach is impossible for physical
reasons. Therefore, we prefer a vertex operator algebra approach where we discard
the Hilbert completion and restrict ourselves to examining tree level amplitudes.
One such axiomatization of such theories was given in [41] under the term full eld
algebra. In the present paper, however, we prefer to work from scratch, listing the
properties we will use explicitly, and referring to our objects as conformal eld
theories in the vertex operator formulation.
As mentioned in the introduction, our approach in this paper is essentially to
build the minimal possible machinery in which we can phrase the concept of per-
turbative deformation of a CFT along a primary eld of critical weight to arbitrary
degree, and identifying obstructions to obtaining such deformation. Actually identi-
fying the deformed conformal eld theory upon plugging in a value of the deforma-
tion parameter (provided the obstructions vanish) by means of a general abstract
machinery (i.e. not assuming we can recognize the theory by other means) is a
dicult problem which remains untreated in the present paper. Therefore, speak-
ing purely mathematically, we are actually dening the concept of perturbative
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

128 I. Kriz

deformation along with nding our obstructions. It would be far superior to dene
rigorously the moduli space of conformal eld theories upfront, with enough geom-
etry to allow us to dene paths. Such technology, however, is not mathematically
available at the present time.
Regarding the approach to constructing and treating elds, the vertex opera-
tor approach is largely superior from the computational point of view. One can
convert to the more symmetric and foundationally more powerful Hilbert space
approach when we have appropriate convergence of the operators constructed. We
shall proceed by using either language according to what is more convenient at each
particular time.
For now, let us consider untopologized vector spaces

V = V(wL ,wR ) . (16)
Here (wL , wR ) are weights (we refer to wL , respectively wR , as the left, respectively
right, component of the weight), so we assume wL wR Z and usually
wL , wR 0, (17)
V(0,0) = UD . (18)
The no ghost assumptions (17), (18) will sometimes be dropped. If there is a
Hilbert space H, then V is interpreted as the subspace of states of nite weights.
We assume that for u VwL ,wR , we have vertex operators of the form

Y (u, z, z) = uvL wL ,vR wR z vL zvR . (19)
(vL ,vR )

Here ua,b are operators which raise the left (respectively, right) component of weight
by a (respectively, b). We additionally assume vL vR Z and that for a given
w, the weights of operators which act on w are discrete. Even more strongly, we
assume that

Y (u, z, z) = Yi (u, z)Yi (u, z) (20)
i

where

Yi (u, z) = ui;vL wL z vL ,
 (21)
Yi (u, z) = i;vR wR zvR
u

where all the operators Yi (u, z) commute with all Yj (v, z). The main axiom which
elds (19) must satisfy is commutativity and associativity analogous to the
case of vertex operator algebras, i.e. there must exist for elds u, v, w V and
w V of nite weight, a 4-point function
w Z(u, v, z, z, t, t)w (22)
which is real-analytic and unbranched outside the loci of z = 0, t = 0, z = ,
t = and z = t, and whose expansion in t rst and z second (respectively, z rst
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 129

and t second, respectively, z t rst and t second) is


w Y (u, z, z)Y (v, t, t)w,
w Y (v, t, t)Y (u, z, z)w,
w Y (Y (u, z t, z t)v, t, t)w,
respectively. Here, for example, by an expansion in t rst and z second we mean a
series in the variable z whose coecients are series in the variable t, and the other
cases are analogous. Comment: the existence of 4-point function is the appropriate
generalization of locality.
We also assume that Virasoro algebras Ln , L n with equal central charges
cL = cR act and that

Y (L1 u, z, z) = Y (u, z, z),
z
(23)
1 u, z, z) = Y (u, z, z)
Y (L
z
and
0 ).
VwL ,wR is the weight (wL , wR ) subspace of V with respect to (L0 , L (24)

Remark. Even the axioms outlined here are meant for theories which are initial
points of the proposed perturbative deformations, they are too restrictive for the
theories obtained as a result of the deformations themselves. To capture those
deformations, it is best to revert to Segals approach, restricting attention to genus
0 worldsheets with a unique outbound boundary component (tree level amplitudes).
Operators will then be expanded both in the weight grading and in the perturbative
parameter (i.e. the coecient at each power of the deformation parameter will
be an element of the product-completed state space of the original theory). To
avoid discussion of topology, we simply require that perturbative coecients of all
compositions of such operators converge in the product topology with respect to
the weight grading, and the analytic topology in each graded summand.
In this section, we discuss innitesimal perturbations, i.e. the deformed theory
is dened over C[]/(2 ) where  is the deformation parameter. One case where such
innitesimal deformations can be described explicitly is the following

Theorem 1. Consider elds u, v, w V where u is primary of weight (1, 1). Next,


assume that

Z(u, v, z, z, t, t) = Z, (u, v, z, z, t, t)
,

where

Z, (u, v, z, z, t,
t) = Z,,i (u, v, z, t)Z,,i (u, v, z, t)
i
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

130 I. Kriz

and for w W of nite weight, w Z,,i (u, v, z, t)(z t) z (respectively,



w Z,,i (u, v, z, t)(z t) z ) is a meromorphic (respectively, antimeromorphic)
function of z on CP 1 , with poles (if any) only at 0, t, . Now write

Yu,, (v, t, t) = (i/2) Z, (u, v, z, z, t, t)dzd
z, (25)

so

Yu (v, t, t) = Y (v, t, t) +  Yu,, (v, t, t)
,

is the innitesimally deformed vertex operator where is the degenerate worldsheet


with unit disks cut out around 0, t, . Assume now further that we can expand
Z,,i (u, v, z, t) = Y,,i (v, t)Y,,i (u, z) when z is near 0, (26)

Z,,i (u, v, z, t) = Y,,i (u, z)Y,,i (v, t) when z is near , (27)

Z,,i (u, v, z, t) = Y,,i (Y,,i (u, z t))v, t) when z is near t. (28)
Write

Y,,i (u, z) = u,,i,n z n+1 ,


Y,,i (u, z) = u,,i,n z n++1 ,


Y,,i (u, z) = u,,i,n z n+1 .

(Analogously with the s.) Assume now


u,,i,0 w = 0, u,,i,0 v = 0, u,,i,0 Y,,i (v, t)w = 0 (29)
and analogously for the s (note that these conditions are only nontrivial when
= 0, respectively, = 0, respectively, = ). Denote now by ,,i,0 , ,,i, ,
,,i,t the indenite integrals of (26)(28) in the variable z, obtained using the
formula

z k+1
z k dz = for k = 1
k+1
(thus xing the integration constant), and analogously with the s. Let then
C,,i = ,,i, ,,i,t ,
D,,i = ,,i, ,,i,0 ,
(30)
,,i,
C,,i = ,,i,t ,

D,,i = ,,i,
,,i,0
(see the comment in the proof on branching). Let
 u,,i,n u,,i,n
,,i =
n
n
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 131

and similarly for thes, the s and the s. (The denition makes sense when applied
to elds on which the term with denominator 0 vanishes.) Then

Y,,u (v, t, t)w = ,,i Y (v, t, t)w Y (,,i v, t, t)w Y (v, t, t),,i w
i

+ C,,i C,,i (1 + e2i ) + D,,i D


,,i (1 e2i ). (31)

Additionally, when = 0, then D,,i = D ,,i = 0, and when = 0 then C,,i =



C,,i = 0, and

Y,,u (v, t, t)w = ,,i Y (v, t, t)w Y (,,i v, t, t)w Y (v, t, t),,i w. (32)
i

Equation (32) is also valid when = .

Remark 1. Note that technically, the integral (25) is not dened on the nonde-
generate worldsheet described. This can be treated in the standard way, namely by
considering an actual worldsheet  obtained by gluing on standard annuli on the
boundary components. It is easily checked that if we denote by Auq the innitesimal
deformation of Aq by u, then

Auq (w) = Aq (w) Aq (w).

Therefore, the theorem can be stated equivalently for the worldsheet  . The only
change needs to be made in formula (31), where  needs to be multiplied by s2n
and needs to be multiplied by r2n where r and s are radii of the corresponding
boundary components. Because however this is equivalent, we can pretend to work
on the degenerate worldsheet directly, in particular avoiding inconvenient scaling
factors in the statement.

Remark 2. The validity of this theorem is rather restricted by its assumptions.


Most signicantly, its assumption states that the chiral 4-point function can be
rendered meromorphic in one of the variables by multiplying by a factor of the
form z (z t) . This is essentially equivalent to the fusion rules being abelian,
i.e. 1-dimensional for each pair of labels, and each pair of labels has exactly one
product. As we will see (and as is well known), the N = 2 minimal model is an
example of a non-abelian theory.
Speaking more generally in terms of function theory, branched analytic functions
on CP 1 (at a risk of great confusion, we recall that those were called Abelsche
Funktionen by Riemann) are nite-dimensional vector spaces which are locally
spaces of holomorphic functions, outside of nitely many points z1 , . . . , zn on CP 1 .
One also assumes that the singularities at zi are of bounded polynomial growth.
Such function then denes a nite-dimensional representation of the fundamental
group 1 (CP 1 {z1 , . . . , zn }), called the holonomy representation. In particular, chi-
ral correlation functions of a full eld algebra are branched functions in this sense.
The key issue is whether the holonomy representation is a sum of one-dimensional
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

132 I. Kriz

representation (in which case it factors through the abelianization of the fundamen-
tal group the rst homology group). Then and only then is the function a sum
of contributions which can be rendered holomorphic by multiplication with appro-
priate products of (zi zj )ij . A most basic example of a branched function with
non-abelian holonomy is the hypergeometric function, which occurs as the 4-point
function of parafermions and N = 2-minimal models.
Even for an abelian theory, the theorem only calculates the deformation in the
0 charge sector because of the assumption (29). Because of this, even for a free
eld theory, we will need to discuss an extension of the argument. Since in that
case, however, stating precise assumptions is even more complicated, we prefer to
treat the special case only, and to postpone the discussion to Sec. 4 below.

Proof. Let us work on the scaled real worldsheet  . Let

,,i = Z,,i (u, v, z, t)dz,


,,i = Z,,i (u, v, z, t)d
z.

Denote by 0 , , t the boundary components of  near 0, , t. Then the form


,,i, ,,i is unbranched on a domain obtained by making a cut c connecting
0 and t . We have

,,i,t = Y (,,i v, t, t), (33)
t

,,i,0 = Y (v, t, t),,i . (34)
0

But we want to integrate ,,i, ,,i over the boundary  :


   
,,i ,,i = ,,i ,,i + ,,i ,,i + ,,i ,,i
 t 0
 
+ ,,i ,,i + ,,i ,,i (35)
c+ c

where c+ , c are the two parts of  along the cut c, oriented from t to 0 and
back respectively. Before going further, let us look at two points x+ c+ , x c
which project to the same point on c. We have

C(e2i 1)
(x ) = C (x+ ) C (x )
(x )
(x+ ) (t + C)
= (t + C)
= (x+ ) (x )
(x )
(x+ ) (0 + D)
= (0 + D)
(x )
(x+ ) D
= D
(x )
= D(e2i 1)
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 133

(the subscripts , , i were omitted throughout to simplify the notation). This


implies the relation
C,,i (e2i 1) = D,,i (e2i 1). (36)

Comment. This is valid when the constants C,,i , D,,i are both taken at the
point x ; note that since the chiral forms are branched, we would have to adjust
the statement if we measured the constants elsewhere. This however will not be of
much interest to us as in the present paper we are most interested in the case when
the constants vanish.
In any case, note that (36) implies C,,i = 0 when = 0 mod Z and = 0
mod Z, and D,,i = 0 when = 0 mod Z and = 0 mod Z. There is an anlogous
,,i . Note that when = 0 = , all the forms in
relation to (36) between C,,i , D
sight are unbranched, and (32) follows directly. To treat the case = , proceed
analogously, but replacing ,,i, by ,,i,0 or ,,i,t . Thus, we have nished
proving (32) under its hypotheses.
Returning to the general case, let us study the right-hand side of (35). Subtract-
ing the rst two terms from (33), (34), we get
 
C,,i ,,i , D,,i ,,i , (37)
t 0

respectively. On the other hand, the sum of the last two terms, looking at points
x+ , x for each x c, can be rewritten as
 
C,,i (e2i + 1)
,,i = D,,i (e2i + 1)
,,i . (38)
c+ c

Now recall (30). Choosing ,,i, as the primitive function of ,,i , we see that
for the end point x of c ,
,,i, (x ) =
,,i, (x+ )
,,i,t (x )
,,i,t (x+ )
= (e2i 1)
,,i,t (x )
= (e2i 1)
,,i, (x ) + (e2i 1)C,,i .
(39)

Similarly, for the beginning point y of c ,
,,i, (y ) =
,,i, (y + ) + ,,i,0 (y )
,,i,0 (y + ) +
,,i,0 (y )
= (e2i 1)
,,i, (y ) (e2i 1)D
= (e2i 1) ,,i .
(40)
Then (39), (40) multiplied by C,,i are the integrals (37), while the integral (38) is
,,i,0 (y ) + C,,i (1 e2i )
D,,i (1 e2i ) ,,i,0 (x ). (41)
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

134 I. Kriz

Adding this, we get


C,,i C,,i (1 + e2i ) + D,,i D
,,i (1 e2i ),

as claimed.

3. Exponentiation of Infinitesimal Deformations


Let us now look at primary weight (1, 1) elds u. We would like to investigate
whether the innitesimal deformation of vertex operators (more precisely world-
sheet vacua or string amplitudes) along u indeed continues to a nite deformation,
or at least to perturbative level, as discussed in the previous section. Looking again
at Eq. (8), we see that we have in principle a series of obstructions similar to those
of Gerstenhaber [2629], namely if we denote by

m
Ln (m) = Lin i , L0n = Ln (42)
i=0

a deformation of the operator Ln in Hom(V, V )[]/m , we must have


Ln (m)u(m) = 0 V []/m+1 for n > 0 (43)
L0 (m)u(m) = u(m) V []/m+1 . (44)
This can be rewritten as

Ln um = Lin umi
i1
 (45)
(L0 1)um = Li0 umi .
i1

(Analogously for the s. In the following, we will work on the obstruction for the
chiral part, the antichiral part is analogous.) At rst, these equations seem very
overdetermined. Similarly as in the case of Gerstenhabers obstruction theory, how-
ever, of course the obstructions are of cohomological nature. If we denote by A the
Lie algebra L0 1, L1 , L2 , . . . , then the system
Ln (m)u(m 1)
(46)
(L0 (m) 1)u(m 1)
is divisible by m in V []/m+1 , and is obviously a coboundary, hence a cocycle
with respect to L0 (m) 1, L1 (m), . . . . Hence, dividing by m , we get a 1-cocycle
in H 1 (A, C). Solving (45) means expressing this A-cocycle as a coboundary.
In the absence of ghosts (= elements of negative weights), there is another sim-
plication we may take advantage of. Suppose we have a 1-cocycle c = (x0 , x1 , . . .)
of A, representing an element of H 1 (A, C). (In our applications, we will be inter-
ested in the case when the xi s are given by (46).) Writing out the cocycle condition
explicitly, we obtain the equations
Lk xj Lj xk = (k j)xj+k ,
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 135

where Lk = Lk for k > 0, L0 = L0 1. In particular,


Lk x0 L0 xk = kxk ,
or
Lk x0 = (L0 + k 1)xk for k > 0. (47)
In the absence of ghosts, (47) means that for k 1, xk is determined by x0 with
the exception of the weight 0 summand (x1 )0 of x1 . Additionally, if we denote the
weight k summand of y in general by yk , then
c = dy (48)
means
(x0 )k = (k 1)y, (49)
(x0 )1 = 0. (50)
The rest of Eq. (48) then follows from (47), with the exception of the weight 0
summand of x1 . We must, then, have
(x1 )0 Im L1 . (51)
Conditions (50), (51), for

xk = Lik umi ,
i1

are the conditions for solving (45), i.e. the actual obstruction.
For m = 1, we get what we call the primary obstruction. Calculating the integral
(5) over an annulus and passing to the appropriate limits (the innitesimal annuli
expressing the operators Ln ), we obtain

1k =
L1k = L ui,m+k ui,m , (52)
m,i

so (50) becomes

ui,0 u
i,0 u = 0. (53)
i

The condition (51) becomes


 
i,0 u Im L1 ,
ui,1 u ui,0 ui,1 u Im L
1. (54)
i i

This investigation is also interesting in the supersymmetric context. In the case


of N = 1 worldsheet supersymmetry, we have additional operators Gir , and in the
i
N = 2 SUSY case, we have operators G+i i
r , Gr , Jn (cf. [31, 49]), dened as the
i +
 -coecient of the deformation of Gr , resp. Gr , Gr , Jn analogously to Eq. (42).
In the N = 1-supersymmetric case, the critical deforming elds have weight
(1/2, 1/2) (as do a and c elds in the N = 2 case), so in both cases the rst
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

136 I. Kriz

Eq. (42) remains the same as in the N = 0 case, the second becomes

(L0 1/2)um = Li0 umi . (55)
i1

Additionally, for N = 1, we get



Gr um = Gir umi , r 1/2 (56)
i1

(similarly when s are present).


In the N = 1-supersymmetric case, we therefore deal with the Lie algebra A,
which is the free C-vector space on Ln , Gr , n 0, r 1/2. For a cocycle which
has value xk on Lk and zr on Gr , Eq. (47) becomes

Lk x0 = (L0 + k 1/2)xk for k > 0, (57)

so in the absence of ghosts, xk is always determined by x0 . If

the 1-cocycle (xk , zr ) is the coboundary of y (58)

we additionally get

(x0 )k = (k 1/2)y,

so

(x0 )1/2 = 0.

On the other hand, on the zs, we get

Gr x0 = (L0 + r 1/2)zr , r 1/2, (59)

so we see that in the absence of ghosts, all zr s are determined, with the exception of

(z1/2 )0 .

Therefore our obstruction is

(z0 )1/2 = 0, (z1/2 )0 Im(G1/2 ). (60)

For the primary obstruction, we have



1 =
L1k = L 1/2 u)m+k,m ,
(G1/2 G (61)
k
m

G1r = 2 1/2 u)m+r,m ,
(G
m
 (62)
1r
G =2 (G1/2 u)m,m+r ,
m
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 137

so the obstruction becomes



1/2 u)m,m = 0,
(G1/2 G
m

1/2 u)m+1/2,m Im(G1/2 ),
(G (63)
m

(G1/2 u)m,m+1/2 Im(G
1/2 ).
m

In the case of N = 2 supersymmetry, there is an additional complication, namely


chirality. This means that in addition to the conditions
(L0 1/2)u = 0,
(64)
Ln u = G
r u = Jn1 u = 0 for n 1, r 1/2,
we require that u be chiral primary, which means
G+
1/2 u = 0. (65)
(There is also the possibility of antichiral primary, which has
G
1/2 u = 0 (66)
instead, and similarly for the s.) Let us now write down the obstruction equations
for the chiral primary case. We get the rst Eqs. (45), (55), and an analogue of (56)
i
with Gir replaced by G+ir and Gr . Additionally, we have the equation

G+1/2 u = G+i
m mi
1/2 u
i1

and analogously for the s.


In this situation, we consider the super-Lie algebra A2 which is the free C-vector
space on Ln , Jn , n 0, G r , r 1/2 and Gs , s 1/2. One easily veries that
+

this is a super-Lie algebra on which the central extension vanishes canonically ([31],
Sec. 3.1). Looking at a 1-cocycle whose value is xk ,zr , tk on Lk , G
r , Jk respectively,
we get Eq. (57), and additionally
G
r x0 = (L0 + r 1/2)zr , r 1/2 for , r 1/2 for + (67)
and
Jn x0 = (n 1/2)tn , n 0. (68)
+
We see that the cocycle is determined by x0 , with the exception of (z1/2 )0 , (z1/2 )1 .
Therefore, we get the condition
(x0 )1/2 = 0

(z1/2 )0 Im(G
1/2 ) (69)
+
(z1/2 )1 = G+
1/2 u where G+
1/2 u =0
and similarly for the s.
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

138 I. Kriz

In the case of deformation along a cc eld u, we have



1 =
L1k = L k (G
1/2 G1/2 u)m+k,m , (70)
m

Gr+,1 = u)m+r+1/2,m ,
2(G 1/2
m

+,1 =
G r 2(G
1/2 u)m,m+r+1/2 , (71)
m
Gr,1 = r,1
G = 0,
Jn1
= 0 = Jn1 ,
so the obstructions are, in a sense, analogous to (63) with Gr replaced by G
r .

Remark. The relevant computation in verifying that (70), (71) (and the analogous
cases before) form a cocycle uses formulas of the following type ([72]):
Resz (a(z)v(w)z n ) Resz (v(w)a(z)z n ) = Reszw ((a(z w)v)(w)z n ). (72)
For example, when v is primary of weight 1, a = L2 , the right hand side of (72) is
Reszw (L0 v(z w)2 n(z w)wn1 + L1 v(z w)1 wn )
= nv(w)wn1 + L1 v(w)wn
 
= nvk wnk2 + (k 1)vk wnk2

= (n k)vn wnk2 .

The left-hand side is [Ln1 , vkn+1 ]wnk2 , so we get
[Ln1 , vkn+1 ] = (n k 1)vk ,
as needed.
Other required identities follow in a similar way. Let us verify one interesting
case when a = G 3/2 , u chiral primary. Then the right-hand side of (72) is

Reszw (G 1/2 v(w)(z w) 1 n
w ) = (G
1/2 v)(w) = (G1/2 v)w
n1
.
This implies

[G
r , us ] = (G1/2 u)r+s , (73)
as needed.
We have now analyzed the primary obstructions for exponentiation of innites-
imal CFT deformations. However, in order for a perturbative exponentiation to
exist, there are also higher obstructions which must vanish. The basic principle for
obtaining these obstructions was formulated above. However, in practice, it may
often happen that those obstructions will not converge. This may happen for two
dierent basic reasons. One possibility is that the deformation of the deforming eld
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 139

itself does not converge. This is essentially a violation of perturbativity, but may
in some cases be resolved by regularizing the CFT anomaly along the deformation
parameter. We will discuss this at the end of this section, and will give an example
in Sec. 4 below.
Even if all goes well with the parameter, however, there may be another problem,
namely the expressions for Lin etc. may not converge due to the fact that our
deformation formulas concern vacua of actual worldsheets, while Lin etc. correspond
to degenerate worldsheets. Similarly, vertex operators may not converge in the
deformed theories. We will show here how to deal with this problem.
The main strategy is to rephrase the conditions from the above part of this
section in terms of nite annuli. We start with the N = 0 (non-supersymmetric)
case. Similarly as in (42), we can expand

m
UAr (m) = UAhr h . (74)
h=0

In the non-supersymmetric case, the basic fact we have is the following:

Theorem 2. Assuming uk (considered as elds in the original undeformed CFT )


have weight > (1, 1) for k < h, r (0, 1) we have
  1  sh
2mh 1 2m 1
h
U Ar = umh ,mh um1 ,m1 UAr sh sh1h1
(mk ,kh) sh =r sh1 =r
 s2
1 1
s2m
1 ds1 dsh . (75)
s1 =r

(Note that the integral on the right-hand side is over a simplex.)


 1
uh = um ,m um1 ,m1 u.
2 (mh + + m1 )(mh1 + + m1 ) m1 h h
h
(mk ,kh)

(76)
In particular, the obstruction is the vanishing of the sum (with the term mh + +m1
omitted from the denominator ) of the terms in (76) with mh + + m1 = 0.

Proof. The identity (75) is essentially by denition. The key point is that in the
higher deformed vacua, there are terms in the integrand obtained by inserting uk ,
k > 1 to boundaries of disjoint disks Di cut out of Ar . Then there are corrective
terms to be integrated on the worldsheets obtained by cutting out those disks. But
the point is that under our weight assumption, all the disks Di can be shrunk to
a single point, at which point the term disappears, and we are left with integrals
of several copies of u inserted at dierent points. If we are using vertex operators
to express the integral, the operators must additionally be applied in time order
(i.e. elds at points of lower modulus are inserted rst). There is an h! permutation
factor which cancels with the Taylor denominator. This gives (75).
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

140 I. Kriz

Now (76) is proved by induction. For h = 1, the calculation is done above ((52)).
Assume now the induction hypothesis, and evaluate the integral in the standard
way of taking primitive functions successively from the inside out. The primitive
function of ms is taken to be ms+1 /(s + 1) (by the induction hypothesis, and the
assumptions that lower order obstructions vanish, the case s = 1 never occurs.
Then the contributing term of the integral where the k 1 innermost integrals have
the upper bound and the kth innermost integral has the lower bound is equal to
UAhk
r
uk , h>k1
(and this term occurs with a minus sign because of the lower bound involved). The
summand which has all upper bounds except in the last integral is equal to
1 r2(m1 ++mh )
um ,m um1 ,m1 ur2 , (77)
2h (mh + + m1 )(mh1 + + m1 ) m1 h h
which is supposed to be equal to
UAr uh + r2 uh .
This gives the desired solution.

Remark. The formula (77) of course does not apply to the case m1 + + mh = 0.
In that case, the correct formula is
ln(r)
um ,m um1 ,m1 ur2 . (78)
(mh1 + + m1 ) m1 h h
So the question becomes whether there could exist a eld uh such that UAr uh r2 uh
is equal to the quantity (78). One sees immediately that such eld does not exist
in the product-completed space of the original theory. What this approach does
not settle however is whether it may be possible to add such non-perturbative
elds to the theory and preserve CFT axioms, which could facilitate existence of
deformations in some generalized sense, despite the algebraic obstruction. It would
have to be, however, a eld of generalized weight in the sense of [3942].
In eect, written in innitesimal terms, the expression (78) becomes
1
L0 uh uh = um ,m um1 ,m1 u.
(mh1 + + m1 ) m1 h h
The right-hand side wu is a eld of holomorphic weight 1, so we see that we have
a matrix relation
    
uh 1 w uh
L0 = .
u 0 1 u
This is an example of what one means by a eld of generalized weight. One should
note, however, that elds of generalized weight are excluded in unitary conformal
eld theories. By Wick rotation, the unitarity axiom of a conformal eld theory
becomes the axiom of reection positivity [59]: the operator U associated with a
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 141

worldsheet is dened up to a 1-dimensional complex line L (which is often more


strongly assumed to have a positive real structure). If we denote by the complex-
conjugate worldsheet (note that this reverses orientation of boundary components),
then reection positivity requires that we have an isomorphism L = L (the dual

line), and using this isomorphism, an identication U = U (here the asterisk
denotes the adjoint operator). Specializing to annuli Ar , r 1, we see that
the annulus for r real is self-conjugate, so the corresponding operators are self-
adjoint, and hence diagonalizable. On the other hand, for r = 1, we obtain
unitary operators, and unitary representations of S 1 on Hilbert space split into
eigenspaces of integral weights. The central extension given by L is then trivial and
hence the operators corresponding to all Ar commute, and hence are simultaneously
diagonalizable, thus excluding the possibility of generalized weight.
The possibility, of course, remains that the correlation function of the deformed
theory can be modied by a non-perturbative correction. Let us note that if left
uncorrected, the term (78) can be interpreted innitesimally as

L0 u() u() = Cm v mod m+1 , (79)

where v is another eld of weight 1. Note that in case that u = v, (79) can be
interpreted as saying that u changes weight at order m of the perturbation param-
eter. In the general case, we obtain a matrix involving all the (holomorphic) weight
1 elds in the unperturbed theory. Excluding elds of generalized weight in the
unperturbed theory (which would translate to elds of generalized weight in the
perturbed theory), the matrix must have other eigenvalues than 1, thus showing
that some critical elds will change weight.

In the N = 1-supersymmetric case, an analogous statement holds, except the


assumption is that the weight of uk is greater than (1/2, 1/2) for k < h, and the
integral (75) must be replaced by

UAhr = (G1/2 G 1/2 u)m ,m (G1/2 G
1/2 u)m1 ,m1 UAr
h h
mk
 1  sh  s2
h 1 1
s1m1 1 ds1 dsh ,
m
sm
h
h1
sh1 (80)
sh =r sh1 =r s1 =r

and accordingly
 1
uh =
mk
2h (mh + + m1 )(mh1 + + m1 ) m1

(G1/2 G
1/2 u)m ,m (G1/2 G
h h
1/2 u)m1 ,m1 u, (81)

so the obstruction again states that the term with mh + + m1 = 0 must vanish.
In the N = 2 case, when u is a cc eld, we simply replace G by G in (80) and (81).
But in the supersymmetric case, to preserve supersymmetry along the deforma-
tion, we must also investigate the nite analogs of the obstructions associated
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

142 I. Kriz

with G1/2 in the N = 1 case, and G +


1/2 , G1/2 in the N = 2 c case (and simi-
larly for the a case, and the s). In fact, to tell the whole story, we should seri-
ously investigate integration of the deforming elds over super-Riemann surfaces
(= super-worldsheets). This can be done; one approach is to treat the case of the
superdisk rst, using Stokes theorem twice with the dierentials , replaced by
D, D respectively in the N = 1 case (and the same at one chirality for the N = 2
case). A general super-Riemann surface is then partitioned into superdisks.
For the purpose of obstruction theory, the following special case is sucient.
We treat the N = 2 case, since it is of main interest for us. Let us consider the
case of cc elds (the other cases are analogous). First we note (see (71)) that G
is unaected by deformation via a cc eld, so the obstructions derived from G 1/2
and G 1/2 are trivial (and similarly at the s).
To understand the obstruction associated with G+ 1/2 , we will study nite (as
opposed to innitesimal) annuli obtained by exponentiating G+ 1/2 . Now the element
+
G1/2 is odd. Thinking of the super-semigroup of superannuli as a supermanifold,
then it makes no sense to speak of odd points of the supermanifold. It makes sense,
however, to speak of a family of odd elements parametrized by an odd parameter s:
this is simply the same thing as a map from the (0|1)-dimensional superane line
into the supermanifold. In this sense, we can speak of the nite odd annulus

exp(sG+
1/2 ). (82)

Now we wish to study the deformations of the operator associated with (82) along
a cc eld u as a perturbative expansion in .
Thinking of G+1/2 as an N = 2-supervector eld, we have

+
G+
1/2 = (z + ) +
z . (83)
z
We see that (83) deforms innitesimally only the variables + and z, not . Thus,
more specically, (82) results in the transformation

z exp(s )z,
(84)
.

This gives rise to the formula, valid when uk have weight > (1/2, 1/2) for 1 k < h,
 1  th
mh1 1
mh 1
h
Uexp(sG +
)
= t h th1
1/2
mk th =exp(s ) th1 =exp(s )
 t2
t1m1 1 dt1 dth vmh ,mh vm1 ,m1 Uexp(sG+ ) , (85)
1/2
t1 =exp(s )

where vmk ,mk is equal to


u)m
(G (86)
1/2 k+1/2 ,mk
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 143

in summands of (85) where the factor resulting from integrating the tk variable has
a factor, and
(G
1/2 G1/2 u)mk ,mk (87)
in other summands. (We see that each summand can be considered as a product of
factors resulting from integrating the individual variables tk ; in at most one factor,
(86) can occur, otherwise the product vanishes.)
Realizing that exp(ms ) = 1 + ms , this gives that the obstruction (under
the weight assumption for uk ) is that the summand for m1 + + mh = 0 (with
the denominator m1 + + mh omitted) in the following expression vanish:

h
1 1
(G G u)m ,m
mk k=1
m1 + + mh m1 1/2 1/2 h h

u)m
mk (G (G
1/2 k+1/2 ,mk 1/2 G1/2 u)m1 ,m1 u. (88)
To investigate the higher obstructions further, we need the language of corre-
lation functions. Specically, the CFTs whose deformations we will consider are
RCFTs. The simplest way of building an RCFT is from chiral sectors H
where runs through a set of labels, by the recipe

H= H H


where denotes the contragredient label (cf. [38]). (In the case of the Gepner
model, we will need a slightly more general scenario, but our methods still apply
to that case analogously.) Further, we will have a symmetric bilinear form
B : H H C
with respect to which the adjoint to Y (v, z) is
(z 2 )n Y (ezL1 v, 1/z)
where v is of weight n. There is also a real structure
H
=H ,

thus specifying a real structure on H, u v = u


v, and inner product
u1 v1 , u2 v2 = B(u1 , u
2 )B(v1 , v2 ).
We also have an inner product
H R H C
given by
u, v = B(u, v).
Then we have the P1 -chiral correlation function
u(z ) |vm (zm )vm1 (zm1 ) v1 (z1 )v0 (z0 ) (89)
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

144 I. Kriz

which can be dened by taking the vacuum operator associated with the degenerate
worldsheet obtained by cutting out unit disks with centers z0 , . . . , zm from the
unit disk with center z , applying this operator to v0 vm , and taking
inner product with u. Thus, the correlation function (89) is in fact the same thing
as applying the eld on either side of (89) to the identity, and taking the inner
product.
This object (89) is however not simply a function of z0 , . . . , z . Instead, there is
a nite-dimensional vector space M depending holomorphically on (called the
modular functor) such that (89) is a linear function

M C.
However, now one assumes that M is a unitary modular functor in the sense
of Segal [59]. This means that M has the structure of a positive-denite inner
product space for not just the as above, but an arbitrary worldsheet. The inner
product is not valued in C, but in

det()2c

where c is the central charge. Since the determinant of as above is the same as
det(P1 ) (hence in particular constant), we can make the inner product C-valued in
our case.
If the deforming eld is of the form

uu
, (90)

the higher L0 obstruction (under the weight assumptions given above) can be
further written as

v(0) |u(zm ) u(z1 )u(0)
0 z1 zm 1

v |
 u(zm ) u
(z1 ) z1 dzm d
u(0) dz1 d zm for w(v) 1 (91)
(w is weight) in the N = 0 case and

v(0) |(G
1/2 u)(zm ) (G1/2 u)(z1 )u(0)
0 z1 zm 1

 u)(zm ) (G
v |(G u z1 dzm d for w(v) 1/2
1/2 1/2 )(
z1 )
u(0) dz1 d zm
(92)

in the N = 2 cc case. The G+


1/2 -obstruction in the N = 2 case can be written as
 
m
v(0) |(G
1/2 u)(zm ) u(zk ) (G1/2 u)(z1 )u(0)
0 z1 zm 1 k=1

u
v |(G
 )
1/2 )(zm ) (G1/2 u
(z1 ) z1 dzm d
u(0) dz1 d zm for w(v) 0, w(
v ) 1/2 (93)
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 145

and similarly for the . We see that these obstructions vanish when we have
v(z ) |u(zm ) u(z0 ) = 0, for w(v) 1 (94)
in the N = 0 case (and similarly for the s), and
v(z ) |G
1/2 u(zm ) G1/2 u(z1 )u(z0 ) = 0, for w(v) 1/2, (95)
and similarly for the s. Observe further that when
u
=u
,
the condition for the s is equivalent to the condition for u, and further (94), (95)
are also necessary in this case, as in (91), (92) we may also choose v = v, which
makes the integrand non-negative (and only 0 if it is 0 at each chirality). In the
N = 2 case, it turns that the condition (95) simplies further:

Theorem 3. Let u be a chiral primary eld of weight 1/2. Then the necessary
and sucient condition (95) for existence of perturbative CFT deformations along
the eld u u
is equivalent to the same vanishing condition applied to only chiral
primary elds v of weight 1/2.

Proof. In order for the elds (95) to correlate, they would have to have the same
J-charge QJ . We have
QJ u = 1, QJ (G
1/2 u) = 0.

As QJ of the right-hand side of (95) is 1. Thus, for the function (95) to be possibly
non-zero, we must have
QJ v = 1. (96)
But then we have
1 1
w(v) QJ v =
2 2
with equality arising if and only if
v is chiral primary of weight 1/2 (97)
(see [31, Sec. 3.3]).

Remark 1. We see therefore that in the N = 2 SUSY case, there is in fact no


need to assume that the weight of U k is > (1/2, 1/2) for k < h. If the obstruction
vanishes for k < h, then we have

1
uk = (G G u)(zk ) (G G
1/2 1/2 u)(z1 )udz1 dzk d
z1 d
zk (98)
k! D 1/2 1/2
where the integrand is understood as a (k + 1)-point function (and not its power
series expansion in any particular range), over the unit disk.
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

146 I. Kriz

Additionally, for any worldsheet ,



1
Uh = (G G u)(zh ) (G G
1/2 1/2 u)(z1 )dz1 dzh d
z1 d
zh (99)
h! 1/2 1/2
(it is to be understood that in both (98), (99), the elds are inserted into holomor-
phic images of disks where the origin maps to the point of insertion with derivative
of modulus 1 with respect to the measure of integration).
When the obstruction occurs at step k, the integral (98) has a divergence of
logarithmic type. In the N = 0 case, there is a third possibility, namely that the
obstruction vanishes, but the eld uh in Theorem 2 has summands of weight < (1, 1)
(< (1/2, 1/2) for N = 1). In this case, the integral (98) will have a divergence of
power type, and the intgral of terms of weight < (1, 1) (respectively, < (1/2, 1/2))
has to be taken in range from to 1 rather than from 0 to 1 to get a convergent
integral. The formula (99) is not correct in that case.

Remark 2. In [19], a dierent correlation function is considered as an measure


of marginality of u to higher perturbative order. The situation there is actually
more general, allowing combinations of both chiral and antichiral primaries. In the
present setting of chiral primaries only, the correlation function considered in [19]
amounts to
1|(G
1/2 u)(zn ) (G1/2 u)(z1 ) . (100)
It is easy to see using the standard contour deformation argument that (100) indeed
vanishes, which is also observed in [18]. In [19], this type of vanishing is taken
as evidence that the N = 2 CFT deformations exist. It appears, however, that
even though the vanishing of (100) follows from the vanishing of (95), the opposite
implication does not hold. (In fact, we will see examples in Sec. 6 below.) The
explanation seems to be that [19] writes down an integral expressing the change
of central charge when deforming by a combination of cc elds and ac elds, and
proves its vanishing. While this is correct formally, we see from Remark 1 above
that in fact a singularity can occur in the integral when our obstruction is non-zero:
the integral can marginally diverge for k points while it is convergent for < k points.
It would be nice if the obstruction theory a la Gerstenhaber we described here
settled in general the question of deformations of conformal eld theory, at least in
the vertex operator formulation. It is, however, not that simple. The trouble is that
we are not in a purely algebraic situation. Rather, compositions of operators which
are innite series may not converge, and even if they do, the convergence cannot be
understood in the sense of being eventually constant, but in the sense of analysis,
i.e. convergence of sequences of real numbers.
Specically, in our situation, there is the possibility of divergence of the terms
on the right-hand side of (45). Above we dealt with one problem, that in general,
we do not expect innitesimal deformations to converge on the degenerate world-
sheets of vertex operators, so we may have to replace (45) by equations involving
nite annuli instead. However, that is not the only problem. We may encounter
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 147

regularization along the ow parameter. This stems from the fact that Eqs. (43),
(44) only determine u() up to scalar multiple, where the scalar may be of the form

1+ Ki i = f (). (101)
i1

But the point is (as we shall see in an example in the next section) that we may
only be able to get a well dened value of
f 1 ()u() = v() (102)
when the constants Ki are innite. The obstruction then is
Ln (m)f (m)v(m) = 0 V []/m+1 for n > 0
(103)
(1 L0 (m))f (m)v(m) = 0 V []/ m+1
.
At rst, it may seem that it is dicult to make this rigorous mathematically with
the innite constants present. However, we may use the following trick. Suppose we
want to solve
c1 a11 + + cn a1n = b1
.. (104)
.
c1 am1 + + cn amn = bn
in a, say, nite-dimensional vector space V . Then we may rewrite (104) as
 

(b1 , . . . , bn ) = 0 V (a11 , . . . , am1 ), . . . , (a1n , . . . , amn ) . (105)
m

This of course does not give anything new in the algebraic situation, i.e when the
aij s are simply elements of the vector space V . When, however the vectors
(a11 , . . . , am1 ), . . . , (a1n , . . . , amn )
are (possibly divergent) innite sums

(a1j , . . . , amj ) = (a1jk , . . . , amjk ),
k

then the right-hand side of (105) can be interpreted as


 

V (a11k , . . . , am1k ), . . . , (a1nk , . . . , amnk ) .
m

In that sense, (105) always makes sense, while (104) may not when interpreted
directly. We interpret (103) in this way.
Let us now turn to the question of sucient conditions for exponentiation of
innitesimal deformations. Suppose there exists a subspace W V closed under
vertex operators which contains u and such that for all elements v W , we have
that

Yi (u, z)Yi (u, z)v
i
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

148 I. Kriz

involve only z n zm with m, n Z, m, n = 1. Then, by Theorem 1,


1  : W W
[]/2

is an innitesimal isomorphism between W and the innitesimally u-deformed W .


It follows, in the non-regularized case, that then
exp()u (106)
is a globally deformed primary eld of weight (1, 1), and
exp() : W W
[[]] (107)
is an isomorphism between W and the exponentiated deformation of W . However,
since we now know the primary elds along the deformation, vacua can be recovered
from Eq. (8) of the last section.
Such non-regularized exponentiation occurs in the case of the coset construction.
Setting
W = v|Yi (u, z)Yi (u, z)v involve only z n zm with m, n 0, m, n Z .
Then W is called the coset of V by u. Then W is closed under vertex operators,
and if u W , the formulas (106), (107) apply without regularization.
The case with regularization occurs when there exists some constant

K() = 1 + K n n
n1

where Kn are possibly constants such that


K() exp()u (108)
is nite in the sense described above (see (105)). We will see an example of this in
the next section.
All these constructions are easily adapted to supersymmetry. The formu-
las (106), (107) hold without change, but the deformation is with respect to
G1/2 G 1/2 u respectively, G G +
1/2 1/2 u, G1/2 G1/2 u depending on the situation
applicable.

4. The Deformations of Free Field Theories


As our rst application, let us consider the 1-dimensional bosonic free eld confor-
mal eld theory, where the deformation eld is
u = x1 x1 . (109)
In this case, the innitesimal isomorphism of Theorem 1 satises
 xn x n
= (110)
n
nZ

and the sucient condition of exponentiability from the last section is met when
we take W the subspace consisting of states of momentum 0. Then W is closed
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 149

under vertex operators, u W and the n = 0 term of (110) drops out in this case.
However, this is an example where regularization is needed. It can be realized as
follows: Write

= n
n>0

where


xn x
n xn x
n
n = .
n n
We have

exp = exp n . (111)
n>0

To calculate exp n explicitly, we observe that



n xn x
xn x n xn xn x
n xn
, = 1,
n n n n
and setting
xn x
n
e= ,
n
xn x
n
f = , (112)
n
xn xn xn x
n
h= 1,
n n
we obtain the sl2 Lie algebra
[e, f ] = h,
[e, h] = 2e, (113)
[f, h] = 2f.
Note that conventions regarding the normalization of e, f, h vary, but the relations
(113) are satised for example for




0 1 0 0 1 0
e= , f= , h= . (114)
0 0 1 0 0 1
In SL2 , we compute



0 1
exp((f + e)) = exp 
1 0
 
cosh  sinh 
=
sinh  cosh 

  1

1 tanh  cosh  0 1 0
= . (115)
0 1 tanh  1
0 cosh 
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

150 I. Kriz

In the translation (112), this is






xn xn xn xn xn x
n
exp tanh() exp (ln cosh ) + +1
n n n


xn xn
exp tanh() . (116)
n
To exponentiate the middle term, we claim


xn xn xn xn z
exp z = : exp (e 1)) : (117)
n n
To prove (117), dierentiate both sides by z. On the left-hand side, we get


xn xn xn xn
exp z .
n n
Thus, if the derivative by z of the right-hand side y of (117) is


xn xn xn xn z
: exp (e 1) :, (118)
n n
then we have the dierential equation y  = xnnxn y, which proves (117) (looking
also at the initial condition at z = 0).
Now we can calculate (118) by moving the xn occuring before the normal order
symbol to the right. If we do this simply by changing (118) to normal order, we get


xn xn xn xn z
: exp (e 1) :, (119)
n n
but if we want equality with (118), we must add the terms coming from the com-
mutator relations [xn , xn ] = n, which gives the additional term


xn xn xn xn z
(e 1) :
z
exp (e 1) :. (120)
n n
Adding together (119) and (120) gives


xn xn xn xn z
ez : exp (e 1) :, (121)
n n
which is the derivative by z of the right-hand side of (117), as claimed.
Using (117), (116) becomes


1 xn xn
n = exp tanh()
cosh  n




1 xn xn xn x
n xn x
n
: exp 1 + : exp tanh()
cosh  n n n
(122)
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 151

which is in normal order. Let us write


1
n =  . (123)
cosh  n
Then the product

 = n
n1

is in normal order, and is the regularized isomorphism from the exponentiated -


deformation W of the conformal eld theory in vertex operator formulation on
to the original W . The inverse, which goes from W to W , is best calculated by
regularizing the exponential of . We get



0 1
exp((f e)) = exp 
1 0


cosh  sinh 
=
sinh  cosh 


1
0

1 tanh  cosh  1 0
=
0 1 tanh  1
0 cosh 


1 xn xn
= exp tanh()
cosh  n



1 xn xn x
n x
n
: exp 1 + :
cosh  n n


xn x
n
exp tanh() .
n
So expressing this as
1
n =  , (124)
cosh  n
the product

 = n
n1

is the regularized isomorphism from W to W .


Even though  and  are only elements of W , the element u() =  u is the
regularized chiral primary eld in W , and can be used in a regularized version of
Eq. (8) to calculate the vacua on V , which will converge on non-degenerate Segal
worldsheets.
In this approach, however, the resulting CFT structure on V remains opaque,
while as it turns out, in the present case it can be identied by another method.
In fact, to answer the question, we must treat precisely the case missing in
Theorem 1, namely when the weight 0 part of the vertex operator of the deforming
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

152 I. Kriz

eld, which in this case is determined by the momentum, does not vanish. The
answer is actually known in string theory to correspond to constant deformation of
the metric on spacetime, which ends up isomorphic to the original free eld theory.
From the point of view of string theory, what we shall give is a purely worldsheet
argument establishing this fact.
Let us look rst at the innitesimal deformation of the operator Y (v, t, t) for
some eld v V which is an eigenstate of momentum. We have three forms which
coincide where dened:
1 , z, z)Y (v, t, t)dzd
Y (x1 x z (125)
Y (v, t, t)Y (x1 x
1 , z, z)dzd
z (126)
1 , z t, z t)v, t, t)dzd
Y (Y (x1 x z. (127)
By chiral splitting, if we assume v is a monomial in the modes, we can denote
(125)(127) by (without forming a sum of terms). Again, integrating (125)
(127) term by term dz, we get forms , 0 , t , respectively. Here we set

1
dz = ln z.
z
Again, these are branched forms. Selecting points p0 , p , pt on the correspond-
ing boundary components, we can, say, make cuts c0,t and c0, connecting the
points p0 , pt and p0 , p . Cutting the worldsheet in this way, we obtain well dened
branches , 0 , t . To complicate things further, we have constant discrepancies
C0t = 0 t
(128)
C0 = 0 .
These can be calculated for example by comparing with the 4 point function
Y+ (x1 , z)Y (v, t) + Y (v, t)Y (x1 , z) + Y (Y (x1 , z t)v, t) (129)
where Y (v, z) denotes the sum of the terms in Y (v, z) involving negative powers
of z, and Y+ (v, z) is the sum of the other terms. Another way to approach this is
as follows: one notices that

Y (x1 , z)dz =  Y (1 , z)S |=0 (130)

where Sm denotes the operator which adds m to momentum. It follows that


C0t =  (Z(x1 , v, z, t)S Z(x1 , S v, z, t))|=0
(131)
C0 =  (Z(x1 , v, z, t)S S Z(x1 , v, z, t))|=0 .
Now the deformation is obtained by integrating the forms
0 , (132)
(t + C0t )
, (133)
( + C0 )
, (134)
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 153

on the boundary components around 0, t and , and along both sides of the cuts
c0,t , c0, . To get the integrals of the terms in (132)(134) which do not involve the
discrepancy constants, we need to integrate
 
 xn 
n
z + x0 ln z x
m zm1
. (135)
n m
n
=0

To do this, observe that (pretending we work on the degenerate worldsheet, and


hence omitting scaling factors, taking curved integrals over z = 1),
 
ln z ln z 1
z=
d z = 2i ln z (2i)2
d (136)
z z 2

1
ln z zm1 d z = 2i zm . (137)
m
(Actually, the rst term on the right-hand side of (136) depends on the branch of
the logarithm taken and hence cannot gure in the nal result; the reader may note
that this is indeed the case.) Integrating (135), we obtain terms

 zm
2ix0 x
m + x0 ln z (138)
m
m
=0

which will cancel with the integral along the cuts (to calculate the integral over the
cuts, pair points on both sides of the cut which project to the same point in the
original worldsheet), and local terms
2i  xn x n 1
(2i)2 x0 x0 . (139)
2 n 2
n
=0

The discrepancies play no role on the cuts (as the forms C0t , C0 are
unbranched), but using the formula (131), we can compensate for the discrepancies
to linear order in  by applying on each boundary component
S2ix0 . (140)
In (138), however, when integrating , we obtain also discrepancy terms conjugate
to (140), so the correct expression is
S2ix0 S2ix0 . (141)
The term (141) is also local on the boundary components, so the sum of (139) and
(141) is the formula for the innitesimal isomorphism between the free CFT and
the innitesimally deformed theory. To exponentiate, suppose now we are working
in a D-dimensional free CFT, and the deformation eld is
M x1 x
1 . (142)
Then the formula for the exponentiated isomorphism multiplies left momentum by
exp M (143)
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

154 I. Kriz

and right momentum by

exp M T . (144)
But of course, in the free theory, the left momentum must equal to the right momen-
tum, so this formula works only when M is a symmetric matrix. Thus, to cover the
general case, we must discuss the case when M is antisymmetric. In this case, it may
seem that we obtain indeed a dierent CFT which is dened in the same way as
the free CFT with the exception that the left momentum mL and right momentum
mR are related by the formula
mL = AmR
for some xed orthogonal matrix A. As it turns out, however, this theory is still iso-
morphic to the free CFT. The isomorphism replaces the left moving oscillators xi,n
by their transform via the matrix A (which acts on this Heisenberg representation
by transport of structure).
Next, let us discuss the case of deforming gravitaitonal eld of non-zero momen-
tum, i.e. when
u = M x1 x1 1 (145)
with = 0. Of course, in order for (145) to be of weight (1, 1), we must have
 = 0. (146)

Clearly, then, the metric cannot be Euclidean, hence there will be ghosts and a part
of our theory does not apply. Note that in order for (145) to be primary, we also
must have

M= i
i (147)
i

where

i , = 
i , = 0. (148)
Despite the indenite signature, we still have the primary obstruction, which is

 
xk
xk k
coe z1 z1 : n z m1 zn1 exp
M xm x zk + z :
m,n
k k
k
=0

M x1 x
1 1 (149)
(we omit the z ,x0 term, since the power is 0 by (146)). In the notation (147),
this is

(i x0 i x0 x1 x1 + i x1 x1 + x1 i x1 )
i,j

(
j x0
j x
0
x1
x1 +
j x
1
x1 +
x1
j x
1 )M x1 x
1 1
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 155

which in the presence of (148) reduces to the condition


M 2 x1 x
1 = 0. (150)
This is false unless
M 2 = 0 (151)
which means that (145) is a null state, along which the deformation is not interesting
in the sense of string theory. More generally, the distributional form of (150) is

M ()2 = 0. (152)
2 =0

If we set
f () = 2 =0 M ()2
then the Fourier transform of f will be a function g satisfying
 2g
2 =0
i
where the signs correspond to the metric, which we assume is diagonal with entries
1. The Fourier transform of the condition (152) is then
2
g = 0. (153)
i j
Assuming a decay condition under which the Fourier transform makes sense, (153)
implies g = 0, hence (151), so in this case also the obstruction is nonzero unless
(145) is a null state.
In this discussion, we restricted our attention to deforming elds of gravitational
origin. It is important to note that other choices are possible. As a very basic exam-
ple, let us look at the 1-dimensional Euclidean model. Then there is a possibility
of critical elds of the form
a12 + b12 . (154)
This includes the sine-Gordon interaction [69] when a = b. (We see hyperbolic
rather than trigonometric functions because we are working in Euclidean spacetime
rather than in the time coordinate, which is the case usually discussed.) The primary
obstruction in this case states that the weight (0, 0) descendant of (154) applied to
(154) is 0. Since the descendant is
(4ab)x1 ,
we obtain the condition a = 0 or b = 0. It is interesting to note that in the case of
the compactication on a circle, these cases where investigated very successfully by
Ginsparg [30], who used the obstruction to competely characterize the component
of the moduli space of c = 1 CFTs originating from the free Euclidean compactied
free theory. The result is that only free theories compactify at dierent radii, and
their Z/2-orbifolds occur.
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

156 I. Kriz

There are many other possible choices of non-gravitational deformation elds,


one for each eld in the physical spectrum of the theory. We do not discuss these
cases in the present paper.
Let us now look at the N = 1-supersymmetric free eld theory. In this case,
as pointed out above, in the NS-NS sector, critical gravitational elds for defor-
mations have weight (1/2, 1/2). We could also consider the NS-R and R-R sectors,
where the critical weights are (1/2, 0) and (0, 0), respectively. These deforming
elds parametrize soul directions in the space of innitesimal deformations. The
soul parameters , have weights (1/2, 0), (0, 1/2), which explains the dierence of
critical weights in these sectors.
Let us, however, focus on the body of the space of gravitational deformations,
i.e. the NS-NS sector. Let us rst look at the weight (1/2, 1/2) primary eld

M 1/2 1/2 . (155)

The point is that the innitesimal deformation is obtained by integrating the inser-
tion operators of
1/2 M 1/2 1/2 = M x1 x
G1/2 G 1 .

Therefore, (155) behaves exactly the same as a deformation along the eld (142) in
the bosonic case. Again, if M is a symmetric matrix, exponentiating the deformation
leads to a theory isomorphic via scaling the momenta, while if M is antisymmetric,
the isomorphism involves transforming the left moving modes by the orthogonal
matrix exp(M ).
In the case of momentum = 0, we again have indenite signature, and the
eld

u = M 1/2 1/2 1 . (156)

Once again, for (156) to be primary, we must have (147), (148). Moreover, again
the actual innitesimal deformation is obtained by applying the insertion operators
of G1/2 G 1/2 u, so the treatment is exactly the same as for the deformation along
the eld (145) in the bosonic case. Again, we discover that under a suitable decay
condition, the obstruction is always nonzero for gravitational deformations of non-
zero momentum with suitable decay conditions.
It is worth noting that in both the bosonic and supersymmetric cases, one can
apply the same analysis to free eld theories compactied on a torus. In this case,
however, scaling momenta changes the geometry of the torus, so using deformation
elds of 0 momentum, we nd exponential deformations which change (constantly)
the metric on the torus. This seems to conrm, in the restricted sense investigated
here, a conjecture stated in [59].

Remark. Since one can consider CalabiYau manifolds which are tori, one sees
that there should also exist an N = 2-supersymmetric version of the free eld theory
compactied on a torus. (It is in fact not dicult to construct such model directly,
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 157

it is a standard construction.) Now since we are in the CalabiYau case, marginal


cc elds should correspond to deformations of complex structure, and marginal ac
elds should correspond to deformations of K ahler metric in this case.
But on the other hand, we already identied gravitational elds which should
be the sources of such deformations. Additionally, deformations in those direction
require regularization of the deformation parameter, and hence cannot satisfy the
conclusion of Theorem 3.
This is explained by observing that we must be careful with reality. The gravita-
tional elds we considered are in fact real, but neither chiral nor antichiral primary
in either the left or the right moving sector. By contrast, chiral primary elds (or

antiprimary) elds are not real. This is due to the fact that G+ 3/2 and G3/2 are
not real in the N = 2 superconformal algebra, but are in fact complex conjugate to
each other. Therefore, to get to the real gravitational elds, we must take real parts,
or in other words linear combinations of chiral and antichiral primaries, resulting
in the need for regularization.

It is in fact a fun exercise to calculate explicitly how our higher N = 2 obstruc-


tion theory operates in this case. Let us consider the N = 2-supersymmetric free
eld theory, since the compactication behaves analogously. The minimum number
of dimensions for N = 2 supersymmetry is 2. Let us denote the bosonic elds by
x, y and their fermionic superpartners by , . Then the 0-momentum summand of
the state space (NS sector) is (a Hilbert completion of)


1
Sym(xn , yn |n < 0) r , r |r < 0, r Z + .
2

The body parts of the bosonic and fermionic vertex operators are given by the
usual formulas
 
Y (x1 , z) = xn z n1 , Y (y1 , z) = yn z n1 ,
 
Y (1/2 , z) = s z ns1/2 , Y (1/2 , z) = s z ns1/2 ,

[r , r ] = [r , r ] = 1
[xn , xn ] = [yn , yn ] = n.

We have, say,

G13/2 = 1/2 x1 + 1/2 y1


G23/2 = 1/2 y1 1/2 x1 .

As usual,
1
G
3/2 =
(G13/2 iG23/2 ).
2
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

158 I. Kriz

With these conventions, we have a critical chiral primary


u = 1/2 i1/2 (157)
(and its complex conjugate critical antichiral primary). We then see that for a
non-zero coecient C,
CG
1/2 u = x1 iy1 . (158)
We now notice that formulas analogous to (112) etc. apply to (158), but the 1
summans of h will appear with opposite signs for the real and imaginary summands,
so it will cancel out, so the regularizations (123), (124) are not needed, as expected.
Next, let us study the formula (81). The key observation here is that we have
the combinatorial identity
1  1
= (159)
n1 nk
(n(1) + + n(k) )(n(1) + + n(k1) ) n(1)

where the sum on the right is over all permutations on the set {1, . . . , k}. Now in
the present case, we have the innitesimal isomorphism on the 0-momentum part,
up to non-zero coecient,
 (xn iyn )( xn + i
yn )
= (160)
n
and in the absence of regularization, the expansion of the exponentiated isomor-
phism on the 0-momentum parts is simply
exp().
(The + sign in the s is caused by the fact that we are in the complex conjugate
Hilbert space.) Applying this to (157), we see that we have formulas analogous to
(116)(122), and applying the exponentiated isomorphism to (157), all the terms
in normal order involving x>0 , y>0 will vanish, so we end up with


(xn iyn )(
xn + i
yn )
exp D u
n<0
n

for some non-zero coecient D. Applying (159), we get (81).


Finally, the obstruction in chiral form
u (0), (G
1/2 u)(zk ), . . . , (G1/2 u)(z1 ), u(0)

must vanish identically. To see this, we simply observe (157), (158) that in the
present case, u is in the coset model with respect to G 1/2 u (see the discussion
below formula (250) below). Thus, in the N = 2-free eld theory, the obstruction
theory works as expected, and in the case discussed, the obstructions vanish. It is
worth noting that in 2n-dimensional N = 2-free eld theory, we thus have an n2 -
dimensional space of cc + aa real elds, and an n2 -dimensional space of real ca + ac
elds, and although regularization occurs, there is no obstruction to exponentiating
the deformation by turning on any linear combination of those elds. For a free N =
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 159

2-theory compactied on an n-dimensional abelian variety, this precisely recovers


the deformations in the corresponding component of the moduli space of Calabi
Yau varieties.
However, other deformations exist. For an interesting calculation of deformations
of the N = 2-free eld theory in sine-Gordon directions, see [13].

5. The Gepner Model of the Fermat Quintic


The nite weight states of one chirality (say, left moving) of the Gepner model
of the Fermat quintic are embedded in the 5-fold tensor product of the N = 2-
supersymmetric minimal model of central charge 9/5 [24, 25, 32]. More precisely,
the Gepner model is an orbifold construction. This construction has two versions.
In [24,25,32], one is interested in actual string theories, so the 5-fold tensor product
of central charge 9 of N = 2 minimal models is tensored with a free supersymmet-
ric CFT on 4 Minkowski coordinates. This is then viewed in lightcone gauge, so
in eect, one tensors with a 2-dimensional supersymmetric Euclidean free CFT,
resulting in N = 2-supersymmetric CFT of central charge 12. Finally, one performs
an orbifolding/GSO projection to give a candidate for a theory for which both
modularity and spacetime SUSY can be veried. (Actually, this is still not quite
precise, as in Gepners original work, the true point of interest is the construction
of heterotic string theories; from our point of view, however, the dierence does not
matter.)
What we care about is that it is also possible to create an orbifold theory of
central charge 9 which is the candidate of the nonlinear -model itself, without the
spacetime coordinates. (The spacetime coordinates can be added to this construc-
tion and usual GSO projection performed if one is interested in the corresponding
string theory.)
The essence of this construction not involving the spacetime coordinates is for-
mula (2.10) of [33]. In the case of the level 3 N = 2-minimal model (more precisely,
the unitary N = 2 Virasoro minimal model of A-type), the orbifold construction
is with respect to the Z/5-action diagonal which acts on the eigenstates of J0 -
eigenvalue (= U (1)-charge) j/5 by e2ij/5 . As we shall review, the NS part of the
level 3 N = 2 minimal model has two sectors of U (1)-charge j/5, which we will

for the moment ad hoc denote Hj/5 and Hj/5 for j Z/5Z. In the FF realization
(see below), these sectors correspond to  = 0,  = 1, respectively. Then the NS-NS
sector of the 5-fold tensor product of minimal model has the form
5
 
i /5 Hi /5 H
(Hik /5 H i/5 ). (161)
k k k
(ik ) k=1

The corresponding sector of the orbifold construction (formula (2.10) of [33]) has
the form
5
  

(Hik /5 H
(j+i
k )/5
Hik /5 H 
(j+i
k )/5
). (162)
P
(ik ): ik 5Z jZ/5Z k=1
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

160 I. Kriz

Mathematically speaking, this orbifold can be constructed by noting that, ignoring


for the moment supersymmetry, the N = 2-minimal model is a tensor product
of the parafermion theory of the same level and a lattice theory (see [31] and
also below). The orbifold construction does not aect the parafermionic factor,
and on the lattice coordinate, which in this case does not possess a non-zero Z/2-
valued form, and hence physically models a free theory compactied on a torus,
the orbifold simply means replacing the torus by its factor by the free action of
the diagonal Z/5 translation group, which is represented by another lattice theory.
On this construction, N = 2 supersymmetry is then easily restored using the same
formulas as in (161), since the U (1)-charge of the Gs is integral.
The calculations in this and the next section proceed entirely in the orbifold
(162), and hence can be derived from the structure of the level 3 N = 2-minimal
model. It should be pointed out that a mathematical approach to the fusion rules
of the N = 2 minimal models was given in [42]. We shall use the Coulomb gas
realization of the N = 2-minimal model, cf. [34, 53]. Let us restrict attention to
the NS sector. Then, essentially, the left moving sector of the minimal model is a
subquotient of the lattice theory where the lattice is 3-dimensional, and spanned by

      
3 1 i 2 i 2 2 2
, 0, 0 , , , , , 0, i .
15 15 2 5 2 3 15 3

We will adopt the convention that we shall abbreviate

(k, , m)MM = (k, , m)

for the lattice label


   
k i 2 mi 2
, , .
15 2 5 2 3

We shall also write

(, m)MM = (m, , m)MM .

Call the oscillator corresponding to the jth coordinate xj,m , j = 0, 1, 2. Then the
conformal vector is

1 2 1 2 i 2 1
x x + x1,2 + x22,1 . (163)
2 0,1 2 1,1 2 5 2
The superconformal algebra is generated by
   
1 5
+
G3/2 = i x2,1 x1,1 1( 5 ,0,i 2 ) ,
2 3 15 3

    (164)
1 5
G
3/2 = i x2,1 + x1,1 1( 5 ,0,i 2 ) .
2 3 15 3
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 161

For future reference, we will sometimes use the notation

(a, b, c)xn = ax0,n + bx1,n + cx2,n

and also sometimes abbreviate

(b, c)xn = (0, b, c)xn .

The module labels are realized by labels

(, m) = 1( m 2 2 , (165)
15
, i
2
im
3, 2 3)

0  3, m = ,  + 2, . . . ,  2, . (166)

It is obvious that to stay within the range (166), we must understand the fusion rules
and how they are applied. The basic principle is that labels are indentied as follows:
No identications are imposed on the 0th lattice coordinate. This means that upon
any identication, the 0th coordinate must be the same for the labels identied.
Therefore, the identication is governed by the 1st and 2nd coordinates, which give
the Coulomb gas (= FeiginFuchs) realization of the corresponding parafermionic
theory (the Z/3 parafermion model). The keypoint here are the parafermionic cur-
rents
   
1 5
1,2/3 = i x2,1 x1,1 1(0,i 2 ) ,
2 3 3 PF

    (167)
1 5
+
1,2/3 = i x2,1 + x1,1 1(0,i 2 )
2 3 3 PF

(the 0th coordinate is omitted). Clearly, the parafermionic currents act on the
labels by

1,2/3 : (, m)P F (, m + 2)P F ,


(168)
+
1,2/3 : (, m)P F (, m 2)P F .

The lattice labels (, m)P F allowed are those which have non-negative weight. This
condition coincides with (166). Now we impose the identication for parafermionic
labels:

(, m)P F = (3 , m 3)P F .

This implies

(1, 1)P F (2, 2)P F ,


(1, 1)P F (2, 2)P F , (169)
(0, 0)P F (3, 3)P F (3, 3)P F .
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

162 I. Kriz

Now in the Gepner model corresponding to the quintic, the cc elds allowed are

((3, 2, 0, 0, 0)L, (3, 2, 0, 0, 0)R), (170)


((3, 1, 1, 0, 0)L, (3, 1, 1, 0, 0)R), (171)
((2, 2, 1, 0, 0)L, (2, 2, 1, 0, 0)R), (172)
((2, 1, 1, 1, 0)L, (2, 1, 1, 1, 0)R), (173)
((1, 1, 1, 1, 1)L, (1, 1, 1, 1, 1)R), (174)

and the ac eld allowed is

((1, 1, 1, 1, 1)L, (1, 1, 1, 1, 1)R). (175)

Here we wrote  for 1( , )M M , ( = 0, . . . , 3), which is a chiral primary in the N = 2


minimal model of weight /10, and  for 1( , )M M , which is antichiral primary
of weight /10. The tuple notation in (170)(175) really means tensor product. We
omit permutations of the elds (170)(173), so counting all permutations, there are
101 elds (170)(174).
We will need an understanding of the fusion rules in the Z/3 parafermion
model and N = 2-supersymmetric minimal model of central charge 9/5. In the
Z/3 parafermion model, we have 6 labels

(0, 0)P F , (3, 1)P F , (3, 1)P F , (176)


(1, 1)P F , (1, 1)P F , (2, 0)P F . (177)

This can be described


as follows: the labels (176) have the same fusion rules as the
lattice L = i 6 C, i.e.

L /L (178)

where L is the dual lattice (into which


 L is embedded using the standard quadratic
form on C). This dual lattice is  2i 23 , and the fusion rule is abelian, which means
that the product of labels has only onepossible label as outcome, and is described
by the product in L /L. The label 2i 23 corresponds to (0, 2)P F (3, 1)P F .
Next, the product of (2, 0)P F with (3, 1)P F has only one possible outcome,
(2, 2)P F = (1, 1)P F . The product of (2, 0)P F with itself has two possible out-
comes, (2, 0)P F and (0, 0)P F . All other products are determined by commutativity,
associativity and unitality of fusion rules.
The result can be summarized as follows: We call (176) level 0, 3 labels and
(177) level 1, 2 labels. Every level 1, 2 label has a corresponding label of level 0, 3.
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 163

The correspondence is

(0, 0)P F (2, 0)P F


(3, 1)P F (1, 1)P F (179)
(3, 1)P F (1, 1)P F .

As described above, the fusion rules on level 0, 3 are determined by the lattice
theory of L. Additionally, multiplication preserves the correspondence (179), while
the level of the product is restricted only by requiring that any level added to level
0, 3 is the original level.
To put it in another way still, the Verlinde algebra is

Z[]/( 3 1) Z[]/(2  1) (180)

where = (3, 1)P F and  = (2, 0)P F .


In the N = 2 supersymmetric minimal model (MM) case, we allow labels

(3k + m, , m)MM (181)

where (, m) is a PF label, k Z. Two labels (181) are identied subject to identi-
cations of PF labels, and also

(j, , m)MM (j + 15, , m)MM , (182)

and, as a result of SUSY,

(j, , m)MM (j 5, , m 2)MM . (183)

(By we mean that the labels (i.e. VA modules) are identied, but we do not imply
that the states involved actually coincide; in the case (183), they have dierent
weights.) Recalling again that we abbreviate (m, , m)MM as (, m)MM , we get the
following labels for the c = 9/5 N = 2 SUSY MM:

(0, 0)MM (2, 0)MM


(3, 3)MM (2, 2)MM
(3, 1)MM (1, 1)MM (184)
(3, 1)MM (1, 1)MM
(3, 3)MM (2, 2)MM .

Again, the left column (184) represents 0, 3 level labels, the right column represents
level 1, 2 labels. The left column labels multiply as the labels of the lattice super-
CFT corresponding to the lattice in C C spanned by
  
5 2
( 15, 0), , i (185)
15 3
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

164 I. Kriz

(recall that a super-CFT can be assigned to a lattice with integral quadratic form;
the quadratic form on C C is the standard one, the complexication of the
Euclidean inner product). The dual lattice of (185) is spanned by

  
3 5 2
,0 , ,i , (186)
15 15 3

which correspond to the labels (3, 0, 0)MM , (5, 3, 1)MM , respectively. We see that

 /
= Z/5. (187)

In (184), the rows (counted from top to bottom as 0, . . . , 4) match the corresponding
residue class (187). The fusion rules for (2, 0)MM , (0, 0)MM are the same as in the
PF case. Hence, again, multiplication of labels preserves the rows (184), and the
Verlinde algebra is isomorphic to

Z[]/( 5 1) Z[]/(2  1) (188)

where is (3, 3)MM .

Remark. As remarked in Sec. 3, the positive deniteness of the modular functor,


which is crucial for our theory to work, is a requirement for a physical CFT. It
is interesting to note, however, that if we do not include this requirement, other
possible choices of real structure are possible on the modular functor: The Verlinde
algebra of a lattice modular functor with another modular functor M with two
labels 1 and , and Verlinde algebras (180), (188) are tensor products of lattice
Verlinde algebras and the algebra

Z[]/(2  1).

The real structure of this last modular functor can be changed by multiplying by
1 the complex conjugation in M for a worldsheet precisely when has an
odd number of boundary components labelled on level 1, 2. The resulting modular
functor of this operation is not positive-denite.

Let us now discuss the question of vertex operators in the PF realization of the
minimal model. Clearly, since the 0th coordinate acts as a lattice coordinate and is
not involved in renaming, it suces the question for the parafermions. Now in the
FeiginFuchs realization of the level 3 PF model, any state can be written as

u1 (189)

where is one of the labels (166) and u is a state of the Heisenberg representation of
the Heisenberg algebra generated by xi,m , i = 1, 2, m = 0. The situation is however
further complicated by the fact that not all Heisenberg states u are allowed for a
given label . We shall call the states which are in the image of the embedding
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 165

admissible. For example, since the = 0 part of the PF model is isomorphic to the
coset model SU (2)/S 1 of the same level, states

(a, b)x1 (0, 0)P F (190)

are not admissible for (a, b) = (0, 0). One can show that admissible states are exactly
those which are generated from the ground states (166) by vertex operators and
PF currents. Because not all states are admissible, however, there are also states
whose vertex operators are 0 on admissible states. Let us call them null states. For
example, since (190) is not admissible, it follows that

(a, b)x1 (3, 3)P F , (191)

which is easily seen to be admissible for any choice of (a, b), is null.
Determining explicitly which states are admissible and which are null is
extremely tricky (cf. [34]). Fortunately, we do not need to address the question
for our purposes. This is because we will only deal with states which are explicitly
generated by the primary elds, and hence automatically admissible; because of
this, we can ignore null states, which do not aect correlation functions of admis-
sible states.
On the other hand, we do need an explicit formula for vertex operators. One
method for obtaining vertex operators is as follows. We may rename elds using
the identications (169) and also PF currents: a PF current applied to a renamed
eld must be equal to the same current applied to the original eld. Note that this
way we may get Heisenberg states above labels which fail to satisfy (166). Such
states are also admissible, even though the corresponding ground states (which
have the same name as the label) are not. Now if we have two admissible states

ui 1( i ,mi ) , i = 1, 2

where 0 i 3 and 1 + 2 3, then the lattice vertex operator

(u1 1( 1 ,m1 ) )(z)u2 1( 2 ,m2 ) (192)

always satises our fusion rules, and (up to scalar multiple constant on each module)
is a correct vertex operator of the PF theory. This is easily seen simply by the fact
that (192) intertwines correctly with module vertex operators (which are also lattice
operators).
While in our examples, it will suce to always consider operators obtained
in the form (192), it is important to realize that they do not describe the PF
vertex operators completely. The problem is that when we want to iterate vertex
operators, we would have to keep renaming states. But when two ground states 1 ,
1 are identied via the formula (169), it does not follow that we would have

u1 = u1 (193)
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

166 I. Kriz

for every Heisenberg state u. On the contrary, we saw for example that (190) is
inadmissible, while (191) is null. One also notes that one has for example the iden-
tication
x1 1 = L1 1 = L1 1 = x1 1 , (194)
which is not of the form (193).
Because of this, to describe completely the full force of the PF theory, one needs
another device for obtaining vertex operators (although we will not need this in the
present paper). Briey, it is shown in [34] that up to scalar multiple, any vertex
operator
u(z)v = Y (u, z)(v)
where u, v are admissible states can be written as
 
(ak x1 1(0,2) )(tk ) (a1 x1 1(0,2) (t1 )u(z)vdt1 dtk (195)

where the operators in the argument (195) are lattice vertex operators and the
number k is selected to conform with the given fusion rule. While it is easy to show
that operators of the form (195) are correct vertex operators on admissible states
(again up to scalar multiple constant on each irreducible module), as the screening
operators
ax1 1(0,2)
commute with PF currents, selecting the bounds of integration (contours) is much
more tricky. Despite the notation, it is not correct to imagine these as integrals over
closed curves, at least not in general. One approach which works is to bring the
argument of (195) to normal order, which expands it as an innite sum of terms of
the form

(ti tj )ij tk k (196)
(where we put t0 = z) with coecients which are lattice vertex operators. Then
to integrate (196), for ij , k > 0, we may simply integrate ti from 0 to ti1 , and
dene the integral by analytic continuation in the variables ij , k otherwise.
The functions obtained in this way are generalized hypergeometric functions,
and fail for example the assumptions of Theorem 1 (see Remark 2 after the theo-
rem). The explanation is in the fact that, as we already saw, the fusion rules are
not abelian in this case.

6. The Gepner Model: The Obstruction


We will now show that for the Gepner model of the Fermat quintic, the function
(95) may not vanish for the deforming eld (170). This means, not all perturbative
deformations corresponding to marginal elds exist in this case. We emphasize that
our result applies to deformations of the CFT itself (of central charge 9). A dierent
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 167

approach is possible by embedding the model to string theory, and investigating


the deformations in that setting (cf. [16]). Our results do not automatically apply
to deformations in that setting.
We will consider

v = u = (3, 3, 3) (2, 2, 2).

(In the remaining three coordinates, we will always put the vacuum, so we will
omit them from our notation.) First note that by Theorem 3, this is actually the
only relevant case (95), since the only other chiral primary eld of weight 1/2
with only two non-vacuum coordinates is (2, 2, 2) (3, 3, 3), which cannot correlate
with the right-hand side of (95), whose rst coordinate is on level 0, 3. In any
case, we will show therefore that the Gepner model has an obstruction against
continuous perturbative deformation along the eld (170) in the moduli space of
exact conformal eld theories.
Now the chiral correlation function (95) is a complicated multivalued function
because of the integrals (196), which are generalized hypergeometric functions. As
remarked above, the modular functor has a canonical at connection on the space
of degenerate worldsheets whose boundary components are shifts of the unit circle
with the identity parametrization. The at connection comes from the fact that
these degenerate worldsheets are related to each other by applying exp(zL1 ) to
their boundary components. This is why we can speak of analytic continuation
of a branch of the correlation function corresponding to a particular fusion rule.
It can further be shown (although we do not need to use that result here) that
the continuations of the correlation function corresponding to any one particular
fusion rule generate the whole correlation function (i.e. the whole modular functor
is generated by any one non-zero section).
Let us now investigate which number m we need in (95). In our case, we have

G
1/2 (u) = G1/2 (3, 3, 3) (2, 2, 2) (3, 3, 3) G1/2 (2, 2, 2). (197)

(The sign will be justied later;it is not needed at this point.) The rst sum-
mand (197) has x0,0 -charge (2/ 15, 2/ 15), the second has x0,0 -charge (3/ 15,
3/ 15). Thus, the charges can add up to 0 only if m is a multiple of 5. The smallest
possible obstruction is therefore for m = 5, in which case (95) is a 7 point function.
Let us focus on this case. This function however is too big to calculate completely.
Because of this, we use the following trick.
First, it is equivalent to consider the question of vanishing of the function

1|(G
1/2 u)(z5 ) (G1/2 u)(z1 )u(z0 )
u(t) . (198)

Now by the OPE, it is possible to transform any correlation function of the form

 | v(z)w(t) (199)
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

168 I. Kriz

to the correlation function


 | (vn w)(t) (200)
(all other entries are the same). More precisely, (199) is expanded, in a certain
range and choice of branch, into a series in z t with coecients (200) for values
of n belonging to a coset Q/Z. By the above argument, therefore, the function
(199) vanishes if and only if the function (200) vanishes for all possible choices of
n associated with one xed choice of fusion rule.
In the case of (198), we shall divide the elds on the right-hand side into two
sets Gx , Gy containing two copies of G 1/2 u each, and a set Gz containing the

remaining three elds u, u and G1/2 u. Each set Gx , Gy , Gz will be reduced to a
single eld using the transition from (199) to (200) (twice in the case of Gz ). To
simplify notation (eliminating the subscripts), we will denote the elds resulting
from Gx , Gy , Gz by a(x), b(y), c(z), respectively. Thus, x, y, z are appropriate
choices among the variables zi , t, depending how the transition from (199) to (200)
is applied.
This reduces the correlation function (198) to
1|a(x)b(y)c(z) . (201)
Most crucially, however, we make the following simplication: We shall choose the
fusion rules in such a way that the elds a, b, c are level 0, 3 in the FeiginFuchs
realization, and at most one of the charges will be 3 (in each coordinate). Then,
(201) is just a lattice correlation function, for the computation of which we have
an algorithm.
To make the calculation correctly, we must keep careful track of signs. When
taking a tensor product of super-CFTs, one must add appropriate signs analogous
to the KoszulMilnor signs in algebraic topology. Now a modular functor of a super-
CFT decomposes into an even part and an odd part. Additionally, more than one
choice of this decomposition may be possible for the same theory, depending on
which bottom states of irreducible modules are chosen as even or odd. The sign of
a fusion rule is then determined by whether composition along the pair of pants
with given labels preserves parity of states or not. Mathematically, this phenomenon
was noticed by Deligne in the case of the determinant line (cf. [50]). (Deligne also
noticed that in some cases no consistent choice of signs is possible and a more
rened formalism is needed; a single fermion of central charge 1/2 is an example;
this is also discussed in [50]. However, this will not be needed here.)
In the case of the N = 2-minimal model, there is a choice of parities of ground
states of irreducible modules which make the whole modular functor (all the fusion
rules) even: simply choose the parity of (k, , m) to be k mod 2. We easily see that
this is compatible with supersymmetry.
Now in this case of completely even modular functor, the signs simplify, and we
put
Y (u v, z)(r s) = (1)(r)(v) Y (u, z)r Y (v, z)s (202)
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 169

(where (u) means the parity of u). Regarding supersymmetry (if present), an
element H of the superconformal algebra also acts on a tensor product by
H(u v) = Hu v + (1)(H)(u) u Hv, (203)
in particular
G
1/2 (u v) = (G1/2 u) v + (1)
(u)
u (G
1/2 v). (204)
We see that because of (204), the elds a, b, c may have the form of sums of several
terms.

Example 1. Recall that the inner product (more precisely symmetric bilinear
form) of labels considered as lattice points is
r1 r2 s1 s2 t1 t2
(r1 , s1 , t1 ), (r2 , s2 , t2 ) = + . (205)
15 10 6
Recall also (from the denition of energy-momentum tensor) that weight of the
label ground states is calculated by
r2 r2 s(s + 2) t2
w(r, s, t)MM = + w(s, t)P F = + . (206)
30 30 20 12
Now we have
u = (3, 3, 3) (2, 2, 2) = (3, 0, 0) (2, 1, 1). (207)
We begin by choosing the eld c. Compose rst u and
= (3, 3, 3) (2, 2, 2) = (3, 0, 0) (2, 1, 1).
u (208)
We choose the non-zero un u of the bottom weight for the fusion rule which adds
the lattice charges on the right-hand side of (207), (208). The result is
u1/10 u = (0, 0, 0) (0, 2, 0). (209)
Next, apply G1/2 u to (209). Again, we will choose the bottom descendant. Now

G1/2 u has two summands,
(2, 3, 1) (2, 1, 1) (210)
and
(3, 0, 0) (0, 5, 3)x1 (3, 1, 3) (211)
(the term (211) involves renaming to stay withing no-ghost PF labels after compo-
sition). Applying (210) to (209) gives bottom descendant
(2, 3, 1) (2, 3, 1) of weight 8/5, (212)
applying (211) to (209) gives bottom descendant
(3, 0, 0) (3, 0, 0) of weight 3/5. (213)
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

170 I. Kriz

Since (213) has lower weight than (212), (212) may be ignored, and we can choose
c = (3, 0, 0) (3, 0, 0). (214)
Now again, using the formula (204), we see that in the sets of elds Gx , Gy we need
one summand (211) and three summands (210) to get to x00 -charge 0. Thus, one of
the groups Gx , Gy will contain two summands of (210) and the other will contain
one. We employ the following convention:
We choose Gy to contain two summands (210) and Gx to contain one
summand (210) and one summand (211). (215)
This leads to the following:
We must choose the elds a and b of the same weights and symmetrize the
resulting correlation function with respect to x and y. (216)
We will choose b rst. Again, we will choose the bottom weight (nonzero) descendant
of (210) applied to itself renamed as
(0, 5, 3)x1 (2, 0, 2) (2, 2, 2), (217)
which is
(4, 3, 1) (4, 3, 1). (218)
We rename to level 0, which gives
b = (0, 5, 3)x1 (4, 0, 2) (0, 5, 3)x1 (4, 0, 2),
(219)
w(b) = 12/5.
Then a must have weight 12/5 to satisfy (216). When calculating a, however, there
is an additional subtlety. This time, we actually have to take into account two
summands, from applying (210) to (211) and vice versa, i.e. (211) to (210). In both
cases, we must rename to get the desired fusion rule. To this end, we may replace
(211) by
(3, 0, 0) (3, 2, 0). (220)
However, when applying (210) and (220) to each other in opposite order, the renam-
ings then do not correspond, resulting in the possibility of wrong coecient/sign
(since renaming are correct only up to constants which we have not calculated). To
reconcile this, we must use exactly the same renamings step by step, related only
by applying PF currents. To this end, we may compare the renaming of applying
(0, 5, 3)x1 (2, 0, 2) (2, 2, 2) (221)
to
1
(3, 0, 0) (0, 5, 3)x1 (3, 1, 3) (222)
2
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 171

(the 1
2 comes from the PF current (5, 3)x1 (0, 2) which takes (2, 2) to 2(2, 0))
and

(3, 0, 0) (3, 2, 0) (223)

to

(0, 5, 3)x1 (2, 0, 2) (2, 1, 1). (224)

We see that the bottom descendant of applying (221) to (222) is

(0, 5, 3)x1 (1, 0, 2) (1)(1, 3, 1) (225)

while the bottom descendant of applying (223) to (224) is

(0, 5, 3)x1 (1, 0, 2) (1, 3, 1). (226)

The expression (225) is the negative of (226). On the other hand, we see that the
bottom descendants of applying (210) to (220) and vice versa are the same. This
means that we are allowed to use the names (210) and (220) to each other in either
order, but we must take the results with opposite signs.
Now (226) has weight 7/5, so to get weight 12/5, we must take the descendant of
applying (210) to (220) and vice versa which is of weight 1 higher than the bottom.
This gives
((2, 3, 1) (3, 0, 0))x1 (1, 3, 1) (1, 3, 1)
+ (1, 3, 1) ((2, 1, 1) (3, 2, 0))x1 (1, 3, 1),
which is

a = (5, 3, 1)x1 (1, 3, 1) (1, 3, 1) + (1, 3, 1) (5, 1, 1)x1 (1, 3, 1).


(227)

Now the correlation function of a(x), b(y), c(z) given in (227), (219), (214) is an
ordinary lattice correlation function. The algorithm for calculating the lattice cor-
relation function of elds ui (xi ) which are of the form

1i (xi )

or

i x1 1i (xi )

with the label

1P i

is as follows: The correlation function is a multiple of



(xi xj ) i ,j
i<j
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

172 I. Kriz

by a certain factor, which is a sum over all the ways we may absorb any i x1
factors. Each such factor may either be absorbed by another j x1 , which results
in a factor

i , j (xi xj )2 , i = j (228)

or by another lattice label 1j , which results in a factor

i , j (xi xj )1 , i = j. (229)

Each i x1 must be absorbed exactly once (and the mechanism (228) is considered
as absorbing both i and j ), but one lattice label 1j may absorb several dierent
i x1 s via (229).
Evaluating the correlation function of a(x), b(y), c(z) with the vacuum using
this algorithm, we get

2(y z)
.
(x z)(x y)3

Symmetrizing with respect to x, y, we get

2(x 2z + y)
,
(y z)(x z)(x y)2

(our total correlation function factor), which is non-zero.


In more detail, we can calculate separately the contributions to the correlation
function of the two summands (227). For the rst summand, the factor before the
sign contributes

1
, (230)
(x z)(y x)

the factor after the sign contributes

1
. (231)
yx

Multiplying (230) and (231), we get

1
,
(x z)(x y)2

and symmetrizing with respect to x and y,

x 2z + y
, (232)
(x z)(x y)2 (y z)

which is the total contribution of the rst summand (227).


March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 173

For the second summand (227), the factor after contributes


2 1
, (233)
(x y)2 (x z)(y x)
and the factor before the sign contributes
1
. (234)
yx
Multiplying, we get
x 2z + y
.
(x z)(x y)3
After symmetrizing with respect to x, y, we get also (232), so both summands of
(227) contribute equally to the correlation function.

Example 2. In this example, we keep the same a(x) and b(y) as in the pre-
vious example, but change c(z). To select c(z), this time we start with G
1/2 u
represented as

(3, 0, 0) (0, 5, 3)x1 (3, 1 3) + C(2, 3, 1) (2, 1, 1) (235)

(C is a non-zero normalization constant which we do not need to evaluate explicitly),


which we apply to u represented as

(3, 0, 0) (2, 1, 1). (236)

From the two summands (235), we get bottom descendants

(0, 0, 0) (5, 2, 2) of weight 9/10 (237)

and

(5, 3, 1) (0, 2, 0) of weight 19/10. (238)

Therefore, we may ignore (238) and select (237) only. Now applying (237) to u
written as

(3, 0, 0) (2, 1, 1), (239)

we select a descendant of weight 1 above the label

(3, 0, 0) (3, 3, 3).

Recalling from the conjugate of (191) that weight 1 states above the label
(3, 3)P F = (0, 0)P F must vanish, we get

c = (3, 0, 0) (1, 0, 0)x1 (3, 0, 0) (240)


March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

174 I. Kriz

(up to a non-zero multiplicative constant). This gives the correlation function

(x 2z + y)2
. (241)
5(y z)2 (x z)2 (x y)2
Let us write again in more detail the contributions of the two summands (227). For
the rst summand, the contribution of the factor before is again (230) (remain
unchanged), and the contribution of the factor after is
y 3z + 4x
. (242)
15(y z)(x z)(x y)
Multiplying, we get
y 3z + 4x
,
15(y z)2 (x z)2 (x y)
and symmetrizing with respect to x, y,
y 2 + 6yz 6z 2 8xy + 6zx + x2
. (243)
15(y z)2 (x z)2 (x y)2
This is the total contribution of the rst summand (227).
For the second summand (227), the coordinate before contributes again (234),
and the coordinate after contributes
2(yx 3yz 3xz + 3z 2 + 2x2 + 2y 2 )
. (244)
15(y z)(x z)2 (x y)2
Multiplying, we get
yx 3yz 3xz + 3z 2 + 2x2 + 2y 2
.
15(y z)(x z)2 (x y)3
Symmetrizing with respect to x, y, we get
2(yx 3yz 3xz + 3z 2 + 2x2 + 2y 2 )
(245)
15(y z)2 (x z)2 (x y)2
which is the total contribution of the second summand (227). Adding the contribu-
tions (243) and (245) (which are not equal in this case) gives (241).

Remark. When u is, say, a cc eld of weight (1/2, 1/2) in an N = (2, 2) CFT,
then we have a CPT-conjugate aa eld v. Physical CFTs require a real structure,
and the elds u, v are not real. As already noted in the Remark at the end of Sec. 4
for the case of the free eld theory, deforming along the eld u (or v), which is the
case considered in this section, breaks real structure of the CFT. Truly physical
innitesimal deformations therefore occur not along the elds u, v but the eld

u + v. (246)
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 175

(This, of course, explains why the dimension of the space of, say, innitesimal cc-
deformations is the dimension of the space of deformations of the complex structure,
and not the double of that number.) In the literature, the contribution of the CPT-
conjugate is often ignored (cf. [31, formulas (4.5) and (4.7)]). Nevertheless, from
the point of view of obstruction theory, considering (246) and the original cc eld u
should be equivalent. An argument can be sketched as follows: Let a be a non-zero
complex number. Then replacing u by au, (246) becomes
au + a
v. (247)
Since the obstructions are homogeneous, instead of (247), we may consider
a

u + bv, b = (248)
a
Then b is an arbitrary element of the unit circle S 1 . Thus, even when we restrict to
real deformations, the obstruction should vanish for every eld (248) with b S 1 .
But the chiral part of the obstruction is holomorphic in b, so vanishing for all b S 1
implies vanishing for b = 0 and hence if all of the real deformations along (247) are
unobstructed, so is the deformation along u.
While this argument is compelling, we learned in Sec. 4 that when deform-
ing along elds of the form (246), regularization along the deformation parameter
is required. Therefore, to make the argument precise in the present setting, we
would either need to develop a general regularization scheme to the same order
to which obstructions vanish, or compute the regularization parameters explicitly
in the present case. Working this out would be a substantial improvement of the
present result.
The remainder of this section is dedicated to comments on possible perturbative
deformations along the elds (15 , 15 ), (15 , 15 ) (the exponent here denotes repeti-
tion of the eld in a tensor product, and 1 again stands for (1, 1, 1)MM , etc.). We
will present some evidence (although not proof) that the obstruction might vanish
in this case. The results we do obtain will prove useful in the next section. Such
conjecture would have a geometric interpretation. In Gepners conjectured interpre-
tation of the model we are investigating as the -model of the Fermat quintic, the
eld (175) corresponds to the dilaton. It seems reasonable to conjecture that the
dilaton deformation should exist, since the theory should not choose a particular
global size of the quintic. Similarly, the eld (174) can be explained as the dilaton
on the mirror manifold of the quintic, which should correspond to deformations of
complex structure of the form
x5 + y 5 + z 5 + t5 + u5 + xyztu = 0. (249)
Therefore, our analysis predicts that the (body of) the moduli space of N = 2-
supersymmetric CFTs containing the Gepner model is 2-dimensional, and contains
-models of the quintics (249), where the metric is any multiple of the metric for
which the -model exists (which is unique up to a scalar multiple).
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

176 I. Kriz

To discuss possible deformations along the elds (15 , 15 ), (15 , 15 ), let us rst
review a simpler case, namely the coset construction: In a VOA V , we set, for u V
homogeneous,

Y (u, z) = un+w(u) z n ,
nZ

Y (u, z) = un+w(u) z n , (250)
n<0

Y+ (u, z) = Y (u, z) Y (u, z).

The coset model of u is

Vu = v V |Y (u, z)v = 0 and Y+ (u, z)v involves only integral powers of z .


(251)

Then Vu is a sub-VOA of V . To see this, recall that

Y (u, z)Y (v, t)w = Y+ (u, z)Y (v, t)w + Y+ (v, t)Y (u, z)w + Y (Y (u, z t)v)w.
(252)

When v, w Vu , the last two terms of the right-hand side of (252) vanish, which
proves that

Y (v, t)w Vu [[t]][t1 ].

Now in the case of N = 2-super-VOAs, let us stick to the NS sector. Then (250) still
correctly describes the body of a vertex operator. The complete vertex operator
takes the form

n
Y (u, z, + , ) = un+w(u) z n + u+ n +
n1/2+w(u) z + un1/2+w(u) z
nZ

+ u n +
n1+w(u) z . (253)

We still dene Y (u, z, + , ) to be the sum of terms involving n < 0, and


Y+ (u, z, + , ) the sum of the remaining terms. The compatibility relations for
an N = 2-super-VOA are

D+ Y (u, z, + , ) = Y (G+ +
1/2 u, z, , ),
(254)
D Y (u, z, + , ) = Y (G +
1/2 u, z, , ),

where

D+ = +
+ + , D =
+ . (255)
z z
Now using (252) again, for u V homogeneous, we will have a sub-N = 2-VOA Vu
dened by (251), which is further endowed with the operators G +
1/2 , G1/2 .
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 177

In the case of lack of locality, only a weaker conclusion holds. Assume rst we
have abelian fusion rules in the same sense as in Remark 2 after Theorem 1.

Lemma 4. Suppose we have elds ui , i = 0, . . . , n such that for i > j,



Y (ui , z)uj = (ui )nij w(ui ) z n+ij (256)
n0

with 0 ij < 1. Consider further points z0 = 0, z1 , . . . , zn . Then



(zi zj )ij Y (un , zn ) Y (uz , z1 )u0 (257)
ni>j0

where each (zi zj )ij are expanded in zj is a power series whose coecients
involve nonnegative integral powers of z0 , . . . , zn only.

Proof. Induction on n. Assuming the statement is true for n 1, note that by


assumption, (257), when coupled to w V of nite weight, is a meromorphic
function in zn with possible singularities at z0 = 0, z1 , . . . , zn zn1 . Thus, (257) can
be expanded at its singularities, and is equal to

(zi zj )ij
n1i>j0


znn0 expandzn (zn zj )nj Y (un1 , zn1 )
j
=0

Y (u1 , z1 )Y (un , zn )u0
<0
zn


n1
+ (zn zi )ni expand(zn zi ) (zn zj )nj
i=1 n1j
=i

Y (un1 , zn1 ) Y (ui+1 , zi+1 )Y (Y (un , zn zi )ui , zi )


Y (ui1 , zi1 ) Y (u1 , z1 )u0


(zn zi )<0

n,n1

n1
zj
nj
+ zn n0 expand1/zn 1
j=1
zn

Y (un , zn )Y (un1 , zn1 ) Y (u1 , z1 )u0 . (258)


0
zn
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

178 I. Kriz

In (258), expand? (?) means that the argument is expanded in the variable given
as the subscript. The symbol (?)?<0 (respectively, (?)?0 ) means that we take only
terms in the argument, (which is a power series in the subscript), which involve
negative (respectively, non-negative) powers of the subscript.
In any case, by the assumption of the lemma, all summands (258) vanish with
the exception of the last, which is the induction step.

In the case of non-abelian fusion rules, an analogous result unfortunately fails.


Assume for simplicity that
u0 = = un holds in (258) with 0 F
ij < 1 true for any fusion rule F . (259)
We would like to conclude that the correlation function
v, u(zn ) u(z1 )u (260)
involves only non-negative powers of zi when expanded in z1 , . . . , zn (in this order).
Unfortunately, this is not necessarily the case. Note that we know that (260) con-
verges to 0 when two of the arguments zi approach while the others remain separate.
However, this does not imply that the function (260) converges to 0 when three or
more of the arguments approach simultaneously.
To give an example, let us consider the solution of the Fuchsian dierential
equation of P1 {0, t, }


 A B
y = + y (261)
x zt
for square matrices A, B (with t = 0 constant). Since the solution y has bounded
singularities, multiplying y by z m (z t)n for large enough integers m, n makes
the resulting function Y converge to 0 when z approaches 0 or t. If, however, the
expansion of Y at involved only non-negative powers of z, it would have only
nitely many terms, and hence abelian monodromy. It is well known, however, that
this is not necessarily the case. In fact, any irreducible monodromy occurs for a
solution of Eq. (261) for suitable matrices A, B (cf. [7]).
Therefore, the following result may be used as evidence, but not proof, of the
exponentiability of deformations along (15 , 15 ) and (15 , 15 ).

Lemma 5. The assumption (259) is satised for the eld


u = G
1/2 ((1, 1, 1), . . . , (1, 1, 1))

in the 5-fold tensor product of the N = 2 minimal model of central charge 9/5.
Before proving this, let us state the following consequence:
Indeed, assuming Lemma 5 and setting w = ((1, 1, 1), . . . , (1, 1, 1)), the obstruc-
tion is
w |(G
1/2 )w(zn ) (G1/2 w)(z1 )w . (262)
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 179

(The antichiral primary case is analogous.) But using the fact that

G
1/2 ((G1/2 )w(zn ) (G1/2 w)(z1 )w) = (G1/2 )w(zn ) (G1/2 w)(z1 )G1/2 w

along with injectivity of G


1/2 on chiral primaries of weight 1/2, we see that the
non-vanishing of (262) implies the non-vanishing of (260) with u = G 1/2 w for
some v of weight 1, which would contradict Lemma 5.

Proof of Lemma 5. We have

G
1/2 (1, 1, 1) = (4, 1, 1). (263)

We have in our lattice

(1, 1, 1) (1, 1, 1) = 1/15 + 1/10 1/6 = 0,


(4, 1, 1) (4, 1, 1) = 16/15 + 1/10 1/6 = 1, (264)
(1, 1, 1) (4, 1, 1) = 4/15 + 1/10 + 1/6 = 0,

so we see that with the fusion rules which stays on levels 1, 2, the vertex operators
u(z)u have only non-singular terms.
However, this is not sucient to verify (259). In eect, when we use the fusion
rule which goes to levels 0, 3,

(1, 1, 1)(z)(4, 1, 1)

and

(4, 1, 1)(z)(1, 1, 1)

will have most singular term z 2/5 , so when we write again 1 instead of (1, 1, 1)
and G instead of G1/2 (1, 1, 1), with the least favorable choice of fusion rules, it
seems u(z)u can have singular term z 4/5 , coming from the expressions

(G1111)(z)(1G111) (265)

and

(1G111)(z)(G1111) (266)

(and expression obtained by permuting coordinates). Note that with other combi-
nations of fusion rules, various other singular terms can arise with z >4/5 .
Now the point is, however, that we will show that with any choice of fusion rule,
the most singular terms of (265) and (266) come with opposite signs and hence
cancel out. Since the z exponents of other terms are higher by an integer, this is all
we need.
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

180 I. Kriz

Recalling the KoszulMilnor sign rules for the minimal model, recall that 1 is
odd and G is even, so

(G 1)(z)(1 G) = G(z)1 1(z)G, (267)


(1 G)(z)(G 1) = 1(z)G G(z)1. (268)

We have

1(z)G = (1, 1, 1)(z)(4, 2, 2) = M (3, 0, 0)z 2/5 + HOT,


G(Z)1 = (4, 2, 2)(z)(1, 1, 1) = N (3, 0, 0)z 2/5 + HOT,

with some non-zero coecients M, N , so the bottom descendants of (267) and (268)
are

M N (3, 0, 0) (3, 0, 0)

respectively,

M N (3, 0, 0) (3, 0, 0),

so they cancel out, as required.

7. The Case of the Fermat Quartic K3-Surface


The Gepner model of the K3 Fermat quartic is an orbifold analogous to (162) with
5 replaced by 4 of the 4-fold tensor product of the level 2 N = 2-minimal model,
although one must be careful about certain subtleties arising from the fact that
the level is even. The model has central charge 6. The level 2 PF model is the
1-dimensional fermion (of central charge 1/2), viewed as a bosonic CFT. As such,
that model has 3 labels, the NS label with integral weights (denote by N S), the
NS label with weights Z + 12 (denote by N S  ), and the R label (denote by R). The
fusion rules are given by the fact that N S is the unit label,

N S  N S  = N S,
N S  R = R, (269)
R R = N S + N S .

We shall again nd it useful to use the free eld realization of the N = 2 minimal
model, which we used in the last two sections. In the present case, the theory is a
subquotient of a lattice theory spanned by




1 1 i i 1
, 0, 0 , , , , , 0, i .
2 8 8 2 2
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 181

Analogously as before, we write (k, , m) for




k i mi
, , .
8 8 2
The conformal vector is
1 2 1 i 1
x0,1 x21,1 + x1,2 + x22,1 .
2 2 2 2 2
The superconformal vectors are
G
3/2 = (0, 4, 2)x1 (4, 0, 2),

3/2 = (0, 4, 2)x1 (4, 0, 2).


G+
The fermionic labels will again be denoted by omitting the rst coordinate: (, m)F .
The fermionic identications are:
(2, 2)F (2, 2)F (0, 0)F
(270)
(1, 1)F (1, 1)F .

A priori the lattice  8 has 8 labels k8 , 0 k 7, but the G denition together
with (270) forces the MM identication of labels
(1, 1, 1) (3, 1, 1) (3, 1, 1).
The labels of the level 2 MM are therefore
(2k, 0, 0), 0 k 3,
(2k + 1, 1, 1), 0 k 1.
The fusion rules are
(k, 0, 0) (, 0, 0) = (k + , 0, 0),
(k, 0, 0) (, 1, 1) = (k + , 1, 1),
(k, 1, 1) (, 1, 1) = (k + , 0, 0) + (k +  + 4, 0, 0),
so the Verlinde algebra is simply
Z[a, b]/(a4 = 1, b2 = a + a3 , a2 b = b)
where a = (2, 0, 0), b = (1, 1, 1).
One subtlety of the even level MM in comparison with odd level concerns signs.
Since the k-coordinates of G and G+ are even, we can no longer use the k-
coordinate of an element as an indication of parity (u and G u cannot have the
same parity). Because of this, we must introduce odd fusion rules. There are vari-
ous ways of doing this. For example, let the bottom states of (2k, 0, 0), (1, 1, 1) and
(1, 1, 1) be even. Then the fusion rules on level  = 0 are even, as are the fusion
rules combining levels 0 and 1. The fusion rules
(1, 1, 1) (1, 1, 1) (2, 0, 0), (1, 1, 1) (1, 1, 1) (2, 0, 0)
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

182 I. Kriz

are even, the remaining fusion rules (adding 4 to the k-coordinate on the right-hand
side) are odd.
Now the c elds of the MM are
(0, 0, 0), (1, 1, 1), (2, 2, 2)
and the a elds are
(0, 0, 0), (1, 1, 1), (2, 2, 2).
If we denote by H1,2k+1 the state space of label (2k + 1, 1, 1), 0 k 1, and by
H0,2k the state space of label (2k, 0, 0), 0 k 3, then the state space of the 4-fold
tensor product of the level 2 minimal model is
3  3   1 
  

0,2ki
H0,2ki H H1,2ki +1 H
1,2ki +1 . (271)
i=0 ki =0 ki =0

The Gepner model is an orbifold with respect to the Z/4-group which acts by i on
products in (271) where the sum of the subscripts 2ki or 2ki + 1 is congruent to 
modulo 4. Therefore, the state space of the Gepner model is the sum over Z/4
and i Z/4,

3
i = 0 Z/4,
i=0

of
3    

 

H0,2ki H
0,2ki +2
H1,2ki +1 H
1,2ki +1+2
.
i=0 2ki i mod 4 2ki +1i
(272)
It is important to note that each summand (272) in which all the factors have the
odd subscripts 1, 2ki + 1 occurs twice in the orbifold state space.
If we write again  for (, , ) and  for (, , ), then the critical cc elds
are chirally symmetric permutations of
(2, 2, 0, 0), (2, 2, 0, 0)
(2, 1, 1, 0), (2, 1, 1, 0) (273)
(1, 1, 1, 1), (1, 1, 1, 1).
Note that applying all the possible permutations to the elds (273), we obtain only
19 elds, while there should be 20, which is the rank of H 1,1 (X) for a K3-surface
X. However, this is where the preceeding comment comes to play: the last eld
(273) corresponds to a term (272) where all the factors have odd subscripts, and
hence there are two copies of that summand in the model, so the last eld (273)
occurs twice.
By the fact that the Fermat quartic Gepner model has N = (4, 4) worldsheet
supersymmetry (se e.g. [9, 54] and references therein), the spectral ow guarantees
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 183

that the number of critical ac elds is the same as the number of critical cc elds.
Concretely, the critical ac elds are the permutations of
(0, 0, 2, 2), (2, 2, 0, 0)
(0, 1, 1, 2), (2, 1, 1, 0) (274)
(1, 1, 1, 1), (1, 1, 1, 1).
As above, the last eld (274) occurs in 2 copies, thus the rank of the space of critical
ac elds is also 20.
We wish to investigate whether innitesimal deformations along the elds (273),
(274) exponentiate perturbatively. To this end, let us rst see when the coset-
type scenario occurs. This is sucient to prove convergence in the present case.
This is due to the fact that in the present theory, there is an even number of
fermions, in which case it is well known by the boson-fermion correspondence that
the correlation functions follow abelian fusion rules, and therefore Lemma 4 applies.
To prove that the coset scenario occurs, let us look at the chiral c elds
u = (2, 2, 0, 0), (2, 1, 1, 0), (1, 1, 1, 1)
and study the singularities of
G
1/2 (z)(G1/2 u). (275)
By Lemma 4, if (275) are non-singular, the obstructions vanish. The inner product is
kk   mm
(k, , m), (k  ,  , m ) = + ,
8 8 4
k2 ( + 2) m2
w(k, , m) = + .
16 16 8
Next, (2, 2, 2) = (2, 0, 0),
G
1/2 (2, 0, 0) = (0, 4, 2)x1 (2, 0, 2),

G
1/2 (1, 1, 1) = (3, 1, 1),

if we again replace, to simplify notation, the symbol G


1/2 by G, then we have

22 1
The most singular z-power of 2(z)2 is = , (276)
8 2
2 2 1
The most singular z-power of G2(z)2 is = . (277)
8 2
For G2(z)G2, rename the rightmost G2 as (2, 2, 0). We get
(2) (2) 1
The most singular z-power of G2(z)G2 is 1 + = , (278)
8 2
The most singular z-power of 1(z)1 is 0 for the even fusion rule and 1/2
for the odd fusion rule, (279)
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

184 I. Kriz

3 1 1
The most singular z-power of G1(z)1 is + + = 0 for the even,
8 8 4
fusion rule and 1/2 for the odd fusion rule, (280)
9 1 1
The most singular z-power of G1(z)G1 is + = 1 for the even
8 8 4
fusion rule and 3/2 for the odd fusion rule. (281)
One therefore sees that for the eld u = (1, 1, 1, 1), (275) is non-singular: In the
case of the least favorable (odd) fusion rules, the most singular term appears to be
1, coming from
(G1, 1, 1, 1) (1, G1, 1, 1). (282)
However, this term cancels with
(1, G1, 1, 1) (G1, 1, 1, 1). (283)
To see this, note that the last two coordinates do not enter the picture. We have
an odd (respectively, even) pair of pants P respectively, P+ in the MM with input
1, 1. They add up to a pair of pants in MMMM. On (282), we have pairs of pants
Pi {P , P+ },
P (G1 1) (1 G1) = (P1 P2 )(G1 1) (1 G1)
= sP1 (G1 1) P2 (1 G1) (284)
where s is the sign of permuting P2 past G1 1. Here we use the fact that 1 is
even. On the other hand,
P (1 G1) (G1 1) = (P1 P2 )(1 G1) (G1 1)
= sP1 (1 G1) P2 (G1 1) (285)
(as G1 is odd, so there is a by permuting it with itself). From (73), the lowest
term of Pi (1 G1) and Pi (G1 1) have opposite signs, so (284) and (285) cancel
out.
The situation is simpler for u = (2, 2, 0, 0), in which case all the fusion rules are
even, and the most singular term of
(G2 2)(z)(2 G2)
appears to have most singular term z 1 . However, again note that 2 is even and
G2 is odd, so
(G2 s)(z)(2 G2) = G2(z)2 2(z)G2, (286)
while
(2 G2)(z)(G2 2) = 2(z)G(z) G2(z)2. (287)
Renaming G2 as (2, 2, 0), the bottom descendant of both G2(z)2 and 2(z)G2 is
(0, 2, 0) with some coecient, so (286) and (287) cancel out. Thus, the deformations
along the rst and last elds of (273) and (274) exponentiate.
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 185

The eld u = (2, 1, 1, 0) is dicult to analyse, since in this case, (275) has
singular channels and the coset-type scenario does not occur. We do not know how
to calculate the obstruction directly in this case. It is however possible to present
an indirect argument why these deformations exist.
In one precise formulation, the boson-fermion correspondence asserts that a
tensor product of two copies of the 1-dimensional chiral fermion theory considered
bosonically (= the level 2 parafermion) is an orbifold of the lattice theory 2 , by
the Z/2-group whose generator acts on the lattice by sign.
This has an N = 2-supersymmetric
version. We tensor with two copies of the
lattice theory associated with  8 , picking out the sector


m n p
, , where m n p mod 2. (288)
8 8 4

The fermionic currents of the individual coordinates are

1/2,1 = (1) + (1), 1/2,2 = i((1) (1)), (289)

so the SUSY generators are




4
G
3/2,1 = , 0 ((1) + (1)),
8


4 (290)
G
3/2,2 = 0, i((1) + (1)),
8
G = G1 + G2 .

The Z/2 group acts trivially on the new lattice coordinate.


A note is due on the signs: To each state, we can assign a pair of parities, which
will correspond to the parities of the 2 coordinates in the orbifold. This then also
determines the sign of fusion rules.
Now consider our eld as a tensor product of (2, 0) and (1, 1), each in a tensor
product of two copies of the minimal models. Considering each of these factors as
orbifold of the N = 2-supersymmetric lattice theory, let us lift to the lattice theory:


2
(2, 0) , 0 (0), (291)
8


1 1
(1, 1) , ((1/2) + (1/2)). (292)
8 8

Then the elds (291), (292) are Z/2-invariant. In the case of (291), we can proceed
in the lift instead of the orbifold, because the fusion rules in the orbifold are abelian
anyway. In the case of (292), the choice amounts to choosing a particular fusion
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

186 I. Kriz

rule. But now the point is that




2
G
1/2 (2, 0) , 0 ((1) + (1)), (293)
8


3 1
G1/2 (1, 1) , ((1/2) + (1/2)), (294)
8 8


1 3
G1/2 (1, 2) , ((1/2) + (1/2)). (295)
8 8
Thus, the left of G
1/2 u is a sum of lattice labels!
Now the critical summands of the operator
G
1/2 (u)(zk ) G1/2 (u)(z1 )(u)(0) (296)
have k = 4m, and we have 2m summands (293), and m summands (294), (295),
respectively. All


4m
2m, m, m
possibilities occur. It is the bottom (= label) term which we must compute in order
to evaluate our obstruction. But by our sign discussion, when we swap a (294) term
with a (295) term, the label summands cancel out. Now adding all such possible


4m
2m, m 1, m 1, 2
pairs, all critical summands of (296) will occur with equal coecients by symmetry,
and hence also the bottom coecient of (289) is 0, thus showing that the vanishing
of our obstruction for this eld lifts to the lattice theory.
Since the eld (292) is invariant under the Z/2-orbifolding (and although (291)
is not, the same conclusion holds when replacing it with its orbifold image),
the entire perturbative deformation can also be orbifolded, yielding the desired
deformation.
We thus conclude that for the Gepner model of the K3 Fermat quartic, all the
critical elds exponentiate to perturbative deformations.

8. Conclusions and Discussion


In this paper, we have investigated perturbative deformations of CFTs by turning
on a marginal cc eld, by the method of recursively updating the eld along the
deformation path. A certain algebraic obstruction arises. We work out some exam-
ples, including free eld theories, and some N = (2, 2) supersymmetric Gepner
models. In the N = (2, 2) case, in the case of a single cc eld, the obstruction we
nd can be made very explicit, and perhaps surprisingly, does not automatically
vanish. By explicit computation, we found that the obstruction does not vanish
for a particular critical cc eld in the Gepner model of the Fermat quintic 3-fold
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 187

(we saw some indication, although not proof, that it may vanish for the eld cor-
responding to adding the symmetric term xyztu to the superpotential, and for the
unique critical ac eld). By comparison, the obstruction vanishes for the critical
cc elds and ac elds in the Gepner model of the Fermat quartic K3-surface. Our
calculations are not completely physical in the sense that cc elds are not real: real
elds are obtained by adding in each case the complex-conjugate aa eld, in which
case the calculation is more complicated and is not done here.
Assuming (as seems likely) that the real eld case exhibits similar behavior as
we found, why are the K3 and 3-fold cases dierent, and what does the obstruction
in the 3-fold case indicate? In the K3-case, our perturbative analysis conforms
with the AspinwallMorrison construction [9] of the big moduli space of K3s, and
corresponding (2, 2)- (in fact, (4, 4)-) CFTs, and also with the ndings of Nahm
and Wendland [54, 62].
In the 3-fold case, however, the straightforward perturbative construction
of the deformed nonlinear -model fails. This corresponds to the discussion of
NemeschanskySen [55] of the renormalization of the nonlinear -model. They
expand around the 0 curvature tensor, but it seems natural to assume that similar
phenomena would occur if we could expand around the Fermat quintic vacuum.
Then [55] nd that non-Ricci at deformations must be added to the Lagrangian
at higher orders of the deformation parameter in order to cancel the function.
Therefore, if we want to do this perturbatively, elds must be present in the orig-
inal (unperturbed) model which would correspond to non-Ricci at deformation.
No such elds are present in the Gepner model. (Even if we do not a priori assume
that the marginal elds of the Gepner model correspond to Ricci at deforma-
tions, we see that dierent elds are needed at higher order of the perturbation
parameter, so there are not enough elds in the model.) More generally, ignoring
for the moment the worldsheet SUSY, the bosonic superpartners are elds which
are of weight 1 classically (as the classical nonlinear -model Lagrangian is con-
formally invariant even in the non-Ricci at case). A 1-loop correction arises in
the quantum picture [4], indicating that the corresponding deformation elds must
be of generalized weight (cf. [3942]). However, such elds are excluded in unitary
CFTs, which is the reason why these deformations must be non-perturbative. One
does not see this phenomenon on the level of the corresponding topological models,
since these are invariant under varying the metric within the same cohomological
class, and hence do not see the correction term [68]. Also, it is worth noting that
in the K3-case, the function vanishes directly for the Ricci-at metric by the
N = (4, 4) supersymmetry ( [5]), and hence the correction terms of [55] are not
needed. Accordingly, we have found that the corresponding perturbative deforma-
tions exist.
From the point of view of mirror symmetry, mirror-symmetric families of hyper-
surfaces in toric varieties were proposed by Batyrev [10]. In the case of the Fermat
quintic, the exact mirror is a singular orbifold and the nonlinear -model deforma-
tions corresponding to the Batyrev dual family exist perturbatively by our analysis.
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

188 I. Kriz

To obtain mirror candidates for the additional deformations, one uses crepant res-
olutions of the mirror orbifold (see [57] for a survey). In the K3-case, this approach
seems validated by the fact that the mirror orbifolds can indeed be viewed as a
limit of non-singular K3-surfaces [6]. In the 3-fold case, however, this is not so
clear. The moduli spaces of CalabiYau 3-folds are not locally symmetric spaces.
The crepant resolution is not unique even in the more restrictive category of alge-
braic varieties; dierent resolutions are merely related by ops. It is therefore not
clear what the exact mirrors are of those deformations of the Fermat quintic where
the deformation does not naturally occur in the Batyrev family, and resolution
of singularities is needed. In other words, the McKay correspondence sees only
topological invariants, and not the ner geometrical information present in the
whole nonlinear model.
In [21], Fan, Jarvis and Ruan constructed exactly mathematically the A-models
corresponding to LandauGinzburg orbifolds via GromovWitten theory applied
to the Witten equation. Using mirror symmetry conjectures, this may be used to
construct mathematically candidates of topological gravity-coupled A-models as
well as B-models of CalabiYau varieties. GromovWitten theory, however, is a
rich source of examples where such gravity-coupled topological models exist, while
a full conformally invariant (2, 2)--model does not. For example, GromovWitten
theory can produce highly non-trivial topological models for 0-dimensional orbifolds
(cf. [56, 43]).
Why does our analysis not contradict the calculation of Dixon [19] that the
central charge does not change for deformation of any N = 2 CFT along any linear
combination of ac and cc elds? Zamolodchikov [70,71] dened an invariant c which
is a non-decreasing function in a renormalization group ow in a 2-dimensional
QFT, and is equal to the central charge in a conformal eld theory. It may therefore
appear that by [19], all innitesimal deformations along critical ac and cc elds in
an N = 2-CFT exponentiate. However, we saw that when our obstruction occurs,
additional counterterms corresponding to those of NemeschanskySen are needed.
This corresponds to non-perturbative corrections of the correlation function needed
to x c, and the functions [19] cannot be used directly in our case.
Finally, let us briey discuss the signicance of our result to the relationship
between classical and quantum geometry. One of the well known eects (and also
great puzzles) of string duality (as reviewed, say, in [31]) is that a smooth path in
the moduli space of conformal eld theories corresponding to CalabiYau varieties
can correspond to a discontinuous path in the classical moduli space of the Calabi
Yau varieties themselves, and more specically that the topology of the underlying
CalabiYau variety can change along such path. In view of our result, it is possible
that this picture needs to be rened. Namely, what we perceive as a smooth path
in quantum geometry may actually consist of discrete steps tunneling across the
changes of topology. An explanation of such phenomenon could be that the moduli
space of quantum geometries should itself be quantized, and can have a discrete
rather than continuous spectrum.
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 189

Acknowledgments
The author thanks D. Burns, I. Dolgachev, I. Frenkel, Doron Gepner, Y. Z. Huang,
I. Melnikov, K. Wendland and E. Witten for explanations and discussions. Spe-
cial thanks to H. Xing, who contributed many useful ideas to this project before
changing his eld of interest.
The author is supported by grants from the NSA and the MCTP.

References
[1] I. Aeck and A. W. Ludwig, Universal noninteger ground state degeneracy in critical
quantum systems, Phys. Rev. Lett. 67 (1991) 161164.
[2] I. Aeck and A. W. Ludwig, Exact conformal eld theory results on the multichannel
Kondo eect: Single Fermion Greens function, selfenergy and resistivity, J. High
Energy Phys. 11 (2000) 21.
[3] L. Alvarez-Gaume, S. Coleman and P. Ginsparg, Finiteness of Ricci at N = 2
supersymmetric -models. Comm. Math. Phys. 103(3) (1986) 423430.
[4] L. Alvarez-Gaume and D. Z. Freedman, K ahler geometry and renormalization of
supersymmetric -models, Phys. Rev. D 22 (1980) 846853.
[5] L. Alvarez-Gaume and P. Ginsparg, Finiteness of Ricci at supersymmetric models,
Comm. Math. Phys. 102 (1985) 311326.
[6] M. T. Anderson, The L2 structure of moduli spaces of Einstein metrics on
4-manifolds, Geom. Funct. Anal. 2 (1992) 2989.
[7] D. V. Anosov and A. A. Bolibruch, The RiemannHilbert Problem, Aspects of Math-
ematics, E22 (Friedr. Vieweg and Sohn, Braunschweig, 1994).
[8] J. Ashkin and G. Teller, Statistics of two-dimensional lattices with four components,
Phys. Rev. 64 (1943) 178184.
[9] P. S. Aspinwall and D. R .Morrison, String theory on K3 surfaces, in Mirror Sym-
metry, eds. B. R. Greene and S. T. Yau, Vol. II, AMS/IP Stud. Adv. Math. (Amer.
Math. Soc., 1994), pp. 703716.
[10] V. V. Batyrev, Dual polyhedra and mirror symmetry for CalabiYau hypersurfaces
in toric varieties, J. Alg. Geom. 3 (1994) 493535.
[11] R. J. Baxter, Eight-vertex model in lattice statistics, Phys. Rev. Lett. 26 (1971)
832833.
[12] P. Bouwknecht and D. Ridout, A note on the equality of algebraic and geometric
D-brane charges in WZW, J. High Energy Phys. 0405 (2004) 029.
[13] R. Cohen and D. Gepner, Interacting bosonic models and their solution, Mod. Phys.
Lett. A 6 (1991) 2249.
[14] M. Dine, N. Seiberg, X. G. Wen and E. Witten, Non-perturbative eects on the string
world sheet, Nucl. Phys. B 278 (1986) 769969.
[15] M. Dine, N. Seiberg, X. G. Wen and E. Witten, Non-perturbative eects on the string
world sheet, Nucl. Phys. B 289 (1987) 319363.
[16] L. J. Dixon, V. S. Kaplunovsky and J. Louis, On eective eld theories describing
(2, 2) vacua of the heterotic string, Nucl. Phys. B 329 (1990) 2782.
[17] J. Ellis, C. Gomez, D. V. Nanopoulos and M. Quiros, World sheet instanton eects
on no-scale structure, Phys. Lett. B 173 (1986) 5964.
[18] J. Distler and B. Greene, Some exact results on the superpotential from CalabiYau
compactications, Nucl. Phys. B 309 (1988) 295316.
[19] L. Dixon, Some worldsheet properties of superstring compactications, on orbifolds
and otherwise, in Superstrings, Unified Theories and Cosmology, Proc. ICTP Summer
school, 1987, ed. G. Furlan (World Scientic, 1988), pp. 67126.
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

190 I. Kriz

[20] M. R. Douglas and W. Taylor, The landscape of intersecting brane models, J. High
Energy Phys. 0701 (2007) 031.
[21] H. Fan, T. J. Jarvis and Y. Ruan, The Witten equation, mirror symmetry and quan-
tum singularity theory, arXiv:0712.4021.
[22] S. Fredenhagen and V. Schomerus, Branes on group manifolds, Gluon condensates,
and twisted K-theory, J. High Energy Phys. 104 (2001) 7.
[23] D. Friedan, Nonlinear models in 2+  dimensions, Ann. Phys. 163(2) (1985) 318419.
[24] D. Gepner, Space-time supersymmetry in compactied string theory and supercon-
formal models, Nucl. Phys. B 296 (1988) 757778.
[25] D. Gepner, Exactly solvable string compactications on manifolds of SU (N ) holon-
omy, Phys. Lett. B 199 (1987) 380.
[26] M. Gerstenhaber, On the deformation of rings and algebras I, Ann. of Math. 79
(1964) 59103.
[27] M. Gerstenhaber, On the deformation of rings and algebras II, Ann. of Math. 84
(1966) 119.
[28] M. Gerstenhaber, On the deformation of rings and algebras III, Ann. of Math. 88
(1968) 134.
[29] M. Gerstenhaber, On the deformation of rings and algebras IV, Ann. of Math. 99
(1974) 257276.
[30] P. Ginsparg, Curiosities at c = 1, Nucl. Phys. B 295 (1988) 153170.
[31] B. R. Greene, String theory on CalabiYau manifolds, hep-th/9702155.
[32] B. R. Greene and M. R. Plesser, Duality in CalabiYau moduli space, Nucl. Phys. B
338 (1990) 1537.
[33] B. R. Greene, C. Vafa and N. P. Warner, CalabiYau manifolds and renormalization
group ows, Nucl. Phys. B 324 (1989) 371390.
[34] P. A. Grin and O. F. Hernandez, Structure of irreducible SU (2) parafermion mod-
ules derived vie the FeiginFuchs construction, Int. J. Modern Phys. A 7 (1992)
12331265.
[35] M. T. Grisaru, A. E. M. Van Den and D. Zanon, Four-loop -function for the N = 1
and N = 2 supersymmetric non-linear sigma model in two dimensions, Phys. Lett. B
173 (1986) 423.
[36] P. Hu and I. Kriz, Conformal eld theory and elliptic cohomology, Adv. Math. 189
(2004) 325412.
[37] P. Hu and I. Kriz, Closed and open conformal eld theories and their anomalies,
Comm. Math. Phys. 254 (2005) 221253.
[38] P. Hu and I. Kriz, A mathematical formalism for the Kondo eect in WZW branes,
J. Math. Phys. 48 (2007) 072301, 31 pp.
[39] Y. Z. Huang, J. Lepowsky and L. Zhang, Logarithmic tensor product theory for
generalized modules for a conformal vertex algebra, Part I, math/0609833.
[40] Y. Z. Huang, J. Lepowsky and L. Zhang, A logarithmic generalization of tensor
product theory for modules for a vertex operator algebra, Internat. J. Math. 17
(2006) 9751012; math/0311235.
[41] Y. Z. Huang and L. Kong, Full eld algebras, QA/0511328.
[42] Y. Z. Huang and A. Milas, Intertwining operator superalgebras and vertex tensor
categories for superconformal algebras, II, Trans. Amer. Math. Soc. 354 (2002) 363
385.
[43] P. Johnson, Equivariant GromovWitten theory of one dimensional stacks, Ph.D.
thesis, Univ. of Michigan (2009).
[44] S. Kachru and E. Witten, Computing the complete massless spectrum of a Landau
Ginzburg orbifold, Nucl. Phys. B 407 (1993) 637666.
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

Perturbative Deformations of Conformal Field Theories Revisited 191

[45] L. P. Kadano, Multicritical behavior of the KosterlitzThouless critical point, Ann.


Phys. 120 (1979) 3971.
[46] L. P. Kadano and A. C. Brown, Correlation functions on the critical lines of the
Baxter and AshtenTeller models, Ann. Phys. 121 (1979) 318342.
[47] L. P. Kadano and F. J. Wegner, Some critical properties of the eight-vertex model,
Phys. Rev. D 4 (1971) 39893993.
[48] M. Kohmoto and L. P. Kadano, Lower bound RSRG approximation for a large
system, J. Phys. A 13 (1980) 33393343.
[49] I. Kriz, Some notes on the N -superconformal algebra, http://www.math.lsa.
umich.edu/ikriz.
[50] I. Kriz, On spin and modularity in conformal eld theory, Ann. Sci. ENS 36 (2003)
57112.
[51] J. Maldacena, G. Moore and N. Seiberg, D-brane instantons and K-theory charges,
J. High Energy Phys. 111 (2004) 62.
[52] G. Moore, K-theory from a physical perspective, Topology, Geometry and Quantum
Field Theory, London Math. Soc. Lecture Ser., Vol. 308 (Cambridge Univ. Press,
2004), pp. 194234.
[53] G. Mussardo, G. Sotkov and M. Stanishkov, N = 2 superconformal minimal models,
Int. J. Mod. Phys. A 4(5) (1989) 11351206.
[54] W. Nahm and K. Wendland, A hikers guide to K3, Comm. Math. Phys. 216 (2001)
85103.
[55] D. Nemeschansky and A. Sen, Conformal invariance of supersymmetric -models on
CalabiYau manifolds, Phys. Lett. B 178(4) (1986) 365369.
[56] A. Okounkov and R. Pandharipande, The equivariant GromovWitten theory of P1 ,
Ann. of Math. (2) 163 (2006) 561605.
[57] M. Reid, La Correspondence de McKay, 52eme annee, session de Novembre 1999,
no. 897, Asterisque 276 (2002) 5372.
[58] V. Schomerus, Lectures on branes in curved backgrounds, Class. Quant. Grav. 19
(2002) 57815847.
[59] G. Segal, The denition of conformal eld theory, in Topology, Geometry and Quan-
tum Field Theory, London Math. Soc. Lecture Note Ser., Vol. 308 (Cambridge Uni-
versity Press, 2004), pp. 421577.
[60] C. Vafa and N. Warner, Catastrophes and the classication of conformal theories,
Phys. Lett. B 218 (1989) 5158.
[61] F. J. Wegner, Corrections to scaling laws, Phys. Rev. B 5 (1972) 45294536.
[62] K. Wendland, A family of SCFTs hosting all very attractive relatives to the (2)4
Gepner model, J. High Energy Phys. 0603 (2006) 102.
[63] K. G. Wilson, The renormalization group: Critical phenomena and the Kondo prob-
lem, Rev. Mod. Phys. 47 (1975) 773840.
[64] K. G. Wilson, Non-Lagrangian models of current algebra, Phys. Rev. 179 (1969)
14991512.
[65] K. G. Wilson, Operator-product expansions and anomalous dimensions in the
Thirring model, Phys. Rev. D 2 (1970) 14731493.
[66] E. Witten, Phases of N = 2 theories in two dimensions, Nucl. Phys. B 403 (1993)
159222.
[67] E. Witten, On the LandauGinzburg description of N = 2 minimal models, Int. J.
Mod. Phys. A 9 (1994) 47834800.
[68] E. Witten, Topological sigma models, Comm. Math. Phys. 118 (1988) 411449.
[69] A. B. Zamolodchikov, Integrable eld theory from conformal eld theory, Adv. Stud.
Pure Math. 19 (1989) 641674.
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X10003916

192 I. Kriz

[70] A. B. Zamolodchikov, Irreversibility of the ux of the renormalization group in a


2D eld theory, JETP Lett. 43 (1986) 730732.
[71] A. B. Zamolodchikov, Renormalization group and perturbation theory about xed
points in two-dimensional eld theory, Sov. J. Nucl. Phys. 46 (1987) 10901096.
[72] Y. Zhu, Modular invariance of characters of vertex operator algebras, J. Amer. Math.
Soc. 9 (1996) 237302.
March 10, 2010 10:14 WSPC/S0129-055X 148-RMP
J070-S0129055X10003928

Reviews in Mathematical Physics


Vol. 22, No. 2 (2010) 193206

c World Scientific Publishing Company
DOI: 10.1142/S0129055X10003928

SPATIAL GROWTH OF FUNDAMENTAL


SOLUTIONS FOR CERTAIN PERTURBATIONS
OF THE HARMONIC OSCILLATOR

ARNE JENSEN and KENJI YAJIMA


Department of Mathematical Sciences, Aalborg University,
Fr. Bajers Vej 7G, DK-9220 Aalborg , Denmark
matarne@math.aau.dk
Department of Mathematics, Gakushuin University,
1-5-1 Mejiro, Toshima-ku, Tokyo 171-8588, Japan
kenji.yajima@gakushuin.ac.jp

Received 5 June 2009


Revised 24 November 2009

We consider the fundamental solution for the Cauchy problem for perturbations of the
harmonic oscillator by time dependent potentials which grow at spatial infinity slower
than quadratic but faster than linear functions and whose Hessian matrices have a fixed
sign. We prove that the fundamental solution at resonant times grows indefinitely at
spatial infinity with an algebraic growth rate, which increases indefinitely when the
growth rate of perturbations at infinity decreases from the near quadratic to the near
linear ones.

Keywords: Fundamental solution; Schr


odinger equation; harmonic oscillator.

Mathematics Subject Classification 2010: 35A08, 35B10, 35J10, 81Q20

1. Introduction
We consider d-dimensional time dependent Schrodinger equations
 
u 1
i = + V (t, x) u(t, x), (t, x) R1 Rd . (1)
t 2

We assume throughout this paper that V (t, x) is smooth with respect to the x vari-
ables, and V (t, x) and its derivatives x V (t, x) are jointly continuous with respect
to (t, x). Under the conditions to be imposed on V (t, x) in what follows Eq. (1)
generates a unique unitary propagator {U (t, s) : t, s R} in the Hilbert space
H = L2 (Rd ), so that the solution in H of (1) with the initial condition

u(s, x) = (x) H

193
March 10, 2010 10:14 WSPC/S0129-055X 148-RMP
J070-S0129055X10003928

194 A. Jensen & K. Yajima

is uniquely given by u(t) = U (t, s). The distribution kernel E(t, s, x, y) of U (t, s)
is called the fundamental solution (FDS for short) of Eq. (1):

U (t, s)(x) = E(t, s, x, y)(y)dy.

We write E(t, x, y) = E(t, 0, x, y). It is well known that the FDS of the free
Schr
odinger equation, viz. Eq. (1) with V = 0, is given by

e 4
id
2
E0 (t, x, y) = ei(xy) /2t , t0 (2)
|2t|d/2

and that of the harmonic oscillator, viz. Eq. (1) with V (t, x) = x2 /2, is given for
non-resonant times m < t < (m + 1), m Z via Mehlers formula:
eid(1+2m)/4 (i/sin t)((x2 +y2 )cos t/2xy)
Eh (t, x, y) = e , (3)
|2 sin t|d/2
and, for resonant times t s = m by

Eh (m, x, y) = eimd/2 (x (1)m y). (4)

Note that the FDS for the free Schr odinger equation is smooth and spatially
bounded for any t = 0; for the harmonic oscillator the FDS has this property
only at non-resonant times; at resonant times t = m singularities of the initial
function Eh (0, x, y) = (x y) recur at x = (1)m y, however, it is smooth and
decays rapidly at spatial innity. Actually, it vanishes outside the singular point
x = (1)m y when t = m.
We begin with a brief review on properties of the FDS for (1) with general
potentials V (t, x) laying emphasis on its smoothness and boundedness with respect
to the spatial variables (x, y). We denote the classical Hamiltonian and Lagrangian
corresponding to (1), respectively, by

H(t, x, p) = p2 /2 + V (t, x) and L(t, q, v) = v 2 /2 V (t, q)

and (x(t, s, y, k), p(t, s, y, k)) is the solution of the initial value problem for
Hamiltons equations

x(t)
= p H(t, x, p), = x H(t, x, p);
p(t) x(s) = y, p(s) = k. (5)

We write (x(t, 0, y, k), p(t, 0, y, k)) = (x(t, y, k), p(t, y, k)).


Suppose rst that V (t, x) increases at most quadratically at spatial innity in
the sense that

sup |x V (t, x)| C , for all || 2. (6)


t

Then, in the seminal work [4], Fujiwara has shown that there exists a T depending
only on V such that the following results hold for the time interval 0 (ts) < T :
The map Rd  k  x(t, s, y, k) Rd is a dieomorphism for every xed y Rd
March 10, 2010 10:14 WSPC/S0129-055X 148-RMP
J070-S0129055X10003928

Spatial Growth of Fundamental Solutions for Certain Perturbations 195

and, therefore, there exists a unique path of (5) such that x(s) = y and x(t) = x;
if we write
 t
S(t, s, x, y) = 2 /2 V (r, x(r)))dr
(x(r)
s

for the action integral of the path, the FDS E(t, s, x, y) has the form
e 4
id

E(t, s, x, y) = eiS(t,s,x,y) a(t, s, x, y), t s, (7)


(2|t s|)d/2
where a(t, s, x, y) is a smooth function of (x, y) such that, for any and ,
x y a(t, s, x, y) are C 1 with respect to (t, s, x, y) and
|x y (a(t, s, x, y) 1)| C (t s)2 . (8)
Moreover the semi-classical approximation for the amplitude function is valid in
the sense that as |t s| 0
   1/2
a(t, s, x, y) 
d/2  
= (2) det x(t, s, k, y) + O(|t s|(d2)/2 ), (9)
(2|t s|)d/2  k 

where k is the (unique) point such that x = x(t, s, y, k). In particular, E(t, s, x, y)
is smooth and bounded with respect to the spatial variables (x, y) Rd Rd for
every 0 < |t s| < T (see [9] for a generalization to the case when magnetic elds
are present). For the free Schrodinger equation or for the harmonic oscillator the
relation (9) holds without the error term O(|t s|(d2)/2 ).
Under the condition (6) the structure (7) of the FDS in general breaks down
at later times because singularities of the initial data (x) may recur in nite time
as the FDS of the harmonic oscillator (4) explicitly demonstrates. If V (t, x) is
subquadratic at spatial innity in the sense that
lim sup |x V (t, x)| = 0, || = 2,
|x| t
(10)
|x V (t, x)| C , for all || 3,
then this recurrence of singularities does not take place, however, and the FDS is
of the form (7) for any nite time ([12]). More precisely, if V satises (10), then for
any T > 0, there exists R > 0 such that, for any t and s with 0 < (t s) T
and for any pair (x, y) Rd Rd with x2 + y 2 R2 , there exists a unique path
of (5) such that x(s) = y and x(t) = x and the FDS for 0 < (t s) T may
be written in the form (7), where, for (x, y) with x2 + y 2 R2 , S(t, s, x, y) is the
action integral of this path. Moreover, we have a(t, s, x, y) 1 as x2 + y 2 . In
particular, E(t, s, x, y) is smooth and bounded with respect to (x, y) for any t = s.
On the other hand, if d = 1 and V (t, x) does not depend on t, and if V is convex
and V (x) C|x|2+ for large |x| for some > 0 and C > 0, then, under certain
additional techinical assumption on the derivatives, E(t, x, y) is nowhere C 1 with
respect to (t, x, y) ([10]). It is also known that, if V satises C1 |x| V (x) C2 |x|
near innity with constants > 10 and 0 < C1 C2 < , then E(t, 0, x, y) is
March 10, 2010 10:14 WSPC/S0129-055X 148-RMP
J070-S0129055X10003928

196 A. Jensen & K. Yajima

unbounded with respect to (x, y) for any t R ([13]). These results have been
proven only in one dimension so far, however, it is believed that similar results hold
in all dimensions.
In this way, properties of the FDS experience a sharp transition when the
growth rate at spatial innity of the potential V (t, x) changes from subquadratic
to superquadratic. Thus, the FDS for the borderline case, viz. perturbations of the
harmonic oscillator
 
u 1 1
i = + x2 + W (t, x) u(t, x), (t, x) R1 Rd , (11)
t 2 2
where W (t, x) is subquadratic in the sense it satises (10) with W in place of V ,
has attracted particular interest of many authors, and the following properties of
E(t, s, x, y) have been established (see, e.g., [14, 5, 12, 2, 3]). We may set s = 0,
which we will do, and we will write E(t, x, y) for E(t, 0, x, y); x
= (1 + |x|2 )1/2 .

(a) The structure of the FDS Eh (t, x, y) at non-resonant times as stated in (3) is
stable under perturbations and E(t, x, y) is smooth and spatially bounded for
m < t < (m + 1).

However, E(t, x, y) at resonant times is more sensitive to perturbations:

(b) If W is sublinear, viz. |x W (t, x)| = o(1), || = 1, as |x| uniformly with


respect to t, then the recurrence of singularities at resonant times m, m Z,
persists (WFx denotes the wavefront set):

WFx E(m, x, y) = {(1)m (y, ) : Rd \{0}},

and it decays rapidly at spatial innity, viz. for any N ,

|E(m, x, y) CN x y
N , |x y| 1. (12)

(c) If W is of linear type, viz. |x W (t, x)| C for || = 1, singularities of


E(0, x, y) can propagate at resonant times. For example, if W = a x
, then
with = /||,
) : Rd \{0}},
WFx E(m, x, y) = {(1)m (y + 2am,

but it remains to decay rapidly at spatial innity:

|E(m, x, y) CN x y
N , |x y| 1. (13)

(d) If W is superlinear and satises the following sign condition on the Hessian
matrix x2 W = ( 2 W/xj xk ) that

C1 x
x2 W (t, x) C2 x
, (t, x) R1 Rd (14)

for some constants 0 < < 1 and 0 < C1 < C2 < or < C1 < C2 < 0,
then E(m, x, y), m Z, is C with respect to (x, y), viz. singularities at
resonant times t = m are swept away.
March 10, 2010 10:14 WSPC/S0129-055X 148-RMP
J070-S0129055X10003928

Spatial Growth of Fundamental Solutions for Certain Perturbations 197

This paper is concerned with the properties of the FDS E(t, x, y), when t is
at resonant times t Z. We show that, in the last case (d) above, E(m, x, y)
increases indenitely as |x| at the algebraic rate C|x|d/(22) , exhibiting
a sharp contrast to the decay result (12) or (13) for the case when W is at
most linearly increasing at spatial innity. More precisely we prove the following
theorem:

Theorem 1.1. Suppose that W (t, x) is subquadratic and satisfies the sign condition
(14) for some 0 < < 1 and 0 < C1 < C2 < or < C1 < C2 < 0. Let m Z
and y Rd be fixed. Let C0 (Rd \{0}) be such that (x) = 1 for a |x| b,
0 < a < b < being constants. Then there exist constants 0 < M1 < M2 ,
independent of R 1, such that

  2 1/2
2 x dx
M1 R d/(22)
|E(m, x, y)| M2 Rd/(22) . (15)
Rd R Rd

It is interesting to note that, when increases from 0 to 1, the growth rate


as |x| of W (t, x) decreases (hence W (t, x) becomes weaker), whereas that of
E(m, x, y) as |x y| , r() = d/(2 2), increases from 0 indenitely to
innity. This seemingly contradictory behavior may be understood via the semi-
classical picture as follows. For functions a(x) and b(x) on , a b means that
A1 a(x) b(x) A2 a(x), x , for constants 0 < A1 < A2 . At time 0 consider
the ensemble of classical particles in the phase space Rd Rd sitting on the
linear Lagrangian manifold {(x, p) Rd Rd : x = y, p Rd } with uniform
momentum distribution (2)d/2 dp. Semiclassically, this is described by the wave
function (xy) = E(0, x, y). After time m, will be transported by the Hamilton
ow (5) to the Lagrangian manifold {(x(m, y, k), p(m, y, k)) : k Rd }. As we
shall see below, we have |p(m, y, k)| |k| and |x(m, y, k)| |k|1 as |k| .
It follows at least semiclassically (see (9)) that

  1/2
 x 
|E(m, x, y)| det |k|d/2 |x|d/(22) , |x| ,
k 

which is consistent with (15). Here is another remark, which claries that
Theorem 1.1 is more or less consistent with the known results. We should note that
2
if = 0, then W = c x
, and m is no longer a resonant time for V = x2 /2 + W ,
and the corresponding E(m, x, y) is bounded as |x y| ; on the other hand,
if = 1, then W = c x
and, as in (c) above, a large portion of E(m, x, y) is
concentrated in a bounded domain |x y| 2cm, which may be represented as the
extreme case of C x
d/(22) as 1.
We mention here that the result of the theorem has been conjecture by Martinez
and the second author in [7], where a similar problem is studied in the semi-classical
March 10, 2010 10:14 WSPC/S0129-055X 148-RMP
J070-S0129055X10003928

198 A. Jensen & K. Yajima

setting. More precisely, they consider the FDS of the semi-classical Schrodinger
equation
 2 
u h 1
ih = + x2 + h W (x) u,
t 2 2
where W (x) is t independent and satises the same conditions as in this paper,
(10) and (14); and they prove that the FDS at the resonant times may be written
in the form
E(m, x, y) = hd(1+)/2 a(x, y, h)eiS(x,y)/h , = /(1 ), (16)
where S(x, y) is the action integral of the path of (5) connecting x(0) = y and
x(m) = x and a(x, y, h) satises C 1 |a(x, y, h)| C uniformly with respect h
on every compact subset K of R2d \{(x, (1)m x) : x Rd }. Thus, E(m, x, y) has
the extra growing factor hd/2 as h 0 compared to E(t, x, y) at non-resonant
times t = m and they remark that, if their arguments applied for non-smooth
potentials, (16) would imply the estimate (15) of Theorem 1.1 for the homogeneous
potential W (x) = C|x|2 .
It is well known that the boundedness of E(t, s, x, y) with respect to (x, y)
implies the so called Lp -Lq estimates of the propagator U (t, s) (hence, also nite
time Strichartz estimates). There are examples of Schr odinger equations with
smooth coecients, which exhibit break down of the estimates, e.g., the harmonic
oscillator at resonant times. However, to the best knowledge of the authors, in all
known examples they are broken because of local singularities and, Theorem 1.1 is
the rst example in which they are broken because of the growth at spatial inn-
ity of the FDS (see [8] for Lp -Lq estimates for potentials which are singular but
decay at innity). For the micro-local smoothing estimate which may be applied
for proving the smoothness of the FDS, see for example [1] or [6].
The rest of the paper is devoted to the proof of this theorem. We prove it only
in the m = 1 case. The proof for the other cases is similar. In Sec. 2, we recall
several known facts, which will be used in Sec. 3, where the theorem is proved. We
often omit some of the variables of functions, if no confusion is to be feared. For
functions f of several variables, we write f C k (x) or f C k (t, x) etc., if f is of
class C k with respect to x or (t, x), etc.

2. Preliminaries
We rst recall some results on the Hamiltonian ow generated by (5) when V (t, x) =
x2 /2 + W (t, x) and W is subquadratic. We set the initial time s = 0 and omit the
variable s. The solutions (x(t), p(t)) = (x(t, y, k), p(t, y, k)) of (5) satisfy the integral
equations
 t
x(t) = y cos t + k sin t sin(t s)x W (s, x(s))ds, (17)
0
 t
p(t) = y sin t + k cos t cos(t s)x W (s, x(s))ds. (18)
0
March 10, 2010 10:14 WSPC/S0129-055X 148-RMP
J070-S0129055X10003928

Spatial Growth of Fundamental Solutions for Certain Perturbations 199

Since the subquadratic condition implies |x(t)|


+ |p(t)|
C(1 + |x(t)| + |p(t)|) for
a constant C > 0 and, hence,

eC|t| (1 + |y| + |k|) (1 + |x(t)| + |p(t)|) eC|t| (1 + |y| + |k|), (19)

it follows, as y 2 + k 2 , uniformly with respect to t in compact intervals, that

|x(t) (y cos t + k sin t)| = o(|y| + |k|), (20)


|p(t) (y sin t + k cos t)| = o(|y| + |k|). (21)

We x m Z, m = 0, and 0 < < /2, and consider t in the interval I =


[m , m + ]. Then, the following results have been proved in Lemmas 2.3, 2.5
and 3.5, respectively, of [12] by using the integral equations (17) and (18).

(i) For any and , as R2 = y 2 + k 2

y k (y x(t) (cos t)1) 0, y k (k x(t) (sin t)1) 0, (22)


y k (y p(t) + (sin t)1) 0, y k (k p(t) (cos t)1) 0, (23)

uniformly with respect to t I. Here 1 is the d d identity matrix.


(ii) Let R > 0 be suciently large. Then, for any t I and (, y) R2d
with 2 + y 2 R2 , there exists a unique k Rd such that the solution
(x(s, y, k), p(s, y, k)) of (5) satises

p(t, y, k) = . (24)

(iii) Let R be as in (ii) and dene (t, , y) for t I and 2 + y 2 > R2 by


 t
(t, , y) = x(t, y, k) L(s, x(s, y, k), x(s,
y, k))ds,
0

where k is determined by (24). Then C (, y) and y C 1 (t, , y)


for any , ; is a generating function of the canonical map (p(t, y, k), y) 
(x(t, y, k), k):

( )(t, p(t, y, k), y) = x(t, y, k), (y )(t, p(t, y, k), y) = k, (25)

and satises the HamiltonJacobi equation t = 2 /2 V (t, ). More-


over, as 2 + y 2 , y approaches the corresponding function of the
harmonic oscillator whenever | + | 2:
  
 ( 2 + y 2 ) sin t + 2 y 

sup  y (t, , y)
tI 2 cos t  0.

Furthermore, we have the following representation formula of the FDS [12,


Theorem 1.3(2)].
March 10, 2010 10:14 WSPC/S0129-055X 148-RMP
J070-S0129055X10003928

200 A. Jensen & K. Yajima

Theorem 2.1. Let W be subquadratic. Then, for t I = [m , m + ], the


FDS E(t, x, y) of (11) may be written in the following form
 2
i(m+1)d eixi(t,,y)
/2
a(t, , y)
E(t, x, y) = lim d (26)
0 Rd (2)d |cos t|d/2
where the integral converges in the C topology with respect to (x, y) and the func-
tions and a satisfy the following properties:

(a) C (, y), y C 1 (t, , y) for any , and

(t,
, y) = (t, , y) for t I, 2 + y 2 R2 .

(b) a C (, y), x y a C 1 (t, , y) for any , and

lim sup |x y (a(t, , y) 1)| 0


2 +y 2 tI

for any and .

We call integrals of the form (26) oscillatory integrals and often write them
simply as

i(m+1)d eixi(t,,y)

a(t, , y)
d.
Rd (2) |cos t|
d d/2

When W satises the sign condition (14), the phase function (, , y) satises
the following properties which are essential for the proof of the theorem. From now
on we let m = 1.

Proposition 2.2. Let W be subquadratic and satisfy (14). Let L > 0. Then, there
exist constants C > 0 and R > 0 depending only on L such that for every || R
and |y| L:

C1 ||1 | (, , y)| C2 ||1 , (27)


| (, , y)| C|| , || 2. (28)

Proof. The upper bound in estimate (27) is obvious from (25), (17) and (20);
the lower bound is proved in [11, pp. 6163] for time independent perturbations
W (t, x) = W (x), and the proof applies to the time dependent case as well, if we
use [12, Lemmas 2.1 and 2.2] instead of [11, Lemmas 4.2 and 4.3]. From [11, pp. 61
63], we also have for || R and k such that p(, y, k) =

k x(, y, k) || . (29)

Dierentiating ( )(, p(, y, k), y) = x(, y, k) with respect to k, we have

(2 )(, , y)k p(, y, k) = k x(, y, k) (30)


March 10, 2010 10:14 WSPC/S0129-055X 148-RMP
J070-S0129055X10003928

Spatial Growth of Fundamental Solutions for Certain Perturbations 201

and, applying the second result of (23) and (29), we obtain (28) for the case || = 2.
For higher derivatives, we further dierentiate (30) and apply (22) and (23) in
addition to (29). Estimate (28) follows inductively.

Lemma 2.3. Let L > 0 and 0 < a < b < be fixed arbitrarily and let C0 (Rd )
be supported by {x Rd : a |x| b}. Then, there exist R0 > 0 and C0 > 0, such
that for all R > R0 and |y| L

1
|( (, , y)/R)|2 d C0 Rd/(1) . (31)
R d Rd
If (x) > > 0 for a1 < |x| < b1 , a < a1 < b1 < b, then we also have the lower
bound:

1
C1 R d/(1)
d |( (, , y)/R)|2 d. (32)
R Rd

Proof. For suciently large R > 0, we have by virtue of (21) that 1/2
|p(, y, k)|/|k| 2 for |y| L and |k| R, and (27) implies

C1 |k|1 |x(, y, k)| C2 |k|1 .

It follows that, if (x(, y, k)/R) = 0, then aR/C2 |k|1 bR/C1 . Hence,


whenever ( (, , y)/R) = 0, we have

D1 R1/(1) || D2 R1/(1)

and

1
|( (, , y)/R)|2 d CRd/(1) .
Rd
A similar argument yields the lower bound in the second case. We omit the obvious
details.

3. Proof of Theorem 1.2


Before starting the proof we remark the following: If we were able to prove the
faster decay as || for the higher derivatives , say,

| (, , y)| C|||| , (33)

then the standard stationary phase method combined with a change of scale would
yield the pointwise estimate

|E(m, x, y)| C|x|d/(22) as |x| . (34)

However, (33) does not seem to hold in general and this required a weaker formu-
lation of the theorem and a little complicated proof given below.
March 10, 2010 10:14 WSPC/S0129-055X 148-RMP
J070-S0129055X10003928

202 A. Jensen & K. Yajima

We need to estimate
    2
1  
I(R)  x E(, x, y) dx. (35)
Rd  R 
R d

In what follows, we omit the variable , the domain of integration Rd from integral
signs and write
as . Since y is xed in the following computation, we sometimes
omit the variables y as well. This should not cause any confusion. Then, by virtue
of (26), (35) may be written as an oscillatory integral
    2
1  
I(R) = lim  x e i(x()) 2 /2
a()d  dx
0 (2)2d Rd  R 
  2
1 x 2 2
= lim 2d d
eix()+i(()())( + )/2 a()a()dddx
0 (2) R R

1 2 2
= lim 2 (R( ))ei(()())( + )/2 a()a()dd,
(36)
0 (2)d

where we wrote 2 (x) = 2 (x) and we dened the Fourier transform by



1
f() = (F f )() = eix f (x)dx.
(2)d

In what follows we omit the limit sign lim0 and the damping factors which arise
from exp(( 2 + 2 )/2). In the right-hand side of (36), we change variables to
= and expand by Taylors formula as
 1  1
a( + ) = a() () + b (, )
! !
||N ||=N +1

in the resulting formula, where a() = a and where we wrote


 1
b (, ) = (1 )N a() ( + )d.
0

This expresses I(R) as


 
1
2 (R) ei(+)i() a()a() ()dd + BN (R),
(37)
(2)d !
||N

where BN (R) is the sum over with || = N + 1 of constants times



2 (R) ei((+)()) a()b (, )dd

  
i() i(+)
= e a() e
2 (R) b (, )d d.
March 10, 2010 10:14 WSPC/S0129-055X 148-RMP
J070-S0129055X10003928

Spatial Growth of Fundamental Solutions for Certain Perturbations 203

We take  N such that (1 ) > d and apply integration by parts  times to the
inner integral, which we denote by I(, R), by using the identity

1 i ( + )
ei(+) = ei(+) .
1 + ( ( + ))2

Thus, if we write M for the transpose of the dierential operator on the left, we
have

I(, R) = ei(+) M (
2 (R) b (, ))d. (38)

Since M has the form


 
1 i
M= + i div + ,
1 + ( )2 1 + ( )2 1 + ( )2

are bounded for || 2 and since

C 1 +
2(1) 1 + ( ( + ))2 C +
2(1)

by virtue of (27), M is an th order dierential operator with respect to whose


coecients are bounded by C +
(1) . Hence

 
|I(, R)| C +
(1) R|| |(
2 )(R)|| || b (, )|d.
|++|

2 () is rapidly decreasing and b (, ) are bounded, the integrand is


Since
bounded for any L > 0 by a constant times

(1)


(1) R
L R|| ||N +1|| .

It follows, by changing variables to /R, and by taking L large enough, that for
R>1

(1)
|I(, R)| CRN 1d+
/R
(1)
N +1L d

(1)
C  RN 1d+
.

Thus, for  such that (1 ) > d we may estimate the remainder BN (R) in (37) by

C (1) C
|BN (R)| |a()|
d ,
RN +d+1 RN +d+1
March 10, 2010 10:14 WSPC/S0129-055X 148-RMP
J070-S0129055X10003928

204 A. Jensen & K. Yajima

and we may ignore BN (R) by taking N large enough. We have next to deal with
the rst terms in (37), which are sum over || N of
  
1
A =
2 (R) i((+)())
e d a()a() ()d. (39)
(2)d !
By using Taylors formula, we write

ei((+)()) = ei () ei(,) ,
 1 
2
(, ) = (1 ) 2 ( + )d ,
0

and expand ei via Taylors formula:


 N 
 (i)m N +1  1
(i)
ei((+)()) = ei () + (1 )N ei d ,
m=0
m! N ! 0

where we take N large enough so that (N + 1) > d. We then insert this into the
right-hand side of (39). Note that

|(, )| C

||2

by virtue of (28). It follows that the contribution to A of the term containing


(i)N +1 /(N + 1)! is bounded by taking L such that L > (2 + )(N + 1) + || + d by

CLN R
L ||2(N +1)+||
(N +1)
(N +1) dd
 
2(N +1)||d L+(N +1) 2(N +1)+||
CLN R
|| d
(N +1) d

CR(d+||+2N +2) .

Thus, we may again ignore this term and we are left for A with
N   
1 1
e i ()
2 (R) (i(, )) d a()a() ()dd.
m
(2)d ! m=0 m!

Here we repeat the same argument as in the rst step to the inner integral. We
expand (, ) further by Taylors formula:
 ()
(, ) = () + LN (, ),
!
2||N

  1 
LN (, ) = C (1 )N () ( + )d
||=N +1 0

and expand the product (, )m . We estimate the contribution to A of the terms


which contain LN , by performing integration by parts  times, (1 ) > d, by
March 10, 2010 10:14 WSPC/S0129-055X 148-RMP
J070-S0129055X10003928

Spatial Growth of Fundamental Solutions for Certain Perturbations 205

using the identity



1 i ()
ei () = ei ()
1 + | ()|2
and the estimate (27). This yields the bound CR2(N +1)d+ for the contribution
and we ignore them. The rest is a sum of the terms of the form
C1 m (1 ) () (m ) (), = 1 + + m
and their contributions to A are given by constants times

2 (R) (+) (1 ) () (m ) ()a()a() ()dd
ei ()

1
= (+ 2 )( ()/R)(1 ) () (m ) ()a()a() ()d.
(iR)||+||Rd
m
Here |1 |, . . . , |m | 2 and |(1 ) () (m ) ()| C
by (28) and this
integral is bounded in modulus by

C m
|(+ 2 )( ()/R)|
d
R||+||Rd
C  Rd/(1) R|+| Rm/(1) ,
by virtue of Lemma 2.3. Thus the main contribution to I(R) is given by the term
with m = 0 and = 0:

1 1
( ()/R)2 |a()|2 d.
(2)d Rd
Since a() 1 as || , this is comparable with CRd/(1) for large R by
virtue of Lemma 2.3. The theorem follows.

Acknowledgements
The rst author was partially supported by the Danish Natural Science Research
Council grant Mathematical Physics. The second author was supported by JSPS
grant in aid for scientic research No. 18340041. This work has been done while
the second author was visiting Department of Mathematical Sciences of Aalborg
University. He acknowledges the hospitality of the department.

References
[1] W. Craig, T. Kappeler and W. Strauss, Microlocal dispersive smoothing for the
odinger equation, Comm. Pure Appl. Math. 48 (1995) 769860.
Schr
[2] S. Doi, Dispersion of singularities of solutions for Schrodinger equations, Comm.
Math. Phys. 250 (2004) 473505.
[3] S. Doi, Smoothness of solutions for Schr
odinger equations with unbounded potentials,
Publ. RIMS Kyoto Univ. 41 (2005) 175221.
March 10, 2010 10:14 WSPC/S0129-055X 148-RMP
J070-S0129055X10003928

206 A. Jensen & K. Yajima

[4] D. Fujiwara, Remarks on the convergence of the Feynman path integrals, Duke Math.
J. 47 (1980) 4196.
[5] L. Kapitanski, I. Rodnianski and K. Yajima, On the fundamental solution of a per-
turbed harmonic oscillator, Topol. Methods Nonlinear Anal. 9 (1997) 77106.
[6] A. Martinez, S. Nakamura and V. Sordoni, Analytic smoothing eect for the
odinger equation with long-range perturbation, Comm. Pure Appl. Math. 59(9)
Schr
(2006) 13301351.
[7] A. Martinez and K. Yajima, On the fundamental solution of semiclassical Schrodinger
equations at resonant times, Comm. Math. Phys. 216 (2001) 357373.
odinger operators: A survey, in Mathematical
[8] W. Schlag, Dispersive estimates for Schr
Aspects of Nonlinear Dispersive Equations, Ann. of Math. Stud., Vol. 163 (Princeton
Univ. Press, Princeton, NJ, 2007), pp. 255285.
[9] K. Yajima, Schrodinger evolution equations with magnetic elds, J. dAnalyse Math.
56 (1991) 2976.
[10] K. Yajima, Smoothness and non-smoothness of the fundamental solution of time
dependent Schr odinger equations, Comm. Math. Phys. 181 (1996) 605629.
[11] K. Yajima, On fundamental solution of time dependent Schr odinger equations, Con-
temp. Math. 217 (1998) 4968.
[12] K. Yajima, On the behavior at innity of the fundamental solution of time dependent
odinger equation, Rev. Math. Phys. 13 (2001) 891920.
Schr
[13] G. P. Zhang and K. Yajima, Smoothing property for Schr odinger equations with
potential super-quadratic at innity, Comm. Math. Phys. 221 (2001) 573590.
[14] S. Zelditch, Reconstruction of singularities for solutions of Schrodinger equation,
Comm. Math. Phys. 90 (1983) 126.
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X1000393X

Reviews in Mathematical Physics


Vol. 22, No. 2 (2010) 207231

c 2010 by the authors
DOI: 10.1142/S0129055X1000393X

ON THE EXISTENCE OF THE DYNAMICS


FOR ANHARMONIC QUANTUM
OSCILLATOR SYSTEMS

BRUNO NACHTERGAELE , BENJAMIN SCHLEIN ,


ROBERT SIMS , SHANNON STARR
and VALENTIN ZAGREBNOV
Department of Mathematics, University of California,
Davis, CA 95616, USA
bxn@math.ucdavis.edu
Centre for Mathematical Sciences,

University of Cambridge, Cambridge, CB3 0WB, UK


b.schlein@dpmms.cam.ac.uk
Department of Mathematics,

University of Arizona, Tucson, AZ 85721, USA


rsims@math.arizona.edu
Department of Mathematics, University of Rochester,
Rochester, NY 14627, USA
sstarr@math.rochester.edu
Universitede la M
editerran
ee (Aix-Marseille II),
Centre de Physique Theorique-UMR 6207 CNRS,
Luminy - Case 907, 13288 Marseille, Cedex 09, France
zagrebnov@cpt.univ-mrs.fr

Received 18 September 2009

We construct a W -dynamical system describing the dynamics of a class of anharmonic


quantum oscillator lattice systems in the thermodynamic limit. Our approach is based
on recently proved LiebRobinson bounds for such systems on finite lattices [19].

Keywords: Thermodynamic limit; infinite-system dynamics; anharmonic lattice.

Mathematics Subject Classification 2010: 82C10, 82C20, 81Q15, 37K60, 46L55

1. Introduction
The dynamics of a nite quantum system, i.e. one with a nite number of degrees
of freedom described by a Hilbert space H, is given by the Schr
odinger equation.


c 2010 by the authors. This paper may be reproduced, in its entirety, for non-commercial
purposes.

207
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X1000393X

208 B. Nachtergaele et al.

The Hamiltonian H is a densely dened self-adjoint operator on H, and for a vector


(t) in the domain of H the state at time t satises

it (t) = H(t). (1.1)

For all initial conditions (0) H, the unique solution is given by

(t) = eitH (0), for all t R.

Due to Stones Theorem eitH is a strongly continuous one-parameter group of


unitary operators on H, and the self-adjointness of H is the necessary and sucient
condition for the existence of a unique continuous solution for all times.
An alternative description of this dynamics is the so-called Heisenberg picture
in which the time evolution is dened on the algebra of observables instead of the
Hilbert space of states. The corresponding Heisenberg equation is

t A(t) = i[H, A(t)], (1.2)

where, for each t R, A(t) B(H) is a bounded linear operator on H. Its solutions
are given by a one-parameter group of -automorphisms, t , of B(H):

A(t) = t (A(0)).

For the description of physical systems we expect the Hamiltonian, H, to


have some additional properties. For example, for nite systems such as atoms
or molecules, stability of the system requires that H is bounded from below. In this
case, the inmum of the spectrum is expected to be an eigenvalue and is called the
ground state energy. When the model Hamiltonian, H, is describing bulk matter
rather than nite systems, we expect some additional properties. For example, the
stability of matter requires that the ground state energy has a lower bound propor-
tional to N , where N is the number of degrees of freedom. Much progress on this
stability property has been made in the last several decades [24,12]. We also expect
that the dynamics of local observables of bulk matter, or large systems in general,
depends only on the local environment. Mathematically this is best expressed by
the existence of the dynamics in the thermodynamic limit, i.e. in innite volume.
This is the question we address in this paper.
There are two settings that allow one to prove a rich set of important physical
properties of quantum dynamical systems, including innite ones: the C dynamical
systems and the W dynamical systems [3]. In both cases, the algebra of observables
can be thought of as a norm-closed -subalgebra A of some algebra of the form
B(H), but in the case of the W -dynamical systems, we additionally require that
the algebra is closed for the weak operator topology, which makes it a von Neumann
algebra. For a C -dynamical system, the group of automorphisms t is assumed to
be strongly continuous, i.e. for all A A, the map t  t (A) is continuous in t for
the operator norm (C -norm) on A. In a W -dynamical system the continuity is
with respect to the weak topology.
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X1000393X

On the Existence of the Dynamics 209

In the case of lattice systems with a nite-dimensional Hilbert space of states


associated with each lattice site, such as quantum spin-lattice systems and lattice
fermions, it has been known for a long time that under rather general conditions
the dynamics can be described by a C dynamical system, including in the thermo-
dynamic limit [4]. When the Hilbert space at each site is innite-dimensonal and
the nite-system Hamiltonians are unbounded, this is no longer possible and the
weak continuity becomes a natural assumption.
The class of systems we will primarily focus on here are lattices of quantum
oscillators but the underlying lattice structure is not essential for our method.
Systems dened on suitable graphs, such as the systems considered in [6, 7] can
also be analyzed with the same methods. In a recent preprint [1], it was shown
that convergence of the dynamics in the thermodynamic limit can be obtained
for a modied topology. Here, we follow a somewhat dierent approach. The main
dierence is that we study the thermodynamic limit of anharmonic perturbations of
an infinite harmonic lattice system described by an explicit W -dynamical system.
The more traditional way is to rst dene the dynamics of anharmonic systems in
nite volume (which can be done by standard means [21]), and then to study the
limit in which the volume tends to innity. This is what is done in [1], but it appears
that controlling the continuity of the limiting dynamics is more straightforward in
our approach. In fact, we are able to show that the resulting dynamics for the class
of anharmonic lattices we study is indeed weakly continuous, and we obtain a W -
dynamical system for the innite system. The W -dynamical setting is obtained
by considering the GNS representation of a ground state or thermal equilibrium
state of the harmonic system. The ground states and thermal states are quasi-free
states in the sense of [22], or convex mixtures of quasi-free states. In the ground
state case the GNS representations are the well-known Fock reprensentations. For
the thermal states the GNS representations have been constructed by Araki and
Woods [2].
Common to both approaches, ours and the one of [1], is the crucial role played
by an estimate of the speed of propagation of perturbations in the system, com-
monly referred to as LiebRobinson bounds [8, 11, 1618]. Briey, if A and B are
two observables of a spatially extended system, localized in regions X and Y of
our graph, respectively, and t denotes the time evolution of the system, a Lieb
Robinson bound is an estimate of the form
[t (A), B] Cea(d(X,Y )v|t|) ,
where C, a, and v are positive constants and d(X, Y ) denotes the distance between
X and Y . LiebRobinson bounds for anharmonic lattice systems were recently
proved in [19], and this work builds on the results obtained there. Our results are
mainly limited to short-range interactions that are either bounded or unbounded
perturbations of the harmonic interaction (linear springs).
To conclude the introduction, let us mention that the same questions, the exis-
tence of the dynamics for innite oscillator lattices, can and has been asked for
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X1000393X

210 B. Nachtergaele et al.

classical systems. Two classic papers are [10, 15]. Many properties of this classical
innite volume harmonic dynamics have been studied in detail, e.g., [23,9] and some
recent progress on locality estimates for anharmonic systems is reported in [5, 20].
The paper is organized as follows. We begin with a section discussing bounded
interactions. In this case, the existence of the dynamics follows by mimicking the
proof valid in the context of quantum spins systems. Section 3 describes the innite
volume harmonic dynamics on general graphs. It is motivated by an explicit example
on Zd . Next, in Sec. 4, we discuss nite volume perturbations of the innite volume
harmonic dynamics and prove that such systems satisfy a LiebRobinson bound. In
Sec. 5, we demonstrate that the existence of the dynamics and its continuity follow
from the LiebRobinson estimates established in the previous section.

2. Bounded Interactions
The goal of this section is to prove the existence of the dynamics for oscillator sys-
tems with bounded interactions. Since oscillator systems with bounded interactions
can be treated as a special case of more general models with bounded interactions,
we will use a slightly more general setup in this section, which we now introduce.
We will denote by the underlying structure on which our models will be
dened. Here will be an arbitrary set of sites equipped with a metric d. For
with countably innite cardinality, we will need to assume that there exists a
non-increasing function F : [0, ) (0, ) for which:

(i) F is uniformly integrable over , i.e.



F  := sup F (d(x, y)) < , (2.1)
x y

and
(ii) F satises
 F (d(x, z))F (d(z, y))
C := sup < . (2.2)
x,y F (d(x, y))
z

Given such a set and a function F , by the triangle inequality, for any a 0
the function

Fa (x) = eax F (x),

also satises (i) and (ii) above with Fa  F  and Ca C.


In typical examples, one has that Zd for some integer d 1, and the metric

is just given by d(x, y) = |x y| = dj=1 |xj yj |. In this case, the function F can
be chosen as F (|x|) = (1 + |x|)d for any  > 0.
To each x , we will associate a Hilbert space Hx . In many relevant systems,
one considers Hx = L2 (R, dqx ), but this is not essential. With any nite subset
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X1000393X

On the Existence of the Dynamics 211

, the Hilbert space of states over is given by



H = Hx ,
x

and the local algebra of observables over is then dened to be



A = B(Hx ),
x

where B(Hx ) denotes the algebra of bounded linear operators on Hx .


If 1 2 , then there is a natural way of identifying A1 A2 , and we may
thereby dene the algebra of quasi-local observables by the inductive limit

A = A ,

where the union is over all nite subsets ; see [3, 4] for a discussion of these
issues in general.
The result discussed in this section corresponds to bounded perturbations of
local self-adjoint Hamiltonians. We x a collection of on-site local operators H loc =
{Hx }x where each Hx is a self-adjoint operator over Hx . In addition, we will
consider a general class of bounded perturbations. These are dened in terms of
an interaction , which is a map from the set of subsets of to A with the
property that for each nite set X , (X) AX and (X) = (X). As with
the LiebRobinson bound proven in [19], we will need a growth condition on the
set of interactions for which we can prove the existence of the dynamics in the
thermodynamic limit. This condition is expressed in terms of the following norm.
For any a 0, denote by Ba () the set of interactions for which
1 
a := sup (X) < . (2.3)
x,y Fa (d(x, y))
Xx,y

Now, for a xed sequence of local Hamiltonians H loc = {Hx }x , as described


above, an interaction Ba (), and a nite subset , we will consider self-
adjoint Hamiltonians of the form
 
H = Hloc + H = Hx + (X), (2.4)
x X

acting on H (with domain given by x D(Hx ) where D(Hx ) Hx denotes
the domain of Hx ). As these operators are self-adjoint, they generate a dynam-
ics, or time evolution, {t }, which is the one-parameter group of automorphisms
dened by

t (A) = eitH AeitH for any A A .


March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X1000393X

212 B. Nachtergaele et al.

Theorem 2.1. Under the conditions stated above, for all t R, A A , the norm
limit

lim t (A) = t (A) (2.5)


exists in the sense of non-decreasing exhaustive sequences of finite volumes and


defines a group of -automorphisms t on the completion of A . The convergence
is uniform for t in a compact set.

Proof. Let be a nite set. Consider the unitary propagator


loc loc
U (t, s) = eitH ei(ts)H eisH (2.6)
and its associated interaction-picture evolution dened by

t,int (A) = U (0, t)AU (t, 0) for all A A . (2.7)
Clearly, U (t, t) = 1l for all t R, and it is also easy to check that
d d
i U (t, s) = Hint (t)U (t, s) and i U (t, s) = U (t, s)Hint (s)
dt ds
with the time-dependent generator
loc loc  loc loc
Hint (t) = eiH t H eiH t
= eiH t (Z)eiH t . (2.8)
Z

Fix T > 0 and X nite. For any A AX , we will show that for any
n
non-decreasing, exhausting sequence {n } of , the sequence {t,int (A)} is Cauchy
in norm, uniformly for t [T, T ]. Moreover, the bounds establishing the Cauchy
property depend on A only through X and A. Since
loc loc P P
t (A) = t,int

(eitH AeitH ) = t,int

(eit xX Hx
Aeit xX Hx
),

an analogous statement then immediately follows for {tn (A)}, since they are all
also localized in X and have the same norm as A.
Take n m with X n m and calculate
 t
m n d
t,int (A) t,int (A) = {Um (0, s)Un (s, t)AUn (t, s)Um (s, 0)} ds. (2.9)
0 ds
A short calculation shows that
d
U (0, s)Un (s, t)AUn (t, s)Um (s, 0)
ds m
= iUm (0, s)[(Hint
m
(s) Hint
n
(s)), Un (s, t)AUn (t, s)]Um (s, 0)
loc loc
n

= iUm (0, s)eisHn [B(s), st (A(t))]eisHn Um (s, 0), (2.10)

where
= eitHlocn AeitHlocn = eitHX
A(t)
loc loc
AeitHX (2.11)
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X1000393X

On the Existence of the Dynamics 213

and
loc loc

B(s) = eisHn (Hint
m
(s) Hint
n
(s))eisHn
 loc loc 
= eisHm \n (Z)eisHm \n (Z)
Zm Zn
 loc loc
isH m \n
isH m \n
= e (Z)e . (2.12)
Zm :
Zm \n =

Combining the results of (2.9)(2.12), and using unitarity, we nd that


 t
m n n
t,int (A) t,int (A) [st
(A(t)), B(s)] ds (2.13)
0

and by the LiebRobinson bound proven in [19], it is clear that


n
[st
(A(t)), B(s)]
 loc loc
n
[st (A(t)), eisHm \n (Z)eisHm \n ]
Zm :
Zm \n =

2A 2 a Ca |ts|   
(e 1) (Z) Fa (d(x, z))
Ca
ym \n Zm : xX zZ
yZ

2A 2 a Ca |ts|    
(e 1) (Z) Fa (d(x, z))
Ca
ym \n zm Zm : xX
y,zZ

2Aa 2 a Ca |ts|   
(e 1) Fa (d(x, z))Fa (d(z, y))
Ca
ym \n xX zm
 
2Aa(e2 a Ca |ts| 1) Fa (d(x, y)). (2.14)
ym \n xX

With the estimate above and the properties of the function Fa , it is clear that
m n
sup t,int (A) t,int (A) 0 as n, m , (2.15)
t[T,T ]

and the rate of convergence only depends on the norm A and the set X where A
is supported. This proves the claim.

If all local Hamiltonians Hx are bounded, {t } is strongly continuous. If the Hx


are allowed to be densely dened unbounded self-adjoint operators, we only have
weak continuity and the dynamics is more naturally dened on a von Neumann
algebra. This can be done when we have a suciently nice invariant state for the
model with only the on-site Hamiltonians. For example, suppose that for each x ,
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X1000393X

214 B. Nachtergaele et al.

we have a normalized eigenvector x of Hx . Then, for all A A , for any nite


, dene
 
 
(A) = x , A x . (2.16)
x x

can be regarded as a state of the innite system dened on the norm comple-
tion of A . The GNS Hilbert space H of can be constructed as the closure of
 
A x x . Let A x x . Then
(n ) (n ) (n )
(t (A) t0 (A)) (t (A) t (A)) + (t (A) t0 (A))
(n )
+ (t0 (A) t0 (A)). (2.17)

For suciently large n , the limtt0 of the middle term vanishes by Stones the-
orem. The two other terms are handled by (2.5). It is clear how to extend the
continuity to H .
We will discuss this type of situation in more detail in the next three sections
where we consider models that include quadratic (unbounded) interactions as well.

3. The Harmonic Lattice


As noted in the introduction, we will consider anharmonic perturbations of innite
harmonic lattices. In this section, we discuss the properties of the harmonic sys-
tems that we need to assume in general in order to study the perturbations in the
thermodynamic limit. We will also show in detail that a standard harmonic lattice
model possesses all the required properties.

3.1. The CCR algebra of observables


We begin by introducing the CCR algebra on which the harmonic dynamics will be
dened. Following [14], one can dene the CCR algebra over any real linear space
D equipped with a non-degenerate, symplectic bilinear form , i.e. : D D R
with the property that if (f, g) = 0 for all f D, then g = 0, and

(f, g) = (g, f ) for all f, g D. (3.1)

In typical examples, D will be a complex inner product space associated with ,


e.g., D = 2 () or a subspace thereof such as D = 1 (), or 2 (0 ), with 0 ,
and

(f, g) = Im[
f, g ]. (3.2)

The Weyl operators over D are dened by associating non-zero elements W (f ) to


each f D which satisfy

W (f ) = W (f ) for each f D, (3.3)


March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X1000393X

On the Existence of the Dynamics 215

and

W (f )W (g) = ei(f,g)/2 W (f + g) for all f, g D. (3.4)

It is well known that there is a unique, up to -isomorphism, C -algebra generated


by these Weyl operators with the property that W (0) = 1l, W (f ) is unitary for all
f D, and W (f ) 1l = 2 for all f D\{0}, see, e.g., [4, Theorem 5.2.8]. This
algebra, commonly known as the CCR algebra, or Weyl algebra, over D, we will
denote by W = W(D).

3.2. Quasi-free dynamics


The anharmonic dynamics we study in this paper will be dened as perturbations
of harmonic, technically quasi-free, dynamics. A quasi-free dynamics on W(D) is a
one-parameter group of *-automorphisms t of the form

t (W (f )) = W (Tt f ), f D (3.5)

where Tt : D D is a group of real-linear, symplectic transformations, i.e.

(Tt f, Tt g) = (f, g). (3.6)

As W (f ) W (g) = 2 for all f = g D, one should not expect t to be strongly


continuous; only a weaker form of continuity is present. This means that t does not
dene a C -dynamical system on W, and thus we look for a W -dynamical setting
in which the weaker form of continuity is naturally expressed.
In the present context, it suces to regard a W -dynamical system as a pair
{M, t } where M is a von Neumann algebra and t is a weakly continuous, one
parameter group of -automorphisms of M. For the harmonic systems we are con-
sidering, a specic W -dynamical system arises as follows. Let be a state on W and
denote by (H , , ) the corresponding GNS representation. We will assume that
is both regular and t -invariant. Recall that is regular if and only if t  (W (tf ))
is continuous for all f D, and t -invariance means

(t (A)) = (A) for all A W. (3.7)

For the von Neumann algebra M, take the weak-closure of (W) in L(H ) and
let t be the weakly continuous, one parameter group of -automorphisms of M
obtained by lifting t to M. The latter step is possible since is t -invariant; see,
e.g., [3, Corollary 2.3.17].

3.3. LiebRobinson bounds for harmonic lattices


To prove the existence of the dynamics for anharmonic models, we use that the
unperturbed harmonic system satises a LiebRobinson bound. Such an estimate
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X1000393X

216 B. Nachtergaele et al.

depends directly on properties of and Tt . In fact, it is easy to calculate that


[t (W (f )), W (g)] = {W (Tt f ) W (g)W (Tt f )W (g)}W (g)
= {1 ei(Tt f,g) }W (Tt f )W (g), (3.8)
using the Weyl relations (3.4). For the examples we consider below, one can prove
that for every a > 0, there exist positive numbers ca and va for which
 ea|xy|
|(Tt f, g)| ca eva |t| |f (x)||g(y)| (3.9)
(1 + |x y|)d+1
x,yZd

holds for all t R and all f, g 2 (Zd ). In general, we will assume that the harmonic
dynamics satises an estimate of this type. Namely, we suppose that there exists a
number a0 > 0 for which given 0 < a a0 , there are numbers ca and va for which

|1 ei(Tt f,g) | ca eva |t| |f (x)||g(y)|Fa (d(x, y)) (3.10)
x,y

holds for all t R and all f, g 2 (). Here we describe the spatial decay in
through the functions Fa as introduced in Sec. 2. Since the Weyl operators are
unitary, the norm estimate

[t (W (f )), W (g)] ca eva |t| |f (x)||g(y)|Fa (d(x, y)), (3.11)
x,y

readily follows.

3.4. An important example


Using the example given below, we illustrate the general discussion above in terms
of a standard harmonic model dened over = Zd . We begin with a description of
some well-known calculations that are valid for these models when restricted to a
nite volume. This analysis motivates the denition of the harmonic dynamics in the
innite volume. We then demonstrate that this innite volume dynamics satises
a LiebRobinson bound. By representing this dynamics in a suitable state, the
relevant weak-continuity is readily veried. Interestingly, our analysis also applies
to the massless case of = 0, see below, and we discuss this briey. We end this
subsection with some nal comments.

3.4.1. Finite volume analysis


We consider a system of coupled harmonic oscillators restricted to a nite volume.
Specically on cubic subsets L = (L, L]d Zd , we analyze Hamiltonians of the
form
 
d
HLh = p2x + 2 qx2 + j (qx qx+ej )2 (3.12)
xL j=1
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X1000393X

On the Existence of the Dynamics 217

acting in the Hilbert space



HL = L2 (R, dqx ). (3.13)
xL

Here the quantities px and qx , which appear in (3.12) above, are the single site
momentum and position operators regarded as operators on the full Hilbert space
HL by setting

d
px = 1l 1l i 1l 1l and qx = 1l 1l q 1l 1l,
dq
(3.14)

i.e. these operators act non-trivially only in the xth factor of HL . These operators
satisfy the canonical commutation relations

[px , py ] = [qx , qy ] = 0 and [qx , py ] = ix,y , (3.15)

valid for all x, y L . In addition, {ej }dj=1 are the canonical basis vectors in
Zd , the numbers j 0 and 0 are the parameters of the system, and the
Hamiltonian is assumed to have periodic boundary conditions, in the sense that
qx+ej = qx(2L1)ej if x L but x + ej L . It is well-known that Hamiltonians
of this form can be diagonalized in Fourier space. We review this quickly to establish
some notation and refer the interested reader to [19] for more details.
Introducing the operators

1  1 
Qk = eikx qx and Pk = eikx px , (3.16)
|L | xL |L | xL

dened for each k L = { x


L : x L }, and setting



d
(k) = 2 + 4 j sin2 (kj /2), (3.17)
j=1

one nds that



HLh = (k)(2bk bk + 1) (3.18)
k
L

where the operators bk and bk satisfy



1 (k) 1 (k)
bk = Pk i Qk and bk = Pk + i Qk . (3.19)
2(k) 2 2(k) 2

In this sense, we regard the Hamiltonian HLh as diagonalizable.


March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X1000393X

218 B. Nachtergaele et al.

Using the above diagonalization, one can determine the action of the dynamics
corresponding to HLh on the Weyl algebra W(2 (L )). In fact, by setting
 

W (f ) = exp i (Re[f (x)]qx + Im[f (x)]px ) , (3.20)
xL

for each f 2 (L ), it is easy to verify that (3.3) and (3.4) hold with (f, g) =
Im[
f, g ]. It is convenient to express these Weyl operators in terms of annihilation
and creation operators, i.e.
1 1
ax = (qx + ipx ) and ax = (qx ipx ), (3.21)
2 2
which satisfy

[ax , ay ] = [ax , ay ] = 0 and [ax , ay ] = x,y for all x, y L . (3.22)

One nds that


 
i
W (f ) = exp (a(f ) + a (f )) , (3.23)
2
where, for each f 2 (L ), we have set
 
a(f ) = f (x)ax , a (f ) = f (x)ax . (3.24)
xL xL

Now, the dynamics corresponding to HLh , which we denote by tL , is trivial with


respect to the diagonalizing variables, i.e.

tL (bk ) = e2i(k)t bk and tL (bk ) = e2i(k)t bk , (3.25)

where bk and bk are as dened in (3.19). Hence, if we further introduce


1  1 
bx = eikx bk and bx = eikx bk , (3.26)
|L | k |L | k
L L

for each x L and, analogously to (3.24), dene


 
b(f ) = f (x)bx , b (f ) = f (x)bx , (3.27)
xL xL

for each f 2 (L ), then one has that

tL (b(f )) = b([F 1 Mt F ]f ), (3.28)

where F is the unitary Fourier transform on 2 (L ) and Mt is the operator of


multiplication by e2i(k)t in Fourier space with (k) as in (3.17). We need only
determine the relation between the as and the bs.
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X1000393X

On the Existence of the Dynamics 219

A short calculation shows that there exists a linear mapping U : 2 (L )


 (L ) and an anti-linear mapping V : 2 (L ) 2 (L ) for which
2

b(f ) = a(U f ) + a (V f ), (3.29)

a relation know in the literature as a Bogoliubov transformation [13]. In fact, one


has that
i 1 i 1
U= F M+ F and V = F M F J (3.30)
2 2
where J is complex conjugation and M is the operator of multiplication by
1
(k) = (k), (3.31)
(k)

with (k) as in (3.17). Using the fact that is real valued and even, it is easy to
check that

U U V V = 1l = U U V V (3.32)

and

V U U V = 0 = V U U V (3.33)

where we stress that V is the adjoint of the anti-linear mapping V . The relation
(3.29) is invertible, in fact,

a(f ) = b(U f ) b (V f ), (3.34)

and therefore
 
i
W (f ) = exp (b((U V )f ) + b ((U V )f )) . (3.35)
2
Clearly then,

t (W (f )) = W (Tt f ), (3.36)

where the mapping Tt is given by

Tt = (U + V )F 1 Mt F (U V ), (3.37)

and we have used (3.28).

3.4.2. Infinite volume dynamics


It is now clear how to dene the innite volume harmonic dynamics. Consider a
subspace D 2 (Zd ) and dene W(D) as above with (f, g) = Im[
f, g ]. First,
assume > 0, take : [, )d R as in (3.17), and set U and V as in (3.30) with
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X1000393X

220 B. Nachtergaele et al.

(3.31). If > 0, both U and V are bounded transformations on 2 (Zd ). We will


treat the case = 0 by a limiting argument. The mapping Tt dened by setting
Tt = (U + V )F 1 Mt F (U V ), (3.38)

is well-dened on 2 (Zd ). To dene the dynamics on W(D) we will need to choose


subspaces D that are Tt invariant. On such D, Tt is clearly real-linear. With (3.32)
and (3.33), one can easily verify the group properties T0 = 1l, Ts+t = Ts Tt , and
Im[
Tt f, Tt g ] = Im[
f, g ], (3.39)

i.e. Tt is sympletic in the sense of (3.6). Using [4, Theorem 5.2.8], there is a unique
one-parameter group of -automorphisms on W(D), which we will denote by t ,
that satises

t (W (f )) = W (Tt f ) for all f D. (3.40)


This denes the harmonic dynamics on W(D).
Here it is important that Tt : D D. As was demonstrated in [19], the mapping
Tt can be expressed as a convolution. In fact,
   
(0) i (1) (1) i (1) (1)
Tt f = f Ht + (Ht + Ht ) + f (H Ht ) . (3.41)
2 2 t
where
 
(1) 1 1 i(kx2(k)t)
Ht (x) = Im e dk ,
(2)d (k)
 
(0) 1 i(kx2(k)t)
Ht (x) = Re e dk , (3.42)
(2)d
 
(1) 1 i(kx2(k)t)
Ht (x) = Im (k)e dk .
(2)d
Using analysis similar to what is proven in [19], the following result holds.

Lemma 3.1. Consider the functions defined in (3.42). For 0, 1 , . . . , d 0,


d
but such that c, = ( 2 + 4 j=1 j )1/2 > 0, and any > 0, the bounds
(0) 2 (/2)+1
|Ht (x)| e(|x|c, max( ,e )|t|)

(1) 2 (/2)+1
(|x|c, max( ,e )|t|)
|Ht (x)| c1
, e
(3.43)
(1) 2 (/2)+1
|Ht (x)| c, e/2 e(|x|c, max( ,e )|t|)

d
hold for all t R and x Zd . Here |x| = j=1 |xi |.
Given the estimates in Lemma 3.1, Eq. (3.41) and Youngs inequality imply that
Tt can be dened as a transformation of p (Zd ), for p 1. However, the symplectic
form limits us to consider D = p (Zd ) with 1 p 2.
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X1000393X

On the Existence of the Dynamics 221

The following bound now readily follows:


|Im
Tt f, g | (1 + 2e/2 c, + 2c1
, )
 2 (/2)+1
|f (x)||g(y)|e(|x|c, max( ,e )|t|)
. (3.44)
x,y

This implies an estimate of the form (3.9), and hence a LiebRobinson bound as
in (3.11).
A simple corollary of Lemma 3.1 follows.

Corollary 3.2. Consider the functions defined in (3.42). For 0, 1 , . . . , d 0,


d
but with c, = ( 2 + 4 j=1 j )1/2 > 0, take  1 to be the 1 -norm. One has that

(0)
Ht 0  1 0 as t 0, (3.45)
and
(m)
Ht 1 0 as t 0, for m {1, 1}. (3.46)
(m)
Proof. The estimates in Lemma 3.1 imply that the functions Ht are bounded
by exponentially decaying functions (in |x|). These estimates are uniform for t in
compact sets, e.g., t [1, 1], and therefore dominated convergence applies. It is
(0) (m)
clear that H0 (x) = 0 (x) while H0 (x) = 0 for m {1, 1}. This proves the
corollary.

3.4.3. Representing the dynamics


The innite-volume ground state of the model (3.12) is the vacuum state for the
b-operators, as can be seen from (3.18). This state is dened on W(D) by

1
V )f 2
(W (f )) = e 4 (U (3.47)
By standard arguments this denes a state on W(D) [4]. Using (3.38), (3.32) and
(3.33), one readily veries that is t -invariant. is regular by observation. The
weak continuity of the dynamics in the GNS-representation of will follow from
the continuity of the functions of the form
t  (W (g1 )W (Tt f )W (g2 )), for g1 , g2 , f D. (3.48)
When > 0, this continuity can be easily observed from the following expression:
(W (g1 )W (Tt f )W (g2 )) = ei(g1 ,g2 )/2 ei(Tt f,g2 g1 )/2

V )(g1 +g2 +Tt f ) 2 /4
e (U . (3.49)
Note that Tt is dierentiable with bounded derivative and that both U and V are
bounded. This establishes the continuity in the case that > 0.
As discussed in the introduction of the section, the W -dynamical system is
now dened by considering the GNS representation of . This yields a von
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X1000393X

222 B. Nachtergaele et al.

Neumann algebra M = (W(D)). The invariance of implies that the dynamics


is implementable by unitaries Ut , i.e.
(t (W (f ))) = Ut (W (f ))Ut . (3.50)
Using Ut , the dynamics can be extended to M. As a consequence of (3.48), this
extended dynamics is weakly continuous.

3.4.4. The case of = 0


We now discuss the case = 0. Here, the maps Tt are dened using the convolution
formula (3.41). By Lemma 3.1, Tt is well-dened as a transformation of p (Zd ), for
1 p 2. Both the group property of Tt and the invariance of the symplectic
form follow in the limit 0 by dominated convergence which is justied by
Lemma 3.1. This demonstrates that the dynamics is well dened.
We represent the dynamics in a state dened by (3.47), but with the under-
standing that (U V )f  may take on the value +, in which case (W (f )) = 0.
is still clearly regular. It remains to show that the dynamics is weakly continuous.
Observe that
 
(0) i (1) (1)
Tt f f = f (Ht 0 ) f (H + Ht )
2 t
 
i (1) (1)
+f (H Ht ) , (3.51)
2 t
follows from (3.41). Using Youngs inequality and Corollary 3.2, it is clear that
Tt f f  0 as t 0 for any f p (Zd ) with 1 p 2. A calculation shows
that
(0) (1) (1)
(U V )(Tt f f ) = F1 (Ht 0 ) F2 Ht iF3 Ht , (3.52)
where
F1 = F 1 M F Im[f ] iF 1 M 1/2 F Re[f ],
(3.53)
F2 = F 1 M F Re[f ] and F3 = F 1 M 1/2 F Im[f ].
A similar argument to what is given above now implies that (U V )(Tt f f )
0 as t 0, for any f D0 , where
D0 = {f 2 (Zd ) : F 1 M 1/2 F Re[f ] 2 (Zd )}. (3.54)
(1)
No additional assumption on Im[f ] is necessary since F3 is convolved with Ht .
Given the form of (3.49), this suces to prove weak continuity. In fact, one can
check that Tt leaves D0 invariant and that if f D0 , then (U V )Tt f 2 (Zd )
for all t R. This establishes weak continuity of the dynamics, dened on W(D0 ).

Remark 3.3. We observe that, when = 0, the nite volume Hamiltonian HLh
(3.12) is translation invariant and commutes with the total momentum operator P0
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X1000393X

On the Existence of the Dynamics 223

(see (3.16)). In fact, HLh can be written as



HLh = P02 + Pk Pk + 2 (k)Qk Qk
k
L \{0}

= P02 + (k)(2bk bk + 1)
k
L \{0}

where we used the notation (3.16) and, for k = 0, we introduced the operators bk , bk
as in (3.19). In this case, the operator HLh does not have eigenvectors: its spectrum is
purely continuous. By a unitary transformation, the Hilbert space HL (see (3.13))
can be mapped into the space L2 (R, dP0 ; Hb ) of square integrable functions of
P0 R, with values in Hb . Here, Hb denotes the Fock space generated by all creation
and annihilation operators bk , bk with k = 0. It is then easy to construct vectors
which minimize the energy by a given distribution of the total momentum: for an
arbitrary (complex valued) f L2 (R) with f  = 1, we dene f L2 (R, dP0 ; Hb )
by setting f (P0 ) = f (P0 ) (where is the Fock vacuum in Hb ). These vectors
are not invariant with respect to the time evolution. It is simple to check that the
2
odinger evolution of f is given by eiHL t f = ft with ft (P0 ) = eitP0 f (P0 )
h
Schr
is the free evolution of f . In particular, for = 0, HLh does not have a ground state
in the traditional sense of an eigenvector. For this reason, when = 0, it is not
a priori clear what the natural choice of state should be. As is discussed above, one
possibility is to consider rst = 0 and then take the limit 0. This yields a
ground state for the innite system with vanishing center of mass momentum of
the oscillators. By considering non-zero values for the center of mass momentum,
one can also dene other states with similar properties.

3.4.5. Some final comments


The analysis in the following sections and our main result is not limited to the
class of examples we discussed above. For example, harmonic systems dened on
more general graphs, such as the ones considered in [6, 7] can also be treated. Also
note that our choice of time-invariant state, while natural, is by no means the only
possible state. Instead of the vacuum state dened in (3.47), equilibrium states at
positive temperatures could be used in exactly the same way. It would also make
sense to study the convergence of the equilibrium or ground states for the perturbed
dynamics and to consider the dynamics in the representation of the limiting innite-
system state, but we have not studied this situation and will not discuss it in this
paper.

4. Perturbing the Harmonic Dynamics


In this section, we will discuss nite volume perturbations of the innite volume
harmonic dynamics which we dened in Sec. 3. To begin, we recall a fundamental
result about perturbations of quantum dynamics dened by adding a bounded term
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X1000393X

224 B. Nachtergaele et al.

to the generator. This is a version of what is usually known as the Dyson or Duhamel
expansion. The following statement summarizes [4, Proposition 5.4.1].

Proposition 4.1. Let {M, t } be a W -dynamical system and let denote the
infinitesimal generator of t . Given any P = P M, set P to be the bounded
derivation with domain D(P ) = M satisfying P (A) = i[P, A] for all A M. It
follows that + P generates a one-parameter group of -automorphisms P of M
which is the unique solution of the integral equation
 t
P
t (A) = t (A) + i P
s ([P, ts (A)]) ds. (4.1)
0
In addition, the estimate
|t| P
P
t (A) t (A) (e 1)A (4.2)
holds for all t R and A M.
Since the initial dynamics t is assumed weakly continuous, the norm estimate
(4.2) can be used to show that the perturbed dynamics is also weakly continuous.
Hence, for each P = P M the pair {M, P
t } is also a W -dynamical system.
P1 +P2
Thus, if Pi = Pi M for i = 1, 2, then one can dene t iteratively.

4.1. A LiebRobinson bound for on-site perturbations


In this section, we will consider perturbations of the harmonic dynamics dened in
Sec. 3. Recall that our general assumptions for the harmonic dynamics on are as
follows.
We assume that the harmonic dynamics, t0 , is dened on a Weyl algebra W(D)
where D is a subspace of 2 (). In fact, we assume there exists a group Tt of
real-linear transformations which leave D invariant and satisfy
t0 (W (f )) = W (Tt f ) for all f D. (4.3)
In addition, we assume that this harmonic dynamics satises a LiebRobinson
bound. Specically, we suppose that there exists a number a0 > 0 for which given
any 0 < a a0 , there are positive numbers ca and va for which

|1 ei(Tt f,g) | ca eva |t| |f (x)||g(y)|Fa (d(x, y)) (4.4)
x,y

here the spatial decay in is described by the function Fa as introduced in Sec. 2.


As we discussed in Sec. 3, the estimate (4.4) immediately implies the LiebRobinson
bound

[t0 (W (f )), W (g)] ca eva |t| |f (x)||g(y)|Fa (d(x, y)). (4.5)
x,y

Finally, we assume that we have represented this harmonic dynamics in a regular


and t0 -invariant state for which the pair {M, t0 }, with M = (W(D)), is a
W -dynamical system.
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X1000393X

On the Existence of the Dynamics 225

Our rst estimate involves perturbations dened as nite sums of on-site terms.
More specically, the perturbations we consider are dened as follows.
To each site x , we will associate a nite measure x on C, and an element
Px W(D) which has the form

Px = W (zx )x (dz). (4.6)
C

We require that each x is even, i.e. invariant under z  z, to ensure self-


adjointness, i.e. Px = Px . Our LiebRobinson bounds hold under the additional
assumption that the second moment is uniformly bounded, i.e.

sup |z|2 |x |(dz) < . (4.7)
x C

We use Proposition 4.1 to dene the perturbed dynamics. Fix a nite set .
Set

P = Px , (4.8)
x

()
and note that (P ) = P W(D). We will denote by t the dynamics that
results from applying Proposition 4.1 to the W -dynamical system {M, t0 } and P .
Before we begin the proof of our estimate, we discuss two examples.
Example 1. Let x be supported on [, ) and absolutely continuous with respect
to Lebesgue measure, i.e. x (dz) = vx (z) dz. If vx is in L2 ([, )), then Px is
proportional to an operator of multiplication by the inverse Fourier transform of
vx . Moreover, since the support of x is real, Px corresponds to multiplication by
a function depending only on qx .
Example 2. Let x have nite support, e.g., take supp(x ) = {z, z} for some
number z = + i C. Then
Px = W (zx ) + W (zx ) = 2 cos(qx + px ). (4.9)

We now state our rst result.

Theorem 4.2. Let t0 be a harmonic dynamics defined on as described above.


Suppose that

= sup |z|2 |x |(dz) < , (4.10)
x C
()
and define the perturbed dynamics t as indicated above. For every 0 < a a0 ,
there exist positive numbers ca and va for which the estimate
()

[t (W (f )), W (g)] ca e(va +ca Ca )|t| |f (x)||g(y)|Fa (d(x, y)) (4.11)
x,y

holds for all t R and for any functions f, g D.


March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X1000393X

226 B. Nachtergaele et al.

Here the numbers ca and va are as in (4.4), whereas Ca is the convolution


constant as dened in (2.2) with respect to the function Fa .

Proof. Fix t > 0 and dene the function t : [0, t] W(D) by setting

t (s) = [s() (ts


0
(W (f ))), W (g)]. (4.12)

It is clear that t interpolates between the commutator associated with the original
()
harmonic dynamics, t0 at s = 0, and that of the perturbed dynamics, t at s = t.
A calculation shows that
d 
t (s) = i [s() ([Px , W (Tts f )]), W (g)], (4.13)
ds
x

where dierentiability is guaranteed by the results of Proposition 4.1. The inner


commutator can be expressed as

[Px , W (Tts f )] = [W (zx ), W (Tts f )]x (dz)
C

= W (Tts f )Lts;x (f ), (4.14)

where

Lts;x (f ) = Lts;x (f ) = W (zx ){ei(Tts f,zx ) 1}x (dz) W(D). (4.15)
C

Thus t satises
d 
t (s) = i t (s)s() (Lts;x (f ))
ds
x

+i s() (W (Tts f ))[s() (Lts;x (f )), W (g)]. (4.16)
x

The rst term above is norm preserving. In fact, dene a unitary evolution Ut ()
by setting
d 
Ut (s) = i s() (Lts;x (f ))Ut (s) with Ut (0) = 1l. (4.17)
ds
x

It is easy to see that


d 
(t (s)Ut (s)) = i s() (W (Tts f ))[s() (Lts;x (f )), W (g)]Ut (s), (4.18)
ds
x

and therefore,
 t
t (t)Ut (t) = t (0) + i s() (W (Tts f ))[s() (Lts;x (f )), W (g)]Ut (s) ds.
x 0

(4.19)
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X1000393X

On the Existence of the Dynamics 227

Estimating in norm, we nd that


()
[t (W (f )), W (g)] [t0 (W (f )), W (g)]
 t
+ [s() (Lts;x (f )), W (g)] ds. (4.20)
x 0

Moreover, using (4.15) and the bound (4.4), it is clear that



[s() (Lts;x (f )), W (g)] ca eva (ts) |f (x )|Fa (d(x, x ))
x

|z|[s() (W (zx )), W (g)]|x |(dz) (4.21)
C
holds. Combining (4.21), (4.20), and (4.5), we have proven that
()

[t (W (f )), W (g)] ca eva t |f (x)||g(y)|Fa (d(x, y))
x,y

   t
+ ca |f (x )| Fa (d(x, x )) eva (ts)
x x 0

|z|[s() (W (zx )), W (g)]|x |(dz) ds. (4.22)
C
Following the iteration scheme applied in [19], one arrives at (4.11) as claimed.

4.2. Multiple site anharmonicities


In this section, we will prove that LiebRobinson bounds, similar to those in Theo-
rem 4.2, also hold for perturbations involving short range interactions. We introduce
these as follows.
For each nite subset X , we associate a nite measure X on CX and an
element PX W(D) with the form

PX = W (z X )X (dz), (4.23)
CX

where, for each z C , the function z X : C is given by


X

 zx if x X,
(z X )(x) = zx x (x) = (4.24)
x X
0 otherwise.
We will again require that X is invariant with respect to z  z, and hence, PX
is self-adjoint. In analogy to (4.8), for any nite subset , we will set

P = PX , (4.25)
X
()
where the sum is over all subsets of . Here we will again let t denote the dynam-
ics resulting from Proposition 4.1 applied to the W -dynamical system {M, t0 } and
the perturbation P dened by (4.25).
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X1000393X

228 B. Nachtergaele et al.

The main assumption on these multi-site perturbations follows. There exists a


number a1 > 0 such that for all 0 < a a1 , there is a number a > 0 for which
given any pair x1 , x2 ,
 
|zx1 ||zx2 ||X |(dz) a Fa (d(x1 , x2 )). (4.26)
X: CX
x1 ,x2 X

Theorem 4.3. Let t0 be a harmonic dynamics defined on . Assume that (4.26)


()
holds, and that t denotes the corresponding perturbed dynamics. For every 0 <
a min(a0 , a1 ), there exist positive numbers ca and va for which the estimate
() 2 
[t (W (f )), W (g)] ca e(va +ca a Ca )|t| |f (x)||g(y)|Fa (d(x, y)) (4.27)
x,y

holds for all t R and for any functions f, g D.

The proof of this result closely follows that of Theorem 4.2, and so we only
comment on the dierences.

Proof. For f, g D and t > 0, dene t : [0, t] W(D) as in (4.12). The


derivative calculation beginning with (4.13) proceeds as before. Here

Lts;X (f ) = W (z X ){ei(Tts f,zX ) 1}X (dz), (4.28)
CX

is also self-adjoint. The norm estimate


()
[t (W (f )), W (g)] [t0 (W (f )), W (g)]
 t
+ [s() (Lts;X (f )), W (g)] ds, (4.29)
X 0

holds similarly. With (4.28), it is easy to see that the integrand in (4.29) is
bounded by
  
ca eva (ts) |f (x)| Fa (d(x, x )) |zx ||[s() (W (z X )), W (g)]|X |(dz),
x x X CX

(4.30)

the analogue of (4.21), for 0 < a a0 . Moreover, if 0 < a min(a0 , a1 ), then


()
[t (W (f )), W (g)]
   
ca eva t |f (x)||g(y)|Fa (d(x, y)) + ca |f (x)| Fa (d(x, x ))
x,y x X x X
 t 
eva (ts) |zx |[s() (W (z X )), W (g)]|X |(dz)ds. (4.31)
0 CX
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X1000393X

On the Existence of the Dynamics 229

The estimate claimed in (4.27) follows by iteration. In fact, the rst term in the
iteration is bounded by
  
ca |f (x)| Fa (d(x, x1 ))
x X x1 X
   
t  
va (ts)
e |zx1 | ca e va s
|zx2 ||g(y)|Fa (d(x2 , y))
0 CX x2 X y

|X |(dz) ds
 
ca t ca eva t |f (x)||g(y)| Fa (d(x, x1 ))Fa (d(x2 , y))
x,y x1 ,x2
 
|zx1 ||zx2 ||X |(dz)
X: CX
x1 ,x2 X
 
a ca t ca eva t |f (x)||g(y)| Fa (d(x, x1 ))Fa (d(x1 , x2 ))Fa (d(x2 , y))
x,y x1 ,x2

a Ca2 ca t ca eva t |f (x)||g(y)|Fa (d(x, y)). (4.32)
x,y

The higher order iterates are treated similarly.

5. Existence of the Dynamics


In this section, we demonstrate that the nite volume dynamics analyzed in the
previous section converge to a limiting dynamics as the volume on which the
perturbation is dened tends to . We state this as Theorem 5.1 below.

Theorem 5.1. Let t0 be a harmonic dynamics defined on W(1 ()) as described


in Sec. 4.1. Let {n } denote a non-decreasing, exhaustive sequence of finite subsets
of . Consider a family of perturbations P n as defined in (4.25) and (4.23) which
satisfy (4.26). Suppose in addition that
 
M = sup |zx ||X |(dz) < . (5.1)
x X: CX
xX

Then, for each f 1 () and t R fixed, the limit


( )
lim n (W (f )) (5.2)
n t

exists in norm. The limiting dynamics, which we denote by t , is weakly continuous.

It is important to note that since the estimates in Theorem 4.3 are independent
of , the limiting dynamics also satises a LiebRobinson bound as in (4.27). We
now prove Theorem 5.1.
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X1000393X

230 B. Nachtergaele et al.

Proof. Fix a Weyl operator W (f ) with f 1 (). Let T > 0 and take m n.
Iteratively applying Proposition 4.1, we have that
 t
( ) ( ) ( )
t n (W (f )) = t m (W (f )) + i s(n ) ([P n \m , tsm (W (f ))]) ds, (5.3)
0

for all T t T . The bound


( )
[P n \m , tsm (W (f ))]
 
( )
[W (z X ), tsm (W (f ))]|X |(dz)
Xn : CX
Xn \m =
   
2
ca e(va +ca a Ca )(ts) |f (x)| Fa (d(x, y)) |zy ||X |(dz)
x Xn : yX CX
Xn \m =
   
(va +ca a Ca2 )(ts)
ca e |f (x)| Fa (d(x, y)) |zy ||X |(dz)
x yn \m X: CX
yX
2  
M ca e(va +ca a Ca )(ts) |f (x)| Fa (d(x, y)) (5.4)
x yn \m

follows readily from Theorem 4.3 and assumption (5.1). For f 1 () and xed
t, the upper estimate above goes to zero as n, m . In fact, the convergence is
uniform for t [T, T ]. This proves (5.2).
By an /3 argument, similar to what is done at the end of Sec. 2, weak continuity
follows since we know it holds for the nite volume dynamics. This completes the
proof of Theorem 5.1.

Acknowledgments
The work reported in this paper was supported by the National Science Foundation:
B.N. under Grants #DMS-0605342 and #DMS-0757581, R.S. under Grant #DMS-
0757424, and S.S. under Grant #DMS-0757327 and #DMS-0706927. The authors
would also like to acknowledge the hospitality of the Department of Mathematics
at U.C. Davis where a part of this work was completed.

References
[1] L. Amour, P. Levy-Bruhl and J. Nourrigat, Dynamics and LiebRobinson estimates
for lattices of interacting anharmonic oscillators, to appear in Colloq. Math., Special
volume dedicated to A. Hulanicki; arXiv:0904.2717.
[2] H. Araki and E. J. Woods, Representations of the canonical commutation relations
describing a non-relativistic infinite free Bose gas, J. Math. Phys. 4 (1963) 637662.
[3] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechan-
ics. Volume 1, 2nd edn. (Springer-Verlag, 1987).
March 10, 2010 10:13 WSPC/S0129-055X 148-RMP
J070-S0129055X1000393X

On the Existence of the Dynamics 231

[4] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechan-
ics. Volume 2, 2nd edn. (Springer-Verlag, 1997).
[5] P. Butt` a, E. Caglioti, S. Di Ruzza and C. Marchioro, On the propagation of a per-
turbation in an anharmonic system, J. Stat. Phys. 127 (2007) 313325.
[6] M. Cramer and J. Eisert, Correlations, spectral gap, and entanglement in harmonic
quantum systems on generic lattices, New J. Phys. 8 (2006) 71.
[7] M. Cramer, A. Serafini and J. Eisert, Locality of dynamics in general harmonic
quantum systems, in Quantum Information and Many Body Quantum Systems, eds.
M. Ericsson and S. Montangero (Edizioni della Normale, 2008).
[8] M. Hastings and T. Koma, Spectral gap and exponential decay of correlations, Comm.
Math. Phys. 265(3) (2006) 781804.
[9] J. L. van Hemmen, Dynamics and ergodicity of the infinite harmonic crystal, Phys.
Rept. 65 (1980) 45149.
[10] O. E. Lanford, J. Lebowitz and E. H. Lieb, Time evolution of infinite anharmonic
systems, J. Statist. Phys. 16(6) (1977) 453461.
[11] E. H. Lieb and D. W. Robinson, The finite group velocity of quantum spin systems,
Comm. Math. Phys. 28 (1972) 251257.
[12] E. H. Lieb and R. Seiringer, The Stability of Matter in Quantum Mechanics (Cam-
bridge University Press, 2009).
[13] J. Manuceau and A. Verbeure, Quasi-free states of the CCR algebra and Bogoliubov
transformations, Comm. Math. Phys. 9 (1968) 293302.
[14] J. Manuceau, M. Sirugue, D. Testard and A. Verbeure, The smallest C -algebra for
canonical commutation relations, Comm. Math. Phys. 32 (1973) 231243.
[15] C. Marchioro, A. Pellegrinotti, M. Pulvirenti and L. Triolo, Velocity of a perturbation
in infinite lattice systems, J. Statist. Phys. 19(5) (1978) 499510.
[16] B. Nachtergaele and R. Sims, LiebRobinson bounds and the exponential clustering
theorem, Comm. Math. Phys. 265(1) (2006) 119130.
[17] B. Nachtergaele, Y. Ogata and R. Sims, Propagation of correlations in quantum
lattice systems, J. Statist. Phys. 124(1) (2006) 113.
[18] B. Nachtergaele and R. Sims, Locality estimates for quantum spin systems, in New
Trends in Mathematical Physics, Selected Contributions of the XVth International
Congress on Mathematical Physics (Springer-Verlag, 2009), pp. 591614.
[19] B. Nachtergaele, H. Raz, B. Schlein and R. Sims, LiebRobinson bounds for harmonic
and anharmonic lattice systems, Comm. Math. Phys. 286 (2009) 10731098.
[20] H. Raz and R. Sims, Estimating the LiebRobinson velocity for classical anharmonic
lattice systems, J. Statist. Phys. 137 (2009) 79108.
[21] M. Reed and B. Simon, Methods of Modern Mathematical Physics, II, Fourier Anal-
ysis, Self-Adjointness (Academic Press, 1975).
[22] D.W. Robinson, The ground state of the bose gas, Comm. Math. Phys. 1 (1965)
159174.
[23] H. Spohn and J. L. Lebowitz, Stationary non-equilibrium states of infinite harmonic
systems, Comm. Math. Phys. 54 (1977) 97120.
[24] W. Thirring and F. Dyson (eds), The Stability of Matter: From Atoms to Stars:
Selecta of Elliott H. Lieb, 4th edn. (Springer-Verlag, 2005).
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Reviews in Mathematical Physics


Vol. 22, No. 3 (2010) 233303

c World Scientic Publishing Company
DOI: 10.1142/S0129055X10003953

EFFECT OF A LOCALLY REPULSIVE INTERACTION


ON s-WAVE SUPERCONDUCTORS

J.-B. BRU and W. DE SIQUEIRA PEDRA


Departamento de Matem aticas,
Facultad de Ciencia y Tecnologa Universidad del Pas Vasco,
Apartado 644, 48080 Bilbao, Spain
and
IKERBASQUE, Basque Foundation for Science,
48011, Bilbao, Spain
jeanbernard bru@ehu.es
jb.bru@ikerbasque.org
Institutf
ur Mathematik, Universit
at Mainz,
Staudingerweg 9, 55099 Mainz, Germany
pedra@mathematik.uni-mainz.de

Received 23 September 2009


Revised 22 February 2010

The thermodynamic impact of the Coulomb repulsion on s-wave superconductors is


analyzed via a rigorous study of equilibrium and ground states of the strong coupling
BCS-Hubbard Hamiltonian. We show that the one-site electron repulsion can favor
superconductivity at xed chemical potential by increasing the critical temperature
and/or the Cooper pair condensate density. If the one-site repulsion is not too large, a
rst or a second order superconducting phase transition can appear at low temperatures.
The Meiner eect is shown to be rather generic but coexistence of superconducting and
ferromagnetic phases is also shown to be feasible, for instance, near half-lling and at
strong repulsion. Our proof of a superconductor-Mott insulator phase transition implies
a rigorous explanation of the necessity of doping insulators to create superconductors.
These mathematical results are consequences of quantum large deviation arguments
combined with an adaptation of the proof of Strmers theorem [1] to even states on the
CAR algebra.

Keywords: Superconductivity; s-wave; Coulomb interaction; Hubbard model; Meiner


eect; Mott insulators; equilibrium states; Strmers theorem.

Mathematics Subject Classication 2010: 82B20, 82D55

Contents

1. Introduction 234
2. Grand-Canonical Pressure and Gap Equation 241

233
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

234 J.-B. Bru & W. de Siqueira Pedra

3. Phase Diagram at Fixed Chemical Potential 244


3.1. Existence of a s-wave superconducting phase transition . . . . . . . . 245
3.2. Electron density per site and electron-hole symmetry . . . . . . . . . 249
3.3. Superconductivity versus magnetization: Meiner eect . . . . . . . 250
3.4. Coulomb correlation density . . . . . . . . . . . . . . . . . . . . . . . 252
3.5. Superconductor-Mott insulator phase transition . . . . . . . . . . . . 255
3.6. Mean-energy per site and the specic heat . . . . . . . . . . . . . . . 257
4. Phase Diagram at Fixed Electron Density per Site 260
4.1. Thermodynamics away from any critical point . . . . . . . . . . . . . 260
4.2. Coexistence of ferromagnetic and superconducting phases . . . . . . 262

5. Concluding Remarks 266

6. Mathematical Foundations of the Thermodynamic Results 268


6.1. Thermodynamic limit of the pressure: Proof of Theorem 2.1 . . . . . 269
6.2. Equilibrium and ground states of the strong coupling
BCS-Hubbard model . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
7. Analysis of the Variational Problem 292

Appendix. Griths Arguments 298

1. Introduction
Since the discovery of mercury superconductivity in 1911 by the Dutch physicist
Onnes, the study of superconductors has continued to intensify, see, e.g., [2]. Since
that discovery, a signicant amount of superconducting materials has been found.
This includes usual metals, like lead, aluminum, zinc or platinum, magnetic materi-
als, heavy-fermion systems, organic compounds and ceramics. A complete descrip-
tion of their thermodynamic properties is an entire subject by itself, see [24] and
references therein. In addition to zero-resistivity and many other complex phe-
nomena, superconductors manifest the celebrated Meiner or MeinerOchsenfeld
eect, i.e. they can become perfectly diamagnetic. The highesta critical tempera-
ture for superconductivity obtained nowadays is between 100 and 200 Kelvin via
doped copper oxides, which are originally insulators. In contrast to most supercon-
ductors, note that superconduction in magnetic superconductors only exists on a
nite range of non-zero temperatures.
Theoretical foundations of superconductivity go back to the celebrated BCS
theory appeared in the late fties (1957) which explains conventional type I

a In January 2008, a critical temperature over 180 Kelvin was reported in a Pb-doped copper oxide.
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 235

superconductors. This theory is based on the so-called (reduced) BCS Hamiltonian



HBCS
:= ak, a
(k )( k, a
k, + a k, )
k

1 
+ k, a
k,k a k, a
k , a
k , (1.1)
|| 
k,k

dened in a cubic box R3 of volume ||. Here is the dual group of seen as a
torus (periodic boundary condition) and the operator ak,s respectively ak,s creates
respectively annihilates a fermion with spin s {, } and momentum k .
The function k represents the kinetic energy, the real number is the chemical
potential and k,k is the BCS coupling function. The choice k,k = < 0 is often
used in the Physics literature and the case k = 0 is known as the strong coupling
limit of the BCS model.
The lattice approximation of the BCS Hamiltonian amounts to replace the box
R3 by Z3 (or, more generally, by Zd1 ) and the strong coupling
limit of the reduced BCS model is in this case known as the strong coupling (with
k,k = ) BCS model.b The assumptions k = 0 and k,k = are of interest,
because in this case the BCS Hamiltonian can be explicitly diagonalized. The exact
solution of the strong coupling BCS model is well-known since the sixties [6, 7].
This model is in a sense unrealistic: among other things, its representation of the
kinetic energy of electrons is rather poor. Nevertheless, it became popular because
it displays most of basic properties of real conventional type I superconductors.
See, e.g., [8, Chap. VII, Sec. 4]. Even though the analysis of the thermodynamics of
the BCS Hamiltonian was rigorously performed in the eighties [9, 10] (see also the
innovating work of Bernadskii and Minlos in 1972 [11]), generalizations of the strong
coupling approximation of the BCS model are still subject of research. For instance,
strong coupling-BCS-type models with superconducting phases at arbitrarily high
temperatures are treated in [12].
In fact, a general theory of superconductivity is still a subject of debate, espe-
cially for high-Tc superconductors. An important phenomenon ignored in the BCS
theory is the Coulomb interaction between electrons or holes, which can imply
strong correlations, for instance in high-Tc superconductors. To study these cor-
relations, most of theoretical methods, inspired by Beliaev [5], use perturbation
theory or renormalization group derived from the diagram approach of Quantum
Field Theory. However, even if these approaches have been successful in explaining
many physical properties of superconductors [3, 4], only few rigorous results exist
on superconductivity.
For instance, the eect of the Coulomb interaction on superconductivity is not
rigorously known. This problem was of course adressed in theoretical Physics right
after the emergence of the Frohlich model and the BCS theory, see, e.g., [13].

b See also (1.2) with = 0 and h = 0.


April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

236 J.-B. Bru & W. de Siqueira Pedra

In particular, the authors explain in [13, Chap. VI], by means of diagrammatic


pertubation theory, that the eect of the Coulomb interaction on the Fr ohlich
model should be to lower the critical temperature of the superconducting phase
by lowering the electron density. We rigorously show that this phenomenology is
only true for our model in a specic region of parameters.
Indeed, the aim of the present paper is to understand the possible thermo-
dynamic impact of the Coulomb repulsion in the strong coupling approximation.
More precisely, we study the thermodynamic properties of the strong coupling BCS-
Hubbard model dened in the boxc N := {Z[L, L]}d1 of volume |N | = N 2
by the Hamiltonian
 
HN := (nx, + nx, ) h (nx, nx, )
xN xN
 
+ 2 nx, nx, ax, ax, ay, ay, (1.2)
N
xN x,yN

for real parameters , h, , and 0. The operator ax,s respectively ax,s creates
respectively annihilates a fermion with spin s {, } at lattice position x Zd
whereas nx,s := ax,s ax,s is the particle number operator at position x and spin s.
The rst term of the right-hand side of (1.2) represents the strong coupling limit
of the kinetic energy, with being the chemical potential of the system. Note that
this strong coupling limit explained above for the BCS Hamiltonian is also
called atomic limit in the context of the Hubbard model, see, e.g., [14, 15]. The
second term in the right-hand side of (1.2) corresponds to the interaction between
spins and the magnetic eld h. The one-site interaction with coupling constant
represents the (screened) Coulomb repulsion as in the celebrated Hubbard model.
So, the parameter should be taken as a positive number but our results are also
valid for any real . The last term is the BCS interaction written in the x-space
since
 
ax, ax, ay, ay, = a
k, a
k, a
q, a
q, , (1.3)
N N
x,yN k,qN

with N being the reciprocal lattice of quasi-momenta and where a q,s is the cor-
responding annihilation operator for s {, }. Observe that the thermodynamics
of the model for = 0 can easily be computed. Therefore, we restrict the analy-
sis to the case > 0. Note also that the homogeneous BCS interaction (1.3) can
imply a superconducting phase and the mediator implying this eective interaction
does not matter here, i.e. it could be due to phonons, as in conventional type I
superconductors, or anything else.
We show that the one-site repulsion suppresses superconductivity for large
0. In particular, the repulsive term in (1.2) cannot imply any superconducting
state if = 0. However, the rst elementary but nonetheless important property

c Without loss of generality, we choose N such that L := (N 1/d 1)/2 N.


April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 237

of this model is that the presence of an electron repulsion is not incompatible with
superconductivity if || and (+|h|) are not too big as compared to the coupling
constant of the BCS interaction. In this case, the superconducting phase appears
at low temperatures as either a rst order or a second order phase transition.
More surprisingly, the one-site repulsion can even favor superconductivity at xed
chemical potential by increasing the critical temperature and/or the Cooper pair
condensate density. This contradicts the naive guess that any one-site repulsion
between electron pairs should at least reduce the formation of Cooper pairs. It is
however important to mention that the physical behavior described by the model
depends on which parameter, or , is xed. (It does not mean that the canonical
and grand-canonical ensembles are not equivalent for this model.) Indeed, we also
analyze the thermodynamic properties at xed electron density per site in the
grand-canonical ensemble, as it is done for the perfect Bose gas in the proof of
BoseEinstein condensation. The analysis of the thermodynamics of the strong
coupling BCS-Hubbard model is performed in details. In particular, we prove that
the Meiner eect is rather generic but also that the coexistence of superconducting
and ferromagnetic phases is possible (as in the VonsovkiiZener model [16, 17]), for
instance at large > 0 and densities near half-lling. The later situation is related to
a superconductor-Mott insulator phase transition. This transition gives furthermore
a rigorous explanation of the need of doping insulators to obtain superconductors.
Indeed, at large enough coupling constant , the superconductor-Mott insulator
phase transition corresponds to the breakdown of superconductivity together with
the appearance of a gap in the chemical potential as soon as the electron density
per site becomes an integer, i.e. 0, 1 or 2. If the system has an electron density
per site equal to 1 without being superconductor, then any non-zero magnetic eld
h = 0 implies a ferromagnetic phase.
Note that the present setting is still too simplied with respect to real super-
conductors. For instance, the anti-ferromagnetic phase or the presence of vortices,
which can appear in (type II) high-Tc superconductors [3,4], are not modeled. How-
ever, the BCS-Hubbard Hamiltonian (1.2) may be a good model for certain kinds
of superconductors or ultra-cold Fermi gases in optical lattices, where the strong
coupling approximation is experimentally justied. Actually, even if the strong cou-
pling assumption is a severe simplication, it may be used in order to analyze the
thermodynamic impact of the Coulomb repulsion, as all parameters of the model
have a phenomenological interpretation and can be directly related to experiments.
See discussions in Sec. 5. Moreover, the range of parameters in which we are inter-
ested turns out to be related to a rst order phase transition. This kind of phase
transitions are known to be stable under small perturbations of the Hamiltonian.
In particular, by including a small kinetic part it can be shown by high-low tem-
perature expansions that the model

HN, := HN + (x y)(ay, ax, + ay, ax, )
x,yN
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

238 J.-B. Bru & W. de Siqueira Pedra

has essentially the same correlation functions as HN , up to corrections of order 1


(1 -norm of ). This analysis will be the subject of a separated paper. For any = 0
notice that the model HN, is not anymore permutation invariant but only trans-
lation invariant. Such translation invariant models are studied in a systematic way
in [18]. Their detailed analysis is however, generally much more dicult to perform.
Considering rst models having more symmetries as for instance, permutation
invariance is in this case technically easier.
Coming back to the strong coupling BCS-Hubbard model HN , it turns out that
the thermodynamic limit of its (grand-canonical) pressured
1
pN (, , , , h) := ln Trace(eHN ) (1.4)
N
exists at any xed inverse temperature > 0. It corresponds to a variational
problem which has minimizerse in the set EUS,+ of (evenf ) permutation invariant
states on the CAR C -algebra U generated by annihilation and creation operators:
p(, , , , h) := lim {pN (, , , , h)} = inf F(). (1.5)
N S,+
EU

Here the map


F() := e() 1 S()

is the ane (lower weak -semicontinuous) free-energy density functional dened on
EUS,+ from the mean energy per volume
e() := lim {N 1 (HN )} <
N
and the entropy density
 
1

S() := lim Trace(D|UN log D|UN ) < .
N N

Note that D|UN is the density matrix associated to the state restricted on the

local CAR C -algebra UN B( CN {,} ) (isomorphism). Such a derivation
of the pressure as a minimization problem over states on a C -algebras are also
performed for various quantum spin systems, see, e.g., [1923].
The minimum of the variational problem (1.5) is attained for any weak -limit
point of local Gibbs states
Trace( eHN )
N () := (1.6)
Trace(eHN )
associated with HN . Similarly to what is done for general translation invariant
models (see [24, 25]), the set of equilibrium states of the strong coupling BCS-
Hubbard model is naturally dened to be the set = (, , , h) of minimizers

d Our notation for the Trace does not include the Hilbert space where it is evaluated but it
should be deduced from operators involved in each statement.
e Because  F() is lower semicontinuous and E S,+ is compact with respect to the weak -
U
topology.
f See Remark 6.1 in Sec. 6.1.
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 239

of (1.5). Note that is a non empty convex subsetg of EUS,+ and the extreme
decomposition in coincides with the one in EUS,+ , i.e. is a faceh in EUS,+ .
So, pure equilibrium states are extreme states of . Meanwhile, any weak limit
point as n of an equilibrium state sequence { (n) }nN with diverging inverse
temperature n is per denition a ground state EUS,+ .
Here we have left the Fock space representation of the model to go to a
representation-free formulation of thermodynamic phases. This means that HN is
not anymore seen as a Hamiltonian acting on the Fock space but as a (self-adjoint)
element of the CAR C -algebra U with thermodynamic phases describes by states
on U. Doing so we take advantage of the non-uniqueness of the representation of the
CAR C -algebra U. This property is indeed necessary to get non-unique equilibrium
and ground states which imply phase transitions. This fact was rst observed by
Haag in 1962 [26], who established that the non-uniqueness of the ground state of
the BCS model in innite volume is related to the existence of several inequivalenti
irreducible representationsj of the Hamiltonian, see also [6, 27].
Equilibrium states dene tangents to the convex map
(, , , , h) p(, , , , h).
The analysis of the set of tangents of this map gives hence information about the
expectations of many important observables with respect to equilibrium states. The
main technical point in the present work is therefore to nd an explicit representa-
tion of the pressure by using the permutation invariance of the model in a crucial
way. Indeed, we adapt to our case of fermions on a lattice the methods of [19] used
to nd the pressure of spin systems of mean-eld type. Then, it is proven that it
suces to minimize the variational problem (1.5) with respect to the set EUS,+ of
extreme states in EUS,+ . By adapting the proof of Strmers theorem [1] to even
states on the CAR algebra, we show next that extreme, permutation invariant and
even states are product states

:= x
xZd

obtained by copying some one-site even state to all other sites. This result is
a non-commutative version of the celebrated de Finetti Theorem from (classical)
probability theory [28]. Using this, the variational problem (1.5) can be drastically
simplied to a minimization problem on a nite dimensional manifold. At the end,
it yields to another explicit, rather simple, variational problem on R+0 , which can

g The map  F() on the convex set EU S,+


is ane and lower semicontinuous, thus is a
S,+
non-empty face of EU .
h A face F of a compact convex set K is subset of K with the property that if = m F
n=1 n n
with mn=1 n = 1 and {n }n=1 K, then {n }n=1 F.
m m
i This means that there is no isomorphism between h
j1 and hj2 whenever hj1 and hj2 are the
Hilbert spaces corresponding to two dierent irreducible representations.
j This means that the Hamiltonian can be seen as an operator acting on several Hilbert spaces

{hj }jJ with no (non-trivial) invariant subspace.


April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

240 J.-B. Bru & W. de Siqueira Pedra

be rigorously analyzed by analytic or numerical methods to obtain the complete


thermodynamic behavior of the model.
Observe however, that all correlation functions cannot be drawn from an
explicit formula for the pressure by taking derivatives combined with Griths argu-
ments [2931] on the convergence of derivatives of convex functions, unless the
(innite volume) pressure is shown to be dierentiable with respect to any pertur-
bation. Showing dierentiability of the pressure as well as the explicit computation
of its corresponding derivative can be a very hard task, for instance for correlation
functions involving many lattice points. By contrast, the method presented in this
paper gives access to all correlation functions at once. This is one basic (mathemat-
ical) message of this method, which is generalized in [18] to all translation invariant
Fermi systems without requiring any quantum spin representation.
In fact, we precisely characterize the sets for all (0, ], where is the
set of ground states with parameters , , , and h. This detailed study yields our
main rigorous results on the strong coupling BCS-Hubbard model HN , which can
be summarized as follows:

There is a set of parameters S, dening the superconducting phase, with equi-


librium and ground states breaking the U (1)-gauge symmetry and showing o-
diagonal long range order (ODLRO).
Depending on the parameters, the superconducting phase transition is either a
rst order or a second order phase transition.
The superconducting phase S is characterized by the formation of Cooper pairs
(shown by proving bounds for the density-density correlations) and a depleted
Cooper pair condensate, the density r [0, 1/4] of which is dened by the gap
equation.
From our proof of Strmers theorem [1] for even states on the CAR algebra,
we observe that the superconducting phase S corresponds to a s-wave supercon-
ductor, i.e. a superconductor with two-point correlation function, for x, y Zd ,
1/2
s1 , s2 {, } and within S, equal to (ax,s1 ay,s2 ) = r ei = 0 if x = y and
s1 = s2 , and (ax,s1 ay,s2 ) = 0 else. (Here is any pure state of ; [0, 2)
is determined by .)
We observe the Meiner eectk by analyzing the relation between superconduc-
tivity and magnetization.
We establish the existence of a superconductor-Mott insulator phase transition
for integer electron density per site.
The coexistence of ferromagnetic and superconducting phases is shown to be fea-
sible at (critical) points of the boundary S of S, by applying the decomposition
theory for states [32] on the weak -compact and convex set .

k It is mathematically dened here by the absence of magnetization in presence of superconduc-

tivity. Steady surface currents around the bulk of the superconductor are not analyzed as it is a
nite volume eect.
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 241

The critical temperature c for the superconducting phase transition with respect
to , or h is analyzed in the case of xed chemical potential and also in the
case of constant electron density . It shows that c can be an increasing function
of the positive coupling constant > 0 at xed R but not at xed > 0.
For the critical temperature c shows as a function of the electron
density the typical behavior observed (only) in high-Tc superconductors: c
is zero or very small for 1 and is much larger for away from 1. Thus, our
model provides a simple rigorous microscopic explanation for such experimentally
well-known behavior of high-Tc superconductors.
Together with our study of the heat capacity, all these results can be used to x
experimentally all parameters of HN .

Note that our study of equilibrium states is reminiscent of the work of Fannes,
Spohn and Verbeure [33], performed however within a dierent framework. By
opposition with our setting, their analysis [33] concerns symmetric states on an
innite tensor product of one C -algebra and their denition of equilibrium states
uses the so-called correlation inequalities for KMS-states, see [29, Appendix E].
To conclude, this paper is organized as follows. In Sec. 2, we give the thermo-
dynamic limit of the pressure pN (1.4) as well as the gap equation. Then, our main
results concerning the thermodynamic properties of the model are formulated in
Sec. 3 at xed chemical potential and in Sec. 4, at xed electron density per
site. Section 5 briey explains our result on the level of equilibrium states and gives
additional remarks. In order to keep the main issues and the physical implications
as transparent as possible, we reduce the technical and formal aspects to a minimum
in Secs. 25. In particular, in Secs. 24 we only stay on the level of pressure and
thermodynamic limit of local Gibbs states. The generalization of the results on the
level of equilibrium and ground states is postponed to Sec. 6.2. Indeed, the rather
long Sec. 6 gives the detailed mathematical foundations of our phase diagrams. In
particular, in Sec. 6.1 we introduce the C -algebraic machinery needed in our anal-
ysis and prove various technical facts to conclude in Sec. 6.2 with the rigorous study
of equilibrium and ground states. In Sec. 7, we collect some useful properties on the
qualitative behavior of the Cooper pair condensate density, whereas the Appendix
is an appendix on Griths arguments [2931].

2. Grand-Canonical Pressure and Gap Equation


In order to obtain the thermodynamic behavior of the strong coupling BCS-
Hubbard model HN , it is essential to get rst the thermodynamic limit N
of its grand-canonical pressure pN (1.4). The rigorous derivation of this limit is
performed in Sec. 6.1. We explain here the nal result with the heuristic behind it.
The rst important remark is that one can guess the correct variational problem
by the so-called approximating Hamiltonian method [3436] originally proposed by
Bogoliubov Jr. [37]. In our case, the correct approximation of the Hamiltonian HN
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

242 J.-B. Bru & W. de Siqueira Pedra

is the c-dependent Hamiltonian


 
HN (c) := (nx, + nx, ) h (nx, nx, )
xN xN
 
+ 2 nx, nx, ((N c)ax, ax, + (N c)ax, ax, ), (2.1)
N
xN xN

with c C, see also [6, 7]. The main advantage of this Hamiltonian in comparison
with HN is the fact that it is a sum of shifts of the same local operator. For an
appropriate order parameter c C, it leads to a good approximation of the pressure
pN as N . This can be partially seen from the inequality
  
 

N |c| + HN (c) HN =
2
ax, ax, N c ax, ax, N c 0,
N
xN xN

which is valid as soon as 0. Observe that the constant term N |c|2 is not
included in the denition of HN (c). Hence, by using the GoldenThompson inequal-

ity Trace(eA+B B ) Trace(eA ), the thermodynamic limit p(, , , , h) of the
pressure pN (1.4) is bounded from below by

p(, , , , h) sup{|c|2 + p(c)}. (2.2)


cC

The function p(c) = p(, , , , h; c) is the pressure associated with HN (c) for any
N 1. It can easily be computed since HN (c) is a sum of local operators which
commute with each other. Indeed, for any N 1, this pressure equalsl
1 1
p(c) := ln Trace(eHN (c) ) = ln Trace(eH1 (c) )
N
1
= ln Trace(e{(+h)n +(h)n +(ca a +ca a )2n n } ). (2.3)

To be useful, the variational problem in (2.2) should also be an upper bound of
p(, , , , h). By adapting the proof of Strmers theorem [1] to even states on the
CAR algebra and by using the PetzRaggioVerbeure proof for spin systems [19] as
a guideline, we prove this in Sec. 6.1. Thus, the thermodynamic limit of the pressure
of the model HN exists and can explicitly be computed by using the approximating
Hamiltonian HN (c):

Theorem 2.1 (Grand-Canonical Pressure). For any , > 0 and , , h R,


the thermodynamic limit p(, , , , h) of the grand-canonical pressure pN (1.4)
equals

p(, , , , h) = sup{|c|2 + p(c)} = 1 ln 2 + + sup f (r) < ,


cC r0

l Here a0, , a0, and n0, , n0, are replaced, respectively, by a , a and n , n .
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 243

where the real function f (r) = f (, , , , h; r) is dened by


1
f (r) := r + ln{cosh(h) + e cosh(gr )},

with gr := {( )2 + 2 r}1/2 .

Remark 2.1. The fact that the pressure pN coincides as N with the vari-
ational problem given by the so-called approximating Hamiltonian (here HN (c))
was previously proven via completely dierent methods in [34] for a large class of
Hamiltonian (including HN ) with BCS-type interaction. However, as explained in
the introduction, our proof gives deeper results, not expressed in Theorem 2.1, on
the level of states, cf. (1.5) and (6.33). In contrast to the approximating Hamiltonian
method [3437], it leads to a natural notion of equilibrium and ground states and
allows the direct analysis of correlation functions. For more details, we recommend
Sec. 6, particularly Sec. 6.2.
From the gauge invariance of the map c p(c) observe that any maximizer
1/2
c C of the rst variational problem given in Theorem 2.1 has the form r ei
with r 0 being solution of
sup f (r) = f (r ) (2.4)
r0

and [0, 2). For any , > 0 and real numbers , , h, it is also clear that the
order parameter r is always bounded since f (r) diverges to when r . Up
to (special) points (, , , , h) corresponding to a phase transition of rst order,
it is always unique and continuous with respect to each parameter (see Sec. 7).
For low inverse temperatures (high temperature regime) r = 0. Indeed,
straightforward computations at low enough show that the function f (r) is con-
cave as a function of r 0 whereas r f (0) < 0, see Sec. 7. On the other hand, any
non-zero solution r of the variational problem (2.4) has to be solution of the gap
equation (or EulerLagrange equation)

2gr e cosh(h)
tanh(gr ) = 1+ . (2.5)
cosh(gr )
If gr = 0, observe that one uses in (2.5) the asymptotics x1 tanh x 1 as x 0,
see also (7.2). Because tanh(x) 1 for x 0, we then conclude that
1
0 r max{0, rmax }, with rmax := 2 ( )2 . (2.6)
4
In particular, if 2| |, then r = 0 for any > 0. However, at large
enough > 0 (low temperature regime) and at xed , h, R, there is a unique
c > 2| | such that r > 0 for any c . In other words, the domain
of parameters (, , , , h) where r is strictly positive is non-empty, see Figs. 1
and 2 and Sec. 7. Observe in Fig. 2 that a positive , i.e. a one-site repulsion,
can signicantly increase (right gure) the critical temperature c = c (, , , h),
which is dened such that r > 0 if and only if > c1 .
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

244 J.-B. Bru & W. de Siqueira Pedra

c c c
0.8 0.6
0.20

0.5
0.6 0.15
0.4

0.4 0.3 0.10

0.2
0.2 0.05
0.1

2.0 1.5 1.0 0.5 0.5


1.5 1.0 0.5 0.5 1.0 1.5
0.5 0.5 1.0 1.5 2.0

Fig. 1. Illustration, as a function of , of the critical temperature c = c (, , , h) such that


r > 0 if and only if > c1 (blue area) for = 2.6, h = 0 and with = 0.575 (left gure),
0 (gure on the center) and 0.575 (right gure). The blue line corresponds to a second order
phase transition, whereas the red dashed line represents the domain of with a rst order phase
transition. The black dashed line is the chemical potential = corresponding to an electron
density per site equal to 1, see Sec. 3. (Color online.)

c c c
0.5 0.4

0.8
0.4
0.3

0.6
0.3
0.2
0.4
0.2

0.1
0.2 0.1

2.0 1.5 1.0 0.5 0.5 1.0


0.4 0.2 0.2 0.4 0.6
0.2 0.4 0.6 0.8

Fig. 2. Illustration, as a function of , of the critical temperature c = c (, , , h) for = 2.6,


h = 0 and with = 0.5 (left gure), = 1 (gure at the center) and = 1.25 (right gure). The
blue line corresponds to a second order phase transition, whereas the red dashed line represents
the domain of with rst order phase transition. The black dashed line is the coupling constant
= corresponding to an electron density per site equal to 1, see Sec. 3. (Color online.)

From Lemma 7.1, the set of maximizers of the variational problem (2.4) has
at most two elements in [0, 1/4]. It follows by continuity of (, , , , h, r)
f (, , , , h; r), and from the fact that the interval [0, 1/4] is compact, that the set

S := {(, , , , h): , > 0 and r > 0 is the unique maximizer of (2.4)} (2.7)

is open. In Sec. 3.1, we prove that the set S corresponds to the superconducting
phase since the order parameter solution of (2.4) can be interpreted as the Cooper
pair condensate density. The boundary S of the set S is called the set of critical
points of our model. By denition, if (2.4) has more than one maximizer, then
(, , , , h) S, whereas if (, , , , h)  S, then r = 0 is the unique maximizer
of (2.4).
For more details on the study of the variational problem (2.4), we recommend
Sec. 7.

3. Phase Diagram at Fixed Chemical Potential


By using our main theorem, i.e. Theorem 2.1, we can now explain the thermo-
dynamic behavior of the strong coupling BCS-Hubbard model HN . The rigorous
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 245

proofs are however given in Sec. 6.2. Actually, we concentrate here on the physics
of the model extracted from the (nite volume) grand-canonical Gibbs state N
(1.6) associated with HN . We start by showing the existence of a superconducting
phase transition in the thermodynamic limit.

3.1. Existence of a s-wave superconducting phase transition


The solution r of (2.4) can be interpreted as an order parameter related to the
Cooper pair condensate density N (c0 c0 )/N , where
1  1 
c0 := ax, ax, = a
k, a
k,
N xN N k
N

(respectively c0 )annihilates (respectively creates) one Cooper pair within the con-
densate, i.e. in the zero-mode for electron pairs. Indeed, in Sec. 6.2 (see Theorem 6.3)
we prove, by using a notion of equilibrium states, the following.
Theorem 3.1 (Cooper Pair Condensate Density). For any , > 0 and
real numbers , , h away from any critical point, the (innite volume) Cooper pair
condensate density equals

  1 
1
lim N (c0 c0 ) = lim N (a
a
a y, a y, )
N N 2
x, x,
N N
x,yN

= r max{0, rmax},
with rmax 1/4 dened in (2.6). The (uniquely dened ) order parameter r =
r (, , , h) is an increasing function of > 0.

Remark 3.1. In fact, Theorem 3.1 is not anymore satised only if the order param-
eter r is discontinuous with respect to > 0 at xed (, , , h). In this case, the
thermodynamic limit of the Cooper pair condensate density is bounded by the left
and right limits of the corresponding (innite volume) density, see the Appendix,
in particular (A.1). Similar remarks can be done for Theorems 3.43.7.
At least for large enough and , we have explained that r > 0, see Figs. 1
and 2. Illustrations of the Cooper pair condensate density r as a function of
and are given in Fig. 3. In other words, a superconducting phase transition can
appear in our model. Its order depends on parameters: it can be a rst order or a
second order superconducting phase transition, cf. Fig. 3 and Sec. 7 for more details.
From numerical investigations, note that r was always found to be an increasing
function of > 0. Unfortunately we are able to prove only a part of this fact in
Sec. 7. Therefore, a superconducting phase appearing only in a range of non-zero
temperatures as for magnetic superconductors cannot rigorously be excluded. But
we conjecture that our model can never show this phenomenon, i.e. r should always
be an increasing function of > 0.
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

246 J.-B. Bru & W. de Siqueira Pedra

Fig. 3. In the gure on the left, we have three illustrations of the Cooper pair condensate density
r as a function of the inverse temperature for = 0 (blue line), = 0.45 (red line) and
= 0.575 (green line). The gure on the right represents a 3D illustration of r as a function of
and . The color from red to blue reects the decrease of the temperature. In all gures, = 1,
= 2.6 and h = 0. (Color online.)

Observe that a non-trivial solution r = 0 is a manifestation of the breakdown


of the U (1)-gauge symmetry. To see this phenomenon, we need to perturb the
Hamiltonian HN with the external eld

N (ei c0 + ei c0 ) for any 0 and [0, 2).

This leads to the perturbed Gibbs state N,, () dened by (1.6) with HN
replaced by

HN,, := HN (ei ax, ax, + ei ax, ax, ), (3.1)
xN

see (6.42). We then obtain the following result for the so-called Bogoliubov quasi-
averages (cf. Theorem 6.2).

Theorem 3.2 (Breakdown of the U (1)-Gauge Symmetry). For any , > 0


and real numbers , , h away from any critical point, and for any [0, 2), one
gets for the Bogoliubov quasi-average below :
 
1  1/2
lim lim N,, (c0 / N ) = lim lim N,, (ax, ax, ) = r ei ,
0 N 0 N N
xN

with r 0 being the unique solution of (2.4), see Theorem 2.1.

Note that the breakdown of the U (1)-gauge symmetry should be seen in


experiments via the so-called o diagonal long range order (ODLRO) property of
the correlation functions [38], see Sec. 6.2. In fact, because of the permutation
invariance, Theorem 3.1 still holds if we remove the space average, i.e. for any
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 247

lattice sites x and y = x,


lim N (ay, ay, ax, ax, ) = r ,
N

see Theorem 6.3. Similar remarks can be done for Theorems 3.43.7.
Observe also that the type of superconductivity described here is the s-wave
superconductivity, which is dened via the two-point correlation function.
Theorem 3.3 (s-Wave Superconductivity). For any , > 0 and real numbers
, , h away from any critical point, and for any [0, 2), x, y Zd and s1 , s2
{, }, the two-point correlation function dened from the Bogoliubov quasi-averages
equals
1/2
lim lim N,,(ax,s1 ay,s2 ) = r ei x,y (1 s1 ,s2 ),
0 N

with r 0 being the unique solution of (2.4), see Theorem 2.1. Here x,y = 1 if
and only if x = y.
In other words, for x, y Zd and s1 , s2 {, } the two-point correlation
function inside the superconducting phase is non-zero if and only if x = y and
s1 = s2 . More generally, for any innite volume equilibrium state , we have
(ax,s1 ay,s2 ) = (a0,s1 a0,s2 )x,y , see Sec. 6.
We conclude now this analysis by giving the zero-temperature limit of
the Cooper pair condensate density r proven in Sec. 7.
Corollary 3.1 (Cooper Pair Condensate Density at Zero-Temperature).
The Cooper pair condensate density r = r (, , , h) is equal at zero-
temperature to

rmax for any > ||,+|h|
r := lim r =
0 for any < ||,+|h|
with rmax 1/4 (cf. (2.6) and Fig. 4) and
x,y := 2(y + {y 2 x2 }1/2 )[0,y) (x)(0,) (y) + 2x[y,) (x) 0
been dened for any x R+ and y R. Here K is the characteristic function of
the set K R.

Remark 3.2. If = ||,+|h| , straightforward estimations show that the order


parameter r converges to r = 0, see Sec. 7. This special case is a critical point at
suciently large . We exclude it in our discussion since all thermodynamic limits
of densities in Sec. 3 are performed away from any critical point, see, for instance,
Theorem 3.1.
The result of Corollary 3.1 is in accordance with Theorem 3.1 in the sense that
the order parameter r is an increasing function of 0. Observe also that
1
sup{r (, , , h)} = r (, , , h) =
R 4
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

248 J.-B. Bru & W. de Siqueira Pedra

Fig. 4. In the gure on the left, the blue area represents the domain of (, ) with 1 6,
where the (zero-temperature) Cooper pair condensate density r is non-zero at = 1 and h = 0.
The gure on the right represents a 3D illustration of r when 1 6 and 2.5 2.5
with again = 1, h = 0. (Color online.)

for any xed > 0,+|h| , whereas for any real numbers , , h,
1
lim r (, , , h) =.
4
In other words, the superconducting phase for = is as perfect as for = . In
particular, in order to optimize the Cooper pair condensate density, if > 0, then
it is necessary to increase the one-site repulsion by tuning in to . Consequently,
the direct repulsion between electrons can favor the superconductivity at xed .
This phenomenon is conrmed by the following analysis.
First observe that Eq. (2.5) has no solution if 2|| and = 0. In other
words, the strong coupling BCS theory has no phase transition as soon as 2||
and = 0. However, even if 2||, there is a range of where a superconducting
phase takes place. For instance, take > 0 and note that > ||,+|h| when

0 < < + ( + |h|). (3.2)
2 2
This last inequality can always be satised for some > 0, if + |h| < 2.
Therefore, although there is no superconductivity for 2|| and = 0, there
is a range of positive 0 dened by (3.2) for + |h| < 2, where the
superconductivity appears at low enough temperature, see Corollary 3.1 and Fig. 4.
In the region 2 > 0 where the superconducting phase can occur for = 0,
observe also that the critical temperature c for > 0 can sometimes be larger as
compared with the one for = 0, cf. Fig. 2.

Remark 3.3. The eect of a one-site repulsion on the superconducting phase tran-
sition may be surprising since one would naively guess that any repulsion between
pairs of electrons should destroy the formation of Cooper pairs. In fact, the one-site
and BCS interactions in (1.2) are not diagonal in the same basis, i.e. they do not
commute. In particular, the Hubbard interaction cannot be directly interpreted as
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 249

a repulsion between Cooper pairs. This interpretation is only valid for large 0.
Indeed, at xed and > 0, if is large enough, there is no superconducting
phase.

3.2. Electron density per site and electron-hole symmetry


We give next the grand-canonical density of electrons per site in the system (cf.
Theorem 6.4).
Theorem 3.4 (Electron Density per Site). For any , > 0 and real numbers
, , h away from any critical point, the (innite volume) electron density equals
 
1  ( ) sinh(gr )
lim N (nx, + nx, ) = d := 1 + ,
N N gr (e cosh(h) + cosh(gr ))
xN

with d = d (, , , h) [0, 2], r 0 being the unique solution of (2.4) and


gr := {( )2 + 2 r}1/2 , see Theorem 2.1 and Fig. 5.
At low enough temperature and for > ||,+|h| , Corollary 3.1 tells us that a
superconducting phase appears, i.e. r > 0. In this case, it is important to note that
the electron density becomes independent of the temperature. Indeed, by combining
Theorem 3.4 with (2.5) one gets that

d = 1 + 2 1 ( ) (3.3)

is linear as a function of in the domain of (, , , , h) where r > 0, i.e. in the


presence of superconductivity, see Fig. 5.
We give next the electron density per site in the zero-temperature limit ,
which straightforwardly follows from Theorem 3.4 combined with Corollary 3.1.
Corollary 3.2 (Electron Density per Site at Zero-Temperature). The
(innite volume) electron density d = d (, , , h) [0, 2] at zero-temperature

d d
2.0 2.0 d
1.00

1.5 1.5
0.95

1.0 1.0 0.90

0.85
0.5 0.5



2 4 6 8 10 12

2 1 1 2 1.0 0.5 0.5 1.0 1.5 2.0

Fig. 5. In the gure on the left, we give illustrations of the electron density d as a function of
the chemical potential for < c (red line) and > c (blue line) at coupling constant = 0
(gure on the left, = 1.4, 2.45) and = 0.575 (gure on the center, = 4, 6.45). In the gure on
the right, d is given as a function of at = 0.3 with > equal to 0.35 (orange line, second
order phase transition), 0.575 (blue line, rst order phase transition) and 1.575 (green line, no
phase transition). In all gures, = 2.6, h = 0 and c = c1 is the critical inverse temperature.
(Color online.)
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

250 J.-B. Bru & W. de Siqueira Pedra

is equal to
sgn( )
d := lim d = 1 + [+|h|,) (| |)
1 + ||,+|h| (1 + h,0 )

for < ||,+|h| , whereas within the superconducting phase, i.e. for >
||,+|h| (Corollary 3.1), d = 1 + 2 1 ( ). Recall that sgn(0) := 0.

To conclude, observe that (2d ) is the density of holes in the system. So, if >
, then d (1, 2], i.e. there are more electrons than holes in the system, whereas
d [0, 1) for < , i.e. there are more holes than electrons. This phenomenon
can directly be seen in the Hamiltonian HN , where there is a symmetry between
electrons and holes as in the Hubbard model. Indeed, by replacing the creation
operators ax, and ax, of electrons by the annihilation operators bx, and bx,
of holes, we can map the Hamiltonian HN (1.2) for electrons to another strong
coupling BCS-Hubbard model for holes dened via the Hamiltonian
 
H N := hole ( x, ) hhole
nx, + n nx, n
( x, )
xN xN
 
+ 2 n x,
x, n by, by, bx, bx, + 2( )N ,
N
xN x,yN

with

x, := bx, bx, ,
n x, := bx, bx, ,
n hhole := h and hole := 2 N 1 .

Therefore, if one knows the thermodynamic behavior of HN for any h R and


(regime with more electrons than holes), we directly get the thermodynamic
properties for < (regime with more holes than electrons), which correspond to
 N with hhole = h and a chemical potential for holes hole > at
the one given by H
 N shifts the grand-canonical
large enough N . Note that the last constant term in H
pressure by a constant, but also the (innite volume) mean-energy per site 
(Sec. 3.6).

3.3. Superconductivity versus magnetization: Meiner eect


(c)
It is well known that for magnetic elds h with |h| below some critical value h ,
type I superconductors become perfectly diamagnetic in the sense that the mag-
(c)
netic induction in the bulk is zero. Magnetic elds with strength above h destroy
the superconducting phase completely. This property is the celebrated Meiner or
(c)
MeinerOchsenfeld eect. For small elds h (i.e. |h| < h ) the magnetic eld in
the bulk of the superconductor is (almost) cancelled by the presence of steady sur-
face currents. As we do not analyze transport here, we only give the magnetization
density explicitly as a function of the external magnetic eld h for the strong cou-
pling BCS-Hubbard model. Note that type II superconductors cannot be covered
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 251

in the strong coupling regime since the vortices appearing in presence of magnetic
elds come from the magnetic kinetic energy.

Theorem 3.5 (Magnetization Density). For any , > 0 and real numbers
, , h away from any critical point, the (innite volume) magnetization density
equals
 
1  sinh(h)e
lim N (nx, nx, ) = m := ,
N N e cosh(h) + cosh(gr )
xN

with m = m (, , , h) [1, 1], r 0 being the unique solution of (2.4) and


gr := {( )2 + 2 r}1/2 , see Theorem 2.1 and Fig. 6.

This theorem deduced from Theorem 6.4 does not seem to show any Meiner
eect since m > 0 as soon as h = 0. However, when the Cooper pair condensate
density r is strictly positive, from Theorem 3.5 combined with (2.5) note that

2gr e sinh(h)
m = . (3.4)
sinh(gr )
In particular, it decays exponentially as when r r > 0, see Fig. 6. We
give therefore the zero-temperature limit of m in the next corollary.

Corollary 3.3 (Magnetization Density at Zero-Temperature). The (innite


volume) magnetization density m = m (, , , h) [1, 1] at zero-temperature
is equal to
sgn(h)
m := lim m = [0,+|h|] (| |),
1 + ||,+|h|

Fig. 6. In the gure on the left, we have an illustration of the electron density d (blue line), the
Cooper pair condensate density r (red line) and the magnetization density m (green line) as
functions of the magnetic eld h at = 7, = 1, = 0.575 and = 2.6. The gure on the right
represents a 3D illustration of m = m (1, 0.575, 2.6, h) as a function of h and . The color from
red to blue reects the decrease of the temperature. In both gures, we can see the Meiner eect
(in the 3D illustration, the area with no magnetization corresponds to r > 0). (Color online.)
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

252 J.-B. Bru & W. de Siqueira Pedra

for < ||,+|h| (see Corollary 3.1), whereas for > ||,+|h| there is no
magnetization at zero-temperature since m decays exponentiallym as to
m = 0.
Consequently, there is no superconductivity, i.e. r = 0, when < ||,+|h|
and, as soon as h = 0 with | | < + |h|, there is a perfect magnetization at
zero-temperature, i.e. m = sgn(h). Observe that the condition | | > + |h|
implies from Corollary 3.2 that either d = 0 or d = 2, which implies that m
must be zero.
On the other hand, if > ||, , we can dene the critical magnetic eld at
zero-temperature by the unique positive solution

1 2
(c)
h := + ( ) > 0
2
(3.5)
4
(c)
of the equation ||,+y = for y 0. Then, by increasing |h| up to h , the
(zero-temperature) Cooper pair condensate density r stays constant, whereas the
(zero-temperature) magnetization density m is zero, i.e. r = rmax and m = 0
(c) (c)
for |h| < h , see Corollary 3.1. However, as soon as |h| > h , r = 0 and
m = sgn(h), i.e. there is no Cooper pair and a pure magnetization takes place.
In other words, the model manifests a pure Meiner eect at zero-temperature
corresponding to a superconductor of type I, cf. Fig. 6.
Finally, note that we give an energetic interpretation of the critical magnetic
(c) (c)
eld h after Corollary 3.5. Observe also that a measurement of h (3.5) implies,
for instance, a measurement of the chemical potential if one would know and ,
which could be found via the asymptotic (3.15) of the specic heat, see discussions
in Sec. 5.

3.4. Coulomb correlation density


The space distribution of electrons is still unknown and for such a consideration,
we need the (innite volume) Coulomb correlation density
 
1 
lim N (nx, nx, ) . (3.6)
N N
xN

Together with the electron and magnetization densities d and m , the knowl-
edge of (3.6) allows us in particular to explain in detail the dierence between
superconducting and non-superconducting phases in terms of space distributions of
electrons.
Actually, by the CauchySchwarz inequality for the states one gets that
 
1  1  1 
N (nx, nx, ) N (nx, ) N (nx, ). (3.7)
N N N
xN xN xN

m Actually, m = O(e(2(+|h|))/2 ) for > ||,+|h| 2( + |h|).


April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 253

From Theorems 3.4 and 3.5, the densities of electrons with spin up and down
equal, respectively,
 
1  d + m
lim N (nx, ) = [0, 1]
N N 2
xN

and
 
1  d m
lim N (nx, ) = [0, 1]
N N 2
xN

for any , > 0 and , , h away from any critical point. Consequently, by
using (3.7) in the thermodynamic limit, the (innite volume) Coulomb correlation
density is always bounded by
 
1  1 2
0 lim N (nx, nx, ) wmax := d m2 . (3.8)
N N 2
xN

If, for instance, (3.6) equals zero, then as soon as an electron is on a denite site,
the probability to have a second electron with opposite spin at the same place goes
to zero as N . In this case, there would be no formation of pairs of electrons on
a single site. This phenomenon does not appear exactly in nite temperature due
to thermal uctuations. Indeed, we can explicitly compute the Coulomb correlation
in the thermodynamic limit (cf. Theorem 6.4):

Theorem 3.6 (Coulomb Correlation Density). For any , > 0 and real
numbers , , h away from any critical point, the (innite volume) Coulomb corre-
lation density equalsn
 
1  1
lim N (nx, nx, ) = w := (d m coth(h)),
N N 2
xN

with w = w (, , , h) (0, wmax ), see Fig. 7. Here d and m are, respectively,


dened in Theorems 3.4 and 3.5.

Consequently, because gr | |, for any inverse temperature > 0 the


Coulomb correlation density is never zero, i.e. w > 0, even if the electron density
d is exactly 1, i.e. if = . Moreover, the upper bound in (3.8) is also never
attained. However, for low temperatures, w goes exponentially fast with respect
to to one of the bounds in (3.8), cf. Fig. 7. Indeed, one has the following zero
temperature limit:

Corollary 3.4 (Coulomb Correlation Density at Zero-Temperature). The


(innite volume) Coulomb correlation density w = w (, , , h) [0, 1] at

n If h = 0, then w (, , , 0) := limh0 w (, , , h).


April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

254 J.-B. Bru & W. de Siqueira Pedra

w , wmax w , wmax w , wmax


0.5 0.5 0.5

0.4
0.4 0.4

0.3
0.3 0.3
0.2

0.2 0.2
0.1

2 4 6 8 10 12
2.0 2.5 3.0 3.5 4.0 4.5 5.0
2.0 2.5 3.0 3.5 4.0 4.5 5.0

Fig. 7. Illustration of the Coulomb correlation density w (red lines) and its corresponding upper
bound wmax (blue lines) as a function of > 0 at = 0.2, = 2.6, for = 1.305 < (left gure,
d < 1), = 0.2 = (two right gures, d = 1), and from the left to the right, with h = 0
(m = 0), and h = 0.3, 0.35 (where m > 0). The dashed green lines indicate that d /2 = 0.5 in
the three cases. In the gure on the left there is no superconducting phase in opposition to the right
gures where we see a phase transition for > 2.3 (second order) or 2.6 (rst order). (Color online.)

zero-temperature is equal to
1 + sgn( )
w := lim w = [+|h|,) (| |)
2(1 + ||,+|h| (1 + h,0 ))
for < ||,+|h| whereas w = d /2 for > ||,+|h| , see Corollaries 3.1
and 3.2.
If | | > + |h|, the interpretation of this asymptotics is clear since either
d = 0 for < or d = 2 for > . The interesting phenomena are when
| | < + |h|. In this case, if there is no superconducting phase, i.e. <
||,+|h| , then w converges towards w = 0 as . In particular, as
explained above, if an electron is on a denite site, the probability to have a second
electron with opposite spin at the same place goes to zero as N and .
However, in the superconducting phase, i.e. for > ||,+|h| , the upper
bound wmax (3.8) is asymptotically attained. Since wmax = d /2 as , it
means that 100% of electrons form Cooper pairs in the limit of zero-temperature,
which is in accordance with the fact that the magnetization density must disap-
pear, i.e. m = 0, cf. Corollary 3.3. As explained in Sec. 3.1, the highest Cooper
pair condensate density is 1/4, which corresponds to an electron density d = 1.
Actually, although all electrons form Cooper pairs at small temperatures, there are
never 100% of electron pairs in the condensate, see Fig. 8. In the special case where
d = 1, only 50% of Cooper pairs are in the condensate.
The same analysis can be done for hole pairs by changing ax by bx in the
denition of extensive quantities. Dene the electron and hole pair condensate frac-
tions respectively by v := 2r /d and v , where r and d
:= 2r /d are the hole
condensate density and the hole density respectively. Because of the electron-hole
symmetry, r = r and d = 2 d . In particular, when r > 0, we asymptotically
get that v + v 1 as . Hence, in the superconducting phase, an elec-
tron pair condensate fraction below 50% means in fact that there are more than
50% of hole pair condensate and conversely at low temperatures. For more details
concerning ground states in relation with this phenomenon, see discussions around
(6.60) in Sec. 6.2.
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 255

% of Cooper pair condensate d % of Cooper pair condensate


100 2.0 100

80 80
1.5

60 60
1.0
40 40

0.5
20 20

1.0 0.5 0.5 1.0 1.5 2.0


1.0 0.5 0.5 1.0 1.5 2.0
1.0 0.5 0.5 1.0 1.5 2.0

Fig. 8. The fraction of electron pairs in the condensate is given in right and left gures as a
function of . In the gure on the left, = h = 0, with inverse temperatures = 2.45 (orange
line), 3.45 (red line) and 30 (blue line). In the gure on the right, = 0.575 and h = 0.1 with
= 5 (orange line), 7 (red line) and 30 (blue line). The gure on the center illustrates the electron
density d also as a function of at = 30 (low temperature regime) for = h = 0 (red line)
and for = 0.575 and h = 0.1 (green line). In all gures, = 2.6. (Color online.)

3.5. Superconductor-Mott insulator phase transition


By Corollary 3.2, if > 0 and the system is not in the superconducting phase
(i.e. if r = 0), then the electron density converges to either 0, 1 or 2 as
since
d = 1 + sgn( ). (3.9)
We dene the phase where the system does not form a pair condensate and the
electron density is around 1, as a Mott insulator phase. More precisely, we say that
the system forms a Mott insulator, if for some  < 1, some 0 < 0 < , some
0 R and some > 0, the electron density
d (1 , 1 + ) and r = 0 for all (, ) (0 , ) (0 , 0 + ).
As discussed in Sec. 3.4, observe that we have, in this phase, exactly one electron (or
hole) localized in each site at the low temperature limit since d 1 and w 0
as .
To extract the whole region of parameters where such a thermodynamic phase
takes place, a preliminary analysis of the function x,y dened in Corollary 3.1 is
rst required. Observe that 0,y > 0 if and only if y > 0. Consequently, for any real
numbers and h such that + |h| 0 we have 0,+|h| = 0. However, if + |h| > 0
then 0,+|h| > 0. Meanwhile, at xed y > 0, the continuous function x,y of x 0
is convex with minimum for x = y, i.e.
inf {x,y } = y,y = 2y > 0. (3.10)
x0

In particular, x,y is strictly decreasing as a function of x [0, y] and strictly


increasing for x y.
Now, by combining Corollaries 3.13.4, we are in position to extract the set of
parameters corresponding to insulating or superconducting phases:
(1) For any > 0 and , R such that
| | > max{/2, + |h|},
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

256 J.-B. Bru & W. de Siqueira Pedra

observe rst that there are no superconductivity (r = 0), either no electrons or no


holes (see (3.9)) and, in any case, no magnetization since m = 0. It is a standard
(non ferromagnetic) insulator.
The next step is now to analyze the thermodynamic behavior for

| | < max{/2, + |h|}, (3.11)

which depends on the strength of > 0. From (2) to (4), we assume that (3.11) is
satised.
(2) If the BCS coupling constant satises

0 < +|h|,+|h| = 2( + |h|),

then from (3.10) combined with Corollary 3.1 there is no Cooper pair for any and
any . In particular, under the condition (3.11) there are a perfect magnetization,
i.e. m = sgn(h), and exactly one electron or one hole per site since d = 1 and
w = 0. In other words, we obtain a ferromagnetic Mott insulator phase.
(3) Now, if > 0 becomes too strong, i.e.

> 0,+|h| = 4( + |h|),

then for any R such that | | < /2 there are Cooper pairs because r =
rmax > 0, an electron density d equal to (3.3) and no magnetization (m = 0).
In this case, observe that all quantities are continuous at | | = /2. This is a
superconducting phase.
(4) The superconducting-Mott insulator phase transition only appears in the inter-
mediary regime where

+|h|,+|h| = 2( + |h|) < < 0,+|h| = 4( + |h|), (3.12)

cf. Fig. 9. Indeed, the function x,+|h| = has two solutions

1/2
x1 := {4( + |h|) }1/2 and x2 := > x1 .
2 2
In particular, for any R such that | | (x1 , /2), the BCS coupling
constant is strong enough to imply the superconductivity (r = rmax > 0), with
an electron density d equal to (3.3) and no magnetization (m = 0). We are in
the superconducting phase. However, for any R such that | | < x1 , the BCS
coupling constant becomes too weak and there is no superconductivity (r = 0),
exactly one electron per site, i.e. d = 1 and w = 0, and a pure magnetization
if h = 0, i.e. m = sgn(h). In this regime, one gets a ferromagnetic Mott insulator
phase. All quantities are continuous at | | = /2 but not for | | = x1 . In
other words, we get a superconductor-Mott insulator phase transition by tuning in
the chemical potential . An illustration of this phase transition is given in Fig. 10,
see also Fig. 8.
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 257


2.0 100

1.5
50
1.0

0.5
50 100 150 200

1.5 2.0 2.5 3.0 3.5 4.0

50
0.5

1.0 100

Fig. 9. In both gures, the blue area represents the domain of (, ), where there is a supercon-
ducting phase at zero temperature for = 1 and h = 0. The two increasing straight lines (green
and brown) are = 4 and = 2 for 1. In particular, between these two lines (2 < < 4),
there is a superconducting-Mott insulator phase transition by tuning . (Color online.)

d , r , m d , r , m c
2.0 2.0
0.20

1.5 1.5
0.15

1.0 1.0
0.10

0.5 0.5 0.05

0.5 0.5 1.0 1.5 2.0


0.5 0.5 1.0 1.5 2.0
0.5 0.5 1.0 1.5 2.0

Fig. 10. Here = 0.575, = 2.6, and h = 0.1. In the two gures on the left, we plot the electron
density d (blue line), the Cooper pair condensate density r (red line) and the magnetization
density m (green line) as functions of for = 7 (left gure) or 30 (low temperature regime, gure
on the center). Observe the superconducting-Mott Insulator phase transition which appears in both
cases. In the right gure, we illustrate as a function of the corresponding critical temperature
c . The blue line corresponds to a second order phase transition, whereas the red dashed line
represents the domain of with rst order phase transition. The black dashed line is the chemical
potential = corresponding to an electron density per site equal to 1. (Color online.)

3.6. Mean-energy per site and the specic heat


To conclude, low-Tc superconductors and high-Tc superconductors dier by the
behavior of their specic heat. The rst one shows a discontinuity of the specic
heat at the critical point whereas the specic heat for highTc superconductors is
continuous. It is therefore interesting to give now the mean-energy per site in the
thermodynamic limit in order to compute next the specic heat.

Theorem 3.7 (Mean-Energy per Site). For any , > 0 and real numbers
, , h away from any critical point, the (innite volume) mean energy per site is
equal to

lim {N 1 N (HN )} =  := d hm + 2w r ,
N

see Theorems 3.13.6 and Fig. 11.


April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

258 J.-B. Bru & W. de Siqueira Pedra


1.5 0.95

1.0
1.6
1.00 0.20
1.1
1.7 0.15
1.2
1.05
1.8 6 0.10 h
8
1.9
4 5 6 7 8 9 10
10
0.05

2 3 4 5 6 7
12
0.00

Fig. 11. In the two gures on the left, we give the mean energy per site  as a function of at
h = 0 for = 0 (gure on the left, second order BCS phase transition) or = 0.575 (gure on the
center, rst order phase transition). The dashed line in both gures is the mean energy per site
with zero Cooper pair condensate density. On the right gure,  is given as a function of and h
at = 0.575. The color from red to blue reects the decrease of the temperature and the plateau
corresponds to the superconducting phase. In all gures, = 1 and = 2.6. (Color online.)

At zero-temperature, Corollaries 3.13.4 imply an explicit computation of the


mean energy per site:
Corollary 3.5 (Mean-Energy per Site at Zero-Temperature). The (innite
volume) mean energy per site  =  (, , , h) at zero-temperature is equal to
+ | |
 := lim  = + [+|h|,) (| |)
1 + ||,+|h| (1 + h,0 )
|h|
[0,+|h|] (| |),
1 + ||,+|h|
for < ||,+|h| whereas for > ||,+|h|

 := lim  = + ( )(1 + 1 ( )),
4
cf. Corollary 3.1.
(c)
Note that the critical magnetic eld h (3.5) has a direct interpretation in
terms of the zero-temperature mean energy per site  . Indeed, if | | < + |h|,
i.e. d / {0, 2}, by equating  in the superconducting phase with the mean energy
 = |h| in the non-superconducting (ferromagnetic) state, we directly get
(c)
that the magnetic eld should be equal to |h| = h (3.5). In other words, the
(c)
critical magnetic eld h corresponds to the point where the mean energies at
zero-temperature in both cases are equal to each other, as it should be. Note that
this phenomenon is not true at non-zero temperature since the mean energy per
site can be discontinuous as a function of h (even if = 0), see Fig. 11.
Now, the specic heat at nite volume equals
cN, := 2 {N 1 N (HN )} = N 1 2 N ([HN N (HN )]2 ). (3.13)
However, its thermodynamic limit
c := lim cN, = 2  + C (3.14)
N
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 259

cannot be easily computed because one cannot exchange the limit N and the
derivative , i.e. C = C (, , , h) may be non-zero. For instance, Griths argu-
ments [2931] (Appendix) would allow to exchange any derivative of the pressure
pN and the limit N by using the convexity of pN . To compute (3.14) in this
way, we would need to prove the (piece-wise) convexity of N, := N 1 N (HN ) as
a function > 0. As suggested by Fig. 11, this property of convexity might be right
but it is not proven here.
Notice however that if experimental measurements of the specic heat comes
from a discrete derivative of the mean energy per site  , it is then clear that it
corresponds to forget about the term C . In this case, i.e. assuming C = 0, we nd
again the well-known BCS-type behavior of the specic heat in presence of a second
order phase transition, see Fig. 12. In addition, if C = 0, then for any , , h and
> ||,+|h| (Corollary 3.1), we explicitly obtain via direct computations the
well-known exponential decay of the specic heat at zero-temperature for s-wave
superconductors:
1
c = (2 + 2 42 ) 2 e + o( 2 e ) as . (3.15)
4
(Note that this asymptotic could give access to and also , see discussions in
Sec. 5.) However, if a rst order phase transition appears, then the (innite volume)
mean energy per site  is discontinuous at the critical temperature c (cf. Fig. 11)
and the specic heat cc1 is innite. In Fig. 12, we give an illustration of the ratio
c/cmax between the jump c at = c and the maximum value cmax of cc1 . For
most of standard superconductorso note that the measured values are between 0.6
and 0.7. Numerical computations suggest that this ratio c/cmax may always be
bounded in our model by one as soon as a second order phase transition appears.

c =c 1 c =c 1 c/cmax

3.0 3.0 1.0

2.5 2.5
0.8

2.0 2.0
0.6
1.5 1.5

0.4
1.0 1.0

0.2
0.5 0.5

0.4 0.6 0.8 1.0 1.2


/c 0.4 0.6 0.8 1.0 1.2
/c 0.2 0.0 0.2 0.4 0.6

Fig. 12. Here = 1, = 2.6 and h = 0. Assuming C = 0, we give 3 plots of the specic heat
c as a function of the ratio /c between := 1 and the critical temperature c for = 0, 0.5
(both left gure, respectively blue and red lines, second order phase transition), and = 0.575
(gure on the center, blue line, rst order phase transition). The dashed red line in the gure on
the center indicates what the specic heat at nite volume might be since c1 = +. The right
c
gure is a plot as a function of of the relative specic heat jump, i.e. the ratio c/cmax between
the jump c at = c and the maximum value cmax of c1 at the same point. The yellow colored
c
area indicates that this ratio numerically computed is formally innite due to a rst order phase
transition. (Color online.)

o At least for the following elements: Hg, In, Nb, Pb, Sn, Ta, Tl, V.
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

260 J.-B. Bru & W. de Siqueira Pedra

4. Phase Diagram at Fixed Electron Density per Site


In any nite volume, the electron density per site is strictly increasing as a function
of the chemical potential by strict convexity of the pressure. Therefore, for any
xed electron density (0, 2) there exists a unique N, = N, (, , , h) such
that
1 
= N (nx, + nx, ), (4.1)
N
xN

where N represents the (nite volume) grand-canonical Gibbs state (1.6) associ-
ated with HN and taken at inverse temperature and chemical potential = N, .
The aim of this section is now to analyze the thermodynamic properties of the model
for a xed instead of a xed chemical potential . We start by investigating it
away from any critical point.

4.1. Thermodynamics away from any critical point


In the thermodynamic limit and away from any critical point, the chemical potential
N, converges to a solution = (, , , h) of the equation
= d (, , , h), (4.2)
see Theorem 3.4. For instance, if = 1, the chemical potential is simply given
by , i.e. (1, , , h) = . At least away from any critical point, this chemical
potential is always uniquely dened.
Indeed, outside the superconducting phase (see Sec. 3.1), the electron density
d given by Theorem 3.4 is a strictly increasing continuous function of the chemical
potential at xed > 0. In other words, for any xed electron density (0, 2),
Eq. (4.2) has a unique solution , i.e. the chemical potential is the inverse of
the electron density d taken as a function of R.
On the other hand, inside the superconducting phase, from (3.3) the chemical
potential is also unique and equals

= ( 1) + , (4.3)
2
see Figs. 5 and 10. In particular, does not depend on h or as soon as r > 0.
The gap equation (2.5) then equals

e cosh(h) 1
tanh(gr ) = 2gr 1 + , with gr := {( 1)2 + 4r}1/2 ,
cosh(gr ) 2
and
0 r max{0, (2 )/4},
for any xed electron density (0, 2).
Hence, the thermodynamic behavior of the strong coupling BCS-Hubbard model
HN is simply given for any (0, 2), away from any critical point, by setting =
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 261

r r
0.25
0.12

0.20 0.10

0.15 0.08

0.06
0.10
0.04

0.05
0.02

1 2 3 4 5 6 7
2 3 4 5 6 7 8

Fig. 13. Illustrations of the Cooper pair condensate density r as a function of the inverse
temperature for = 2.6, h = 0, and densities = 1, 1.7 (respectively left and right gures),
with = 0 (blue line), 0.5 (red line), 0.75 (green line), and 1 (orange line). The dashed line
indicates the value of r . (Color online.)

in Sec. 3. In particular, the superconducting phase can appear by tuning in each


parameter: the BCS coupling constant (see (2.6)), the inverse temperature > 0
(see Corollary 3.1), the coupling constant , the magnetic eld h (see Sec. 3.3), the
chemical potential or the electron density (see Sec. 3.5). Therefore, to explain
the phase diagram at xed electron density, it is sucient to give the behavior of
the Cooper pair condensate density r as a function of (0, 2). Everything can
be easily performed via numerical methods, see Fig. 13. We restrict our rigorous
analysis to the zero-temperature limit of r , which is a straightforward consequence
of Corollary 3.1 and (4.3).
Corollary 4.1 (Zero-Temperature Cooper Pair Condensate Density). At
zero-temperature, xed electron density (0, 2) and , h R, the Cooper pair
condensate density r converges as towards r = (2 )/4 when >
max{ ,+|h| , 0}. Here

x,y := 4y
[0,) (y)
x(x 2) + 2
is a function dened for any x, y R.
Remark 4.1. The case 0 < < ,+|h| is more subtle than its analogous with a
xed chemical potential , because phase mixtures can take place. See Sec. 4.2.
As explained above, as soon as > ,+|h| we can extract from this corollary all
the zero-temperature thermodynamics of the strong coupling BCS-Hubbard model
by using Corollaries 3.13.4.
If + |h| > 0 and satisfy the inequalities
> min {
,+|h| } = 2,+|h| = 2( + |h|)
0,+|h| =
(0,2)

and
< max {
,+|h| } =
1,+|h| = 4( + |h|),
(0,2)
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

262 J.-B. Bru & W. de Siqueira Pedra

it is also clear that the superconductor-Mott insulator phase transition appears by


tuning the electron density in the same way as described in Sec. 3.5 for . See
Fig. 10. In this case however, we recommend Sec. 4.2 for more details because of
the subtlety mentioned in Remark 4.1. See Figs. 15 and 16 below.
From (4.3) combined with Corollary 4.1, note that the asymptotics (3.15) of
the specic heat at zero-temperature is still valid at xed electron density as
soon as > max{ ,+|h| , 0}. Meanwhile, from Corollary 4.1 the zero-temperature
Cooper pair condensate density r does not depend on , , or h, as soon as
> ,+|h| is satised. Indeed, the chemical potential in the case where r > 0
is renormalized, cf. (4.3). In other words, at zero-temperature, the thermodynamic
behavior of the strong coupling BCS-Hubbard model for > ,+|h| is equal to
the well-known behavior of the BCS theory in the strong coupling approximation
( = h = 0). This phenomenon is also seen by using renormalization methods where
it is believed that the Coulomb interaction simply modies the mass of electrons
by creating quasi-particles (which however do not exist in our model).

4.2. Coexistence of ferromagnetic and superconducting phases


Observe that the electron density d given by Theorem 3.4 can have discontinu-
ities as a function of the chemical potential . This phenomenon appears at the
superconductor-Mott insulator phase transition, see Sec. 3.5 and Fig. 10. Because
of electron-hole symmetry (Sec. 3.2), without loss of generality we can restrict our
study to the case where d [0, 1], i.e. [0, 1] and .
In this regime, the electron density d has, at most, one discontinuity point at
(c)
the so-called critical chemical potential . In particular, there are two critical
electron densities

d
(c)
:= d ( 0, , , h) with d > d .
+

Similarly, we can also dene two critical Cooper pair condensate densities r , two
critical magnetization densitiesp m and two critical Coulomb correlation density
w . Of course, since r+
> r = 0, we are here on a critical point, i.e.

(c)
(, , , , h) S

(see (2.7)), with , > 0 and , h R such that this critical chemical potential
(c) (c)
= (, , h) exists.
The thermodynamics of the model for  [d +
, d ] is already explained in
Sec. 4.1 because the solution r of (2.4) is unique at = . The chemical potential
N, converges to = , if [d
(c) +
, d ]. In this case the variational problem

(2.4) has exactly two maximizers r . The thermodynamic behavior of the system

p If h = 0, then m
= 0.
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 263

in this regime is not, a priori, clear except from the obvious fact that
1 
lim N (nx, + nx, ) =
N N
xN

per denition. In particular, it cannot be deduced from the above results. We handle
this situation within a much more general framework in Theorem 6.5. As a conse-
quence of this study (see discussions after Theorem 6.5), all the extensive quantities
can be obtained in the thermodynamic limit:
Theorem 4.1 (Densities in Coexistent Phases). Take , > 0 and real num-
(c)
bers , h in the domain of denition of the critical chemical potential . For any
+
[d , d ], all densities are uniquely dened :
(i) The Cooper pair condensate density equals

1 

lim N (a a a y, a y, ) = r+
, with
N N 2
x, x,
x,yN

d

:= [0, 1].
d
d+
(ii) The magnetization density equals
 
1 
lim N (nx, nx, ) = (1 )m +
+ m .
N N
xN

(iii) The Coulomb correlation density equals


 
1 
lim N (nx, nx, ) = (1 )w + w+ .
N N
xN

(iv) The mean energy per site equals


lim {N 1 N (HN )} = (1 ) +
+  ,
N

with 
(c)
:= hm + 2w r .

As a consequence of this theorem, as soon as the magnetic eld h = 0, there


is a coexistence of ferromagnetic and superconducting phases at low temperatures
for (d +
, d ). In other words, the Meiner eect is not valid in this interval
of electron densities. An illustration of this is given in Fig. 14. Such phenomenon
was also observed in experiments and from our results, it should occur rather near
half-lling (but not exactly at half-lling) and at strong repulsion > 0. Addition-
ally, observe that this coexistence of thermodynamic phases can also appear at the
(c)
critical magnetic eld h (see Sec. 3.3).

Remark 4.2. Coexistence of ferromagnetic and superconducting phases has


already been rigorously investigated, see, e.g., [16, 17]. For instance, in [16] such
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

264 J.-B. Bru & W. de Siqueira Pedra

r m r , , m
1.2
0.5
0.20 1.0
0.4
0.15 0.8
0.3
0.6
0.10
0.2
0.4

0.05
0.1 0.2

5 10 15 20
5 10 15 20
0.6 0.8 1.2 1.4

Fig. 14. In the two gures on the left, we give illustrations of the Cooper pair condensate density
r and the magnetization density m as functions of the inverse temperature for densities
= 0.6 (orange line), 0.7 (magenta line), 0.8 (red line), 0.9 (cyan line). In the gure on the right,
we illustrate the coexistence of ferromagnetic and superconducting phases via graphs of r , m
and the chemical potential as functions of for = 30 (low temperature regime). In all gures,
= 0.575, = 2.6, and h = 0.1. (The small discontinuities around = 1 in the right gure are
numerical anomalies.) (Color online.)

phenomenon is shown to be impossible in the ground state of the VonsovkiiZener


model applied to s-wave superconductors,q whereas at nite temperature, numeri-
cal computations [17] suggests the contrary. This last analysis [17] is however not
performed in details.

The second interesting physical aspect related to densities between the critical
densities d +
and d is a smoothing eect of the extensive quantities (magnetization
density, Cooper pair condensate density, etc.) as functions of the inverse temper-
(c)
ature . Indeed, since the critical chemical potential only exists when a rst
order phase transition occurs, one could expect that the extensive quantities are
not continuous as functions of > 0. In fact, for (d +
, d ), there is a convex
interpolation between quantities related to the solutions r +
= 0 and r > 0 of (2.4),
see Theorem 4.1. The continuity of the extensive quantities then follows, see Fig. 14.
It does not imply however, that all densities become always continuous at xed
as a function of the inverse temperature . For instance, in Fig. 13, the green and
orange graphs give two illustrations of a discontinuity of the order parameter r at
xed electron density = 1 where = . To understand this rst order phase
transition, other extensive quantity should be additionally xed, see discussions in
Sec. 5 and Fig. 17.
Following these last results, we give now in Fig. 15 other plots of the critical
temperature c = c (, , , h), which is dened as usual such that r > 0 if and
only if > c1 . In this gure, observe that a positive , i.e. a one-site repulsion,
can never increase the critical temperature if the electron density is xed instead
of the chemical potential , compare with Fig. 2. We also show in Fig. 15 (right
gure) that if the density of holes equals the density of electrons, i.e. = 1, then
we have a Mott insulator, whereas a small doping of electrons or holes implies
either a superconducting phase (blue area) or a superconductor-Mott insulator

q It is a combination of the BCS interaction (1.3) with the Zener sd exchange interaction.
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 265

c c c
1.2 0.6
0.20

1.0 0.5
0.15
0.8 0.4

0.6 0.3 0.10

0.4 0.2
0.05
0.2 0.1

3.0 2.5 2.0 1.5 1.0 0.5 0.5


0.2 0.4 0.6 0.8 1.0 1.2
0.5 1.0 1.5 2.0

Fig. 15. Illustration, as a function of (the two gures on the left) or (gure on the right),
of the critical temperature c = c (, , , h) for = 2.6, h = 0.1 and with = 1 (left gure),
= 0.7 (gure on the center) and = 0.575 (right gure). The blue and yellow areas correspond
respectively to the superconducting and ferromagnetic-superconducting phases, whereas the red
dashed line indicates the domain of with a rst order phase transition as a function of or the
temperature := 1 (It only exists in the left gure). The dashed green line (left gure) is the
asymptote when . In the right gure, observe that there is no phase transition for = 1.
(Color online.)


c =c 1
1.0 6 7 8 9 10
0.2 10
0.5

2 4 6 8 10
0.4
8

0.5 6
0.6

1.0 4
0.8
1.5 2
1.0
2.0
0.4 0.6 0.8 1.0 1.2
/c
1.2

Fig. 16. In the two gures on the left, we give illustrations of the mean energy per site  as a
function of the inverse temperature for densities = 0.7 (magenta line), 0.9 (cyan line), 1 (green
line), 1.1 (blue line) and 1.3 (red line). For = 1, there is no phase transition and for = 0.9 or
1.1 only a ferromagnetic-superconducting phase appears, whereas for = 0.7 or 1.3 this last phase
is followed for larger by a superconducting phase. In the gure on the right, assuming C = 0,
we give two plots of the specic heat c as a function of the ratio /c between := 1 and
the critical temperature c for densities = 0.7 (magenta line) and 0.9 (cyan line). In all gures,
= 0.575, = 2.6, and h = 0.1. (Color online.)

(ferromagnetic) phase (yellow area) related to the superconductor-Mott insulator


phase transition described in Sec. 3.5 and Fig. 10.
To conclude, the Fig. 16 illustrates various thermodynamic features of the sys-
tem at xed . First, as a function of > 0,  is continuously dierentiable only for
= 1. In other words, there is no phase transition by opposition to the cases with
= 0.7, 0.9 or = 1.1, 1.3. This is the Mott insulator phase transition illustrated
in Fig. 10. As in Fig. 10, we also observe the electron-hole symmetry implying
that = 0.7 and = 1.3, or = 0.9 and = 1.1, has same phase transitions at
exactly the same critical points. As explained in Sec. 3.1, the mean energy per site
 for = 0.7, 1.3, or = 0.9, 1.1, diers by a constant, i.e. in absolute value by
|2 |. At high temperatures, i.e. when 0, the function  diverges to
if = 1 with (0, 1) whereas it stays nite at = 1. Indeed, when 0 the
electron density d converges to 1 at xed , , , h, see Theorem 3.4 and Fig. 5.
If = 1 , it follows that the chemical potential diverges to as 0,
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

266 J.-B. Bru & W. de Siqueira Pedra

implying that  . In other words, it is energetically unfavorable to x an


election density = 1 at high temperatures. Finally, the specic heat c has only
one jump in the case of one phase transition and two jumps when there are two
phase transitions, namely when the superconductor-Mott insulator (ferromagnetic)
phase and the purely superconducting phase appear.

5. Concluding Remarks
(1) First, it is important to note that two dierent physical behaviors can be
extracted from the strong coupling BCS-Hubbard model HN : a rst one at xed
chemical potential and a second one at xed electron density (0, 2). This does
not mean that the canonical and grand-canonical ensembles are not equivalent for
this model. But, the inuence of the direct interaction with coupling constant
drastically changes from the case at xed to the other one at xed . For instance,
via Corollary 4.1 (see also Fig. 15), any one-site repulsion between pairs of electrons
is in any case unfavorable to the formation of Cooper pairs, as soon as the electron
density is xed. This property is however wrong at xed chemical potential ,
see Fig. 2. In other words, xing the electron density is not equivalentr to xing
the chemical potential in the model. Physically, a xed electron density can be
modied by doping the superconductor. Changing the chemical potential may be
more dicult. One naive proposition would be to impose an electric potential on a
superconductor which is coupled to an additional conductor serving as a reservoir
of electrons or holes at xed chemical potential.

(2) A measurement of the asymptotics as of the specic heat c (see (3.14)


with C = 0) in a superconducting phase would determine, by using (3.15), rst
the parameter > 0 via the exponential decay and then the coupling constant .
Next, the measurement of the critical magnetic eld at very low temperature would
allow to obtain by (3.5) the chemical potential and hence the electron density
at zero-temperature. Since the inverse temperature as well as the magnetic eld
h can directly be measured, all parameters of the strong coupling BCS-Hubbard
model HN (1.2) would be experimentally found. In particular, its thermodynamic
behavior, explained in Secs. 24, could nally be confronted to the real system. One
could for instance check if the critical temperature c given by HN in appropriate
dimension corresponds to the one measured in the real superconductor. Such studies
would highlight the thermodynamic impact of the kinetic energy.

(3) In Sec. 4, the electron density is xed but one could have xed each exten-
sive quantity: the Cooper pair condensate density, the magnetization density, the
Coulomb correlation density or the mean-energy per site. For instance, if the mag-
netization density m R is xed, by strict convexity of the pressure there is a

r Equivalent is not taken here in the sense of the equivalence of ensembles.


April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 267

unique magnetic eld hN, = hN, (, , , m) such that


1 
m= N (nx, nx, ).
N
xN

In the thermodynamic limit, we then have hN, converging to h solution of the


equation m = m at xed , > 0 and , R. By using Theorem 6.5, we would
obtain the thermodynamics of the system for any , > 0 and , , m R. More
generally, when one of the extensive quantities r , d , m , w , or  is discontinuous
at a critical point, then the thermodynamic limit of the local Gibbs states N can be
uniquely determined by xing one of the corresponding extensive quantity between
its critical values. The other extensive quantities are determined in this case by an
obvious transcription of Theorem 4.1 for the considered discontinuous quantity at
the critical point. Observe, however, that r , d , m , w , and  should be related,
respectively, to the parameters , , h, and . For instance, the existence of a
magnetic eld hN, solution of (4.1) at xed (0, 2) is not clear at nite volume.
Figure 17 gives an example of an electron density always equal to 1 for =
together with discontinuity of all other extensive quantities. In order to get well-
dened quantities at the thermodynamic limit in this example for parameters allow-
ing a rst order phase transition, it is not sucient to have the electron density
xed. At the critical point we could for instance x the magnetization density m R
in the ferromagnetic case (h = 0.1) or in any case, the Coulomb correlation density
w 0 which determines a coupling constant N, converging to , see the right
illustrations of Fig. 17 with the existence of a critical magnetic eld and a critical
coupling constant.
(4) To conclude, as explained in the introduction, for a suitable space of states
it is possible to dene a free energy density functional F (1.5) associated with
the Hamiltonians HN . The states minimizing this functional are equilibrium states
and implies all the thermodynamics of the strong coupling BCS-Hubbard model
discussed in Secs. 3 and 4. Indeed, the weak -limit of the local Gibbs state N
as N exists and belongs to our set of equilibrium states for any , > 0

r , m , w , r , m , w , m c (h), w c ( )
0.8
0.4 0.4

0.2 0.2 0.6

1 2 3 4 5 6 7
1 2 3 4 5 6 7
0.4

0.2 0.2
0.2
0.4 0.4

0.6 0.6 0.1 0.2 0.3 0.4 0.5


h,

Fig. 17. In the two gures on the left, we give illustrations of the Cooper pair condensate density
r (blue line), the magnetization density m (green line), the Coulomb correlation density w
(red line), and the mean-energy per site  (orange line) as functions of the inverse temperature
for h = 0 (gure on the left) and h = 0.1 (gure on the center) whereas = = 0.375, i.e.
= 1. In the gure on the right, we illustrate mc (green line) and wc (red line) respectively
as functions of h with = = 0.375 and with (, h) = (0.375, 0.1) at the critical inverse
temperature c := c1 3.04. (Color online.)
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

268 J.-B. Bru & W. de Siqueira Pedra

and , , h R, cf. Theorem 6.5. In Sec. 6.2, we prove in particular the following
properties of equilibrium states:
1/2
(i) Any pure equilibrium state satises (ax, ax, ) = r ei for some
[0, 2). In particular, if r = 0 they are not U (1)-gauge invariant and show o
diagonal long range order [38] (ODLRO), cf. Theorem 6.1, Theorem 6.3 and
Corollary 6.1.
(ii) All densities are uniquely dened: the electron density of any equilibrium states
is given by (nx, + nx, ) = d , its magnetization density by (nx,
nx, ) = m , and its Coulomb correlation density equals (nx, nx, ) = w , cf.
Theorem 6.4.
(iii) The Cooper elds x := ax, ax, + ax, ax, and x := i(ax, ax, ax, ax, )
for pure states become classical in the limit , i.e. their uctuations go
to zero in this limit, cf. Theorem 6.6.

Any weak limit point of equilibrium states with diverging inverse temperature
is (by denition) a ground state. For > 0 and , , h R, most of ground states
inherit the properties (i)(iii) of equilibrium states. In particular, within the GNS-
representation [32] of pure ground states, Cooper elds are exactly c-numbers, see
Corollary 6.2. In this case, correlation functions can explicitly be computed at any
order in Cooper elds. Furthermore, notice that even in the case h = 0 where
the Hamiltonian HN is spin invariant, there exist ground states breaking the spin
SU (2)-symmetry. For more details including a precise formulation of these results,
we recommend Sec. 6, in particular, Sec. 6.2.

6. Mathematical Foundations of the Thermodynamic Results


The aim of this section is to give all the detailed proofs of the thermodynamics
of the strong coupling BCS-Hubbard model HN (1.2). The central result of this
section is the thermodynamic limit of the pressure, i.e. the proof of Theorem 2.1.
The main ingredient in this analysis is the celebrated Strmer Theorem [1], which
we adapt here for the CAR algebra (see Lemma 6.8). We orient our approach on
the PetzRaggioVerbeure results in [19], but we would like to mention that the
analysis of permutation invariant quantum systems in the thermodynamic limit
(with Strmers theorem as the background) is carried out for dierent classes of
systems also by other authors. See, e.g., [33, 39]. Finally, we introduce in Sec. 6.2 a
notion of equilibrium and ground states by a usual variational principle for the free
energy density. The thermodynamics of the strong coupling BCS-Hubbard model
described in Secs. 3 and 4 is encoded in this notion and the thermodynamic limits
of local Gibbs states used above for simplicity are special cases of equilibrium and
ground states dened in Sec. 6.2. Before we proceed, we rst dene some basic
mathematical objects needed in our analysis.
Let I be the set of nite subsets of Zd1 . For any I we then dene U
as the C -algebra generated by {ax, , ax, }x and the identity. Choosing some
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 269

xed bijective map : N Zd , N := {1, 2, . . .}, UN denotes the local C -algebra


U{(1),...,(N )} at xed N N, whereas U is the full C -algebra, i.e. the closure of
the union of all UN for any integer N 1. Note that
n(l), := a(l), a(l), and n(l), := a(l), a(l),
are the electron number operators on the site (l), respectively, with spin up and
down . To simplify the notation, as soon as a statement clearly concerns the one-
site algebra U1 = U{(1)} , we replace a(1), , a(1), and n(1), , n(1), , respectively,
by a , a and n , n , whereas any state on U1 is denoted by and not by , which
is by denition a state on more than one site (on U , UN or U). Important one-site
Gibbs states in our analysis are the states c associated for any c C with the
Hamiltonian H1 (c) (2.1) and dened by

Trace(Ae{(h)n +(+h)n +(ca a +ca a )2n n } )
c (A) := , (6.1)
Trace(e{(h)n +(+h)n +(ca a +ca a )2n n } )
for any A U1 . Finally, note that our notation for the Trace does not include the

Hilbert space where it is evaluated. Using the isomorphisms U B( C{,} )
of C -algebras, the corresponding Hilbert space is deduced from the local algebra
where the operators involved in each statement are living.
Now, we are in position to start the proof of Theorem 2.1. It is followed by a
rigorous analysis of the corresponding equilibrium and ground states.

6.1. Thermodynamic limit of the pressure: Proof of Theorem 2.1


Since we have already shown the lower bound (2.2) in Sec. 2, to nish the proof of
Theorem 2.1 it remains to obtain
lim sup{pN (, , , , h)} sup{|c|2 + p(c)}. (6.2)
N cC

We split this proof into several lemmata. But rst, we need some additional deni-
tions.
We dene the set of all S-invariant even states. Let S be the set of bijective
maps from N to N which leaves invariant all but nitely many elements. It is a
group with respect to the composition. The condition
s : a(l),# a(s(l)),# , s S, l N, (6.3)
denes a group homomorphism : S Aut(U), s s uniquely. Here, # stands
for a spin up or down . Then, let
EUS,+ := { EU : s = for any s S, and
(a(l1 ),# a(lt ),# a(m1 ),# a(m ),# ) = 0 if t + is odd}
be the set of all S-invariant even states, where EU is the set of all states on U. The
set EUS,+ is weak -compact and convex. In particular, the set of extreme points of
EUS,+ , denoted by EUS,+ , is not empty.
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

270 J.-B. Bru & W. de Siqueira Pedra

Remark 6.1. Any permutation invariant (p.i.) state on U is in fact automatically


even, see, e.g., [25, Example 5.2.21]. We explicitly write the evenness of states in
the denition of EUS,+ because this property is essential in our arguments below.
Now, to x the notation and for the reader convenience, we collect well-known
results about the so-called relative entropy, cf. [25, 40]. Let (1) and (2) be two
states on the local algebra U , with (1) being faithful. Dene the relative entropys
S( (1) | (2) ) := Trace(D(2) ln D(2) ) Trace(D(2) ln D(1) ),
where D(j) is the density matrix associated to the state (j) with j = 1, 2. The
relative entropy is super-additive: for any 1 , 2 I, 1 2 = , and for any even
states (1) , (2) , (1,2) , respectively, on U1 , U2 and U1 2 , (1) and (2) faithful,
we have
S( (1) (2) | (1,2) ) S( (1) | (1,2) |U1 ) + S( (2) | (1,2) |U2 ). (6.4)
For even states (1) and (2) , respectively on U1 and U2 with 1 2 = , the
even state (1) (2) is the unique extension of (1) and (2) on U1 2 satisfying
for all A U1 and all B U2 ,
(1) (2) (AB) = (1) (A) (2) (B).
The state (1) (2) is called the product of (1) and (2) . The product of even states
is an associative operation. In particular, products of even states can be dened with
respect to any countable set {Un }nN of subalgebras of U with m n = for
m = m.
Observe that the relative entropy becomes additive with respect to product
(1)
states: if (1,2) = (2) , where
(1) and
(2) are two even states respectively
on U1 and U2 , then (6.4) is satised with equality. The relative entropy is also
convex: for any states (1) , (2) , and (3) on U , (1) faithful, and for any (0, 1)
S( (1) | (2) + (1 ) (3) ) S( (1) | (2) ) + (1 )S( (1) | (3) ). (6.5)
Meanwhile
S( (1) | (2) + (1 ) (3) ) log + (1 ) log(1 ) + S( (1) | (2) )
+ (1 )S( (1) | (3) ), (6.6)
for any (0, 1). Note that the relative entropy makes sense in a class of states on
U much larger than that of even states on U (cf. [40]), but this is not needed here.
The condition
: a(l),# a(l+1),#
uniquely denes a homomorphism on U called right-shift homomorphism. Any
state on U such that = is called shift-invariant and we denote by EU the

s As in [40] we use the ArakiKosaki denition, which has opposite sign than the one given in [25].
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 271

set of shift-invariant states on U. An important class of shift-invariant states are


product states obtained by copying some even state of the one-site algebra
U1 on all other sites, i.e.


:= k . (6.7)
k=0

Such product states are important and used below as reference states. More gener-
ally, a state is L-periodic with L N if = L . For each L N, the set of
L
all L-periodic states from EU is denoted by EU .
Let be any faithful even state on U1 and let be any L-periodic state on U.
It immediately follows from super-additivity (6.4) that for any N, M N

S( |U(M +N )L | |U(M +N )L ) S( |UM L | |UM L ) + S( |UN L | |UN L ).

In particular, the following limit exists

) := lim S( |UN L | |UN L ) = sup S( |UN L | |UN L )


S(, (6.8)
N NL N N NL
and is the relative entropy density of with respect to the reference state . This
functional has the following important properties:

Lemma 6.1 (Properties of the Relative Entropy Density). At any xed


L N, the relative entropy density functional S(, ) is lower weak -
semicontinuous, i.e. for any faithful even state EU1 and any r R, the set
L
Mr := { EU : S(,
) > r}

is open with respect to the weak -topology. It is also ane, i.e. for any faithful state
EU1 and states , EU
L

+ (1 ) ) = S(,
S(, ),
) + (1 )S(,

with (0, 1).

Proof. Without loss of generality, let L = 1. From the second equality of (6.8),

Mr = { EU : S( |UN | |UN ) > rN }.
N N

As the maps S( |UN | |UN ) are weak -continuous for each N , it follows that
Mr is the union of open sets, which implies the lower weak -semicontinuity of the
relative entropy density functional. Moreover from (6.5) and (6.6) we directly obtain
) is ane.
that S(,

Notice that any p.i. state is automatically shift-invariant. Thus, the mean
relative entropy density is a well-dened functional on EUS,+ . Now, we need to
dene on EUS,+ the functional () relating to the mean BCS interaction energy
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

272 J.-B. Bru & W. de Siqueira Pedra

per site:
Lemma 6.2 (BCS Energy per Site for p.i. States). For any EUS,+ , the
mean BCS interaction energy per site in the thermodynamic limit


N
() := lim (a(l), a(l), a(m), a(m), )
N N 2
l,m=1

= (a(1), a(1), a(2), a(2), )

is well-dened and the ane map : EUS,+ C, () is weak -continuous.

Proof. First,

N
(a(l), a(l), a(m), a(m), )
l,m=1


N 
N
= (a(l), a(l), a(l), a(l), ) + (a(l), a(l), a(m), a(m), ).
l=1 l, m=1
l
=m

(6.9)
Since EUS,+ , for any l = m observe that
(a(l), a(l), a(m), a(m), ) = (a(1), a(1), a(2), a(2), ), (6.10)
whereas
(a(l), a(l), a(l), a(l), ) = (a(1), a(1), a(1), a(1), ). (6.11)
Therefore, by combining (6.9) with (6.10) and (6.11), the lemma follows.

Now, we dene by
Trace(A eH )
H (A) := , A U , (6.12)
Trace(eH )
the Gibbs state associated with any self-adjoint element H of U at inverse tem-
perature > 0. This denition is of course in accordance with the Gibbs state N
(1.6) associated with the Hamiltoniant HN (1.2) since N = HN for any N N.
Note however, that the state N is seen either as dened on the local algebra UN
or as dened on the whole algebra U by periodically extending it (with period N ).
Next we give an important property of Gibbs states (6.12):
Lemma 6.3 (Passivity of Gibbs States). Let H0 , H1 be self-adjoint elements
from U and dene for any state on U
F () := (H1 ) 1 S( H0 |) + P H0 ,

t With the appropriate numbering of sites dened by the bijective map .


April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 273

where P H := 1 ln Trace(eH ) for any self-adjoint H U . Then P H1 +H0


F () for any state on U with equality if = H0 +H1 . Note that F () is the
free energy associated with the state .

Proof. For any self-adjoint H U and any state on U observe that

Trace(D ln DH ) = Trace(D ln(exp(P H H)))


= (H) P H , (6.13)

which implies that

P H1 +H0 = 1 (Trace(DH0 +H1 ln DH0 +H1 ) Trace(DH0 +H1 ln DH0 ))


H0 +H1 (H1 ) + P H0 , (6.14)

i.e. P H1 +H0 = F ( H0 +H1 ). Without loss of generality take any faithful state on

U . In this case, there are positive numbers j with j j = 1 and vectors j| from
 
the Hilbert space H such that () = j j j| |j. In particular, from (6.13)
we have

(H1 ) S( H0 |) + P H0 = j ( ln j j|H0 + H1 |j).
j

Consequently, by convexity of the exponential function combined with Jensen


inequality we obtain that

exp((H1 ) S( H0 |) + P H0 )

j exp( ln j j|H0 + H1 |j)
j

Trace(exp((H0 + H1 ))) = exp(P H1 +H0 ).

Note that the last inequality uses the so-called PeierlsBogoliubov inequality which
is again a consequence of Jensen inequality.

This proof is standard (see, e.g., [25]). It is only given in detail here, because we
also need later Eqs. (6.13) and (6.14).
Observe that Lemma 6.3 applied to = H0 gives the Bogoliubov (con-
vexity) inequality [29]. We can also deduce from this lemma that the pressure
pN (, , , , h) (1.4) associated with HN equals


N
pN (, , , , h) = N (a(l), a(l), a(m), a(m), )
N2
l,m=1

1
S(0 |UN |N ) + pN (, , , 0, h), (6.15)
N
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

274 J.-B. Bru & W. de Siqueira Pedra

for any , > 0 and real numbers , , h. Recall that 0 is the shift-invariant state
obtained by copying the state 0 (6.1) of the one-site algebra U1 , see (6.7).
Lemma 6.4 (From S to the Relative Entropy Density S at Finite N ). Let

N be the shift-invariant state dened by
1
N := (N + N + + N N 1 ),

N
where is the right-shift homomorphism. Then S(0 |UN |N ) = N S(
0,
N ),
cf. (6.8).
0,
Proof. By Lemma 6.1 combined with (6.8), the relative entropy density S( N )
equals
 1

1
N 1
0,
S( N ) = lim S(0 |UM N | N k |UM N ) , (6.16)
M MN N
k=0
for any xed N N. By using now the additivity of the relative entropy for product
states observe that
S(0 |UM N | N k |UM N ) = (M 1)S(0 |UN | N |UN ) + S(0 |Uk | N |Uk )
+ S(0 |UN k | N |UN k ), (6.17)
for any k {0, . . . , N 1}, with S(0 |U0 | N |U0 ) := 0 by denition. Therefore the
equality S(0 |UN |N ) = N S( 0,
N ) directly follows from (6.16) combined with
(6.17).

We are now in position to give a rst general upper bound for the pressure
pN (, , , , h) by using the equality (6.15) together with Lemmas 6.2 and 6.4.
Lemma 6.5 (General Upper Bound of the Pressure pN ). For any , > 0
and , , h R, one gets that
lim sup{pN (, , , , h)} p(, , , 0, h) + sup {() 1 S(
0 , )},
N S,+
EU

where we recall that EUS,+ is the non empty set of extreme points of EUS,+ .

Proof. By (6.15) combined with Lemma 6.4 one gets



N
pN (, , , , h) = N (a(l), a(l), a(m), a(m), )
N2
l,m=1

1 S(
0,
N ) + pN (, , , 0, h). (6.18)
The last term of this equality is independent of N N since
1
pN (, , , 0, h) = ln Trace(e[(h)n +(+h)n 2n n ] )

=: p(, , , 0, h), (6.19)
cf. (2.3).
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 275

However, the other terms require the knowledge of the states N and N in the
limit N . Actually, because the unit ball in U is a metric space with respect
to the weak -topology, the sequence {N } converges in the weak -topology along
a subsequence towards . Meanwhile, it is easy to see that for all A U , I,
lim {N (A)
N (A)} = 0.
N

Thus, the sequences of states N and N have the same limit points. Since N
is even and permutation invariant with respect to the N rst sites, the state
belongs to EUS,+ . We now estimate the rst term (6.18) as in Lemma 6.2 to get
lim sup{pN (, , , , h)} p(, , , 0, h) + (a(1), a(1), a(2), a(2), )
N

+ 1 lim sup{S(
0,
N )}. (6.20)
N

From Lemma 6.1 the relative entropy density is lower semicontinuous in the weak -
topology, which implies that
lim sup{S( N )} S(
0, 0 , ).
N

By combining this last inequality with (6.20) we then nd that


lim sup{pN (, , , , h)} p(, , , 0, h) + ( ) 1 S(
0 , ), (6.21)
N

with EUS,+ .
Now, from Lemma 6.2 the functional () is ane and weak -continuous,
whereas by Lemma 6.1 the map S( 0 , ) is ane and lower weak -
semicontinuous. The free energy functional () 1 S(
0 , ) is, in par-

ticular, convex and upper weak -semicontinuous. Meanwhile recall that EUS,+ is a
weak -compact and convex set. Therefore, from the Bauer maximum principle [32,
Lemma 4.1.12] it follows that
sup {() 1 S(
0 , )} = sup {() 1 S(
0 , )}. (6.22)
S,+ S,+
EU EU

Together with (6.21), this last inequality implies the upper bound stated in the
lemma.

Since even states on U are entirely determined by their action on even elements
from U, observe that we can identify the set of even p.i. states of U with the set
of p.i. states on the even sub-algebra U + . We want to show next that the set of
extreme points EUS,+ belongs to the set of strongly clustering states on the even
sub-algebra U + of U. By strongly clustering states with respect to U + , we mean
that for any B in U + , there exists a net {Bj } Co{s (B) : s S} such that for
any A U + ,
lim |(A s (Bj )) (A)(B)| = 0
j

uniformly in s S. Here, Co M denotes the convex hull of the set M .


April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

276 J.-B. Bru & W. de Siqueira Pedra

S,+
Lemma 6.6 (Characterization of the Set of Extreme States of EU ). Any
S,+
extreme state EU is strongly clustering with respect to the even sub-algebra
U + and conversely.

Proof. We use some standard facts about extreme decompositions of states which
can be found in [32, Theorems 4.3.17 and 4.3.22]. To satisfy the requirements of
these theorems, we need to prove that the C -algebra U + of even elements of U is
asymptotically abelian with respect to the action of the group S. This is proven as
follows. For each l N dene the map (l) : N N by

k + 2 , if 1 k 2 .
l1 l1

(l) (k) := k 2l1 , if 2l1 + 1 k 2l . (6.23)

l
k, if k > 2 .
In other words, the map (l) exchanges the block {1, . . . , 2l1 } with {2l1 +
1, . . . , 2l }, and leaves the rest invariant. For any A, B U U + with I,
it is then not dicult to see that
lim [A, (l) (B)] = 0
l

in the norm sense. Recall that the map (l) is dened via (6.3). By density of
local elements of U + the limit above equals zero for all A, B U + . Therefore, by
using now [32, Theorems 4.3.17 and 4.3.22] all states EUS,+ are then strongly
clustering with respect to U + and conversely.

We show next that p.i. states, which are strongly clustering with respect to
the even sub-algebra U + , have clustering properties with respect to the whole
algebra U.
Lemma 6.7 (Extension of the Strongly Clustering Property). Let EUS,+
be any strongly clustering state with respect to U + . Then, for any A, B U and
> 0, there are B Co{s (B): s S} and l such that for any l l ,
|(A(l) (B )) (A)(B)| < .

Proof. By density of local elements it suces to prove the lemma for any A, B
UN and N N. The operators A and B can always be written as sums A = A+ +A
and B = B + + B , where A+ and B + are in the even sub-algebra U + whereas
A and B are odd elements, i.e. they are sums of monomials of odd degree in
annihilation and creation operators. Since is assumed to be strongly clustering
with respect to U + , for any > 0 there are positive numbers 1 , . . . , k with
1 + + k = 1, and maps s1 , . . . , sk S such that for any l N,
 
 
 k

 A+ (l) (B +
) (A +
)(B + 
)
 k sj  < . (6.24)
 j=1 
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 277

By parity and linearity of observe that (A+ )(B + ) = (A)(B), whereas



k
(A(l) (B )) = A+ (l) k sj (B + ) (6.25)
j=1

for large enough l with the operator B Co{s (B): s S} dened by



k
B := k sj (B). (6.26)
j=1

The equality (6.25) follows from parity and the statement


)) = 0
(A(l) (B
for any EUS,+ , A, B UN , B odd, and suciently large l. This can be seen
as follows. Since any element of UN with dened parity can be written as a linear
combination of two self-adjoint elements with same parity, we assume without loss
of generality that (B ) = B
. Choose l N large enough such that the support
) does not intersect {(1), . . . , (N )} for all l l . The map (l) :
:= (l) (B
of B l
N N is dened by (6.23). Dene B := m2l+1 (B ), m N0 := {0, 1, 2, . . .},
l,m l
where is the right-shift homomorphism. For any J N
 J 

)
AB = (J + 1)(AB
l,m l,0
m=0

by symmetry of . Use now the CauchySchwarz inequality for states to get



 J

  

(J + 1)|(ABl,0 )| (A A)
B
(B
l,m l,m ).
m,m =0

and B
Since per construction, B  anti-commute if m = m ,
l,m l,m


J 
J
(Bl,m B l,m )= (Bl,m Bl,m ).
m,m =0 m=0

By symmetry of , the right-hand side of the equation above equals (J +


)2 ). Hence, we conclude that
1)((B l,0

)| (J + 1)1/2
|(AB )2 ),
(|A|2 )((B
l,0 l,0

) = 0 for all l l .
for any J N, i.e. (ABl,0
Therefore, the lemma follows from (6.24) and (6.25) with B Co{s (B) : s
S} dened by (6.26) for any > 0.

We now identify the set of clustering states on U with the set of product states
by the following lemma, which is a non-commutative version of de Finetti Theorem
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

278 J.-B. Bru & W. de Siqueira Pedra

of probability theory [28]. Strmer [1] was the rst to show the corresponding result
for innite tensor products of C -algebras.
Lemma 6.8 (Strongly Clustering p.i. States are Product States). Any
p.i. and strongly clustering (in the sense of Lemma 6.7) state is a product state
(6.7) with the one-site state = := |U1 being the restriction of on the local
(one-site) algebra U1 .
Proof. Let l1 , . . . , lk N with li = lj whenever i = j, and for any j {1, . . . , k}
take Aj U1 . To prove the lemma we need to show that
( l1 (A1 ) lk (Ak )) = (A1 ) (Ak ). (6.27)
The proof of this last equality for any k 1 is performed by induction. First, for
k = 1 the equality (6.27) immediately follows by symmetry of the state . Now,
assume the equality (6.27) veried at xed k 1. The state is strongly clustering
in the sense of Lemma 6.7. Therefore for each > 0 there are q N, positive
numbers 1 , . . . , q with 1 + + q = 1, and maps s1 , . . . , sq S such that


 q
 j ( l1 (A1 ) lk (Ak )(l) sj ( lk+1 (Ak+1 )))

j=1



( l1 (A1 ) lk (Ak ))( lk+1 (Ak+1 )) < , (6.28)


for any l N. Fix N suciently large such that the operators lm (Am ) and
sj ( lk+1 (Ak+1 )) belong to UN for any m {1, . . . , k + 1} and j {1, . . . , q}.
Choose l N large enough such that the support of (1) sj ( lk+1 (Ak+1 )) does not
intersect {(1), . . . , (N )} for all l l and j {1, . . . , q}, which by symmetry of
implies that
( l1 (A1 ) lk (Ak )(l) sj ( lk+1 (Ak+1 ))) = ( l1 (A1 ) lk (Ak ) lk+1 (Ak+1 )).
Combined with (6.28) and 1 + + q = 1, it yields
|( l1 (A1 ) lk (Ak ) lk+1 (Ak+1 )) ( l1 (A1 ) lk (Ak )) (Ak+1 )| < .
Since the equality (6.27) is assumed to be veried at xed k 1, it follows that
|( l1 (A1 ) lk+1 (Ak+1 )) (A1 ) (Ak+1 )| < ,
for any > 0. In other words, by induction the equality (6.27) is proven for any
k 1.
As soon as the upper bound is concerned, we combine Lemma 6.5 with Lem-
mas 6.66.8 to obtain that
lim sup{pN (, , , )} p(, , , 0, h) + sup {|(a a )|2 1 S(0 |)}.
N +
EU
1
(6.29)
Here denotes the set of even states on the (one-site) algebra U1 . Now the
EU+1
proof of the upper bound (6.2) easily follows from the passivity of Gibbs states on
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 279

U1 . Indeed, we apply Lemma 6.3 to the one-site Hamiltonians H0 = H1 (0) (see


(2.1)) and
c c
H1 = a a a a
2 2
in order to bound the relative entropy S(0 |). More precisely, it follows that
p(, , , 0, h) 1 S(0 |) p(c/(2)) x Re{(a a )} y Im{(a a )},
(6.30)
for any state and any c C with x := Re{c} and y := Im{c}. Consequently,
EU+1
from (6.29) we deduce that
lim sup{pN (, , , , h)}
N

!
sup inf (Re{(a a )}2 + Im{(a a )}2 )
+ x,yR
EU
1

x Re{(a a )} y Im{(a a )} + p((x + iy)/(2))}
 
sup inf {(t2 + s2 ) tx sy + p((x + iy)/(2))} .
t,sR x,yR

In particular, by xing x = 2t and y = 2s in the inmum we nally obtain


lim sup{pN (, , , , h)} sup {(t2 + s2 ) + p(t + is)},
N t,sR

i.e. the upper bound (6.2) for any , > 0 and , , h R.

6.2. Equilibrium and ground states of the strong coupling


BCS-Hubbard model
It follows immediately from the passivity of Gibbs states that
p(, , , , h) () 1 S(
0 , ) + p(, , , 0, h), (6.31)
for any EUS,+ ,cf. (6.1) and Lemmas 6.2 and 6.3. Therefore, by using Lemma 6.5
with (6.22) the (innite volume) pressure can be written as
p(, , , , h) = sup {() 1 S(
0 , )} + p(, , , 0, h).
S,+
EU

Moreover, as shown above (see the upper bound in the proof of Lemma 6.5), any
weak limit point of local Gibbs states N (1.6) when N satises (6.31)
with equality.
Indeed, by using (6.13) one obtains for any state that

N
1 1
((HN ) S(trN ||UN )) = 2 (a(l), a(l), a(m), a(m), )
N N
l,m=1
1
S(0 |UN ||UN ) + pN (, , , 0, h),
N
(6.32)
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

280 J.-B. Bru & W. de Siqueira Pedra

with pN being the (nite volume) pressure (1.4) associated with the Hamiltonian
HN (1.2), 0 being the product state obtained by copying the state 0 (6.1) on
the one-site algebra U1 (see (6.7)), and with the trace state trN dened on the local
algebra UN for N N by
Trace()
trN () := .
Trace(IUN )
For any permutation invariant state it is straightforward to check that the limits
lim {N 1 S(0 |UN ||UN )}
N

and
e() := lim {N 1 (HN )} = (H1 (0)) ()
N

exist for any xed parameters , > 0 and , , h R, see respectively (2.1) and
Lemma 6.2 for the denitions of H1 (0) and (). Combined with (6.19) and (6.32)
it then follows that the usual entropy density

S() := lim {N 1 S(trN ||UN )}
N
 
1
= lim Trace(D|UN log D|UN ) <
N N
of the permutation invariant state also exists and
1
lim S(0 |UN ||UN ) = e() + () 1 S()
+ p(, , , 0, h).
N N

The set = (, , , h) of equilibrium states of the strong coupling BCS-


Hubbard model is dened by
:= { EUS,+ : e() + 1 S()
= p(, , , , h)
1
= () S(0 , ) + p(, , , 0, h)}.
Note that contains per construction all weak limit points of local Gibbs states
N as N .
Consequently, the equilibrium states are, as usual, the minimizers of the free
energy functional
F() := e() 1 S()
(6.33)
on the convex and weak -compact set EUS,+ ,
cf. (1.5). They also maximize the
upper semicontinuous ane functional () 1 S( 0 , ). It follows that
S,+
is a closed face of EU and we have in this set a notion of pure and mixed
thermodynamic phases (equilibrium states) by identifying purity with extremity.
In particular, it is convex and weak -compact. Each weak -limit of equilib-
rium states (n) n (n , n , n , hn ) such that (n , n , n , hn ) (, , , h) and
n is called a ground state of the strong coupling BCS-Hubbard model.
The set of all ground states with parameters > 0 and , , h R is denoted
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 281

by = (, , , h). Extreme states of the weak -compact convex set are


called pure ground states.
We analyze now the set of pure equilibrium states, i.e. the equilibrium states
belonging to the set EUS,+ of extreme points of EUS,+ , cf. (6.22). First, from
Lemmas 6.66.8 recall that any extreme state is a product state (6.7), i.e. it is
obtained by copying a state on the one-site algebra U1 to the other sites. In
particular, by combining (6.22) with (6.31) observe that
p(, , , , h) = sup {|(a a )|2 1 S(0 |)} + p(, , , 0, h). (6.34)
+
EU
1

Therefore, a product state is a pure equilibrium state if and only if belongs to


the set G = G (, , , h) of one-site equilibrium states dened by
G := { EU+1 : |(a a )|2 1 S(0 |) = p(, , , , h) p(, , , 0, h)}.
(6.35)
In other words, the study of pure states of can be reduced, without loss of
generality, to the analysis of G . The rst important statement concerns the char-
acterization of the set G in relation with the variational problems (2.4) and (6.34).
Theorem 6.1 (Explicit Description of One-Site Equilibrium States). For
any , > 0 and , , h R, the set G of one-site equilibrium states are given by
1/2
the states c (6.1) with c := r ei for any order parameter r solution of (2.4)
and any phase [0, 2).

Proof. Take any solution r of (2.4) and any [0, 2). Then, from (6.14) observe
that
1 S(0 |c ) + p(, , , 0, h) = c (c a a + c a a ) + p(c ). (6.36)
Since c (a a ) = c and c (a a ) = c , the last equality combined with Theo-
rem 2.1 implies that
|c (a a )|2 1 S(0 |c ) = p(, , , , h) p(, , , 0, h). (6.37)
In other words, c is a maximizer of the variational problem dened in (6.34) and
hence, c G .
On the other hand, any state G satises (6.37) and by combining Theo-
rem 2.1 with the inequality (6.30) for c = 2(a a ) it follows that
|(a a )|2 + p((a a )) sup{|c|2 + p(c)}.
cC
1/2
Hence, (a a ) = r ei = c for some [0, 2). It remains to prove that the
equality (a a ) = c uniquely denes the one-site equilibrium state G . It
follows from (a a ) = c (a a ) = c with , c G that S(0 |c ) = S(0 |)
and
(c a a + c a a ) 1 S(0 |) = P H1 (c ) P H1 (0) (6.38)
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

282 J.-B. Bru & W. de Siqueira Pedra

because of (6.36), see (2.1) for the denition of H1 (c). By Lemma 6.3, one obtains
for any self-adjoint A U1 that

(A) + (c a a + c a a ) 1 S(0 |) P H1 (c )+A P H1 (0) . (6.39)

Consequently, we obtain by combining (6.38) and (6.39) that

P H1 (c )+A P H1 (c ) (A),

for any self-adjoint A U1 and G such that (a a ) = c . In other words,


the functional {} is tangent to the pressure at H1 (c ). Since the convex map
A P H1 (c )+A is continuously dierentiable and self-adjoint elements separate
states, the tangent functional is unique and = c .

It follows immediately from the theorem above that pure states of solve the
gap equation:

Corollary 6.1 (Gap Equation for Pure Equilibrium States). For any , >
0 and , , h R, pure states from are precisely the product states c satisfying
1/2
the gap equation c (a(l), , a(l), ) = c for any l N and with c := r ei being
any maximizer of the rst variational problem given in Theorem 2.1.

If c = 0, observe that the gap equation c (a(l), , a(l), ) = c with c dened


in (6.1) corresponds to the EulerLagrange equation satised by the solutions c :=
1/2
r ei of the rst variational problem given in Theorem 2.1. The phase [0, 2)
is arbitrarily taken because of the gauge invariance of the map c p(c), and the gap
equation c (a(l), , a(l), ) = c can be reduced to (2.5). In other words, if c = 0,
the gap equation can be written in two dierent ways: either c (a(l), , a(l), ) =
c in the view point of extreme equilibrium states or (2.5) in the view point of the
order parameter r .
From this last corollary observe also that the existence of non-zero maximiz-
ers c = 0 implies the existence of equilibrium states breaking the U (1)-gauge
symmetry satised by HN (1.2). This breakdown of the U (1)-gauge symmetry for
c = 0 is already explained by Theorem 3.2, which can be proven by our notion of
equilibrium states as follows.
Consider the upper semicontinuous convex map on EUS,+ dened for any 0
and [0, 2) by

e() + 1 S()
+ 2 Re{ei (a a )}. (6.40)

From Sec. 6.1 it is straightforward to check that


 
1
p, (, , , , h) := lim ln Trace(eHN,, )
N N

= sup {e() + 1 S()


+ 2 Re{ei (a a )}}, (6.41)
S,+
EU
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 283

with the Hamiltonian HN,, dened in (3.1). Moreover, any weak -limits ,,
of local Gibbs states
Trace( eHN,, )
N,,() := (6.42)
Trace(eHN,, )
are equilibrium states (see the proof of Lemma 6.5 applied to HN,, ), i.e. the
state ,, belongs to the (non-empty) convex set ,, = ,, (, , , h) of
maximizers of (6.40) at xed 0 and [0, 2). In fact, one gets the following
statement, which implies Theorem 3.2.

Theorem 6.2 (Breakdown of the U (1)-Gauge Symmetry). Take , > 0


and real numbers , , h away from any critical point. Then at xed phase
[0, 2),

1 
N
1/2
lim lim N,,(a(l), a(l), ) = lim ,, (a(1), a(1), ) = r ei ,
0 N N 0
l=1

with ,, ,, being the unique maximizer of (6.40) for suciently small


0.

Proof. First we need to characterize pure states of ,, as it is done in Corol-


lary 6.1 for = 0. By convexity and upper semicontinuity, note that maximizers of
(6.40) are taken on the set of extreme states whereas the set of extreme maximizers
is a face. Since extreme states are product states (cf. Lemmas 6.66.8), we get that

sup {e() + 1 S()


+ Re{ei (a a )}} = sup{|c|2 + p(c + 1 ei )},
S,+ cC
EU
(6.43)

as in the case = 0 (see (2.3) for the denition of p(c)). If c,, =


c,, (, , , h) C is a maximizer of

|c|2 + p(c + 1 ei ), (6.44)

then observe that z,, := c,, + 1 ei maximizes the function

|z 1 ei |2 + p(z)

of the complex variable z C. By gauge invariance of the map z p(, , , h; z),


it follows that z,, ei R and thus c,, ei R. Using this, we extend Corol-
lary 6.1 to 0 and [0, 2). In other words, for any , > 0, 0, [0, 2)
and , , h R, pure states of ,, are product states c,, satisfying the gap
equation

c,, (a(l), , a(l), ) = c,, , (6.45)

for any l N and with c,, ei R being any maximizer of (6.44).


April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

284 J.-B. Bru & W. de Siqueira Pedra

As |c| , notice that p(c) = O(|c|). So, by gauge invariance we obtain


sup{|c|2 + p(c + 1 ei )} = max {|sei |2 + p([s + 1 ]ei )}
cC s[M,M]

= max {s2 + p(s + 1 )},


s[M,M]

for any (0, 1) and M < suciently large. Consequently, if the parameters ,
, , , and h are such that the maximizer r (2.4) is unique, then the maximizer
c,, ei R of (6.44) is also unique as soon as > 0 is suciently small. Indeed
the map s p(s) is continuous on the compact interval [M, M ]. In particular,
from (6.45) there is a unique maximizer of (6.40), i.e.
,, = {c,, }. (6.46)
1/2
Moreover, c,, converges to r ei as 0. Therefore, it follows from (6.45)
that
1/2
lim c,, (a(l), a(l), ) = r ei (6.47)
0

for any l N.
By permutation invariance
1 
N
N,, (a(l), a(l), ) = N,,(a(1), a(1), ).
N
l=1
(1) (2)
Now, let {Nj } and {Nj } be two subsequences in N such that
lim N (1) ,, (a(1), a(1), ) = lim sup N,, (a(1), a(1), ),
j j N

lim N (2) ,, (a(1), a(1), ) = lim inf N,, (a(1), a(1), ).


j j N

We can assume without loss of generality that N (2) and N (1) both converge with
j j
respect to the weak -topology as j . Since any weak -limits ,, of local
Gibbs states N,, (6.42) are equilibrium states (see again the proof of Lemma 6.5),
i.e. ,, ,, , the theorem then follows from (6.46) and (6.47). Indeed, for
any , > 0 and , , h R away from any critical point, the sequence N,,
of local Gibbs state converges towards ,, = c,, in the weak -topology as
soon as 0 is suciently small.

From Corollary 6.1 note that the expectation values of Cooper elds
(l) := a(l), a(l), + a(l), a(l),
(6.48)
(l) := i(a(l), a(l), a(l), a(l), )
are
c ((l) ) = 2 Re{c } and c ((l) ) = 2 Im{c } (6.49)
1/2
for any pure state c of and l N, where we recall that c := r ei is some
maximizer of the rst variational problem given in Theorem 2.1. In particular,
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 285

((l) ) = 0 or ((l) ) = 0 for any pure state is a manifestation of the


breakdown of the U (1)-gauge symmetry.
Unfortunately, the operators (l) and (l) do not correspond to any exper-
iment, as they are not gauge invariant. More generally, experiments only see
the restriction of states c to the subalgebra of gauge invariant elements. Con-
sequently, the next step is to prove the so-called o diagonal long range order
(ODLRO) property proposed by Yang [38] to dene the superconducting phase.
Indeed, one detects the presence of U (1)-gauge symmetry breaking by consider-
ing the asymptotics, as |l m| , of the (U (1)-gauge symmetric) Cooper pair
correlation function
G (l, m) := (a(l), a(l), a(m), a(m), ) (6.50)
associated with some state . In particular, if G (l, m) converges to some xed
non-zero value whenever |l m| , the state shows o diagonal long range
order (ODLRO). This property can directly be analyzed for equilibrium states from
our next statement.
Theorem 6.3 (Cooper Pair Correlation Function). For any , > 0 and
, , h R away from any critical point, the Cooper pair correlation function
GN (l, m) associated with the local Gibbs state N converges for xed l = m towards
lim GN (l, m) = G (l, m) = r ,
N

for any equilibrium state , and with r being the solution of (2.4).

Proof. By similar arguments as in the proof of Theorem 6.2, if G (l, m) = r for


all equilibrium states , then
lim GN (l, m) = r .
N

By permutation invariance of , note that


G (l, m) = G (1, 2) (6.51)
for any l = m. If = c is an extreme equilibrium state, then one clearly has
Gc (1, 2) = c (a a )c (a a ) = |c |2 = r .

On the other hand, the set of equilibrium states for xed parameters , > 0,
and , , h R is weak -compact. In particular, if is not extreme, the
function G (1, 2) is given, up to arbitrarily small errors, by convex sums of the
form

k
j G(j) (1, 2), 1 , . . . , k 0, 1 + + k = 1, (6.52)
j=1

where { (j) }j=1,...,k are extreme equilibrium states. Since any weak -limit of
local Gibbs states N (1.6) is an equilibrium state (see proof of Lemma 6.5), the
theorem is then a consequence of (6.51) and (6.52).
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

286 J.-B. Bru & W. de Siqueira Pedra

Since

1 
N
N (a(l), a(l), a(m), a(m), )
N2
l,m=1

N (N 1)
= N (a(1), a(1), a(2), a(2), ) + O(N 1 ),
N2

note that this theorem implies Theorem 3.1.


Therefore, away from any critical point, if an equilibrium state shows ODLRO
then all pure equilibrium states break the U (1)-gauge symmetry. Conversely, if
all pure equilibrium states break the U (1)-gauge symmetry, then all equilibrium
state show ODLRO. This is due to the fact that the order parameter r is unique
away from any critical point. In particular, from Sec. 7, at suciently small inverse
temperature there is no ODLRO and = {0 }, whereas for suciently large
and all equilibrium states show ODLRO.
For any , > 0 and real numbers , , h at some critical point, this property
is not satised in general. There are indeed cases where the phase transition is of
rst order, cf. Fig. 3. In this situation, 0 and some r > 0 are maximizers at the
same time, and hence, there are some equilibrium states breaking the U (1)-gauge
symmetry and other equilibrium states which do not show ODLRO in this specic
situation.
Observe now that the superconducting phase is not only characterized by
ODLRO and the breakdown of the U (1)-gauge symmetry. Indeed, the two-point
correlation function determines its type: s-wave, d-wave, p-wave, etc. In fact, for
any extreme equilibrium state = c , x, y Zd and s1 , s2 {, }, one clearly
has

 0 if x = y.
c (ax,s1 )c (ay,s2 ) if x = y
c (ax,s1 ay,s2 ) = = 0 if x = y, s1 = s2 .
c (ax,s1 ax,s2 ) if x = y c
if x = y, s1 = s2 .

As a consequence, for any equilibrium state , we have (ax,s1 ay,s2 ) =


(a0,s1 a0,s2 )x,y and we obtain a s-wave superconducting phase. In particular, The-
orem 3.3 is a simple consequence of this last equalities combined with (6.46), (6.47)
and the fact that any weak -limits ,, ,, of local Gibbs states N,,
(6.42) are equilibrium states (see again the proof of Lemma 6.5).
Now we would like to pursue this analysis of equilibrium states by showing that
their denition is in accordance with results of Theorems 3.43.6. This statement
is given in the next theorem.

Theorem 6.4 (Uniqueness of Densities for Equilibrium States). Take


, > 0 and real numbers , , h away from any critical point. Then, for any
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 287

equilibrium state and l N, all densities are uniquely dened :


(i) The electron density is equal to
 
1 
N
lim N (n(l ), + n(l ), ) = (n(l), + n(l), ) = d ,
N N 
l =1

cf. Theorem 3.4.


(ii) The magnetization density is equal to
 
1 
N
lim N (n(l ), n(l ), ) = (n(l), n(l), ) = m ,
N N 
l =1

cf. Theorem 3.5.


(iii) The Coulomb correlation density is equal to
 
1 
N
lim N (n(l ), n(l ), ) = (n(l), n(l), ) = w ,
N N 
l =1

cf. Theorem 3.6.

Proof. Suppose rst that is pure. Then, from Corollary 6.1 it follows that
(n(l), + n(l), ) = c (n(l), + n(l), ),
1/2
with c = r ei for some [0, 2). Thus, by using the gauge invariance of the
map c p(c) we directly get
1/2
(n(l), + n(l), ) = p(, , , , h; c ) = p(, , , , h; r ) = d . (6.53)
At xed parameters , > 0, , , h R, recall that the set of equilibrium
states is weak -compact. In particular, if is not pure, it is the weak -limit
of convex combinations of pure states. Therefore, we obtain (6.53) for any .
Similarly one gets
(n(l), n(l), ) = m and (n(l), n(l), ) = w , (6.54)
for any equilibrium state and l N. Moreover, since any weak -limit
of local Gibbs states N (1.6) is an equilibrium state, i.e. , we therefore
deduce from (6.53) and (6.54), exactly as in the proof of Theorem 6.2, the existence
of the limits in the statements (i)(iii).

Observe that the weak -limit of local Gibbs states N (1.6) can
easily be performed, even at critical points, by using the decomposition theory for
states [32]:
Theorem 6.5 (Asymptotics of the Local Gibbs State N as N ).
1/2
Recall that for any [0, 2), c := r ei is a maximizer of the rst variational
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

288 J.-B. Bru & W. de Siqueira Pedra

problem given in Theorem 2.1, whereas the states c and are respectively dened
by (6.1) and (6.7). Take any , > 0, , , h R, and let N .

(i) Away from any critical point, the local Gibbs state N converges in the weak -
topology towards the equilibrium state
" 2
1
() = c ()d. (6.55)
2 0
(ii) For each weak limit point of local Gibbs states N with parameters
(N , N , N , N , hN ) converging to any critical point (, , , , h) S (2.7),
there is [0, 1] such that
" 2

() = (1 )0 () + c ()d.
2 0

Proof. By U (1)-gauge symmetry of the Hamiltonians HN (1.2) recall that any


weak -limit of local Gibbs states N (1.6) is a U (1)-invariant equilibrium state.
So, in order to prove the rst part of the Theorem it suces to show that the
equilibrium state given in (i) is the unique U (1)-invariant state in . If the solution
r of (2.4) is zero, then this follows immediately from Corollary 6.1.
1/2
Let r > 0 be the unique maximizer of (2.4), i.e. c := r ei = 0 for any
[0, 2). Let

= { : G }

be the set of all extreme states of , see (6.35) for the denition of the set G of one-
site equilibrium states. Observe that the closed convex hull of is precisely
and that is the image of the torus [0, 2) under the continuous map c ,
1/2
with c := r ei . This last map denes a homeomorphism between the torus and
. In particular, the set is compact and for each equilibrium state
there is a uniquely dened probability measure dm on the torus such that
" 2
(A) = c (A)dm
(), for all A U. (6.56)
0

See, e.g., [41, Proposition 1.2]. By U (1)-invariance of , for any n N one has
from (6.56) that
n  " 2
# n/2
a(l), a(l), = r ein dm
() = 0.
l=1 0

Therefore, if r > 0, there is a unique probability measure allowing the U (1)-gauge


symmetry of : dm () must be the uniform probability measure on [0, 2).
From Lemma 7.1 the cardinality of set of maximizers of (2.4) is at most 2.
Indeed, away from any critical point, it is 1 whereas at a critical point it can be
either 1 (second order phase transition) or 2 (rst order phase transition). For
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 289

more details, see Sec. 7. In both cases, we can use the same arguments as above. By
similar estimates as in the proof of Lemma 6.5 it immediately follows that all limit
points of the Gibbs states N with parameters (N , N , N , N , hN ) converging to
(, , , , h) S as N , belongs to = (, , , h). Since the set of all
U (1)-invariant equilibrium states from is { ( ) for any [0, 1]} with
" 2

() := (1 )0 () +
( )
c ()d, (6.57)
2 0
we obtain the second statement (ii).

This theorem is a generalization of results obtained for the strong couplingu


BCS model [7]. Note however, that Thirrings analysis [7] of the asymptotics of
local Gibbs states comes from explicit computations, whereas we use the structure
of sets of states, as explained for instance in [33].
Observe that Theorem 4.1 is a simple consequence of Theorem 6.5. Indeed,
assume for instance that the order parameter r = r (, , , h) and the electron
density per site d = d (, , , h) jumps respectively from r +
= 0 to r and
from d + (c)
to d by crossing a critical chemical potential at xed parameters
(, , , h). An example of such behavior is given in gure 10 for an electron density
smaller than one. If [d +
, d ], then the unique solution N, = N, (, , , h)
(c) (c)
of (4.1) must converge towards as N . Meanwhile, at xed (, , , , h)
0 (n + n ) = d
and c+ (n + n ) = d+
,


with c+
:= r+
e
i
and [0, 2). Any weak -limit of local Gibbs states N
satises per construction
(n + n ) =
and has the form ( ) () (6.57), by Theorem 6.5. Hence, the Gibbs state N con-
verges in the weak -topology towards ( ) () with dened in Theorem 4.1.
Indeed, the existence of the limits (i)(iii) in Theorem 4.1 follows from the unique-
ness of the limiting equilibrium state with xed electron density [d +
, d ].
We give now various important properties of densities in ground states, i.e. for
= , which immediately follow from Theorem 6.4. Recall that the set of
ground states is the set of all weak limit points as n of all equilibrium state
sequences { (n) }nN with diverging inverse temperature n .
Take > 0 and parameters , , h such that | | = + |h|. Then the electron
and Coulomb correlation densities equal, respectively,
d := (n(l), + n(l), ) = d and w := (n(l), n(l), ) = w , (6.58)
for any ground state and l N, cf. Corollaries 3.2 and 3.4.

u See (1.2) with = 0 and h = 0.


April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

290 J.-B. Bru & W. de Siqueira Pedra

If additionally > ||,+|h| , we are in the superconducting phase for ground


states, cf. Corollary 3.1. Indeed, for any [0, 2), there is a ground state
such that for any l N,
(a(l), a(l), ) = r1/2 i
max e .

In the superconducting phase, from Corollary 3.4 we observe that d = 2w ,


whereas the magnetization density equals
m := (n(l), n(l), ) = m = 0, (6.59)
for any superconducting state and l N. This is the Meiner eect, see
Corollary 3.3. On the other hand, the CauchySchwarz inequality for the states
implies the inequalities
 
0 (n(l), n(l), ) (n(l), ) (n(l), ) (6.60)

for any l N and EU+ . In fact, in the superconducting phase the second
inequality of (6.60) is an equality for any . Indeed, (6.59) and Corollary 3.4
yield
(n(l), n(l), ) = (n(l), ) = (n(l), ), (6.61)
for any and l N. It shows that 100% of electrons form Cooper pairs in
superconducting ground states.
In the case where h = 0 with > ||,+|h| and | | = + |h|, the density
vector (d, m, w) dened by (6.58) and (6.59) is also unique as in the superconducting
phase. It equals (d , m , w ), see Corollaries 3.23.4. However, if h = 0 with
< ||, , or = ||,+|h| , or | | = + |h|, then the density vector
(d, m, w) belongs, in general, to a non-trivial convex set. In other words, there are
phase transitions involving to these densities. In particular, even in the case h = 0
where the Hamiltonian HN (1.2) is spin invariant, there are ground states breaking
the spin SU (2)-symmetry.
For instance, take , > 0 and parameters , such that | | < and <
||, . Then for any and l N, the electron density equals d = d = 1,
whereas the Coulomb correlation density is w = w = 0. In particular, the rst
inequality of (6.60) is an equality showing that 0% of electrons forms Cooper pairs.
But, even if the magnetic eld vanishes, i.e. h = 0, for any x (1, 1) there exists
a ground state (x) with magnetization density m = x (see (6.59) for the
denition of m).
Therefore, all the thermodynamics of the strong coupling BCS-Hubbard model
discussed in Secs. 3.13.5 is encoded in the notion of equilibrium and ground states
with (0, ]. However, there is still an important open question related
to the thermodynamics of this model. It concerns the problem of uctuations of
the Cooper pair condensate density (Theorem 3.1) or Cooper elds (l) and (l)
(6.48) as a function of the temperature. Unfortunately, no result in that direction
are known as soon as the thermodynamic limit is concerned. We prove however a
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 291

simple statement about uctuations of Cooper elds for pure states from in the
limit .

Theorem 6.6 (Fluctuations of Cooper Fields). Take , > 0 and real num-
bers , , h away from any critical point. Then, for any pure state c and
l N, the uctuations of Cooper elds (l) and (l) (6.48) are bounded by

0 c ({(l) c ((l) )}2 ) 2 1 1 ,

0 c ({(l) c ((l) )}2 ) 2 1 1 ,

i.e. they vanish in the limit .

Proof. Recall that properties of pure states are characterized in Corollary 6.1, i.e.
they are product states c with the one-site state c being dened in (6.1). In
1/2
particular, they satisfy (6.49). Now, to avoid triviality, assume that c := r ei =
0 and let f( ) be the function dened for any R by

f( ) := |c + |2 + p(c + ).

Since c = 0 is a maximizer of the function |c|+p(c) of c C, one has 2 f(0) 0,


i.e. 2 p(c + )| =0 2. From straightforward computations, observe that p(c + )
is a convex function of R with

1 2 {2 p(c + )}| =0 = c ({(l) c ((l) )}2 ) 0.

From this last equality combined with {2 p(c + )}| =0 2, we deduce the the-
orem for (l) . Moreover, from similar arguments using the function f( ) := f(i )
instead of f, the uctuations of the Cooper eld (l) are also bounded by 2 1 1 .

From Theorem 6.6, note that Cooper elds are c-numbers in the corresponding
GNS-representation [32] of pure ground states dened as weak -limits of pure equi-
librium states:

Corollary 6.2 (Cooper Fields for Pure Ground States). Let be


any weak -limit of pure equilibrium states and let (, , H) be the corresponding
GNS-representation of on bounded operators on the Hilbert space H with cyclic
vacuum . Then is pure and for any l N, ((l) ) = ((l) )IH and ((l) ) =
((l) )IH .

Proof. A pure equilibrium state is a product state (6.7) and any weak -limit of
product states in EUS,+ is also a product state. Thus, by Lemma 6.6, any ground state
dened as the weak -limit of pure equilibrium states is extreme in EUS,+
and hence extreme in . Clearly, for such ground state, (((l) )) = ((l) )IH
for any l N. Let := (l) ((l) ). From Theorem 6.6 combined with the
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

292 J.-B. Bru & W. de Siqueira Pedra

CauchySchwarz inequality we obtain for any A U that




()(A)
2
H = (A A) A AA
(( 2 ))

A2 
3/2 [(
2 )]1/4 = 0.

From the cyclicity of , it follows that ((l) ) = ((l) )IH . The proof of
((l) ) = ((l) )IH is also performed in the same way. We omit the details.

In particular, for such pure ground states in , correlation functions can


explicitly be computed at any order in Cooper elds. For instance, for all N N,
all kj , lj N, mj , nj N0 , j = 1, . . . , N , and any An U, n = 1, . . . , N + 1, one
has

(k1 ) (l1 ) A2 AN (kN ) (lN ) AN +1 )


(A1 m 1 n1 mN nN

(k1 ) )((l1 ) ) ((kN ) )((lN ) )(A1 AN +1 ).


= (m 1 n1 mN mN

7. Analysis of the Variational Problem


The variational problem (2.4) is quite explicit but for the reader convenience, we
collect here some properties of its solution r with respect to , > 0 and , , h
R. We show in particular that r > 0 exists in a non-empty domain of (, , , , h)
with some monotonicity properties as well as the existence of both rst and second
order phase transitions. We conclude this section by giving the asymptotics of r
as , i.e. by proving Corollary 3.1.
(1) We start by showing that r = 0 for suciently small inverse temperatures
at xed , , and h. Indeed, for any r 0 one computes that

sinh(gr )
r f (r) = 1 , (7.1)
2gr (e cosh(h) + cosh(gr ))
cf. Theorem 2.1. Direct estimations show that if 0 < < 2 1 , then r f (r) < 0 for
any r 0, i.e. r = 0.
(2) Fix now > 0 and , , h R, then r > 0 for suciently large coupling
constants . Indeed, for large enough > 0 there is, at least, one strictly positive
solution r > 0 of (2.5). Since direct computations using again (2.5) imply that
d
{f (, , , , h; r ()) f (, , , , h; 0)} = r () > 0,
d
and
f (, , , , h; r ) f (, , , , h; 0) = O() as ,
for any xed > 0 and , , h R, there is a unique c > 2| | such that
f (r ) > f (0), i.e. r > 0 for > c . The domain of parameters (, , , , h) where
r is strictly positive is therefore non-empty, cf. Figs. 3 and 4.
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 293

(3) To get an intuitive idea of the behavior of the function f (r) (cf. Theorem 2.1),
we analyze the cardinality of the set S of strictly positive solutions of the gap
equation (2.5):
Lemma 7.1 (Cardinality of the Set S). If 6, the gap equation (2.5) has
at most one strictly positive solution, whereas it has, at most, two strictly positive
solutions when > 6.
Proof. From (7.1), any strictly positive maximizer r > 0 of (2.4) is solution of
the equation

h1 (gr ) = 0, with h1 (x) := sinh(x) e cosh(h) cosh(x). (7.2)
2x
This last equation is equivalent to the gap equation (2.5). For any x > 0, observe
that


x h1 (x) = cosh(x) + sinh(x) = 0 (7.3)
2x 2x2
if and only if
$
y
(2 1 1 )1/2 y = 1 =: C(y), y = x > 0. (7.4)
tanh(y)
The map y C(y) is strictly concave for y > 0, C(0) = 0, and y C(0) = (2/6)1/2 .
Therefore, if > 6 there is a unique strictly positive solution y% = %
x > 0 of (7.4),
and there is no strictly positive solution of (7.4) when < 6. Since h1 (0) could
be negative in some cases and h1 (x) diverges exponentially to as x , the
cardinality of set of strictly positive solutions of the gap equation (2.5) is at most
two if > 6, or at most one if 6.

Consequently, if the gap equation (2.5) has no solution, then f (r) is strictly
decreasing for any r 0. If the gap equation (2.5) has one unique solution r > 0,
the function f (r) is increasing until its (strictly positive) maximizer r > 0 and
decreasing next for r r . Finally, when there are two strictly positive solutions of
(2.5), the lower one must be one local minimum whereas the larger solution must
be a local maximum. In this case the function f (r) decreases for r 0 until its
local minimum, then increases until its local maximum, and nally decreases again
to diverge towards . Note that none of these cases can be excluded, i.e. they all
appear depending on , > 0 and , , h R. See Figs. 3 and 18.
(4) We study now the dependence of r > 0 with respect to variations of each
parameter. So, let us x the parameters {, , , , h}\{} with = , , , , or
h and consider the function (r, ) := r f (r, ) for r 0 and in the open set of
denition of f (r, ) = f (, , , , h; r), see (7.1). Recall that r > 0 is a solution
at = 0 of the gap equation (2.5), i.e. (r , 0 ) = 0.
Straightforward computations imply that
4
r2 f (r) = h2 (gr ), (7.5)
4gr2 (e cosh(h) + cosh(gr ))
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

294 J.-B. Bru & W. de Siqueira Pedra

f (r)
f (r) f (r)
1.16

2.15 1.38
1.15
1.37

1.36 1.14
2.10

1.35
1.13

2.05 1.34
1.12
1.33

0.05 0.10 0.15 0.20 0.25


r 0.05 0.10 0.15 0.20 0.25
r 0.05 0.10 0.15 0.20 0.25
r

Fig. 18. Illustrations of the function f (r) for r [0, 1/4] at (, , h) = (1, 2.6, 0) with inverse
temperatures = c 0.3 (orange line), = c (red line), = c + 0.5 (blue line), and with
coupling constants = 0 (left gure), = 0.45 (gure on the center) and = 0.575 (right gure).
Here c = c1 is the critical inverse temperature which, from left to right, equals 2.04, 3.46 and
6.35, respectively. (Color online.)

for any r > 0 with


e cosh(h) cosh(x) + 1 sinh(x)
h2 (x) := . (7.6)
e cosh(h) + cosh(x) x
It yields that there is at most one strictly positive solution, r 0 of r (r, 0 ) =
0 for each xed set of parameters. For instance, if e cosh(h) 1, then it is
straightforward to check that r (r, 0 ) < 0 for any r > 0. In the situation where
the gap equation (2.5) has two strictly positive solutions, r > 0 cannot solve
r (r, 0 ) = 0, since in this case the equation h2 (x) = 0 would have at least two
strictly positive solutions, as r is a maximizer.
Consequently, to simplify our study we restrict on the very large set of param-
eters where r (r , 0 ) = 0. In this case, the dierential d has maximal rank at
(r , 0 ) and from the implicit function theorem, there are > 0 and a smooth
and strictly positive functionv r () > 0 dened on the ball B (0 ) centered on
the point 0 and with radius such that (, r ()) = 0 for any B (0 ). By
continuity of the function r we can choose > 0 such that r (, r ()) does
not change its sign for B (0 ). Thus r () describes the evolution of the solu-
tion of (2.4) for B (0 ). If r = r (0 ) > 0 is the unique maximizer of (2.4)
with r (r , 0 ) = 0, then the function r () describes the smooth evolution of the
Cooper pair condensate density with respect to small perturbations of 0 . Observe
that

(r (), ) = { r ()}{r (r, )}|r=r () + { (r, )}|r=r () = 0

and {r (r, 0 )}|r=r (0 ) < 0 because r is a maximizer. Consequently, one obtains

sgn{ r (0 )} = sgn{{ r f (r, 0 )}|r=r (0 ) }.

In other words, the function r () of B (0 ) is either increasing if

{ r f (r, 0 )}|r=r (0 ) > 0,

v If = , then of course r () := r .
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 295

or decreasing if
{ r f (r, 0 )}|r=r (0 ) < 0,
as soon as r > 0 is the unique maximizer of (2.4) with r (r , 0 ) = 0.
(5) By applying this last result respectively to 0 = > ||,+|h| (Corollary 3.1)
and 0 = h R, we obtain that r > 0 is an increasing function of > 0 and a
decreasing function of |h| because via (2.5) one has
{ r f (r, )}|r=r > 4 2 ( )2 0
at xed parameters (, , , h) and
2gr e sinh(h)
{h r f (r, h)}|r=r =
sinh(gr )
at xed (, , , ).
(6) If > ||,+|h| , for any xed (, , , h) the order parameter r > 0 is a
decreasing function of | | under the condition that e cosh(h) 1, as
2 ( )
{ r f (r, )}|r=r = h2 (gr ),
2gr2 (e cosh(h) + cosh(gr ))
cf. (7.6). If e cosh(h) > 1, the behavior of r > 0 is not anymore monotone as a
function of | | ( being xed), cf. Fig. 10.
The behavior of r as a function of or is also not clear in general. But, at
least as a function of the inverse temperature > 0, we can give simple sucient
conditions to get its monotonicity. Indeed, direct computations show that
cosh(gr )
{ r f (r, )}|r=r = ( + 2)gr ( + 2gr2 )
sinh(gr )
e sinh(h)
2hgr .
sinh(gr )
By combining this last equality with (2.5), we then get that
{ r f (r, )}|r=r 0 (7.7)
with r > 0 if and only if
( cosh(gr ) 2e cosh(h)( + h tanh(h)))
gr2 . (7.8)
4(cosh(gr ) + e cosh(h))
From (2.5) combined with tanh(x) < 1, we also have
2 cosh2 (gr )
gr2 < . (7.9)
4(cosh(gr ) + e cosh(h))2
Therefore, a sucient condition to satisfy the inequality (7.8) is obtained by bound-
ing the right-hand side of (7.9) with the r.h.s. of (7.8). From (2.5) this implies the
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

296 J.-B. Bru & W. de Siqueira Pedra

condition
gr ( + h tanh(h)) tanh(gr ),
under which r is an increasing function of > 0. This inequality is also equiva-
lent to

e cosh(h)
gr tanh(gr ) ( + h tanh(h)) .
2 cosh(gr )
In particular, by using again the gap equation (2.5), if

e cosh(h)
> 2( + h tanh(h)) 1 + ,
cosh(gr )
then r > 0 is an increasing function of > 0. Since tanh x 1, another sucient
condition to get (7.7) is +|h| gr . In particular, if < || and > ||,+|h|
with h suciently small, then r > 0 is again an increasing function of > 0.
Therefore, the domain of (, , , h) where r > 0 is proven to be an increasing
function of > 0 is rather large. Actually, from a huge number of numerical com-
putations, we conjecture that r > 0 is always an increasing function of > 0. In
other words, this conjecture implies that the condition expressed in Corollary 3.1 on
(, , , h) should be necessary to obtain a superconductor at a xed temperature.
(7) Observe that the order of the phase transition depends on the parameters. For
instance, assume 0, h = 0 and > ||, . Then, at any inverse temperature
> 0 it follows from (7.5) that f (r) is a strictly concave function of r > 0.
This property justies the existence and uniqueness of the inverse temperature c
solution of the equation

tanh(| |) 2 e
= 1+ ,
| | cosh(| |)
i.e. (2.5) for 0, h = 0 and r = 0. In particular, c is such that the Cooper pair
condensate density continuously goes from r = 0 for c to r > 0 for > c .
In this case the superconducting phase transition is of second order, cf. Fig. 3.
The appearance of a rst order phase transition at some xed (, , , h) is also
not surprising. Indeed, recall that the function f (r) may have a local minimum
and a local maximum, see discussions below Lemma 7.1. For instance, assume now
= > 0, h = 0 and 4 = 0, < 6. Then, from (7.1) for r = 0,


r f (0) = (e + 1) .
e +1 2
Since by explicit computations
 
ex + 1
min > 3,
x>0 x
it follows that r f (0) < 0 for any > 0 whenever = > 0, h = 0 and 0 < 6.
Therefore, as soon as there is a superconducting phase transition, for instance if
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 297

4 < 6 (cf. Corollary 3.1), the function r of > 0 must be discontinuous at


the critical point. This case is an example of a rst order superconducting phase
transition. Numerical illustrations of a similar rst order phase transition are also
given in Fig. 3.
(8) We conclude this section by a computation of the asymptotics of the order
parameter r as . We prove in particular Corollary 3.1.
From (2.6), we already know that r = 0 for any 2| | with
:= .
Therefore, we consider here that > 2| | and we look for the domain where the
parameter r is strictly positive in the limit . Recall that r is solution of
the variational problem (2.4), i.e.
1 1
ln 2 + sup f (r) = r + ln{eh + eh + e(gr ) + e(gr +) }. (7.10)
r0
When the last exponential term can always be neglected for our analysis
since gr 0.
Now, assume rst that g0 = |
| > + |h|. Then gr > + |h| for any r 0 and
when the function f (r) converges to

w(r) := r + gr .

In particular, the order parameter r converges towards the unique maximizer rmax
(2.6) of the function w(r) for r 0, i.e.

r := lim r = rmax , (7.11)


| and real numbers , , h satisfying |


for any > 2| | > + |h|.
Assume now that | | + |h| and let rmin be the solution of gr = + |h|, i.e.

rmin := 2 (( + |h|)2
2 ) 0. (7.12)

Then, for any r [0, rmin]

f (r) = r + |h| + o(1) as .

In particular, since > 0,

sup f (r) = f () = |h| + o(1), with = o(1) as . (7.13)


0rrmin

The solution r of the variational problem (7.10) converges either to 0, or to some


strictly positive value r > rmin . In the case where r > rmin , we would have

f (r ) = w(r ) + o(1) as . (7.14)

| + |h| and 2( + |h|), then rmin rmax , cf. (2.6) and (7.12). In
Now, if |
this regime, straightforward computations show that

|h| sup w(r) = |h| w(rmin ) = 1 ((|h| + )2


2 ) 0. (7.15)
rrmin
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

298 J.-B. Bru & W. de Siqueira Pedra

In other words, the order parameter r converges towards


r := lim r = 0, (7.16)

| + |h|.
for any 2( + |h|) and real numbers , , h satisfying |
| + |h| and > 2( + |h|), then rmin < rmax . In particular
However, if |
one gets
1
|h| sup w(r) = |h| w(rmax ) = (
| |,+|h| )( | |,+|h| ),

(7.17)
rrmin 4
with x,y 2y dened for any x R+ and y R in Corollary 3.1 and

| |,+|h| := 2( + |h| ( + |h|)2
2 ) 2|
|.

In particular,
sup w(r) = w(rmax ) > |h|, (7.18)
rrmin

for any > | |,+|h| 2|


|. Therefore, by combining (7.13) with (7.14) and
(7.18), we obtain
r := lim r = rmax , (7.19)

for any > | |,+|h| and real numbers , , h satisfying |


| + |h|.
Finally, if = | |,+|h| and |
| < + |h|, observe that (7.17) is zero. So,
we analyze the next order term to know which number, 0 or rmax , maximizes the
function f (r) when . On the one hand, straightforward estimations imply
that
f (0) |h| = 1 (e(+|h|| |) + e2|h| )(1 + o(1)) as . (7.20)
On the other hand, if = | |,+|h| with |
| < + |h|, then by using (2.6) one
obtains
2 2
f (rmax ) |h| = 1 e (+|h|) (1 + o(1)) as . (7.21)
Therefore, if = | |,+|h| and |
| < + |h|, it is trivial to check from (7.20)
(7.21) that f (0) > f (rmax ) when .
Consequently, the limits (7.11), (7.16) and (7.19) together with (2.6) imply
Corollary 3.1 for any = ||,+|h| , whereas if = ||,+|h| , the order param-
eter r converges to r = 0.

Appendix. Griths Arguments


As we have an explicit representation of the pressure, it can be veried in some
cases that r is a C 1 -functionw of parameters implying that p(, , , , h) is dif-
ferentiable with respect to parameters. In this particular situation, the proofs of

w For instance, for special choices of parameters one could check that r (r , 0 )
= 0, see Sec. 7.
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 299

Theorems 3.1, 3.2, 3.43.7 done in Sec. 6.2 could also be performed without our
notion of equilibrium states by using Griths arguments [2931], which are based
on convexity properties of the pressure. We explain it shortly and we conclude by
a discussion of an alternative proof of Theorem 3.2.

Remark A.1. Our method gives access to all correlation functions at once (cf. The-
orem 6.5). It is generalized in [18] to all translation invariant Fermi systems. How-
ever, computing all correlation functions with Griths arguments [2931] requires
the dierentiability of the pressure with respect to any perturbation as well as the
computation of its corresponding derivative. This is generally a very hard task, for
instance for correlation functions involving many lattice points.

(1) Take self-adjoint operators PN acting on the fermionic Fock space and assume
the existence of the (innite volume) grand-canonical pressure
p (, , , , h) := lim pN, (, , , , h)
N

for any xed in a neighborhood V of 0. In this case, observe that the nite volume
pressure
1
pN, (, , , , h) := ln Trace(e(HN PN ) )
N
is convex as a function of V and
pN,0 = N 1 N (PN ).
Consequently, the point-wise convergence of the function pN, towards p implies
that
   
lim inf lim pN, lim p and lim sup lim+ pN, lim+ p ,
N 0 0 N 0 0
(A.1)
see Griths lemma [30, 31] or [29, Appendix C]. In particular, one gets
lim { pN,0 } = lim {N 1 N (PN )} = p=0 , (A.2)
N N

under the assumption that p is dierentiable at = 0.


(2) Therefore, by taking

PN = ax, ax, ay, ay, ,
x,yN

we obtain from (A.2) that



1 

lim a a a y, a y, = p(, , , , h),
N N 2
x, x,
x,yN

as soon as the (innite volume) pressure p (, , , , h) has continuous derivative


with respect to > 0. Combined with Theorem 2.1 and (2.5) we would obtain
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

300 J.-B. Bru & W. de Siqueira Pedra

Theorem 3.1. Meanwhile, Theorems 3.43.7 could have been deduced in the same
way from (A.2) combined with explicit computations using (2.5).

(3) A direct proof of Theorem 3.2 using Griths arguments is more delicate. One
uses similar arguments as in [29, 42]. We give them for the interested reader.
For any [0, 2), rst recall that the pressure p, associated with HN,,
(3.1) in the thermodynamic limit is given by (6.41), which equals (6.43). Addition-
ally, if the parameters , , , , and h are such that (2.4) has a unique maximizer
r , then the variational problem (6.43) has a unique maximizer c,, ei R for
1/2
> 0 suciently small, and c,, converges to r ei as 0, see proof of
Theorem 6.2.
Now, let us denote by

NN := (nx, + nx, )
xN

the full particle number operator. By straightforward computations, observe that

[ax, , NN ] = ax, and [ax, , NN ] = ax, , (A.3)

for any lattice site labelled by x N , where [A, B] := AB BA. Therefore the
i
unitary operator U := e 2 NN realizes a global gauge transformation because one
deduces from (A.3) that
i i
U ax, U = e 2 ax, and U ax, U = e 2 ax, . (A.4)

In particular, the unitary transformation of the Hamiltonian HN,, (3.1) equals

U HN,, U = HN,,0 .

It implies on the corresponding Gibbs states (6.42) that

N,,(BN ) = ei N,,0 (BN ), (A.5)

with the operator BN be dened by



BN := ax, ax, .
xN

In other words, it suces to prove Theorem 3.2 for = 0.


Take = 0. Observe that

0 = N,,0 ([HN,,0 , NN ]) = N,,0 (BN BN ). (A.6)

Additionally, by using the positive semidenite BogoliubovDuhamel scalar


product
"
(X, Y )HN,,0 := 1 eN pN,,0 (,,,,h) Trace(e( )HN,,0 X e HN,,0 Y )d
0
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 301

with respect to the Hamiltonian HN,,0 (see, e.g., [25, 29, 42]), one gets that
0 ([NN , HN,,0 ], [NN , HN,,0 ])HN,,0
= N,,0 ([NN , [HN,,0 , NN ]]) = N,,0 (BN + BN ). (A.7)

So, by combining (A.6) with (A.7) it follows that

N,,0 (BN ) = N,,0(BN ) 0

for any 0. In particular N,,0 (BN ) = N,,0(BN ) is a real number.


The function pN,,0 is a convex function of 0 because
({(BN + BN ) N,,0 (BN + BN )}, {(BN + BN ) N,,0 (BN + BN )})HN,,0
= 2 pN,,0 (, , , , h).

Then, under the assumption that p,0 is dierentiable at = 0 away from any
critical point, the equations (A.2), with
PN = BN + BN

and (6.43), imply that


 
1 1
lim N,,0 (BN + BN ) = lim ln Trace(eHN,,0 )
N N N N
= p,0 (, , , , h)
= c,,0 (a a + a a ),

for any > 0 suciently small and with c () dened for any c C by (6.1).
Returning back to the original Hamiltonian HN,, (3.1) for any [0, 2), we
conclude from (A.5) combined with the last equalities that
 
1  ei
lim N,,(ax, ax, ) = c (a a + a a ).
N N 2 ,,0
xN

Therefore, by taking the limit 0, Theorem 3.2 would follow if one additionally
checks that p,0 is dierentiable at = 0 away from any critical point.

Acknowledgments
We are very grateful to Volker Bach and Jakob Yngvason for their hospitality at
the Erwin Schr odinger International Institute for Mathematical Physics, at the
Physics University of Vienna, and at the Institute of Mathematics of the Johannes
GutenbergUniversity that allowed us to work on dierent aspects of the present
paper. We also thank N. S. Tonchev and V. A. Zagrebnov for giving us relevant
references, as well as the referee for having helped us to improve the paper. Addition-
ally, J.-B. B. especially thanks the mathematical physics group of the Department
of Physics of the University of Vienna for the very nice working environment.
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

302 J.-B. Bru & W. de Siqueira Pedra

References
[1] E. Strmer, Symmetric states of innite tensor product C -algebras, J. Funct. Anal.
3 (1969) 4868.
[2] J. R. Schrieer and M. Tinkham, Superconductivity, Rev. Mod. Phys. 71 (1999)
S313S317.
[3] Y. Yanase, T. Jujo, T. Nomura, H. Ikeda, T. Hotta and K. Yamada, Theory of
superconductivity in strongly correlated electron systems, Phys. Rep. 387 (2003)
1149.
[4] A. L. Patrick, N. Nagaosa and X.-G. Wen, Doping a Mott insulator: Physics of high-
temperature superconductivity, Rev. Mod. Phys. 78 (2006) 1785.
[5] S. T. Beliaev, Application of the methods of quantum eld theory to a system of
bosons, Sov. Phys. JETP 7 (1958) 289299.
[6] W. Thirring and A. Wehrl, On the mathematical structure of the B.C.S.-model,
Comm. Math. Phys. 4 (1967) 303314.
[7] W. Thirring, On the mathematical structure of the B.C.S.-model. II, Comm. Math.
Phys. 7 (1968) 181189.
[8] D. J. Thouless, The Quantum Mechanics of Many-Body Systems, 2nd edn. (Academic
Press, New York, 1972).
[9] N. G. Dueld and J. V. Pule, A new method for the thermodynamics of the BCS
model, Comm. Math. Phys. 118 (1988) 475494.
[10] G. A. Raggio and R. F. Werner, The Gibbs variational principle for general BCS-type
models, Europhys. Lett. 9 (1989) 633638.
[11] I. A. Bernadskii and R. A. Minlos, Exact solution of the BCS model, Theor. Math.
Phys. 12(2) (1972) 779787.
[12] N. Ilieva and W. Thirring, High-Tc superconductivity by phase cloning, arXiv:hep-
th/0701245v3 (2007).
[13] N. N. Bogoliubov, V. V. Tolmachev and D. V. Shirkov, A New Method in the The-
ory of Superconductivity (Academy of Sciences Press, Moscow, 1958) and (Consult.
Bureau, Inc., N.Y., Chapman Hall Ltd., London, 1959).
[14] R. J. Bursill and C. J. Thompson, Variational bounds for lattice fermion models
II: Extended Hubbard model in the atomic limit, J. Phys. A Math. Gen. 26 (1993)
44974511.
[15] F. P. Mancini, F. Mancini and A. Naddeo, Exact solution of the extended Hubbard
model in the atomic limit on the Bethe lattice, arXiv:0711.0318v1 (2007).
[16] I. G. Brankov and N. S. Tonchev, On the SD model for coexistence of ferromagnetism
and superconductivity, Phys. Stat. Sol. (B) 102 (1980) 179187.
[17] N. N. Bogoliubov Jr., A. N. Ermilov and A. M. Kurbatov, On coexistence of super-
conductivity and ferromagnetism, Phys. A 101 (1980) 613628.
[18] J.-B. Bru and W. de Siqueira Pedra, Non-cooperative equilibria of Fermi systems
with long range interactions, in preparation.
[19] D. Petz, G. A. Raggio and A. Verbeure, Asymptotics of Varadhan-type and the Gibbs
variational principle, Comm. Math. Phys. 121 (1989) 271282.
[20] G. A. Raggio and R. F. Werner, Quantum statistical mechanics of general mean eld
systems, Helv. Phys. Acta 62 (1989) 9801003.
[21] G. A. Raggio and R. F. Werner, The Gibbs variational principle for inhomogeneous
mean eld systems, Helv. Phys. Acta 64 (1991) 633667.
[22] F. Hiai, M. Mosonyi, H. Ohno and D. Petz, Free energy density for mean eld per-
turbation of states of a one-dimensional spin chain, Rev. Math. Phys. 20(3) (2008)
335365.
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00395

Eect of a Locally Repulsive Interaction on s-Wave Superconductors 303

[23] W. De Roeck, C. Maes, K. Netocny and L. Rey-Bellet, A note on the non-commu-


tative Laplace-Varadhan integral Lemma, arXiv:0808.0293v2 [math-ph] (2009).
[24] G. L. Sewell, Quantum Theory of Collective Phenomena (Clarendon Press, Oxford,
1986).
[25] O. Brattelli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechan-
ics, Vol. II, 2nd edn. (Springer-Verlag, New York, 1996).
[26] R. Haag, The mathematical structure of the BardeenCooperSchrieer model, Il
Nuovo Cimento 25(2) (1962) 287299.
[27] G. Emch, Algebraic Methods in Statistical Mechanics and Quantum Field Theory
(Wiley-Interscience, New York, 1972).
[28] L. Accardi, De Finetti theorem, in Encyclopaedia of Mathematics, ed. M. Hazewinkel
(Kluwer Academic Publishers, 2001).
[29] V. A. Zagrebnov and J.-B. Bru, The Bogoliubov model of weakly imperfect Bose gas,
Phys. Rep. 350 (2001) 291434.
[30] R. Griths, A proof that the free energy of a spin system is extensive, J. Math. Phys.
5 (1964) 12151222.
[31] K. Hepp and E. H. Lieb, Equilibrium statistical mechanics of matter interacting with
the quantized radiation eld, Phys. Rev. A 8 (1973) 25172525.
[32] O. Brattelli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechan-
ics, Vol. I, 2nd edn. (Springer-Verlag, New York, 1996).
[33] M. Fannes, H. Spohn and A. Verbeure, Equilibrium states for mean eld models, J.
Math. Phys. 21(2) (1980) 355358.
[34] N. N. Bogoliubov Jr., J. G. Brankov, V. A. Zagrebnov, A. M. Kurbatov and N. S.
Tonchev, Metod approksimiruyushchego gamiltoniana v statisticheskoi zikex (Izdat.
Bulgar. Akad. Nauk,y Soa, 1981).
[35] N. N. Bogoliubov Jr., J. G. Brankov, V. A. Zagrebnov, A. M. Kurbatov and N. S.
Tonchev, Some classes of exactly soluble models of problems in Quantum Statistical
Mechanics: The method of the approximating Hamiltonian, Russ. Math. Surv. 39
(1984) 150.
[36] J. G. Brankov, D. M. Danchev and N. S. Tonchev, Theory of Critical Phenomena in
Finite-Size Systems: Scaling and Quantum Eects (World Scientic, 2000).
[37] N. N. Bogoliubov Jr., On model dynamical systems in statistical mechanics, Physica
32 (1966) 933944.
[38] C. N. Yang, Concept of o-diagonal long range order and the quantum phases of
liquid He and of superconductors, Rev. Mod. Phys. 34 (1962) 694704.
[39] S. Adams and T. Dorlas, C -Algebraic approach to the BoseHubbard model, J.
Math. Phys. 48 (2007) 103304, 14 pp.
[40] H. Araki and H. Moriya, Equilibrium statistical mechanics of fermion lattice systems,
Rev. Math. Phys. 15 (2003) 93198.
[41] R. R. Phelps, Lectures on Choquets Theorem, Lecture Notes in Mathematics,
Vol. 1757, 2nd edn. (Springer-Verlag, 2001).
[42] J. Ginibre, On the asymptotic exactness of the Bogoliubov approximation for many
Bosons systems, Comm. Math. Phys. 8 (1968) 2651.

x The Approximating Hamiltonian Method in Statistical Physics.


y Publ. House Bulg. Acad. Sci.
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00396

Reviews in Mathematical Physics


Vol. 22, No. 3 (2010) 305329

c World Scientific Publishing Company
DOI: 10.1142/S0129055X10003965

ON SEMICLASSICAL AND UNIVERSAL INEQUALITIES


FOR EIGENVALUES OF QUANTUM GRAPHS

SEMRA DEMIREL and EVANS M. HARRELL, II


Department of Mathematics, University of Stuttgart,
Pfaenwaldring 57, D-70569 Stuttgart, Germany
Semra.Demirel@mathematik.uni-stuttgart.de
School of Mathematics, Georgia Institute of Technology,
Atlanta GA 30332-0160, USA
harrell@math.gatech.edu

Received 12 November 2009

We study the spectra of quantum graphs with the method of trace identities (sum rules),
which are used to derive inequalities of LiebThirring, PayneP olyaWeinberger, and
Yang types, among others. We show that the sharp constants of these inequalities and
even their forms depend on the topology of the graph. Conditions are identified under
which the sharp constants are the same as for the classical inequalities. In particular, this
is true in the case of trees. We also provide some counterexamples where the classical
form of the inequalities is false.

Keywords: Quantum graph; semiclassical; LiebThirring inequality; sum rule; universal


spectral bounds.

Mathematics Subject Classification 2010: 81Q35, 34L15, 34L40, 81Q20, 47E05, 47A75

1. Introduction
This article is focused on inequalities for the means, moments, and ratios of eigenval-
ues of quantum graphs. A quantum graph is a metric graph with one- dimensional
Schr
odinger operators acting on the edges and appropriate boundary conditions
imposed at the vertices and at the nite external ends, if any. Here we shall dene
the Hamiltonian H on a quantum graph as the minimal (Friedrichs) self-adjoint
extension of the quadratic form


Cc  E() := | |2 ds, (1.1)

which leads to vanishing Dirichlet boundary conditions at the ends of exterior edges
and to the conditions at each vertex vk that is continuous and moreover

(0+ ) = 0, (1.2)
j
xkj

305
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00396

306 S. Demirel & E. M. Harrell, II

where the sum runs over all edges emanating from vk , and xkj designates the
distance from vk along the jth edge. (Edges connecting vk to itself are accounted
twice.) In the literature, these vertex conditions are usually known as Kirchho or
Neumann conditions. Other vertex conditions are possible, and are amenable to our
methods with some complications, but they will not be considered in this article.
For details about the denition of H, we refer to [15].
Quantum mechanics on graphs has a long history in physics and physical chem-
istry [21, 24], but recent progress in experimental solid state physics has renewed
attention on them as idealized models for thin domains. While the problem of quan-
tum systems in high dimensions has to be solved numerically, since quantum graphs
are locally one-dimensional their spectra can often be determined explicitly. A large
literature on the subject has arisen, for which we refer to the bibliography given
in [3, 7].
The subject of inequalities for means, moments, and ratios of eigenvalues is
rather well developed for Laplacians on domains and for Schr odinger operators,
and it is our aim to determine the extent to which analogous theorems apply to
quantum graphs. For example, when there is a potential energy V (x) in appropriate
function spaces, LiebThirring inequalities provide an upper bound for the moments
odinger operator H() = 2 +V (x)
of the negative eigenvalues Ej () of the Schr
2 d
in L (R ), > 0, of the form
 
d/2
(Ej ()) L,d

(V (x))+d/2 dx (1.3)
Ej ()<0 Rd

for some constant L,d Lcl cl


,d , where L,d , known as the classical constant, is
given by
1 ( + 1)
Lcl
,d = d/2
.
(4) ( + d/2 + 1)
It is known that (1.3) holds true for various ranges of 0 depending on the
dimension d; see [5, 13, 19, 20, 23, 27]. In particular, in [18] Laptev and Weidl proved
that L,d = Lcl ,d for all 3/2 and d 1, and Stubbe [25] has recently given
a new proof of sharp LiebThirring inequalities for 2 and d 1 by showing
monotonicity with respect to coupling constants. His proof is based on general trace
identities for operators [11, 12] known as sum rules, which will again be used as the
foundation of the present article.
When there is no potential energy but instead the Laplacian is given Dirichlet
conditions on the boundary of a bounded domain, then the means of the rst n
eigenvalues are bounded from below by the BerezinLiYau inequality in terms of
the volume of the domain, and in addition there is a large family of universal bounds
on the spectrum, dating from the work of Payne, P olya, and Weinberger [22], which
constrain the spectrum without any reference to properties of the domain. (For a
review of the subject, see [2].) It turns out that there are far-reaching analogies
between these universal inequalities for Dirichlet Laplacians and LiebThirring
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00396

Inequalities for Eigenvalues of Quantum Graphs 307

inequalities, which have led to common proofs based on sum rules [812, 25]. More
precisely, some sharp LiebThirring inequalities and some universal inequalities of
the PPW family can be viewed as corollaries of a Yang-type inequality like (2.5)
below, which in turn follows from a sum-rule identity.
In one dimension, a domain is merely an interval and the spectrum of the Dirich-
let Laplacian is a familiar elementary calculation, for which the question of universal
bounds is trivial. A quantum graph, however, has a spectrum that responds in com-
plex ways to its connectedness; if the total length is nite and appropriate boundary
conditions are imposed at exterior vertices, then the spectrum is discrete, and ques-
tions about counting functions, moments, etc. and their relation to the topology of
the graph become interesting, even in the absence of a potential energy. Below we
shall prove several inequalities for the spectra of nite quantum graphs, with the
aid of the same trace identities we use to derive LiebThirring inequalities.
For LiebThirring inequalities on quantum graphs, the essential question is
whether a form of (1.3) holds with the sharp constant for d = 1, or whether the
connectedness of the graph can change the state of aairs. In [6], Ekholm, Frank
and Kovark proved LiebThirring inequalities for Schr odinger operators on regular
metric trees for any 1/2, but without sharp constants. We shall show below
that trees enjoy a LiebThirring inequality with the sharp constant when 2,
but that this circumstance depends on the topology of the graph.
We begin with some simple explicit examples showing that neither the expected
LiebThirring inequality nor the analogous universal inequalities for nite quantum
graphs without potential hold in complete generality. As it will be convenient to
have a uniform way of describing examples, we shall let xij denote the distance
from vertex vi along the jth edge j emanating from vi . We note that every edge
corresponds to two distinct coordinates xij = L xi j  where L is the length of
the edge, and that a homoclinic loop from a vertex vi to itself is accounted as
two edges.
d2
For the operator dx 2 on an interval, with vanishing Dirichlet boundary condi-

tions, the universal inequality of PayneP olyaWeinberger reduces to E2 /E1 5,


and the AshbaughBenguria theorem becomes E2 /E1 4, both of which are trivial
in one dimension. But for which quantum graphs do these classic inequalities con-
tinue to be valid? We shall show below that the classic PPW and related inequalities
can be proved for the case of trees, with Dirichlet boundary conditions imposed at
all external ends of edges, using the method of sum rules. The sum-rule proof does
not work for every graph, however, so the question naturally arises whether the
topology makes a real dierence, or whether a better method of proof is required.
The following examples show that the failure of the sum-rule proof in the case
of multiply connected graphs is not an artifact of the method but due to a true
topological eect.
We refer to graphs consisting of a circle attached to a single external edge as
simple balloon graphs. The external edge may either be innite or of nite length
with a vanishing boundary condition at its exterior end. Consider rst the graph
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00396

308 S. Demirel & E. M. Harrell, II

1
2
v1

Fig. 1. The balloon graph.

:= 1 2 , which consists of a loop 1 to which a nite external interval 2 is


attached at a vertex v1 . Without loss of generality, we may x the length of the
loop as 2, while the string will be of length L.

Example 1.1 (Violation of the Analogue of PPW). Let us begin with the
case of a balloon graph with L < , and assume that there is no potential. We set
d2
= 1. Thus H locally has the form dx 2 with Dirichlet condition at the end of

the string 2 and vertex condition (1.2) at v1 connecting it to the loop.


For convenience, we slightly simplify the coordinate system, letting xs := x12 be
the distance on s := 2 from the node, and x := x11 on 1 . Thus x increases
from at v1 to x2 = + when it joins it again. It is possible to analyze the
eigenvalues of the balloon graph quite explicitly: With a Dirichlet condition at xs =
L, any eigenfunction must be of the form a sin(k(L xs )) on s . On 1 symmetry
dictates that the eigenfunction must be proportional to either sin kx or cos kx .
There are thus two categories of eigenfunctions and eigenvalues. Eigenfunctions
of the form sin kx contribute nothing to the vertex condition (1.2) (because the
outward derivatives at the node are equal in magnitude with opposite signs), and
therefore the derivative of a sin(k(L xs )) must vanish at xs = 0. If k is a positive
integer, then k 2 is an eigenvalue corresponding to an eigenfunction that vanishes
on s . Otherwise, the conditions on s cannot be achieved without violating the
condition of continuity with the eigenfunction on 1 . To summarize: the eigenvalues
of the rst category are the squares of positive integers.
The second category of eigenfunctions match cos kx on the loop to a sin(k(L
xs )) on the interval. The boundary conditions and continuity lead after a standard
calculation to the transcendental equation

cot kL = 2 tan k. (1.4)

There are three interesting situations to consider. In the limit L 0, an asymptotic


analysis of (1.4) shows that the eigenvalues tend to {( n2 )2 }. In the limit L
2
, the lower eigenvalues tend to {(n + 12 )2 L 2 }, which are the eigenvalues of an

interval of length L with Dirichlet conditions at L and Neumann conditions at 0.


The ratio of the rst two eigenvalues in this limit is approximately 9, which is
already greater than the classically anticipated value of 5 or 4. The highest value
of the ratio is, somewhat surprisingly, attained for an intermediate value of L,
viz., L = , for which (1.4) can be easily solved, yielding k = 1 arctan 12 + j
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00396

Inequalities for Eigenvalues of Quantum Graphs 309

for a positive integer j. The corresponding fundamental ratio of the lowest two
eigenvalues becomes
2
1
arctan
E2 2
=
=
16.8453.
E1 1
arctan
2
(We spare the reader the direct calculation showing that the critical value of the
ratio occurs precisely at L = , establishing this value as the maximum among all
simple balloons.)

Example 1.2 (Showing that E2 /E1 can be Arbitrarily Large). A modi-


cation of Example 1.1 with more complex topology shows that no upper bound
on the ratio of the rst two eigenvalues is possible for the graph analogue of the
Dirichlet problem. We again set = 1 and assume V = 0, and consider a fancy
balloon graph consisting of an external edge, s , the string, of length joined
at v1 to N edges m , m = 1, . . . , N of length , all of which meet at a second
vertex v2 . We observe that the eigenfunctions may be chosen either even or odd
under pairwise permutation of the edges m . This is because if P f represents the
linear transformation of a function f dened on the graph by permuting two of
the variables {x21 , . . . , x2N }, and j is an eigenfunction of the quantum graph with
eigenvalue Ej , then so are j P j . (In particular, continuity and (1.2) are pre-
served by these superpositions.) Moreover, the fundamental eigenfunction is even
under any permutation, because it is unique and does not change sign.
By continuity and the conditions (1.2) at the vertices, as in Example 1.1, a
straightforward exercise shows that E1 = ( 1 arctan( 1N ))2 , and that there are
other even-parity eigenvalues

2
1 1
j arctan
N
for all positive integers j. Odd parity, when combined with continuity, forces the
eigenfunctions to vanish at the nodes, and thus leads to eigenvalues of the form
j 2 , for positive integers j. The fundamental ratio E2 /E1 for this example can be
seen to be

2
1

arctan N


1 ,
arctan
N
which is roughly 2 N for large N .

Remarks. (1) With no external edges, the lowest eigenvalue of a quantum graph
is E1 = 0, so one might intuitively argue that for a graph with a large and
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00396

310 S. Demirel & E. M. Harrell, II

complex interior part the eect of an exterior edge with a boundary condition
is small. The theorems and examples given below, however, point towards a
more nuanced intuition.
(2) Another instructive example is the bunch-of-balloons graph, with many non-
intersecting loops attached to the string at v1 . We leave the details to the
interested reader.

Example 1.3 (Violation of Classical LiebThirring). Next consider a balloon


d2
graph with L = and the Schr odinger operator H(l) := dx 2
2 + V (x) on L ()

with vertex conditions (1.2). Let the potential V be given by



2a2
V1 (x) := , x 1 = [, ]
V (x) := cosh2 (ax) .


V2 (x) := 0, xs 2 = [0, )

Then the eigenfunction corresponding to the eigenvalue a2 is given by


C cosh1 (ax ) on 1 and by eaxs on 2 . The continuity condition gives C =
cosh(a) and the condition (1.2) at v1 leads to the equation

1
tanh(a) = . (1.5)
2
Denoting the ratio

|E1 |
Q(, V ) :=  ,
|V (x)|+1/2 dx

we compute

a3
Q(3/2, V ) = 
4a4
2 4 dx
0 cosh (ax )
 a
1
1
= 8 dy
0 cosh4 (y)

1
8
= tanh(a)(2 + sech2 (a)) .
3

Because of (1.5), sech2 (a) = 1 tanh2 (a) = 34 , and therefore

3 3
Q(3/2, V ) = > = Lcl
3/2,1 . (1.6)
11 16
Note that the ratio Q(3/2, V ) is independent of the length of the loop, as expected
because any length L can be achieved by a change of scale.
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00396

Inequalities for Eigenvalues of Quantum Graphs 311

The ratio Q(, V ) can also be calculated explicitly for the case = 2. In this
case

1
3 3 1
Q(2, V ) = 27/2 arctan(tanh(a/2)) + sech(a) + sech3 (a)
4 16 8
8
0.2009 > Lcl
= 2,1 = =
0.1697.
15

2. LiebThirring Inequalities for Quantum Graphs


2.1. Classical LiebThirring inequality for metric trees
Our point of departure is the family of sum-rule identities from [11, 12]. Let H
and G be abstract self-adjoint operators satisfying certain mapping conditions.
We suppose that H has nonempty discrete spectrum lying below the continuum,
{Ej : Hj = Ej j }. In the situations of interest in this article the spectrum will
either be entirely discrete, in which case we focus on spectral subsets of the form
J := {Ej , j = 1, . . . , k}, or else, when there is a continuum, it will lie on the positive
real axis and we shall take J as the negative part of the spectrum. Let PA denote
the spectral projector associated with H and a Borel set A.
Then, given a pair of self-adjoint operators H and G with domains D(H) and
D(G), such that G(J ) D(H) D(G), where J is the subspace spanned by the
eigenfunctions j corresponding to the eigenvalues Ej , it is shown in [11, 12] that:

(z Ej )2 [G, [H, G]]j , j 2(z Ej ) [H, G]j , [H, G]j
Ej J
 
=2 (z Ej )(z )( Ej )dG2j , (2.1)
Ej J J c

where dG2j := | Gj , dP Gj | corresponds to the matrix elements of the oper-


ator G with respect to the spectral projections onto J and J c . Because of our
choice of J,

(z Ej )2 [G, [H, G]]j , j 2(z Ej ) [H, G]j , [H, G]j 0. (2.2)
Ej J

In this section H is the Schr


odinger operator on the graph , namely

d2
H() = + V (x) in L2 (), > 0,
dx2

with the usual conditions (1.2) at each vertex vi . In particular, if any leaves (i.e.
edges with one free end) are of nite length, vanishing Dirichlet boundary conditions
are imposed at their ends. Without loss of generality we may assume that V C0
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00396

312 S. Demirel & E. M. Harrell, II

for the operator H(). Under this assumption, for any > 0, H() has at most a
nite number of negative eigenvalues. We denote negative eigenvalues of H() by
Ej () corresponding to the normalized eigenfunctions j .
We shall be able to derive inequalities of the standard one-dimensional type
when it is possible to choose G to be multiplication by the arclength along some
distinguished subsets of the graph. This depends on the following:

Lemma 2.1. Suppose that there exists a continuous, piecewise-linear function G


on the graph , such that at each vertex vk
 G
(0+ ) = 0. (2.3)
j
xkj


Suppose that = m m with (G )2 = am on m . If the spectrum has nonempty
essential spectrum, assume that z inf ess (H). Then

(z Ej )2+ am m j 2 4(z Ej )+ am m j 2 0. (2.4)
j,m

We observe that m = 1 am = 0.

Proof. The formula (2.4) is a direct application of (2.2), when we note that, locally,
[H, G] = 2G dxdkj G and [G, [H, G]] = 2(G )2 . (A factor of 2 has been divided
out.) The reason for the condition (2.3) is that Gj must be in the domain of
denition of H, which requires that at each vertex,
 Gj
0= (0+ )
j
xkj
 j  G
=G (0+ ) + j (0+ )
j
xkj j
xkj

 G
= j (0+ ).
j
xkj

If we are so fortunate that (G )2 is the same constant on every edge, then (2.4)
reduces to the quadratic inequality

(z Ej )2+ 4(z Ej )+ j 2 0, (2.5)
j

familiar from [8, 9, 11, 12, 25], where it was shown that it implies universal spec-
tral bounds for Laplacians and LiebThirring inequalities for Schr odinger opera-
tors in routine ways. Equation (2.5) can be considered as a Yang-type inequality,
after [30].
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00396

Inequalities for Eigenvalues of Quantum Graphs 313

Stubbes monotonicity argument


In [25], Stubbe showed that some of the classical sharp LiebThirring inequalities
follow from the quadratic inequality (2.5). Here we apply the same argument to
quantum graphs: For any > 0, the functions Ej () are non-positive, continuous
and increasing. Ej () is continuously dierentiable except at countably many values
where Ej () fails to be isolated or enters the continuum. By the FeynmanHellman
theorem,

d
Ej () = j , j = j 2 .
d
Setting z = 0, (2.5) reads
 d 
(Ej ())2 + 22 (Ej ())2 0.
d
Ej ()<0 Ej ()<0

We denote by 1 2 k > 0 the values at which


Ej () appears. For any ]N +1 , N [ the number of eigenvalues is constant,
and therefore

d 1/2 
(Ej ())2 0.
d
Ej ()<0


This means that 1/2 Ej ()<0 (Ej ())2 is monotone decreasing in . Hence, by
Weyls asymptotics (see [4, 28]),
  
1/2 (Ej ())2 lim 1/2 (Ej ())2 = Lcl
2,1 (V (x))2+1/2 dx.
0+
Ej ()<0 Ej ()<0

Remark 2.2. Strictly speaking the FeynmanHellman theorem only holds for non-
degenerate eigenvalues. In the case of degenerate eigenvalues, one has to take the
right basis in the corresponding degeneracy space and to change the numbering if
necessary, see, e.g., [26].

The balloon counterexamples given above might lead one to think that the
existence of cycles poses a barrier for a quantum graph to have an inequality of the
form (2.5). Consider, however the following example.

Example 2.3 (Hash Graphs). Let be a planar graph consisting of (or metri-
cally isomorphic to) the union of a closed family of vertical lines and line segments
Fv and a closed family of horizontal lines and line segments Fh . We assume that
for some > 0 the distance between any two lines or line segments in Fv is at
least , and that the same is true of Fh . (The assumption on the spacing of the
lines allows an unproblematic denition of the vertex conditions (1.2).) We impose
Dirichlet boundary conditions at any ends of nite line segments. We also suppose a
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00396

314 S. Demirel & E. M. Harrell, II

crossing condition, that there are no vertices touching exactly three edges. (That
is, no line segment from Fv has an end point in Fh and vice versa.)
Regarding the graph as a subset of the xy-plane, we let G(x, y) = x + y. It
is immediate from the crossing condition that G satises (2.3). Furthermore, the
derivative of G along every edge is 1, and therefore the quadratic inequality (2.5)
holds.

A quadratic inequality (2.5) can arise in a dierent way, if there is a family


of piecewise ane functions G each with a range of values am , but such that

 am = 1 (or any other xed positive constant). This occurs in our

next example.
Even when this is not possible, if we can arrange that 0 < amin  am amax ,
then the resulting weaker quadratic inequality
 amax
(z Ej )2+ 4 (z Ej )+ j 2 0, (2.6)
j
a min

will still lead to universal spectral bounds that may be useful. We speculate about
this circumstance below.

Example 2.4 (Y -Graph). As the next example we consider a simple graph,


namely the Y -graph, which is a star-shaped graph with three positive halfaxes i ,
i = 1, 2, 3, joined at a single vertex v1 . If we set


x11 1
g1 := 0,
G1 (x) := g2 := x12 , x12 2 ,


g := x , x ,
3 13 13 3

then obviously G(J ) D(H ()) holds, and with Lemma 2.1 we get

(z Ej )2+ ( 2 j 2 + 3 j 2 )
j
4(z Ej )+ ( 2 j 2 + 3 j 2 ) 0. (2.7)

As 1 does not contribute to this inequality, we cyclically permute the zero part
of G, i.e. we next choose G2 (x), such that g2 = 0, g1 = x11 and g3 = x13 , and
nally G3 (x), such that g3 = 0, g1 = x11 and g2 = x12 . These give us two further
inequalities analogous to (2.7). Summing all three inequalities, and noting that on
3
every edge, =1 am = 2, we nally obtain

2(z Ej )2+ 8(z Ej )+ j 2 0, (2.8)
j

which when divided by 2 yields the quadratic inequality (2.5).

We next extend the averaging argument to prove (2.5) for arbitrary metric trees.
A metric tree consists of a set of vertices, a set of leaves and a set of edges, i.e.
segments of the real axis, which connect the vertices, such that there is exactly
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00396

Inequalities for Eigenvalues of Quantum Graphs 315

one path connecting any two vertices. It is common in graph theory to distinguish
between edges and leaves; a leaf is joined to a vertex at only one of its endpoints,
i.e. there is a free end, at which we shall set Dirichlet boundary conditions. (When
the distinction is not material we shall refer to both edges and leaves as edges. It is
also common to regard one free end as the distinguished root r of the tree, but
for our purposes all free ends of the graph have the same status.) We denote the
vertices by vi , i = 1, . . . , n. The edges including leaves will be denoted by e. We
shall explicitly write lj for leaves when the distinction matters.

Theorem 2.5. For any tree graph with a finite number of vertices and edges, the
mapping

 1/2 (Ej ())2
Ej ()<0

is nonincreasing for all > 0. Consequently


 
1/2
(Ej ()) 2
Lcl
2,1 (V (x))2+1/2 dx
Ej ()<0

for all > 0.

Remark 2.6. By the monotonicity principle of Aizenman and Lieb (see [1]),
Theorem 2.5 is also true with the sharp constant for higher moments of eigen-
values. Alternatively, the extension to higher values of can be obtained directly
from the trace inequality of [10] for power functions with > 2. Furthermore,
Theorem 2.5 can be extended by a density argument to potentials V L+1/2 ().

To prepare the proof of Theorem 2.5, we rst formulate some auxiliary results.

Lemma 2.7. For all n N,

[ n1
 2 ]

[ n
2 ]1

n1 n1
= . (2.9)
2k 2k + 1
k=0 k=0

Proof. This is a simple computation.

Definition 2.8. Let E be the set of all edges e . We call the mapping C : E
{0, 1} a coloring and say that C is an admissible coloring if at each vertex v the
number

#{e : e emanates from v : C(e) = 1}

is even. We let A() denote the set of all admissible colorings on .


April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00396

316 S. Demirel & E. M. Harrell, II

Theorem 2.9. Let n be a metric tree with n vertices. For an edge e n , we


denote by

a(e, n) := #{C(n ) A : C(e) = 1}

the number of all admissible mappings C A(n ), such that C(e) = 1 for e n .
Then

a(e, n) is independent of e n . (2.10)

Proof. We shall prove (2.10) by induction over the number of vertices of . The
case with one vertex v1 is trivial because of the symmetry of the graph. Given a
metric tree n with n vertices, we can decompose it as follows. n consists of a
metric tree n1 with n 1 vertices to which m 1 leaves lj , j = 2, . . . , m, are
attached to the free end of a leaf l1 n1 . We call the vertex at which the leaves
lj , j = 1, . . . , m, are joined vn . Hence,

m
n := n1 vn lj .
j=2

By the induction hypothesis,

a(e, n 1) := #{C A(n1 ) : C(e) = 1} is independent of e n1 . (2.11)

Obviously for every edge or leaf e = l1 in n1 , we have

a(e, n 1) = #{C A(n1 ) : C(e) = 1 C(l1 ) = 1}


+ #{C A(n1 ) : C(e) = 1 C(l1 ) = 0}. (2.12)

Now, we have to show that a(e, n) is independent of e n . Note rst that for
m
each xed leaf lj of the subgraph = vn j=1 lj , we have

[m

2 ]1

m1
1 := #{C A( ) : C(lj ) = 1, lj } = (2.13)
2k + 1
k=0

and
2 ]
[ m1


m1
0 := #{C A( ) : C(lj ) = 0, lj } = . (2.14)
2k
k=0

Hence, for arbitrary neighboring edges e , e n1 the following equality holds,

a(e , n) = 1 #{C A(n1 ) : C(e ) = 1 C(l1 ) = 1}


+ 0 #{C A(n1 ) : C(e ) = 1 C(l1 ) = 0}, (2.15)
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00396

Inequalities for Eigenvalues of Quantum Graphs 317

and, respectively,
a(e , n) = 1 #{C A(n1 ) : C(e ) = 1 C(l1 ) = 1}
+ 0 #{C A(n1 ) : C(e ) = 1 C(l1 ) = 0}. (2.16)
By Lemma 2.7, := 0 = 1 . Therefore, with (2.12) the equalities (2.15) and (2.16)
read
a(e , n) = a(e , n 1),
a(e , n) = a(e , n 1).
Furthermore, by the induction hypothesis,
a(e , n 1) = a(e , n 1),
from which it immediately follows that
a(e , n) = a(e , n 1) = a(e , n 1) = a(e , n).
This proves Theorem 2.9.

Proof of Theorem 2.5. In order to apply Stubbes monotonicity argument [25], we


need to establish inequality (2.5) for metric trees. To do this, we proceed as for the
example of the Y -graph. Let J denote the subspace spanned by the eigenfunctions
j on L2 () corresponding to the eigenvalues Ej . Note rst that there exist self-
adjoint operators G, which are given by piecewise ane functions gi on the edges
(or leaves) of , such that G(J ) D(H()) D(G). Edges (or leaves) on which
constant functions gi are given, do not contribute to the sum rule. Therefore we
average over a family of operators G, such that every edge e (or leaf) of the tree
appears equally often in association with an ane function having G = 1 on e.
We let G denote the set of continuous operators G(x) = {gi (x) ane, x ei (or li )},
which satisfy (1.2) at the vertices v of . Indeed it is not necessary to average over
all the operators G G, because it makes no dierence in Lemma 2.1, for instance,
whether gi = 1 or gi = 1. Therefore we dene an equivalence relation G on G
= {
as follows: Let G gi (x) ane, x ei , (or li )} be another operator in G. We say
that G G i {1, . . . , n} : |gi (x)| = |
gi (x)|. We dene G := G/. Then we
can consider the isomorphism
I : A() G , (2.17)
where for each C A() we choose an ane function GC G on , such that
|GC (e)| = C(e) for every e . By Theorem 2.9, we know that #{C A() :
C(e) = 1} is independent of e . This means that summing up all inequalities
corresponding to (2.4), which we get from each GC G , leads to

(z Ej )2+ p 4(z Ej )+ p j 2 0, (2.18)
j
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00396

318 S. Demirel & E. M. Harrell, II


where p :=  am = #{C A() : C(e) = 1} and we have used the normal-
ization j = 1. Having the anologue of inequality (2.5) for metric trees, we can
reformulate the monotonicity argument for our case. This proves Theorem 2.5.

Remark 2.10. The proof applies equally to metric trees with leaves of innite
lengths.

2.2. Modified LiebThirring inequalities for one-loop graphs


In this section we consider the graph consisting of a circle to which two leaves
are attached. It is not hard to see that the construction leading to LiebThirring
inequalities with the sharp classical constant fails for one-loop graphs, because

no family of auxiliary functions G exists with the side condition that  am = 1
throughout . Unlike the case of the balloon graph, it is possible to replace the clas-
sical inequality with a weakened version (2.6) as mentioned above. There is, however
another option, based on commutators with exponential functions, following an idea
of [10]: As usual, we dene the one-parameter familiy of Schrodinger operators

d2
H() = + V (x), > 0,
dx2
in L2 () with the usual conditions (1.2) at each vertex vi of . The leaves are
denoted by 1 := [0, ) and 2 := [0, ), while we write 3 and 4 for the
semicircles with lengths L. Let j be the eigenfunctions of H() corresponding to
the eigenvalues Ej ().

Theorem 2.11. Let q := 2/L. For all > 0 the mapping



2
3
 1/2 z q 2 Ej (2.19)
16 +
Ej ()<0

is nonincreasing. Furthermore, for all z R and all > 0 the following sharp
LiebThirring inequality holds:


2+1/2
3
R2 (z, ) 1/2 Lcl
2,1 V (x) z + q 2 dx, (2.20)
16

where

R2 (z, ) := (z Ej ())2+ .
Ej ()<z

Remark 2.12. Once again, Theorem 2.11 can be extended to potentials V


L+1/2 () and is true for all 2, either by the monotonicity principle of Aizenman
and Lieb [1] or by the trace formula of [10] for 2.
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00396

Inequalities for Eigenvalues of Quantum Graphs 319

For the proof of Theorem 2.11, we make use of a theorem of Harrell and
Stubbe:

Theorem 2.13 ([10, Theorem 2.1]). Let H be a self-adjoint operator on H,


with a nonempty set J of finitely degenerate eigenvalues lying below the rest of the
spectrum J c and {j } an orthonormal set of eigenfunctions of H. Let G be a linear
operator with domain DG and adjoint G defined on DG such that G(DH ) DH
DG and G (DH ) DH DG , respectively. Then

1 
(z Ej )2 ( [G , [H, G]]j , j + [G, [H, G ]]j , j )
2
Ej J

(z Ej )( [H, G]j 2 + [H, G ]j 2 ). (2.21)
Ej J

Remark 2.14. Strictly speaking, in [10] it was assumed that the spectrum was
purely discrete. However, the extension to the case where continuous spectrum is
allowed in J c follows exactly as in [11, Theorem 2.1].

Proof of Theorem 2.11. In this case, it is not possible to get a quadratic inequality
from Lemma 2.1 without worsening the constants. This follows from the fact that
the conditions 3 (0) = 4 (0) and 3 (L) = 4 (L) imply that the piecewise linear
function G has to be dened equally on 3 and 4 . Consequently, the condition (1.2)
can be satised only with dierent values of am as in (2.6), namely a1 = a2 = 4a3 =
4a4 . Our proof of Theorem 2.11 consists of three steps. First, we apply Lemma 2.1,
after which we apply Theorem 2.13. Finally we combine both results and apply the
line of argument given in [10].

First step. Using Lemma 2.1 with the choice,




g := 2x11 , x11 1
1

g := 2x + L, x22 2
2 22
G(x) :=

g := x , x13 3


3 13

g4 := x14 , x14 4 ,

we obtain

 
4 (z Ej ())2+ p12 (j) 4 (z Ej ())+ p12 (j)
Ej ()<0 Ej ()<0

 
+ (z Ej ())2+ p34 (j) 4 (z Ej ())+ p34 (j) 0, (2.22)
Ej ()<0 Ej ()<0

where pik (j) := i j 2 + k j 2 and pik (j) := i j 2 + k j 2 .


April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00396

320 S. Demirel & E. M. Harrell, II

Second step. Next, in Theorem 2.13 we set




g1 := 1, x11 1


g := 1 , x22 2
2
G(x) :=

g := ei2x13 /L , x13 3


3

g4 := ei2x14 /L , x14 4 .

It is easy to see that Gj D(H ). With q := 2/L, the rst commutators work
out to be

[Hj , gj ] = 0, j = 1, 2,

[H3 , g3 ] = eiqx13 (q 2 + 2iqd/dx), [H4 , g4 ] = eiqx14 (q 2 2iqd/dx);

whereas for the second commutators,

[gj , [Hj , gj ]] = [gj , [Hj , gj ]] = 0, j = 1, 2,


(2.23)
[gj , [Hj , gj ]] = [gj , [Hj , gj ]] = 2q ,2
j = 3, 4.

From inequality (2.21), we get


 
(z Ej ())2 p34 (j) (z Ej ()) (q 2 p34 (j) + 4p34 (j)). (2.24)
Ej ()J Ej ()J

Third step. Adding (2.22) and (2.24), we nally obtain


d 3 
2 R2 (z, ) + 2 R2 (z, ) q 2 (z Ej )p34 (j), (2.25)
d 2
Ej J

or
d 3
2R2 (z, ) + 4 R2 (z, ) q 2 R1 0, (2.26)
d 2
which is equivalent to

3q 2 1/2
(1/2 R2 (z, )) R1 (z, ). (2.27)
8

Letting U (z, ) := 1/2 R2 (z, ), the inequality has the form

U 3 2 U
q . (2.28)
16 z
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00396

Inequalities for Eigenvalues of Quantum Graphs 321

3 2
Since the expression in (2.20) can be written as U (z 16 q , ), an application of the
chain rule shows that the monotonicity claimed in (2.20) follows from (2.28). (We
note that (2.28) can be solved by changing to characteristic variables := 16z 3q2 ,
16z
:= + 3q2 , in terms of which

U
0. (2.29)

That is, U decreases as increases while is xed.) By shifting the variable in


(2.29), we also obtain

3
U (z, ) U z + q 2 ( s ), s (2.30)
16

for s . By Weyls asymptotics, for all 0,


 
+d/2
lim d/2 (z Ej ()) = Lcl,d (V (x) z) dx, (2.31)
0+
Ej ()<z

see [4, 28]. Hence, as s 0, the right-hand side of (2.30) tends to




2+1/2
3 2
Lcl
2,1 V (x) z + q dx,
16

so the conclusion of Theorem 2.11 follows.

Remark 2.15. Theorem 2.11 can be generalized to one-loop graphs to which


2n, n N equidistant semiaxes are attached.

To summarize, in this section we have seen that for some classes of quantum
graphs a quadratic inequality (2.5) can be proved with the classical constants, and
that for some other classes of graphs similar statements can be proved at the price
of worse constants as in (2.6), or of a shift in the zero-point energy as in (2.20).
It is reasonable to ask whether one can look at the connectness of a graph and
say whether a weak Yang-type inequality (2.6) can be proved. As we have seen, this
is the case if there exists a family of continuous functions G on the graph such that

On each edge, all the derivatives {G } are constant.


At each vertex vk , each function G satises
 dG
(0+ ) = 0.
j
dxkj

For each edge e there exists at least one function G with G = 0.
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00396

322 S. Demirel & E. M. Harrell, II

Interestingly, the question of the existence of such a family of functions can


be rephrased in terms of the theory of electrical resistive circuits, a subject dat-
ing from the mid-19th century [14]. We rst note that for a suitable family of
functions to exist, there must be at least two leaves, which can be regarded as
external leads of an electric circuit, bearing some resistance. (In the nite case
let the resistance be equivalent to the length of the leaf, and in the innite case
let it be some xed nite value, at least as large as the length of any nite
leaf.) Each internal edge is regarded as a wire bearing a resistance equal to the
length of the edge. If we regard the value of G as a current, then Kirchhos
condition at the vertex of an electric circuit is exactly the condition (1.2) that
 dG +
j dxkj (0 ) = 0, and the condition that the electric potential G must be uniquely
dened at all vertices is equivalent to global continuity of G . It has been known
since Weyl [29] that the currents and potentials in an electric circuit are uniquely
determined by the voltages applied at the leads. There are, however, circuits such
that no matter what voltages are applied to the external leads, there will be
an internal wire where no current ows; the most well known of these is the
Wheatstone bridge. (See, for instance, the Wikipedia article on the Wheatstone
bridge.)
Let us call a metric graph a generalized Wheatstone bridge when the correspond-
ing circuit has exactly two external leads and a conguration for which no current
will ow in at least one of its wires. Then we conjecture that there are only two
impediments to the existence of a suitable family of functions G , and therefore to
a weakened quadratic inequality (2.6), namely: Unless a quantum graph contains
either

a subgraph that can be disconnected from all leaves by the removal of one point
(such as a balloon graph or a graph shaped like the letter ); or
a subgraph that when disconnected from the graph by cutting two edges is a
generalized Wheatstone bridge,

then an inequality of the form (2.6) holds. Otherwise the best that can be
obtained may be a modied quadratic inequality with a variable shift, as in
Theorem 2.11.

Fig. 2. The Wheatstone bridge.


April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00396

Inequalities for Eigenvalues of Quantum Graphs 323

3. Universal Bounds for Finite Quantum Graphs


In this section, we derive dierential inequalities for Riesz means of eigenvalues
of the Dirichlet Laplacian on bounded metric trees with at least one leaf (free
edge). From these inequalities we derive Weyl-type bounds on the averages of the
eigenvalues of the Dirichlet Laplacian

d2
HD := 2 in L2 (),
dx D
with the conditions (1.2) at each vertex vi . At the ends of the leaves, vanishing
Dirichlet boundary conditions are imposed. We recall that with the methods of [9,
12] these are consequences of the same quadratic inequality (2.5) as was used above
to prove LiebThirring inequalities. When the total length of the graph is nite,
the operator HD on D(HD ) has a positive discrete spectrum {Ej } j=1 , allowing us
to dene the Riesz mean of order ,

R (z) := (z Ej )+ (3.1)
j

for > 0 and real z.

Theorem 3.1. Let be a metric tree of finite length and with finitely many edges
and vertices, and let HD be the Dirichlet Laplacian in L2 () with domain D(HD ).
Then for z > 0,
5
R1 (z) R2 (z); (3.2)
4z
5
R2 (z) R2 (z); (3.3)
2z
and consequently
R2 (z)
z 5/2
is a nondecreasing function of z.

Proof. The claims are vacuous for z E1 , so we henceforth assume z > E1 . The
line of reasoning of the proof of Theorem 2.5 applies just as well to the operator
HD on D(HD ), yielding

(z Ej )2+ 4(z Ej )+ j 2 0. (3.4)
j

Since V 0, j 2 = Ej . Observing that



(z Ej )+ Ej = zR1 (z) R2 (z),
j
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00396

324 S. Demirel & E. M. Harrell, II

we get from (3.4)

5R2 (z) 4zR1 (z) 0.

This proves (3.2). Inequality (3.3) follows from (3.2), as R2 (z) = 2R1 (z).

Since by Theorem 3.1, R2 (z)z 5/2 is a nondecreasing function, we obtain a lower


bound of the form R2 (z) Cz 5/2 for all z z0 in terms of R2 (z0 ). Upper bounds
can be obtained from the limiting behavior of R2 (z) as z , as given by the
Weyl law. In the following, we follow [9] to derive Weyl-type bounds on the averages
of the eigenvalues of HD in L2 ().

Corollary 3.2. For z 5E1 ,



5/2
1/2 z
16E1 R2 (z) Lcl
2,1 ||z
5/2
,
5
(3)
where Lcl
2,1 := (4)1/2 (7/2)
, and || is the total length of the tree.

Proof. By Theorem 3.1, for all z z0 ,


R2 (z) R2 (z0 )
5/2
5/2 . (3.5)
z z0

As R2 (z0 ) (z0 E1 )2+ for any z0 > E1 , it follows from (3.5) that

5/2
z
R2 (z) (z0 E1 )2+ .
z0
(z0 E1 )2+
The coecient 5/2 is maximized when z0 = 5E1 . Thus we get
z0


5/2
1/2 z
16E1 R2 (z).
5
For metric trees with total length ||, the Weyl law states that

En
lim = , (3.6)
n n ||
(see [16]). It follows that
R2 (z)
Lcl
2,1 ||,
z 5/2
R2 (z)
as z . Since z 5/2
is nondecreasing, we get

R2 (z)
Lcl
2,1 ||, z < .
z 5/2
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00396

Inequalities for Eigenvalues of Quantum Graphs 325

In summary, we get from Theorem 3.1 and Corollary 3.2 the following two-sided
estimate:

3/2
1/2 z 5
4E1 R2 (z) R1 (z). (3.7)
5 4z
In order to obtain similar estimates, related to higher eigenvalues, we introduce
the notation
1
Ej := E
j
j

for the means of eigenvalues E ; similarly, the means of the squared eigenvalues are
denoted
1 2
Ej2 := E .
j
j

For a given z, we let ind(z) be the greatest integer i such that Ei z. Then
obviously,
R2 (z) = ind(z)(z 2 2zEind(z) + Eind(z)
2 ).
As for any integer j and all z Ej , ind(z) j, we get
R2 (z) D(z, j) := j(z 2 2zEj + Ej2 ).
Using Theorem 3.1 for z zj Ej , it follows that

5/2
z
R2 (z) D(zj , j) . (3.8)
zj
2
Furthermore, Ej Ej2 by the CauchySchwarz inequality, and hence
2
D(z, j) = j((z Ej )2 + Ej2 Ej ) j(z Ej )2 . (3.9)
This establishes the following

Corollary 3.3. Suppose that z 5Ej . Then


16jz 5/2
R2 (z) (3.10)
25(5Ej )1/2
and, therefore,
4jz 3/2
R1 (z) . (3.11)
5(5Ej )1/2

Proof. Combining Eqs. (3.8) and (3.9), we get



5/2
z
R2 (z) j(zj Ej )2
.
zj
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00396

326 S. Demirel & E. M. Harrell, II

Inserting zj = 5Ej the rst statement follows. (This choice of zj maximizes the
constant appearing in (3.10).) The second statement results from substituting the
rst statement into (3.7).

The Legendre transform is an eective tool for converting bounds on R (z) into
bounds on the spectrum, as has been realized previously, e.g., in [17]. Recall that if
f (z) is a convex function on R+ that is superlinear in z as z +, its Legendre
transform

L[f ](w) := sup{wz f (z)}


z

is likewise a superlinear convex function. Moreover, for each w, the supremum in


this formula is attained at some nite value of z. We also note that if f (z) g(z)
for all z, then L[g](w) L[f ](w) for all w. The Legendre transform of the two sides
of inequality (3.11) is a straightforward calculation (e.g., see [9]). The result is

w3 125
(w [w])E[w]+1 + [w]E[w] Ej , (3.12)
j 2 108

for certain values of w and j. In Corollary 3.3 it is supposed that z 5Ej . Let zmax
be the value for which L[f ](w) = wzmax f (zmax ), where f is the right-hand side
of (3.11). Then by an elementary calculation,

1/2
6j zmax
w= .
5 5Ej

It follows that inequality (3.12) is valid for w 6j/5. Meanwhile, for any w we can
always nd an integer k such that on the left-hand side of (3.12), k 1 w < k.
If k > 6j/5 and if we let approach k from below, we obtain from (3.12)

k 3 125
Ek + (k 1)Ek1 Ej .
j 2 108
The left-hand side of this equation is the sum of the eigenvalues E1 through Ek , so
we get the following:

Corollary 3.4. For k 65 j, the means of the eigenvalues of the Dirichlet Laplacian
on an arbitrary metric tree with finitely many edges and vertices satisfy a universal
Weyl-type bound,

2
Ek 125 k
. (3.13)
Ej 108 j

In [10] it was shown that a similar inequality with a dierent constant can
be proved for all k j in the context of the Dirichlet Laplacian on Euclidian
domains. The very same argument applies to quantum graphs with V = 0. With
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00396

Inequalities for Eigenvalues of Quantum Graphs 327

this assumption j 2 = Ej , so with = 1 (2.5) can be rewritten as a quadratic


inequality,


j
Pj (z) := (z E )(z 5E ) 0 (3.14)
=1

for z [Ej , Ej+1 ] (cf. [10, Eq. (4.6)]). From (3.2) and (3.5) for z z0 Ej ,

5 5 5/2
 j
R1 (z) R2 (z) z 3/2 z0 (z0 Ej )2 . (3.15)
4z 4
=1

The derivative of the right-hand side of (3.15) with respect to z0 , by a calculation,


is a negative quantity times Pj (z0 ), and therefore an optimal choice for the value
of (3.15) is the root

z0 = 3Ej + Dj 5Ej , (3.16)

where Dj is the discriminant of Pj . The inequality in (3.16) results from the Cauchy
Schwarz inequality as in [10, 12]. Because Pj (z0 ) = 0,


j 
j 
j
0= (z0 E )(z0 5E ) = 5 (z0 E )2 4z0 (z0 E ),
=1 =1 =1

so (3.15) reads


3/2 
j
3/2
z z
R1 (z) (z0 E ) = j(z0 Ej ).
z0 z0
=1

From the left-hand side of (3.16), z0 Ej 23 z0 , so


2 1/2 3/2
R1 (z) jz0 z . (3.17)
3

The Legendre transform of (3.17) is

z0 3
kEk k , (3.18)
3j 2

and a calculation of the maximizing z in the Legendre transform of the right-hand


side of (3.17) shows that (3.18) is valid for all k > j. In particular, with the
inequality on the right-hand side of (3.16), we have established the following:
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00396

328 S. Demirel & E. M. Harrell, II

Corollary 3.5. For k j, the means of the eigenvalues of HD in L2 () satisfy



2
Ek 5 k
. (3.19)
Ej 3 j

Remark 3.6. Relaxing the assumption to k j comes at the price of making the
constant on the right-hand side larger. It would be possible to interpolate between
(3.19) and (3.13) for k [j, 6j/5] with a slightly better inequality.

Acknowledgment
The authors are grateful to several people for useful comments, including Rupert
L. Frank, Lot Hermi, Thomas Morley, Joachim Stubbe, and Timo Weidl, and to
Michael Music for calculations and insights generated by them. We also wish to
express our appreciation to the Mathematisches Forschungsinstitut Oberwolfach
for hosting a workshop in February 2009, where this collaboration began, and to
the Erwin Schrodinger Institut for hospitality.

References
[1] M. Aizenman and E. H. Lieb, On semiclassical bounds for eigenvalues of Schr odinger
operators, Phys. Lett. A 66(6) (1978) 427429.
[2] M. S. Ashbaugh, The universal eigenvalue bounds of PayneP olyaWeinberger, Hile
Protter, and H. C. Yang, Spectral and Inverse Spectral Theory (Goa, 2000), Proc.
Indian Acad. Sci. Math. Sci. 112 (2002) 330.
[3] G. Berkolaiko, R. Carlson, S. A. Fulling and P. Kuchment (eds.), Quantum Graphs
and Their Applications, Contemporary Mathematics, Vol. 415 (American Mathemat-
ical Society, 2006).
[4] M. Sh. Birman, The spectrum of singular boundary problems, Amer. Math. Soc.
Trans. (2) 53 (1966) 2380.
[5] M. Cwikel, Weak type estimates for singular values and the number of bound states
odinger operators, Ann. Math. (2) 106(1) (1977) 93100.
of Schr
[6] T. Ekholm, R. L. Frank and H. Kovark, Eigenvalue estimates for Schr odinger oper-
ators on metric trees, arXiv:0710.5500.
[7] P. Exner, J. P. Keating, P. Kuchment, T. Sunada and A. Teplyaev (eds.), Analysis on
Graphs and Its Applications, Proceedings of Symposia in Pure Mathematics, Vol. 77
(American Mathematical Society, Providence, RI, 2008); Papers from the program
held in Cambridge, January 8June 29 (2007).
[8] E. M. Harrell, II and L. Hermi, On Riesz means of eigenvalues, arXiv:0712.4088.
[9] E. M. Harrell, II and L. Hermi, Dierential inequalities for Riesz means and Weyl-
type bounds for eigenvalues, J. Funct. Anal. 254(12) (2008) 31733191.
[10] E. M. Harrell, II and J. Stubbe, Trace identities for commutators with applications
to the distribution of eigenvalues, arXiv:0903:0563v1.
[11] E. M. Harrell, II and J. Stubbe, Universal bounds and semiclassical estimates for
eigenvalues of abstract Schrodinger operators, arXiv:0808.1133.
[12] E. M. Harrell, II and J. Stubbe, On trace identities and universal eigenvalue estimates
for some partial dierential operators, Trans. Amer. Math. Soc. 349(5) (1997) 1797
1809.
April 20, 2010 14:17 WSPC/S0129-055X 148-RMP J070-00396

Inequalities for Eigenvalues of Quantum Graphs 329

[13] D. Hundertmark, Bound state problems in quantum mechanics, in Spectral Theory


and Mathematical Physics: A Festschrift in Honor of Barry Simons 60th Birthday,
Proc. Sympos. Pure Math., Vol. 76, Part 1 (Amer. Math. Soc., Providence, RI, 1980),
pp. 463496.

[14] G. R. Kirchho, Uber die Auosung der Gleichungen, auf welche man bei der Unter-
suchung der linearen Vertheilung galvanischer Str ome gefuhrt wird, Poggendorf s
Ann. Phys. Chem. 72 (1847) 497508.
[15] P. Kuchment, Quantum graphs: An introduction and a brief survey, in Analysis on
Graphs and Its Applications, Proc. Symp. Pure. Math. (Amer. Math. Soc., Provi-
dence, RI, 2008), pp. 291314.
[16] P. Kurasov, Schr odinger operators on graphs and geometry. I. Essentially bounded
potentials, J. Funct. Anal. 254(4) (2008) 934953.
[17] A. Laptev and T. Weidl, Recent results on LiebThirring inequalities, in Journees

Equations aux Derivees, Partielles (La Chapelle sur Erdre, 2000), Exp. No. XX
(Univ. Nantes, Nantes, 2000), 14pp.
[18] A. Laptev and T. Weidl, Sharp LiebThirring inequalities in high dimensions, Acta
Math. 184(1) (2000) 87111.
[19] E. H. Lieb, The number of bound states of one-body Schr odinger operators and the
Weyl problem, in Geometry of the Laplace Operator (Proc. Sympos. Pure Math.,
Univ. Hawaii, Honolulu, Hawaii, 1979), Proc. Sympos. Pure Math., Vol. 36 (Amer.
Math. Soc., Providence, RI, 1980), pp. 241252.
[20] E. H. Lieb and W. Thirring, Inequalities for the moments of the eigenvalues of the
Schrodinger Hamiltonian and their relation to Sobolev inequalities, Studies in Math-
ematical Physics: Essays in Honor of Valentine Bergmann (Princeton Univ. Press,
1976), pp. 269303.
[21] L. Pauling, The diamagnetic anistropy of aromatic molecules, J. Chem. Phys. 4
(1936) 673677.
[22] L. H. Payne, G. P olya and H. F. Weinberger, On the ratio of consecutive eigenvalues,
J. Math. Phys. 35 (1956) 289298.
[23] G. V. Rozenblum, Distribution of the discrete spectrum of singular dierential oper-
ators, Izv. Vyss. Ucebn. Zaved. Matematika 1(164) (1976) 7586.
[24] K. Ruedenberg and C. W. Scherr, Free-electron network model for conjugated sys-
tems, I, Theory, J. Chem. Phys. 21 (1953) 15651581.
[25] J. Stubbe, Universal monotonicity of eigenvalue moments and sharp LiebThirring
inequalities, preprint (2008).
[26] W. Thirring, A Course in Mathematical Physics: Quantum Mechanics of Atoms and
Molecules, Vol. 3 (Springer-Verlag, 1991), pp. 149150.
[27] T. Weidl, On the LiebThirring constants L,1 for 1/2, Comm. Math. Phys.
178(1) (1996) 135146.
[28] H. Weyl, Das asymptotische Verteilungsgesetz der Eigenwerte linearer partieller Dif-
ferentialgleichungen, Math. Ann. 71 (1912) 441479.
[29] H. Weyl, Repartici on de corriente en una red conductora, Rev. Mat. Hisp. Amer.
5(1) (1923) 153164.
[30] H. C. Yang, Estimates of the dierence between consecutive eigenvalues, preprint
(1995); revision of International Centre for Theoretical Physics, preprint IC/91/60,
Trieste (April 1991).
14:17 WSPC/S0129-055X 148-RMP J070-00397

Reviews in Mathematical Physics


Vol. 22, No. 3 (2010) 331354

c World Scientic Publishing Company
DOI: 10.1142/S0129055X10003977

GEOMETRIC MODULAR ACTION FOR DISJOINT


INTERVALS AND BOUNDARY CONFORMAL
FIELD THEORY

ROBERTO LONGO, , PIERRE MARTINETTI,,,


and KARL-HENNING REHREN,,
Dipartimento di Matematica,

Universit`
a di Roma 2 Tor Vergata,
00133 Roma, Italy
Institut f
ur Theoretische Physik, Universit
at G
ottingen,
Friedrich-Hund-Platz 1, 37077 G ottingen, Germany
Courant Centre Higher Order Structures in Mathematics,
Universit
at G
ottingen, Bunsenstr. 3-5,
37073 G
ottingen, Germany
longo@mat.uniroma2.it
martinetti@theorie.physik.uni-goettingen.de
rehren@theorie.physik.uni-goettingen.de

Received 7 December 2009


Revised 14 January 2010

Dedicated to John E. Roberts on the occasion of his 70th birthday

In suitable states, the modular group of local algebras associated with unions of disjoint
intervals in chiral conformal quantum eld theory acts geometrically. We translate this
result into the setting of boundary conformal QFT and interpret it as a relation between
temperature and acceleration. We also discuss novel aspects (mixing and charge
splitting) of geometric modular action for unions of disjoint intervals in the vacuum
state.

Keywords: Quantum eld theory; modular theory.

Mathematics Subject Classications 2010: 81T40

1. Introduction
Geometric modular action is a most remarkable feature of quantum eld theory [2],
emerging from the combination of the basic principles: unitarity, locality, covariance
and positive energy [1]. It associates thermal properties with localization [17, 30],
and is intimately related to the Unruh eect [34] and Hawking radiation [31]. It
allows for a reconstruction of space and time along with their symmetries [7], and for
a construction of full-edged quantum eld theories [23, 16] out of purely algebraic
data together with a Hilbert space vector (the vacuum).

331
14:17 WSPC/S0129-055X 148-RMP J070-00397

332 R. Longo, P. Martinetti & K.-H. Rehren

The modular group [32, Chap. VI, Theorem 1.19] is an intrinsic group of auto-
morphisms of a von Neumann algebra M , associated with a cyclic and separating
vector , provided by the theory of Tomita and Takesaki [17, 32]. In quantum
eld theory, M may be the algebra of observables localized in a wedge region
{x R4 : x1 > |x0 |} and = the vacuum state. In this situation it follows [1]
that the associated modular group is the 1-parameter group of Lorentz boosts in
the 1-direction, which preserves the wedge, i.e. it has a geometric action on the
subalgebras of observables localized in subregions of the wedge.
Geometric modular action was also established for the algebras of observables
localized in lightcones or double cones in the vacuum state in conformally invariant
QFT [5, 16], and for interval algebras in chiral conformal QFT [4]. It is known,
however, that the modular group of the vacuum state is not geometric (fuzzy)
for double cone algebras in massive QFT (see, e.g., [2,29]), and the same is true for
the modular group of wedge algebras or conformal double cone algebras in thermal
states [3]. In this contribution, we shall be interested in modular groups for algebras
associated with disconnected regions (such as unions of disjoint intervals in chiral
conformal QFT).
Our starting point is the observation [21] that in chiral conformal QFT (the
precise assumptions will be specied below), for any nite number n of disjoint
intervals Ii on the circle one can nd product states (not the vacuum if n > 1) on
 
the algebras A( i Ii ) = i A(Ii ) whose modular groups act geometrically inside
the intervals.
For n = 2, let E = I1 I2 and E  = S 1 \E the complement of the closure of
E. By locality, A(E) A(E  ) , where the inclusion is in general proper. The larger
algebra A(E  ) admits the physical re-interpretation as a double cone algebra B+ (O)
in boundary conformal QFT [25] as will be explained in Sec. 2.2.
The above state on A(E) can be extended to a state on B+ (O) = A(E  ) such
that the geometric modular action is preserved. We shall compute the geomet-
ds
ric ow in the double cone O in Sec. 2. Adopting the interpretation of d as
inverse temperature (where is the proper time along an orbit and s the mod-
ular group parameter) [11, 28], we compute the relation between temperature and
acceleration. There is not a simple proportionality as in the case of the Hawking
temperature.
In Sec. 3, we shall connect our results with a recent work by Casini and
Huerta [9]. In a rst quantization approach as in [14], these authors have succeeded
to compute the operator resolvent in the formula of [14] for the modular operator.
From this, they obtained the modular ow for disjoint intervals and double cones in
2 dimensions in the theory of free Fermi elds. Unlike [21], they consider the vacuum
state. They nd a geometric modular action in the massless case (including the chi-
ral case), but this action involves a mixing (modular teleportation [9]) between
the dierent intervals resp. double cones. We shall discuss how, upon descent to
gauge-invariant subtheories, the mixing leads to the new phenomenon of charge
splitting (Sec. 3.3).
14:17 WSPC/S0129-055X 148-RMP J070-00397

Geometric Modular Action for Disjoint Intervals 333

Ignoring the mixing, the geometric part of the vacuum modular ow for two
intervals in the chiral free Fermi model is the same as the purely geometric modular
ow in the previous non-vacuum product state, provided a canonical choice for
the latter is made, in the model-independent approach.
We shall make the result of Casini and Huerta (which was obtained by formal
manipulations of operator kernels) rigorous by establishing the KMS property of
the vacuum state with respect to the modular action they found. We shall also
present a preliminary discussion of the question, to what extent the result may be
expected to hold in other than free Fermi theories.

2. Geometric Modular Flow for n-Intervals


Let I  A(I) be a dieomorphism covariant local net on the circle S 1 : the
orientation-preserving dieomorphisms of S 1 are unitarily implemented by U ()
such that Ad U () maps A(I) onto A((I)) and Ad U ()|A(I) = id |A(I) if |I = id |I ;
in particular, for localized dieomorphisms U () are local observables, associated
with the stress-energy tensor; see, e.g., [27, Sec. 3].

An n-interval is the union E := nk=1 Ik of n open intervals Ik S 1 (k =
1, . . . , n) with mutually disjoint closure. The complement E  = S 1 \E is another
n-interval.
If there is an interval I S 1 such that E= {z S 1 : z n I}, we write
E = I, and call E symmetric. In this case, E  = n I  . Note that every 2-interval
n

is a M obius transform of a symmetric 2-interval, while the same is not true for
n > 2.
We are interested in the algebras

n
A(E) := 
A(Ii ) and A(E) := A(E  ) , (2.1)
i=1

and their states with geometric modular action. By we denote the vacuum vector,
and by U the projective unitary representation of the dieomorphism group in the
vacuum representation, with generators Ln (n Z) and central charge c.

2.1. Product states with geometric modular action



For n = 1, E ist just an interval and A(I) = A(I) (Haag duality).
Proposition 1 (BisognanoWichmann Property) ([4, Theorem 2.3]). The
modular group of unitaries for the pair (A(I), ) is given by the 1-parameter group
obius transformations that xes the interval I, it
of M A(I), = U (I (2t)).
1
For I = S+ the upper half circle, the generator of the subgroup U (S+1 (t))
is the dilation operator D = i(L1 L1 ). It follows that D as well as its
Mobius conjugates DI (the generators of the subgroups U (I (t))) are of modular
origin:
2 DI = log A(I), . (2.2)
14:17 WSPC/S0129-055X 148-RMP J070-00397

334 R. Longo, P. Martinetti & K.-H. Rehren

I2 I1

I3
q q
Fig. 1. Flow ft in the 3-intervals E = 3 1 = I I I and E  =
S+ 3
S
1.
1 2 3

Let now
(n) 1 c n2 1 (n) 1
L0 = L0 + , L1 = Ln , (2.3)
n 24 n n
(n)
and U (n) the covering representation of the Mobius group with generators Lk
(k = 0, 1). The unitary one-parameter groups V (t) = U (n) (I (2t)) act on the
dieomorphism covariant net by

V (t)A(J)V (t) = A(ft (J)) (J I)
n
(2.4)
where the geometric ow ft is given by (cf. Fig. 1)

ft (z) = n I (2t)(z n ), (2.5)

with the branch of n chosen in the same connected component of E as z, i.e. ft is
a dieomorphism of S 1 which preserves each component of E separately. The same
formulae hold also for J n I  .
(n)
The question arises whether for n > 1 the generators DI of V (t) also have
modular origin as in (2.2). However, unlike with n = 1, we have the following
lemma and corollary:
Lemma. In a unitary positive-energy representation of sl(2, R) of weight h > 0,
there is no vector such that D = 0, where D = i(L1 L1 ).

Proof. An orthonormal basis of the representation is given by the vectors |n =


1
(n!(2h)n ) 2 Ln1 |h, where |h is the lowest weight vector. Solving the eigenvalue

equation L1 = L1 by the ansatz = n cn |n, produces a recursion for the
coecients cn whose solution is not square-summable.

Corollary. For n > 1, no cyclic and separating vector exists in a positive-energy


representation of the net A such that the modular Hamiltonian log A(E), would
(n)
equal 2DI .
2
c n 1 (n)
Proof. By modular theory, log A(E), = 0. But because L0 24 n > 0,
(n)
the lemma states that no vector can be annihilated by DI which is a Mobius
conjugate of D(n) .
14:17 WSPC/S0129-055X 148-RMP J070-00397

Geometric Modular Action for Disjoint Intervals 335

Instead, the appropriate generalization of (2.2) for the modular origin of the
(n)
generators DI was given in [21], assuming that the net A is completely rational.
This means that the split property holds and the -index A = [A(E)  : A(E)] is

nite, and implies that A(E) A(E) is irreducible and there is a unique conditional

expectation E : A(E) A(E) [22, Proposition 5 and Sec. 3]. In the sequel, d d


is the Connes spatial derivative for a pair of faithful normal states and  on a
von Neumann algebra M and its commutant M  , which is the canonical positive

d it
d it
operator such that ( d ) implements t on M and ( d ) implements t on
M  [10, Theorem 9].
Proposition2 ([21, Corollary 16]). There is a faithful normal state E on
A(E) (E = n I) and a second faithful normal state E  on A(E  ), such that the

following hold: The modular automorphism group tE is implemented by V (t), t E
is implemented by V (t), and
 
(n) E
d n1
2DI = log + log A . (2.6)
dE  2
Here, 
E = E E extends the state on A(E) to a state on A(E). Moreover,
bE
d dE
dE  = bE 
d .
n
The state E on A(E) is given by E := ( k=1 k ) E where E : A(E)
n n
k=1 A(Ik ) k=1 A(Ik ) is the natural isomorphism given by the split property
(Ik are the components of E), and the states k on A(Ik ) are given by k =
Ad U (k ), where is the vacuum state, and U (k ) implement dieomorphisms
k that equal z  z n on Ik . (By locality, k do not depend on the behavior of k
outside Ik .)

Corollary. Let E and 


E be the states on A(E) and on A(E), respectively, as in
n
Proposition 2. For intervals Jk Ik (= the components of E) and F = k=1 Jk ,
we have the geometric modular actions
tE (A(Jk )) = A(ft (Jk )), hence tE (A(F )) = A(ft (F )), (2.7)
tbE (A(Jk )) = A(ft (Jk )), and  )) = A(f
bE (A(F  t (F )). (2.8)
t

Proof. (2.7) is obvious from (2.4). By the dening implementation properties of


the Connes spatial derivative, we conclude from (2.6), that bE is implemented by
V (t). This implies (2.8), by the U (n) -covariance of the algebras under consideration.
(We include the obvious statement (2.7) for later comparison with the geometric
modular ow in [9], for which only the second equality in (2.7) holds while the rst
is violated.)
For n = 1, one may just choose = id , so that both I and I  are given by
the restrictions of the vacuum state, and (2.6) reduces to (2.2).
For n > 1, the state E is dierent from the vacuum state, but it is rotation
invariant on A(E) in the sense, that E Ad U (rott ) = E on A(Jk ) for J k Ik
and t small enough that rott (Jk ) Ik . (rott stands for the rotations z  eit z.)
14:17 WSPC/S0129-055X 148-RMP J070-00397

336 R. Longo, P. Martinetti & K.-H. Rehren

Namely, if J I such that gJ I for g in a neighborhood N of theidentity of the


M obius group, then by construction, E Ad U (n) (g) = E on A( n J) for g N .
In particular, the same is true for the rotations rott with t in a neighborhood of
0. Since U (n) (rott ) = U (rott/n ) (complex phase), the rotation invariance on A(E)
follows.
One could actually have chosen any other family of dieomorphisms k that map
( )
Ik onto I, resulting in product states E k with a dierent geometric ow on E. In
that case, the unitary 1-parameter group V (t) satisfying the properties of Propo-
(n)
sition 2 is a dieomorphism conjugate of UI (I (2t)). One might expect that
our choice of E is the only one in this class which enjoys the rotation invariance
on A(E). Surprisingly, this is not the case:
( )
Let E k be a product state on A(E) that is given on A(Ik ) by Ad U (k ),
where k are dieomorphisms of S 1 that map Ik onto I. Then this state is rotation
invariant on A(E), by construction, if and only if Ad U (hk ) are rotation invariant
on A(I), where hk are dieomorphisms of S 1 , dened on I by hk (z n ) = k (z)
for z Ik . In particular, hk map I onto I. The condition that Ad U (h) is
rotation invariant on A(I), can be evaluated for the 2-point function of the stress-
energy tensor in that state. Using the inhomogeneous transformation law under
 
dieomorphisms h, involving the Schwartz derivative Dz h = hh 32 ( hh )2 , the
quantity
2
dht (z) dht (w)
dz dw c2
2c 2 + 36 Dz ht (z) Dw ht (w), (2.9)
(ht (z) ht (w))

where ht = h rott , must be independent of t for z, w I and t in a neighborhood


of zero. Working out the singular parts of the expansion in w around z, one nds
that Dz ht (z) must be independent of t for z I. This already implies that the
second (regular) term is separately invariant, so that, in particular, the invariance
condition does not depend on the central charge c. Solving
 
t Dz ht (z) = 0 z 2 Dz h(z) = const., (2.10)
when the constant is parametrized as 12 (1 2 ), yields
Az + B
h(z) = (z ) = for z I, (2.11)
Cz + D
where is a M obius transformation.a The state Ad U (h) is indeed rotation
invariant on A(I) by h rott (z) = rott (z ) and M
obius invariance of .

a The sign of the exponent can be reversed by exchanging A B and C D. In order that

S , must be either real or imaginary, with corresponding reality conditions


h takes values in 1
A B
on the matrix C D . Requiring h also to preserve the orientation, we nd: If > 0, then

A B A B i 1 i 1
C D
SU (1, 1). If i > 0, then C D i 1 SL(2, R), where i 1 is the Cayley
transformation x  1+ix
1ix
.
14:17 WSPC/S0129-055X 148-RMP J070-00397

Geometric Modular Action for Disjoint Intervals 337

For each value of , requiring h to preserve the endpoints of the interval I xes
the M obius transformation up to left composition with the 1-parameter subgroup
I (t). Because is invariant under I (t), the state Ad U (h) is uniquely deter-
mined by the exponent in (2.11).
One has therefore a 1-parameter family of product states, all rotation-invariant
on A(I), but with dierent modular ows on I. Going back to the product states
on A(E) by composition with z  z n , there is one parameter k for each interval,
i.e. for the choice of the states Ad U (k ) on A(Ik ). The state is invariant also
under large rotations by 2/n, if and only if these parameters are the same for
all k.

2.2. Geometric modular action in boundary CFT


The case n = 2 is of particular interest in boundary conformal quantum eld theory
(BCFT) [25]. With every 2-interval E such that 1 E, one associates a double
cone OE in the halfspace M+ = {(t, x) R2 : x > 0} as follows. The boundary
x = 0, t R is the pre-image of S 1 := S 1 \{1} under the Cayley transform
C : R t  z = (1 + it)/(1 it) S 1 . Let E = I I+ S 1 with I < I+ in the
R
counter-clockwise order, and I = C 1 (I ) R. Then
R R R
OE := I+ I {(t, x) : t x I }. (2.12)
(When there can be no confusion, we shall drop the subscript E.)
Now, the algebras

B+ (O) := A(E) (2.13)
have the re-interpretation as local algebras of BCFT, which extend the subalgebras
of chiral observables
A+ (O) := A(E) A(I ) A(I+ ). (2.14)
Under this re-interpretation, the second statement in (2.8) asserts, that the modular
group tbE acts geometrically inside the associated diamond O:
sbE (B+ (Q)) = B+ (fsO (Q)), (2.15)
where the double cone Q = OF O corresponds to a sub-2-interval F E, and
the ow fsO on O arises from the pair of ows fs (2.5) on I+ and I , by the said
transformations, i.e.
fsO (t + x, t x) (us , vs ) = (C 1 fs C(t + x), C 1 fs C(t x)). (2.16)
R R
For I+ = (a, b) R+ and I = (1/a, 1/b) (corresponding to a symmetric
2-interval E), we have computed the velocity eld
(us a)(aus + 1)(us b)(bus + 1)
s us = 2 =: 2V O (us ) (2.17)
(b a)(1 + ab) (1 + u2s )
R R
for us I+ , and the same equation for vs I .
14:17 WSPC/S0129-055X 148-RMP J070-00397

338 R. Longo, P. Martinetti & K.-H. Rehren

R R
For I+ = (a1 , b1 ) and I = (a2 , b2 ) corresponding to a non-symmetric 2-interval

E, there is a M
obius transformation m that maps E onto a symmetric interval E.
Choosing the state E := E Ad U (m) on A(E), the resulting geometric modular

ow is given by fs = m1 fs m. Going through the same steps, we nd
(u a1 )(u b1 )(u a2 )(u b2 )
s us = 2V O (us ) = 2 (2.18)
Lu2 2M u + N
with
L = b1 a1 +b2 a2 , M = b1 b2 a1 a2 , N = b2 a2 (b1 a1 )+b1 a1 (b2 a2 ). (2.19)
This dierential equation is solved by
(us a1 )(us a2 )
log = 2s + const. (2.20)
(us b1 )(us b2 )
The modular orbits for u = t + x, v = t x are obtained by eliminating s:
(u a1 )(u a2 ) (v b1 )(v b2 )
= const. (2.21)
(u b1 )(u b2 ) (v a1 )(v a2 )

2.3. General boundary CFT



Up to this point, we have taken the boundary CFT to be given by B+ (O) := A(E),

which equals the relative commutant B+ (O) = A(K) A(L) by virtue of Haag
duality of the local chiral net A. Here, K and L S 1 are the open intervals
between I+ and I , and spanned by I+ and I , respectively, i.e. L = I+ K I .
The general case of a boundary CFT was studied in [25]. If A is completely
rational, every irreducible local boundary CFT net containing A(E) is intermediate
between A(E) and a maximal (Haag dual) BCFT net:
A(I+ ) A(I ) A+ (O) B+ (O) B+
dual
(O) B(K) B(L), (2.22)
where I  B(I) is a conformally covariant, possibly nonlocal net on S 1 , which
extends A and is relatively local with respect to A [25, Proposition 2.9(ii)]. (Its
extension to the circle in general requires a covering). If A is completely rational,
the local subfactors A(I) B(I) automatically have nite index (not depending
on I S 1 ) by the same argument as in [20, p. 39], and there are only nitely many
such extensions [19, Theorem 2.4].
There is then a unique global conditional expectation , that maps each B(I)
onto A(I). commutes with Mobius transformations and preserves the vacuum
state. By relative locality, maps B(K) B(L) into (in general, not onto) A(K)
A(L), hence

A(E) A+ (O) (B+ (O)) A(E). (2.23)
The product state 
E on A(E) E on B+ (O).
induces a faithful normal state
Proposition 3. In a completely rational, dieomorphism invariant BCFT, the
modular group of the state E acts geometrically on B+ (Q), Q O, i.e.
sbE (B+ (Q)) = B+ (fsO (Q)), where fsO is the ow (2.16).
14:17 WSPC/S0129-055X 148-RMP J070-00397

Geometric Modular Action for Disjoint Intervals 339

Proof. B+ (O) is generated by A+ (O) and an isometry v [24] such that every
element b B+ (O) has a unique representation as b = av with a A+ (O), and
va = (a)v where is a dual canonical endomorphism of B+ (O) into A+ (O). For
a double cone Q O, the isometry v may be chosen to belong to B+ (Q), in
which case is localized in Q. We know that the modular group restricts to the
modular group of A+ (O), which acts geometrically, in particular, it takes A+ (Q)
to A+ (fsO (Q)). It then follows by the properties of the conditional expectation
that sbE (v) vs = us v where us A(E) is a unitary cocycle of intertwiners
us : s sbE sbE 1 . Since sbE acts geometrically in A+ (O), s is
localized in fsO (Q), and A+ (fsO (Q)) vs = B+ (fsO (Q)). This proves the claim.

Thus, in every BCFT, the modular group of the state E on B+ (OE ) acts
geometrically inside the double cone OE by the same ow (2.20), (2.21).

2.4. Local temperature in boundary conformal QFT


E , whose geometric modular action we have just
We shall show that the states
discussed, are manufactured far from thermal equilibrium. We adopt the notion of
local temperature introduced in [8], where one compares the expectation values
of suitable thermometer observables (x) in a given state with their expecta-
tion values in global KMS reference states of inverse temperature . If one can
represent the expectation values as weighted averages

((x)) = dx () ((x)) (2.24)

(where the thermal functions  ((x)) do not depend on x because KMS


states are translation invariant), then one may regard the state at each point
x as a statistical average of thermal equilibrium states. In BCFT, this analysis
can be carried out very easily for the product states E with the energy density
2T00 (t, x) = T (t + x) + T (t x) as thermometer observable. One has (T ( )) =
2 2
24 c in the KMS states, while the inhomogeneous transformation law of T under
dieomorphisms gives E (T (y)) = 24 c
Dy (y) = 4
c
(1 + y 2)2 if y I
R
where
1
(y) = C (z  z ) C(y) = 1y2 , i.e. negative energy density inside the
2 2y

R R
double cone O = I+ I . The product states E can therefore not be interpreted
as local thermal equilibrium states in the sense of [8]. The possibility of locally
negative energy density in quantum eld theory is well known, and its relation to the
Schwartz derivative in two-dimensional conformal QFT was rst discussed in [15].

2.5. Modular temperature in boundary conformal QFT


The thermal time hypothesis [11] provides a very dierent thermal interpretation
of states with geometric modular action. According to this hypothesis, one inter-
prets the norm of the vector s tangent to the modular orbit x (s) as the inverse
14:17 WSPC/S0129-055X 148-RMP J070-00397

340 R. Longo, P. Martinetti & K.-H. Rehren

temperature s of the state as seen by a physical observer with accelerated trajec-


tory x (s). In the vacuum state on the Rindler wedge algebra, this gives precisely
the Unruh temperature s = d 2
ds = ( is the proper time, and the acceleration).
One may also give a local interpretation, by viewing s as the inverse temperature
of the state for an observer at each point whose trajectory is tangent to the unique
modular orbit through that point.
For these interpretations to make sense it is important that s is a timelike
vector. Indeed, it is easily seen that the ow (2.17), (2.18) gives negative sign for
both s us and s vs , because the velocity eld V O is positive inside the interval.
Hence the tangent vector is past-directed timelike. This conforms with a general
result, proven in more than 2 spacetime dimensions:

Proposition 4 ([32, Satz 6.5]). Let A(O) be a local algebra and Ut a unitary 1-
parameter group such that Ut A(Q)Ut = A(ft Q) where ft is an automorphism of O
taking double cones in O to double cones. If there is a vector , cyclic and separating
for A(O), such that Ut A has an analytic continuation into a strip < Im t < 0,
then t (ft x)|t=0 V+ . In particular, the ow of a geometric modular action is
always past-directed null or timelike.

From (2.18), we get the proper time (d )2 = du dv and hence the inverse tem-
perature = d
ds as a function of the position x = (t, x)

du dv
(t, x)2 = = 4 2 V O (t + x)V O (t x). (2.25)
ds ds

The temperature diverges on the boundaries of the double cone (V O (ai ) = V O (bi ) =
0), and is positive everywhere in its interior.
For comparison with the ordinary Unruh eect, we also compute the acceleration
in the momentarily comoving frame
 2 1
x 2 x 2 (d2 x/dt2 ) u v  u v 
= = = , (2.26)
2
2 (1 (dx/dt) )
2 3/2 2(u v  )3/2

dx x u v  d2 x
where the prime stands for s , and we have used dt = t = u +v  and dt2 =
  
(dx/dt) u v 
t = 4 u(uv +v  )3 . Thus

V O (u) V O  (v)  V O (t + x) V O  (t x)
(t, x) =   = (2.27)
2 V O (u)V O (v)  u=t+x, v=tx
1 (t, x)

as a function of the position (t, x). The product


  
(t, x) (t, x) = x V O (t + x) + V O (t x) 
  
= t V O (t + x) V O (t x)  (2.28)
14:17 WSPC/S0129-055X 148-RMP J070-00397

Geometric Modular Action for Disjoint Intervals 341

1 B 1 B
u
u
A A
0 0

-1 -1 -1 -1
B B

-1 -1
A A

Fig. 2. Inuence of the boundary. Left: modular orbit of an arbitrary point in the symmetric
double cone O = {(t, x) : A t + x B, A 1
t x B 1
}. Right: a zoom on the modular
orbit (us , vs ) going through the center of the double cone. The plot represents the curve ( us , vs )
where us = f (us udiag
s ) + udiag
s , with (udiag
s , vs ) the straight line joining the two tips of the
double cone (a special vacuum modular orbit in the absence of the boundary), and f = 100 a
zoom factor.

has the maximal value 2 (Unruh temperature) near the left and right edges of the
double cone, and equals 0 along a timelike curve connecting the past and future
tips. This curve is in general not itself a modular orbit.
In general, the modular orbits are not boost trajectories. However, the quanti-
tative departure is very small. As an illustration, we display a true modular orbit,
as well as a plot with one coordinate exaggerated by a zoom factor of 100 (Fig. 2).
There exists however one distinguished modular orbit with a simple dynamics,
namely the boost

us vs = 1 s R (2.29)

(in the symmetric case, for simplicity) which is a solution of (2.21) for const. = 1.
It is the Lorentz boost of a wedge in M+ , whose edge lies on the boundary x = 0.
The same is true also for non-symmetric intervals, although the formula (2.29) is
more involved.
Along this distinguished orbit the inverse temperature (2.25) simply writes
s us d
= 2 = 2 ln us . (2.30)
us ds
14:17 WSPC/S0129-055X 148-RMP J070-00397

342 R. Longo, P. Martinetti & K.-H. Rehren

One can express the proper time of the observer following the boost as a function
of the modular parameter
(s) = ln us ln u0 , (2.31)
O
hence ( ) = 2 V u(u
0e
0e )
. Choosing u0 = 1, one can write the inverse temperature
as a function of the proper time in the form
(sinh(max ) sinh( )) (sinh( ) sinh(min ))
( ) = 2 , (2.32)
(sinh(max ) sinh(min )) cosh( )
where min and max are functions of the coordinates of the double cone. As for
double cones in Minkowski space [28], the temperature is innite at the tips of the
double cone ( = min or = max ) and reaches its minimum in the middle of the
observers lifetime. Unfortunately, for generic orbits we have no closed formula
for the temperature as a function of the proper time, so as to compare with the
plateau behavior (constant temperature for most of the lifetime) as in [28], that
occurs in CFT without boundary for vacuum modular orbits close to the edges of
the double cone.

3. The Vacuum Modular Flow


Casini and Huerta [9] recently found that the vacuum modular group for the algebra
of a free Fermi eld in the union of n disjoint intervals (ak , bk ) R is given by the
formula
 
dxj  dxk (t)
t ((xj )) = Ojk (t) (xk (t)). (3.1)
d d
k

Here,
 x ak
e(x) = (3.2)
x bk
k

denes a uniformization function that maps each interval (ak , bk ) onto R, and

e R+ has n pre-images xk = xk (), one in each interval, i.e. l xxkk()al
()bl = e .
The geometric modular ow is given byb
(t) = 0 2t, (3.3)
i.e. a separate ow xk (t) = xk ( 2t) in each interval. The orthogonal matrix O
yields a mixing of the elds on the dierent trajectories xi (t), and is determined
by the dierential equation

O(t) = K(t)O(t) (3.4)

b In[9], the notation is dierent: the authors counter the ow so that the position of t ((xj ( +
2t))) remains constant, except for the mixing.
14:17 WSPC/S0129-055X 148-RMP J070-00397

Geometric Modular Action for Disjoint Intervals 343

where Kjj (t) = 0 and


 
dxj (t) dxk (t)
d d
Kjk (t) = 2 (j = k). (3.5)
xj (t) xk (t)

Remark. The mixing is a minimal way to evade an absurd conclusion from


Takesakis Theorem ([32, Chap. IX, Theorem 4.2]): Without mixing the modular
group would globally preserve the component interval subalgebras. Then, the Reeh
Schlieder property of the vacuum vector would imply that the n-interval algebra
coincides with each of its component interval subalgebras.

Proposition 5. For k (ak , bk ) R the Cayley transform of a symmetric
n
n-interval E = I S 1 \{1}, the geometric part (3.3) of the ow (without
mixing) is the same as (2.5).

1+iak 1+ibk 1+ix


Proof. We use variables uk = 1ia k
, vk = 1ibk , z = 1ix , and the identity
2i(x a) = (1 ix)(1 ia)(z u). Then
 x ak  z uk zn U
e = = const. = const. (3.6)
x bk z vk zn V
k k

where U = unk , V = vkn such that I = (U, V ) S 1 . Therefore, the ow (3.3) is


equivalent to

z(t)n U 2t z U
n
= e , (3.7)
z(t)n V zn V

which in turn is easily seen to be equivalent to (2.5).

Keep in mind, however, that the modular group of the product state in Sec. 2.1
does not mix the intervals (ak , bk ).
Since every 2-interval is a M
obius transform of a symmetric 2-interval, the state-
ment of Proposition 5 is also true for general 2-intervals, with the ow (2.20).

3.1. Verification of the KMS condition


The authors of [9] have obtained the ow (3.1) using formal manipulations. We
shall establish the KMS property of the vacuum state for this ow. Because this
property distinguishes the modular group [32, Chap. VIII, Theorem 1.2], we obtain
an independent proof of the claim.

We take k (ak , bk ) R the Cayley transform of a symmetric n-interval E =
n
I S 1 . We rst solve the dierential equation (3.4) for the mixing.
14:17 WSPC/S0129-055X 148-RMP J070-00397

344 R. Longo, P. Martinetti & K.-H. Rehren

With angular variables x = tan 2 , and > 0 > 1 > > n1 > , the
non-diagonal elements of the matrix K can be written as
 
dxk (t) dxl (t)
 
dk (t) dl (t) dk (t) dl (t)
Kkl (t) = 2
xk (t) xl (t) dz dz
 
dk (t) dl (t)
= 2 dz dz (3.8)
k (t) l (t)
2 sin
2

for k = l. For symmetric intervals, k = 0 k 2


n and dk
dz = d0
dz > 0, hence

d0 (t)
dz 1
Kkl (t) = 2 = kl 0 (t), kl = . (3.9)
(k l) (k l)
2 sin 2 sin
n n

With the constant anti-symmetric matrix = (kl )n1


k,l=0 , we obtain the orthogonal
mixing matrix

Corollary. The mixing matrix is given by

O(t) = e(0 (t)0 (0)) . (3.10)

Remark. The mixing matrix O(t) always belongs to the same one-parameter sub-
group of SO(n), with generator . For n = 2, this is just

cos sin 1
O(t) = with (t) = (0 (t) 0 ). (3.11)
sin cos 2

If E is not symmetric, the general formula is

Lx0 (t) M Lx0 (0) M


(t) = arctan arctan (3.12)
LN M 2 LN M 2

with notations as in (2.18).c


Next, we compute the vacuum expectation values t ((xi ))s ((yj )) for xi
i
Ii , yj Ij , using (3.1) and (x)(y) = xyi . Passing to angular variables

c The authors of [9] also compute this angle, but misrepresent it as the arctan of the dierence,
rather than the dierence of the arctans.
14:17 WSPC/S0129-055X 148-RMP J070-00397

Geometric Modular Action for Disjoint Intervals 345

x  , y  by

dx dy d d
= , (3.13)
x y i i
2 sin
2

this gives

t ((xi ))s ((yj ))  


dk (t) dl (s)
i
 dxi dyj
= (e(0 (t)0 ) )ik (e(0 (s)0 ) )jl  . (3.14)
k (t) l (s) i
kl 2 sin
2

Notice that again dk , dl in the square roots do not depend on k and l. To perform
the sums over k and l, we need a couple of trigonometric identities:

Lemma. For n N and k = 0, 1, . . . , n 1, let sink () := sin( k n ). Then (sums


and products always extending from 0 to n 1):

(i) k sink () = (2)1n sin(n).

(ii) For j = 0, . . . , n 1 one has k: k=j cot((j k) n ) = 0.
(iii) For j = 0, . . . , n 1 one has

 1 sin(n) 1
(e2() )jk = . (3.15)
sink () sin(n) sinj ()
k

 2
Proof. (i) is just another way of writing k (z k ) = z n 1 where k = eik n are
the nth roots of unity, and z = e2i . Dividing (i) by sinj (), taking the logarithm,
and taking the derivative at = 0, yields (ii). For (iii), we have to show that the
expression

 1  
(2)1n sin(n) (e2 )jk = (e2 )jk sinl () (3.16)
sink ()
k k l: l=k

is independent of . Taking the derivative with respect to and inserting (3.9), we


have to show that
     
1
sinl () + cos k sinl () = 0. (3.17)
sin(j k) n
k l: l=k k l: l=j,k
n

Writing cos( k n ) = (sink () cos((j k) n ) sinj ())/sin((j k) n ), this sucient


condition reduces to the identity (ii).
14:17 WSPC/S0129-055X 148-RMP J070-00397

346 R. Longo, P. Martinetti & K.-H. Rehren

Using (3.15) with 2 = 0 (t) l (s) and 2 = 0 l (s) in the expression (3.14),
and once again with 2 = 0 (s) 0 and 2 = 0 0 , we get
 
  d0 (t) d0 (s)
0 0 i i
sin n dxi dyj
2
t ((xi ))s ((yj )) =     . (3.18)
0 (t) 0 (s) i i j i
sin n 2 sin
2 2
We exhibit the t- and s-dependent terms:
   
d0 (t) d0 (s) d0 (t) dH0 (s)
  =  
n0 (t) n0 (s) i 0 (t) H0 (s) i
2 sin 2n sin
2 2
 
1 dX(t) dY (s)
= . (3.19)
n X(t) Y (s) i
The rst equality is the invariance of the 2-point function under a M obius trans-
formation mapping I to S+ 1
, such that for z = ei E and w = ei E we get
1+iX
(z n ) = ei = 1iX S+1
1iY S+ with X, Y R+ ; the
and (wn ) = eiH = 1+iY 1

second equality is again (3.13) for the inverse transformation  X, H  Y . By


Proposition 5, the ow on R+ is just X(t) = e2t X, giving

e(t+s)
t ((xi ))s ((yj )) = f (xi , yj ). (3.20)
e2t X e2s Y i
This expression manifestly satises the KMS condition in the form

(x)i/2 ((y)) = (y)i/2 ((x)). (3.21)

We conclude that the KMS condition holds for the CasiniHuerta ow for symmetric
n-intervals:

Corollary. For symmetric n-intervals E = n I, (3.1) is the modular automorphism
group of the algebra A(E) with respect to the vacuum state.

Proof. Smearing with test functions of appropriate support, the KMS property
holds for bounded generators of the CAR algebra A(E). Because is a free eld,
the KMS property of the 2-point function in the vacuum extends to the KMS
property of the corresponding quasifree (i.e. Fock) state of the CAR algebra.

Remark. It is quite remarkable that by virtue of the mixing, through the identity
(ii) of the lemma, the ratio of the modular vacuum correlation functions
(n) (n)
t ((xi ))s ((yj ))
(1) (1)
(3.22)
t ((X))s ((Y ))
14:17 WSPC/S0129-055X 148-RMP J070-00397

Geometric Modular Action for Disjoint Intervals 347

is independent of the modular parameters t, s. Here, in the numerator (n) is the


modular group for a symmetric n-interval R, and in the denominator (1) is the
modular group for the 1-interval R+ .

3.2. Product states for general n-intervals


With hindsight from [9], we can generalize to non-symmetric n-intervals the model-
independent construction of a product state, as in Sec. 2.1, by replacing the function
z  z n as follows. If C stands for the Cayley transformation x  z = 1ix 1+ix
,
 n
and k (ak , bk ) R the pre-image of a symmetric n-interval E = I, then U =
C(ak )n S 1 and V = C(bk )n S 1 do not depend on k. One computes the
uniformization function (3.2) in this case to be given by

e = C 1 (z  z n ) C(x) (3.23)
 n

V
obius transformation Z C (1)
where : S 1 S 1 is the M n
(1) U ZU
V Z , that

takes I to S+ . For a general n-interval E = Ik S , one may choose an
1 1

obius transformation, and replace z  z n by the function


arbitrary M

g(z) := 1 C e C 1 , (3.24)

where is the uniformization function (3.2). Thus, g maps each component Ik onto
the same interval I = 1 (S+1
), i.e. we have E = g 1 (I). Repeating the construction
of Proposition 2 with factor states k = Ad U (k ), where the dieomorphisms
k coincide with g on Ik , one obtains a product state with the geometric modular
ow
 
ft (z) = g 1 I (2t)g(z) , (3.25)

instead of (2.5). By construction, this ow corresponds to (t) = (0) 2t as


before, which in turn coincides with the geometric part of the vacuum modular
ow (3.1).

3.3. Lessons from the free Fermi model


Charge splitting
It is tempting to ask whether, and in which precise sense, the free Fermi eld
result extends also to the free Bose case. (The authors of [9] are positive about
this, but did not present a proof.) In the chiral situation, the free Bose net A(I)
(the current algebra with central charge c = 1) is given by the neutral subalgebras
of the complex free Fermi net F (I). Because the vacuum state is invariant under
the charge transformation, there is a vacuum-preserving conditional expectation
14:17 WSPC/S0129-055X 148-RMP J070-00397

348 R. Longo, P. Martinetti & K.-H. Rehren

: F (I) A(I), implying that the vacuum modular group of F (E) restricts to the
vacuum modular group of C(E) := (F (E)). We have

F (E)
(3.26)

A(E) C(E) A(E),

where both inclusions are strict: C(E) contains neutral products of integer charged
elements of F (Ik ) in dierent component intervals, which do not belong to A(E),

while A(E) contains charge transporters [6, 22] for the continuum of superselec-
tion sectors of the current algebra with central charge c = 1, which do not belong
to C(E).
Being the restriction of the vacuum modular group of F (E), the action of the
vacuum modular group of C(E) can be directly read o. It acts geometrically,
i.e. takes C(F ) to C(ft (F )),d but it does not take A(F ) to A(ft (F )), because the
mixing takes a neutral product of two Fermi elds in one component Jk of F to
a linear combination of neutral products of Fermi elds in dierent components
ft (Jj ), belonging to C(ft (F )) but not to A(ft (F )). Let us call this feature charge
splitting (stronger than mixing).
The inclusion situation (3.26) does not permit to determine the vacuum modular
ow of A(E) from that of C(E), because there is no vacuum-preserving conditional
expectations C(E) A(E) that would imply that the modular group restricts.
(Of course, this would be a contradiction, because we have already seen that the
modular group of F (E), and hence that of C(E), does not preserve A(E).) Similarly,
we cannot conclude that the vacuum modular ow of A(E)  should extend that of
C(E), or that of A(E). Proposition 6 below actually shows that this scenario must
be excluded.

Application to BCFT
It is instructive to discuss the consequence of the free Fermi eld mixing and the
ensuing charge splitting for C(E) under the geometric re-interpretation of boundary
CFT, as in Sec. 2.2. For deniteness and simplicity, we consider the case when A
is the even subnet of the real free Fermi net, i.e. A is the Virasoro net with c = 12 .
Unlike the c = 1 free Bose net, this model is completely rational.
The same considerations as in the previous argument apply also in this case:
Again, the inclusions A(E) C(E) := (F (E)) F (E)Z2 A(E)  are strict, the
1
latter because charge transporters for the Ramond sector (weight h = 16 ) do not
belong to C(E). The vacuum modular ow for C(E) is induced by that for F (E),

but it does not pass to A(E) or A(E).

S
d Here and below, F E always stands for an n-interval F = k Jk where Jk are the components
of the pre-image of some interval under the function (3.2), i.e. in the symmetric case, F = n J
with J I.
14:17 WSPC/S0129-055X 148-RMP J070-00397

Geometric Modular Action for Disjoint Intervals 349

R R
Let therefore E S 1 be 2-intervals and O = I+ I M+ the associated
double cones. The net
O  C(O) = F (E)Z2 (3.27)
is a BCFT net intermediate between the minimal net A+ (O) = A(E) and the
maximal (Haag dual) net B+ (O) = A(E),  see [25]. It is generated by elds
n m R
i=1 (u i ) j=1 (vj ) with n + m = even, and ui smeared in I+ , vj smeared in
R
I .
The vacuum modular ow of C(O) mixes ft ui with ft ui and ft vj with ft vj ,
where u  u and v  v  are the bijections of the two intervals onto each other con-
necting the two pre-images of the uniformization function . Hence, if (u)n (v)m
(in schematical notation) belongs to C(Q) for a double cone Q O, the vacuum
modular ow takes it to linear combinations of
(ft u)n1 (ft u )n2 (ft v)m1 (ft v  )m2 (3.28)
with n1 + n2 = n, m1 + m2 = m. Grouping the charged factors to neutral (even)
bi-localized products, these generators belong to the local algebra of 6 double
6  around 6 points as indicated in Fig. 3.
cones =1 C(ft Q ) C(ft Q)
 the corre-
In spite of the fact that two of the 6 double cones Q lie outside Q,
 But their bi-localized generators,
sponding algebras C(Q ) are contained in C(Q).

J+

u

Q
Q

O
v

Fig. 3. The 6 regions mixed by the vacuum modular ow in boundary CFT. (u, v) is a point in
Q O. The boost is the distinguished orbit in O as in Sec. 2.5, and denes u = u 1
and v = v1 .
If (u, v) lies on the boost, then the points (v, u ) and (v , u) lie on the boundary. Consequently, if
a double cone Q O around (u, v) intersects the distinguished orbit, then four of the 6 associated
double cones Q merge with each other, while the other two touch the boundary and degenerate
to left wedges. (The ow ft itself, as in Fig. 2, is suppressed.)
14:17 WSPC/S0129-055X 148-RMP J070-00397

350 R. Longo, P. Martinetti & K.-H. Rehren

 because on the boundary


such as (u)(v  ), cannot be associated with points in Q,
they are localized in the entire interval J+ spanned by u and v  [26, Sec. 2], hence
  Therefore, in the geometric re-interpretation
belong to J C(J+ J ) C(Q).
of boundary CFT, the discrete mixing (charge splitting) on top of the geometric
modular action induces a truely fuzzy action on BCFT algebras associated with
double cones Q O! The fuzzyness seems, however, not to be described by a pseudo
dierential operator, as suggested in [30, 29], but rather reects the nonlocality of
an operator product expansion for bi-localized elds.

3.4. Preliminaries for a general theory


Also in the general case of a local chiral net A, there is a notion of charge split-
ting: Superselection sectors are described by DHR endomorphisms of the local net,
which are localized in some interval [12, 13]. Intertwiners that change the interval
of localization (charge transporters) are observables, i.e. they do not carry a charge
themselves, but they may be regarded as operators that annihilate a charge in one
interval and create the same charge in another interval. These charge transporters
do not belong to A(E) (where the 2-interval E is the union of the two intervals),

but together with A(E) generate A(E), see the discussion in [22, Sec. 5]. Therefore,
one may speculate whether the combination of geometric action with charge split-
ting could be a general feature for the vacuum modular group of suitable n-interval
algebras intermediate between A(E) and A(E),  i.e. the modular group does not
preserve the subalgebras A(F ), let alone the algebras of the component intervals
A(Jk ).
The discussion of the algebras A(E) C(E) A(E)  in the preceding subsection
shows that there cannot be a simple general answer. Nevertheless, we can derive a
few rst general results.

Proposition 6. Let H be a joint cyclic and separating vector for A(E) and
A(E  ), e.g., the vacuum.

(i) If the modular automorphism group of (A(E),  ) globally preserves the subal-

gebra A(E), then A(E) = A(E).
(ii) If the adjoint action of the modular unitaries it for (A(E), ) globally pre-

serves A(E), 
or, equivalently, A(E  ) then A(E) = A(E).

Proof. By assumption, is also cyclic and separating for A(E) = A(E  ) and
  ) = A(E) . Then (i) follows directly by Takesakis Theorem [32, Chap. IX,
A(E
Theorem 4.2]. For (ii), note that it preserves A(E  ) if and only if it preserves

A(E  ) = A(E); and it implements the modular automorphism group for

(A(E) = A(E ), ). Thus, the statement is equivalent to (i), with E replaced


by E .
14:17 WSPC/S0129-055X 148-RMP J070-00397

Geometric Modular Action for Disjoint Intervals 351

The obvious relevance of Proposition 6(ii) is that in the generic case when

A(E) is strictly larger than A(E), there can be no vector state satisfying the
ReehSchlieder property such that A(E) has geometric modular action on A(E)
and on A(E  ). In particular, the modular unitaries will not belong to the dieomor-
phism group, but we may expect that Connes spatial derivatives as in Proposition
2 do.
Recall that we have already seen (in the Remark after (3.4)) that mixing nec-
essarily occurs. By Proposition 6(i), it is not possible that A(E) has geometric
modular action without charge splitting.

4. Loose Ends
We have put into relation and contrasted the two facts that

(i) in dieomorphism covariant conformal quantum eld theory there is a construc-


tion of states on the von Neumann algebras of local observables associated with
disconnected unions of n intervals (n-intervals), such that the modular group
acts by dieomorphisms of the intervals [21], and
(ii) in the theory of free chiral Fermi elds, the modular action of the vacuum state
on n-interval algebras is given by a combination of a geometric ow with a
mixing among the intervals [9].

The absence of the mixing in (i) can be ascribed to the choice of product states
in which quantum correlations across dierent intervals are suppressed. (In the re-
interpretation of 2-interval algebras as double cone algebras in boundary conformal
eld theory [25], the inuence of the boundary was shown to weaken as expected
on physical grounds in the limit when the double cone is far away from the
boundary [26]. Indeed, it can be seen from the formula (3.12) for the mixing angle
that in this limit the mixing in (ii) also disappears.) On the other hand, there is
some freedom in the choice of product states, which allows to deform the geometric
modular ow within each of the intervals. It comes therefore as a certain surprise
that the geometric part of the vacuum modular ow in (ii) coincides with the purely
geometric ow in the product states in (i), precisely when the latter are chosen in a
canonical way (involving the simple function z  z n on the circle, corresponding
to = 1 in (2.11), in the case of symmetric n-intervals, and the function g (3.24)
in the general case). This means that the relative Connes cocycle between the
vacuum state and the canonical product state is just the mixing, while for all
other product states, it will also involve a geometric component.
Two circles of questions arise:
First, is the geometric part of the vacuum ow specic for the free Fermi model,
or is it universal? And if it is universal, what takes the place of the mixing in the
general case? Putting aside some technical complications of the proof, the authors
of [9] claim a universal behavior for free elds, while in this paper, we have given
rst indications how the geometric behavior should propagate to subtheories and
14:17 WSPC/S0129-055X 148-RMP J070-00397

352 R. Longo, P. Martinetti & K.-H. Rehren

to eld extensions, also strongly supporting the idea of a universal behavior. Insight
from the theory of superselection sectors suggests that the mixing in the general case
should be replaced by a charge splitting. On the other hand, Takesakis Theorem
poses obstructions against the idea that charge splitting on top of a geometric
modular ow could be the general answer (Proposition 6).
Second, the notion of canonical ( = 1) in the above should be given a physical
meaning, related to the absence of a geometric component in the Connes cocycle.
In the free Fermi case, the geometric part of the modular Hamiltonian contains
the stress-energy tensor (x)x (x), while the mixing part can be expressed in
terms of (xk )(xl ) with xk and xl belonging to dierent intervals. The absence of
derivatives suggests that the Connes cocycle is more regular in the UV in the case
when the geometric parts coincide, than in the general case. The same should be
true for the generalized product state constructed in Sec. 3.2. A precise formulation
of this UV regularity is wanted.

Acknowledgments
We thank Jakob Yngvason for bringing to our attention the article of Casini and
Huerta [9], and Horacio Casini for discussions about their work. We also thank the
Erwin Schr odinger Institute (Vienna) for the hospitality at the Operator Algebras
and Conformal Field Theory program, AugustDecember 2008, where this work
has been initiated.
This work was supported in part by ERC Advanced Grant 227458 OACFT
Operator Algebras and Conformal Field Theory, and by the EU network Non-
commutative Geometry MRTN-CT-2006-0031962. R.L. is partially supported by
PRIN-MIUR and GNAMPA-INDAM. P.M. and K.H.R. are supported in part by the
German Research Foundation (Deutsche Forschungsgemeinschaft (DFG)) through
the Institutional Strategy of the University of Gottingen.

References
[1] J. Bisognano and E. H. Wichmann, On the duality condition for quantum elds, J.
Math. Phys. 17 (1976) 303321.
[2] H.-J. Borchers, On revolutionizing QFT with modular theory, J. Math. Phys. 41
(2000) 36043673.
[3] H.-J. Borchers and J. Yngvason, Modular groups of quantum elds in thermal states,
J. Math. Phys. 40 (1999) 601624.
[4] R. Brunetti, D. Guido and R. Longo, The conformal spin and statistics theorem,
Comm. Math. Phys. 156 (1993) 201219.
[5] D. Buchholz, On the structure of local quantum elds with non-trivial interaction,
in Proc. Intern. Conf. Operator Algebras, Ideals, and Their Applications in Physics,
ed. H. Baumg artel (Teubner, 1977), pp. 146153.
[6] D. Buchholz, G. Mack and I. T. Todorov, The current algebra on the circle as a germ
of local eld theories, Nucl. Phys. B 5B (Proc. Suppl.) (1988) 2056.
[7] D. Buchholz, O. Dreyer, M. Florig and S. J. Summers, Geometric modular action
and spacetime symmetry groups, Rev. Math. Phys. 12 (2000) 475560.
14:17 WSPC/S0129-055X 148-RMP J070-00397

Geometric Modular Action for Disjoint Intervals 353

[8] D. Buchholz, I. Ojima and H. Roos, Thermodynamic properties of non-equilibrium


states in quantum eld theory, Ann. Phys. 297 (2002) 219242.
[9] H. Casini and M. Huerta, Reduced density matrix and internal dynamics for multi-
component regions, Class. Quant. Grav. 26 (2009) 185005, 15 pp.
[10] A. Connes, On the spatial theory of von Neumann algebras, J. Funct. Anal. 35 (1980)
153164.
[11] A. Connes and C. Rovelli, Von Neumann algebra automorphisms and time thermo-
dynamics relation in general covariant quantum theories, Class. Quant. Grav. 11
(1994) 28992918.
[12] S. Doplicher, R. Haag and J. E. Roberts, Local observables and particle statistics, I,
Comm. Math. Phys. 23 (1971) 199230.
[13] , Local observables and particle statics, II, Comm. Math. Phys. 35 (1974)
4985.
[14] F. Figliolini and D. Guido, The Tomita operator for the free scalar eld, Ann. Inst.
Henri Poinc are Phys. Theor. 51 (1989) 419435.
[15] E. E. Flanagan, Quantum inequalities in two-dimensional Minkowski spacetime,
Phys. Rev. D 56 (1997) 49224926.
[16] D. Guido, R. Longo and H.-W. Wiesbrock, Extensions of conformal nets and super-
selection structures, Comm. Math. Phys. 192 (1998) 217244.
[17] R. Haag, N. Hugenholtz and M. Winnink, On the equilibrium states in quantum
statistical mechanics, Comm. Math. Phys. 5 (1967) 215236.
[18] P. Hislop and R. Longo, Modular structure of the local algebras associated with the
free massless scalar eld theory, Comm. Math. Phys. 84 (1982) 7185.
[19] M. Izumi and H. Kosaki, On a subfactor analogue of the second cohomology, Rev.
Math. Phys. 14 (2002) 733757.
[20] M. Izumi, R. Longo and S. Popa, A Galois correspondence for compact groups of
automorphisms of von Neumann algebras with a generalization to Kac algebras, J.
Funct. Anal. 155 (1998) 2563.
[21] Y. Kawahigashi and R. Longo, Noncommutative spectral invariants and black hole
entropy, Comm. Math. Phys. 257 (2005) 193225.
[22] Y. Kawahigashi, R. Longo and M. M uger, Multi-interval subfactors and modularity
of representations in conformal eld theory, Comm. Math. Phys. 219 (2001) 631669.
[23] R. K ahler and H.-W. Wiesbrock, Modular theory and the reconstruction of four-
dimensional quantum eld theories, J. Math. Phys. 42 (2001) 7486.
[24] R. Longo and K.-H. Rehren, Nets of subfactors, Rev. Math. Phys. 7 (1995) 567597.
[25] R. Longo and K.-H. Rehren, Local elds in boundary conformal QFT, Rev. Math.
Phys. 16 (2004) 909960.
[26] R. Longo and K.-H. Rehren, How to remove the boundary: An operator algebraic
procedure, Comm. Math. Phys. 285 (2009) 11651182.
[27] R. Longo and F. Xu, Topological sectors and a dichotomy in conformal eld theory,
Comm. Math. Phys. 251 (2004) 321364.
[28] P. Martinetti and C. Rovelli, Diamonds temperature: Unruh eect for bounded tra-
jectories and thermal time hypothesis, Class. Quant. Grav. 20 (2003) 49194932.
[29] T. Saary, On the generator of massive modular groups, Lett. Math. Phys. 77 (2006)
235248.
[30] B. Schroer and H.-W. Wiesbrock, Modular theory and geometry, Rev. Math. Phys.
12 (2000) 139158.
[31] G. Sewell, Relativity of temperature and the Hawking eect, Phys. Lett. A 79 (1980)
2324.
14:17 WSPC/S0129-055X 148-RMP J070-00397

354 R. Longo, P. Martinetti & K.-H. Rehren

[32] M. Takesaki, Theory of Operator Algebras, II, Springer Encyclopedia of Mathematical


Sciences, Vol. 125 (Springer-Verlag, 2003).

[33] S. Trebels, Uber die geometrische Wirkung modularer Automorphismen, PhD thesis,
Gottingen (1997); (in German, see also [2, Chap. III.4]).
[34] W. G. Unruh, Notes on black-hole evaporation, Phys. Rev. D 14 (1976) 870892.
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003941

Reviews in Mathematical Physics


Vol. 22, No. 4 (2010) 355380

c World Scientific Publishing Company
DOI: 10.1142/S0129055X10003941

SPECTRAL SHIFT FUNCTION FOR OPERATORS WITH


CROSSED MAGNETIC AND ELECTRIC FIELDS

MOUEZ DIMASSI and VESSELIN PETKOV


Departement de Math
ematiques, Universit
e Paris 13,
99, Avenue J.-B. Cl
ement, 93430 Villetaneuse, France
dimassi@math.univ-paris13.fr
Universit
e Bordeaux I, Institut de Math
ematiques de Bordeaux,
351, Cours de la Lib
eration, 33405 Talence, France
petkov@math.u-bordeaux1.fr

Received 19 August 2009


Revised 8 January 2010

We obtain a representation formula for the derivative of the spectral shift function
(; B, ) related to the operators H0 (B, ) = (Dx By)2 + Dy2 + x and H(B, ) =
H0 (B, ) + V (x, y), B > 0,  > 0. We establish a limiting absorption principle for H(B, )
and an estimate O(n2 ) for  (; B, ), provided / (Q), where Q = (Dx By)2 +
Dy2 + V (x, y).

Keywords: Magnetic potential; Stark operator; spectral shift function.

Mathematics Subject Classification 2010: 35P25, 35Q40

1. Introduction
Consider the two-dimensional Schr
odinger operator with homogeneous magnetic
and electric elds
H = H(B, ) = H0 (B, ) + V (x, y), Dx = ix , Dy = iy ,
where
H0 = H0 (B, ) = (Dx By)2 + Dy2 + x.
Here B > 0 and  > 0 are proportional to the strength of the homogeneous magnetic
and electric elds. We assume that V, x V C 0 (R2 ; R) L (R2 ; R)) and V (x, y)
satises the estimate
|V (x, y)| C(1 + |x|)2 (1 + |y|)1 , > 0. (1.1)
For  = 0 we have ess (H0 (B, )) = ess (H(B, )) = R. On the other hand, for
decreasing potentials V we may have embedded eigenvalues R and this situation
is completely dierent from that with  = 0 when the spectrum of H(B, 0) is formed

355
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003941

356 M. Dimassi & V. Petkov

by eigenvalues with nite multiplicities which may accumulate only to Landau levels
n = (2n + 1)B, n N (see [9, 13, 15] and the references cited there). The spectral
properties of H and the existence of resonances have been studied in [5, 7, 8] under
the assumption that V (x, y) admits a holomorphic extension in the x-variable into
a domain
0 = {z C : 0 |Im z| 0 }.
Moreover, without any assumption on the analyticity of V (x, y) we show in Propo-
sition 2 below that the operator (H z)1 (H0 z)1 for z C, Im z = 0, is trace
class and following the general setup [11, 20], we dene the spectral shift function
() = (; B, ) related to H0 (B, ) and H(B, ) by
  , f  = tr(f (H) f (H0 )), f C0 (R).
By this formula () is dened modulo a constant but for the analysis of the deriva-
tive  () this is not important. Moreover, the above property of the resolvents and
BirmanKuroda theorem imply ac (H0 (B, )) = ac (H(B, )) = R. A represen-
tation of the derivative  (; B, ) has been obtained in [5] for strong magnetic
elds B + under the assumption that V (x, y) admits an analytic continuation
in x-direction. Moreover, the distribution of the resonances zj of the perturbed
operator H(B, ) has been examined in [5] and a BreitWigner representation of
 (; B, ) involving the resonances zj was established.
In the literature there are a lot of works concerning Schrodinger operators with
magnetic elds ( = 0) but there are only few ones dealing with magnetic and
Stark potentials ( = 0) (see [5, 7, 8] and the references given there). It should be
mentioned that the tools in [5, 7, 8] are related to the resonances of the perturbed
problem and to dene the resonances one supposes that the potential V (x, y) has
an analytic continuation in x variable. In this paper we consider the operator H
without any assumption on the analytic continuation of V (x, y) and without the
restriction B +. Our purpose is to study  (; B, ) and the existence of
embedded eigenvalues of H. To examine the behavior of the spectral shift function
we need a representation of the derivative  (; B, ). The key point in this direction
is the following
Theorem 1. Let V, x V C 0 (R2 ; R) L (R2 ; R) and let (1.1) hold for V and
x V . Then for every f C0 (R) and  = 0 we have
1
tr(f (H) f (H0 )) = tr(x V f (H)). (1.2)

The formula (1.2) has been proved by Robert and Wang [18] for Stark Hamil-
tonians in absence of magnetic eld (B = 0). In fact, the result in [18] says that

 1 e
(; 0, ) = x V (x, y, x, y; , 0, )dxdy, (1.3)
 R2
where e(, ; , 0, ) is the spectral function of H(0, ). The presence of magnetic led
B = 0 and Stark potential lead to some serious diculties. The operator H is not
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003941

Representation of the Spectral Shift Function 357

elliptic for |x|+|y| and we have double characteristics. On the other hand, the
commutator [H, x] involves the term (Dx By) and it creates additional diculties.
The proof of Theorem 1 is long and technical. We are going to study the trace class
properties of the operators (H i)N , x (H i)N 1 , (H i)x (H i)N 2
etc. for N 2 and C0 (R2 ) (see Lemmas 1 and 2). Moreover, by an argument
similar to that in [5, Proposition 2.1], we obtain estimates for the trace norms of
the operators
(z H)1 V (z  H)1 , V (z H)1 (z  H)1 , z
/ R, z
/R
and we apply an approximation argument. Notice that in [18] the spectral shift
function is related to the trace of the time delay operator T () dened via the
corresponding scattering matrix S() (see [17]). In contrast to [18], our proof is
direct and neither T () nor S() corresponding to the operator H(B, ) are used.
The second question examined in this work is the existence of embedded real
eigenvalues and the limiting absorption principle for H. In the physical literature
one conjectures that for  = 0 there are no embedded eigenvalues. We establish
in Sec. 3 a weaker result saying that in any interval [a, b] we may have at most a
nite number embedded eigenvalues with nite multiplicities. Under the assumption
for analytic continuation of V it was proved in [7] that for some nite interval
[(B, ), (B, )] there are no resonances z of H(B, ) with Re z / [(B, ), (B, )].
Since the real resonances z coincide with the eigenvalues of H(B, ), we obtain
some information for the embedded eigenvalues. On the other hand, exploiting the
analytic continuation and the resonances we proved in [5] that for B + the
reals parts Re zj of the resonances zj lie outside some neighborhoods of the Landau
levels. Thus the Landau levels play a role in the distribution of the resonances. It
is known that the spectrum of the operator Q = (Dx By)2 + Dy2 + V (x, y) with
decreasing potential V is formed by eigenvalues (see [9, 13, 15]). In this paper, we
establish a limiting absorption principle for / (Q). In particular, we show that
there are no embedded eigenvalues outside (Q). This agrees with the result in [5]
obtained under the restrictions on the behavior of V and B +. On the other
hand, the result of Proposition 3 and the estimates (4.3) have been established by
Wang [19] for Stark operators with B = 0.
Following the results in Sec. 4 and the representation of  (; B, ) given in [5],
it is natural to expect that for / (Q) the derivative of the spectral shift function
 (; B, ) must be bounded. In fact, we prove the following stronger result.

Theorem 2. Let the potential V C (R2 ; R) satisfy with some > 0 and n
N, n 2 the estimates
|x y V (x, y)| C, (1 + |x|)n|| (1 + |y|)2|| , , . (1.4)
Then for 0
/ (Q) we have
 (; B, ) = O(n2 ) (1.5)
uniformly for in a small neighborhood R of 0 .
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003941

358 M. Dimassi & V. Petkov

The estimate (1.5) has been obtained in [18] in the case of absence of magnetic
eld B = 0 (for a BreitWigner formula see [10], [4] for Stark Hamiltonians and [5]
for the operator H(B, )). Our approach is quite dierent from that in [18]. Our
proof is going without an application of a representation similar to (1.3) which leads
to complications connected with the behavior of the spectral function e(, ; , B, )
corresponding to H(B, ). The formula (1.2) plays a crucial role and our analysis
is based on a complex analysis argument combined with a representation of f (H)
involving the almost analytic continuation of f C0 (R). In this direction, our
argument is similar to that developed in [4, 5].
The plan of this paper is as follows. In Sec. 2, we establish Theorem 1. The
embedded eigenvalues and Mourre estimates are examined in Sec. 3. In Sec. 4,
we prove Proposition 3 concerning the limiting absorption principle for H(B, ).
Finally, in Sec. 5, we establish Theorem 2.

2. Representation of the Spectral Shift Function


Throughout this work we will use the notations of [3] for symbols and pseudo-
dierential operators. In particular, if m : R4 [0, +[ is an order function (see
[3, Denition 7.4]), we say that a(z, ) S 0 (m) if for every N4 there exists
C > 0 such that

|z, a(z, )| C m(z, ).
In the special case when m = 1, we will write S 0 instead of S 0 (1). We will use the
standard Weyl quantization of symbols. More precisely, if p(z, ), (z, ) R4 , is a
symbol in S 0 (m), then P w (z, Dz ) is the operator dened by
  
w 2 i(zz  ) z + z
P (z, Dz )u(z) = (2) e p , u(z  )dz  d, for u S(R2 ).
2
We denote by P w (z, hDz ) the semiclassical quantization obtained as above by quan-
tizing p(z, h).
Our goal in this section is to prove Theorem 1. For this purpose we need some
Lemmas. We set
Q0 = H0 x = (Dx By)2 + Dy2 , Q = Q0 + V,
and in Lemma 1 we will use the notation H1 = H. For the simplicity we assume
that  = B = 1. The general case can covered by the same argument.
Lemma 1. Assume that V, x V C 0 (R2 ; R) L (R2 ; R) and let C0 (R2 ).
Then for N 2, j = 0, 1 and for Im z = 0, the following operators are trace class:
(i) (Hj i)N , x (Hj i)N 1 , (Hj i)x (Hj i)N 2 .
(ii) (Hj i)N , (Hj i)N 1 x .
(iii) x (Hj i)N 1 , (Hj i) x (Hj i)N 2 .
(iv) (Hj i)x (Hj i)N 2 .
(v) (H1 + i)x (H1 + i)N 1 (H1 z)1 .
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003941

Representation of the Spectral Shift Function 359

Moreover,
 
|z| + 1
(H1 + i)x (H1 + i)N 1 (H1 z)1 tr = O . (2.1)
|Im z|2

Proof. We will prove the lemma only for (H1 + i), the case concerning (H1 i)
is similar. On the other hand, the statements for (H0 + i) follow from those for
(H1 + i) when V = 0.
From the rst resolvent equation, we obtain

(H1 + z)1 = (Q0 + z)1 (Q0 + z)1 (x + V )(H1 + z)1


N
 +2
= (Q0 + z)1 + (1)j (Q0 + z)1 ((x + V )(Q0 + z)1 )j
j=1

+ (1)N +3 ((Q0 + z)1 (x + V ))N +3 (H1 + z)1 . (2.2)

Taking (N 1) derivatives with respect to z in the above identity and setting z = i,


we see that (H1 + i)N is a linear combination of terms

KN := (Q0 + i)j1 W (Q0 + i)j2 W (Q0 + i)jr W (H1 + i)p ,

with j1 + + jr N, j1 1, p 0 and W (x) = x + V (x).


Recall that if P S 0 (m) with m L1 (R4 ), (respectively, m L2 (R4 )) then the
corresponding operator is trace class (respectively, HilbertSchmidt). By using this
and the fact that the symbol of (Q0 + i)1 is in S 0 ( y, 2 ), we deduce that
the operator
j  
l
Kl,p,l  ,p := x yp (Q0 + i)j xl yp

is trace class one for ll , pp > 1, j 2 and HilbertSchmidt one for ll , pp >
1/2, j 1. Next, we write KN as follows
j1 j2
KN = x3r y2r K3r,2r,3r2,2r2 W x1 K3r3,2r2,3r1,2r4 W x1
jr
W x1 K3,2,1,0 W x1 (H1 + i)p . (2.3)

Since j1 + j2 + + jr N 2, in the above decomposition, there are at least


two HilbertSchmidt operators or one of trace class. Combining this with the fact
x3r y2r , W x1 and (H1 + i)p are bounded from L2 (R2 ) into L2 (R2 ), we
conclude that KN is trace class operator. Thus (H1 + i)N is also a trace class
operator. Repeating the same arguments, we obtain the proof for x (Hj i)N 1 .
As above to treat (Hj i)x (Hj i)N 2 , it suces to show that (Hj
i)x KN is trace class. If we have j1 2 the proof is completely similar to that of
(H1 + i)N . In the case where j1 = 1 since (H1 + i)x (Q0 + i)1 is not bounded,
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003941

360 M. Dimassi & V. Petkov

we have to exploit the following representation


(H1 + i)x KN = (H1 + i)(x )KN
+ (H1 + i)(Q0 + i)1 x W (Q0 + i)j2 W
(Q0 + i)jr W (H1 + i)p .
Next use the fact that x W L and repeat the argument of the proof above.
Recall that A is trace class if and only if the adjoint operator A is trace class.
Consequently, (i) implies (ii). Since x = x (x ), the assertion (iii) follows
from (i).
To deal with (iv), we apply the following obvious identity with z = i,
x (H z)1 = (H z)1 x + (H z)1 (1 + x V )(H z)1 , (2.4)
and obtain
(H1 + i)x (H1 + i)N = (H1 + i)N x

N
 1
+ (H1 + i)j (1 + x V )(H1 + i)N +j . (2.5)
j=0

Applying (i) and (ii) to each term on the right hand side of (2.5), we get (iv).
Now we pass to the proof of (v). Applying (2.4), we obtain
(H1 + i)x (H1 + i)N 1 (H1 z)1
= (H1 + i)(H1 z)1 x (H1 + i)N 1
+ (H1 + i)(H1 z)1 (1 + x V ) (H1 z)1 (H1 + i)N .
Combining the above equation with (i), (ii), (iv) and using the estimate
 
1 |z| + 1
(H1 + i)(H1 z) = O ,
|Im z|
we get (2.1).

Lemma 2. Assume that V (x, y) = (x, y)W (x, y), where C0 (R2 ; R) and
W, x W C 0 (R2 ; R) L (R2 ; R). Then for N 4 the operator
(H + i)x [(H + i)N (H0 + i)N ],
is trace class.

Proof. Taking (N 1) derivatives with respect to z in the resolvent identity


(H + z)1 (H0 + z)1 = (H + z)1 V (H0 + z)1
and setting z = i, we see that (H + i)N (H0 + i)N is a linear combination of
terms
(H + i)j V (H0 + i)(N +1+j)
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003941

Representation of the Spectral Shift Function 361

with 1 j N . Composing the above terms by (H + i)x and applying Lemma 1,


we complete the proof.

Lemma 3. Assume that V satises the assumptions of Lemma 1. Let f C0 (R)


and C0 (R2 ). Then the operators
f (Hi ), Hi x f (Hi ), x Hi f (Hi )
are trace class and we have
tr(Hi x f (Hi )) = tr(x Hi f (Hi )).

Proof. Set g(x) = (x + i)4 f (x). Since g(Hi ) is bounded, it follows from Lemma 1
that the operators
(Hi + i)4 g(Hi ), Hi x (Hi + i)4 g(Hi ), x (Hi + i)4 Hi g(Hi ),
are trace class, and the cyclicity of the trace yields
tr(Hi x f (Hi )) = tr(Hi x (Hi + i)4 g(Hi )) = tr(Hi g(Hi )x (Hi + i)4 )
= tr(x (Hi + i)4 g(Hi )Hi ) = tr(x Hi f (Hi )).
Notice that in the above equalities we have used the fact that the operators g(Hi ),
Hi and (Hi + i)4 commute.

Lemma 4. Let V be as in Lemma 2. Then for every f C0 (R) the operators


f (H) f (H0 ), x (f (H) f (H0 )) and (H i)x (f (H) f (H0 ))
are trace class.

Proof. Let g(x) = (x + i)4 f (x) be as above. We decompose


(H + i)x (f (H) f (H0 )) = (H + i)x ((H + i)4 (H0 + i)4 )g(H0 )
+ (H + i)x (H + i)4 (g(H) g(H0 )) = I + II.
According to Lemma 2, the operator I is trace class. To treat II, we use the Heler
Sj
ostrand formula

1 g (z)(H + i)x (H + i)4 ((z H)1 (z H0 )1 )L(dz)
(II ) =


1 g (z)(H + i)x (H + i)4 (z H)1 V (z H0 )1 L(dz),
=

where g(z) C0 (C) is an almost analytic continuation of g such that
g (z) =

O(|Im z| ), while L(dz) is the Lebesgue measure on C. Now applying Lemma 1(v),
we see that the operator
(H + i)x (H + i)4 (z H)1 V
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003941

362 M. Dimassi & V. Petkov

is trace class. Since |z| is bounded on supp g, we can apply (2.1) to the right hand
part of the above equation and combining this with g (z) = O(|Im z| ), we deduce
that II is trace class. Summing up, we conclude that (H + i)x (f (H) f (H0 ))
is trace class. The same argument works for (H i)x (f (H) f (H0 )). The proof
concerning f (H) f (H0 ) and x (f (H) f (H0 )) are similar and simpler.

To establish Theorem 1, we also need the following abstract result. For the
readers convenience, we present a proof.

Proposition 1. Let A be an operator of trace class on some Hilbert space H


and let {Kn } be sequences of bounded linear operator which converges strongly to
K L(H). Then

lim Kn A KA tr = 0.
n

Proof. First assume that A is a nite rank operator having the form A =
m
k=1 , k k , where k , k H. Since
m

A tr k k ,
k=1

we have
m

(Kn K)A tr (Kn K)k k 0, n . (2.6)
k=1

The general case can be covered by an approximation. Since Kn converges strongly,


it follows from the BanachStreinhaus theorem that = supn Kn < . Let
be an arbitrary positive constant and let A be a nite rank operator such that

A A tr 2 . We have

(Kn K)A tr (Kn K)(A A ) tr + (Kn K)A tr


+ (Kn K)A tr .

Next we apply (2.6) for the nite rank operator A and obtain

lim (Kn K)A tr ,


n

which implies Proposition 1, since is arbitrary.

Proof of Theorem 1. Assume rst that V = W where C0 (R2 ; R) and


W, x W C 0 (R2 ; R) L (R2 ; R). Choose a function C0 (R2 ) such that = 1
for |(x, y)| 1. For R > 0 set
 
x y
R (x, y) = , ,
R R
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003941

Representation of the Spectral Shift Function 363

and introduce

BR := [R x , H]f (H) [R x , H0 ]f (H0 ).


Here [A, B] = AB BA denotes the commutator of A and B. According to
Lemma 3, we have

tr([R x , H]f (H)) = tr([R x , H0 ]f (H0 )) = 0.


Thus
tr(BR ) = 0. (2.7)

On the other hand, a simple calculus shows that


BR = R ([x , H]f (H) [x , H0 ]f (H0 )) + [R , H0 ]x (f (H) f (H0 ))
1 2
:= BR + BR , (2.8)

where we have used that [R , H] = [R , H0 ].


Since [x , H] = 1 + x V and [x , H0 ] = 1, it follows from Lemma 3, Lemma 4
and Proposition 1 that
1
lim tr(BR ) = tr(f (H) f (H0 )) + tr(x V f (H)). (2.9)
R

Next, we claim that


2
lim BR = 0. (2.10)
R
2
Using that [R , H0 ] = R (Dx R )(Dx y) R2 (Dy R )Dy + R12 (R ), we decompose
2 2 1 2 3
BR as a sum of three terms BR = IR + IR + IR , where
1 2
IR = (Dx R )(Dx y)x (f (H) f (H0 )),
R

2 2
IR = (Dy R )Dy x (f (H) f (H0 )),
R

3 1
IR = (R )x (f (H) f (H0 )).
R2
1
To treat IR , we set Q = H x and write
1 2
IR = (Dx R )(Dx y)(Q0 i)1 (H i)x (f (H) f (H0 ))
R
2
+ (Dx R )[(Dx y)(Q i)1 , x]x (f (H) f (H0 ))
R
2
+ x(Dx R )(Dx y)(Q i)1 x (f (H) f (H0 )).
R
The operators [(Dx y)(Q i)1 , x] and (Dx y)(Q i)1 are bounded,
while x (f (H) f (H0 )) and (H i)x (f (H) f (H0 )) are trace class operators
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003941

364 M. Dimassi & V. Petkov

(see Lemma 4). On the other hand, R2 (Dx R ), R 2


x(Dx R ) converges strongly to
zero. Indeed, since (x, y) = 1 for |(x, y)| 1, we get
  2 
x 
 (Dx R )u dxdy sup |xDx (x, y)| |u|2 dxdy 0, R ,
R (x,y)R2 {|(x,y)|R}

for all u L2 (R2 ). Applying Proposition 1, we conclude that


1
lim IR = 0. (2.11)
R

2 3
To deal with IR , IR , notice that the operators Dy (Q i)1 and [Dy (Q i)1 , x]
are bounded and we repeat the above argument. Thus we deduce

lim I j = 0, j = 2, 3. (2.12)
R R

Consequently, (2.11) and (2.12) imply (2.10) and the claim is proved. Now, combin-
ing (2.7)(2.10), we obtain Theorem 1 in the case where V satises the assumption
of Lemma 2 and  = 1.

Proposition 2. Assume that V L (R2 ; R) satises (1.1). Then for z


/ R, z 
/R
the operators (z H) V (z H) , V (z H) (z H) , (H z) (H0 z)1
1  1 1  1 1

are trace class and

(z H)1 V (z  H)1 tr C1 |Im z|1 |Im z  |1 , (2.13)


1  1 1  1
V (z H) (z H) tr C1 |Im z| |Im z | .

Moreover, if g C0 (R), then the operator V g(H) is trace class.


1+
Proof. Set g (x, y) = x1 2 y 2 and f (x, y) = x2 y1 , where
is the constant in (1.1). According to Lemma 8 in the Appendix, g (H0 + i)1 ,
(H0 + i)1 g are HilbertSchmidt operators and f (H0 + i)2 is a trace one. Since
g1 V g1 , V f1 L , it follows that

(H0 + i)1 V (H0 + i)1 = (H0 + i)1 g [g1 V g1 ]g (H0 + i)1

and V (H0 + i)2 are trace class operators. Next we write

(H + i)1 (H0 + i)1 = (H0 + i)1 V (H0 + i)1


+ (H + i)1 V (H0 + i)1 V (H0 + i)1

and conclude that (H + i)1 (H0 + i)1 = (H + i)1 V (H0 + i)1 is trace class.
Now consider the following equalities

(i + H)1 V (i + H)1 = (i + H0 )1 V (i + H0 )1
+ (i + H)1 V (i + H0 )1 V (i + H0 )1
+ (i + H0 )1 V (i + H0 )1 V (i + H)1
+ (i + H)1 V (i + H0 )1 V (i + H0 )1 V (i + H)1
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003941

Representation of the Spectral Shift Function 365

and

V (H + i)2 = V (H0 + i)2 V (H0 + i)1 (H + i)1 V (H0 + i)1


V (H + i)1 V (H0 + i)1 (H + i)1 .
By using the trace class properties established above, we get (2.13) for z = z  = i.
By applying the rst resolvent equation

(H z)1 = (H + i)1 + (i z)(H + i)1 (H z)1 ,


we obtain the general case.
To examine V g(H), consider the function h(x) = (x + i)2 g(x). Then V g(H) =
V (H + i)2 h(H) and since V (H + i)2 is trace class, we obtain the result.

For R > 0 introduce


HR := H0 + R (x, y)V (x, y),
x y
where R (x, y) = ( R , R ) with C0 (R2 ) such that = 1 in a neighborhood of
|(x, y)| 1.

Remark 1. The result of Proposition 2 concerning the trace class property of


(H z)1 (H0 z)1 , Im z = 0, improves considerably [5, Proposition 2],
where much more regular potentials have been examined. On the other hand, if
the potential V satises (1.1) and V, x V C 0 (R2 ; R) L (R2 ; R), then the state-
ments of Proposition 2 hold for the operators (z HR )1 V (z  H)1 , z / R, z 
/ R.
The proof of Theorem 1 in the general case will be a simple consequence of the
following

Lemma 5. Let V (x, y) be as in Theorem 1. Then for f C0 (R) we have


lim tr(f (HR ) f (H)) = 0, (2.14)
R

lim tr(x (R V )f (HR )) = tr(x V f (H)). (2.15)


R

Proof. Let g(x) = (x + i)f (x) be as above. We decompose


f (HR ) f (H) = ((HR + i)1 (H + i)1 )g(H) + (HR + i)1 (g(HR ) g(H))
= JR + KR .
From the rst resolvent identity, we obtain
JR = (HR i)1 (1 R )V (H + i)1 g(H) = (HR i)1 (1 R )V f (H).
According to Proposition 2, the operator V f (H) is trace class and (HR i)1 (1R )
converges strongly to zero. Then from Proposition 1 it follows that
lim tr JR = 0. (2.16)
R
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003941

366 M. Dimassi & V. Petkov

To treat trKR , as in the proof of Lemma 4, we use the HelerSjostrand formula


and write

1 g (z) tr((HR + i)1 ((z HR )1 (z H)1 ))L(dz)
tr KR =


1 g(z) tr((HR + i)1 (z HR )1 (1 R )V (z H)1 )L(dz).
=

By cyclicity of the traces we obtain

tr((HR + i)1 (z HR )1 (1 R )V (z H)1 )


= tr((z HR )1 (1 R )V (z H)1 (HR + i)1 )
= tr((z HR )1 (1 R )V (z H)1 (H + i)1 )
+ tr((1 R )V (HR + i)1 (z HR )1 (1 R )V (z H)1 (H + i)1 ).

/ R the operators (1R )V (HR +i)1 (zHR )1 (1R ) and


Now notice that for z
1
(zHR ) (1R ) converge strongly to zero. On the other hand, from Proposition 2
we deduce that the operator V (z H)1 (i + H)1 is trace class. Thus for z
/ R, we
conclude that the integrand converge to 0 as R . An application of the Lebesgue
convergence domination theorem combined with the estimates (2.13) yield

lim tr KR = 0. (2.17)
R

Putting together (2.16) and (2.17), we obtain (2.14).


Next, we pass to the proof of (2.15). A simple calculus shows that
1
x (R V )f (HR ) = x (R V )(f (HR ) f (H)) + (x )R V f (H)
R
+ (R x V f (H)). (2.18)

Repeating the same arguments as in the proof of (2.14), we show that

lim tr(x (R V )(f (HR ) f (H))) = 0. (2.19)


R

1
On the other hand, since R (x )R (respectively R ) converges strongly to zero
(respectively 1), it follows from Proposition 1 that
 
1
lim tr (x )R Vf (H) = 0, lim tr(R x Vf (H)) = tr(x Vf (H)),
R R R

which together with (2.18) and (2.19) yield (2.15).

End of the proof of Theorem 1. Applying Theorem 1 to HR , we obtain:

tr[f (HR ) f (H)] + tr[f (H) f (H0 )] = tr[f (HR ) f (H0 )]


= tr(x (R V )f (H)),

and an application of Lemma 5 implies Theorem 1.


May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003941

Representation of the Spectral Shift Function 367

3. Mourre Estimate and Embedded Eigenvalues


Consider the operator
Q = (Dx By)2 + Dy2 + V (x, y),
and set x = (1 + |x|2 )1/2 , Dx  = (1 + Dx2 )1/2 .

Lemma 6. Assume that V, x V C 0 (R2 ; R) L (R2 ; R) and let


I{|x|+|y|>R}(x, y)x V L 0
for R +. Then for all f C0 (R), the operator f (H)x V f (H) is compact.

Proof. Let (x, y) C0 (R2 ) be equal to one near zero. Set n (x, y) = ( nx , ny ).
According to Lemma 3, the operator f (H)n x V f (H) is trace class. The set of
compact operators is closed with respect to the norm . L(L2 ) and the lemma follows
from the obvious estimate
f (H)(1 n )x V f (H) L(L2 ) f 2 (H) L(L2 ) (1 n )x V .

Theorem 3. Let [a, b] R. Under the assumptions of Lemma 6, there exists a


compact operator K such that
I[a,b] (H)[x , H] I[a,b] (H) I[a,b] (H) + I[a,b] (H)KI[a,b] (H). (3.1)

Proof. Since the operator x commutes with (Dx By) and Dy2 , we have [x , H] =
 + x V . Consequently,
I[a,b] (H)[x , H]Ia,b] (H) = I[a,b] (H) + I[a,b] (H)x V I[a,b] (H)
= I[a,b] (H) + I[a,b] f (H)x V f (H)I[a,b] (H), (3.2)
where f C0 (R) is a cut-o function such that f = 1 on [a, b]. Thus, Theorem 3
follows from Lemma 6.

The use of commutators with the operator x is well known for the analysis of
the operator without magnetic eld (B = 0) (see the pioneering work [2] and [1] for
a more complete list of references). On the other hand, to treat crossed magnetic
and electric elds we need Lemma 1 and Lemma 3.

Corollary 1. In addition to the assumptions of Theorem 3 assume that x2 V


C 0 (R2 ) L (R2 ). Then the point spectrum of H in [a, b] is nite and with nite
multiplicity. Moreover, the singular continuous spectrum of H is empty.

Proof. Set A = Dx and let R. The explicit formula


eiA (H + i)1 = (eiA HeiA + i)1 eiA
= (H +  + V (x + , y) V (x, y) + i)1 eiA
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003941

368 M. Dimassi & V. Petkov

shows that eiA leaves D(H) invariant. On the other hand, since

HeiA (H + i)1 = eiA HeiA (H + i)1

= (H  + V (x , y) V (x, y))(H + i)1 ,

we deduce that for each D(H)

sup HeiA < .


||<1

Combining this with the fact i[A, H] =  + x V , [A, [A, H]] = x2 V and using
(3.1), we conclude that the self-adjoint operator A is a conjugate operator for H at
every E R in the sense of [14]. Consequently, Corollary 1 follows from the main
result in [14] (see also [1, 6]).

Remark 2. For any sign-denite and bounded potential V (x, y) such that
|V (x, y)| 0 as |x| + |y| suciently fast in [13, 15] it was established that for
 = 0 the potential V creates an innite number of eigenvalues of Q which accumu-
late to Landau levels. The above corollary shows that only a nite number of these
eigenvalues may survive in the presence of a non vanishing constant electric eld.
In general, the problem of absence of embedded eigenvalues when  = 0 remains
open and this is an interesting conjecture.

For a xed value of  = 0, the following result shows that there are potentials
for which H has absolutely continuous spectrum without embedded eigenvalues.

Corollary 2. Fix  > 0. Assume that x V C 0 (R2 ; R) L (R2 ; R), = 0, 1, 2


and

 + x V (x, y) > c > 0, (3.3)

uniformly on (x, y) R2 . Then H has no eigenvalues. Moreover, for s > 1/2, the
following estimates holds uniformly on in a compact interval

Dx s (H i0)1 Dx s = O (1). (3.4)

Proof. Let [a, b] be a compact interval in R. From (3.1) and (3.3), we have

I[a,b] (H)[x , H]Ia,b] (H) cI[a,b] (H). (3.5)

According to the proof of Corollary 1, A = Dx is a conjugate operator in the


sense of [14]. Combining this with (3.5) we deduce from [14] that H has no eigen-
value in R. Applying once more Mourre theorem (see [1, 6, 14]), we obtain the
estimate (3.4).
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003941

Representation of the Spectral Shift Function 369

4. Limiting Absorption Principle


In this section, we treat the case when  is small enough. Notice that when  tends
to zero in general the assumption  + x V > c > 0 is not satised and we cannot
apply Corollary 2. Our goal is to study the behavior of the resolvent (H i)1
as 0 for / (Q). For such we could have eigenvalues of H and a direct
application of Mourre argument is not possible. We will obtain the result assuming
that  is small and for this purpose we need the following

Lemma 7. Assume that V L (R2 ; R) and let / (Q). Let C0 (R; R) be


equal to 1 near and let supp (Q) = . Then

(H)x2 C2 . (4.1)

Proof. Since supp (Q) = , the operators (z Q)1 and (z Q)1 x(z Q)1
are analytic operator valued functions for z in a complex neighborhood of supp .
Let (z)
C0 (C) be an almost analytic continuation of (x) such that

(z)
= O(|Im z| )

and supp (z)


(Q) = . We have the representation

1
(H) = (z)(z
H)1 L(dz),

where L(dz) is the Lebesgue measure in C. By using the resolvent identity, we get

(z H)1 = (z Q)1 + (z Q)1 x(z Q)1


+ 2 (z H)1 x(z Q)1 x(z Q)1 ,

and we obtain


(H) = (Q) (z)(z
Q)1 x(z Q)1 L(dz)


2
(z)(z
H)1 x(z Q)1 x(z Q)1 L(dz).

Since supp (z)
(Q) = , the rst two terms on the right-hand side vanish.
Consequently,

2
(H) = (z)(z
H)1 x(z Q)1 x(z Q)1 L(dz). (4.2)

Next, we observe that

x(z Q)1 = (z Q)1 x + (z Q)1 [x, Q](z Q)1 = (z Q)1 x + L1 .

We have [x, Q] = 2(Dx By). Thus it is easy to see that for z / (Q), L1 =
(z Q)1 [x, Q](z Q)1 is a bounded operator since (Dx By)(iQ)1 is bounded
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003941

370 M. Dimassi & V. Petkov

and (z Q)1 = (i Q)1 + (i Q)1 (i z)(z Q)1 . We write


x(z Q)1 x(z Q)1 = (z Q)1 x(z Q)1 x
4

+ (z Q)1 xL1 + L1 (z Q)1 x + L21 = Ij .
j=1

The operators I4 = L21 and I3 = L1 (z Q)1 xx2 are bounded. To see that
I1 x2 is bounded, note that
I1 x2 = (z Q)2 x2 x2 + (z Q)1 L1 xx2 .
Finally,
I2 x2 = (z Q)2 x[x, Q](z Q)1 x2 + (z Q)1 L1 [x, Q](z Q)1 x2
and since the second term on the right-hand side is bounded, it remains to examine
the operator
x[x, Q](z Q)1 x2 = [x, Q]x(z Q)1 x2 + 2(z Q)1 x2 .
Applying the above argument, we see that the last operator is bounded. Conse-
quently, the operator under integration in (4.2) is bounded by O(|Im z|1 ) and this
proves the statement.

Proposition 3. Assume that x V C 0 (R2 ; R) L (R2 ; R) for = 0, 1, 2 and let


x2 x V L (R2 ). Let [a, b] be a compact interval such that [a, b](Q) = . Then
for s > 1/2 and suciently small 0 > 0 we have the following estimate uniformly
with respect to [a, b] and  ]0, 0 ]
Dx s (H i0)1 Dx s C1 . (4.3)
Moreover, H has no embedded eigenvalues and singular continuous spectrum in
[a, b].

Proof. Let [a , b + ] (Q) = for 0 <  1. Choose a function (t)


C0 (R; R) such that supp [a , b + ] and (t) = 1 for a1 = a /2 t
b + /2 = b1 . Then
I[a1 ,b1 ] (H)[x , H]I[a1 ,b1 ] (H)
= I[a1 ,b1 ] (H) + I[a1 ,b1 ] (H)x V I[a1 ,b1 ] (H)
= I[a1 ,b1 ] (H) + I[a1 ,b1 ] (H)((H)x2 )(x2 x V ) I[a1 ,b1 ] (H)
Our assumption implies that the multiplication operator x2 x V L , while
Lemma 7 says that
(H)x2 C2 .
Thus
I[a1 ,b1 ] (H)((H)x2 )(x2 x V )I[a1 ,b1 ] (H) C1 2 I[a1 ,b1 ] (H)
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003941

Representation of the Spectral Shift Function 371

and with a constant c0 > 0 we deduce

I[a1 ,b1 ] (H)[x , H]I[a1 ,b1 ] (H) c0 I[a1 ,b1 ] (H).

Then it is well known (see, for instance [1,6,14]) that for [a, b] we get (4.3) and
H has no eigenvalues and singular continuous spectrum in [a, b].

Remark 3. As we mentioned in Remark 2 for sign-denite rapidly decreasing


potentials the spectrum of the operator Q is formed by innite number eigenvalues
having as points of accumulation the Landau levels n = (2n+1)B, n N. For such
potentials Proposition 3 shows that the embedded eigenvalues of H could appear
only in small neighborhoods of the eigenvalues of Q. Since in every interval we may
have only a nite number of eigenvalues of H, it is clear that for some eigenvalues
of Q there are no eigenvalues of H in their neighborhoods. Moreover, it was
proved in [12] that for potentials V C0 (R2 ) we have (Q) ]n B, n + B[
(n Cn1/2 , n + Cn1/2 ), n N with C > 0 and N depending only on sup|V |
and the diameter of the support of V . Thus for M large the embedded eigenvalues
M of H are suciently close to Landau levels n .

5. Estimates for the Derivative of the Spectral Shift Function


First we notice that the assumption (1.4) makes possible to dene the spectral shift
function (, ) related to operators H0 () = H0 (B, ) and H() = H0 (B, )+V (x, y)
by the equality

  , f  = tr(f (H()) f (H0 ())), f C0 (R).

Here and below we omit the dependence of B in the notations. Our purpose in this
section is to establish Theorem 2. For the proof we need the following

Proposition 4. Under the assumptions of Theorem 2, for 0


/ (Q) and 1/2 <
s < min(1/2 + /4, 1) the operator

Dx s x V [(Q z)1 x]n Dx s

is trace class for z in a small complex neighborhood C of 0 .

Proof. Before starting the proof, notice that it is easy to establish the statement for
z  0 since in this case the operator (Qz)1 is a pseudodiferential one and we can
apply the calculus of pseudodierential operators and the criteria which guarantees
that a pseudodierential operator is trace class (see for instance, [3, Theorem 9.4]).
For z R+ \(Q) this is not the case and (Q z)1 is a bounded operator but
not a pseudodierential one. We may replace (Q z)1 by the pseudodierential
operator (Qi)1 modulo bounded operators but therefore it is dicult to examine
the product involving many bounded operators and factors xk . To overcome this
diculty, we are going to apply a convenient decomposition by product of operators
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003941

372 M. Dimassi & V. Petkov

having in mind that the operator on the left of a such product must be trace
class one.
First, we treat the case n = 2, the general case will be covered by a recurrence.
We start with the analysis of the operator

Dx 2s x V [(Q z)1 x]2 . (5.1)

Our goal is to show that (5.1) is a trace class operator. Write

Dx 2s x V x2 x2 (Q z)1 x(Q z)1 x


= Dx 2s (x V )x2 (Q z)1 x2 x(Q z)1 x
+ Dx 2s x V x2 (Q z)1 [Q, x2 ](Q z)1 x(Q z)1 x
= Dx 2s x V x2 (Q z)2 [x2 x2 + [Q, x2 x](Q z)1 x]
+ Dx 2s x V x2 (Q z)1 [Q, x2 ](Q z)1 x(Q z)1 x = T1 + T2 .

To deal with T1 , we use the representation

T1 = Dx 2s x V x2 (Q z)2 W1

and we will show that the operator

W1 = x2 x2 + [Q, x2 x](Q z)1 x


 
1 x2 1 x2
= x2 x2 i (Dx By ) + (D x By ) (Q z)1 x
(1 + x2 )2 (1 + x2 )2

is bounded. Consider the operator

(1 x2 )
(Dx By) (Q z)1 x
(1 + x2 )2

(1 x2 )x
= (Dx By) (Q i)1 [1 + (z i)(Q z)1 ]
(1 + x2 )2

1 x2
+ (Dx By) (Q z)1 [Q, x](Q z)1 .
(1 + x2 )2

The pseudodierential operator

(1 x2 )x
(Dx By) (Q i)1
(1 + x2 )2

is bounded and the product of this operator with [1 + (i z)(Q z)1] is bounded,
too. As in the proof of Lemma 7, we see that [Q, x](Qz)1 is bounded and with the
same argument we treat the other terms. Thus we conclude that W1 is a bounded
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003941

Representation of the Spectral Shift Function 373

operator. Next we write

T2 = Dx 2s x V x2 (Q z)2 W2 ,

where

W2 = [Q, x2 ]x(Q z)1 x + [Q, [Q, x2 ]](Q z)1 x(Q z)1 x
= W21 + W22 .

We have
 
x2 1 x 1
W21 = 2i (Dx By) (Q z) x + (D x By)x(Q z) x
(1 + x2 )2 (1 + x2 )2

and as above we deduce that W21 is a bounded operator. For the analysis of W22 ,
we write

1 3x2
W22 = 4(Dx By)2 + R1 (x)(Dx By) + R2 (x)
(1 + x2 )3

x
+ (4x V + 8BDy ) (Q z)1 x(Q z)1 x.
(1 + x2 )2

A simple calculus gives

(Q z)1 x(Q z)1 x = (Q z)1 x2 (Q z)1 + (Q z)1 xM1


= x2 (Q z)2 + 4(Q z)1 x(Dx By)(Q z)2
+ x(Q z)1 M1 + (Q z)1 M2
= x2 (Q z)2 + 4x(Q z)1 M3 + (Q z)1 M4
= x2 (Q i)2 M5 + 4x(Q i)1 M6 + (Q i)1 M7 ,

where Mk , k = 1, 2 . . . , denote bounded operators. The pseudodierential calculus


implies that the product of the term in the brackets { } with xj (Q i)j , j = 1, 2
is a bounded operator. Combining this with the above equality, we conclude that
W22 is bounded.
Now it remains to see that the operator

T = Dx 2s x V x2 (Q z)2

is trace class. For this purpose we replace (Q z)2 by

(Q i)2 [I + (z i)(Q z)1 ]2


May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003941

374 M. Dimassi & V. Petkov

and consider the pseudodierential operator


Dx 2s x V x2 (Q i)2 (5.2)
with principal symbol
2s (x V )(x, y)(1 + x2 )
gs (x, y, , ) = .
(( By)2 + 2 + V (x, y) i)2
We use the estimate 2s C By2s y2s and we apply Theorem 9.4 in [3] to
deduce that (5.2) is a trace class operator. In fact we have


x,y,, gs L1 (R4 ) <
||5

since 2s < 2 guarantees that the integral with respect to is convergent, while 2s <
1 + /2 and the estimate (1.4) imply that integral with respect to y is convergent.
Consequently, T is a trace class operator and this completes the analysis of (5.1).
Notice also that the same argument implies that the operator
Dx s x V [(Q z)1 x]2
is trace class.
To prove that the operator Dx s x V [(Q z)1 x]2 Dx s is trace class, we com-
mute the operator Dx s with (Q z)1 x and x V in order to reduce the proof to
that of (5.1). The commutators [x, Dx s ] and [V, Dx s ]x are bounded since s < 1.
Next
[(Q z)1 , Dx s ]x = (Q z)1 [V, Dx s ](Q z)1 x
= (Q z)1 [V, Dx s ](x(Q z)1 + (Q z)1 M1 )
= (Q z)1 M2
and we obtain operators which can be handled by the above argument. Thus the
assertion is proved for n = 2.
Passing to the general case n > 2, assume that the assertion holds for n =
2, . . . , k 1, and suppose that V satisfy the estimate (1.4) with n = k. The idea is
to replace the operator
Dx s x V [(Q z)1 x]k Dx s
by the trace class operator Dx s (x V )xk (Q z)2 Dx s plus a sum of several
operators which are trace class according to the recurrence assumption. Notice that
if Mj is bounded operator obtained as a product of (Dx By) and (Q z)j , j 1,
the operator Dx s Mj Dx s becomes a bounded operators and this makes possible
to exploit the representation
Dx s x V (Q z)1 x Mj Dx s
= [Dx s x V (Q z)1 x Dx s ] (Dx s Mj Dx s ).
Thus we reduce the analysis to the trace class property of Dx s x V (Q
z)1 x Dx s . For simplicity of the notations we will write A t B if the dif-
ference A B is a trace class operator.
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003941

Representation of the Spectral Shift Function 375

We start with the observation that


Dx s x V [(Q z)1 x]k Dx s
t Dx s x V [(Q z)1 x]k2 (Q z)1 x2 (Q z)1 Dx s .
We can establish this by a recurrence. For k 1 we apply the equality
Dx s x V [(Q z)1 x]k1 Dx s
= Dx s x V [(Q z)1 x]k3 (Q z)1 x2 (Q z)1 Dx s
Dx s x V [(Q z)1 x]k2 (Q z)1 [Q, x](Q z)1 Dx s
t Dx s x V [(Q z)1 x]k3 (Q z)1 x2 (Q z)1 Dx s .
Commuting (Q z)1 and x2 , we obtain the result for k 1 and in the same way
we continue for p k 1.
Next we commute (Q z)1 and x2 and get
Dx s x V [(Q z)1 x]k2 (Q z)1 x2 (Q z)1 Dx s
t Dx s x V [(Q z)1 x]k3 (Q z)1 x3 (Q z)2 Dx s .
Indeed, [Q, x2 ] = 4(Dx By)x = 4ix(Dx By) 2 yields
(Q z)1 x2 (Q z)1
= x2 (Q z)2 4i(Q z)1 x(Dx By) (Q z)1 2(Q z)2
and for the term
Dx s x V [(Q z)1 x]k1 (Dx By)(Q z)1 Dx s
we use the recurrence assumption and the fact that M2 = (Dx By)(Q z)1 is
a bounded operator. In the same way for 1 j k 1 we show that
Dx s x V [(Q z)1 x]kj (Q z)1 xj (Q z)2 Dx s
t Dx s x V [(Q z)1 x]kj1 (Q z)1 xj+1 (Q z)2 Dx s ,
taking into account the equality
[Q, xj ] = 2j(Dx By)xj1 = 2jxj1 (Dx By) 2ij(j 1)xj1
and the recurrence assumption. Finally, we prove that
Dx s x V [(Q z)1 x]k Dx s t Dx s (x V )xk (Q z)2 Dx s
and, as in the proof in the case n = 2, we conclude that the operator on the right-
hand side is trace class one.

After this preparation we pass to the proof of Theorem 2.


Proof of Theorem 2. Let R be a small neighborhood of 0 such that
(Q) = . For the simplicity of the notations we will write H(), (, ) instead of
H(B, ), (; B, ). Given f C0 (), introduce an almost analytic continuation
f C0 (C) of f so that f(z) = O(|Im z| ) and supp f(z) (Q) = . Since
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003941

376 M. Dimassi & V. Petkov

(z Q)1 is analytic over the support of f(z), applying the resolvent equality,
we get

1
x V f (H()) = f(z)x V (z H())1 L(dz)


n
= (1)n+1 f(z)x V [(z Q)1 x]n (z H())1 L(dz). (5.3)

Taking into account Proposition 4 and the cyclicity of the trace, we get

tr f(z)Dx s [Dx s x V [(z Q)1 x]n Dx s ]Dx s (z H())1 L(dz)


= tr f(z)[Dx s x V [(z Q)1 x]n Dx s ]Dx s (z H())1 Dx s L(dz).

Set W (z) = Dx s x V [(z Q)1 x]n Dx s and note that for z supp f this
operator is trace class and W (z) is analytic. We write

1
f(z) tr(x V [(z Q)1 x]n (z H())1 )L(dz)


1
= lim f(z + i)
0 Im z>0
tr[(W (z + i)Dx s (H() (z + i))1 Dx s )]L(dz)
 
s 1 s
+ f (z i) tr(W (z i)Dx  (H() (z i)) Dx  )L(dz) .

Im z<0

Notice that the functions

tr(W (z i)Dx s (H() (z i))1 Dx s )

are analytic in Im z > 0. Applying Green formula, as in [4, Lemma 1], we deduce
1
  (, ), f  = tr(f (H() f (H0 )) = tr(x V f (H())


(1)n n1
= lim f () tr(W ()[Dx s ((H() ( + i))1
0 2i
(H() ( i))1 )Dx s ])d,

where the integral is taken in the sense of distributions. On the other hand, Proposi-
tion 4 combined with (4.3) show that the right-hand side of the above representation
is nite and has order O(n2 ). Thus for f C0 () we obtain


 (, ), f  = f ()T ()d

with T () = O(n2 ) and this completes the proof.


May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003941

Representation of the Spectral Shift Function 377

Acknowledgments
The authors are grateful to the referees for their thorough and careful reading of the
paper. Their remarks and suggestions lead to an improvement of the rst version
of this paper.
The second author was partially supported by the ANR project NONAa.

Appendix
The proof of the following lemma is similar to the proof of [5, Proposition 2.1] and
for the reader convenience we give it.
1
Lemma 8. Let > 0 and let kj (x, y) = xj(1+) yj( 2 +) , j = 1, 2. The opera-
tors G2 := k2 (H0 + i)2 , G2 , (respectively, G1 := k1 (H0 + i)1 , G1 ), are trace class
(respectively, HilbertSchmidt).

Proof. Without loss of the generality, we may assume that B =  = 1. Introduce


the unitary operator U : L2 (R2 ) L2 (R2 ) by

2  
(U u)(x, y) = ei(x,y,x ,y ) u(x , y  )dx dy  ,
R 2

where (x, y, x , y  ) = xy xy  x y + x y  12 y  . A simple calculus shows that

0 = U 1 H0 U = (Dy2 + y 2 ) + x 1 ,
H
 4 
1 1
kj = U kj U = kj x Dy , y + Dx .
2
Since U is unitary, it suces to prove the lemma for G j := U Gj U 1 =
j
kj (H0 + i) .
Let (t) C0 (R; [0, 1]) be a cut-o function such that (t) = 1 for |t| 1 and
2
(t) = 0 for |t| 2. Fix a number k, max{1, 1+2 } < k < 2, and introduce the
symbol
 
y, k
q(x, y, ) = ,
| 2 + y 2 + (x + i)|
where y,  = (1 + y 2 + 2 )1/2 . It clear that q(x, y, ) S 0 (R4(x,,y,) ) and we set
A = q (x, y, Dy ). We decompose
kj (H
0 + i)j = Ak (H
j
0 + i)j + (I A)k (H
j
0 + i)j = Lj + Mj . (A.1)
To treat Lj , notice that on the support of q(x, y, ) we have
( 2 + y 2 + x + i)1 S 0 (R4 ; y, k ).
In fact, on the support of q we obtain
y, k 2| 2 + y 2 + x + i|,
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003941

378 M. Dimassi & V. Petkov

and it is easy to estimate the derivatives of ( 2 + y 2 + x + i)1 . According to the


calculus of pseudodierential operators, Lj becomes a pseudodierentail operator
with symbol in
1
S 0 (R4 ; y, k x j(1+) y + j( 2 +) ),

and the trace norm (respectively, HilbertSchmidt norm) of L2 (respectively L1 )


can be estimated (see, for instance, [3, Proposition 9.2 and Theorem 9.4]) by

L1 2HS + L2 tr C0 y, 2k x 22 y + 12 dxddyd

C0 y, 2k dyd C0 . (A.2)

To deal with Mj , j = 1, 2, we will show that (I A)k2 is trace class operator and
(I A)k1 is HilbertSchmidt one.
Notice that on the support of the symbol of (I A) we have

y, k | 2 + y 2 + x + i|.
1
Taking into account the estimate xl ym kj (x, y) = Ol,m (xj(1+) yj( 2 +) ),
we get

(I A)k1 2HS + (I A)k2 tr



C1 x 22 y + 12 dxddyd
y,
k | 2 +y 2 +x+i|

C2 x 22 dxdyd
y,
k | 2 +y 2 +x+i|

C2 u22 dudyd
y,
k | 2 +y 2 ++u+i|

C2 y,
k | 2 + y 2 + + u|,
u22 dudyd
|u| 12 y,
k

+ C2 u22 dudyd
y,
k | 2 + y 2 + + u|,
|u| 12 y,
k
  
C2 u22 dudyd
|u|C3 ,|y|C3 ,||C3


22
+ u dudyd
|u| 12 y,
k
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003941

Representation of the Spectral Shift Function 379


  1
(2|u|) k
C4 + C5 u22 rdr du
0


C4 + C6 u22+2/k du C7 , (A.3)

since 2 2 + 2/k < 1.


Using (A.1)(A.3) and the fact that A is trace class (respectively Hilbert
Schmidt) operator if and only if A is trace class (respectively HilbertSchmidt)
operator, we complete the proof of the lemma.

References
[1] W. O. Amrein, A. M. Boutet de Monvel and V. Georgescu, C0 -Groups, Commutator
Methods and Spectral Theory of N-Body Hamiltonians, Progress in Mathematics,
Vol. 135 (Birkh auser-Verlag, Basel, 1996).
[2] F. Bentosela, R. Carmona, P. Duclos, B. Simon, B. Souillard and R. Weder,
Schrodinger operators with an electric eld and randon or deterministic potentials,
Comm. Math. Phys. 88 (1983) 387397.
[3] M. Dimassi and J. Sj ostrand, Spectral Asymptotics in Semiclassical Limit, London
Mathematical Society, Lecture Notes Series, Vol. 268 (Cambridge University Press,
1999).
[4] M. Dimassi and V. Petkov, Spectral shift function and resonances for non-
semibounded and Stark Hamiltonians, J. Math. Pures Appl. 82 (2003) 13031342.
[5] M. Dimassi and V. Petkov, Resonances for magnetic Stark hamiltonians in two dimen-
sional case, Int. Math. Res. Not. 77 (2004) 41474179.
[6] C. Gerard, A proof of the abstract limiting absorption principle by energy estimates,
J. Funct. Anal. 254 (2008) 27072724.
[7] C. Ferrari and H. Kovarik, Resonances width in crossed electic and magnetic elds,
J. Phys. A Math. Gen. 37 (2004) 76717697.
[8] C. Ferrari and H. Kovarik, On the exponential decay of magnetic Stark resonances,
Rep. Math. Phys. 56 (2005) 197207.
[9] V. Ivrii, Analysis and Precise Spectral Asymptotics, Springer Monographs in Mathe-
matics (Springer, Berlin, 1998).
[10] M. Klein, D. Robert and X. P. Wang, BreitWigner formula for the scattering phase
in the Stark eect, Comm. Math. Phys. 131(1) (1990) 109124.
[11] M. G. Krein, On the trace formula in perturbation theory, Mat. Sb. 33 (1953) 597626
(in Russian).
[12] E. Korotyaev and A. Pushnitski, A trace formula and high energy spectral asymp-
totics for the perturbed Landau Hamiltonian, J. Funct. Anal. 217 (2004) 221248.
[13] M. Melgaard and G. Rosenblum, Eigenvalue asymptotics for weakly perturbed Dirac
and Schr odinger operators with constant magnetic elds of full rank, Comm. Partial
Dierential Equations 28 (2003) 697736.
[14] E. Mourre, Absence of singular continuous spectrum for certain self-adjoint operators,
Comm. Math. Phys 78(3) (1981) 391408.
[15] G. Raikov and S. Warzel, Quasi-classical versus non-classical spectral asymptotics
for magnetic Schr odinger operators with decreasing electric potentials, Rev. Math.
Phys. 14 (2002) 10511072.
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003941

380 M. Dimassi & V. Petkov

[16] M. Reed and B. Simon, Methods of Modern Mathematical Physics, IV, Analysis of
Operators (Academic Press, New York, 1978).
[17] D. Robert and X. P. Wang, Existence of time-delay operators for Stark Hamiltonians,
Comm. Partial Dierential Equations 14 (1989) 6398.
[18] D. Robert and X. P. Wang, Time-delay and spectral density for Stark Hamiltonians.
II. Asymptotics of trace formulae, Chinese Ann. Math. Ser. B 12(3) (1991) 358383.
[19] X. P. Wang, Weak coupling asymptotics of Schr odinger operators with Stark eect,
in Harmonic Analysis, Lecture Notes in Math., Vol. 1494 (Springer, Berlin, 1991),
pp. 185195.
[20] D. Yafaev, Mathematical Scattering Theory (Amer. Math. Society, Providence, RI,
1992).
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

Reviews in Mathematical Physics


Vol. 22, No. 4 (2010) 381430

c World Scientic Publishing Company
DOI: 10.1142/S0129055X10003990

THE LOCALLY COVARIANT DIRAC FIELD

KO SANDERS
Institute of Theoretical Physics, University of G
ottingen,
Friedrich-Hund-Platz 1, D-37077 G ottingen, Germany
and
Courant Research Center,
Higher Order Structures in Mathematics,
University of G
ottingen, Germany
jacobus.sanders@theorie.physik.uni-goe.de

Received 25 November 2009


Revised 1 March 2010

We describe the free Dirac eld in a four-dimensional spacetime as a locally covariant


quantum eld theory in the sense of Brunetti, Fredenhagen and Verch, using a represen-
tation independent construction. The freedom in the geometric constructions involved
can be encoded in terms of the cohomology of the category of spin spacetimes. If we
restrict ourselves to the observable algebra, the cohomological obstructions vanish and
the theory is unique. We establish some basic properties of the theory and discuss the
class of Hadamard states, lling some technical gaps in the literature. Finally, we show
that the relative Cauchy evolution yields commutators with the stress-energy-momentum
tensor, as in the scalar eld case.

Keywords: Quantum eld theory; curved spacetime; Dirac eld.

Mathematics Subject Classications 2010: 81T20

1. Introduction
Quantum eld theory in curved spacetime is relevant for several purposes, such as
the construction of cosmological models and to obtain a better understanding of
quantum eld theory in Minkowski spacetime. In order to achieve these goals in
a more realistic setting, it is important to go beyond the well-studied free scalar
eld. In this paper, we will present a proof, already contained in [1], of the fact
that the free Dirac eld in a four-dimensional globally hyperbolic spacetime can be
described as a locally covariant quantum eld theory in the sense of [2].
Our presentation of the Dirac eld is representation independent and we empha-
size categorical methods throughout in order to point out an interesting problem
concerning the uniqeness of the theory. The obstruction for the denition of a unique
theory can be formulated in terms of the cohomology of the category of spacetimes
with a spin structure, in particular its rst StiefelWhitney class. It seems di-
cult to compute this class for a category, but we will show that a unique theory

381
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

382 K. Sanders

can always be obtained by restriction to the observable algebras generated by even


polynomials in the eld, in which case the cohomological obstructions vanish.
Hadamard states can be dened in terms of a series expansion of their two-point
distribution, detailing their local singularity structure. Alternatively, they can be
characterized by a microlocal condition. The equivalence of these two denitions
has been investigated by several authors using dierent techniques of proof, but
in our opinion none of these arguments has been fully convincing. In our discus-
sion, we hope to close any remaining gaps in the dierent proofs and establish the
equivalence on rm ground.
We also compute the relative Cauchy evolution of this eld and obtain com-
mutators with the stress-energy-momentum tensor, in complete analogy with the
scalar eld case ([2]). For this, we use a point-splitting procedure to renormalize
the stress-energy-momentum tensor. Because we only need commutators with this
tensor we do not need to treat the so-called trace anomaly, a nite multiple of the
identity operator, in detail. We refer the interested reader to [3], who also con-
struct the extended algebra of Wick powers, relevant for perturbation theory. A
Spin-Statistics Theorem in a generally covariant framework may be found in [4].
The contents of this paper are organized as follows. In Sec. 2, we review some of
the mathematical background material that we need in order to describe the Dirac
eld. This includes rst of all the Dirac algebra and the Spin group, followed by
a categorical formulation of some of the dierential geometry that we will need.
In Sec. 3, we describe the classical free Dirac eld, starting with the geometric
and algebraic aspects in Secs. 3.1 and 3.2 and the equations of motion and their
fundamental solutions in Sec. 3.3. We discuss the uniqueness of the functorial con-
structions and their cohomological obstructions in Sec. 3.4. We then proceed to the
quantum Dirac eld in Sec. 4. In Sec. 4.1, we quantize the classical Dirac eld in a
local and covariant way and collect some of its basic properties. Section 4.2 deals
with Hadamard states and includes a discussion of the existing results concerning
the equivalence of the microlocal and the series expansion denitions. For this pur-
pose we also refer to Appendix A, which contains several relevant and useful (but
expected) results in microlocal analysis. Section 4.3 contains our discussion of the
relative Cauchy evolution of the free Dirac eld, obtaining commutators with the
stress-energy-momentum tensor, but the proof of our main result there is deferred
to Appendix B, because it consists of rather involved computations. Finally we end
with some conclusions.
Our presentation of locally covariant quantum eld theory is based on the orig-
inal [2] and on [5]. For the Dirac eld in curved spacetime, we largely follow [6, 7],
as well as our earlier [1]. For results on Cliord algebras, we refer to [8] (see also [9]
for a short review).

2. Mathematical Preliminaries
To prepare for our discussion of the locally covariant Dirac eld, we present in the
current section some mathematical preliminaries concerning the Dirac algebra, the
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

The Locally Covariant Dirac Field 383

Spin group and a categorical formulation of relevant aspects of dierential geometry.


These merely serve to x our notation and set the scene for the subsequent sections.
We also point out the relations with some other denitions and conventions in the
literature.

2.1. The Dirac algebra and the Spin group


The Spin group can be embedded in the Cliord algebra of Minkowski spacetime,
which we call the Dirac algebra. Therefore, we will rst briey recall some results
on Cliord algebras, for wich we refer to [8] (note the dierence in sign convention
in the Cliord multiplication).
Let Rr,s be a nite dimensional real vector space with dimension n = r + s
and with a non-degenerate bilinear form gab which has r positive and s negative
eigenvalues. The Cliord algebra Clr,s is dened as the R-linear associative algebra
generated by a unit element I and an orthonormal basis ea of Rr,nr subject to the
relations:
ea eb + eb ea = 2gab I.
This denition is independent of the choice of basis. We may identify Rr,s Clr,s
as the subspace of monomials in the basis ea of degree one. The even, respectively
odd, subspace of this Cliord algebra is the one spanned by monomials of even,
0
respectively odd, degree in the basis vectors and is denoted by Clr,s , respectively
1
Clr,s . Note that the even subspace is also a subalgebra. In the following we will be
especially interested in Minkowski spacetime, M0 := R1,3 , where the bilinear form is
= diag(1, 1, 1, 1) and where we choose an orthonormal basis ga , a = 0, 1, 2, 3
with g0 2 = 1,  2 denoting the Minkowski pseudo-norm squared. The associated
Cliord algebra is called the Dirac algebra D := Cl1,3 and it is characterized by
ga gb + gb ga = 2ab I. (1)
As a vector space, the Cliord algebra is naturally isomorphic to the exterior
algebra. This motivates the term volume form for the element g5 := g0 g1 g2 g3 (or
in general e := e1 er+s ). Note the following properties:

Lemma 2.1. We have g52 = I and g5 vg51 = v for all v M0 . More generally,
if u M0 has u2 = u2 I = 0, then u1 = u 1
2 u and v  uvu
1
denes a
reection of M0 in the hyperplane perpendicular to u.

Proof. These equalities follow directly from Eq. (1). For the last claim, e.g., we
compute:
2u, v
uvu1 = v (uv + vu)u1 = v u, v M0 .
u2
Standard arguments with Cliord algebras [8] give:
D = Cl1,3 Cl1,4
0
Cl4,1
0
, Cl4,1 M (4, C),
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

384 K. Sanders

where M (4, C) denotes the algebra of complex (44)-matrices. In fact, Cl4,1 is gen-
erated by the generators ga of D together with a central element , corresponding
to iI M (4, C). Hence:
M (4, C) C R D. (2)
This also implies that the center of D is spanned by I (over R). The following
Fundamental Theorem provides all the essential information we need on the Dirac
algebra (for an elementary algebraic proof, we refer to Pauli [10].):
Theorem 2.2 (Fundamental Theorem). The Dirac algebra D is simple and
has a unique irreducible complex representation (i.e. an R-linear representation
: D M (n, C)), up to equivalence. This is the representation 0 : D M (4, C)
determined by 0 (ga ) = a with the Dirac matrices
   
0 I 0 i
0 := , i := ,
I 0 i 0
     
where i are the Pauli matrices 1 := 01 10 , 2 := 0i i
0 and 3 := 10 10 . The
equivalence with another irreducible complex representation of D is implemented
by (S) = L0 (S)L1 for all S D, where L GL(4, C) is unique up to a non-zero
complex factor.
Consequently, for every set of matrices a M (4, C) satisfying Eq. (1) there
is an L GL(4, C), unique up to a non-zero complex constant, such that a =
La L1 .

Proof. One can show [8] that D M (2, H), where H is the skew eld of quater-
nions. This algebra is simple, because it is a full matrix algebra. The given matrices
a satisfy the Cliord relations (1) and therefore extend to a representation of D
in M (4, C).
Any complex representation : D M (n, C) extends to a complex representa-
tion of M (4, C), using Eq. (2) and the trivial center of D, which is irreducible if
is irreducible. As M (4, C) has only one irreducible representation up to equivalence
(see [11]), namely the dening one on C4 , this determines up to equivalence, as
stated. If K, L GL(4, C) are two matrices which implement the same equivalence,
then KL1 commutes with D and hence K = cL, where c C is non-zero because
K is invertible. Note that  (ga ) := a extends to a complex representation of D
in M (4, C) which is faithful (as D is simple). The last statement then follows from
the previous one.

For notational convenience, we dene 5 := 0 (g5 ).


We can dene a determinant and trace function on D by det S = det (S) and
Tr(S) = Tr((S)) for all S D, where is any irreducible complex representation
of D. This is well-dened by the Fundamental Theorem. The following lemma is
often useful in computations:
Lemma 2.3. We have Tr(ga gb ) = 4ab and Tr([gb , gc ]gd ga ) = 8(cd ba bd ca ).
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

The Locally Covariant Dirac Field 385

Proof. Using the cyclicity of the trace and Eq. (1) we nd: Tr(ga gb ) = 12 Tr(ga gb +
gb ga ) = Tr(ab I) = 4ab and
Tr([gb , gc ]gd ga ) = Tr(gb [gc , gd ga ]) = Tr(gb {gc , gd }ga gb gd {gc , ga })
= 2 Tr(cd gb ga gb gd ca ) = 8(cd ba bd ca ).

We now turn to the Spin group, which is the universal covering group of the
special Lorentz group, a double covering which can be constructed in an elegant
way inside the Dirac algebra.

Definition 2.4. The Pin and Spin groups of Clr,s are dened as
Pin r,s := {S Clr,s | S = u1 uk , ui Rr,s , u2i = I},
Spin r,s := Pin r,s Clr,s
0
.
We let Spin 01,3 denote the connected component of Spin 1,3 which contains the iden-
tity.
We also dene the Lorentz group L := O1,3 , the special Lorentz group L+ :=
SO1,3 and the special ortochronous Lorentz group L+ := SO1,3 0
, which is the con-
nected component of L+ containing the identity.
The special ortochronous Lorentz group preserves the orientation and time-
orientation. For S P in1,3 the map v  SvS 1 on M0 is a product of reections
(up to a sign) by Lemma 2.1. Together with the fact that det u = u4 for all
u M0 this gives rise to another useful characterisation of the group P in1,3 , which
we shall not provea:

Proposition 2.5. Pin 1,3 = {S D | det S = 1, v M0 SvS 1 M0 }.


It can be seen from Proposition 2.5 that P in1,3 and Spin1,3 are indeed Lie
groups. For the universal covering homomorphism between P in1,3 and the
Lorentz group, we have the following formulaeb,c :

Proposition 2.6. The map : P in1,3 L dened by S  ab (S) M (4, R)


such that Sgb S 1 = ga ab (S) is the universal covering homomorphism of Lie
groups, which restricts to the universal covering homomorphism Spin 01,3 L+ . We

have ab (S) = 14 Tr(g a Sgb S 1 ) and the inverse of the derivative d : spin 01,3 l+ at

a Thedenition of the Spin group in [12] corresponds to our group P in1,3 . In [6, 7] one uses the
term Spin group for the group
S := {S M (4, C) | det S = 1, SvS 1 M0 for all v M0 }.
Note that this group cannot give a double covering of the Lorentz group, as claimed in [6] (but
not in [7]), because for any S S the matrices iS, S, iS are in S too. Its usefulness is based
on its simple denition and the fact that S 0 = Spin01,3 .
b These results are well known, but we record them for deniteness to correct a sign error in the

spin connection (5) that has occured in [6, 7, 13].


c Lower case Latin indices are raised and lowered with ab , respectively,
ab throughout.
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

386 K. Sanders

S = I is given by:
1 b
(d)1 (ba ) = gb g a .
4 a

Proof. For the rst sentence we refer to [8, Theorem 2.10] and subsequent remarks.
Using the Cliord relations (1), we see that
1 ac 1
ab (S) = Tr(cd db (S)I) = ac Tr((gc gd + gd gc )db (S))
4 8
1 ac 1
= Tr(gc gd db (S)) = Tr(g a Sgb S 1 ).
4 4
Expanding (S+s+O(2 )) up to second order in  we nd d(s)ab = 14 Tr([gb , g a ]s).
We check that L(ba ) := 14 ba gb g a is an inverse of d:
1 ac ef d 1
d(L(de ))ab = e Tr([gb , gc ]gd gf ) = ac ef de (cd bf bd cf )
16 2
1 a
= ( ae bd de ) = ab ,
2 b

where we used Lemma 2.3 and the symmetry properties of de l+ in the last
line.

2.2. Some category theory and dierential geometry


The language of locally covariant quantum eld theory uses category theory to
express the physical ideas of locality and covariance. Any object or construction
that is extended from a single spacetime (usually Minkowski spacetime) to the
categorical framework gets the adjective locally covariant. The essence of local
covariance seems to have a geometric origin and, because the Dirac eld in curved
spacetimes involves a substantial amount of geometric constructions, it will be
convenient to present the relevant dierential geometry in a categorical setting
here. We refrain from the urge to call this locally covariant dierential geometry,
which appears to be a pleonasm.
A category C consists of a set of objects c and a set of morphisms or arrowsd
: c1 c2 between objects of C, such that the composition of morphisms, when
dened, is associative and each object admits an identity morphism (we refer
to [14] for more details). A (covariant ) functor F : C B is a map between cat-
egories, which maps objects c to objects F(c) and morphisms : c1 c2 to mor-
phisms F() : F(c1 ) F(c2 ) such that an identity morphism maps to an identity
morphism and the composition of morphisms is preserved. A contravariant func-
tor F : C B is dened similarly, but reverses the direction of the morphisms:
F() : F(c2 ) F(c1 ). A natural transformation t: F G between covariant func-
tors F : C B and G : C B is a map which assigns to each object c a morphism
t(c) of B, called the component of t at c, such that for every morphism : c1 c2

d It is very often convenient to depict the morphisms in a diagram as arrows between objects.
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

The Locally Covariant Dirac Field 387

of C we have t(c2 ) F() = G() t(c1 ), which can be depicted as a commutative


diagram. When a natural transformation t admits another natural transformation
s such that t(c) s(c) = idc = s(c) t(c) for all objects c, then t is called a
natural equivalence. In this case, we write t: F G. A natural transformation
between contravariant functors or between a covariant and a contravariant func-
tor is dened similarly, except that some arrows in the commutative diagram are
reversed.
A subcategory B of C consists of a subset of the objects of C and a subset of its
morphisms in such a way that B still satises the axioms of a category. In our case,
all categories will be concrete, i.e. the objects will be sets with a certain structure
and the morphisms will be maps between sets. The identity morphism will always
be the identity map and the composition of maps, when dened, is automatically
associative. In short, our categories will be subcategories of the category Set, whose
objects are setse and whose morphisms are maps.
For our discussion of dierential geometry we start with the following

Definition 2.7. The category Mann of smooth manifolds is the category whose
objects are C manifolds M of (nite) dimension n and whose morphisms are C
embeddings : M1 M2 .
The category Bund of ber bundles is the category whose objects are smooth
ber bundles p : B M over objects M of Mann with bundle projection map p,
and whose morphisms are C maps : B1 B2 covering a morphism : M1 M2
of Mann , i.e. such that p2 = p1 . We denote by Bund the subcategory whose
morphisms restrict to isomorphisms of the bers.
The categories VBundR , respectively VBundC , of real (complex) vector bundles
is the subcategory of Bund whose objects V are real (complex) vector bundles and
whose morphisms : V1 V2 are real (complex) linear maps of the bers. Again
we denote by VBundR and VBundC the subcategories whose morphisms restrict
to isomorphisms of the bers.

We could have taken all smooth maps between manifolds as morphisms of Mann or
allowed all dimensions. However, local dieomorphisms allow us to transport more
structure, which enables us to describe more of the canonical dierential geometric
constructions as functors. We describe the most important examples below. For
ber bundles, on the other hand, it will be useful to allow maps which are not
isomorphisms on the bers.f ,g

e See [14] for some relevant remarks concerning the foundations of set theory and the use of small
sets.
f The unprimed categories, whose morphisms are isomorphisms of the bers, can be described as

bered categories over Mann , cf. [15, p. 44].


g The functors B : Mann Bund below are all of a special type, namely, they associate to a

manifold M a ber bundle whose base space is again M. Although we will only use functors
of this type when describing the Dirac eld, the restriction is not technically necessary in our
denitions.
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

388 K. Sanders

Two of the most basic functors in dierential geometry are


The tangent bundle functor T : Mann VBundR assigns to every manifold M
the tangent bundle T M and to every morphism : M1 M2 the dierential
d : T M1 T M2 .
The cotangent bundle functorh T : Mann VBundR assigns to every manifold
M the cotangent bundle T M and to every morphism : M1 M2 the push-
forward : T M1 T M2 , which is dened as := d1 .
In a similar way, one can dene the functor k : Mann VBundR of exterior
k-forms and the exterior algebra functor : Mann VBundR , both with push-
forwards. Another example is
The density bundle functor |n | : Mann VBundR assigns to every spacetime
M the one-dimensional trivial vector bundle of densities |n M|, where n is
the dimension of M. This is the vector bundle whose ber at x M consists
of functions d : nx M R such that d(r) = |r| for all r R and nx M
(cf. [16, Appendix A.3]). A morphism is mapped to the push-forward dened
by d := d , where := d is the pull-back.
By standard constructions, one can take (nite) direct sums and tensor products
of functors from Mann into VBundR which map M into a vector bundle over
M. One obtains another such functor in the obvious way. For functors V into
VBundR one can also dene the dual, denoted by V , where the morphism between
dual vector bundles is the push-forward of the original morphism. This generalizes
the example of T above. As another standard construction one can dene the
complexication VC of any functor V into VBundR (respectively, VBundR ), which
is a functor into VBundC (respectively, VBundC ).
Now we turn to some examples of natural transformations:
The canonical pairing between a functor V : Mann VBundR which maps M
to a vector bundle V M over M and its dual V is a natural transformation
 , : V V 0 whose components cover the identity morphism.
Complex conjugation is a natural equivalence : VC VC in VBundR (or
VBundR ) between complexied vector bundles, which sends each section to
its complex conjugate.
A further example of a natural equivalence is the ber-wise multiplication by a
real number r = 0. (For r = 0, this only yields a natural transformation.) Further-
more, the constructions mentioned above (dual, direct sum, tensor product) and
the natural transformations (pairing, ber-wise multiplication) can also be applied
directly to complex vector bundles in a canonical (Hermitean) way.

h Itis tempting to think of a contravariant functor that maps manifolds to their cotangent bundles
and morphisms to the pull-back, := d, which indeed reverses the directions of arrows
and changes the order of compositions. However, the pull-back is only dened on the image of ,
so in general this does not dene a morphism in VBundR .
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

The Locally Covariant Dirac Field 389

It will be convenient to consider distributions and integration in a categorical


setting too:

Definition 2.8. TVec is the category of topological vector spaces with injective
continuous linear maps as morphisms. The functor C : Mann TVec is the constant
functor C, i.e. it assigns to each object the one dimensional space C and to each
morphism the identity morphism.

The functor of test-sections is the functor C 0 : VBundC TVec which maps

each complex vector bundle V to the space C0 (V) of compactly supported smooth
sections of V in the test-section topology.i A morphism , covering a morphism , is
mapped to the push-forward dened by (f ) = f 1 on (M1 ), extended
by 0 to all of M2 .
The functor of smooth sections is the contravariant functor C : VBundC
TVec which maps each complex vector bundle V to the space C (V) of smooth
sections of V in the usual topology. A morphism , covering a morphism , is
mapped to the pull-back dened by (f ) = 1 f .
The functor of distributions is the contravariant functor Distr: VBundC
TVec which maps each complex vector bundle V to the space (C0 (V)) of distri-
butions on V with the weak topology induced by C0 (V). A morphism , covering
a morphism , is mapped to the pull-back dened by u := u .
We will not need compactly supported distributions, but they can be dened as
the functor dual to C . Notice that objects which are not compactly supported,
such as smooth sections or distributions, behave contravariantly, whereas compactly
supported ones behave covariantly. Also note that the pull-back of a smooth section
can only be dened for morphisms that restrict to isomorphisms of the bers. The
following constructions will be of importance in Sec. 4:

Integration is a natural transformation
 : C
0 | | C which assigns to each
n

C0 (| M|) the integral M .
n

Canonical Injections. Let f : VBundC VBundC be the forgetful functor. For


any functor V : Mann VBundC there is a canonical natural transformation
: C0 f V C

V, whose components are the canonical injections

C0 (V M) C (V M). Similarly, there is a canonical natural  transformation
: C (V |n |) Distr f V given by M (f ) := M ., f  for any
smooth section f of V M and any density on M. Each component of is
injective.
Where convenient we will identify a functor V : Mann VBundC with the func-
tor f V, omitting the forgetful functor, as this rarely leads to confusion. Fur-
thermore, any natural transformation t: V1 V2 between a pair of functors
Vi : Mann VBundC , i = 1, 2, lifts to a corresponding natural transformation

i For a precise denition of the well-known topologies on test-sections and smooth sections we refer

to [17, Chap. 17].


May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

390 K. Sanders

T : C
0 V1 C0 V2 dened pointwise by TM f := tM f . The same statement
holds for T : C V1 C V2 , if the Vi are functors into the category VBundC .
Next we add the structure of a semi-Riemannian metric:

Definition 2.9. The category SRMann of semi-Riemannian manifolds is the sub-


category of Mann whose objects M = (M, g) are C manifolds M of dimen-
sion n with a semi-Riemannian metric g and whose morphisms m : M1 M2 are
given by the isometric morphisms in Mann , i.e. morphisms : M1 M2 such that
g1 = g2 |(M1 ) .
Again there is a canonical forgetful functor f : SRMann Mann , which is often
left implicit, so we will write e.g. T for the functor T f . The extra structure of a
semi-Riemannian metric gives rise to extra functors and natural equivalences that
are of interest to us:
The metric identification is a natural equivalence G: T T whose component
at M = (M, g) is given by the map GM : T M T M such that v  g(v, ).
The frame bundle functor F : SRMann VBundR assigns to each object M the
frame bundle F M, i.e. the bundle whose ber at a point x M consists of all
orthonormal bases of Tx M in the metric g. This ber is a subset of T n M. A
morphism m is mapped to the push-forward acting on F M T n M.
The volume form functor vol : SRMann VBundRis dened as vol := |n | f .
When m : M1 M2 is a morphism and dvoli := | det gi | the metric induced
volume form on Mi , then vol maps dvol1 to the restriction of dvol2 to m(M1 ).
There is a canonical natural equivalence from 0 to vol, which consists of
multiplication with the metric induced volume form.
Similarly there are natural equivalences between any functor V: SRMann
VBundC and V |n |. Therefore we obtain a canonical natural transformation
: C V Distr V whose components are injective.
Finally
we should mention the Cliord bundle functor Cl : SRMann VBundR , which
assigns to each object M = (M, g) the Cliord bundle ClM, which is the vector
bundle whose ber at x M is the Cliord algebra of (Tx M, g) viewed as a linear
space. Ignoring the algebraic structure, this functor is naturally equivalent to f .
Although we will not do so, it is possible to use this functor as a basic object for
the description of fermions (cf. [18]).

3. The Classical Dirac Field


After these mathematical preliminaries we are now ready to start constructing the
classical free Dirac eld (as a locally covariant classical eld). We will rst describe
the geometric and algebraic constructions, before we discuss the Dirac equation and
its fundamental solutions. We close by investigating to what extent the relations
between the Dirac operator, charge conjugation and adjoint map x the structure
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

The Locally Covariant Dirac Field 391

of the theory and nd that the non-uniqueness can be characterised in terms of the
cohomology of the category of spin spacetimes.

3.1. Geometric aspects


In order to describe the Dirac eld we need to introduce the notion of a spin
structure on a spacetime, combining the geometric and the algebraic results of
Sec. 2. This is the purpose of the current subsection.
The systems that we will consider are intended to model Dirac quantum elds
living in a (region of) spacetime which is endowed with a xed Lorentzian metric
(a background gravitational eld). Mathematically these regions are modelled as
follows:

Definition 3.1. By the term globally hyperbolic spacetime we will mean a con-
nected, Hausdor, C Lorentzian manifold M = (M, g) of dimension d = 4, which
is oriented, time-oriented and admits a Cauchy surface.
A subset O M of a globally hyperbolic spacetime M is called causally convex
i for all x, y O all causal curves in M from x to y lie entirely in O.
The category Spac is the subcategory of SRMann whose objects are all glob-
ally hyperbolic spacetimes M = (M, g) and whose morphisms are isometric embed-
dings that preserve the orientation and time-orientation and such that (M1 ) is
causally convex.

By a theorem of Geroch any globally hyperbolic spacetime is paracompact ([19,


Appendix]).
Most notations we use concerning the causal structure of spacetimes are stan-
dard, cf. [20]. The importance of causally convex sets is that for any morphism
the causal structure of M1 coincides with that of (M1 ) inside M2 :

(JM 1
(x)) = JM 2
((x)) (M1 ), x M1 .

If O M is a connected open causally convex set, then (O, g|O ) denes a globally
hyperbolic spacetime in its own right. In this case there is a canonical morphism
IM,O : O M given by the canonical embedding : O M. We will often drop
IM,O and from the notation and simply write O M .
Notice that there is a forgetful functor f : Spac SRMann and that we can
dene the functor F+ : Spac Bund of oriented, time-oriented orthonormal frames
F+ M for the tangent bundle, in analogy to Sec. 2.2. This is a principal L+ -bundle
over M , where the special ortochronous Lorentz group L+ acts from the right,
i.e., given e = (x, e0 , . . . , e3 ) F+ M , where x M and ea Tx M such that
gx (ea , eb ) = ab and e0 is future pointing, the action of is dened by R e = e =
(x, e0 , . . . , e3 ) where ea = eb ba .

Definition 3.2. A spin structure on M is a pair (SM , ), where SM is a principal


Spin 01,3 -bundle over M , the spin frame bundle, with a right action RS , S Spin 01,3 ,
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

392 K. Sanders

and : SM FM , the spin frame projection, is a base-point preserving bundle


homomorphism such that
RS = R(S) ,
where S  (S) is the universal covering map (cf. Proposition 2.6).
A globally hyperbolic spin spacetime SM = (M, g, SM , ) is an object M =
(M, g) of Spac which is endowed with the spin structure (SM , ).
The category SSpac is the subcategory of Bund whose objects are all glob-
ally hyperbolic spin spacetimes SM = (M, g, SM , ) and whose morphisms
: SM 1 SM 2 cover a morphism : M1 M2 in Spac and satisfy (R1 )S =
(R2 )S and 2 = 1 , where pi are the bundle projections, i the spin
frame projections and the push-forward.
Note that a morphism acts as a dieomorphism of the bers, because it intertwines
the group action.
Every globally hyperbolic spacetime admits a spin structure, which need not be
unique [6, 8, 19, 21]. We will regard distinct spin structures on the same underly-
ing spacetime as distinct spin spacetimes.j Spinor and cospinor elds are sections
of vector bundles associated to the spin frame bundle. We will require that the
assignment of these vector bundles is functorial:

Definition 3.3. A locally covariant spinor bundle is a functor V: SSpac


VBundC , written as SM  VSM ,  , such that and cover the same
morphism in Spac and such that each VSM is a vector bundle associated to the
spin frame bundle SM through some representation. The dual functor V is called

a locally covariant cospinor bundle. Smooth sections of VSM , respectively VSM , are
called (Dirac) spinors (or spinor elds), respectively cospinors (cospinor elds).
The condition in the denition of a locally covariant spinor bundle ensures that the
vector bundle VSM and the spin frame bundle SM are both bundles over the same
spacetime M .
For deniteness we pick out the following standard choice of locally covariant
spinor and cospinor bundles:

Definition 3.4. The standard locally covariant Dirac spinor bundle D0 : SSpac
VBundC is the locally covariant spinor bundle which associates to each object SM
of SSpac the associated vector bundle D0 M = SM Spin01,3 C4 of SM with the

j There exists another approach to spinors, which considers on each spacetime the Cliord bundle.

This Cliord bundle is functorial in its dependence on the spacetime, but it does not generally
dene a spin structure. Indeed, at each point one can identify the Spin group inside the ber of
the Cliord bundle, but there may not be any projection from these Spin groups onto the frame
bundle that intertwines the actions of the structure groups, the obstruction being a topological
twist. (Conversely, every spin structure can be seen as a topologically twisted copy of the Spin
groups in the Cliord bundle.) Nevertheless, it appears to provide sucient structure to describe
all the relevant physics in a functorial way. We refer to [18] for more information on this approach.
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

The Locally Covariant Dirac Field 393

representation 0 , and which maps each morphism : SM 1 SM 2 to the morphism


: D0 M1 D0 M2 given by ([E, z]) := [(E), z]. The standard locally covariant
Dirac cospinor bundle D0 is the dual functor of D0 .

Recall that a point in D0 M consists of an equivalence class of pairs (E, z) SM


C4 , where the equivalence is given by

[RS E, z] = [E, 0 (S)z].

The dual functor D0 then assigns to each SM the dual vector bundle D0 M whose
points are equivalence classes of pairs (E, w ) SM (C4 ) , where the equivalence
is given by [RS E, w ] = [E, w 0 (S 1 )]. (Here we consider w (C4 ) as a row
vector, whereas z C4 is treated as a column vector.)
For any object SM the unique connection SM on T M which is compatible

with the metric, SM g = 0, can be described by an l+ -valued one-form (SM )ba on

the orthonormal frame bundle F+ M (cf. [22, Chap. 2, Proposition 1.1]), where l+ is
the Lie-algebra of L+ , which can be identied with the tangent space of the ber of
F+ M at any point. For every local section e of F+ M the pull-back ba := e (ba )
consists exactly of the connection one-forms of SM expressed in the orthonormal
frame ea . The one-form (SM )ba can be pulled back by the spin frame projection
and lifted to a spin01,3 -valued one-form SM on SM :
1
SM := (d)1 ((SM )ba ) = p ((SM )ba )gb g a ,
4
where the last equality uses Proposition 2.6. The one-form SM determines a con-
nection on the spin frame bundle SM . For any associated vector bundle DM we
then nd a connection, also denoted by SM , determined by the connection one-
forms := E (SM ) in a local section E of SM , as represented on DM (we will
give an explicit expression for in Eq. (5)). The connection can be viewed as a
map SM : C0 (D0 M ) C0 (T M D0 M ), which is a component of a natural
transformationk : C
0 D0 C0 (T D0 ). The Leibniz rule allows us to
extend it to mixed spinor-tensors, using, e.g., a v, u = a v, u + v, a u.

3.2. Adjoints, charge conjugation and the Dirac operator


We now dene the adjoint and charge conjugation maps on spinors and cospinors.
These are special cases of the Fundamental Theorem 2.2, using the complex conju-
gate and adjoint matricesl (cf. [23]).

k Alternatively we could have written the connection as a natural transformation from the 1-jet
bundle extension of D0 to T D0 .
l On a general representation space of complex dimension four, one can dene many complex

conjugations and Hermitean inner products. In order to obtain the desired equalities involving
adjoint and charge conjugate spinors later on, we need these two operations to be compatible, i.e.
v, w = v, w. Without loss of generality we can then use the standard complex conjugation and
Hermitean inner product on C4 .
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

394 K. Sanders

Theorem 3.5. For any irreducible complex representation of the Dirac algebra
D there are matrices A, C GL(4, C) such that
A = A , (ga ) = A(ga )A1 , An > 0,
(3)
= I,
CC (ga ) = C(ga )C 1
for all future pointing time-like vectors n M0 D. We have for all S Spin01,3 :
A = C AT C,
(S) A(S) = A, (S 1 )C 1 (S) = C 1 .
Moreover, if A , C  M (4, C) have the properties stated above for the irreducible
complex representation  of D, then there is an L GL(4, C), unique up to a sign,
such that L A L = A, (L)
1 C  L = C and = L1  L on D.

Proof. To prove the existence of A and C in the representation 0 we may take


A = A0 := 0 , C = C0 := 2 and check the required properties straightforwardly.
Note for example that
 
0 i
n I + n i 0
0 na a = > 0,
0 n0 I ni i

because det(n0 I ni i ) = n2 > 0 and Tr(n0 I ni i ) = 2n0 > 0. To prove the


existence of A and C in a general irreducible complex representation one writes
a = K(ga )K 1 by Theorem 2.2 and veries that A = K A0 K and C = K 1 C0 K
will do.
Given A , C  satisfying Eq. (3) for  we can x K GL(4, C) such that  =
KK 1 on D and the desired matrix L must be L = zK for some z = 0 by the
Fundamental Theorem 2.2. Now set A := K A K and C := (K) 1 C  K and note
that A and C satisfy (3) for . Because the sets of matrices (ga ) and (ga )
both satisfy the relations (1) we must have aA = A and cC = C for some non-zero
complex factors a and c, again by the Fundamental Theorem. Also, |c| = 1 because
= I and a > 0 because A = A and A(n) > 0 for future pointing time-like
CC
vectors. Hence, |z|2 = a and z = c z , which xes z (and L) up to a sign. This proves
the last statement.
The equation A = C AT C holds for A0 , C0 and therefore also in general. For
a unit vector u = ua ga we have u2 = I and hence
(u) A(u) = ua ub (ga ) A(gb ) = ua ub A(ga gb ) = A(u2 ) = A.
For S Spin 1,3 , we must therefore have that (S) A(S) = A, by denition of
the Spin group. For S = I, the sign is a plus, so by continuity and connectedness
we conclude that (S) A(S) = A for all S Spin01,3 . For C, we use the fact that

(u1 )C 1 (u) = (u)1 (u)C 1 = C 1


and hence (S 1 )C 1 (S) = C 1 for all S Spin1,3 .
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

The Locally Covariant Dirac Field 395

Note that g5 Spin 1,3 \Spin 01,3 . Indeed, using 0 and A0 = 0 in Theorem 3.5 we
see that 5 A0 5 = A0 , so g5 Spin 1,3 by denition, but not in Spin 01,3 .
In the following theorem we use the fact that for any pair of natural transforma-
tions t, t : SSpac VBundC we can dene the sum t + t and the tensor product
t t componentwise.

Theorem 3.6. The standard locally covariant Dirac spinor and cospinor bundles
admit natural (C-antilinear ) equivalences + : D0 D0 , c : D0 D0 , c : D0 D0 in
VBundR and a natural transformation : D0 T D0 in VBundC such that all
components cover the identity morphism and the following equations hold both on
spinors and cospinors (i.e. we denote the inverses of + and c by the same symbol):
+ +
= 1 =c c , = 1 c +
+ c

 ,  S (+ + ) =  ,  =  ,  (c c )
(1+ ) = + , (1c ) = 1 c (4)
(1 + S 1) (1 ) = (2 g) 1
= ,
where S: D0 D0 D0
D0 and S: T T T T swap the factors in the
tensor product, g: 0 T T maps the function 1 to the metric g and : D0
T D0 is the adjoint map of under the canonical pairing  , . Furthermore, for
every object SM , every time-like future pointing tangent vector n T M and every
v D0 M we have n v + , (v) 0.
The natural transformation can also be seen as a natural transformation T
End(D0 ) or T End(D0 ). Equations (4) simply give the usual computational
rules for spinors and cospinors in a functorial setting. Thus, for every SM and
every p D0 M , q D0 M we have:
p++ = p = pcc , pc+ = p+c
p+ , q +  = q, p = q c , pc 
( p)+ = p+ , ( p)c = pc
+ = 2g I, a b 0,
where we have dropped the subscript SM to lighten the notation.

Proof. The canonical pairing  , : D0 D0 0C on SM is given by


[E, w ], [E, z] = w, z, where the right-hand side is the standard Hermitean inner
product on C4 . Note that this is well-dened, because we can always get the same
E SM on the left-hand side by a suitable action of Spin01,3 . The components of
the natural equivalences + and c on each SM are dened using the matrices A0 and
C0 of Theorem 3.5 and their properties:
[E, z]c := [E, C01 z], [E, w ]c := [E, w
C0 ],
[E, z]+ := [E, z A0 ], [E, w ]+ := [E, A1
0 w].
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

396 K. Sanders

These are well-dened isomorphisms in VBundR and they give rise to natural equiv-
alences satisfying the rst two lines of Eq. (4).
Now x E SM , let ea be the orthonormal basis (e0 , . . . , e3 ) = (E) of Tp(E) M ,
where : SM FM is the spin frame projection, and let ea be the dual basis of

Tp(E) M . On SM we dene the component of the natural transformation on SM
to be

([E, z]) := ea [E, a z].

This is well-dened, because a dierent section E  := RS E gives rise to the frame


ea = eb b a(S) and the dual frame (e )a = ab (S 1 )eb and on the other hand
0 (S 1 )a 0 (S) = b ba (S 1 ) by denition of (Proposition 2.6). is indeed a
morphism in VBundC and gives rise to a natural transformation. The third line of
Eq. (4) follows again from the properties of A and C (see Theorem 3.5):

([E, z]c ) = ea [E, a C01 z] = ea [E, C01 a z] = (([E, z]))c ,


([E, z]+ ) = ea [E, z A0 a ] = ea [E, z a A] = (([E, z]))+

and similarly on cospinors. Also,


1 c
b a = b a a b cba c = (c d a a c d ) cba c
4 bd
1 1 c
= cbd (c { d , a } {a , c } d 4ad c ) = ( d c + ac d ) = 0.
4 2 bd a
Finally, for every object SM , every future pointing tangent vector n T M and
every v D0 M we have n v + , (v) = v + , Ana a v 0 again by Theorem 3.5.

In terms of the Christoel symbols , the frame ea and representing ga on


D0 M using the End(D0 M )-valued one-forms , the connection one-forms of the
spin connection can be expressed asm
1 a
b := a c ,
4 bc (5)
abc = ec (eb ea ) + ea eb ec .
The Dirac operator is dened on spinors and cospinors by


/ SM := a a .

This denes natural transformations / : C


/ : C
0 D0 C0 D0 , respectively 0

D0 C0 D0 . The intertwining relations of the adjoint and charge conjugation
with the Dirac operator follow from their intertwining with in Theorem 3.6:

Proposition 3.7.
/ + =+
/,
/ c
= 1 c

/.

m Note the sign error in [6, 7].


May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

The Locally Covariant Dirac Field 397

+ c
Proof. Recall that and can be dened pointwise on test-sections. Hence, on
any object SM
/ v)c = ((a v va ) a )c = (a v va ) a C
(
= ((
v C) vCa ) a =
/ (vC) =
/ vc ,
/ u)+ = ( a (a u + a u))+ = (a u + u a )( a ) A
(
= (a (u A) u Aa ) a =
/ (u A) =
/ u+ ,
where the minus sign in the last line appears because the order of the two factors of
/ v)+ = (
in the expression for a needs to be changed. It follows that ( / v ++ )+ =
( + ++
/v ) = +
/ v and ( c
/ u) = (/ u ) = (
+ +c + c+
/ u ) = ( / u ) = (
+c +
/ uc+ )+ =
c
/u .

Remark 3.8. A change in the sign convention, := , has no physical conse-


quences. In fact, this simply gives rise to D Cl3,1 as the Dirac algebra, but since
0
Cl3,1 0
= Cl1,3 nothing changes in the representationn of the group Spin 01,3 = Spin 03,1 .
To accommodate this change one can set a := ia in Eq. (1), which yields the same
Dirac algebra and other constructions (although we do get signs for all covectors
when raising or lowering indices with ). This also implies that one should drop
the factor i in front of the Dirac operator in the Dirac equation (6) below, which
ensures that Pc P = P Pc will still be a wave operator. We can also keep the same
matrices A, C, which now must satisfy the relations:
a = A
a A1 , a = C a C 1 .
The spinor and cospinor bundle and the adjoint and charge conjugation maps then
remain the same and all the relations between these operations and the Dirac
operator remain valid.

3.3. The Dirac equation and its fundamental solutions


The Dirac equation on spinor and cospinor elds, respectively, on a spin spacetime
SM is
(i
/ + m)u = 0, (i
/ + m)v = 0, (6)
where the constant m 0 is to be interpreted as the mass of the eld. These equa-

tions can be derived as the EulerLagrange equations from the action SD := LD

n Notice that a complex irreducible representation of Cl1,3 extends to an irreducible representa-


tion of M (4, C) and therefore also gives a complex irreducible representation of Cl3,1 and vice
versa. The standard Cliord algebra isomorphism Cl3,1  M (4, R) appears if and only if the
a = a . In that case we also nd
representation of Cl1,3 is a Majorana representation, i.e. if
(see, e.g., [12, p. 332])
P in3,1  {S M (4, R) | det S = 1, v M0 SvS 1 M0 }
= P in1,3 .
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

398 K. Sanders

with the Lagrangian densityo

LD := u+ , (i
/ + m)udvolg (7)

by varying with respect to u and u+ , viewed as independent elds. The canonical


momentum of the eld u on a Cauchy surface C with future pointing normal vector
eld n is dened as
1 SD
(x) :=  (x))
= i + (x)n
/ (x). (8)
det g(x) (n

We will write P := i
/ + m for the operator on spinors and Pc := i/ +m
for the operator on cospinors. These are components of natural transformations
P : C
0 D0 C0 D0 , P : C

D0 C D0 and Pc : C
0 D0 C0 D0 ,


Pc : C D0 C D0 , which we denote by the same symbol. We then have by
Proposition 3.7:
P c
= c
P, Pc c
= c
Pc ,
(9)
Pc = P,
+ +
P = Pc ,
+ +

i.e. if a spinor eld u is a solution to the Dirac equation, then so are u+ and uc .
(The adjoint and charge conjugation of u are dened pointwise.)
For a distribution v on D0 M we dene the transpose P by P v, u := v, P u
and similarly for Pc . In this way the transposes give rise to natural transformations
P : Distr D0 Distr D0 and Pc : Distr D0 Distr D0 .

Lemma 3.9. Let : C D0 Distr D0 and : C D0 Distr D0 be the


canonical natural transformations (see the end of Sec. 2.2). Then P = Pc
and Pc = P .

Proof.
 This follows from the fact that for each object SM M u,
/ vdvolg =
M / u, vdvolg if at least one of u C (D0 M ) and v C (D0 M ) is com-
pactly supported. This in turn follows from  / v, u + v,
/ u = a v, a u and
Gauss law.

One can nd unique advanced and retarded fundamental solutions for the Dirac
equation, both for spinors and cospinors [6, 24]:

Theorem 3.10. There are unique natural transformations S : C


0 D0 C D0

and Sc : C0 D0 C D0 such that S P = P S = , Sc Pc =
Pc Sc = and such that for each u C0 (D0 M ), v C0 (D0 M ) we have

o The Lagrangian is a natural transformation between the functor J1 D0 , which assigns to each
spin spacetime SM the rst-order jet bundle J1 D0 M of the spinor bundle D0 M , to the functor
|n | of densities. A component of this natural transformation covers the identity morphism of SM
and is only a moprhism in Bund, not in VBundR , because it is not linear.
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

The Locally Covariant Dirac Field 399

supp(S u) J (supp(u)), supp(Sc u) J (supp(u)). Moreover,


S c
= c
S, Sc c
= c
Sc ,
Sc + =+ S , S + =+ Sc ,

 ,  (1 S ) =  ,  (Sc 1).

Proof. The components of S and Sc are the advanced () and retarded (+)
fundamental solutions for P and Pc , which are given by S := (i / + m)E and
Sc := (i / + m)E respectively, where E are the unique advanced and retarded
fundamental solutions for the normally hyperbolic operator (i / + m)(i / + m) =
(i / + m)(i/ + m) = / 2 + m2 . We refer to [6, Theorem 2.1] for a detailed proof of
the existence and uniqueness of these operators (see also [16] for the existence and
uniqueness of E ).
The naturality of S and Sc follows from their uniqueness and the naturality of
P and Pc . In detail: for every morphism : SM 1 SM 2 and every f C0 (D0 M1 )
the unique smooth solution to P u = f on M2 with supp(u) J (supp( f ))
pulls back to a solution v := u of P v = f on M1 with supp(v) J (supp(f )).
By uniqueness we must then have u = S f and u = S f , i.e. S =
S . The same holds for cospinors. The commutation of S and Sc with charge
conjugation and adjoints follows from Eq. (9).
For arbitrary u C0 (D0 M ) and v C0 (D0 M ) we can nd a C0 (M )
which is identically one on the compact set supp(S u) supp(Sc v). We then com-
pute:

v, S u = Pc Sc v, S u = Sc v, P S u
M M M

= Sc v, P S u = Sc v, u,
M M
which proves the last claim.

We dene the advanced-minus-retarded fundamental solutions S := S S + and


Sc := Sc Sc+ , which are natural transformations S: C
0 D0 C

D0 and

Sc : C0 D0 C D0 respectively.

3.4. The non-uniqueness of the functorial Dirac structure


We have seen that the (standard) structure of Dirac spinors and cospinors, adjoints,
charge conjugation and the Dirac operator is entirely determined by the functor D0
and the natural equivalences + , c and . We formalise this with a denition:

Definition 3.11. By a Dirac structure D := (D,+ ,c , ) we mean a locally covariant


spinor bundle D with a dual bundle D , natural equivalences + : D D , c : D D,
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

400 K. Sanders

and c : D D in VBundR and a natural transformation : D T D in


VBundC , all of whose components cover the identity morphism and satisfying the
relations (4) and SM (v + , v), n 0 for every time-like future pointing vector
n T M.
We call D0 := (D0 ,+ ,c , ) of Theorem 3.6 the standard Dirac structure.
The category DStruc has all Dirac structures as objects and its morphisms
t : D1 D2 are all natural transformations t: D1 D2 whose components are
injective morphisms covering the identity morphism and intertwining the adjoints,
charge conjugation and as follows:
+2
t = t+1 , c2
t = tc1 , 2 (t t) = 1 .
For each Dirac structure, one can perform the constructions of Sec. 3.3. Because
the Dirac algebra D has a unique irreducible complex representation one might
expect that the category DStruc admits a corresponding unique initial object,
perhaps up to isomorphism. This is an object from which there exists a morphism
into any other object. However, as we will explain in this section there is a certain
cohomological obstruction of the category SSpac involved. We will rst consider
the standard Dirac structure, which would be a good candidate for an initial object,
and prove the following weaker property:
Proposition 3.12. Any morphism t from a Dirac structure D to the standard
Dirac structure D0 is an isomorphism.

Proof. Let t : D D0 be a morphism. By the injectivity of the components of


t: D D0 we see that the complex dimension of the ber of DM is at most four.
On the other hand, the vector bundles DM are modules for the Dirac algebra
represented by . Because this algebra is simple, and because Eqs. (4) exclude the
trivial representation, we nd that DM must have complex dimension at least four.
Therefore, t: D D0 must be a natural equivalence and it follows that t : D D0
is an isomorphism.

Corollary 3.13. If we construct a Dirac structure D analogous to D0 , but using


a dierent representation and matrices A, C, then D is isomorphic to D0 .

Proof. Because we use the same representation on all spacetimes we can construct
a natural equivalence t: D D0 whose components are of the form tSM ([E, z]) :=
[E, Lz] for some L GL(4, C) which is independent of SM (cf. Theorem 3.5).

Corollary 3.14. If D := (D0 ,+1 ,c1 ,  ) is any Dirac structure with the standard
locally covariant Dirac spinor bundle D0 , then D is isomorphic to the standard
Dirac structure D0 .

Proof. At each point x in each object SM we can view a as matrices that


represent the Dirac algebra in a representation . Using the Fundamental The-
orem 2.2, we write a = La L1 for some L(x) GL(4, C). As a is well-dened
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

The Locally Covariant Dirac Field 401

on D0 we must have 0 (S)a 0 (S 1 ) = b ba (S) for all S Spin 01,3 . This also
holds for the matrices , so we conclude from the Fundamental Theorem that
0 (S)L(x) = c(x)L(x)0 (S), where c 1 by taking S = I. We can now dene a
natural equivalence t: D0 D0 by [E, z]  [E, L(p(E))z] such that  t = t . If
we also dene +2 := t +1 t1 and c2 := t c1 t1 , then D (D0 ,+2 ,c2 , ) D0 ,
where the last equivalence follows from the previous corollary.

In fact, the proof of Corollary 3.13 shows that for any SM the quadruple
(DM,+ ,c , ) is unique up to an isomorphism tSM , if DM has four-dimensional
complex bers. The isomorphism tSM itself, however, is only unique up to a sign.
In other words, on each spin spacetime we nd a discrete Z2 -symmetry that pre-
serves all physical relations.p
Consider two Dirac structures D and D whose locally covariant spinor bundles
D and D have four-dimensional complex bers. Comparing the action of these
functors on morphisms of SSpac one nds a diagram that commutes up to a sign.
The existence of an initial object in the category DStruc then boils down to the
question whether one can choose signs for all spin spacetimes SM in such a way
that all the diagrams commute. The answer is not at all obvious, but can be neatly
formulated in terms of the rst StiefelWhitney class of the category SSpac. To
explain this we will briey recall the denition of cohomology groups for categories
(cf. [26]).
If C is any category, we can rst build a simplicial set from it called the nerve
of the category (cf. [27]). A 0-simplex is simply an object of C, a 1-simplex is a
morphism between two objects, a 2-simplex is a commutative triangle, etc. We will
write n for the set of all n-simplices. For n 1 every n-simplex has n + 1 faces,
which are described by maps j : n n1 , 0 j n, which remove the jth
vertex from the diagram.
To nd the cohomology of C with values in an Abelian groupq G, we dene
an n-cochain with values in G to be a map v : n G. We denote the set
of n-cochains with values in G by C n (G) and we dene the coboundary map
d : C n (G) C n+1 (G) by

n+1
dv(s) := (1)j v(j s), s n+1 ,
j=0

where we have written the group operation of G additively. One checks that d2 =
0 and denes v to be closed i dv = 0 and exact i v = dt for some (n 1)-
cochain t. The sets of closed and exact n-cochains are denoted by B n (G) and
Z n (G), respectively. They inherit an Abelian group structure from G and because

p This may be compared to [25], who use complex spinor structures and then nd a local (gauge)

symmetry instead of our more restricted global symmetries.


q [26] also considers the non-Abelian case, which is much more involved.
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

402 K. Sanders

Z n (G) B n (G) is necessarily normal one can dene the jth cohomology group as
the quotient H n (G) := B n (G)/Z n (G).
Now let us return to the study of Dirac structures. Suppose that D and D
both have four-dimensional complex bers. Without loss of generality, we may
assume that both Dirac structures coincide on each spin spacetime, but the action
of their locally covariant spinor bundles on a morphism agrees only up to a sign
v() {1}. We can view v :  v() as a 1-cochain on the category SSpac
with values in Z2 = {0, 1}, where 0 corresponds to +1 and 1 to 1). Notice that
for a composition of morphisms = 1 2 we nd v() = v(1 ) + v(2 ) in Z2 ,
because the Dirac structures are both functorial. In cohomological terms this means
precisely that dv = 0.
If there is a natural equivalence t: D D , then the components tSM are auto-
morphisms of the Dirac structure at each SM , i.e. tSM = 1, that compensate for
all the minus signs in v. If we view t as a 0-cochain with values in Z2 , this means
exactly that v = dt. So we have proved:

Theorem 3.15. The number of inequivalent Dirac structures whose locally covari-
ant spinor bundles have four-dimensional complex bers equals the number of rst
StiefelWhitney classes of the category SSpac, i.e. the number of elements in
H 1 (Z2 ).

Remark 3.16. For scalar and vector elds the problem above can be avoided in
a natural way. Taking L+ in the dening (four-vector) representation, the vector
bundle associated to F+ M is just the tangent bundle T M . A morphism in Spac
determines a unique morphism on the tangent bundle, so no topological obstruc-
tions occur. Similarly for the scalar eld, where one uses the trivial one-dimensional
representation of L+ , whose associated vector bundle is 0 (M ) = M R. Again a
morphism in Mann automatically determines a unique morphism on these associ-
ated vector bundles, now by the requirement that the volume element is preserved.
In general one is dealing with representations of Spin 01,3 and associates to each
morphism in SSpac an intertwining operator between such representations. For
the associated vector bundles of SM , the physical requirements that we imposed on
the bundle morphisms, concerning the adjoint and charge conjugation maps and ,
reduce the intertwiners exactly to a choice of lifting L+ to its double cover. In this
way it leads to the same rst StiefelWhitney class that characterizes the number
of spin structures on a manifold. For the general case it is expected that one needs
a non-Abelian cohomology theory to quantify the obstruction for nding initial
objects.

4. The Locally Covariant Quantum Dirac Field


After our discussion of the classical Dirac eld in Sec. 3 we now turn to the quantum
Dirac eld, its construction, its Hadamard states and its relative Cauchy evolution.
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

The Locally Covariant Dirac Field 403

4.1. Quantization of the free Dirac eld


First, we will quantize the free Dirac eld in a generally covariant way and establish
some of its properties. For this purpose we also present the main ideas of locally
covariant quantum eld theory as introduced in [2] (see also [5]).
In the following, any quantum physical system will be described by a topological

-algebra A with a unit I, whose self-adjoint elements are the observables of the
system. An injective and continuous -homomorphism expresses the notion of a
subsystem, whereas a state is desccribed by a normalized and positive continuous
linear functional , i.e. (A A) 0 for all A A and (I) = 1. The state space
of A is the set of all states and is denoted by A+ 1 . Every state gives rise to a
GNS-representation (see [28, Theorem 8.6.2.]), which is characterized uniquely,
up to unitary equivalence, by the GNS-quadruple ( , H , , D ). Here H is
the Hilbert space on which (A) acts as (possibly unbounded) operators with
the dense, invariant domain D := (A) . The vector is cyclic and satises
(A) =  , (A)  for all A A.
The collection of all systems forms a category TAlg:

Definition 4.1. The category TAlg has as its objects all unital topological -
algebras A and as its morphisms all continuous and injective -homomorphisms
such that (I) = I.
A locally covariant quantum eld theory is a (covariant) functor A: SSpac
TAlg, written as SM  ASM ,  .
A locally covariant quantum eld theory A is called causal if and only if any
pair of morphisms i : SM i SM , i = 1, 2, such that 1 (M1 ) (2 (M2 )) in M
yields [1 (ASM 1 ), 2 (ASM 2 )] = {0} in ASM .
A locally covariant quantum eld theory A satises the time-slice axiom i for
all morphisms : SM 1 SM 2 such that (M1 ) contains a Cauchy surface for M2
we have (ASM 1 ) = ASM 2 .

Notice that the condition 1 (M1 ) (2 (M2 )) is symmetric in i = 1, 2. The


causality condition formulates how the quantum physical system interplays with
the classical gravitational background eld, whereas the time-slice axiom expresses
the existence of a causal dynamical law.
We now x a choice of Dirac structure D := (D,+ ,c , ), in order to turn
the free Dirac eld into a locally covariant eld theory. Because we want to
impose the canonical anti-commutation relations it will also be convenient to
quantize spinor and cospinor elds simultaneously by introducing the following
terminology:

Definition 4.2. The locally covariant double spinor bundle is the covariant functor
D D . We dene the following natural equivalences and natural transformations
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

404 K. Sanders

on this bundle, indicated by their components at SM :


(p q)c := pc q c , (p q)+ := q + p+ ,
(p q) := ( p) ( q), p q, p q   := p+ , p  + q  , q + ,
(p q) := p (q).
A double spinor (eld ) is an element of C (DM D M ). A double test-spinor
(eld) is an element of C0 (DM D M ). The adjoint, charge conjugation and
other operations are dened pointwise. We also dene the operator P := P Pc , its
advanced () and retarded (+) fundamental solutions S (u v) := (S u) (Sc v)
and S := S S + .
The exterior tensor product V1  V1 of two vector bundles Vi with ber Vi over
manifolds Mi , i = 1, 2, is the vector bundle over M1 M2 whose ber is V1 V2
and whose local trivializations are determined by (O1 O2 ) (V1 V2 ), where
Oi Vi are local trivializations of Vi .
The Dirac BorchersUhlmann algebra FSM 0
on a spin spacetime SM is the topo-

logical -algebra


FSM
0
:= C0 ((DM D M )n ),
n=0

where the direct sum is algebraic (i.e. only nitely many non-zero summands are
allowed) and
(1) the product is given by continuous linear extension of f1 f2 := f1  f2 ,
(2) the -operation is given by continuous antilinear extension of
(f1   fn ) := fn+   f1+ ,
(3) as a topological vector space FSM 0
is the strict inductive limit FSM
0
=
N n
N =0 C
n=0 0 ((DM D M ) | n ), where KN is an exhausting and
KN
increasing sequence of compact subsets of M and the test-section space of the
restricted vector bundle (DM D M )n |K n is given the test-section topology.
N

The topology of FSM


0
is such that a state is given by a sequence of n-point distri-
butional sections n of (DM D M )n . A morphism : SM 1 SM 2 in SSpac
determines a unique morphism : FSM 0
1
FSM
0
2
that is given by the algebraic and
continuous extension of the morphism DM1 D M1 DM2 D M2 that is sup-
plied by the functor D. Together with this map on morphisms the map SM  FSM 0

becomes a locally covariant quantum eld theory F0 : SSpac TAlg. Our next
task will be to divide out the ideals that generate the dynamics and the canonical
anti-commutation relations.
We dene the natural transformation ( , ): (C
0 (DD ))R (C0 (DD ))

C whose components are the sesquilinear forms:



(f1 , f2 ) := i f1 , Sf2 .
M
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

The Locally Covariant Dirac Field 405

Note that this is indeed a natural transformation, because it can be written as a



composition of natural transformations including ,  , , + and .

Lemma 4.3. On each object SM the sesquilinear form ( , ) is Hermitean, (f1 , f2 ) =


(f1c , f2c ) = (f2 , f1 ), and there holds (f1+ , f2+ ) = (f2 , f1 ). For any spacelike Cauchy
surface C M with future pointing unit normal vector eld na we have

(u1 v1 , u2 v2 ) = (Su1 )+ , n /(Su2 ) + Sc v2 , n
/(Sc v1 )+ . (10)
C

Proof. The symmetry properties follow straightforwardly from the computational


rules of Theorems 3.6 and 3.10. For the last statement we also need a partial inte-
gration (see, e.g., [20, Eq. (B.2.26)] for Gauss law) and we use the Dirac equation:

(u1 v1 , u2 v2 )

= i Pc Sc u+
1 , Su2  + Pc Sc v2 , Sv1 
+
J + (C)

+i Pc Sc+ u+
1 , Su2  + Pc Sc v2 , Sv1 
+ +
J (C)

= a Sc u+
1 , Su2  + a Sc v2 , Sv1 
a a +
J + (C)

a Sc+ u+
1 , Su2  + a Sc v2 , Sv1 
a + a +
J (C)

= na Sc u+ a
1 , Su2  + na Sc v2 , Sv1 
a +
C

na Sc+ u+
1 , Su2  + na Sc v2 , Sv1 
a + a +
C

= (Su1 )+ , n
/(Su2 ) + Sc v2 , n
/(Sc v1 )+ .
C

From Eq. (10) we notice that ( , ) is positive semi-denite and hence denes a
(degenerate) inner product. We proceed by dividing FSM0
by the closed ideal JSM
of FSM generated by all elements of the form P f or f1 f2 + f2 f1+ (f1 , f2 )I.
0 +

Theorem 4.4. The ideal JSM is a -ideal and for any morphism : SM 1 SM 2
we have (JSM 1 ) JSM 2 . We can dene the locally covariant quantum eld the-
ory F : SSpac TAlg which assings to every spin spacetime SM the C -algebra
FSM := FSM0 /J
SM .

Proof. The elements that generate JSM are invariant under adjoints and under
a morphism they are mapped to elements of the same form. This proves the rst
statement. It follows that the quotients FSM
0
/JSM are topological -algebras and
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

406 K. Sanders

that a morphism : FSM 0


1
FSM 0
1
descends to the quotients as a well-dened
morphism. That each algebra FSM /JSM has a C -norm follows from the fact that
0

they are the inductive limits of nite-dimensional Cliord algebras ([29]). The mor-
phisms on the quotients are necessarily continuous in the norm and therefore extend
to morphisms on the C -algebras FSM .

Definition 4.5. A locally covariant quantum eld in the locally covariant vector
bundle V for the locally covariant quantum eld theory A is a natural transforma-
tion : C
0 V f A, where we let f : TAlg TVec be the forgetful functor.
We dene the locally covariant quantum elds B: D D F, : D F
and + : D F by BSM (f ) := 0 f 0 + JSM , SM (v) := BSM (0 v) and
+
SM (u) := BSM (u 0).
That the latter really are locally covariant quantum elds is a consequence of
+
Proposition 4.6. The operator-valued maps BSM , SM , SM are C -algebra-
valued distributions and:

(1) P = 0 and Pc + = 0,
+
(2) SM (u) = SM (u+ ) , 
(3) {SM (u), SM (v)} = (v + 0, u 0)I = i M v, SuI and the other anti-
+

commutators vanish.

Proof. The rst item is P BSM (f ) = BSM (P f ) = BSM (P f ) = 0, where P is


the formal adjoint of P . The last two items follow from the denitions of SM and
+
SM and the properties of BSM after a straight-forward computation.
It remains to show that SM , SM+
are C -algebra-valued distributions, because
the result for BSM then follows. The C -subalgebra of FSM generated by
I, SM (v), (v)SM is a Cliord algebra which isisomorphic to M (2, C) and an
 
explicit isomorphism is given by SM (v)  00 0c , where c = (0 v, 0 v) =

i M v, Sv +  > 0. It follows that SM (v) = c is the operator norm of the
corresponding matrix, i.e.r

SM (v) = i
2
v, Sv + dvolg .
M

In the test-spinor topology we then have continuous maps v  v v + 


i M v, Sv + , from which it follows that v  SM (v) is norm continuous, i.e.
it is a C -algebra-valued distribution. The proof for SM
+
is analogous.

Note that the last two conditions of Proposition 4.6 can also be formulated in
terms of natural transformations, because the algebraic operations in FSM can be
expressed as such. The theory F is the quantized free Dirac eld and ( + ) is

r The factor 2 in [7, Remark 2, p. 340] seems to be erroneous.


May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

The Locally Covariant Dirac Field 407

the locally covariant Dirac (co)spinor eld. Alternatively we could have used the
algebras FSM
0
/JSM themselves instead of completing them to C -algebras.
To see that the anti-commutator is the canonical one (cf. [24]) we apply [6,
Proposition 2.4(c)] which says that S|CC = in / for a Cauchy surface C with
future pointing normal vector eld n. Comparing with Eq. (8) and using n/ 2 = I we
then nd

{iSM (n
+
/(x)), SM (y)} = y, Sn
/ xI = i(y, x)I
M
as expected.
So far our construction depends on the choice of a Dirac structure, although nat-
urally equivalent Dirac structures yield naturally equivalent theories and quantum
elds. The following theorem restricts attention to the observable algebra, dividing
out the freedom of choice completely and yielding a unique theory, but for many
purposes it is not convenient to use it directly because it lacks locally covariant
Dirac (co)spinor elds.

Theorem 4.7. Let B : SSpac TAlg be the locally covariant quantum eld theory
that assigns to each spin spacetime SM the C -subalgebra of FSM generated by all
even polynomials in elements B(f ), with the induced action on morphisms. For all
Dirac structures with four-dimensional complex bers the resulting theories B are
isomorphic.

Proof. The algebras BSM generated by the even polynomials are C -algebras.
Morphisms respect evenness and so restrict to morphisms on B, making B a well-
dened locally covariant quantum eld theory. Now consider two Dirac structures D
and D0 with associated functors F, B and F0 , B0 . If both Dirac structures have four-
dimensional complex bers, then we infer from the comment below Corollary 3.13
that there are -isomorphisms SM : FSM (F0 )SM such that for any morphism
: SM 1 SM 2 we have SM 2 =  (0 ) SM 1 , where  = 1 depends
only on . It follows from the evenness that the SM descend to -isomorphisms
SM : BSM (B0 )SM that intertwine with the morphisms. Hence, B and B0 are
naturally equivalent.

Proposition 4.8. The locally covariant quantum eld theory B : SSpac TAlg
of Theorem 4.7 is causal and satises the time-slice axiom.

Proof. Causality follows from the anti-commutation relations,


[BSM (f1 )BSM (f2 ), BSM (f3 )]
= BSM (f1 ){BSM (f2 ), BSM (f3 )} {BSM (f1 ), BSM (f3 )}BSM (f2 )
= (f2 , f3 )BSM (f1 ) (f1 , f3 )BSM (f2 ),
together with the support properties of S. For the time-slice axiom, we let
: SM SM  be a morphism in SSpac, covering a morphism : M M  in Spac,
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

408 K. Sanders

such that N := (M ) M  contains a Cauchy surface C M  . Then we can choose


Cauchy surfaces C N such that C I (C) and a smooth partition of unity
+ , with supp J (C ). Let f C0 (DM D M ) and write
f = P (S + f + Sf ) + f, (11)

where f := P ( Sf ) = P ( Sf ) is supported in J (C ) J (C )
+ + +

N and + Sf S + f has compact support. Hence, BSM 2 (f ) = BSM 2 (f) =


(BSM 1 ( (f))). Because the algebra FM  is generated by such elements this
shows that is a -isomorphism.

Remark 4.9. A Majorana spinor is a spinor u such that u = uc . In this case the
adjoint is anti-Majorana: u+c = uc+ = u+ . We call a double spinor f = u v
Majorana i u and v + are Majorana, which means that f c = f . Such spinors
are sections of a subbundle of the Dirac spinor bundle, which can be described
by a Majorana representation. Notice that every spinor is a unique complex linear
combination of Majorana spinors.
To quantize Majorana spinors we note that hc , f  = h+ , f c+ . This leads us
to dene the charge conjugation on the quantized eldss by c (v) := + (v c+ )
and +c (u) := (uc+ ), or equivalently B c (f ) := B(f c+ ) = B(f c ) . We impose
the Majorana condition B c (f ) = B( f ) by dividing out the ideal generated by
all elements of the form B(f f c+ ). More precisely, if H is the Hilbert space
obtained from C0 (DM D M ) by dividing out the ideal of double spinors f for
which (f, f ) = 0, then there is an orthogonal decomposition H = H+ H , where
the elements in H satisfy f c+ = f . Indeed, every double spinor can be written
as f = f+ + if , where f := 12 (f f c+ ) are in H and the orthogonality follows
from Lemma 4.3. For the C -algebraic quantization we then have F = F+ F ,
where F is the C -algebra of quantized Majorana spinors and F+ the C -algebra
of quantized anti-Majorana spinors (see [30, Sec. 5.2]). The generators (v) and
+ (u) of F satisfy the additional relation c = and +c = + .

4.2. Hadamard states


After Radzikowskis result [31] that a for a scalar eld state is of Hadamard form if
and only if its wave front set has a certain form, several people set out to extend this
result to the Dirac eld, or more general quantum elds [3234]. All three papers
have provided an original contribution in their method of proof, but upon careful
analysis they all have minor gaps. We feel that it is justied to comment on this
here and to provide the necessary results to ll any remaining gaps.
The most general results are the most recent ones, due to Sahlmann and
Verch [34], who set out to prove the equivalence of the Hadamard form of a state,
dened in terms of the Hadamard parametrix, with a wave front set condition anal-
ogous to the scalar eld case. One of the techniques used is the scaling limit, but

s Our denition diers slightly from that of [13].


May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

The Locally Covariant Dirac Field 409

the proof of their Proposition 2.8, which relates the wave front set of a distribution
to that of its scaling limit, is in our opinion insucient (see footnote w). In the
Appendix, we prove a similar statement as Proposition A.2, thereby lling any gap
in [34] and establishing the desired equivalence on a rm ground. For the Dirac
eld, Hollands has proved that this wave front set condition implies a specic form
of the polarization set ([35, Theorem 4.1]).
The scaling limit result can also be used to nd the wave front sets of the
advanced and retarded fundamental solutions E of normally hpyerbolic operators
on a globally hyperbolic spacetime, a result that we prove as Theorem A.5. Our
proof is largely analogous to the work of Radzikowski and the outcome is in direct
analogy to the results of Duistermaat and H ormander [36] for the scalar case. To
nd the wave front sets of the fundamental solutions S for the Dirac equation we
use (and correct) an idea of [35].
Finally, we comment on the results by Kratzert [32], which use a spacetime defor-
mation argument to compute the wave front set and polarization set of Hadamard
states. This result has a gap, already identied in [34], concerning the case of points
(x, ; y,  ) where either = 0 or  = 0, which prevents the propagation of the sin-
gularity from the original to the deformed spacetime. This gap can be avoided using
either a propagation of Hadamard form result as in [34], or using the commutation or
anti-commutation relations and the explicit form of WF (E), respectively WF (S).
The latter argument, which appears to be implicit in Radzikowskis paper [31],
works as follows: when (x, ; y, 0) WF (2 ) then also (y, 0; x, ) WF (2 ) by
the (anti-)commutation relations and the fact that WF (E) (or WF (S)) has no
points with either entry equal to 0. Using the calculus of Hilbert-space-valued
distributions, Theorem A.4, we then nd that both (x, ; x, ) WF (2 ) and
(x, ; x, ) WF (2 ). Because = 0 (by denition the wave front set does not
contain the zero covector) these points can both be propagated into a deformed
spacetime, where WF () is known to satisfy the required microlocal condition.
This, however, leads to a contradiction, because WF (2 ) WF (2 ) = and
hence = 0. Therefore, WF (2 ) cannot contain points with one of the covectors
equal to 0.
After these historical notes we feel free to dene the notion of Hadamard states
directly in terms of a wave front set condition, rather than using the Hadamard
parametrix. If is a state on FSM then we may consider the GNS-representation
(H , , ) associated to and the H -valued distribution on DM D M
dened by:
v (f ) := (BSM (f )) .
Definition 4.10. A state on FSM is called Hadamard if and only if
WF (v ) = N + := {(x, ) T M | 2 = 0, is future pointing or 0}.
A state on BSM is called Hadamard if and only if it can be extended to a
Hadamard state on FSM . The set of all Hadamard states on BSM will be denoted
by SSM .
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

410 K. Sanders

Note that every state on BSM can be extended to FSM , by the HahnBanach
Theorem and Proposition 4.6. The Hadamard condition is independent of the choice
of extension, because it depends solely on the two-point distribution as the following
proposition shows (cf. [34], we give a short proof using the more advanced microlocal
techniques developed in the Appendix).

Proposition 4.11. For a state on FSM the following conditions are equivalent :

(1) is Hadamard,
(2) WF (v ) N + ,
(3) the two-point distribution 2 (f1 , f2 ) := (BSM (f1 )BSM (f2 )) has

WF (2 ) = C := {(x, ; y,  ) T M 2 \Z | (x, ) (y,  ), (x, ) N + },

where (x, ) (y,  ) if and only if there is an anely parameterized light-like


geodesic from x to y to which ,  are cotangent,
(4) there is a two-point distribution w such that 2 (f1 , f2 ) = iw(P f1 , f2 ) and
WF (w) = C.

Proof. First, note that 2 is a bidistribution on DM D M , because BSM is


an FSM -valued distribution and multiplication in FSM and are continuous. By
Theorem A.4 the third statement implies the rst, which trivially implies the sec-
ond. To show that the second statement implies the third, we use the argument
of [37, Proposition 6.1]. By Theorem A.4 we see that WF (2 ) N N + , where
N := N + . Dening  2 (f1 , f2 ) := 2 (f2 , f1 ) we nd WF (2 )WF (2 ) = . Now,
(2 + 2 )(f1 , f2 ) = i M f1 , Sf2 , so WF (2 ) WF ( 2 ) = WF (S) = WF (E) by
Proposition A.7 and hence WF (2 ) = WF (E) N N + = C by Corollary A.6.
Now, assume that 2 (f1 , f2 ) = iw(P f1 , f2 ), where WF (w) = C. Then
WF (2 ) = WF ((P I)w) WF (w) = C. It follows that WF (v ) N + . For
the converse we suppose that is Hadamard and we choose a smooth real-valued
function + on M such that + 0 to the past of some Cauchy surface C and
such that := 1 + 0 to the future of another Cauchy surface C+ . We then
dene w(f1 , f2 ) := i2 (+ S f1 + S + f1 , f2 ). Note that w is a bidistribution
which is well-dened, because + S f1 and S + f1 are compactly supported. By
construction iw(P f1 , f2 ) = 2 (f1 , f2 ). We now estimate the wave front set of w as
follows. The wave front sets of S are determined in Proposition A.7. Then we may
apply [38, Theorems 8.2.9 and 8.2.13] (in combination with Eq. (17)) to estimate
the wave front sets of the tensor products (x)S (x, y)(x , y  ) and the composi-

tions in iw(x, x ) = 2 (y, y  )( (x)S (x, y)(x , y  )) respectively and, using
WF (2 ) = C, we nd:

WF (iw) WF (S ) WF (2 ) WF (2 ) = WF ((P I)w) WF (w),

i.e. WF (w) = WF (2 ) = C.
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

The Locally Covariant Dirac Field 411

The second characterization in Proposition 4.11 is especially useful, because


it shows we do not need to compute the entire wave front set, as long as we
can estimate it. Employing similar techniques as above one can use the anti-
commutation relations and the wave front set of 2 to estimate the wave front
sets of all higher n-point distributions [39], showing that a Hadamard state neces-
sarily satises the microlocal spectrum condition (SC) of [40] and it follows that
the set of such states is closed under operations from the algebra. We formulate
this and other properties of Hadamard states in the following

Proposition 4.12. The set SSM of all Hadamard states on BSM satises:

(1) (SSM 1 ) SSM 2 for every morphism : SM 1 SM 2 ,


(2) SSM is closed under operations from BSM ,
(3) (SSM 1 ) = SSM 2 for every morphism : SM 1 SM 2 such that (M1 ) con-
tains a Cauchy surface of M2 .

Proof. The rst property follows from Theorem 4.11 and the fact that wave front
sets are local and geometric objects (cf. [38, Chap. 8]). The second property relies
on the anti-commutation relations, which implies that the truncated n-point distri-
butions are totally anti-symmetric (cf. [1, 39]). The nal property follows from the
second characterisation in Theorem 4.11, Eq. (17) in the Appendix, the equation of
motion and the Propagation of Singularities Theorem for the wave front set, which
in this case follows from the propagation of the polarization set [41].

One can also prove that the state spaces are locally physically equivalent [5] and
that all quasi-free Hadamard states are locally quasi-equivalent [42]. Whether the
latter remains true for all Hadamard states appears to be unknown.
We conclude this section with the remark that the functor S : SSpac TVec
dened by SM  SSM and  (restricted to the relevant state space) is a
locally covariant state space for the theory B [2].

4.3. The relative Cauchy evolution of the Dirac eld and the
stress-energy-momentum-tensor
Now that we have a locally covariant free Dirac eld at our disposal, we will inves-
tigate the idea of relative Cauchy evolution for this eld and prove that it yields
commutators with the stress-energy-momentum tensor. This result is completely
analogous to the result for the free scalar eld of [2].
Suppose that we have two objects M0 = (M, g0 , SM 0 , p0 ) and Mg =
(M, g, SM g , pg ) in SSpac, where M is the same in both cases and such that out-
side a compact set K M we have g = g0 , SM g = SM 0 and pg = p0 . Now
let N M0 be causally convex open regions, each containing a Cauchy surface
for M0 , such that K lies to the future of N (i.e. K J + (N )\N in M0 and
hence also in Mg ) and to the past of N + . We view N as objects in SSpac and
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

412 K. Sanders

consider the canonical morphisms 0 :N



M0 and g :N

Mg . By the time-
slice axiom, Proposition 4.8, these give rise to -isomorphisms 0 : BN BM0 and
g : BN BMg . We then dene

g := 0+ (g+ )1 g (0 )1 .

The -isomorphism g : BM0 BM0 measures the change in an operator A BN


as it evolves to N + in the metric g instead of g0 .t g can be extended to a -
isomorphism of the algebra FM0 , where we x the signs for the isomorphisms
between the spinor bundles involved by identifying the double spinor bundles over
N M0 and N Mg . It represents the relative Cauchy evolution of the free
Dirac eld.
We will want to compute the variation of the -isomorphism g as well as that
of the action for the free Dirac eld with respect to the metric g. For this purpose,
we will suppose that the compact set K M has a contractible neighborhood
O which does not intersect either N . Let   g be a smooth curve from [0, 1]
into the space of Lorentzian metrics on M starting at g0 and such that g = g0
outside K for every . The spin bundle SM  must be trivial over the contractible
region O. If we assume it to be dieomorphic to SM 0 outside K we can simply take
SM  = SM 0 as a manifold and, choosing a xed representation and matrices A, C,
we obtain DM = DM .
The deformation of the spin structure is contained entirely in the spin frame pro-
jection  : SM 0 FM  . Let E be a section of SM 0 over O and set (e )a :=  (E).
We require that e varies smoothly with  and that (e )a = (e0 )a outside K. To show
that projections  with these properties exist we can apply the GramSchmidt
orthonormalisation procedure to (e0 )a for all  simultaneously. The assignment
E  e determines  completely, using the intertwining properties. The family of
frames e determines principal ber bundle isomorphisms FM  FM 0 between
the frame bundles by

 : {(e )a }  {(e0 )a }

on K and extending it by the identity on the rest of M. By denition f intertwines


the action of L+ on the orthonormal frame bundles.

Remark 4.13. There may be many deformations of the spin structure, i.e. many
families of projections  which satisfy our requirements. However, the variation
of terms like v, P u will not depend on this choice. Indeed, if  is a dierent
deformation of the spin structure, then e :=  (E) = R e =  (RS E) for some
smooth curve S in Spin01,3 . However, using the invariance of ,  under the action of
the gauge group Spin 01,3 , the variation will be equal in both cases. (Also u = 0 for

t In[2], it seems the authors have the scattering of a state in mind as it passes through the
perturbed metric, which leads them to consider the -isomorphisms g1 rather than g . When
we take the variation with respect to g this gives rise to a sign.
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

The Locally Covariant Dirac Field 413

every spinor u, because D M = DM.) In this sense, the variation will only depend
on the variation of the metric.

4.3.1. The stress-energy-momentum tensor


The classical stress-energy-momentum
 tensor for the Dirac eld is dened as a
variation of the action S = M LD , with the Lagrangian density (7), with respect
to g (x):
2 S
T (x) :=  , (12)
det g(x) g (x)

where is a free classical Dirac spinor, + its adjoint. An explicit computation


yieldsu
i
T = ( + , ( )  ( + , ) ).
2
Here the brackets around indices denote symmetrization as an idempotent operation
and in the following indices between | | are to be excluded from the symmetriza-
tion.
Following [7] we quantize the stress-energy-momentum tensor via a point-split
procedure, i.e. we want to nd a bi-distribution of scalar test-functions which
reduces to T on the diagonal and which can be quantized in a straight-forward
way. For this purpose we use a local spin frame EA and recall that the components
aAB of a are constant. We dene:
i
s
Tab (x, y) := ( + , EA (x)(aA |B| E B , eb) (y)
2
e(a | + , EA| (x)b)AB E B , (y)),

reduces
 to Tab := ea eb T in the limit y x. Performing a partial integration,
(ea v, u) = 0, we can write Tab
s
as a bidistribution of scalar test-functions
h1 , h2 ,
i
s
Tab (h1 , h2 ) = ( + (EA h1 )(aA |B (| (E B eb) h2 ))
2
+ + ( (e(a E|A| h1 ))b)AB (E B h2 )). (13)

Equation (13) can be promoted to the quantized case by replacing and + by the
+
components SM and SM of the corresponding locally covariant quantum eld.
The expression (13) can be viewed as a formal expression for the same distribution
with quantized eld operators.

u Forexplicit computations, we refer to [43, Sec. 4], which uses a Lagrangian that diers from ours
by a total derivative. Varying with respect to g would yield the opposite sign.
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

414 K. Sanders

Proposition 4.14. For all f C0 (DM D M ) and h C0 (M ) we have:



s
[Tab (x, x), BSM (f )]h(x)dvolg (x)
M

1
= {((a BSM )(b) (S f )h) BSM ((b a) (S f )h)},
2

where a := ea .

Proof. For f = u v we use Proposition 4.6 to obtain:



{BSM (f ), SM
+
(EA h)} = i v, SEA hI = i Sc v, EA hI,
M M

{BSM (f ), SM ( E B eb h)} = i  E B eb h, SuI = i E B , eb SuhI,
M M

{BSM (f ), SM
+
( ea EA h)} = i v, S ea EA hI = i ea Sc v, EA hI,
M M

{BSM (f ), (E B h)} = iE B , SuhI.

With Eq. (13), the commutation relations and [AB, C] = A{B, C} {A, C}B this
implies

1 +
s
[Tab (x, y), BSM (f )] = { (EA (x))(aA |B| E B , b) Su(y)
2 SM
+ Sc v, EA (x)(aA |B| (b) SM )(E B (y))

((a SM
+
)(E|A| (x))b)AB E B , Su(y)

(a Sc v, E|A| (x)b)AB SM (E B (y))}.

In this expression, we are multiplying distributions with smooth functions, so we


may take the coincidence limit yielding:

1 +
s
[Tab (x, x), BSM (f )] = { ( (Su)(x)) + (b SM (Sc va) (x))
2 SM (a b)
(a SM
+
(b) Su(x)) SM ((a (Sc v)b) (x))}
1
= {(a BSM (b) S f (x)) BSM ((b a) (S f )(x))},
2

from which the result follows.


May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

The Locally Covariant Dirac Field 415

This result can be written for spinors and cospinors separately as:

s
[Tab (x, x), SM (v)]h(x)dvolg (x)
M
1
= {(a SM ((Sc v)b) h) SM ((a (Sc v)b) h)},
2

s +
[Tab (x, x), SM (u)]h(x)dvolg (x)
M
1
= {(a SM
+
(b) Suh) SM
+
((a b) (Su)h)}.
2

4.3.2. Relative Cauchy evolution


To compute the relative Cauchy evolution explicitly, we rst note that the isomor-
phism g can be characterized in terms of its action on the generators BM0 (f ) of
FM0 as follows:
Proposition 4.15. For f C0 (DN + D N + ), we have g B0 (f ) = B0 (Tg f ),
where
Tg f = Pg + Sg P0 S0 f.
Here the subscripts on B, P and S indicate whether they are the objects dened on
M0 or Mg and the smooth functions are such that 1 to the past of some
Cauchy surface in N and 0 to the future of some other Cauchy surface
in N .

Proof. Note that g (0 )1 B0 (f) = Bg (f) for any f C0 (DN D N ).


Similarly, for f  C0 (DN + D N + ) we have 0+ (g+ )1 Bg (f  ) = B0 (f  ). The
functions , 1 have been chosen appropriately in order to apply Eq. (11)
in Proposition 4.8. We then have B0 (f) = B0 (f ), where f := P0 S0 f . Notice
that f indeed has a compact support in N . Similarly, Bg (f) = Bg (f  ), where
f  := Pg + Sg f has support in N + . Hence, for f  = Tg f : g B0 (f ) = g B0 (f) =
0+ (g+ )1 Bg (f) = 0+ (g+ )1 Bg (f  ) = B0 (f  ).

On each spin spacetime M = (M, g , SM 0 ,  ) we can now quantize the Dirac


eld and obtain relative Cauchy evolutions  := g on FN + as before.
Proposition 4.16. Writing :=  |=0 we have for all f C0 (DN + D N + ):
( B0 (f )) = B0 ( (
/  )S0 f ).

Proof. Using the fact that B0 is a C -algebra-valued distribution and Proposi-


tion 4.15 we nd:
( B0 (f )) = (B0 (P + S P0 S0 f ))
= B0 ((P + S )P0 S0 f )
= B0 ((P )+ S0 P0 S0 f ) + B0 (P0 + (S )P0 S0 f ).
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

416 K. Sanders

Now, because P0 S0 f C0 (DN D N ) we see that (S )P0 S0 f vanishes


on J (N ) and that + (S )P0 S0 f has compact support. Because B0 solves
the Dirac equation we conclude that the second term vanishes. The rst term can
be rewritten using Eq. (11), which yields S0 f = S0 P0 ( S0 f ) and hence:

( B0 (f )) = B0 ((P )+ S0 f ) = B0 ((P )S0 f ).

For the last equality, we used the fact that (P ) is supported in K, where + 1.
Recall that P = (i / + m) (i
/ + m) to get the nal result.

To compute the variation of the Dirac operator we may work in a local frame
on O, where it is supported. Because the Dirac adjoint map is independent of  we
only need to compute this variation either for spinors or for cospinors:

Lemma 4.17. For v C0 (D M ) we have ( / )v + )+ .


/ )v = ((

Proof. Because the adjoint operation is continuous we have:

/ )v = 
( /  v + )+ |=0 = (
/  v|=0 =  ( /  v + |=0 )+ = ((
/ )v + )+ .

It is interesting to note that only the variation of the Dirac operator is of


importance for the variation of the relative Cauchy evolution, just like for the
stress-energy-momentum tensor (cf. [43]). It will also turn out that the variation
only depends on the variation of the metric and not on the other freedom in the
variation of the orthonormal frame, even though we are now acting on it with
the C -algebra-valued eld (cf. Remark 4.13). This will follow from the proof of
the following theorem, for which we refer to Appendix B.

Theorem 4.18. For a double test-spinor f C0 (DM0 D M0 ) and x K:


 
i a b s
(g B0 (f )) = B0 Pg S0 f = e e [T (x, x), B0 (f )]. (14)
g (x) g (x) 2 ab

This result compares well with the scalar eld case, [2, Theorem 4.3].v As particular
cases we obtain for and + :

i a b s
(g (v)) = e e [T (x, x), (v)],
g (x) 2 ab
i a b s
(g + (u)) = e e [T (x, x), + (u)].
g (x) 2 ab

v The sign explained in footnote t cancels the sign due to the variation with respect to g instead
of g .
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

The Locally Covariant Dirac Field 417

It follows that the same result also holds for products and sums of smeared eld
operators.

5. Conclusions
A rigorous formulation of quantum eld theories in curved spacetime, going beyond
the well-known scalar eld, is a prerequisite for constructing more realistic cosmo-
logical models as well as for improving our understanding of quantum eld theory
in Minkowski spacetime. The main purpose of this paper was to present the free
Dirac eld in a four-dimensional globally hyperbolic spacetime as a locally covari-
ant quantum eld theory in the sense of [2] and to compute the relative Cauchy
evolution of this eld, obtaining commutators with the stress-energy-momentum
tensor in analogy with the free real scalar eld. We achieved this in a represen-
tation independent way and in a functorial, and therefore manifestly covariant,
framework.
We established some basic properties of the locally covariant free Dirac eld
and remarked on the quantization of Majorana spinors. We also provided a detailed
discussion of Hadamard states, closing any gaps in the existing proofs of the equiv-
alence of the denitions in terms of the series expansion of their two-point distri-
bution and a microlocal condition, respectively.
Furthermore, we argued that the observable part of the theory is uniqueley
determined by the relations between adjoints, charge conjugation and the Dirac
operator, although the geometric constructions themselves may not be unique due
to the cohomological properties of the category of spin spacetime. On a mathemat-
ical level we have consistently replaced a single spin spacetime SM by the category
SSpac of such spacetimes, and the dierential geometry on SM by the correspond-
ing functorial descriptions. On a physical level, however, we should not conclude
from this that SSpac is now the physical arena in which our system lives, instead
of a collection of systems. (See [1, Chap. 1] for more detailed philosophical remarks
on the interpretation of the locally covariant approach.)

Acknowledgments
I would like to thank Chris Fewster for suggesting to use the cohomological language
in Sec. 3.4 and for bringing the problem of computing the relative Cauchy evolution
for the Dirac eld to my attention. I would also like to thank Romeo Brunetti for
correcting some of my misconceptions in the early stages of this computation. An
anonymous referee made several important corrections and helpful suggestions, for
which I am grateful. Much of this work was performed as part of my PhD-thesis at
the University of York and I would also like to thank the University of Trento for
their kind hospitality during my visit in October 2007. Furthermore, this research
was supported by the German Research Foundation (Deutsche Forschungsgemein-
schaft (DFG)) through the Institutional Strategy of the University of G ottingen
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

418 K. Sanders

and the Graduiertenkolleg 1493 Mathematische Strukturen in der modernen


Quantenphysik.

Appendix A. Results in Microlocal Analysis


In this appendix, we will list some results concerning the microlocal analysis of
distributions. For a detailed treatment of scalar distributions we refer to [38],
whereas Hilbert and Banach-space-valued distributions are treated in [1, 37]. More
details concerning distributional sections of vector bundles can be found in, e.g.,
[1, 16, 34, 41].
Before we discuss distributional sections of vector bundles, we rst consider the
scaling limit of a distribution in an open set of Rn :

Definition A.1. Let O be a convex open region O Rn containing 0. For all


> 0 we dene the scaling map : O O by (x) := x.
Let u be a distribution on a convex open region O Rn containing 0. The
scaling degree d of u at 0 is dened as d := inf{ [, ) | lim0 u = 0},
where ( u)(f ) := n u(f 1 ).
If u0 := lim0 d u exists we call it the scaling limit of u at 0.
Note that the scaling limit may fail to exist (e.g., u(x) = log|x|) or it may vanish
(e.g., if 0  supp(u)). On a manifold, we will only consider scaling limits in a certain
choice of local coordinates. How this limit depends on this choice of coordinates will
not be relevant for us.
We now prove the following resultw :

Proposition A.2. Let u be a distribution on a convex open region O Rn con-


taining 0 with scaling limit u0 at 0. Then
{0} 2 (WF (u0 )) WF (u),
where 2 denotes the projection on the second coordinate.

Proof. Suppose that (0, 0 )  WF (u) with 0 = 0. We will prove that (x, 0 ) 
WF (u0 ) for all x. By assumption, we can choose C0 (O) and an open conic
neighborhood Rn of 0 such that 1 on a neighborhood of 0 and supp()
WF (u) = . We set v := u and v := d v, where d is the scaling degree of u
at 0. Notice that WF (v) T0 O = WF (u) T0 O and u0 := lim0 v , so without

wA similar result was also claimed as [34, Proposition 2.8], but we nd their proof unconvincing.
In particular, when localizing the scaling limit u0 with a test-function 0 and estimating (cf. [34,
Eq. (2.11)])
.

0
0 u () = lim
dn
u 0 ei .
0
the test-function 0 ( . ) becomes singular in the limit 0. The quoted reference pays insucient
attention to this issue in the last sentence of their proof, because their last estimate does not involve
any 0 .
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

The Locally Covariant Dirac Field 419

loss of generality, we may prove the result with v replacing u and we can view the
v as compactly supported distributions on all of Rn .
Notice that for > 0 we have u0 = d u0 , i.e. u0 is a homogeneous distri-
bution and therefore it is tempered ([38, Theorem 7.1.18]). We now prove that v
converges to u0 in the sense of tempered distributions on Rn . For this we rst write

v = ||r (1)|| v , where r is the order of v and the v are compactly sup-

ported distributions of order 0 (see [38, Sec. 2.1]). Note that ||<dn (1)|| v
converges to 0 in S, because for every || < d n and S(Rn ) we have

|((1)|| v ) ()| = dn |v ( ( 1 ))| dn|| C sup| |


 ||
which converges to 0 as 0. We then set w := dn||r (1) v , so
that lim0 w = u0 as distributions. By the Uniform Boundedness Principle this
implies

|w ()| C sup| |, supp() B1 , (15)


||r

for some C, r > 0, where B1 is the (Euclidean) unit ball and 0 < 1. In fact, for
1 we also have

|w ()| = dn |w( 1 )| C dn|| sup| |


dn||r

C sup| |,
dn||r

so the estimate (15) holds for all > 0.


Now, let S(Rn ) be a function of rapid decrease and choose a partition of
unity on Rn as follows. We let 0 C0 (Rn ) be positive such that 1 on B1 and
(x) = 0 when x 2. We then set m (x) := 0 (2m x) 0 (21m x) and note
that:

supp(m1 ) {x | 2m1 x 2m+1 }, m = 1,


m=0

where the sum is nite near every point. We dene m := m and m := 2m1
and rescale m in order to apply the estimate (15):
   

dn  /m . 
|w (m )| = m w

m 
m 

  
 . 
C dn|| sup ( m )
m  m 
||r

C1 sup|x m |, m 0, (16)
Rn
||r ||r+nd
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

420 K. Sanders

(4x)||+nd for m 1, which follows from


dn||
where the last line uses m
d n || and the support properties of m . (For m = 0 we simply estimate
dn||
0 by a constant to arrive at the last line of (16).) We now note that
max supx| m | c for some c independent of m, as the derivatives only bring out
extra factors of 2m 1. Moreover, for m 0 we notice that m+1 +m +m1 1
on supp(m ), where we dene 1 := 0. Therefore (16) leads to

|w (m )| C2 sup |x |(m+1 + m + m1 )
Rn
||r ||r+nd

and summing over m 0 then gives:



|w ()| 3C2 sup |x |.


Rn
||r ||r+nd

This shows that w () can be estimated by a seminorm on S(Rn ) uniformly in .


It then follows that w u0 and hence v u0 as tempered distributions. Indeed,
for any S(Rn ) and  > 0 we can choose  C0 (Rn ) and 0 > 0 such that
|w (  )| < 2 for all > 0 and |w ( )| < 2 for all < 0 .
Fourier transformation is a continuous operation on tempered distributions, so
we can compute:
    N
   
0 ()| = lim dn v  CN lim dn  
|u = CN N lim N +dn
0   0  0

for all in , all N N and suitable CN > 0. For N > nd the limit yields u0 () = 0
near 0 . We then apply [38, Theorem 8.1.8], which says that for a homogeneous
distribution we have for all x = 0 that (x, 0 ) WF (u0 ) if and only if (0 , x)
WF (u 0 ).
0 ) and also (0, 0 ) WF (u0 ) if and only if 0 supp(u

For a distribution u with values in a Banach space B one can dene the wave
front set by using estimates of the norm u(ei ), which replace the corresponding
estimates of the absolute value |u(ei )| for scalar distributions [37]. Alternatively,
one can use the following equivalent characterization ( [1, Theorem A.1.4]):

WF (u) = WF (l u)\Z. (17)
lB

A similar idea works for a distributional section u of a vector bundle V = O Rm


over a contractible region O of Rn . Indeed, using a basis ei for Rm with dual basis
on O with values in B (Rm ) , where the
ei we can identify u with a distribution u
correspondence is given by
m 

m

m
u(h) := u(hei ) e , u
i i
f ei = 
u(f i ), ei ,
i=1 i=1 i=1
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

The Locally Covariant Dirac Field 421

where  ,  denotes the canonical pairing of Rm with the second factor of B (Rm ) .
We set by denition WF (u) := WF ( u).
Equation (17) allows a straightforward generalization of many results for
scalar distributions on open sets of Rn to Banach-space-valued distributional sec-
tions of a vector bundle over regions over Rn . Moreover, by showing how these
results transform under changes of coordinates they can be formulated for vec-
tor bundles on a manifold. We list a number of these results in the following
Theorem (cf. [1, 38]):

Theorem A.3. If u, v are distributional sections of a complex vector bundle V over


the spacetime M with values in the Banach space B, then:
(1) sing supp(u) is the projection of WF (u) on the rst variable,
(2) u C (V, B) if and only if WF (u) = ,
(3) WF (u + v) WF (u) + WF (v),
(4) if P is a linear partial dierential operator on V with smooth coecients
and (matrix-valued) principal symbol x p(x; ), then WF (P u) WF (u)
WF (P u) P , where P := {(x; ) T M | = 0, det p(x; ) = 0},
(5) if x M, : U Rn is a local trivialization on a convex neighborhood U with
(x) = 0 and (1 ) u has a scaling limit u0 at 0, then ({0} 2(WF (u0 )))
WF (u) Tx M .
In the last item, the scaling limit depends not just on the choice of coordinates, but
also on the choice of a frame ei of V over U and we let the scaling maps act on
 
sections of V componentwise: ( i f i ei ) 1 = i (f i 1 )ei .
In the particular case where B is a Hilbert space, we also have (see [1, 37]):

Theorem A.4. Let H be a Hilbert space and Vi , i = 1, 2, two nite-dimensional


(complex ) vector bundles over smooth ni dimensional spacetimes Mi with complex
conjugations Ji , i.e. the Ji are antilinear, base-point preserving bundle isomor-
phisms Ji : Vi Vi such that Ji2 = id. Let ui , i = 1, 2. be two H-valued distri-
butional sections of Vi and let wij be the distributional sections of the vector bundle
Xi  Xj over Mi Mj determined by wij (f1  f2 ) := ui (Ji f1 ), uj (f2 ). Then
(x, ) WF (u1 ) (x, ; x, ) WF (w11 )
and
WF (wij ) (WF (ui ) Z) (WF (uj ) Z),
where Z denotes the zero-section.
Finally, we establish some results on the wave front sets of advanced and
retarded fundamental solutions E (for their existence and uniqueness we refer
to [16]) and S , Sc . These results are analogous to [36, Theorem 6.5.3], but now

x See [16] for the denition of the principal symbol.


May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

422 K. Sanders

for operators in a vector bundle. Note that for distributional sections of vector
bundles there is a Propagation of Singularities Theorem, which follows from the
propagation of the polarization set [41].

Theorem A.5. Let E be the advanced () and retarded (+) fundamental solu-
tions for a normally hyperbolic operator P acting on the sections of a vector bundle
DM over a globally hyperbolic spacetime M = (M, g) of dimension n 2. Then

WF (E ) = {(x, ; y, ) T M 2 \Z | x J (y), x = y, (x, ) (y, )}


{(x, ; x, ) T M 2 \Z | (x, ) T M \}
=: A B (18)

where Z is the zero-section and (x, ) (y, ) if and only if there is a light-like
geodesic from x to y to which and are cotangent such that they are each
others parallel transport along .

Proof. The rst part of this proof follows closely the proof of [31].
We start by reducing the problem to a local one as follows. The principal symbol
of P is p(x, ) = g (x) I, where I is the identity operator on DM , so by the
Propagation of Singularities Theorem, the singularities of E propagate along light-
like geodesics by parallel transport. By denition the points in set A are invariant
under the same parallel transport. Now consider a point p := (x, ; y, ) with x = y.
If = = 0 then P is not contained in any set on either side of the equality, so
we may assume = 0 (the case = 0 is analogous). Let S be a spacelike Cauchy
surface through y and propagate (x, ) along the light-like geodesic towards S.
If ends at S in x = y then P is not contained in A or B, nor is it contained
in WF (E ), because E(x , y) = 0 when x and y are spacelike, so it cannot have
any singularities there. If ends at y, on the other hand, we can nd a point
p := (x ,  ; y, ), where x on is in any given causally convex neighborhood of y
and  is the parallel transport of along to x . Then p WF (E ) if and only
if p WF (E ) and p A if and only if p A . Hence, it suces to prove the
claim locally.
On a suciently small causally convex domain O M we can nd for every
k N a C k -section W k of DM  D M on O2 such that ( [16, Proposition 2.5.1]):

k+1

E (x, y) = Vj (x, y)f (1 R (2 + 2j, ))(x, y) + W k (x, y). (19)
j=0

Here, the Hadamard coecients Vj are uniquely dened smooth sections of DM 


D M on O2 , R (, y) are the retarded (+) and advanced () Riesz distributions
(or rather distribution densities) on Minkowski spacetime and they are pulled back
by the smooth dieomorphism f : O2 T O dened by (x, y)  (x, exp1 x (y)).
This means we use Riemannian normal coordinates for y centered on x, which is
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

The Locally Covariant Dirac Field 423

well-dened because O is causally convex. The Riesz distributions have many useful
properties, of which we will only use for all j 0:

WF (R (2j + 2, )) = {(x, ) T M0 \Z | x = 0 or x2 = 0, x J (0),  x}


R (2 + 2j, x) = 2+2jn R (2 + 2j, x), > 0.
(20)

(These can be proved using [16, Proposition 1.2.4 items 4 and 5], j+1 R (2+2j, ) =
and the wave front sets of the distinguished parametrices as determined in [36].)
Hence, for all j N:

WF (f (1 R (2 + 2j, ))) = f (WF (1 R (2 + 2j, )))


= f (Z|O WF (R (2 + 2j, )))
= {(x, ; y, ) | (, ) = df T (0,  ) for some
(exp1 
x (y), ) WF (R (2 + 2j, ))},

= (A B) T O2 , (21)

where df T is the transpose of the derivative df at (x, y). The last equality uses
the wave front set of the Riesz distributions in Eq. (20) and the properties of
Riemannian normal coordinates (cf. [31]). It follows that WF (E |O2 ) (A
B) T O2 , because for each order of dierentiation N we can choose a suciently
high order k in Eq. (19) to make the required estimate in the denition of the wave
front set.
We can prove the opposite inclusion, if we can show that the wave front set
of the nite sum in (19) also contains (A B) T O2 , which we will do using
scaling limits (cf. [34]). First, we may employ the Riemannian normal coordinates
f : O2 T O as above. Next, we may assume that O is also a contractible coor-
dinate neighbourhood, so we can consider local coordinates : O Rn on O and
the associated coordinate map d on T O. Moreover, we can choose in such a way
that (x0 ) = 0 for an arbitrarily given x0 O. The composition d f then denes
coordinates on O2 such that (x0 , x0 )  0 R2n . Using a frame EA for DM |O
and the dual frame E B we can express the terms in the sum of Eq. (19) in the
local coordinates d f as VjB A
(x, y)R (2 + 2j, y). From Eq. (20), we then nd the
scaling behavior

(VjB
A
(x, y)R (2 + 2j, y)) = 2+2jn (VjB
A
(x, y)R (2 + 2j, y))

for all > 0. In the scaling limit only the lowest order term survives:

lim n2 ( f 1 d1 ) E(x, y) = V0B


A
(0, 0)R(2, y)E B (x)EA (y)
0

= R(2, y)E A (x)EA (y),


May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

424 K. Sanders

where we wrote R(2, y) := R (2, y) R+ (2, y) and we used the explicit expression
V0AB (x, x) = B
A
( [16, Lemmas 2.2.2 and 1.3.17]).
Now, the last item of Theorem A.3 (which follows from Proposition A.2) implies
that WF (E) (d f ) ({(0, 0)} 2 (WF (1 R(2, )))), because E A (x)EA (y) is
smooth and not identically vanishing. From Eq. (20) and the support properties of
R (2, ) we easily compute 2 (WF (1 R(2, ))) = {(0, ) | 2 = 0}. Pulling this
back to O2 and using the properties of Riemannian normal coordinates yields
WF (E) {(x0 , ; x0 , ) | 2 = 0}.
Because E is a bi-solution to the wave equation we can apply the Propagation
of Singularities Theorem to nd that WF (E) A+ A on O2 and from the
support properties of E + and E we then conclude that WF (E ) A . Finally,
WF (E ) WF (P E ) = WF () = B. This completes the proof.

Corollary A.6. In the notation of Theorem A.5, WF (E) = A+ A \Z.

Proof. By Theorem A.5 and the support properties of E , we have WF (E) =


A+ A away from the diagonal. The inclusion then follows from the closedness of
the wave front set. For the opposite inclusion we consider a point on the diagonal and
use the Propagation of Singularities Theorem to nd an approximating sequence of
points o the diagonal.

Proposition A.7. For the fundamental solutions of the Dirac equation we have,
in the notation of Theorem A.5: WF (S ) = WF (Sc ) = A B and WF (S) =
WF (Sc ) = A+ A \Z.
In other words, WF (S ) = WF (Sc ) = WF (E ) and WF (S) = WF (Sc ) =
WF (E).

Proof. Because S = (i / + m)E and Sc = (i / + m)E (see [6]) we


immediately nd WF (S ) WF (E ) and WF (Sc ) WF (E ). Similarly

WF (S) WF (E) and WF (Sc ) WF (E). Now suppose that WF (S) =


WF (Sc ) = WF (E) = A+ A , which we will prove below. By the support
properties of the fundamental solutions we then nd that away from the diagonal
WF (S ) = WF (Sc ) = A , whereas on the diagonal WF (E ) = B WF (S )
WF (P S ) = WF () = B and similarly for cospinors.
To complete the proof we need to show that WF (S) WF (E) and WF (Sc )
WF (E), for which we adapt (and correct) an idea of [33]. We prove the case of
S, because the other case follows by taking adjoints (cf. Theorem 3.10). Further
note that it is sucient to prove the claim on the diagonal, because the Propa-
gation of Singularities Theorem applies both to E and to S. Now suppose that
(x, ; x, ) WF (E)\WF (S). We will derive a contradiction as follows. For every
time-like, future pointing normalized vector n0 Tx M we can nd a smooth space-
like Cauchy surface C through x such that n0 is normal to C. We let n denote
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

The Locally Covariant Dirac Field 425

the future pointing normal vector eld on C and : C M the canonical injec-
tion. By [6, Proposition 2.4(c)] we can restrict S to C 2 to nd S|C 2 = in /
and in particular (x, dTx (); x, dTx ()) WF (S|C 2 ). By (a component version
of) [38, Theorem 8.2.4], on the other hand:

WF (S|C 2 ) ( ) (WF (S)) = {(x, dTx (); y, dTy (  )) | (x, ; y,  ) WF (S)}.

Therefore, there must be a point (x, ; x, ) WF (S) such that (x, dTx ();
x, dTx ()) = (x, dTx (); x, dTx ()) Notice, however, that the transpose of d is
nothing else than restricting the dual vector to the tangent space of C. Because
WF (S) WF (E), there are only two possibilities: = or = 2(a na0 )n0 . The
rst contradicts our assumption, so we have = 2(a na0 )n0 . Now (x, ; x, )
WF (S) must hold for every normalised, time-like, future pointing vector n0 Tx M .
Choosing a sequence of vectors n0 such that and using the closedness of the
wave front set we nd again (x, ; x, ) WF (S). Hence, WF (E) = WF (S).

Appendix B. Proof of Theorem 4.18


The computations involved in the proof of Theorem 4.18 are somewhat similar to the
computation of the stress-energy-momentum tensor. We will work in components
and in local coordinates on O, using Greek indices to indicate the coordinate frame
and coordinate derivatives. To ease the notation we will drop the subscript  on the
local frame ea .
As a is independent of  we may use Eqs. (5) to vary
   
1 c 1 c

/ v = a v ab vc = ea v + eb { e e }vc a ,
b a c b
(22)
4 4

which yields:

1 d c 1 c
a e d v eb e ad vc + a e eb vc
/ v = e d a b a b a

4 4
1 1 c
ec e
a eb vc ea eb e vc .
b a b a
(23)
4 4
We can perform an integration by parts as follows:

1
a ec eb vc b a
4
i i
= Pc (ec eb vc b ) + ec eb Pc (vc b )
4 4
1 1 1
ec a eb vc b a ed eb cad vc b a + ec ed dab vc b a
4 4 4
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

426 K. Sanders

i i 1
= Pc (ec eb vc b ) + ec eb (Pc v)c b ec eb a v[c b , a ]
4 4 4
1 1 1
ec a eb vc b a + eb ed cad vc b a + ec ed dab vc b a .
4 4 4
(24)

Because [c b , a ] = c { b , a } {c , a } b = 2 ab c 2ca b and ec = g cd ed


we can write:
1 1 1
ec eb a v[c b , a ] = (g cd ed )eb ab a vc + ec eb c v b
4 2 2
1
= g cd ed eb ab a vc ed ea a v d
2
1
= g ea eb a vb e a e d v .
d a
(25)
2
When substituting Eqs. (24) and (25) into (23), we can recombine the terms
1 c 1 1 c d
e a eb vc b a ec e b a
a eb vc = e e vc b a
4 4 4 d ab
to obtain
i i 1

/v = Pc (ec eb vc b ) + ec eb (Pc v)c b + g ea eb a vb
4 4 2
1
e c b a
a eb e vc . (26)
4
Note that the variations of the frame ea cancel out, except in the terms with Pc .
These are harmless when we compute B0 ( / S0 f ), because both B0 and v solve
the Dirac equation. Therefore, the nal answer will not depend on variations of the
frame, as desired.
In the last term of Eq. (26), we can use the symmetry of the Christoel symbol:
1 1 c 1
() e c
a eb e vc = ea eb e vc
b a ab
= g ec vc
4 4 4
1 1
= g g g ec vc g ea g v a
4 4
1
+ g ea g v a . (27)
8
We handle the last term using an integration by parts as before:
1 i i 1
a g g v a = Pc (g g v) + g g Pc v g a g v a
8 8 8 8
i i 1
= Pc (g g v) + g g Pc v g a g v a , (28)
8 8 8
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

The Locally Covariant Dirac Field 427

where we used g a g = g g g a g = g a g . The penultimate


term in (27) is:

1 1
g ea g v a = b (g g g )ea eb g v a
4 4
1 1
= b (g ea eb )va g g g b (ea eb g )v a
4 4
1 1
= b (g ea eb )va g (abc ec eb + bbc ea ec )va
4 4
1
g g g b (ea eb g )v a . (29)
4

The rst term on the right-hand side of Eq. (29) is

1 1 1
b (g ea eb )va = b (g ea eb va ) g ea eb b va . (30)
4 4 4

The other terms can be simplied with some computation:

1
g (abc ec eb + bbc ea ec + g g ac b (ec eb g ))va
4
1
= g ( ea + ea ea c ec + ea
4

+ ea g g + ea b eb + g ac ec )va

1
= g ( ac ec g + ea + ea ea g g )va
4
1
= g (2ea g g + ea g (2 g g )
8

+ ea g g 2ea g g )va

1 a
= g (e g g + 2ea g g )va . (31)
8

Substituting Eqs. (27)(31) into (26) yields:

i i i i

/v = Pc (ec eb vc b ) + ec eb (Pc v)c b Pc (g g v) + g g Pc v
4 4 8 8
1 1
+ g ea eb a vb + b (g ea eb va ). (32)
4 4
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

428 K. Sanders

Using Lemma 4.17, we nd for a spinor u C (DM ):

i i i i

/u = P (ec eb b c u) ec eb b c (P u) + P (g g u) g g P u
4 4 8 8
1 1
+ g ea eb b a u + b (g ea eb a u). (33)
4 4

Using Proposition 4.16 and Eqs. (32) and (33) we notice that the terms with
Pc and P cancel out in the following equality, because B0 and S0 f both satisfy the
Dirac equation:

( B0 (f )) = B0 (P S0 f )
i i
= B0 (g ea eb b a S0 f ) + B0 (b (g ea eb a S0 f ))
4 4
i a b
= g e e (B0 ((b a) S0 f ) (b B0 (a) S0 f )). (34)
4

We now compare with Proposition 4.14 to get the nal result.

References
[1] K. Sanders, Aspects of locally covariant quantum eld theory, PhD thesis, University
of York (2008); also available online, arXiv:0809.4828v1[math-ph].
[2] R. Brunetti, K. Fredenhagen and R. Verch, The generally covariant locality prin-
ciple a new paradigm for local quantum eld theory, Comm. Math. Phys. 237
(2003) 3168.
[3] C. Dappiaggi, T.-P. Hack and N. Pinamonti, The extended algebra of observables for
Dirac elds and the trace anomaly of their stress-energy tensor, Rev. Math. Phys. 21
(2009) 12411312.
[4] R. Verch, A spin-statistics theorem for quantum elds on curved spacetime manifolds
in a generally covariant framework, Comm. Math. Phys. 223 (2001) 261288.
[5] C. J. Fewster, Quantum energy inequalities and local covariance II: Categorical for-
mulation, Gen. Relativ. Gravit. 39 (2007) 18551890.
[6] J. Dimock, Dirac quantum elds on a manifold, Trans. Amer. Math. Soc. 269 (1982)
133147.
[7] C. J. Fewster and R. Verch, A quantum weak energy inequality for Dirac elds in
curved spacetime, Comm. Math. Phys. 225 (2002) 331359.
[8] H. B. Lawson and M.-L. Michelson, Spin Geometry (Princeton University Press,
Princeton, 1989).
[9] R. Coquereaux, Cliord algebras, spinors and fundamental interactions: Twenty years
after, arXiv:math-ph/0509040v1.
[10] W. Pauli, Contributions mathematiques ` a la theorie des matrices de Dirac, Ann. Inst.
H. Poincare 6 (1936) 109136.
[11] B. L. van der Waerden, Group Theory and Quantum Mechanics (Springer, Berlin,
1974).
[12] Y. Choquet-Bruhat, C. de Witt-Morette and M. Dillard-Bleick, Analysis, Manifolds
and Physics (North Holland, Amsterdam, 1977).
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

The Locally Covariant Dirac Field 429

[13] S. P. Dawson and C. J. Fewster, An explicit quantum weak energy inequality for
Dirac elds in curved spacetimes, Class. Quantum Grav. 23 (2006) 66596681.
[14] S. Mac Lane, Categories for the Working Mathematician (Springer, New York, 1971).
[15] S. Mac Lane and I. Moerdijk, Sheaves in Geometry and Logic: A First Introduction
to Topos Theory (Springer, New York, 1992).
[16] C. B ar, N. Ginoux and F. Pf ae, Wave Equations on Lorentzian Manifolds and
Quantization (EMS, Z urich, 2007).
[17] J. Dieudonne, Treatise on Analysis, Vol. III (Academic Press, New York-London,
1972).
[18] J. Tolksdorf, Cliord modules and generalized Dirac operators, Internat. J. Theoret.
Phys. 40 (2001) 191209.
[19] R. Geroch, Spinor structures of space-times in general relativity. I, J. Math. Phys. 9
(1968) 17391744.
[20] R. M. Wald, General Relativity (University of Chicago Press, Chicago-London, 1984).
[21] R. Geroch, Spinor structures of space-times in general relativity. II, J. Math. Phys.
11 (1970) 343348.
[22] S. Kobayashi and K. Nomizu, Foundations of Dierential Geometry, Vol. I (Inter-
science, New York, 1963).
[23] R. H. Good Jr., Properties of the Dirac matrices, Rev. Mod. Phys. 27 (1955) 187211.
[24] A. Lichnerowicz, Champs spinoriels et propagateurs en relativite generale, Bull. Soc.
Math. France 92 (1964) 11100.
[25] D. Canarutto and A. Jadczyk, Fundamental geometric structures for the Dirac equa-
tion in general relativity, Acta Appl. Math. 51 (1998) 5992.
[26] J. E. Roberts and G. Ruzzi, A cohomological description of connections and curvature
tensors over posets, Theory Appl. Categ. 16 (2006) 855895.

[27] G. Segal, Classifying spaces and spectral sequences, Inst. Hautes Etudes Sci. Publ.
Math. 34 (1968) 105112.
[28] K. Schm udgen, Unbounded Operator Algebras and Representation Theory
(Birkhauser, Basel, 1990).
[29] H. Araki, On the diagonalization of a bilinear Hamiltonian by a Bogoliubov trans-
formation, Publ. Res. Inst. Math. Sci. Ser. A 4 (1968/1969) 387412.
[30] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechan-
ics, Vol. 2 (Springer, Berlin, 1996).
[31] M. J. Radzikowski, Micro-local approach to the Hadamard condition in quantum
eld theory on curved space-time, Comm. Math. Phys. 179 (1996) 529553.
[32] K. Kratzert, Singularity structure of the two point function of the free Dirac eld on
a globally hyperbolic spacetime, Ann. Phys. (8) 9 (2000) 475498.
[33] S. Hollands, The Hadamard condition for Dirac elds and adiabatic states on
RobertsonWalker spacetimes, Comm. Math. Phys. 216 (2001) 635661.
[34] H. Sahlmann and R. Verch, Microlocal spectrum condition and Hadamard form for
vector-valued quantum elds in curved spacetime, Rev. Math. Phys. 13 (2001) 1203
1246.
[35] S. Hollands, The operator product expansion for perturbative quantum eld theory
in curved spacetime, Comm. Math. Phys. 273 (2007) 136.
[36] J. J. Duistermaat and L. H ormander, Fourier integral operators. II, Acta Math. 128
(1972) 183269.
[37] A. Strohmaier, R. Verch and M. Wollenberg, Microlocal analysis of quantum elds on
curved space-times: Analytic wave front sets and ReehSchlieder theorems, J. Math.
Phys. 43 (2002) 55145530.
May 11, 2010 10:6 WSPC/S0129-055X 148-RMP
J070-S0129055X10003990

430 K. Sanders

ormander, The Analysis of Linear Partial Dierential Operators, Vol. I (Springer,


[38] L. H
Berlin, 2003).
[39] K. Sanders, Equivalence of the (generalized) Hadamard and microlocal spectrum
condition for (generalized) free elds in curved spacetime, Comm. Math. Phys. 295
(2010) 485501.
[40] R. Brunetti, K. Fredenhagen and M. K ohler, The microlocal spectrum condition and
Wick polynomials of free elds on curved spacetimes, Comm. Math. Phys. 180 (1996)
633652.
[41] N. Dencker, On the propagation of polarization sets for systems of real principal type,
J. Funct. Anal. 46 (1982) 351372.
[42] C. DAntoni and S. Hollands, Nuclearity, local quasiequivalence and split property for
Dirac quantum elds in curved spacetime, Comm. Math. Phys. 261 (2006) 133159.
[43] M. Forger and H. Romer, Currents and the energy-momentum tensor in classical eld
theory: A fresh look at an old problem, Ann. Phys. 309 (2004) 306389.
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

Reviews in Mathematical Physics


Vol. 22, No. 4 (2010) 431484

c World Scientic Publishing Company
DOI: 10.1142/S0129055X10004004

INVERSE SCATTERING IN

DE SITTERREISSNERNORDSTROM
BLACK HOLE SPACETIMES


THIERRY DAUDE
Department of Mathematics and Statistics,
McGill University, 805 Sherbrooke South West,
Montreal QC, H3A 2K6, Canada
tdaude@math.mcgill.ca

FRANC
OIS NICOLEAU
Departement de Math ematiques,
Laboratoire Jean Leray UMR 6629,
Universit
e de Nantes, 2, rue de la Houssini`
ere,
BP 92208, 44322 Nantes Cedex 03, France
nicoleau@math.univ-nantes.fr

Received 4 October 2009


Revised 15 March 2010

In this paper, we study the inverse scattering of massive charged Dirac elds in the
exterior region of (de Sitter)ReissnerNordstr
om black holes. Firstly, we obtain a precise
high-energy asymptotic expansion of the diagonal elements of the scattering matrix (i.e.
of the transmission coecients) and we show that the leading terms of this expansion
allow to recover uniquely the mass, the charge and the cosmological constant of the
black hole. Secondly, in the case of nonzero cosmological constant, we show that the
knowledge of the reection coecients of the scattering matrix on any interval of energy
also permits to recover uniquely these parameters.

Keywords: Inverse scattering; black holes; Dirac equation.

Mathematics Subject Classication 2010: 81U40, 35P25

1. Introduction
This paper deals with inverse scattering problems in black hole spacetimes and is
a continuation of our previous work [4]. Here we shall study the inverse scattering
of massive charged Dirac elds that propagate in the outer region of (de Sitter)
ReissnerNordstrom black holes, an important family of spherically symmetric,
charged exact solutions of the Einstein equations that will be thoroughly described
in Sec. 2. These spacetimes are completely characterized by three parameters: the
mass M > 0 and the electric charge Q R of the black hole as well as the cosmo-
logical constant 0 of the universe. In what follows, these parameters will be

431
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

432 T. Daud
e & F. Nicoleau

considered as the unknowns of our inverse problem. In fact, the inverse scattering
problem we have in mind is the following: we assume that we are static observers
living in the exterior region of a (dS)-RN black hole, that is the region between the
exterior event horizon of the black hole and the cosmological horizon when > 0,
or the region lying beyond the exterior event horizon of the black hole when = 0.
The geometry of the spacetime in which these observers live is thus xed in some
sense. But, what we do not assume however is that these observers know the exact
values of the parameters M, Q and a priori . Hence the natural question we
adress is: do such observers have any means to measure or characterize uniquely
these parameters by an inverse scattering experiment?
Let us explain more precisely the exact inverse scattering problem studied in
this paper. First of all, we shall use the direct scattering theory for massive charged
Dirac elds established in [3] for RN black holes and more generally in [18] for dS-RN
black holes. The point of view adopted in these papers to describe the geometry
of the black hole is that of static observers located far from the horizons (think
typically of a telescope on earth aimed at the black hole). We shall conserve this
point of view here which means in practice that all the relevant objects (such as the
wave and scattering operators) used in this work will be expressed by means of the
ReggeWheeler coordinates system. This choice of coordinates has an important
consequence in the understanding of the boundaries of the outer region of (dS)-
RN black holes, namely, either the exterior event horizon of the black hole and
the cosmological horizon when > 0, or the event horizon of the black hole and
spacelike innity when = 0. These boundaries are indeed perceived by such
observers as asymptotic regions of the spacetime which, moreover, may have very
dierent geometrical structures. This entails the following nice and peculiar picture
concerning the propagation properties of the Dirac elds ([3, 18]). First, it can be
proved that the energy of the elds contained in any compact set between the two
asymptotic regions vanishes at late times. Therefore, the elds scatter toward these
asymptotic regions. Second, from the point of view of our particular observers,
Dirac elds are shown to obey there simple but dierent equations that reect
the dierent geometries of the asymptotic regions. Therefore, two distinct wave
operators must be introduced according to the asymptotic region we consider. Let
us denote for the moment the wave operators corresponding to the part of Dirac

elds which scatters toward the event horizon of the black hole by W() and the
wave operators corresponding to the part of Dirac elds which scatters toward the

cosmological horizon or spatial innity by W(+) . These wave operators will be
precisely dened in Sec. 2. The main result obtained in [3, 18] asserts that the
global wave operators dened by

W = W() + W(+) , (1.1)

exist and are asymptotically complete. This permits to dene a global scattering
operator S by the usual formula

S = (W + ) W .
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

Inverse Scattering in de SitterReissnerNordstr


om Black Hole Spacetimes 433

The scattering operator S will be the main object of study of this paper. It
encodes the scattering data as viewed by observers living far from the horizons of a
(dS)-RN black hole. We thus rephrase and precise our initial problem in the follow-
ing way. We assume that our observers have access experimentally to the scattering
operator S. More precisely, we assume that they can measure the expectation values
of S, i.e. they can measure any quantities of the form S,  where ,  denotes
the scalar product of the energy Hilbert space H on which S acts and , are any
element of H. The question we adress is now: is the knowledge of S and any of its
related quantities a sucient information to uniquely characterize the parameters
M, Q and of (dS)-RN black holes?
We can in fact be a bit more precise in the statement of the problem if we
remark that the scattering operator S can be decomposed using (1.1) as

S = TL + TR + L + R,

where

+
TL = (W(+) ) W() , +
TR = (W() ) W(+) ,

and

+
R = (W(+) ) W(+) , +
L = (W() ) W() .

Each of the terms in S corresponds to a dierent inverse scattering experiment.


For instance, the rst two terms TR and TL (in fact the diagonal elements of S)
are understood as transmission operators. These terms measure the part of a sig-
nal which is transmitted from one asymptotic region to the other in a scattering
process. Conversely, the last two terms L and R (the anti-diagonal elements of S)
are understood as reection operators and correspond to the opposite experiment.
These terms measure the part of the signal which is reected from an asymptotic
region to itself. The quantities of interest the inverse scattering data will be
thus either the expectation values TR , , TL ,  of the transmission operators,
or the expectation values L, , R,  of the reection operators.
In this paper, we shall study two types of inverse problems. Firstly, in the two
cases of RN black holes ( = 0) and dS-RN black holes ( > 0), we shall prove
that the parameters M, Q, are uniquely determined if we assume that the high
energies of the transmission operators TR or TL are known. Note here that the same
analysis would not be possible working wih the reection operators R or L. The high
energies of the reection operators are indeed non-measurable and thus cannot be
used to determine uniquely the parameters. This was mentioned in [4] (see also [12]
where a similar problem was studied). Secondly, in the case of dS-RN black holes
only ( > 0), we shall prove the same uniqueness result under the assumption that
the reection operators L or R are known on any (possibly small) interval of energy.
The reason why we do not treat this second type of inverse problem in the case
of RN black hole is the following. The structure of the scattering operator (at any
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

434 T. Daud
e & F. Nicoleau

energy) turns out to be more complicated in the case of RN black holes than in dS-
RN black holes. This is again a consequence of the very dierent geometries of the
asymptotic regions in the RN case (see below for a brief explanation). To obtain the
same uniqueness result in this last case would require thus a better understanding
of the scattering matrix. We are currently investigating this problem.
Let us now recall the results of [4] where the rst kind of inverse problem was
adressed in the case of ReissnerNordstr om black holes (i.e. with only the two
parameters M, Q unknown and the cosmological constant equal to 0). Using the
direct scattering theory for massless Dirac elds obtained in [3, 20] and a high
energy asymptotic expansion of the expectation values TR ,  or TL ,  (as
dened above), a partial answer was then given: the mass M and the modulus
of the charge |Q| are uniquely determined from the leading terms of this high
energy asymptotic expansion. Note that the indecision of the sign of the charge is
not surprising in that case since the propagation of massless Dirac elds is only
inuenced by the geometry of the black hole which in turn only depends on |Q| (see
the expression of the metric (2.2) in Sec. 2).
In this paper, we continue our investigation and improve our results in several
directions. In Sec. 3, we reconsider the case = 0 corresponding to RN black holes
but study the inverse scattering of massive charged Dirac elds instead of massless
Dirac elds. Using the same approach in [4], we show that the mass M as well
as the charge Q are uniquely determined by the leading terms of the high energy
asymptotic expansion of the transmission operators TR or TL . In fact, the advantage
of considering massive charged Dirac elds is that an explicit term associated to
the interaction between the electric charge of the elds and that of the black hole
appears in the equation and allows to recover Q and not |Q|. From the mathematical
side, the analysis turns out to be much more involved than in [4]. The reasons are
twofold. First, from the point of view of our observers, massive Dirac elds have
completely distinct behaviors when approaching the dierent asymptotic regions.
At the event horizon of the black hole for instance, the attraction exerced by the
black hole is so strong that massive Dirac elds seem to behave as massless Dirac
elds. The asymptotic dynamic there turns out to be very simple and is shown to
obey a system of transport equations along the null radial geodesics of the black
hole.a This is a consequence of the particular geometry (of hyperbolic type) near
the event horizon (and more generally near any horizons). Conversely, RN black
holes are asymptotically at at spacelike innity. There, the elds simply behave
like massive Dirac elds in Minkowski spacetime and the mass of the elds, slowing
down the propagation, plays an important role. In consequence, the dynamics near
the two asymptotic regions are quite dierent and must be treated separately. The

a We emphasize again here that this simple expression for the asymptotic dynamic at the event

horizon (in fact at any horizons) is only true from the point of view of observers living far from the
horizons. Adopting another point of view such as the one of local observers living near a horizon
would lead to a very dierent asymptotic dynamic.
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

Inverse Scattering in de SitterReissnerNordstr


om Black Hole Spacetimes 435

second and related diculty comes from the appearance of long-range terms in
the equation but only at a single asymptotic region: spacelike innity. This entails
new technical diculties such as a modication of the standard wave operators at
innity and we need to work harder to obtain the high energy asymptotic expansion
of the transmission operators. An interesting feature we would like to emphasize
is that, eventhough we are considering high energies, the rest mass and the charge
of Dirac elds do contribute to the asymptotic expansion of the scattering matrix.
This can be clearly seen from the reconstruction formulae obtained in Theorem 3.2.
At last, we also mention that the model studied in this section can be viewed as
a good intermediate model before studying the same inverse problem in the more
complicated geometrical setting of Kerr black holes. As shown in [13] indeed, the
appearance of long-range terms in the equation (even for massless Dirac elds) is
compulsory in that case as a side eect of the rotation of the spacetime.
In Sec. 4, we consider the case of nonzero cosmological constant > 0, that is de
SitterReissnerNordstrom black holes and the three parameters M, Q, are sup-
posed to be a priori unknown. The two asymptotic regions are the event horizon of
the black hole and the cosmological horizon. From the point of view of our observers,
massive Dirac elds seem to behave as massless Dirac elds when approaching the
horizons and as before, their propagation there obeys essentially a system of trans-
port equations along the null radial geodesics of the black hole. However, dierent
oscillations appear in the dynamics near these two horizons, once again due to the
interaction between the charge of the eld and that of the black hole. In conse-
quence, Dirac elds evolve asymptotically according to slightly dierent dynamics
in that case too. In Sec. 4.1, using the results of the previous part, we shall obtain a
high energy asymptotic expansion of the transmission operators TR and TL and we
shall prove that the parameters M, Q and are uniquely characterized by the lead-
ing terms of this asymptotic expansion. In Sec. 4.2, we consider an inverse scattering
problem based on the knowledge of the reection operators R or L on a (possibly
small) interval of energy. As already mentioned, a high energy aymptotic expansion
of these reection operators does not give any information and cannot be used to
solve the inverse problem. To study this case, we follow instead the usual stationary
approach of inverse scattering theory on the line. We refer for instance to the review
by Faddeev [8] and to the important paper by Deift and Trubowitz [6] for a presen-
tation of the method for Schr odinger operators and to the nice paper [1] for a recent
application to Dirac operators (see also [12, 15]). We shall rst obtain a stationary
representation of the scattering operator S in terms of the usual transmission and
reection coecients (note that these turn out to be matrices in our case). This
is done after a series of simplications of our model which happens nally to reduce
to a particular case of the model studied in [1]. Then we use the analysis of [1],
namely, a classical Marchenko method based on a detailed analysis of the station-
ary solutions of the corresponding Dirac equation, to prove the following result:
the knowledge of one of the reection operators L or R at all energies is enough
to uniquely characterize the parameters M, Q and . Eventually, we improve this
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

436 T. Daud
e & F. Nicoleau

result observing that, in the dS-RN model, the reection operators R or L are in
fact analytic in the energy variable on a small strip containing the real axis. Hence
it is enough to know R or L on any interval of energy in order to uniquely know
them for all energies. Applying the result of [1], this leads to the uniqueness of the
parameters in that case too. Note nally that a stationnary representation of the
scattering operator in the case of RN black holes would drastically dier to the one
obtained in Sec. 4.2 for dS-RN black holes. This is due to the presence of long-range
terms at spacelike innity that change the asymptotic behaviors of stationary solu-
tions and thus the structure of the scattering matrix. In particular, the stationary
representation obtained in [1] could not be used in this case.
We nish this introduction saying a few words on the main technical tool used
in Secs. 3 and 4 to prove our uniqueness results from the high energies of the trans-
mission operators TR ot TL . These are based on a high-energy expansion of the
scattering operator S following an approach introduced by Enss and Weder in [7]
in the case of multidimensional Schr odinger operators. (Note that the case of mul-
tidimensional Dirac operators in at spacetime was treated later by Jung in [17]).
Their result can be summarized as follows. Using purely time-dependent methods,
they showed roughly speaking that the rst term of the high-energy expansion of
S is exactly the Radon transform of the potential they are looking for. Since they
work in dimension greater than two, this Radon transform can be inversed and
the potential thus uniquely recovered. In our problem however, due to the spher-
ical symmetry of the black hole, we are led to study a family of one-dimensional
Dirac equations and the above Radon transform simply becomes an integral of a
one-dimensional function, hence a number, and cannot be inversed. Fortunately in
our models, it turns out that this integral can be explicitely computed and gives in
general already a physically relevant information. Nevertheless, it is not enough to
uniquely characterize all the parameters of the black hole. In fact, we need to calcu-
late several terms of the asymptotic (and thus obtain several integrals) to prove our
result. To do this, we follow the stationary technique introduced by one of us [21]
which is close in spirit to the IsozakiKitada method used in long-range scattering
theory [16]. The basic idea is to replace the wave operators (and thus the scattering
operator) by explicit Fourier Integral Operators, called modiers, from which we
are able to compute the high-energy expansion readily. The construction of these
modiers and the precise determination of their phases and amplitudes will be given
in a self-contained manner in Sec. 3. Note also that the similar results proved in
our previous paper [4] could not be applied directly to our new model because of
the presence of long-range terms in the equation. At last we mention that, while
this method was well-known for Schrodinger operators and applied successfully to
various situations (see [2, 2123]), it has required some substantial modications
when applied to Dirac operators, essentially because of the matrix-valued nature of
the equation. To deal with these diculties, we made an extensive use of the paper
by G atel and Yafaev [9] where a direct scattering theory of massive Dirac elds in
at spacetime was studied and modiers were constructed.
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

Inverse Scattering in de SitterReissnerNordstr


om Black Hole Spacetimes 437

2. (De Sitter)ReissnerNordstr
om Black Holes and
Dirac Equation
In this section, we describe the geometry of the exterior regions of (de Sitter)
ReissnerNordstrom black holes. In particular, we emphasize the point of view
adopted for the observers as well as the dierent properties of the asymptotic regions
mentioned in the introduction, clearly distinguishing between the cases of zero and
nonzero cosmological constant . We then express in a synthetic manner the equa-
tions that govern the evolution of massive charged Dirac elds in these spacetimes.
We end up this section recalling the known direct scattering results of [3, 18] and
introducing the scattering operator S.

2.1. (De Sitter)ReissnerNordstr


om black holes
In Schwarzschild coordinates a (de Sitter)ReissnerNordstrom black hole is
described by a fourdimensional smooth manifold
M = Rt R +
r S ,
2

equipped with the Lorentzian metric


g = F (r) dt2 F (r)1 dr2 r2 d 2 , (2.1)
where
2M Q2 r2
F (r) = 1 + 2 , (2.2)
r r 3
and d 2 = d2 + sin2 d2 is the Euclidean metric on the sphere S 2 . The constants
M > 0, Q R appearing in (2.2) are interpreted as the mass and the electric
charge of the black hole and 0 is the cosmological constant of the universe.
Observe that the function (2.2) and thus the metric (2.1) do not depend on the
angular variables , S 2 reecting the fact that dS-RN black holes are spherically
symmetric spacetimes.
The family (M, g) are in fact exact solutions of the EinsteinMaxwell equations

1
G = 8T , G = R + Rg + g . (2.3)
2
Here G , R and R denote respectively the Einstein tensor, the Ricci tensor and
the scalar curvature of (M, g) while T is the energy-momentum tensor
 
1 1
T = F F g F F

, (2.4)
4 4
where F is the electromagnetic two-form solution of the Maxwell equations
F = 0, [ F] = 0 and given here in terms of a global electromagnetic
vector potential
Q
F = [ A] , A dx = dt. (2.5)
r
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

438 T. Daud
e & F. Nicoleau

The metric g has two types of singularities. Firstly, the point {r = 0} for which
the function F is singular. This is a true singularity or curvature singularity.b
Secondly, the spheres whose radii are the roots of F (note that the coecient of
the metric g involving F 1 blows up in this case). We must distinguish here two
cases. When the cosmological constant is positive > 0 and small enough, there
are three positive roots 0 r < r0 < r+ < + . The spheres of radius r , r0
and r+ are called, respectively, Cauchy, event and cosmological horizons of the dS-
RN black hole. When = 0, the number of these roots depends on the respective
values of the constants M and Q. In this paper, we only consider the  case M > |Q|
for which  the function F has two zeros at the values r = M M 2 Q2 and
r0 = M + M 2 Q2 . The spheres of radius r and r0 are called, respectively, the
Cauchy and event horizons of the RN black hole. In both situations, the horizons
are not true singularities in the sense given for {r = 0}, but in fact coordinate
singularities. It turns out that, using appropriate coordinate systems, these horizons
can be understood as regular null hypersurfaces that can be crossed one way but
would require speeds greater than that of light to be crossed the other way. We refer
to [14, 28] for a introduction to black hole spacetimes and their general properties.
As mentioned in the introduction, we shall consider in this paper inverse scat-
tering problems from the point of view of static observers living in the exterior
region of a (dS)-RN black hole, that is the region {r0 < r < r+ } when > 0
or the region {r0 < r < +} when = 0, and located far from the horizons.
Such observers are well described by the variable t of the Schwarzschild coordinates
meaning that t corresponds to their proper time. Since the metric is singular then,
it is important to understand the roles of the singularities the horizons as the
natural boundaries of the exterior region. It turns out that they are perceived by
such observers as asymptotic regions of spacetime. Precisely, this means that they
are never reached in a nite time t by incoming and outgoing null radial geodesics,
i.e. the trajectories followed by classical light-rays aimed radially at the black hole
and either at the cosmological horizon if > 0 or at innity if = 0. To see this
point more easily, we introduce a new radial coordinate x, called the ReggeWheeler
coordinate, which has the property of straightening the null radial geodesics and
will, at the same time, greatly simplify the later analysis. Observing that for all
0 the function F (r) in the metric (2.2) remains always positive in the exterior
region, it can be dened implicitely by the relation
dr
= F (r) > 0, (2.6)
dx
or explicitly, by
  r  
1 1 20
x= log(r r0 ) dy + C, (2.7)
20 r0 y r0 F (y)

b It means that certain scalars obtained by contracting the Riemann tensor blow up when r 0.
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

Inverse Scattering in de SitterReissnerNordstr


om Black Hole Spacetimes 439

where the quantity


1 
0 = F (r0 ) > 0,
2
is called the surface gravity of the event horizon and C is any constant of integra-
tion. Note that, when > 0, the ReggeWheeler variable could be also dened
explicitely by
  r+   
1 1 2+
x= log(r+ r) + dy + C, (2.8)
2+ r r+ y F (y)
where the quantity
1 
+ = F (r+ ) < 0,
2
is called the surfave gravity of the cosmological horizon. Moreover, in the case = 0,
the expression (2.7) simplies as
2
1 r
x=r+ log(r r0 ) + log(r r ) + C. (2.9)
20 r0 r
In the coordinate system (t, x, ), it is easy to see from the logarithm in (2.7)
and (2.9) and the positive sign of 0 that the event horizon {r = r0 } is pushed away
to {x = } for all 0. Similarly it follows from (2.8) and the negative sign
of + that the cosmological horizon {r = r+ } is pushed away to {x = +} when
> 0. Hence in any case the ReggeWheeler variable x runs over the full real line
R. Moreover, by (2.6), the metric takes now the form
g = F (r)(dt2 dx2 ) r2 d 2 , (2.10)
from which it is immediate to see that the incoming and outgoing null radial
geodesics are generated by the vector elds t
x

and take the simple
form
(t) = (t, x0 t, 0 ), t R, (2.11)
where (x0 , 0 ) R S 2 are xed. These are simply straight lines with velocity 1
mimicking, at least in the t x plane, the situation of a one-dimensional Minkowski
spacetime. At last, using (2.11), we can check directly that the event horizon and
the cosmological horizon (when > 0) are asymptotic regions of spacetime in the
sense given above.
From now on, we shall only consider the exterior region of dS-RN black holes
and we shall work on the manifold B = Rt with = Rx S2 , equipped with the
metric (2.10). Such a manifold B is globally hyperbolic meaning that the foliation
t = {t} by the level hypersurfaces of the function t, is a foliation of B by
Cauchy hypersurfaces (see [28] for a denition of global hyperbolicity and Cauchy
hypersurfaces). In consequence, we can view the propagation of massive charged
Dirac elds as an evolution equation in t on the spacelike hypersurface , that is a
cylindrical manifold having two distinct ends: {x = } corresponding to the event
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

440 T. Daud
e & F. Nicoleau

horizon of the black hole and {x = +} corresponding to the cosmological horizon


when > 0 and to spacelike innity when = 0. Note that the geometries of
these ends are distinct in general. The event and cosmological horizons are indeed
exponentially large ends of whereas spacelike innity is an asymptotically at
end of (in the latter, observe that the metric (2.2) tends to the Minkowski metric
expressed in spherical coordinates when r +). The dierence between these
geometries will be easily seen from the distinct asymptotic behaviors of Dirac elds
near these regions given in the next subsection.

2.2. Dirac equation and direct scattering results


Scattering theory for massive charged Dirac elds on the spacetime B has been the
object of the papers [3, 18]. We briey recall here the main results of these papers.
In particular, we use the form of the Dirac equation obtained therein.
First, the evolution equation satised by massive charged Dirac elds in B can
be written under the Hamiltonian form
it = H, (2.12)
where is a 4-components spinor belonging to the Hilbert space
H = L2 (R S 2 ; C4 ),
and the Hamiltonian H is given by
H = 1 Dx + a(x)DS 2 + b(x)0 + c(x). (2.13)
Here we use the following notations. The symbol Dx stands for ix whereas DS 2
denotes the Dirac operator on S 2 which, in spherical coordinates, takes the form
 
cot i
DS 2 = i2 + 3 . (2.14)
2 sin
The potentials a, b, c are scalar smooth functions given in terms of the metric (2.1)
by

F (r)  qQ
a(x) = , b(x) = m F (r), c(x) = , (2.15)
r r
where m and q denote the mass and the electric charge of the elds respectively.
Finally, the matrices 1 , 2 , 3 , 0 appearing in (2.13) and (2.14) are usual 4 4
Dirac matrices that satisfy the anticommutation relations
i j + j i = 2ij Id, i, j = 0, . . . , 3. (2.16)
Second, we use the spherical symmetry of the equation to simplify further the
expression of the Hamiltonian H. Since, the Dirac operator DS 2 has compact
resolvent, it can be diagonalized into an innite sum of matrix-valued multipli-
cation operators. The eigenfunctions associated to DS 2 are a generalization of the
usual spherical harmonics called spin-weighted spherical harmonics. We refer to
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

Inverse Scattering in de SitterReissnerNordstr


om Black Hole Spacetimes 441

GelFand and Sapiro [10] for a detailed presentation of these generalized spheri-
cal harmonics and to [3, 18] for an application to our model. There exists thus
a family
 of eigenfunctions Fnl of DS2 with the indexes (l, n) running in the set
I = (l, n), l | 2 | N, l |n| N which forms a Hilbert basis of L2 (S 2 ; C4 )
1

with the following property. The Hilbert space H can then be decomposed into the
innite direct sum

H= [L2 (Rx ; C4 ) Fnl ] := Hln ,
(l,n)I (l,n)I

where Hln = L (Rx ; C


2 4
) Fnl
is identied with L (R; C4 ) and more important, we
2

obtain the orthogonal decomposition for the Hamiltonian H



H= H ln ,
(l,n)I

with
H ln := H|Hln = 1 Dx + al (x)2 + b(x)0 + c(x), (2.17)
and al (x) = a(x)(l + 12 ). Note that the Dirac operator DS 2 has been replaced
in the expression of H ln by (l + 12 )2 thanks to the good properties of the spin-
weighted spherical harmonics Fnl . The operator H ln is a selfadjoint operator on Hln
with domain D(H ln ) = H 1 (R; C4 ). Finally we use the following representation for
the Dirac matrices 1 , 2 and 0 appearing in (2.17)

1 0 0 0 0 0 0 1 0 0 i 0
0 1 0 0 0 1 0 0 i
1 = , 2 = 0 , 0 = 0 0 .
0 0 1 0 0 1 0 0 i 0 0 0
0 0 0 1 1 0 0 0 0 i 0 0
(2.18)
In this paper, it will be often enough to restrict our analysis to a xed harmonic.
To simplify notations we shall thus simply write H, H and a(x) instead of Hln , H ln
and al (x) respectively and we shall indicate in the course of the text whether we
work on the global problem or on a xed harmonic.
Let us summarize now the direct scattering results obtained in [3, 18]. It is
well known that the main information of interest in scattering theory concerns the
nature of the spectrum of the Hamiltonian H. Our rst result goes in this sense.
Using essentially a Mourre theory (see [19]), it was shown in [3, 18] that, for all
0,
pp (H) = , sing (H) = .
In other words, the spectrum of H is purely absolutely continuous. In consequence,
massive charged Dirac elds scatter toward the two asymptotic regions at late
times and they are expected to obey simpler equations there. This is one of the
main information encoded in the notion of wave operators that we introduce now.
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

442 T. Daud
e & F. Nicoleau

We rst treat the case = 0 corresponding to RN black holes. From (2.2) and
(2.9), the potentials a, b, c have very dierent asymptotics as x (according
to our discussion above this reects the fact that the geometries near the two
asymptotic regions are very dierent). At the event horizon, there exists > 0
such that
|a(x)|, |b(x)|, |c(x) c0 | = O(ex ), x , (2.19)
where the constant c0 is given by (see (2.15))
qQ
c0 = .
r0
Hence we can write the Hamiltonian H as
H = H0 + V0 , H0 = 1 Dx + c0 , V0 (x) = a(x)2 + b(x)0 + (c(x) c0 ),
where the potential V0 is then short-range when x . In consequence, we can
choose the asymptotic dynamic generated by the Hamiltonian H0 = 1 Dx + c0 as
the comparison dynamic in this region. The Hamiltonian H0 is a selfadjoint operator
on H with its spectrum covering the full real line, i.e. (H0 ) = R. Note nally that
due to the simple diagonal form of the matrix 1 , the comparison dynamic eitH0
is essentially a system of transport equations along the curves x t, that is the null
radial geodesics of the black hole.
Conversely at innity, the potentials a, b, c have the asymptotics
 
1
|a(x)|, |b(x) m|, |c(x)| = O , x +. (2.20)
x
Hence we can write the Hamiltonian H as
H = H0m + V0m , H0m = 1 Dx + m0 , V0m (x) = a(x)2 + (b(x) m)0 + c(x),
where the potential V0m is now a long-range potential having Coulomb decay when
x +. The asymptotic dynamic is generated by the Hamiltonian H0m = 1 Dx +
m0 , a classical one-dimensional Dirac Hamiltonian in Minkowski spacetime. The
Hamiltonian H0m is a selfadjoint operator on H and its spectrum has a gap, i.e.
(H0m ) = (, m) (+m, +). However, contrary to the preceding case, the
m
asymptotic dynamic eitH0 cannot be used alone as a comparison dynamic because
of the long-range potential V0m , but must be (Dollard)-modied.
In order to dene this modication and for other use, we need to introduce the
classical velocity operators
V0 = 1 , Vm = Dx (H0m )1 ,
associated to the Hamiltonians H0 and H0m , respectively. The classical velocity
operators are selfadjoint operators on H and their spectra are simply (1 ) =
{1, +1} and (Vm ) = [1, +1]. Let us also denote by P and Pm the projections
onto the positive and negative spectrum of 1 and Vm , i.e.
P = 1R (1 ), Pm = 1R (Vm ).
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

Inverse Scattering in de SitterReissnerNordstr


om Black Hole Spacetimes 443

As shown in [3], a great interest of these projections is that they permit to separate
easily the part of the elds that propagate toward the event horizon and the part
of the elds that propagate toward innity. They will be used in the denition of
the wave operators below. Moreover, the classical velocity operator Vm enters in
the expression of the Dollard modied comparison dynamic at innity proposed in
[3] and given by
m
Rt m 1

U (t) = eitH0 ei 0 (b(sVm )m)m(H0 ) +c(sVm ) ds . (2.21)
Let us make here two comments. First, the potential a(x)2 turns out to be a false
m
long-range term. This is clear from (2.21) where the asymptotic dynamic eitH0
has been modied by an extra phase which only involves the long-range potentials
b and c. We refer to [3] for an explanation of this particular point. Second, we shall
propose in Sec. 3 a new time-independent modication of the comparison dynamic
m
eitH0 which will be a direct byproduct of our construction of modiers in the
spirit of IsozakiKitadas work [16]. This new modication will be shown to be
equivalent to the Dollard modication (2.21) in Theorem 3.3.
We are now in position to introduce the wave operators associated to H. At the
event horizon, we dene

W() = s- lim eitH eitH0 P , (2.22)
t

whereas at innity, we dene



W(+) = s- lim eitH U (t)Pm . (2.23)
t

Finally, the global wave operators are given by



W = W() + W(+) (2.24)
Note here our use of the projections P and Pm to separate the part of the eld
propagating toward the event horizon to the part of the eld propagating toward
innity. In fact without these projections, the wave operators (2.22) and (2.23)
would not exist at all. More precisely the main result of [3] is

Theorem 2.1. The wave operators W() , W(+) and W exist on H. Moreover,

the global wave operators W are partial isometries with initial spaces Hscat =

P (H) + P (H) and nal space H. In particular, W are asymptotically complete,
m

i.e. Ran W = H.
As a direct consequence of Theorem 2.1, we can dene the scattering operator S
by the usual formula
S = (W + ) W . (2.25)

It is clear that S is a well-dened operator on H and a partial isometry from Hscat
into Hscat .
+

We now treat the case > 0 corresponding to dS-RN black holes wich turns
out to be a little bit more symmetric at the two (event and cosmological) horizons.
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

444 T. Daud
e & F. Nicoleau

According to (2.2), (2.7) and (2.8), the potentials a, b, c have the following asymp-
totics as x . There exists > 0 such that
|a(x)|, |b(x)| = O(e|x| ), |x| , (2.26)
and
|c(x) c0 | = O(ex ), x , (2.27)
x
|c(x) c+ | = O(e ), x +, (2.28)
where the constants c0 and c+ are given by (see (2.15))
qQ qQ
c0 = , c+ = . (2.29)
r0 r+
Hence, the potentials a, b are short-range when x and c c0 and c c+
are short-range when x and x +, respectively. At the event horizon,
we choose as before the asymptotic dynamic generated by the Hamiltonian H0 =
1 Dx + c0 as the comparison dynamic while, at the cosmological horizon, we choose
the asymptotic dynamic generated by the Hamiltonian H+ = 1 Dx + c+ as the
comparison dynamic. The Hamiltonians H0 and H+ are clearly selfadjoint operators
on H and their spectra are exactly the real line, i.e. (H0 ) = (H+ ) = R. We
observe eventually that the dynamics eitH0 and eitH+ are essentially a system of
transport equations along the null radial geodesics of the black hole but they dier
by the distinct oscillations eitc0 and eitc+ .
We need the classical velocity operators associated to H0 and H+ in order to
separate the part of the elds that propagate toward the event horizon and the part
of the elds that propagate toward the cosmological horizon. It turns out that they
are equal to V0 = 1 in both cases and the associated projections onto the positive
and negative spectrum are still P . Thus we can introduce the wave operators as
before. At the event horizon, we dene

W() = s- lim eitH eitH0 P , (2.30)
t

and at the cosmological horizon, we dene



W(+) = s- lim eitH eitH+ P . (2.31)
t

Finally, the global wave operators are given by



W = W() + W(+) . (2.32)
The main result of [18] is

Theorem 2.2. The wave operators W() , W(+) and W exist on H. Moreover,
the global wave operators W are isometries on H. In particular, W are asymp-

totically complete, i.e. Ran W = H.


Thanks to Theorem 2.2, we can dene the scattering operator S as in (2.25) by
S = (W + ) W which is a well-dened isometry on H.
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

Inverse Scattering in de SitterReissnerNordstr


om Black Hole Spacetimes 445

We deduce from the previous discussion that, for all 0, the scattering
operator S is a well-dened operator on H. For all , H, we shall consider
in the following the expectation values of S, given by S, , as the known data
of our inverse problem. Moreover, using (2.24) and (2.32), we observe that these
expectation values can be decomposed into 4 natural components

S,  = W , W +  = TR ,  + TL ,  + L,  + R, ,

where

TR ,  = W(+) +
, W() , TL ,  = W() +
, W(+) , (2.33)

L,  = W() +
, W() , R,  = W(+) +
, W(+) . (2.34)

It follows from our denitions of the wave operators (2.22), (2.30) and (2.23), (2.31)
that the previous quantities can be interpreted in terms of transmission and reec-
tion between the dierent asymptotic regions, i.e. {x = } for the event horizon
of the black hole and {x = +} for either spacelike innity if = 0, or the cosmo-
logical horizon if > 0. For instance, TR ,  corresponds to the part of a signal
transmitted from {x = +} to {x = } in a scattering process whereas the
term TL ,  corresponds to the part of a signal transmitted from {x = } to
{x = +}. Hence TR stands for transmitted from the right and TL for transmit-
ted from the left. Conversely, L,  corresponds to the part of a signal reected
from {x = } to {x = } in a scattering process whereas the term R, 
corresponds to the part of a signal reected from {x = +} to {x = +}.

3. The Inverse Problem when = 0


In this section, we study the inverse problem at high energy in the case = 0 that
corresponds to RN black holes. Let us recall here that all the results and formulae
given hereafter are always obtained on a xed spin-weighted spherical harmonic.
Therefore the notations H, H, a(x) are a shorthand for Hln , H ln , al (x) dened in
the preceding section. In order to state our main result, we make two assumptions.

Assumption 1. We assume that our observers may measure the high energies of
the transmitted operators TR or TL . Precisely, we assume that one of the following
functions of R

Fl () = TR eix , eix , Gl () = TL eix , eix ,

are known for all large values of , for all l N where l indexes the spin-weighted
spherical harmonics and for all , H with , C0 (R; C4 ).

Assumption 2. We also assume that the mass m and the charge q of the Dirac elds
considered in these inverse scattering experiments are known and xed. Moreover
we assume that q = 0 since the case q = 0 is similar to the one treated [4].
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

446 T. Daud
e & F. Nicoleau

The main result of this section is now summarized in the following Theorem

Theorem 3.1. Under Assumptions 1 and 2, the parameters M and Q of the RN


black hole are uniquely determined.
Following our previous paper [4], the proof of Theorem 3.1 will be based on a
high-energy asymptotic expansion of the functions Fl () and Gl () when +.
Precisely we shall prove the following formulae:
Theorem 3.2 (Reconstruction Formulae). Let , C0 (R; C4 ). Then for
large, we obtain
i
Fl () = (x)P , P  + A(x)P , P  + O(2 ), (3.1)
2
i
Gl () = (x)P+ , P+  A(x)P+ , P+  + O(2 ), (3.2)
2
where (x) and A(x) are multiplication operators given by
R0
(x) = ei
[c(s)c0 ]ds+ic0 x
, (3.3)
 +  0  + 
A(x) = (x) 2
al (s)ds + b2 (s)ds + (b(s) m)2 ds + m2 x .
0
(3.4)

Remark 3.1. In Theorem 3.2, we have emphasized the dependence of the functions
Fl () and Gl () on the parameter l since the reconstruction formulae (3.1) and
(3.2) can be derived if we work on a xed spin-weighted spherical harmonic only.
Nevertheless, as indicated in Assumption 1 we shall need to know these formulae
on all spin-weighted spherical harmonics, hence for all l N, in order to prove the
uniqueness result stated in Theorem 3.1.

Remark 3.2. In the reconstruction formulae of Theorem 3.2, the  +physical contri-
ic0 x
butions are the phase e appearing in (3.3) and the functions al (s)ds + m2 x
2

appearing in (3.4). The presence of these terms clearly show that the charge q
through c0 and the mass m of Dirac elds contribute to the high energy  0 asymptotics
of the transmitted operators. On the other hand, the constant terms [c(s)c0 ]ds
0  +
in (3.3) and b2 (s)ds + 0 (b(s) m)2 ds in (3.4) may appear unnatural at rst
sight since they depend explicitely on the particular value 0 of the ReggeWheeler
variable x. They are in fact due to our particular choice of Dollard modication in

the denition of the modied wave operators W(+) . Recall here indeed that there
is no canonical choice for the (necessary) modications entailed by the presence of
long-range potentials at innity. This point can be easily seen for instance from the
IsozakiKitada modications constructed in the next subsection whose phases
are dened only up to a constant of integration (see (3.26) and Remark 3.4 after
it). The above constant terms can thus be understood as constants of integration
depending on our particular choice of modication. We emphasize at last that these
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

Inverse Scattering in de SitterReissnerNordstr


om Black Hole Spacetimes 447

constants of integration do not play any role in our proof of the uniqueness of the
parameters.

Remark 3.3. In this paper, we use the high-energy asymptotics of the quantum
wave operators for Dirac elds in order to reconstruct the mass and the charge of
the black hole. An other interesting question would be to study the same inverse
problem, but from the semiclassical dynamics, or even from the classical ones.
According to the authors, these problems are still open. However, for semiclassical
Schr
odinger operators with energies localized in an arbitrary small interval, an
inverse scattering problem was studied in [24], for regular potentials at innity; for
the Newton equations at high energies, this problem was treated by Novikov in [25].
We now explain our strategy to prove Theorem 3.2. Using (2.22), (2.23), (2.33)
and the fact that eix corresponds to a translation by in momentum space, we
rst rewrite Fl () and Gl () as follows

Fl () = W(+) +
(), W() (), (3.5)

Gl () = W() +
(), W(+) (), (3.6)
with

W() () = eix W() eix = s- lim eitH() eitH0 () P ,
t
m
ix
W(+) () =e W(+) eix = s- lim eitH() eiX(t,) eitH0 ()
Pm, ,
t

where we use the notations


H() = 1 (Dx + ) + a(x)2 + b(x)0 + c(x), H0 () = 1 (Dx + ) + c0 ,
 1
H0m () = 1 (Dx + ) + m0 , Vm () = (Dx + ) H0m () ,

Pm, = 1R (Vm ()),


 t  
X(t, ) = (b(sVm ()) m)m(H0m ())1 + c(sVm ()) ds.
0

In order to obtain an asymptotic expansion of the functions Fl () and Gl (),


it is thus enough to obtain an asymptotic expansion of the -shifted wave opera-

tors W() (). To do this, we follow the procedure exposed in [21, 22], procedure
inspired by the well-known IsozakiKitada method [16] developed in the setting
of long-range stationary scattering theory. It consists simply in replacing the wave

operators W() () by well-chosen energy modiers J() (), dened as Fourier
Integral Operators (FIO) with explicit phases and amplitudes. Well-chosen here

means practically that we look for J() () satisfying for large enough

W() () = lim eitH() J() ()eitH0 () P , (3.7)
t
itH0m ()
W(+) () = lim eitH() J(+) ()e Pm, , (3.8)
t
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

448 T. Daud
e & F. Nicoleau

and

(W() () J() ()) = O(2 ), (3.9)

for any xed H such that C0 (R; C4 ). Note that the decay O(2 ) in
(3.9) could be improved to any inverse power decay but turns out to be enough to

our purpose here. In particular if we manage to construct such J() () satisfying
(3.9) then we obtain by (3.5) and (3.6)

Fl () = J(+) +
(), J() () + O(2 ),

(3.10)
Gl () = J() +
(), J(+) () + O(2 ),

from which we can calculate the rst terms of the asymptotics easily.
Let us here give a simple but useful result which allows us to simplify slightly
the expressions of (3.7) and (3.8).

Lemma 3.1. For all R , set



() = sgn() 2 + m2 . (3.11)

Then, for all with supp R ,


m
eitH0 Pm = eit (Dx )
Pm . (3.12)

Moreover,

eitH0 P = eitDx itc0 P . (3.13)

Proof. The Fourier representation


 of the operator H0m is 1 + m0 and has pre-
cisely one positive eigenvalue + m2 and one negative eigenvalue 2 + m2 .
2

Similarly, the Fourier representation of the classical velocity operator Vm is


1 0 m
2 +m2 ( +m ). Hence, for > 0, P+ is the projection onto the positive spectrum
of + m and P is the projection onto the negative spectrum of 1 + m0 .
1 0 m

For < 0, it is the opposite. This implies immediately (3.12). Finally the equality
(3.13) is a direct consequence of the denitions of H0 and P .

According to Lemma 3.1, the projections P and Pm allow us to scalarize the



Hamiltonians H0 and H0m in the expressions (3.7) and (3.8) of W() (). Precisely
these expressions read now

W() () = lim eitH() J() ()eit(Dx +)itc0 P , (3.14)
t


W(+) () = lim eitH() J(+) ()eit (Dx +)
Pm, . (3.15)
t

This minor simplication will be important in the forthcoming construction of the



modiers J() ().
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

Inverse Scattering in de SitterReissnerNordstr


om Black Hole Spacetimes 449

Before entering into the details, let us give a hint on how to construct the

modiers J() () a priori dened as FIOs with scalar phases () (x, , )
and matrix-valued amplitudes p () (x, , ), i.e. dened for all H by

1 i (x,,)
J() () = e ()
p() (x, , )()d.
2 R
If we assume for instance that (3.15) is true then we easily get

(W(+) () J(+) ())



=i eitH() C(+) ()eit (Dx +)
Pm, dt, (3.16)
0

where

C(+) () := H()J(+) () J(+) () (Dx + ), (3.17)

are also FIOs with phases


(+) (x, , ) and amplitudes c(+) (x, , ). From (3.16),
we get the simple estimate



(W(+) () J(+) ()) C(+) () eit (Dx +) Pm, dt. (3.18)
0

In order that (3.9) be true it is then clear from (3.18) that the FIOs C(+) ()
have to be small in some sense. Precisely we shall need that the amplitudes
c
(+) (x, , ) be short-range in the variable x at innity (i.e. when x +) and
of order O(2 ) when +. Note here the role played by the projections Pm,
which allow us to consider the part of the Dirac elds that propagate toward innity.
This explains why the amplitudes c (+) (x, , ) must short-range in the variable x

only at innity. Similarly, for the construction of the modiers J() (), we shall

require that the amplitudes c() (x, , ) of the corresponding operators C() ()
be short-range in the variable x only at the event horizon (i.e. when x ) and
of order O(2 ) when +.


3.1. Asymptotics of W(+) ()

In this subsection, we construct the modiers J(+) () and give the asymptotics

of W(+) () when +. For simplicity, we shall omit the lower index (+)
in all the objects dened hereafter.
We rst look at the problem at xed energy (i.e. we take = 0 in the previous
formulae). Hence we aim to construct modiers J with scalar phases (x, )
and matrix-valued amplitudes p (x, ) such that the amplitudes c (x, ) of the
operators C = HJ J (Dx ) be short-range in x when x +. We adapt
here to our case the treatment given by G atel and Yafaev in [9] where a similar
problem was considered in Minkowski spacetime (see also our recent paper [4]).
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

450 T. Daud
e & F. Nicoleau

The operators C are clearly FIOs with phases (x, ) and amplitudes

c (x, ) = B (x, )p (x, ) i1 x p (x, ), (3.19)

where

B (x, ) = 1 x (x, ) + a(x)2 + b(x)0 + c(x) (). (3.20)

As usual, we look for phases close to x and amplitudes p close to 1. So the term
x p in (3.19) should be short-range et can be neglected in a rst approximation.
With p = 1, we are thus led to solve B = 0. However a direct calculation leads
then to matrix-valued phases whereas we look for scalar ones. We follow [9]
and solve in fact (B )2 = 0. Using crucially the anticommutation properties of the
Dirac matrices (2.16), we get the new equation

(B )2 = (x )2 + a2 + b2 + (c )2 + 2(c )(B c + ) = 0. (3.21)

If we put B = 0 in (3.21), we obtain the scalar equation

r (x, ) := (x )2 + a2 + b2 (c )2 = 0. (3.22)

We look for an approximate solution of (3.22) of the form (x, ) = x + (x, )


where (x, ) should be a priori relatively small in the variable x. Recalling that
( )2 = 2 + m2 by (3.11), we must then solve

2x + (x )2 + a2 + (b2 m2 ) c2 + 2c = 0. (3.23)

If we neglect (x )2 in (3.23), we nally get


 
2x = a2 + (b2 m2 ) c2 + d , (3.24)

where we have introduced the notation d (x, ) = 2c(x) (). Note that by (2.20)
and (3.11), the following estimate holds

, N, |x d (x, )| C x1 1 , x R+ , R . (3.25)

Therefore, using (2.20) again and the previous estimate (3.25), we see that a2 c2
is short-range when x + whereas b2 m2 and d are long-range (of Coulomb
type) when x +. Hence we can dene two solutions of (3.24) for all = 0 as
follows
 +  x
1 1  2 
(x, ) = [a (s) c (s)]ds
2 2
(b (s) m2 ) + d (s, ) ds
2 x 2 0
 +
1
+ (b(s) m)2 ds. (3.26)
2 0
1
 +
Remark 3.4. Let us emphasize that we only add the quantity 2 0
(b(s)m)2 ds
in (3.26) in order to prove that the IsozakiKitada and the Dollard modications
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

Inverse Scattering in de SitterReissnerNordstr


om Black Hole Spacetimes 451

coincide (see Theorem 3.3). In the general case however, the phases (x, ), solu-
tions of (3.24) would clearly take the form for all = 0
 +  x
1 1
(x, ) = [a2 (s) c2 (s)]ds (b2 (s) m2 )ds
2 x 2 0

() x
c(s)ds + C(), (3.27)
0

where C() is a constant of integration.


With this choice, we obtain for = 0 (see (3.22)),
1  2
r (x, ) = (x )2 = 2 a2 (x) + (b2 (x) m2 ) c2 (x) + d (x, ) . (3.28)
4
Moreover it is easy to see that the rests r satisfy the estimates
, N, |x r (x, )| C x2  , x R+ , R . (3.29)
In our derivation of the phases (3.26), it is important to keep in mind that we
did not nd an approximate solution of B = 0 but instead of (B )2 = 0. Therefore
we cannot expect to take p = 1 as a rst approximation and we have to work a
bit more. So we look for p such that B p be as small as possible. According to
(3.21) and (3.22), we rst note that
(B )2 = r + 2(c )B . (3.30)
We nd now a relation between B and (B )2 . Using (3.20) and (3.24), we can
reexpress B as
B = B0 + 2 K , (3.31)
where
B0 = 1 + m0 , (3.32)
 
1 1
K = (a2 + (b2 m2 ) c2 + d )1 + a2 + (b m)0 + c .
2 2
(3.33)
If we take the square of (3.31) we get
(B )2 = (B0 )2 + 2 B0 K + 2 K B . (3.34)
However, from (3.32) and (3.11) we see that (B0 )2 = 2 B0 . Whence (3.34)
becomes
(B )2 = 2 B0 (1 K ) + 2 K B . (3.35)
2
Now we replace the expression (3.35) of (B ) into (3.30) and we obtain
 c 
r = 2 B0 (1 K ) + 2 1 + K B . (3.36)

May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

452 T. Daud
e & F. Nicoleau

We would like to isolate B in (3.36). We thus need to invert the functions (1 +


K c ). Using (2.19), (2.20) and (3.25), we get the following global asymptotics
for K


C x1 1 , x R+ , R ,
, N, |x K (x, )|
C x 1 , x R , R .
(3.37)

Let us consider the set X = { R, || R} where R  1 is a constant. It follows



immediately from the asymptotics (3.37) and those of c(x) () that (1 + K c )
and (1 K ) are invertible for all (x, ) R X if the constant R is assumed to
be large enough. In consequence, we can write (3.36) as
 1
1 c
B (1 K )1 = 1 + K
r (1 K )1
2
 1
c

+ 1+K B0 , (3.38)

for all (x, ) R X. The rst term in the right-hand side of (3.38) is small thanks
to (3.29) but the second one is not. We choose p in such a way that they cancel
this term. To do this, we observe that the Fourier representations of the projections
Pm , i.e. the operators
 

Pm () = 1R 2 ( 1
+ m 0
)
+ m2
 
1 sgn()
= I4  ( + m ) , = 0,
1 0
(3.39)
2 2 + m2

satisfy the following equations

B0 ()Pm () = 0, (3.40)

by Lemma 3.1 and (3.32). According to (3.38), a natural choice for p is thus

p = (1 K )1 Pm (), (3.41)

for which we have


 1
1
c
q := B p = 1 + K r (1 K )1 Pm (). (3.42)
2
Let us summarize the situation at this stage. For = 0, we have dened the
phases (x, ) = x + (x, ) by (3.26) and for X, the amplitudes p are
given by (3.41). Directly from the denitions and from the asymptotics (2.19) and
(2.20) of the potentials a, b, c, the following estimates hold.
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

Inverse Scattering in de SitterReissnerNordstr


om Black Hole Spacetimes 453

Lemma 3.2 (Estimates on the Phases, the Amplitudes and Related


Quantities). For all x R+ and X with R large enough, we have
N, | (x, )| C logx . (3.43)

|| 1, N, |x (x, )| C x  . (3.44)


C
|x,
2
( (x, ) x)| . (3.45)
R2
, N, |x K (x, )| C x1 1 . (3.46)
 
, N, |x p (x, ) Pm () | C x1 1 . (3.47)

, N, |x r (x, ) C x2  . (3.48)

, N, |x q (x, ) C x2 1 . (3.49)

, N, |x c (x, ) C x2 1 . (3.50)


Thanks to (3.43)(3.45) and (3.47), for R large enough, we can dene precisely
our modiers J as bounded operators on H (see [27], for instance). Let +
C (R) be a cuto function in space variables such that + (x) = 0 if x 12 and
+ (x) = 1 if x 1. Let also C (R) be a cuto function in energy variables
such that () = 0 if || 12 and () = 1 if || 1. For R large enough, J are
the Fourier Integral Operators with phases (x, ) and amplitudes
 
+
P (x, ) = (x)p (x, ) . (3.51)
R
We nish this part by a rst application of the previous construction. In the

next Theorem, the modiers J(+) are shown to be time-independent modications
of IsozakiKitada type equivalent to the Dollard modication (2.21). Precisely, we
have
Theorem 3.3. For any H such that supp X, we have


W(+) = lim eitH J(+) eit (Dx )
Pm . (3.52)
t

Proof. We only sketch the proof for the case (+). By denition of P+m , we have
     
R
b s x +c s x
|D | |D |
+
i 0t m + m
(D )
ds
U (t)P+m = eit (Dx ) e
2
Dx +m 2 x 2
Dx +m 2
P+m
:= V (t)P+m . (3.53)
Then, we write:
+  + 
+
eitH J(+) eit (Dx )
P+m = eitH V (t) V (t)eit (Dx )
 + + 
eit (Dx ) J(+)
+
eit (Dx ) P+m (3.54)
Rt  + + 
+
= eitH V (t) ei 0 []ds eit (Dx ) J(+) eit (Dx ) P+m .
(3.55)
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

454 T. Daud
e & F. Nicoleau


The classical ow associated with the Hamiltonian + () = sgn() 2 + m2 is
given by
 
||
(x, ) = x + t 
t
, . (3.56)
2 + m2
 + + 
Then, using Egorovs theorem, we see that eit (Dx ) J(+) +
eit (Dx ) is a FIO with
phase + (t, x, ) = x + + (x + t, ), and with principal symbolc P + (x + t, )
where = 2|| 2 .
+m
Rt  + + 
+
Thus, ei 0 []ds eit (Dx ) J(+) eit (Dx ) is a FIO with the same principal
symbol and with phase + +
1 (t, x, ) = x + 1 (t, x, ) where
 +  x+t
1 1
+
1 (t, x, ) = [a2 (s) c2 (s)]ds [(b2 (s) m2 ) + 2c(s) + ())]ds
2 x+t 2 0
 +  t 
1 m
+ (b(s) m)2 ds + (b(s) m) + c(s) ds.
2 0 0 + ()
(3.57)
 + 2
1
Since 2 x+t
[a (s) c2 (s)]ds = o(1) when t +, and by making a change of
variables in the last integral, we obtain
 x+t  +
1 1
1 (t, x, ) =
+ [(b2 (s) m2 ) + 2c(s) + ())]ds + (b(s) m)2 ds
2 0 2 0

 t
1
+ [2(b(s) m)m + 2c(s) + ()]ds + o(1). (3.58)
2 0
 x+t  
Using again that t
(b2 (s) m2 ) + 2c(s) + ()) ds = o(1), we see that
 
1 t  2  1 +
1 (t, x, ) =
+ (b (s) m2 ) + 2c(s) + ()) ds + (b(s) m)2 ds
2 0 2 0

 t
1
+ [2(b(s) m)m + 2c(s) + ()]ds + o(1). (3.59)
2 0

Then,
 t  +
1 1
1 (t, x, ) =
+ (b(s) m)2 ds + (b(s) m)2 ds + o(1) = o(1).
2 0 2 0
(3.60)

c It means that the others terms of the symbol are o(1) when t +.
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

Inverse Scattering in de SitterReissnerNordstr


om Black Hole Spacetimes 455

Using (3.43), (3.44), (3.47) and the continuity of FIOs, we see that
Rt  + + 
+
ei 0 [...]ds eit (Dx ) J(+) eit (Dx ) P+m = P+m + o(1) (3.61)

and Theorem 3.3 follows from (3.55) and (3.61).


We now construct the modiers at high energy J(+) () so that they sat-
isfy (3.9) and (3.15). We still omit the lower index (+) in the next notations.
Comparing (3.15) and (3.52) suggests to construct J () close to eix J eix
which are clearly FIOs with phases (x, , ) = x + (x, + ) and amplitudes
P (x, + ).
With J () = eix J eix , we see from (3.50) that the amplitudes

c (x, , ) = B (x, + )P (x, + ) i1 x P (x, + ),

of the operators C () = H()J () J () (Dx + ) would satisfy the


estimate

c (x, , ) = O(x2 1 ), (3.62)

for in a compact set. Here and in the following, the notation f (x, ) =
O(x2 1 ) means that f (x, ) decays as x2 when x + and as 1 when
+. We want however the amplitudes c (x, , ) to be of order O(x2 2 )
and the decay in (3.62) is not sucient for our purpose. In consequence, we need to
rene our construction. Following the procedure given in [4], we look for modiers
J () dened as FIOs with phases (x, , ) and with new amplitudes P (x, , )
that take the form
 
1 1
P (x, , ) = p (x, + ) + p (x, + )l (x) + 2 P k (x) , (3.63)

(up to suitable cuto functions dened later), where P denote the projections onto
the positive and negative spectrum of 1 . Here the correctors l , k (that can be
matrix-valued) will be functions of x only and should satisfy some decay in x (see
below). It will be clear in the next calculations why we add such correctors to the
amplitudes p (x, + ).
We now choose l and k in (3.63) so that the amplitudes
 
1 1
c (x, , ) = B (x, + ) p (x, + ) + p (x, + )l (x) + 2 P k (x)


1
i1 x p (x, + ) + x p (x, + )l (x)


1 1
+ p (x, + )x l (x) + 2 P x k (x) , (3.64)

of the operators C () be of order O(x2 2 ).


May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

456 T. Daud
e & F. Nicoleau

To prove this, we need the asymptotics of the dierent functions appearing in


(3.64). For x in R+ and for large enough, we obtain (after long and tedious
calculations)
 
m2
( + ) = + + + O(2 ). (3.65)
2
 
m2
d (x, + ) = 2c(x) + + + O(x1 2 ). (3.66)
2
1
K (x, + ) = [2P c(x) + a(x)2 + (b(x) m)0 ]
2
+ O(x1 2 ). (3.67)

Pm ( + ) = P + O(1 ). (3.68)

p (x, + ) = P + O(1 ). (3.69)


1
x p (x, + ) = P (a (x)2 + b (x)0 ) + O(x2 2 ). (3.70)
2
B (x, + ) = 2( + )P + 2c(x)P + a(x)2
+ b(x)0 + O(1 ). (3.71)

q (x, + ) = B (x, + )p (x, + )


1 2
= c (x)P + O(x2 2 ). (3.72)
2

We mention that the following simple equalities have been used several times to
prove the preceding asymptotics
   
I2 0 0 0
1
1+ =2 = 2P+ , 1 =2
1
= 2P . (3.73)
0 0 0 I2

By (3.69)(3.72), the amplitudes c (x, , ) take the form

1 2 1
c (x, , ) = c P 2 c2 P l
2 2
  
1 1
+ 2 2( + )P + 2cP + a + b + O
2 0
P k


1 1
i P (a 2 + b 0 ) 2 P (a 2 + b 0 )l
1
2 2
     
1 1 1 1
+ P + O x l + 2 P x k + O .
x2 2
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

Inverse Scattering in de SitterReissnerNordstr


om Black Hole Spacetimes 457

From the asymptotics (2.20) of the potentials a, b, c, we rewrite this last expres-
sion as
1 2 i 1
c (x, , ) = c2 P P k P (a 2 + b 0 )
2 2
i
1 P x l + R(x, ), (3.74)

where the rest R(x, ) satises
 
1 + |l (x)| |x l (x)| |k (x)| |k (x)| |x k (x)|
R(x, ) = O + + + + . (3.75)
x2 2 2 2 x2 2
Now we choose the correctors l , k in such a way that the terms of orders O(1 )
in (3.74) cancel. Once it is done we shall have to check that the rest (3.75) be of
order O(x2 2 ).
There are clearly two dierent types of terms in the expression (3.74): on one
hand the terms
 
1 i 1 1
c2 P 1 P x l = P c2 ix l ,
2 2
live in H = P (H); on the other hand the terms
 
2 i 1  2  0 1 i  2  0
P k P (a + b ) = P 2k + (a + b ) ,
2 2
live in H = P (H). Since the Hilbert spaces H and H+ form a direct sum of
H, i.e. H = H H+ , we can consider separatly the equations
1
c2 ix l = 0, (3.76)
2
i
2k + (a 2 + b 0 ) = 0,

(3.77)
2
in order to cancel the terms of order O(1 ) in (3.74). We solve rst (3.76) and
obtain

i + 2
l (x) = l(x) = c (s)ds. (3.78)
2 x
Then we solve (3.77) and get
i
k (x) = (a (x)2 + b (x)0 ). (3.79)
4
The functions l and k clearly satisfy when x +
l(x) = O(x1 ), x l(x) = O(x2 ), k (x) = O(x2 ). (3.80)
Finally, with this choice of correcting terms l and k , we conclude from (3.74) and
(3.75) that
c (x, , ) = R(x, ) = O(x2 2 ).
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

458 T. Daud
e & F. Nicoleau

In fact, we can prove that for all x R+ , in a compact set and large enough

, N, |x c (x, , )| C x2 2 . (3.81)

Let us summarize the previous results. The modiers J () are (formally) con-
structed as FIOs with phases (x, , ) = x + (x, + ) where
 +  x
1
(x, + ) = [a (s) c (s)]ds
2 2
[(b2 (s) m2 )
2( + ) x 0
 + 

+ d (s, + )ds] + (b(s) m) ds ,
2
(3.82)
0

and amplitudes
 
1 1
P (x, , ) = p (x, + ) + p (x, + )l(x) + 2 P k (x) , (3.83)

where l and k are given by (3.78) and (3.79) respectively.


Unfortunately, since (x, + ) = O(x) when x , this phase does
not belong to a good class of oscillating symbols. So, we have to introduce some
technical cuto functions in the amplitude in order to localize x far away from
. Moreover, these cuto functions must be negligible in the asymptotics in the
previous calculus. We follow the strategy exposed in [22] which we briey recall
here.
We consider a xed test function C0 (R) and we want to calculate the

asymptotics of W(+) (). Since / C0 (R), at high energies, translation of
wave packets does not dominate over spreading. So we introduce a cuto function
(depending on ) in order to control the spreading.
Let 0 C0 (R) be a cuto function such that 0 () = 1 if | | 1, 0 () = 0
if | | 2. Using the Fourier representation, we have easily:
    
 Dx 

 > 0, N 1,  0 1  = O(N ). (3.84)
 2
L (R)

Now, let us dene the classical propagation zone:

= {x + t; x supp , t R+ }, (3.85)

and let + C (R) be a cuto function such that + = 1 in a neighborhood of


and + = 0 in a neighborhood of . We consider
 
it (Dx +) m, Dx
K () = ( 1)e
+
P 0 . (3.86)

Lemma 3.3. For  1,  ]0, 1[, t R , and N 1, we have:

K ()L2 (R) = O(tN N ). (3.87)


May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

Inverse Scattering in de SitterReissnerNordstr


om Black Hole Spacetimes 459

Proof. We only sketch the proof for the case (+). Using the Fourier transform and
(3.39), we easily see that
    
1 + 1 ( + ) + m0
+
K () = ( (x) 1)
e i()
I4 +  0 ()d
4 ( + )2 + m2

(y)dy, (3.88)

where () = (x y) t ( + )2 + m2 . So,
  
1
1 +
() = x y + t  . (3.89)
(1 + 1 )2 + m2

Since is in a compact set,  < 1, y supp , we easily obtain for x supp( +


1), and  1,

| ()| c (1 + t), (3.90)

for a suitable constant c > 0. We conclude by a standard non stationary phase


argument.

Now, we can dene precisely ours modiers J () in order to calculate the



asymptotics of W(+) (). According to (3.84), it suces to calculate the asymp-

 ). We rst remark that for  1 and  < 1, we have
()0 ( Dx
totics of W(+)
+ X if  supp 0 . So, we can dene the modiers J () as FIOs with
phases (x, , ) = x + (x, + ) where (x, + ) are given by (3.82) and
with amplitudes
   
+ 1 1
P (x, , ) = (x) p (x, + ) + p (x, + )l(x) + 2 P k (x) 0 ,

(3.91)

where l and k are given by (3.78) and (3.79), respectively.


With this denition, we can mimick the proof of Theorem 3.3, to get

Lemma 3.4. For C0 (R) and for large, we have


 
Dx
W(+) ()0 = lim eitH() J(+) ()eit (Dx +) Pm, . (3.92)
t

Moreover, it is easy to see that the estimates (3.81) are still satised, so we can
prove our main estimate (3.9). Precisely we get

Lemma 3.5. For C0 (R) and when tends to innity, the following estimate
holds:

(W(+) () J(+) ()) = O(2 ).
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

460 T. Daud
e & F. Nicoleau

Proof. Everything done in [4], Lemma 3.3 works here in the same way. All the
contributions coming from the cut-o function + are negligible using the same
arguments as in Lemma 3.3 since the support of the derivatives of + are far away
from .


We end up this section giving the asymptotics of W(+) () when is large.

According to Lemma 3.5, we have for any C0 (R; C4 ), W(+) () =
2
J(+) ()+ O( ). Thus we only need to compute the asymptotics of the modier

J(+) () that we shall consider as pseudodierential operators with symbols

j (x, , ) = ei (x,+)
P (x, , ).

Using the explicit expressions (3.82) and (3.91), we rst get the asymptotics
 x   +  x
1
(x, + ) = c(s)ds + (a2 c2 )(s)ds (b2 (s) m2 )ds
0 2 x 0
 +   
logx
+ (b(s) m) ds + O
2
, (3.93)
0 2
   
1 l(x) 1
P (x, , ) = + (x) P P (a2 + b0 ) + P + O . (3.94)
2 2

Moreover using a Taylor expansion of et at t = 0, we get from (3.93)


  
i (x,+) iC + (x) i + logx
e =e 1+ C (x) + O , (3.95)
2 2
with
 x
+
C (x) = c(s)ds,
0
   (3.96)
+ x +
C + (x) = (a2 c2 )(s)ds (b2 (s) m2 )ds + (b(s) m)2 ds.
x 0 0

Combining now (3.94) and (3.95), we obtain


 
+ i + 1 l(x)
j (x, , ) = eiC (x) + (x) P + C (x)P P (a2 + b0 ) + P
2 2
 
1
+O . (3.97)
2

However, notice from (3.78) that


 +  x  + 
i + l(x) i
C (x)+ = a (s)ds
2
(b (s) m )ds +
2 2
(b(s) m) ds ,
2
2 2 x 0 0
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

Inverse Scattering in de SitterReissnerNordstr


om Black Hole Spacetimes 461

and from the anticommutation properties (2.16) of the Dirac matrices that

P (a2 + b0 ) = (a2 + b0 )P .

Hence (3.97) becomes


  +  x
iC + (x) + i
j (x, , ) = e (x) 1 + a2 (s)ds (b2 (s) m2 )ds
2 x 0
 +    
1 1
+ (b(s) m)2 ds (a2 + b0 ) P + O . (3.98)
0 2 2
Eventually, if we introduce the notations
 +  x  + 
i
R (x) = a (s)ds
2
(b (s) m )ds +
2 2
(b(s) m) ds
2
2 x 0 0

1
(a2 + b0 ), (3.99)
2
we deduce from (3.98) and the fact that + (x) = 1 on supp , the following
Proposition

Proposition 3.1. For any C0 (R; C4 ),


   
iC + (x) 1 1
W(+) () = e 1 + R (x) P + O , (3.100)
2
where C + (x) and R (x) are given by (3.96) and (3.99), respectively.


3.2. Asymptotics of W() ()
In this subsection, we focus on what happens at the event horizon and give the

asymptotics of W() () when +. In fact, we shall derive them from the
results obtained in the preceding Sec. 3.1 after some simplications of our model. As
usual, we shall omit the lower index () in the objects dened or used hereafter.
Recall that the expressions of the wave operators at the event horizon are given
by (see (2.22))

W = s- lim eitH eitH0 P ,


t

where H0 = 1 Dx + c0 , H = 1 Dx + a2 + m0 + c and the potentials a, b, c c0


satisfy (2.19) when x . We rst simplify this expression in a convenient way.
Let us introduce the unitary transform U on H
 x
1
U = ei C (x) , C (x) = [c(s) c0 ]ds + c0 x, (3.101)

and dene the selfadjoint operators on H

A0 = 1 Dx , A = U HU . (3.102)
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

462 T. Daud
e & F. Nicoleau

Using (3.101), a short calculation shows that the operator A can be rewritten as
A = 1 Dx + W (x), (3.103)
where
1
C (x)
  1
W (x) = ei a(x)2 + b(x)0 ei C (x) . (3.104)
Note that according to the anticommutation properties (2.16) of the Dirac matrices,
the potential W satises W 1 + 1 W = 0 and W 2 (x) = a2 (x) + b2 (x). Moreover
from (2.19), we get the following estimates for W
> 0, W (x) = O(ex ), x . (3.105)

Using the unitarity of U and (3.102) we rewrite W as
W = U s- lim eitA U eitH0 P ,
t

= U s- lim eitA eitA0 eitA0 U eitH0 P . (3.106)


t

Now we can simplify the strong limit appearing in (3.106) in two steps. First we
claim that
1
s- lim eitA0 U eitH0 P = ei c0 x
P . (3.107)
t

Indeed, using the particular diagonal form of 1 given in (2.18) and since eitH0 =
eitA0 eitco , we have
1
C (x) iA0 itc0 1
C (xt) itc0
eitA0 U eitH0 P = eitA0 ei e e P = ei e P . (3.108)
When t +, the right-hand-side of (3.108) can be written using (3.101) as
 R xt 

eiC (xt) eitc0 P = ei (c(s)c0 )ds+c0 x P ,
from which (3.107) follows when t +. The case t is obtained similarly.
Second since the potential W decays exponentially when x by (3.105),
it follows from the methods used in [3, 18] that the wave operators
W (A, A0 ) = s- lim eitA eitA0 P , (3.109)
t

exist on H. Hence by (3.106), (3.107), (3.109) and the chain-rule, we obtain the
following nice expressions for W
1
W = U W (A, A0 ) ei c0 x
P . (3.110)
1
At last since U and ei c0 x commute with eix , it is clear from (3.110) that it is
enough to know the asymptotics of
W (A, A0 , ) = eix W (A, A0 )eix
when + in order to get the asymptotics of W ().
Note here that the -shifted wave operator W (A, A0 , ) is exactly the kind
of wave operator studied in our previous paper [4] in which the asymptotics of
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

Inverse Scattering in de SitterReissnerNordstr


om Black Hole Spacetimes 463

W (A, A0 , ) were calculated. Nevertheless, we can also easily derive these asymp-
totics from the results of the preceding section. For completeness this is what we
choose to do here.
We thus follow our usual strategy and construct modiers J0 () corresponding
to W (A, A0 , ). This problem is in fact similar to the one in Sec. 3.1. It suces to
replace H0m by A0 and H by A in our calculations. From the explicit form (3.102)
and (3.103) of the operators A0 and A, we deduce that we can use the results
obtained in Sec. 3.1 with the following changes: (1) Since the mass m does not
appear in A0 hence we take m = 0. (2) The long-range matrix-valued potential b
and scalar potential c do not appear in A (see (3.103) and (3.105)) hence we put
b(x) = c(x) = 0. (3) The short-range matrix-valued potential a(x)2 is replaced
by W (x). (4) The projections Pm are replaced by P since we work at the event
horizon. Noting that these changes also entail that () = and d (x, ) = 0,
we obtain the following results.
At xed energy = 0, the modiers J0 are dened as FIOs with phases

1
(x, ) = x + W 2 (s)ds,
2 x
and amplitudesd
 
1 W 2 (x) 1
p (x, ) = (1K (x, ))1 P , K (x, ) = + W (x) . (3.111)
2 2
At high energy, the modiers J0 () are dened as FIOs with phases

1
(x, , ) = x + W 2 (s)ds, (3.112)
2( + ) x
and amplitudes
1
P (x, , ) = p (x, + ) + P k (x), (3.113)
2
where k (x) = 4i W  (x). Using these denitions and (3.105), we can prove that
the symbols c (x, , ) of the operators C () = A()J0 () J0 ()A0 () satisfy
the estimates
ex
, N, |x c (x, , )| C 2 , (3.114)

for all x R and large enough. Finally as in the proof of Lemma 3.5 the
estimates (3.114) are the main ingredients to prove the equivalent properties to
(3.14) and (3.9). Precisely we have

Lemma 3.6. For any C0 (R; C4 ) and for large, the following estimate holds
(W (A, A0 , ) J0 ()) = O(2 ).

d Inthe same way as the preceding section, we should add some technical cuto functions which
are negligible in the asymptotics.
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

464 T. Daud
e & F. Nicoleau

We now use Lemma 3.6 to compute the asymptotics of W (A, A0 , ) up to


the order O(2 ). For any C0 (R; C4 ) and for large, we have
 
1
W (A, A0 , ) = J0 () + O .
2

Hence, it is enough to compute the asymptotics of J0 () for large. Using (3.111)


(3.113) and after some calculations, we obtain
     
1 1
J0 () = 1 + i W (s)ds W (x) P + O
2
2
. (3.115)
2 x
Note that we retrieve naturally the same formulae as in [4]. Eventually combining
(3.110) and (3.115), we obtain the asymptotics of W () for large

Proposition 3.2. For any C0 (R),


   
1 i1 c0 x 1
W() () = U 1 + Q (x) e P + O , (3.116)
2

where U is given by (3.101), Q (x) = 12 (i x W 2 (s)ds W (x)) and W (x) is given
by (3.104).

3.3. Proofs of Theorems 3.1 and 3.2



In this last subsection, we use the asymptotics of W() () obtained in Propositions
3.1 and 3.2 to prove the reconstruction formulae given in Theorem 3.2 and nally
prove Theorem 3.1.

Proof of Theorem 3.2. We only treat the case of the transmission operator TR
and give the proof of (3.1) since the proof of (3.2) corresponding to the transmission
operator TL is similar. Recall that we want to compute the asymptotic expansion
when + of

Fl () = TR eix , eix  = W(+) +
(), W() (),

for , C0 (R; C4 ). Using Propositions 3.1 and 3.2 and the notations therein, we
have
     
iC + (x) 1 1 + i1 c0 x 1
Fl () = e 1 + R (x) P , U 1 + Q (x) e P  + O ,
2
+ 1 1 iC + (x) 1
= eiC (x)
P , U ei c0 x
[e
P  + P , U Q+ ei c0 x P 

 
iC + (x) i1 c0 x 1
+ e R P , U e P ] + O . (3.117)
2
We now compute separatly the terms of dierent orders in (3.117).
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

Inverse Scattering in de SitterReissnerNordstr


om Black Hole Spacetimes 465

Order 0. Since 1 P = P , the term of order 0 reads


+
(x)C (x)+c0 x]
ei[C P , P . (3.118)
Moreover from (3.96) and (3.101), the phase C + (x) C (x) + c0 takes the simple
form
 0
C + (x) C (x) + c0 x = [c(s) c0 ]ds + c0 x. (3.119)

Order 1. Using 1 P = P again, the term of order 1 can be written as


+
(x)C (x)+c0 x]
ei[C (R + (Q+ ) ) P , P .

Since W 2 = a2 + b2 and W P = e2iC (a2 + b0 )P by (2.16), the term (Q+ ) P
takes the form
  
i 2 1
(Q+ ) P = (a + b2 )(s)ds e2iC (a2 + b0 ) P . (3.120)
2 x 2
Moreover from (3.99) the term R is
 
i + 2 i x 2
R = a (s)ds (b (s) m2 )ds
2 x 2 0

i + 1
+ (b(s) m)2 ds (a2 + b0 ). (3.121)
2 0 2
Hence adding (3.120) and (3.121), the term of order 1 reads
   + 
i[C + (x)C (x)+c0 x] i i 0 2
e a2 (s)ds + b (s)ds
2 2
 
i + i
+ (b(s) m)2 ds + m2 x P , P
2 0 2
  
i[C + (x)C (x)+c0 x] 1 2iC 1
e e 2 0 2 0
(a + b ) + (a + b ) P , P .
2 2
(3.122)
+
Finally using that ei[C (x)C (x)+c0 x] is scalar, that (a2 +b0 )P = P (a2 +b0 )
by (2.16) and the fact that P+ , P  = 0, we see that the last term in (3.122)
cancel, i.e.
  
+ 1 2iC 1
ei[C (x)C (x)+c0 x] e (a2 + b0 ) + (a2 + b0 ) P , P = 0.
2 2
Hence the term of order 1 is
   +  
i[C + (x)C (x)+c0 x] i i 0 2 i +
e 2
a (s)ds + b (s)ds + (b(s) m)2 ds
2 2 2 0

i 2
+ m x P , P . (3.123)
2
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

466 T. Daud
e & F. Nicoleau

If we introduce the following functions


R0
(x) = ei
[c(s)c0 ]ds+ic0 x
,
 +  0  + 
A(x) = (x) a2 (s)ds + b2 (s)ds + (b(s) m)2 ds + m2 x ,
0

we have proved the reconstruction formula (3.1) and thus Theorem 3.2. 

Proof of Theorem 3.1. We show here that the reconstruction formula (3.1) entails
the uniqueness of the parameters M and Q under the additional assumption that
the charge q of Dirac elds is known, xed and nonzero. The same result can be
shown from the reconstruction formula (3.2) in a similar way.
We rst compute one of the integrals that appear in (3.1) which will be useful
in the later analysis. Using the explicit expressions of F, al given in (2.2) and (2.15)
as well as the denition of the ReggeWheeler variable x(r) given in (2.6), an easy
calculation shows that
  2
2 1 1
al (s)ds = l + , (3.124)
R 2 r0

where r0 is the radius of the event horizon.


Now let us consider two transmission operators Tl,1 and Tl,2 corresponding,
respectively, to parameters Mj , Qj , mj , (j = 1, 2) and q1 = q2 = q where q is
supposed to be known and nonzero. In what follows, all the objects corresponding
to Tl,j with j = 1, 2 will be denoted by the usual notations with a lower index j.
We suppose that Tl,1 = Tl,2 . In consequence we also have Fl,1 () = Fl,2 (). Our
goal is to prove that M1 = M2 and Q1 = Q2 . Using Theorem 3.2 and identifying
the terms of same orders in the reconstruction formula (3.1), we thus get

1 (x) = 2 (x), (3.125)


A1 (x) = A2 (x). (3.126)

By (3.3) and a standard continuity argument, (3.125) leads to the equality


 0  0
i [c1 (s) c0,1 ]ds + ic0,1 x = i [c2 (s) c0,2 ]ds + ic0,2 x + 2k, (3.127)

where k Z. If we derivate (3.127) with respect to x, we obtain

c0,1 = c0,2 := c0 . (3.128)

Now by (3.124), (3.126) leads to the equality


 2  
1 1 i 0 2 i + i
l+ + b (s)ds + (b1 (s) m)2 ds + m21 x
2 r0,1 2 1 2 0 2
 2  
1 1 i 0 2 i + i
= l+ + b2 (s)ds + (b2 (s) m)2 ds + m22 x. (3.129)
2 r0,2 2 2 0 2
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

Inverse Scattering in de SitterReissnerNordstr


om Black Hole Spacetimes 467

If we derivate (3.129) with respect to x, we rst get

m1 = m2 := m. (3.130)

Hence the mass m of Dirac elds is uniquely determined. Moreover, using (3.130),
(3.124) and the homogeneity in the parameter l, we obtain from (3.129)

r0,1 = r0,2 := r0 . (3.131)

Therefore the radius r0 of the event horizon is also uniquely determined. Now if
we combine (3.131) and c0 = qQ r0 into (3.128), we get (since q is supposed to be
nonzero)

Q1 = Q2 := Q.

The charge Q of the black hole is thus uniquely determined. Eventually since r0
cancels the function F , we get from (2.2) that

r02 + Q2
M1 = M2 := M = ,
2r0
and the mass M of the black hole is uniquely determined. This nishes the proof
of Theorem 3.1. 

4. The Inverse Problem for dS-RN Black Holes ( > 0)


In this section, we study the inverse problem in the case > 0 corresponding to
dS-RN black holes. In a rst part, we prove the same kind of results as in Sec. 3,
that is we prove that the parameters M, Q and are uniquely determined by the
high energies of the transmission operators TL or TR . In a second part, we prove
by means of a purely stationary method that the parameters M, Q and can also
be uniquely determined from the knowledge of the reection operators L or R on
any interval of energy.

4.1. The inverse problem at high energy


As in Sec. 3, we shall assume here that one of the following functions of R

Fl () = TR eix , eix , Gl () = TL eix , eix ,

is known for all large values of , for all l N and for all , H with ,

C0 (R; C ). We emphasize that in this case the construction of the modiers are
4

simpler than in the previous section due to the decay of the potentials at innity;
the phases of the modiers constructed later will belong to a good class of oscillating
symbols. In particular, we do not need a technical cuto function + and a cuto
function 0 in order to control the spreading of the wave packets as in Sec. 3 and we
can consider test functions , H with , C (R; C4 ). We also assume that
0
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

468 T. Daud
e & F. Nicoleau

the mass m and the charge q of the Dirac elds are known and xed. Furthermore,
the charge q is supposed to be nonzero. Then our main result is

Theorem 4.1. Under the previous assumptions, the parameters M, Q and of the
dS-RN black hole are uniquely determined.
This theorem will follow from the following reconstruction formulae obtained
on each spin-weighted spherical harmonics

Theorem 4.2 (Reconstruction Formulae). Let , H such that ,

C0 (R; C ). Then for large, we have
4

1
Fl () = (x)P , P  + A(x)P , P  + O(2 ), (4.1)

1
Gl () = (x)P+ , P+  A(x)P+ , P+  + O(2 ), (4.2)

where (x) and A(x) are multiplication operators given by
 + 
i  2 
(x) = eii(c+ c0 )x , A(x) = al (s) + b2 (s) ds (x), (4.3)
2

and a constant given by


 0 
  +  
= c(s) c0 ds + c(s) c+ ds.
0

We shall prove Theorem 4.2 using the same global strategy as in the proof of
Theorem 3.2. From (2.30), (2.31), (2.33) and the fact that eix corresponds to a
translation by in momentum space, we express F () and G() as follows

Fl () = W(+) +
(), W() (), (4.4)

Gl () = W() +
(), W(+) (), (4.5)
with

W() () = eix W() eix = s- lim eitH() eitH0 () P , (4.6)
t

W(+) () = eix W(+) eix = s- lim eitH() eitH+ () P , (4.7)
t

and
H() = 1 (Dx + ) + a(x)2 + b(x)0 + c(x),
H0 () = 1 (Dx + ) + c0 , H+ () = 1 (Dx + ) + c+ .
In consequence, it is enough to obtain an asymptotic expansion of the -shifted

wave operators W() () in order to prove the reconstruction formulae (4.1)
and (4.2).

Note rst that the -shifted wave operators W() () given by (4.6) are exactly
the same as in the case = 0 studied in Sec. 3.2. For completeness we recall here
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

Inverse Scattering in de SitterReissnerNordstr


om Black Hole Spacetimes 469


the asymptotic expansion of W() () obtained in Proposition 3.2. For any H,

C (R; C ), we have
0
4

   
1 1 1
W() () = U 1 + Q (x) ei c0 x P + O , (4.8)
2
where
 x
1
C (x)
U = ei , C (x) = [c(s) c0 ]ds + c0 x, (4.9)

  
1
Q (x) = i W 2 (s)ds W (x) ,
2 x (4.10)
i1 C (x) 2 0 i1 C (x)
W (x) = e (a(x) + b(x) )e .

Note second that the -shifted wave operators W(+) () given by (4.7) are very
similar to (4.6), the constant c0 being replaced by c+ and the projections P being
replaced by P since we work now at the cosmological horizon. Hence they can be
studied exactly the same way as in Sec. 3.2. Since there are slight modications in
some formulae, we recall here the procedure but omit the proofs. Using the unitary

transform (4.9), we simplify the wave operators W(+) as follows

W(+) = U s- lim eitA eitA0 eitA0 U eiH+ P , (4.11)
t

where we have used again the notations A0 = 1 Dx and A = U HU = 1 Dx +W (x)


from (3.102) and (3.103) with the potential W given by (4.10). We also recall that
by (2.16) this new potential W (x) satises the properties

1 W + W 1 = 0, W 2 = a2 + b 2 , (4.12)

as well as the global estimate

> 0, W (x) = O(e|x| ), x R. (4.13)

The potential W is thus very short-range both at the event horizon and at the
cosmological horizon. Now an easy calculation shows that (to be compared with
(3.107) and its proof)
1 1
s- lim eitA0 U eiH+ P = ei ei c+ x
P , (4.14)
t

where the constant is given by


 0 
  +  
= c(s) c0 ds + c(s) c+ ds. (4.15)
0

Furthermore, it is immediate from (4.13) that the wave operators W (A, A0 ) =



s- limt eitA eitA0 exist on H. Hence we conclude by the chain-rule that W(+)
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

470 T. Daud
e & F. Nicoleau


take the nice form (to be compared to the expressions (3.110) obtained for W() )

1 1

W(+) = U W (A, A0 ) ei ei c+ x
P . (4.16)
1 1
Since U and ei ei c+ x
commute with eix , we nally get the following expression

for W(+) ()
1 1

W(+) () = U W (A, A0 , ) ei ei c+ x
P ,

where
W (A, A0 , ) = eix W (A, A0 )eix .

Clearly it is enough to know the asymptotics of W (A, A0 , )P when + in



order to get the asymptotics of W(+) (). In fact, the calculations are exactly the
same to what has been done in Sec. 3.2 (it suces to replace P by P in these
calculations) or in [4]. Hence we only give the nal result without more details. For
any H, C0 (R; C4 ), we nally obtain
   
1 1 1 1
W(+) () = U 1 + Q (x) ei ei c+ x P + O , (4.17)
2

where U is given by (4.9), Q (x) = 1 (i + W 2 (s)ds W (x)) and W is given by
2 x
(4.10).
Proof of Theorem 4.2. We now use the asymptotic expansions (4.8) and (4.17)
to prove the reconstruction formulae (4.1) and (4.2). Since the proofs are analogous,
we only treat (4.1). Using the previous notations we clearly have
!    
1 1 1 1
Fl () = U 1 + Q (x) ei ei c+ x P , U 1 + Q+ (x)

"  
1 1
ei c0 x P + O . (4.18)
2

Since U is unitary and since 1 P = P , we reexpress (4.18) as

F (l ) = eii(c+ c0 )x P , P 
 
1  
+ eii(c+ c0 )x Q (x) + (Q+ ) (x) P , P  + O 1 . (4.19)
2
, (4.19) becomes
From the explicit expressions of Q+ and Q

Fl () = eii(c+ c0 )x P , P 
   +   
1 ii(c+ c0 )x i 1
+ e W (s)ds W (x) P , P + O
2
.
2 2
(4.20)
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

Inverse Scattering in de SitterReissnerNordstr


om Black Hole Spacetimes 471

Eventually observe that W (x)P = P+ W (x) by (2.16) and that P+ , P  = 0.


Hence we obtain for (4.20)

Fl () = eii(c+ c0 )x P , P 
 +  
i ii(c+ c0 )x 1
+ W (s)ds e
2
P , P  + O . (4.21)
2 2

Denoting

(x) = eii(c+ c0 )x ,
 +   + 
i i
A(x) = 2
W (s)ds (x) = 2 2
(al (s) + b (s))ds (x),
2 2

we have proved the reconstruction formula (4.1). This nishes the proof of
Theorem 4.2. 

Proof of Theorem 4.1. We prove here that the parameters M, Q and are
uniquely determined from the knowledge of the high energies of the transmission
operator TR . Note that the proof with the high energies of TL is the same. Consider
TR,1 and TR,2 two transmission operators corresponding to parameters Mj , Qj , j
with j = 1, 2 where moreover m, q = 0 are supposed to be known and xed. In what
follows, we shall denote all the objects associated to TR,j by the usual notations
with a lower index j.
We assume that TR,1 = TR,2 . From the denition of Fl () it follows then that
Fl,1 () = Fl,2 (). We identify now the terms of same orders in the asymptotic
expansion (4.1). Since , are dense in H, we get

1 (x) = 2 (x), x R (4.22)


A1 (x) = A2 (x), x R. (4.23)

Let us analyze the term of order 0 rst. From (4.22) and (4.3), we have

i1 i(c+,1 c0,1 )x = i2 i(c+,2 c0,2 )x + 2k, x R, (4.24)

where k Z. If we derivate (4.24) with respect to x, we thus obtain

c0,1 c+,1 = c0,2 c+,2 . (4.25)

Hence using (4.25) and (2.29), we see that the quantity

r+ r0
X = c0 c+ = qQ , (4.26)
r0 r+

is uniquely determined.
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

472 T. Daud
e & F. Nicoleau

We analyze now the term of order O(1 ). From (4.23), (4.3) and (4.22) again,
we have
 +  +
2
W1 (s)ds = W22 (s)ds. (4.27)

Using that W 2 (x) = a2l (x) + b2 (x) and the expressions of the potentials al and
b given by (2.15) and the denition of the ReggeWheeler variable (2.6), we can
compute explicitely the integrals that appear in (4.27). In fact we have
 +  2  
1 1 1
2
W (s)ds = l + + m2 (r+ r0 ). (4.28)
2 r0 r+
By homogeneity in l and since m is considered as known and xed, we deduce from
(4.27) and (4.28) that
r+,1 r0,1 r+,2 r0,2
= , (4.29)
r0,1 r+,1 r0,2 r+,2
r+,1 r0,1 = r+,2 r0,2 . (4.30)

Hence the quantities


r+ r0
Y = , Z = r+ r0 , (4.31)
r0 r+
are uniquely determined.
We can now show the uniqueness of the parameters M, Q and as follows. We
rst note the following relation

X = qQY. (4.32)

Since X, Y are uniquely determined and q is supposed to be known and xed, we


deduce from (4.32) that Q is uniquely determined, i.e. Q1 = Q2 = Q.
Moreover, from (4.31) we deduce that r+ r0 and r0 r+ are uniquely determined.
Hence so are r0 and r+ as the unique solutions of the obvious polynomial of second
order. Now recall r0 and r+ are roots of F (r) = 0. The equations F (r0 ) = 0 and
F (r+ ) = 0 can be written using (2.2) as the linear system
2
2 r+ Q2
r+   1 + r2
3 M +
= . (4.33)
2 r02 Q2
1+ 2
r0 3 r0
r 3 r 3
The determinant of (4.33) is 23 r00 r++ and is clearly nonzero. Hence (M, ) are the
unique solutions of the system (4.33) whose coecients depend only on r0 , r+ , Q
which are uniquely determined by the previous discussion. We thus conclude that
M and are also uniquely determined, i.e. M1 = M2 and 1 = 2 and the proof
of Theorem 4.1 is nished. 
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

Inverse Scattering in de SitterReissnerNordstr


om Black Hole Spacetimes 473

4.2. The inverse problem on an interval of energy


In this last subsection, we solve the inverse problem when the reection operators
L or R are supposed to be known on a (possibly small) interval of energy. We follow
the usual stationary approach of inverse scattering on the line and refer to [8, 6]
for a presentation of the general method in the case of one-dimensonal Schr odinger
operators and to [1] for an application to massless Dirac operators (see also [12, 15]
for massive Dirac operators). We rst determine a stationary representation of the
scattering operator S expressed in terms of the usual transmission and reection
coecients (here matrices). We do this by a serie of simplications of our model which
nally reduces to the exact framework studied in [1]. We then use the exponential
decay of the potentials to show that the reection coecients R and L can be
extended analytically to a small strip around the real axis. In consequence, the
reection coecients R or L are uniquely determined on R if they are known on
any interval of energy by analytic continuation. At last, we use the results of [1], a
classical Marchenko method, to prove that the parameters M, Q and are uniquely
determined by the knowledge of R() or L() for all energies.
Recall that the scattering operator S is dened by

S = (W + ) W ,

where the global wave operators W are given when > 0 by



W = W() + W(+) , (4.34)

with

W() = s- lim eitH eitH0 P , W(+) = s- lim eitH eitH+ P . (4.35)
t t

We now use the unitary transform U introduced in (3.101) and the corresponding

simplied expressions of W() obtained in (3.110) and (4.16) to express (4.34) as

1 1 1
W = U W (A, A0 )(ei c0 x
P + ei ei c+ x
P ). (4.36)

Here we have used the notations introduced in Secs. 3.2 and 4.1. Let us denote by
1 1 1
G the operators ei c0 x P + ei ei c+ x P appearing in (4.36) and by S(A, A0 )
the scattering operator associated to the operators A and A0 , i.e.

S(A, A0 ) = (W + (A, A0 )) W (A, A0 ).

Using the unitarity of U we thus immediately get the following expression for the
scattering operator S

S = G+ S(A, A0 )G . (4.37)

The couple of operators (A, A0 ) acting on H turns out to t the framework studied
in [1]. Recall that they are given by A0 = 1 Dx and A = A0 + W (x) where the
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

474 T. Daud
e & F. Nicoleau

1 1
potential W (x) = ei C (x) (a(x)2 + b(x)0 )ei C (x) is the 4 matrix-valued
function
   
0 k(x) 2iC (x) ib(x) a(x)
W (x) = , k(x) = e . (4.38)
k (x) 0 a(x) ib(x)
Here k (x) denotes the transpose conjugate of the matrix-valued function k(x).
Moreover W satises (4.12) and (4.13) and thus its entries belong to L1 (R). This
is precisely the kind of operators studied in [1]. Note however that our potential
W is better than L1 (R) since it is exponentially decreasing at both ends x .
This will be used hereafter. As a consequence, we can use the following stationary
representation of S(A, A0 ) obtained in [1]. Let us introduce the unitary transform
F on H dened by

1 1
F () = ei x (x)dx, (4.39)
2 R
then we have (see [1, p. 143])
S(A, A0 ) = F S0 ()F , (4.40)
where the scattering matrix S0 () takes the form
 
TL () R()
S0 () = . (4.41)
L() TR ()
Here TL () and TR () are 2 2 matrix-valued functions which correspond to the
usual transmission coecients of S whereas L() and R() are 2 2 matrix-valued
functions which correspond to the usual reection coecients of S. We refer to
[1, Secs. 2 and 3] for the denition and the construction of the scattering matrix
S0 (). Hence (4.37) becomes
S = (F G+ ) S0 ()F G . (4.42)
We now nish our factorization of the scattering operator S as follows. Using 2 2
block matrix notations, we note that
 i   ic x     ic x 
e 0 e + 0 1 0 e 0 0
G+ = , G = ,
0 1 0 eic0 x 0 ei 0 eic+ x
and we dene two unitary transforms F on H by
 ic x 
e + 0
F+ () = F ()
0 eic0 x
  ix+ic+ x 
1 e 0
= (x)dx, (4.43)
2 R 0 eixic0 x
and
 
eic0 x 0
F () = F ()
0 eic+ x
  
1 eix+ic0 x 0
= (x)dx. (4.44)
2 R 0 eixic+ x
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

Inverse Scattering in de SitterReissnerNordstr


om Black Hole Spacetimes 475

Then we have
   
ei 0 1 0
F G+ = F+ , F G = F . (4.45)
0 1 0 ei
Hence we conclude from (4.45) that the scattering operator (4.42) factorizes as
 i 
e TL () e2i R()
S = F+ F . (4.46)
L() ei TR ()
We summarize this result as a proposition

Proposition 4.1. The scattering operator S has the following stationary represen-
tation. If F are the unitary transforms dened in (4.43) and (4.44), then
S = F+ S()F , (4.47)
where the 4 4 scattering matrix S() is given by
 i 
e TL () e2i R()
S() = , (4.48)
L() ei TR ()
and the quantities TL , TR and L, R are the 2 2 matrices that correspond to the
transmission and reection matrices of S(A, A0 ) respectively and are obtained in
[1, Secs. 2 and 3].

Remark 4.1. As the notations suggest, the diagonal elements of the scattering
matrix S() given in (4.48) are simply the stationary representations of the trans-
mission operators TL and TR introduced in Sec. 2, (2.33). The anti-diagonal ele-
ments of S() are in turn the stationary representations of the reection operators
L and R in (2.34).

Remark 4.2. The unitary operators F appearing in the stationary representation


(4.47) of S are natural in the following sense. Let us dene the two selfadjoint
operators on H
H + = (1 Dx + c+ )P+ + (1 Dx + c0 )P , H = (1 Dx + c0 )P+ + (1 Dx + c+ )P .
Hence it is clear from (4.34) and (4.35) that the global wave operators can be
written in a classical form as

W = s- lim eitH eitH .
t

Now it is an easy calculation to show that the unitary transforms F introduced


in (4.43) and (4.44) are precisely the unitary transforms which diagonalize the
operators H , i.e.
H = F M F ,
where M denotes the multiplication operator by . We conclude that (4.47)
together with (4.48) are the expected stationary representation of the scattering
operator S.
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

476 T. Daud
e & F. Nicoleau

In the sequel, we shall use the explicit link between our scattering matrix S()
and the scattering matrix S0 () thoroughly studied in [1] in order to solve the
inverse problem. Let us rst briey summarize some of the main results obtained
in [1]. Under the assumption W L1 (R), the scattering matrix S0 () is continuous
for R and tends to I4 when . It is also unitary for each R (see
[1, Theorem 3.1] for a proof of these statements and for other properties on S0 ())).
Moreover, the following partial characterization result holds:
Theorem 4.3 ([1, Theorem 6.3]). Assume that the reection operators R()
and L() be 2 2 matrix valued functions satisfying
sup R() < 1, sup L() < 1, R()
L1 (R), L()
L1 (R), (4.49)
R R
 +  0

R() 2
d < ,
L() 2
d < , (4.50)
0


where R()
and L() denote the usual Fourier transform of R() and L() and 
is the Euclidean norm of a given matrix. Then the matrix-valued function k(x)
L1 (R) in (4.38) (and thus the potential W (x)) can be uniquely recovered from the
knowledge of R() and L() for all R.
We make several comments on this result and how we can apply it to our model:
The proof of the above theorem uses a classical Marchenko method. For instance,
the matrix-valued function k(x) can be obtained after solving the following
Marchenko integral equations for > 0 (see [1, Eqs. (6.9) and (6.11)])
 + +
B1 (x, ) = R(
+ 2x) + + + 2x) R(
B1 (x, )R( + + 2x)dd,
0 0
(4.51)
 + +
2x) +
B2 (x, ) = L( + 2x) dd.
+ 2x)L(
B2 (x, )L(
0 0
(4.52)
Under the assumption (4.49), the integral equations (4.51) and (4.52) are uniquely
solvable in L1 (R+ ) ([1, Theorem 6.2]). Moreover, under the additionnal assump-
tion (4.50), the matrix-valued function k(x) dened using the boundary values
of B1 and B2 by the formulae (see [1, Eq. (4.19)])
k(x) = 2iB1 (x, 0+ ), x > 0, k(x) = 2iB2 (x, 0+ ), x < 0,
can be shown to be in L1 (R) and thus corresponds to the potential we are looking
for.
If the potential W belongs to L1 (R), then the condition (4.49) is automatically
satised (see [1, Theorem 4.2 and Eq. (6.17)]). Although this condition is the
natural one under which one could expect to reconstruct the potential k in the
class L1 , the authors of [1] had to add the extra assumption (4.50) (which must
then be checked) in order to prove their result. We refer to [1, p. 154] for more
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

Inverse Scattering in de SitterReissnerNordstr


om Black Hole Spacetimes 477

details on this point. In our case, we shall prove the condition (4.50) as follows.
Using the exponential decay of W , we are rst able to show that the reection
coecients R() and L() (in fact the whole scattering matrix S0 ()) are analytic
on a small strip around the real axis. Moreover the functions R( + i) and
L( + i) can be shown to belong to L2 (R) uniformly for each || small enough.
It follows then from standard results on the Fourier transform (see, for instance,

[26, Theorem IX.13]) that R()
and L() satisfy

e || R() L2 (R),
e || L() L2 (R),  small enough,

from which (4.50) follows immediately.


From (4.51) and (4.52) and the reconstruction procedure explained above, we see
that the knowledge of R() and L() for all R is used to recover the potential
k(x) for all x R. In fact it is only enough to know either R() or L() for all
R since then the whole scattering matrix S0 () can be uniquely recovered.
The procedure is explained in [1, p. 147, Eqs. (5.3)(5.5)] and we reproduce it
for completeness. Assume for instance that R() is known for all R. Then
the transmission coecients TL () and TR () can be obtained performing the
factorizations

TL ()TL () = I4 R()R() , TR () TR () = I4 R() R(), R.


(4.53)

Under the assumption k L1 (R), it was shown in [1] that the above factorization
problems are in fact left or right canonical WienerHopf factorization in the
Wiener algebra W 4 and thus lead to unique TL () and TR () (see for instance
[11, Theorem 9.2, p. 831]). At last, the reection coecient L() is recovered
from R() by the formula

L() = TR ()R() (TL () )1 . (4.54)

Eventually we explain how we can apply this result to our model. From
Proposition 4.1, we assume for instance that e2i R() is known for all R.
Then it is easy to see from (4.53) and (4.54) that we can uniquely recover TL ()
and TR () by performing WienerHopf factorizations and then e2i L() for all
R. Note that the exponential term e2i disappears in the factorization
(4.53). If we assume that the assumptions (4.49) and (4.50) hold (this will be
checked below), then we can apply Theorem 4.3 as follows. Multiplying the inte-
gral equations (4.51) and (4.52) by e2i and solving them, we conclude that we
can uniquely recover e2i k(x) (and not k(x)) for all x R. We shall show below
that this implies the uniqueness of the parameters M, Q and of the black hole.

Let us now show the analyticity of R() and L() on a small strip around the
real axis and prove there the uniform L2 estimates mentioned above. To do this we
need to introduce some objects whose existence has been shown in [1, Secs. 13].
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

478 T. Daud
e & F. Nicoleau

The reection coecients R() and L() can be expressed in terms of solutions of
the stationary problem

[1 Dx + W (x)]X(x, ) = X(x, ), R (4.55)

where X(x, ) is understood as 4 4 matrix-valued function. Of special interest


are the Jost solutions Fl (x, ) and Fr (x, ) of (4.55) which are singled out by the
specic asymptotics at innity
1
Fl (x, ) = ei x
(I4 + o(1)), x +,
1
Fr (x, ) = e i x
(I4 + o(1)), x .

For each R, these two solutions exist, are fundamental matrices of (4.55) and
are related as follows ([1, Proposition 2.2]). There exist two 4 4 matrix valued
functions al () and ar () such that

Fl (x, ) = Fr (x, )al (), Fr (x, ) = Fl (x, )ar (),

and satisfying al ()ar () = ar ()al () = I4 for all R. Note that Fl (x, ) and
Fr (x, ) satisfy the asymptotics (in the opposite ends)
1
Fl (x, ) = ei x
(al () + o(1)), x ,
1 (4.56)
Fr (x, ) = e i x
(ar () + o(1)), x +.

Let us now express al () and ar () using 2 2 block matrix notations as


   
al1 () al2 () ar1 () ar2 ()
al () = , ar () = .
al3 () al4 () ar3 () ar4 ()

Then the reection coecients are dened by ([1, Eqs. (3.6) and (3.7)])

R() = ar2 ()ar4 ()1 = al1 ()1 al2 (),


L() = al3 ()al1 ()1 = ar4 ()1 ar3 ().

Since the situations are obviously symmetric, we shall only prove the analyticity
and the uniform L2 estimate on a small strip around the real axis for R() (the proof
for L() being identical). Moreover, we shall only consider the denition R() =
al1 ()1 al2 () for simplicity. To go further, we use some integral representations
of the coecients al1 () and al2 () obtained in [1]. These are given in terms of the
Faddeev matrix Ml (x, ) dened by
1
Ml (x, ) = Fl (x, )ei x
.

It is easy to see from (4.55) that Ml (x, ) must satisfy the integral equation
([1, Eq. (2.12)])
 +
1 1
Ml (x, ) = I4 i1 ei (yx) W (y)Ml (y, )ei (yx) dy, (4.57)
x
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

Inverse Scattering in de SitterReissnerNordstr


om Black Hole Spacetimes 479

and from (4.56) that Ml (x, ) must satisfy the asymptotics Ml (x, ) = I4 + o(1)
when x +. In fact, using once again 2 2 block matrix notations for Ml (x, )
 
Ml1 (x, ) Ml2 (x, )
Ml (x, ) = ,
Ml3 (x, ) Ml4 (x, )
and iterating (4.57) once, we get the uncoupled system of integral equations for
Ml3 (x, ) and Ml4 (x, ) ([1, Eqs. (2.15) and (2.16)])
 +
Ml3 (x, ) = i e2i(yx) k(y) dy
x
 +  +
+ e2i(yx) k(y) k(z)Ml3 (z, )dzdy, (4.58)
x y
 +  +
Ml4 (x, ) = I4 + e2i(zy) k(y) k(z)Ml4 (z, )dzdy, (4.59)
x y

and similar equations for Ml1 (x, ) and Ml2 (x, ) that we would not need. Eventu-
ally, the following integral representations for the coecients al1 () and al2 () hold
([1, Eqs. (2.25) and (2.26)])

al1 () = I2 i k(y)Ml3 (y, )dy, (4.60)
R

al2 () = i e2iy k(y) Ml4 (y, )dy. (4.61)
R

We rst study the coecient al2 () expressed in terms of the Faddeev matrix
Ml4 (x, ). Under the assumption k L1 (R), a solution Ml4 (x, ) of (4.59) with
the right asymptotics is easily shown to exist by iteration. Moreover for each xed
x R, this solution can be extended to a continuous function in the variable
when Im 0 and analytic when Im < 0 ([1, Proposition 2.3]). We prove now
the following result
 +
Lemma 4.1. Dene the function P (x, ) = x e2|Im||y| k(y)dy. Then there
exists > 0 small enough such that
(i) For all satisfying |Im | and for all x R, the function P (x, ) is
uniformly bounded.
(ii) For each xed x R, the Faddeev matrix Ml4 (x, ) can be extended analytically
to the strip |Im | < . Moreover, for each such , it satises the estimate
Ml4 (x, ) C cos h(P (x, )). (4.62)

(iii) For each xed x R, the derivative Ml4 (x, ) of the Faddeev matrix with
respect to the variable x can be extended analytically to the strip |Im | < .
Moreover, for each such , it satises the estimate

Ml4 (x, ) C sin h(P (x, )). (4.63)
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

480 T. Daud
e & F. Nicoleau

Proof. The rst assertion is a direct consequence of the denition of P (x, ) and
(4.13) (take for instance = 2 where is the positive number that appears in
#
(4.13)). Solving (4.59) by iteration leads to set Ml4 (x, ) = n=0 un (x, ) with
u0 (x, ) = I2 and
 +  +
un (x, ) = e2i(zy) k(y) k(z)un1 (z, )dzdy, n 1. (4.64)
x y

By induction we get the estimates


P (x, )2n
un (x, ) , n N. (4.65)
(2n)!
Together with (i), this entails the second assertion. To prove the third one, we
#
consider the serie of derivatives n=1 un (x, ). From (4.64), note that
 +

un (x, ) = e2i(zx) k(x) k(z)un1 (z, )dzdy.
x
2n1
By induction and using (4.65), we get the estimates un (x, ) C P (x,)
(2n1)! for
all n 1 from which we deduce (iii).

Corollary 4.1. Let the positive number dened in Lemma 4.1. The coecient
al2 () is analytic on the strip |Im | < . Moreover, it satises there the estimate

al2 () = O(||1 ), || . (4.66)

Proof. The analyticity on the strip |Im | < follows directly from (4.61) and
Lemma 4.1. To prove the second assertion, we integrate by parts in (4.61). For all
with |Im | < , we obtain

1
al2 () = e2iy (k  (y)Ml4 (y, ) + k(y)Ml4

(y, ))dy. (4.67)
2 R
Since k  also satises the estimate (4.13) and using Lemma 4.1 again, we conclude
that al2 () ||
C
.

We now study the coecient al1 () expressed in terms of the Faddeev matrix
Ml3 (x, ). Once again under the assumption k L1 (R), a solution Ml3 (x, ) of
(4.58) with the right asymptotics is easily shown to exist by iteration. Moreover
for each xed x R, this solution can be extended to a continuous function in the
variable when Im 0 and analytic when Im > 0 ([1, Proposition 2.3]). Using
the same function P (x, ) and positive number as in Lemma 4.1, let us prove the
following result

Lemma 4.2. For each xed x R, the Faddeev matrix Ml3 (x, ) can be extended
analytically to the strip |Im | < . Moreover, for each such , it satises the
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

Inverse Scattering in de SitterReissnerNordstr


om Black Hole Spacetimes 481

estimates

Ml3 (x, ) Ce2|Im ||x| sinh(P (x, )). (4.68)


C
Ml3 (x, ) (1 + e2|Im ||x| ), || 1. (4.69)
||
#
Proof. We solve (4.58) by iteration. Hence we set Ml3 (x, ) = n=0 vn (x, ) with
 +
v0 (x, ) = i e2i(yx) k(y)dy,
x

and
 +  +
vn (x, ) = e2i(yx) k(y) k(z)vn1 (z, )dzdy. (4.70)
x y

We can prove the following estimate by induction


P (x, )2n+1
vn (x, ) e2|Im ||x| , n N, (4.71)
(2n + 1)!
which implies immediately (4.68). Moreover, since P (x, ) is uniformly bounded on
|Im | < , we deduce from (4.68) the analyticity of Ml3 (x, ) on the same strip. To
prove (4.69), we integrate by parts in (4.58) with respect to the variable y. For all
with |Im | < , we obtain

k (x) e2ix + 2iy 
Ml3 (x, ) = e (k ) (y)dy
2 2 x

k (x)K(x) e2ix + 2iy 
e ((k ) (y)K(y)
2i 2i x

k (y)k(y)Ml3 (y, ))dy, (4.72)


 +
where we have introduced the function K(x) = x k(y)Ml3 (y, )dy. Now using
(4.13) for k and k  , (4.68) and the uniform estimate K(x) C for all with
|Im | < , we deduce that (4.69) holds when || is large from (4.72).

Corollary 4.2. Let be the positive number dened in Lemma 4.1. Then the coef-
cient al1 () is analytic on the strip |Im | < and tends to I2 when || .
Furthermore, possibly considering smaller , the coecient al1 () is invertible on
the strip |Im | < and a1
l1 () is analytic and uniformly bounded there.

Proof. The rst assertion is a direct consequence of (4.60) and Lemma 4.2. Since
al1 () tends to I2 when || , al1 () is clearly invertible for || large enough.
Since al1 () is also invertible on the real axis ([1, Proposition 2.10]), we conclude
that al1 () is invertible on a strip |Im | <  with 0 <  < small enough and that
a1
l1 () is analytic and uniformly bounded on |Im | < . Denoting this  by , we
have proved the corollary.
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

482 T. Daud
e & F. Nicoleau

Let us put all these results together. Since R() = a1


l1 ()al2 (), Corollaries 4.1
and 4.2 imply that the reection coecient R() is analytic on a strip |Im | <
where is a small enough positive number. Moreover, using the estimates of the
same corollaries, we see that R( + i) L2 (R) for all || < . In fact, we have

sup R( + i)L2 < .


||<


Finally it follows from [26, Theorem IX.13] that the Fourier transform R() satises
the estimate

e|| R()
L2 (R). (4.73)

In particular, the assumption (4.50) in Theorem 4.3 is satised by R().


We nish this paper solving the inverse problem.

Theorem 4.4. Assume that one of the reection matrices L() or e2i R()
appearing in (4.48) is known on a (possibly small) interval of R. Assume more-
over that the mass m and the charge q = 0 of the Dirac elds are known and xed.
Then the parameters M, Q and of the dS-RN black hole are uniquely determined.

Proof. We only give the proof when the reection matrix e2i R() is supposed
to be known on an interval I of R since the proof with L() can be treated the
same way. We consider thus e2i1 R1 () and e2i2 R2 () two reection matrices
corresponding to parameters Mj , Qj and j with j = 1, 2 where moreover the
parameters m, q = 0 are supposed to be known and xed. As usual we shall denote
all the objects related to e2ij Rj () by a lower index j in what follows. Assume
that e2i1 R1 () = e2i2 R2 () for all I. By analyticity, we thus have

e2i1 R1 () = e2i2 R2 (), R.

Using the procedure explained after Theorem 4.3, this also entails that

e2i1 L1 () = e2i2 L2 (), R.

Thanks to (4.73) and the corresponding result for L(), we can apply Theorem 4.3
(and the remarks following this theorem). Hence we obtain the equality e2i1 k1 (x) =
e2i2 k2 (x) for all x R or equivalently
1 1
e2i 1
W1 (x) = e2i 2
W2 (x), x R. (4.74)

Now recall that W 2 is a positive function since


 2
2 2 2 1 F (r)
W (x) = al (x) + b (x) = l + + m2 F (r).
2 r2
Hence taking the square of (4.74) and then the modulus, we have

W12 (x) = a2l,1 (x) + b21 (x) = a2l,2 (x) + b22 (x) = W22 (x), x R. (4.75)
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

Inverse Scattering in de SitterReissnerNordstr


om Black Hole Spacetimes 483

Note in particular that


 +  +
W12 (s)ds = W22 (s)ds. (4.76)

Moreover by homogeneity in l and since al and b are positive functions, we deduce


from (4.75) that
al,1 (x) = al,2 (x), b1 (x) = b2 (x), x R. (4.77)
Now since
1
C (x)
W (x) = e2i (al (x)2 + b(x)0 ),
by (2.16) it follows from (4.74) and (4.77) that
1
1 2i1 C1 (x) 1
1 2i1 C2 (x)
e2i e = e2i e , x R,
or equivalently that
1 C1 (x) = 2 C2 (x) + k, x R, (4.78)
where k Z. Derivating (4.78), we obtain
c1 (x) = c2 (x), x R. (4.79)
If we let tend x to , we obtain from (4.79) and (2.15)
c0,1 = c0,2 , c+,1 = c+,2 . (4.80)
We notice eventually that (4.76) and (4.80) are precisely the conditions under
which the parameters M, Q and were shown to be uniquely determined in the
proof of Theorem 4.1 (see precisely the conditions (4.25) and (4.27)). We thus apply
the same procedure as before to end up the proof of the theorem.

References
[1] T. Aktosun, M. Klaus and C. van der Mee, Direct and inverse scattering for selfadjoint
Hamiltonian systems on the line, Integr. Equa. Oper. Theory 38 (2000) 129171.
[2] S. Arians, Geometric approach to inverse scattering for the Schr odinger equation
with magnetic and electric potentials, J. Math. Phys. 38(6) (1997) 27612773.
[3] T. Daude, Time-dependent scattering theory for massive charged dirac elds by a
ReissnerNordstrom black hole, preprint, Universite Bordeaux 1 (2004); available
online at http://tel.archives-ouvertes.fr/tel-00011974/en/.
[4] T. Daude and F. Nicoleau, Recovering the mass and the charge of a Reissner
Nordstrom black hole by an inverse scattering experiment, Inverse Problems 24
(2008) 025017, 18 pp; Corrigendum, ibid. 25 (2009) 059801.
nski and C. Gerard, Scattering Theory of Classical and Quantum N-Particle
[5] J. Derezi
Systems (Springer, 1997).
[6] P. Deift and E. Trubowitz, Inverse scattering on the line, Comm. Pure Appl. Math
32 (1979) 121251.
[7] V. Enss and R. Weder, The geometrical approach to multidimensional inverse scat-
tering, J. Math. Phys. 36(8) (1995) 39023921.
May 11, 2010 10:7 WSPC/S0129-055X 148-RMP
J070-S0129055X10004004

484 T. Daud
e & F. Nicoleau

[8] L. D. Faddeev, The inverse problem in the quantum theory of scattering II, Itogi
Nanki i Tekhniki. Ser. Sovrem. Probl. Mat. 3 (1974) 93180.
[9] Y. Gatel and D. R. Yafaev, Scattering theory for the dirac operator with a long-range
electromagnetic potential, J. Funct. Anal. 184 (2001) 136176.
[10] I. M. Gelfand and Z. Y. Sapiro, Representations of the group of rotations of
3-dimensional space and their applications, Amer. Math. Soc. Trans. 11(2) (1956)
207316.
[11] I. Gohberg, S. Goldberg and M. A. Kaashoek, Classes of Linear Operators, Vol. 2,
Operator Theory: Advances and Applications, Vol. 63 (Birkh auser, 1993)
[12] B. Grebert, Inverse scattering for the Dirac operator on the real line, Inverse Problems
8 (1992) 787807.
[13] D. Hafner and J-.P. Nicolas, Scattering of massless Dirac elds by a Kerr black hole,
Rev. Math. Phys. 16(1) (2004) 29123.
[14] S. W. Hawking and G. F. R. Ellis, The Large Scale Structure of Space-Time,
Cambridge Monographs on Mathematical Physics, No. 1 (Cambridge Univ. Press,
1973).
[15] D. B. Hinton, A. K. Jordan, M. Klaus and J. K. Shaw, Inverse scattering on the line
for a Dirac system, J. Math. Phys. 32(11) (1991) 30153030.
[16] H. Isozaki and H. Kitada, Modied wave operators with time-independent modiers,
Papers of the College of Arts and Sciences Tokyo Univ. 32 (1985) 81107.
[17] W. Jung, Geometric approach to inverse scattering for Dirac equation, J. Math. Phys.
36(8) (1995) 39023921.
[18] F. Melnyk, The Hawking eect for spin 1/2 elds, Comm. Math. Phys. 244(3) (2004)
483525.
[19] E. Mourre, Absence of singular continuous spectrum for certain self-adjoint operators,
Comm. Math. Phys. 78 (1981) 391408.
[20] J.-P. Nicolas, Scattering of linear Dirac elds by a spherically symmetric black hole,
Ann. Inst. Henri Poincare Physique Theorique 62(2) (1995) 145179.
[21] F. Nicoleau, A stationary approach to inverse scattering for Schr odinger operators
with rst order perturbation, Comm. Partial Dierential Equations 22(34) (1997)
527553.
[22] F. Nicoleau, An inverse scattering problem with the AharonovBohm eect, J. Math.
Phys. 8 (2000) 52235237.
[23] F. Nicoleau, Inverse scattering for Stark Hamiltonians with short-range potentials,
Asymptot. Anal. 35(34) (2003) 349359.
[24] F. Nicoleau, An inverse scattering problem for the Schr odinger equation in a semi-
classical process, J. Math. Pures Appl. 86 (2006) 463470.
[25] R. Novikov, Small angle scattering and X-ray transform in classical mechanics, Arkiv
Mat. 37(1) (1999) 141169.
[26] M. Reed and B. Simon, Methods of Modern Mathematical Physics, Vol. 2 (Academic
Press, 1975).
[27] D. Robert, Autour de lapproximation semiclassique, Progress in Mathematics,
Vol. 68 (Birkhauser, Basel, 1987).
[28] R. Wald, General Relativity (University of Chicago Press, 1984).
[29] R. Weder, Multidimensional inverse scattering in an electric eld, J. Funct. Anal.
139(2) (1996) 441465.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00398

Reviews in Mathematical Physics


Vol. 22, No. 5 (2010) 485505

c World Scientic Publishing Company
DOI: 10.1142/S0129055X10003989

EULERPOINCARE FLOWS ON THE LOOP BOTTVIRASORO


GROUP AND SPACE OF TENSOR DENSITIES AND
(2 + 1)-DIMENSIONAL INTEGRABLE SYSTEMS

PARTHA GUHA
Max Planck Institute for Mathematics in the Sciences,
Inselstrasse 22, D-04103 Leipzig, Germany
and
S. N. Bose National Centre for Basic Sciences,
JD Block, Sector-3, Salt Lake, Calcutta-700098, India
partha@bose.res.in

Received 22 July 2009


Revised 22 January 2010

Dedicated to Professor Tudor Ratiu on his 60th birthday


with great respect and admiration

Following the work of Ovsienko and Roger ([54]), we study loop Virasoro algebra. Using
this algebra, we formulate the EulerPoincare ows on the coadjoint orbit of loop Vira-
soro algebra. We show that the CalogeroBogoyavlenskiiSchi equation and various
other (2 + 1)-dimensional KortewegdeVries (KdV) type systems follow from this con-
struction. Using the right invariant H 1 inner product on the Lie algebra of loop Bott
Virasoro group, we formulate the EulerPoincare framework of the (2+1)-dimensional of
the CamassaHolm equation. This equation appears to be the CamassaHolm analogue
of the CalogeroBogoyavlenskiiSchi type (2 + 1)-dimensional KdV equation. We also
derive the (2 + 1)-dimensional generalization of the HunterSaxton equation. Finally,
we give an EulerPoincare formulation of one-parameter family of (1 + 1)-dimensional
partial dierential equations, known as the b-field equations. Later, we extend our con-
struction to algebra of loop tensor densities to study the EulerPoincare framework of
the (2 + 1)-dimensional extension of b-eld equations.

Keywords: Dieomorphism; loop Virasoro algebra; tensor densities; Calogero


BogoyavlenskiiSchi equation; (2 + 1)-dimensional Camassa equation; b-eld equation.

Mathematics Subject Classications 2010: 53A07, 53B50

1. Introduction
The study of higher dimensional integrable systems is one of the most challenging
areas in integrable systems. Early in the study of integrable systems, the main
thrusts were restricted to the (1 + 1)-dimensional systems because of the diculty
of nding the physically signicant high-dimensional solutions which are localized
in all directions. Recently, much progress has been achieved in understanding the

485
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00398

486 P. Guha

properties and solutions for two-dimensional integrable models such as Kadomtsev


Petvashvili (KP), DaveyStewartson (DS) equations [1]. One of the most striking
feature of (2+1)-dimensional system is the exponentially localized structures, called
dromions, which are driven by two perpendicular line ghost solitons in case of the
DS equation or two non-perpendicular line ghost solitons in case of the KP equation.
One should recall that the name dromions as well as their spectral meaning were
introduced by Fokas and Santini [27]. Recently the rich dromion structures were
found in (2 + 1)-dimensional KdV equations also [49, 50, 59, 63].
After the discovery of dromions, the question arises whether there exist expo-
nentially localized structures in (2 + 1)-dimensional breaking soliton equations as
well. In such systems, the spectral parameter becomes a multivalued function, in
other words, spectral parameter possesses so-called breaking behavior. The solu-
tions of these equations may become multivalued. There is an equation exibiting
breaking solitons, formulated by Bogoyavlenskii in [5, 6], as one of the (2 + 1)-
dimensional reductions of the self dual YangMills equations. In a series of papers
Bogoyavlenskii studied such breaking soliton equations. He extended the well-known
Lax representation to the generalized form

n
Lt = P (L) + Rk (L, Lyk ) + [L, A].
k=1

Here P (L) and Rk (L, Lyk ) are certain meromorphic functions of the operator L.
In [5, 6], Bogoyavlenskii constructed several hydrodynamic-type systems which
are connected to the Toda lattice and the Volterra model. It has been shown that
these systems possess the breaking behavior, the Hamiltonian forms and conserva-
tion laws. The continuous limits of these systems include the equation
vt = 4vvy + 2vx x1 vy vxxy + 0 (6vvx vxxx ), (1)
which after the substitution v = ux , is reduced to potential form
utx = 4ux uxy + 2uy uxx uxxxy , (2)
where we set 0 = 0.
Schi [64] obtained above equation in a dierent route. He derived Eq. (1) from
the reduction of the self-dual YangMills equations from four to three dimensions.
There has been considerable interest to show that the self-dual YangMills equa-
tions as a master integrable equation, from which many integrable systems can be
obtained by suitable reductions and this was the original motivation of Schi. It
has been shown in [66] that the generalized SDYM equations contain (as dimen-
sional reductions) various (2 + 1)-dimensional integrable soliton hierarchies which
generalize the nonlinear Schrodinger and KdV hierarchies.
One can also derive (2 + 1)-dimensional KdV type systems from another
method. Using classical dierential geometry Konopelchenko [44] has derived (2+1)-
dimensional KdV equation. In geodesic coordinates, the Gauss equation is reduced
to the Schr odinger equation where the Gaussian curvature plays the role of a
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00398

EulerPoincar
e Flows on the Loop BottVirasoro Group 487

potential. It can be shown that a special case is governed by the KdV equation
for the Gaussian curvature. In this framework, Konopelchenko [44] studied the
integrable dynamics of curvature via the KdV equation, higher KdV equations
and other (2 + 1)-dimensional integrable equations with breaking solitons. The
bihamiltonian operators for (2 + 1)-dimensional integrable systems were introduced
in [2830].
In an interesting paper Fokas, Olver and Rosenau [26] proposed an algorithmic
construction of (2 + 1)-dimensional integrable system
qxt qxxxt + aqxy + bqxxxy + c(qxx qy + 2qx qxy ) c(qxxxx qy + 2qxxx qxy ), (3)
which yield peakon/dromion type solutions. This equation can be identied with
the potential form of the CamassaHolm (integrable) analogue of the Calogero
BogoyavlenskiiSchi equation.
Recently the one-parameter family of shallow water equations of the following
form
ut uxxt + (b + 1)uux = bux uxx + uuxxx, (4)
where b is a real parameter, has drawn some attention. This equation is known as
the b-eld equation. It was introduced by Degasperis, Holm and Hone [18, 19], who
showed the existence of multi-peakon solutions for any value of b, although only the
special cases b = 2, 3 are integrable, having bihamiltonian formulations. The b = 2
case is the well-known CamassaHolm (CH) equation [8] and b = 3 is the integrable
system discovered by Degasperis and Procesi [20]. One must note that only for
b = 2, 3, Eq. (4) is also hydrodynamically relevant [14, 41]. It is worth to remember
that b = 2 case was later recognized as being included in a class of integrable
equations derived from hereditary symmetries in Fokas and Fuchssteiner [25]
Using the Helmholz eld m := uuxx , the b-eld equation or the DHH equation
(4) allows reformulation in the compact form
mt + umx + bux m = 0, (5)
where the three terms correspond respectively to evolution, convection and stretch-
ing of the one-dimensional ow. In this paper we also study an EulerPoincare
formulation of (2 + 1)-dimensional b-eld equation. It must be worth to mention
that the well-posedness and blow up of the b-eld equation was proved in [23],
and its invariance properties were used by Henry [38] to investigate the equation
qualitatively.
In a recent paper [35], the author has formulated the EulerPoincare (EP)
framework of the Degasperis and Procesi (DP) equation. It turns out that the
DP equation is the EulerPoincare ow on the combined space of Hills (second
order) operator and rst order dierential operators on circle. In this paper, the
author has given the EP formulation of the two-component generalization of the
DP equation. It has been shown [35] also that the Hamiltonian structure obtained
from the EP framework exactly coincides with the Hamiltonian structures of the
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00398

488 P. Guha

DP equation obtained by Degasperis et al. In this paper, we give a much shorter


derivation of the DP and the b-eld equation using the deformation of vector eld
structure on S 1 . Following the work of Ovsienko and Roger [54], we study loop
Virasoro algebra. Using this algebra, we are able to derive the (2 + 1)-dimensional
b-eld equation.
The aim of this paper is to contribute towards a theory of integrable type
geodesic ows on innite-dimensional Lie groups which has attracted tremendous
attention since Arnolds seminal paper [2] on Euler equation in hydrodynamics.
Later, Ebin and Marsden [22] established a proper geometric setting of this problem.
They showed that the geodesic spray was smooth. This led to very nice existence
proofs; the limit of zero viscosity for manifolds with no boundary was shown to
exist for the rst time. It would be worth to mention that the in recent years
equations like the CamassaHolm equation model for shallow water waves or the
HunterSaxton [40] model for nemetic liquid crystals, the geometric structures was
used to to study qualitative properties of the solutions. The geometric structure of
the HunterSaxton equation and its relevance has been studied by Lenells [45]. The
geometric approach leads to the construction of global weak solutions in the periodic
case [46], the case of global weak solutions without periodicity being investigated
in Bressan and Constantin [3]. One must note that in (1 + 1) dimensions both the
CamassaHolm and the DegasperisProcesi equation admit traveling waves that are
peaked and orbitally stable so that these patterns are physically detectable [17,48].
These peakons capture the main feature of the exact travelling wave solutions of
greatest height of the governing equations for water waves [16].
The KdV equation is an EulerPoincare equation on the VirasoroBott group
(see [42, 57, 65]). This group is dened as the unique (up to isomorphism) non-
trivial central extension of the group Di(S 1 ) of all dieomorphisms of S 1 . The
inertia operator is given by the standard L2 -metric on S 1 . It is known that the
two-component KdV and CamassaHolm equations are also geodesic ows on
the extended VirasoroBott group [3234]. It is worth pointing out that for the
CamassaHolm equation the geometric approach leads to a proof which demon-
strates that the equation satises the Least Action Principle [10, 11].
The innite-dimensional groups also play important role for the construction of
(2 + 1)-dimensional integrable systems. One form of (2 + 1)-dimensional KdV and
nonlinear Schr odinger equation can be derived from the toroidal Lie algebra. Here
the variable x is associated to the action of the usual ane part of the toroidal Lie
algebra [4, 61], while evolutions in y and t are indiced by the action of the genuine
toroidal part. The weight of v and the relative and the relative weight of y and t are
balanced with that of x, thus it allows us two freedoms to determine the weights
for all the variables.
In this paper we study the (formal) EulerPoincare framework [52] of various
(2 + 1)-dimensional KdV type systems. Until now there is no systematic method
of construction of (2 + 1)-dimensional integrable systems from the view point
of geodesic ows or EulerPoincare framework. In particular, we show that the
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00398

EulerPoincar
e Flows on the Loop BottVirasoro Group 489

CalogeroBogoyavlenskiiSchi equation arises as a geodesic ow on loop Bott


Virasoro group. This equation is an eminent member of the (2 + 1)-dimensional
family of KdV equations [39]. We also study the EulerPoincare framework of the
BogoyavlenskiiKonopelchenko equation. In fact there are not so long list of equa-
tions of (2 + 1) dimensions are known to be EP formulated. Recently, Ovsienko
[58] studied the bihamiltonian properties of the Martinez AlonsoShabat type
system
uxt = uy uxy uyy ux .
Another nonlinear dierential equation
utx = uxx uy uxy ux + cuyy
has been mentioned in [54].
In the second half of the paper, we construct higher dimensional Camassa
Holm equation. We show that the (2 + 1)-dimensional CamassaHolm equation
arises as geodesic ow with respect to the right invariant H 1 metric on the cotan-
gent loop Virasoro group. We also compute the (2 + 1)-dimensional HunterSaxton
equation.
The result of this paper was announced [36] in the Oberwolfach meeting on
geometrical mechanics. This is a long version of [36]. The paper is organized as
follows. In Sec. 2, we present the EulerPoincare formalism and frozen Poisson
structures. Loop Virasoro algebra is introduced in Sec. 3. In Sec. 4, we give the
EulerPoincare formulation of the (2 + 1)-dimensional KdV equation. Section 5 is
devoted to the derivation of the (2 + 1)-dimensional CamassaHolm equation and
the HunterSaxton equation. In Sec. 6, we present the EulerPoincare framework
of the b-eld equation. The formulation of (2 + 1)-dimensional b-eld equation is
given Sec. 7.

2. The EulerPoincar
e Formalism
The EulerPoincare equations were born in 1901 (see [52] for details) when Poincare
made a extensive generalization of the classical Euler equations for the rigid body
and ideal uids. He did this by formulating the equations on a general Lie algebra,
with the rigid body being associated with the rotation Lie algebra and uids with
the Lie algebra of divergence free vector elds.
We give a rapid introduction of the EulerPoincare framework. Let G be a Lie
group and g be its corresponding Lie algebra and its dual is denoted by g . G can
be thought of as the conguration space of some physical system, for example, the
group SO(3) for a rigid body and a group sDi(M ) of volume preserving dieomor-
phisms for an ideal uid lling a domain M . The dual space g to any Lie algebra
g carries a natural LiePoisson structure:
{f, g}LP () := [df, dg], 
for any g and f, g C (g ).

June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00398

490 P. Guha

The Hamiltonian vector eld on g corresponding to a Hamiltonian function f ,


computed with respect to the LiePoisson structure is given by

d
= addf , g , (6)
dt
which implies that the Hamiltonian vector eld Xf () = addf .
Let us x some quadratic form, energy function, on g. Consider the right trans-
lations of this quadratic form to the tangent space at any point of the group. In
this process, we dene a right-invariant Riemannian metric on the group using the
energy function. The geodesic ow on G with respect to this quadratic form rep-
resents the extremals of the principle of least action, which traces out the actual
motion of the physical system.
We identify the Lie algebra and its dual with this quadratic form. This iden-
tication is done via inertia operator I : g g . This allows us to rewrite the
EulerPoincare equation on the dual space g . It has been proved that the EP
equation on g is Hamiltonian with respect to the natural LiePoisson structure on
the dual space.

Denition 2.1. The EulerPoincare equation on g corresponding to the Hamil-


tonian H() = 12 I 1 ,  is given by

du
= adI 1 , I 1 g, (7)
dt
where ad(.) is the coadjoint operator, dual to the operator [, ], dening the
structure of the Lie algebra g. Equation (7) characterizes an evolution of a point
g .

2.1. Frozen LiePoisson structure


Consider the dual of the Lie algebra of g with a Poisson structure given by the
frozen LiePoisson structure. In other words, we x some point 0 g and
dene a Poisson structure given by

{f, g}0 := [df (), dg()], 0 ,

which satises Jacobi identity. It plays an important role in integrable systems,


particularly to the construction of the rst Hamiltonian structure of the underlying
integrable system.
We can give another interpretation [12,13] of frozen structure from the denition
of cocycle. Given an inertia operator I : g g one can dene a constant Poisson
structure

{f, g}0() = df, I dg where g .


June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00398

EulerPoincar
e Flows on the Loop BottVirasoro Group 491

A two-cocycle is called a coboundary if there is a point 0 g such that


 
(p, q) = [p, q], 0 ,
where p, q g.
Since the Poisson structure is generated by a coboundary of , we obtain the
frozen LiePoisson structure. This behaves like a LiePoisson structure frozen at
the point 0 g and this coincides with the previous denition of frozen structure.
It is easy to check that the above Poisson structures are compatible, i.e. their
linear combination or pencil of Poisson structures
{ , } = { , }0 + { , }LP (8)
is again a Poisson structure for all R.
It was shown by Khesin and Misiolek [43] that
Proposition 2.2. The brackets {, }LP and {, }0 are compatible for every
freezing point 0 g .
At this point, we can introduce the bihamiltonian structure. The notion of inte-
grability can be understood from this structure. The standard way to understand
bihamiltonian vector elds on the dual of the Lie algebra is associated to LiePoisson
structures.
Denition 2.3. A vector eld X on g is called bi-Hamiltonian if there are two
functions, H1 and H2 such that X is a Hamiltonian vector eld of H1 with respect
to the Poisson structure { , }LP and is a Hamiltonian vector eld of H2 with respect
to { , }0 .

3. Loop Virasoro Algebra and (2 + 1)-Dimensional KdV Flows


We wish to extend the Virasoro algebra to the case of two space variables. A natural
way to do this is to consider the loops on it. One denes the loop group on Di(S 1 )
as follows
L(Di(S 1 )) = { : S 1 Di(S 1 ) | is dierentiable},
the group law being given by
( ) (y) = (y) (y), y S1.
In a similar manner, we construct the Lie algebra L(Vect(S 1 )) consisting of
vector elds on S 1 depending on one more independent variable y S 1 . The loop
variable is thus denoted by y and the variable on the target copy of S 1 by x. The
elements of L(Vect(S 1 )) are of the form: f (x, y) x
where f C (S 1 S 1 ) and the
Lie bracket reads as follows [54]
 

f (x, y) , g(x, y) = (f (x, y) gx (x, y) fx (x, y) g(x, y)) .
x x x
   
It is easy to convince oneself that L Vect(S 1 ) is the Lie algebra of L Di(S 1 )
in the usual weak sense for the innite-dimensional case; a one-parameter group
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00398

492 P. Guha

 
argumentation gives an  identication
 between the tangent space to L Di(S 1 )
at the identity and L Vect(S 1 ) , equipped with its Lie bracket. In future, we will
denote L(Vect(S 1 )) by g . The natural pairing between the loop Virasoro algebra
and its dual is given by



f (x, y) , v(x, y)dx2 = f v dx dy. (9)
x S 1 S 1

3.1. Cocycle and extension of loop Virasoro algebra


Consider the following modied GelfandFuchs cocycle on Vect(S 1 ):

d d
mGF f (x) , g(x) = (af  g  + bf  g)dx, (10)
dx dx S1

where the rst term is the original GelfandFuchs cocycle.


This cocycle is cohomologues to the GelfandFuchs cocycle, hence, the corre-
sponding central-extension is isomorphic to the Virasoro algebra. The additional
term is a coboundary term. It is easy to check that the functional

 1
f g dx = (f  g f g  )dx
S1 2 S1
d d
depends on the commutator of f dx and g dx .
Let us give the explicit formul of non-trivial 2-cocycles [31] on g
. A distribution
C (S 1 ) corresponds to a 2-cocycle of the rst class given by [67]


(f, g) = fg xxx dx ,
S1

these
are the Virasoro type extensions. For the particular case where (a(y)) =
S 1 a(y)dy, such a 2-cocycle will be denoted by 1 so that one has

1 (f, g) = f gxxx dx dy. (11)


S 1 S 1

We dene the Lie algebra g as the one-dimensional central extension of g


given
by the cocycles 1 . As a vector space,

g = g R,
where the summand R is the center of g . The commutator in g is given by the
following explicit expression which readily follows from the above formul.
 


f ,a , g ,b = (f gx fx g) + f gxxx dx dy . (12)
x x x S 1 S 1

where the last term is an element of the center of g


.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00398

EulerPoincar
e Flows on the Loop BottVirasoro Group 493

4. EP Formulation of CalogeroBogoyavlenskiiSchi Type


(2 + 1)-Dimensional KdV Equation
In this section, we give a EulerPoincare derivation of the CalogeroBogoyavlenskii
Schi equation (1) or (2). It is a most important member of the (2 + 1)-KdV
family. We recall the KirillovSegal result and generalize it to the case of the Lie
algebra g.

Proposition 4.1. The coadjoint action of the Lie algebra g


is given by


ad 2
(v(x, y)dx ) = (fv
2
f (x,y) x x + 2fx v + c1 fxxx + c2 fx )dx , (13)
while the center acts trivially.

Corollary 4.2. The Hamiltonian operator OLV corresponding to the coadjoint


action of the loop Virasoro algebra is given by
OLV = x v + vx + c1 x3 + c2 x . (14)
Given a functional H on g which is a (pseudo)dierential polynomial:

 
H(g, v) = h g, v, gx , vx , gy , vy , x1 g, x1 v, y1 g, y1 v, gxy , vxy , . . . dx dy,
S 1 S 1

where h is a polynomial in an innite set of variables.


For instance,
H
= hv (hv ) (hvy ) x1 (hx1 v ) y1 (hy1 v )
v x x y
2 2 2
+ (h v ) + (h v ) + (hvyy )
x2 xx
xy xy
y 2
h h
where, as usual, hv means the partial derivative v , similarly hvx = vx .

Proposition 4.3. The EulerPoincare ow restricted to hyperplane c1 = 1, c2 =


0 at (0, v dx2 ) yields the CalogeroBogoyavlenskiiSchi equation (or (2 + 1)-
dimensional KdV )
vt = vxxy + 2vvy + vx x1 vy (15)

for the Hamiltonian H = 12 S 1 S 1 vx1 vy dx dy. We use the expression [58]

x
2
(x1 v)(x, y) = v(, y)d v(x, y)dx.
0 0

4.1. The BogoyavlenskiiKonopelchenko equation


In this section, we derive several other (2 + 1)-dimensional KdV type equations.
The EulerPoincare formalism of the BogoyavlenskiiKonopelchenko equation
vt + vxxy + 3 + vxxx + 3vvx + 2vvy + vx x1 vy = 0, (16)
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00398

494 P. Guha

x 2
with x1 v(x, y) = 0 v(, y)d 0 v(x, y)dx, is closely related to the Calogero
BogoyavlenskiiSchi equation. In fact, this is a combination of KdV and
CalogeroBogoyavlenskiiSchi ows. Equation (16) models the (2+1)-dimensional
interaction of a Riemann wave propagating along the y-axis with a long wave along
the x-axis. Using (14), we obtain the following result.

Proposition 4.4. The EulerPoincare ow associated to the loop Virasoro algebra


g yields the BogoyavlenskiiKonopelchenko (for = 0) equation for the Hamilto-
nian

1
H= (v 2 + vx1 vy )dx dy,
2 S 1 S 1

when restricted to hyperplane c1 = 1, c2 = 0.


Outline of Proof. We use EP equation vt = OLV H
v to obtain our result.

Another class of (2 + 1)-dimensional KdV equation was proposed by Lou and


his collaborators ([47, 49, 50]) to study the rich dromion structures, dened as
vt + vxxx = 3(vy1 vx )x , (17)
where y1 is dened similarly as x1 [58]. This equation reduces to the usual
(1 + 1)-dimensional KdV equation.
We use frozen LiePoisson structure to compute the Hamiltonian operator 1
at (v(x)dx2 , c1 , c2 ) = (0, 0, 1). We also assume that the only cocycle term is S f  g
induced by the coboundary term. The Hamiltonian operator computed at the freez-
ing at the point (0, 0, 1) yields a truncated Hamiltonian operator
O
1 = x .

We also compute the Hamiltonian operator at (v(x)dx2 , c1 , c2 ) = (0, 1, 0), given by


2 = 3 .
O x

Proposition 4.5. The second class (2 + 1)-dimensional KdV follows from the
following combination of ows on g

1 H1 + O
vt = O 2 H2 ,
v v
with H1
v = (vy1 vx ) and H2
v = v, respectively. Here = 3 and = 1.

Proof. By direct computation.

5. H 1 Metric and (2 + 1)-Dimensional CamassaHolm and


HunterSaxton Systems
In this section, we study the CamassaHolm analogue of the (2 + 1)-dimensional
KdV equations from the integrable point of view. The hydrodynamical analogue
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00398

EulerPoincar
e Flows on the Loop BottVirasoro Group 495

was derived in [41]. We start with the explicit expression for the coadjoint action
of g with respect to right invariant H 1 -metric.
Let us introduce H 1 norm on the algebra g .

Denition 5.1. The H 1 -Sobolev norm on the loop Virasoro algebra is dened as


f (x, y) , u(x, y)dx2
x H1

= fu dx + x f x u dx. (18)
S1 S1
Now we compute the coadjoint action.
Proposition 5.2. The coadjoint action with respect to H 1 metric of the loop
Virasoro algebra g
is given by


ad 2
x + 2fx v + c1 fxxx + c2 fx )dx2 ,
v dx = (f v
f (x,y) x

where v = (1 x2 )v.

Proof. We know that


   
 

adf x

2
v dx , h 2
= v dx , f ,h
x H 1 x x H 1
 


2

= v dx , (fhx fx h + fhxxx dx dy, fhx dx dy .
x S 1 S 1 S 1 S 1 H1
Thus from the right-hand side we obtain the matrix expression.
We compute now the left-hand side of the above equation. Let us denote

= h ,e ,
f = f , c , g = g , d , h
x x x
where c = (c1 , c2 ), d = (d1 , d2 ) and e = (e1 , e2 ) .
Now we compute the left-hand side

LHS = g)h
(ad dx dy +  g) h
(ad  dx dy
f f
S 1 S 1 S 1 S 1


=  gh dx dy.
[(1 x2 ) adf
S 1 S 1
Thus by equating the right-hand side and left-hand side, we obtain the above
formula.

Lemma 5.3. The Hamiltonian operator corresponding to the coadjoint action of


the loop Virasoro algebra with respect to H 1 metric is given by
 
OH 1 = (1 x2 )1 x v + vx + c1 x3 + c2 x , (19)
where v = (1 2 )v.
Let us study the EulerPoincare ow associated to H 1 metric on the coadjoint
orbit of the cotangent loop Virasoro algebra g
.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00398

496 P. Guha

Proposition 5.4. The EulerPoincare ow with respect to H 1 -norm on dual space


of loop Virasoro algebra becomes
H
vt = OH 1 , (20)
v

where OH 1 is dened by (19). Suppose the quadratic Hamiltonian on g
is dened as

1
H= v 1 vy dx dy,
2 S 1 S 1 x
then the EulerPoincare ow yields

vt = vx x1 vy + 2
v vy + c1 vxxy + c2 vy . (21)

Corollary 5.5. The EulerPoincare ow restricted to hyperplane c2 = 0 yields the


CamassaHolm analogue of the CalogeroBogoyavlenskiiSchi equation

vt vxxt + c1 vxxy + (vx vxxx )x1 vy + 2(v vxx )vy = 0 (22)



for the Hamiltonian H = S 1 S 1 gx1 vy dx dy.

Corollary 5.6. The potential form of the (2+1)-dimensional CamassaHolm equa-


tion takes the form
   
uxt uxxxt + c1uxxxy + uxx uy + 2ux uxy uxxxxuy + 2uxxxuxy = 0 (23)

for all v = ux .

Remark. In a special (1 + 1)-dimensional case (y = x), Eq. (23) reduces to


potential CamassaHolm equation. If we further assume = 0, then Eq. (23)
reduces to potential KdV equation.

Corollary 5.7. The EulerPoincare ow restricted to hyperplane c1 = 1 and c2 =


1 yields the modied CalogeroBogoyavlenskiiSchi equation

vt vxxt + vy + vxxy + (vx vxxx )x1 vy + 2(v vxx )vy = 0

and potential form of takes the form

uxt uxxxt + uxy + uxxxy + (uxx uy + 2ux uxy ) (uxxxx uy + 2uxxxuxy ) = 0.


1
Corollary 5.8. If we assume 0, then Eq. (22) takes the form

vxxt + vxxx x1 vy + 2vxx vy = 0, (24)

it is known as (2 + 1)-dimensional HunterSaxton equation. For (1 + 1)-dimensional


case (y = x), this reduces to the HunterSaxton equation

vxxt + vxxx v + 2vxx vx = 0. (25)


June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00398

EulerPoincar
e Flows on the Loop BottVirasoro Group 497

The potential form of Eq. (25) takes the form

uxxxt + uxxxxuy + 2uxxxuxy = 0. (26)

e Framework of (1 + 1)-Dimensional b-Field


6. EulerPoincar
Equation
Denote F (S 1 ) the space of tensor-densities of degree on S 1

F = {a(x)dx | a(x) C (S 1 )},

where is the degree, x is a local coordinate on S 1 . As a vector space, F (S 1 ) is


isomorphic to C (S 1 ) [53].
Geometrically we say

F ( ), where = (T S 1 ) ,

= T S 1 is the cotangent bundle of S 1 . Here F0 (M ) = C (M ), the space F1 (M )


and F1 (M ) coincide with the spaces of dierential forms and vector elds, respec-
tively.
d d
Denition 6.1. The b-bracket between v(x) dx and w(x) dx is dened as

[v, w]b = vwx (b 1)vx w. (27)

This b-bracket can also be expressed as


b b2
[v, w]b = [v, w] [v, w]sym , (28)
2 2
where [v, w] = vwx vx w and [v, w]sym = vwx + vx w.

Remark. The b-bracket can be interpretred as an action of Vect(S 1 ) on


F(b1) (S 1 ), a tensor densities on S 1 of degree (b 1). For b = 2 this is just
a vector eld action corresponding to a Lie algebra. Moreover because of [v, w]sym
term b-bracket is not a skew-symmetric bracket, it is a deformation of the bracket
of vector elds.
There is a pairing

 ,  : F F1 R

given by

1
a(x)(dx) , b(x)(dx)

= a(x)b(x)dx (29)
S1

which is Di(S 1 )-invariant. A vector eld f (x) dx


d
acts on the space of tensor den-
sities F by the Lie derivative

Lf(x) d (a(x)) = (f (x)a (x) + f  (x)a(x))(dx) . (30)


dx
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00398

498 P. Guha

We denote b-algebra by F(b1) and its dual by Fb . Thus we can dene a pairing
according to (29)

a(x)(dx)(b1) , b(x)(dx)b  = a(x)b(x)dx.


S1

It is clear that the b-algebra is not a Lie algebra and under this circumstances we
cannot dene proper coadjoint action. We generalize the concept of the coadjoint
action to b-algebra and dened with respect to norm (29).

Lemma 6.2.
1  
(adH )f (u) = (1 x2 )1 f (1 x 2 )ux + bfx (1 x 2 )u . (31)

Proof. We know
adf (u), gH 1 = u, [f, g]b H 1
u dxb , (f g  (b 1)f  g)(dx)1b H 1 ,
hence the pairing is well-dened. Let us compute

 
RHS = (ufg (b 1)uf g)dx + u (f g  (b 1)f  g) dx
S1 S1

= [f (1 x 2 )u + bf  (1 x 2 )u
S1

1 1
LHS = (adH )f u)g dx + (adH )f u g  dx
S1 S1

1
= [(1 x 2 )adH )f u]g dx.
S1

Thus by equating the right-hand side and left-hand side we obtain the above
formula.

Using the Helmholtz operator we express m = (1 x 2 )u. Thus, we express


the Hamiltonian operator corresponding to (31) as
O1 = (1 2 )1 (mx + bm). (32)
The EulerPoincare equation

H 1
ut = O1 for H = u2 dx,
u 2 S1

can be rewritten as
H
mt = O , (33)
u
where O = (mx + bm).
Using the EP formula (33), we construct the b-eld equation.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00398

EulerPoincar
e Flows on the Loop BottVirasoro Group 499

Proposition 6.3. The EulerPoincare ow on the dual space of b-algebra yields


the b-eld equation
mt + mx u + bmux = 0.
This is a new derivation of the b-eld equation.

6.1. Hamiltonian structure of the DegasperisProcesi equation


and EP framework
Degasperis et al. studied Hamiltonian structures for b = 3 case of the b-eld equation
or the DHH equation, in other words, they exhibits bihamiltonian features of the
Degasperis-Procesi system. They expressed the DegasperisProcesi equation as
Hi
mt = Bi i = 0, 1, (34)
m
where m = u uxx (we assume = 1). Thus they studied the ow of Helmholtz
function. They showed that there is only one local Hamiltonian structure
B0 = x (1 x2 )(4 x2 ), (35)
and the second Hamiltonian structure is given by
B1 = m2/3 x m1/3 (x x3 )1 m1/3 x m2/3 , (36)
which can be simplied to

= 2 (3m + mx )( 3 )1 (3m + 2mx ).


B1 B
9
Proposition 6.4. The DegasperisProcesi equation

H1 = 2 (3m + mx )( 3 )1 (3m + 2mx ), 9


mt = B , B H1 = m dx
m 9 4 S1
(37)
is equivalent to

H
mt = O for H = u2 dx,
u S1

where O = (mx + bm).

Proof. Our goal is to show


2 H1 H
( 3 )1 (3m + 2mx ) = ,
9 m u
9
H1 9
where H1 = 4 S1 m dx. If we insert m = 4 to left-hand side of above equation
we obtain
( 3 )1 mx = u,
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00398

500 P. Guha

where we use u = (1 2 )1 m. Thus we obtain


H
mt = (3m + mx )
u

where H = 12 S 1 u2 dx.
Therefore the DegasperisHolmHone form of Hamiltonian structure coincides
with our Hamiltonian structure.

6.1.1. First Hamiltonian structure of b-eld equation


Let us compute the Hamiltonian operator at a frozen point m(x) = m0 . Since m0
is constant so the Hamiltonian operator at the frozen point becomes

O0 = 3m0 . (38)

Actually freezing at the point m0 yields a Poisson structure induced by a cobound-


ary, which is always a trivial Poisson structure. For all practical purposes we can
normalize this O0 operator or taking m0 = 13 . We show that this leads us to the
rst Hamiltonian operator of the DegasperisProcesi equation.

Proposition 6.5. The DegasperisProcesi equation with respect to rst Hamilto-


nian structure of DegasperisHolmHone exactly coincides with O
0 = , where the

corresponding Hamiltonian H satises

H
= (2u2 u2x uuxx). (39)
u

Proof. It is easy to check that



H H0
= (4 2 ) ,
u u
1

where the rst DHH hamiltonian is given by H0 = 6 S1 u3 dx.
Thus we obtain
H0
mt = (4 2 ) .
u
Using the chain rule formula for variational derivatives
H0 H0
= (1 2 )
u m
we obtain
H0
mt = (4 2 )(1 2 ) .
m
Hence we obtain the rst Hamiltonian structure B0 of Degasperis, Holm and
Hone from our method.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00398

EulerPoincar
e Flows on the Loop BottVirasoro Group 501

7. EP Formalism for (2 + 1)-Dimensional b-Field Equation


Consider G 1 = LG1 be the associated loop group corresponding to G1 whose algebra
is given by

g 1 = L(F(b1) ).
Consider an action of L(Vect(S 1 )) on L(F(b1) )

Lf (g(dx)(b1) ) = (f gx (b 1)fx g)(dx)(b1) , (40)


x

this yields a new bracket.


Let us introduce H 1 norm on the algebra g
1.

Denition 7.1. The H 1 -Sobolev norm on the loop tensor density algebra is
dened as

(b1)
f (x, y)(dx) , u(x, y)(dx) H 1 =
b
fu dx + x f x u dx. (41)
S1 S1

Proposition 7.2. The action of Vect(S 1 ) with respect to H 1 metric on the tensor
product algebra Fb is given by

 d (v(x, y)(dx)b ) = +(f vx + bfx v)dxb ,
ad(f dx )

where v = (1 2 )v.

Corollary 7.3. The Hamiltonian operator corresponding to the action of Vect(S 1 )


on Fb with respect to H 1 metric yields
 
O = (1 x2 )1 x v + (b 1)
v x (42)

Proposition 7.4. The EulerPoincare ow on the g


1 orbit yields the (2 + 1)-
dimensional b-eld equation

vt vxxt + x1 vyy + (vx vxxx )x1 vy + b(v vxx )vy = 0 (43)

where the Hamiltonian is given by


1
H= vx1 vy dx dy.
2 S 1 S 1

The potential form of (43) yields

uxt uxxxt + uyy + (uxx uy + bux uxy ) (uxxxxy uy + buxxxuxy ) = 0. (44)

Introducing the quantity

m = v vxx ,
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00398

502 P. Guha

which is just the Helmholtz operator action on v. Therefore the one-parameter


family of (2 + 1)-dimensional peakon-type PDEs (or (2 + 1)-dimensional b-eld
equations) may be written in the following form
mt + mx x1 vy + bvy m + x1 vyy = 0, (45)
which reduces to (1 + 1)-dimensional b-eld equation for y = x and = 0. It is clear
that Eq. (45) becomes a (2+1)-dimensional CamassaHolm and (2+1)-dimensional
DegasperisProcesi equation for b = 2 and b = 3 respectively.

8. Conclusion and Outlook


We have examined various extensions of (2 + 1)-dimensional KdV equations and
(2+1)-dimensional generalized CamassaHolm type systems. In particular, we have
shown that all these equations constitute geodesic ows on the loop BottVirasoro
group. In fact three famous (2 + 1)-dimensional partial dierential equations:
CalogeroBogoyavlenskiiSchi (CBS), (2 + 1)-dimensional CamassaHolm (CH2 )
and (2 + 1)-dimensional HunterSaxton (HS2 ) can be described as EulerPoincare
ows on the dual space of loop Virasoro orbit.
After that we have given the EulerPoincare formalism of the new (1 + 1)-
dimensional b-eld equation, proposed by Degasperis et al., on the space of tensor
algebra. We also extend the EP framework to (2 + 1)-dimensional b-eld equation,
which includes the (2 + 1)-dimensional DegasperisProcesi equation too. There-
fore, this paper has further strengthened the programme of EulerPoincare and
integrable geodesic ows on extended group of dieomorphisms. We hope in our
forthcoming work we will consider the singular solutions of the (2 + 1)-dimensional
equations.

Acknowledgment
The author is profoundly grateful to Professors Jerry Marsden, Tudor Ratiu,
Valentin Ovsienko and Chand Devchand for stimulating discussions and various
constructive suggestions. He is also grateful to Professor Thanasis Fokas for his
interest and encouragement. In particular, he is immensely grateful to Chand
Devchand for the b-bracket discussion. Finally, the author wants to thank the
anonymous referee for many helpful comments and suggestions. He expresses grate-
ful thanks to Professor Jurgen Jost for gracious hospitality at the Max Planck
Institute for Mathematics in the Sciences.

References
[1] M. J. Ablowitz and P. A. Clarkson, Solitons, Nonlinear Evolution Equations and
Inverse Scattering, London Mathematical Society Lecture Note Series, Vol. 149 (Cam-
bridge University Press, 1991).
[2] V. I. Arnold, Sur la geometrie dierentielle des groupes de Lie de dimenson innie et
a lhydrodynamique des uids parfaits, Ann. Inst. Fourier Grenoble
ses applications `
16 (1966) 319361.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00398

EulerPoincar
e Flows on the Loop BottVirasoro Group 503

[3] A. Bressan and A. Constantin, Global solutions of the HunterSaxton equation,


SIAM J. Math. Anal. 37 (2005) 9961002.
[4] Yu. Billig, An extension of the KdV hierarchy arising from a representation of a
toroidal Lie algebra, J. Algebra 217 (1999) 4064.
[5] O. I. Bogoyavlensky, Breaking solitons in (2 + 1)-dimensional integrable equations,
Russian Math. Surveys 45(4) (1990) 186.
[6] O. I. Bogoyavlensky, Breaking solitons III, Izv. Akad. Nauk SSSR Ser. Matem. 54
(1990) 123131; Math. USSR Izv. 36 (1991) 129137 (English translation).
[7] F. Calogero, A method to generate solvable nonlinear evolution equations, Lett.
Nuovo Cimento 14 (1975) 443447.
[8] R. Camassa and D. Holm, An integrable shallow water equation with peaked solitons,
Phys. Rev. Lett. 71(11) (1993) 16611664.
[9] R. Camassa, D. Holm and J. M. Hyman, A new integrable shallow water equation,
Adv. Appl. Mech. 31 (1994) 133.
[10] A. Constantin and B. Kolev, Geodesic ow on the dieomorphism group of the circle,
Comment. Math. Helv. 78 (2003) 787804.
[11] A. Constantin, T. Kappeler, B. Kolev and P. Topalev, On geodesic exponential maps
of the Virasoro group, Ann. Global Anal. Geom. 31 (2007) 155180.
[12] A. Constantin and B. Kolev, Integrability of invariant metrics on the Virasoro group,
Phys. Lett. A 350(12) (2006) 7580.
[13] A. Constantin and B. Kolev, Integrability of invariant metrics on the dieomorphism
group of the circle, J. Nonlinear Sci. 16(2) (2006) 109122.
[14] A. Constantin and D. Lannes, The hydrodynamical relevance of the CamassaHolm
and DegasperisProcesi equations, Arch. Ration. Mech. Anal. 192 (2009) 165186.
[15] M. Chen, S.-Q. Liu and Y. Zhang, A 2-component generalization of the Camassa
Holm equation and its solutions, nlin.SI/0501028.
[16] A. Constantin, The trajectories of particles in Stokes waves, Invent. Math. 166 (2006)
523535.
[17] A. Constantin and W. Strauss, Stability of peakons, Comm. Pure Appl. Math. 53
(2000) 603610.
[18] A. Degasperis, D. D. Holm and A. N. W. Hone, A new integrable equation with
peakon solutions, NEEDS 2001 Proceedings, Theoret. and Math. Phys. 133 (2002)
170183.
[19] A. Degasperis, D. D. Holm and A. N. W. Hone, Integrable and non-integrable equa-
tions with peakons, in Nonlinear Physics: Theory and Experiment, II (Gallipoli, 2002)
(World Sci. Publishing, River Edge, NJ, 2003), pp. 3743.
[20] A. Degasperis and M. Procesi, Asymptotic integrability, in Symmetry and Perturba-
tion Theory (Rome, 1998) (World Sci. Publishing, River Edge, NJ, 1999), pp. 2337.
[21] C. Devchand and J. Schi, The supersymmetric CamassaHolm equation and
geodesic ow on the superconformal group, J. Math. Phys. 42(1) (2001) 260273.
[22] D. Ebin and J. Marsden, Groups of dieomorphisms and themotion of an incom-
pressible uid, Ann. Math. 92 (1970) 102163.
[23] J. Escher and Z. Yin, Well-posedness, blow-up phenomena, and global solutions for
the b-equation, J. Reine Angew. Math. 624 (2008) 5180.
[24] G. Falqui, On a CamassaHolm type equation with two dependent variables,
nlin.SI/0505059.
[25] A. S. Fokas and B. Fuchssteiner, Backlund transformations for hereditary symmetries,
Nonlinear Anal. 5(4) (1981) 423432.
[26] A. S. Fokas, P. J. Olver, P. Rosenau, A plethora of integrable bi-Hamiltonian equa-
tions, in Algebraic Aspects of Integrable Systems, Progr. Nonlinear Dierential Equa-
tions Appl., Vol. 26 (Birkhauser Boston, Boston, MA, 1997), pp. 93101.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00398

504 P. Guha

[27] A. S. Fokas and P. M. Santini, Dromions and a boundary value problem for the
DaveyStewartson I equation, Physica D 44 (1990) 99130.
[28] A. S. Fokas and P. M. Santini, The recursion operator of the KadomtsevPetviashvili
equation and the squared eigenfunction of the Schr odinger operators, Stud. Appl.
Math. 75 (1986) 179186.
[29] A. S. Fokas and P. M. Santini, Recursion operators and bi-Hamiltonian structures in
multidimensions I, Comm. Math. Phys. 115 (1988) 375419.
[30] A. S. Fokas and P. M. Santini, Recursion operators and bi-Hamiltonian structures in
multidimensions II, Comm. Math. Phys. 116 (1988) 449474.
[31] I. M. Gelfand and D. B. Fuks, Cohomologies of the Lie algebra of vector elds on the
circle, Funct. Anal. Appl. 2(4) (1968) 9293.
[32] P. Guha, Integrable geodesic ows on the (super)extension of the BottVirasoro
group, Lett. Math. Phys. 52(4) (2000) 311328.
[33] P. Guha, Geodesic ows, bi-Hamiltonian structure and coupled KdV type systems,
J. Math. Anal. Appl. 310 (2005) 4556.
[34] P. Guha and P. Olver, Geodesic ow and two (super) component analog of the
CamassaHolm equation, SIGMA 2 (2006) 054, 9 pp.
[35] P. Guha, EulerPoincare formalism of (two component) DegasperisProcesi and
HolmStaley type systems, J. Nonlinear Math. Phys. 14(3) (2007) 390421.
[36] P. Guha, EulerPoincare ows on the space of tensor densities and integrable systems,
Oberwolfach Report 5(3) (2008) 18751880.
[37] I. M. Gelfand, I. M. Graev and A. M. Vershik, Models of representations of current
groups, Representations of Lie Groups and Lie Algebras (Budapest, 1971) (Akad.
Kiad, Budapest, 1985), pp. 121179.
[38] D. Henry, Persistence properties for a family of nonlinear partial dierential equa-
tions, Nonlinear Anal. 70 (2009) 20492064.
[39] A. N. W. Hone, Reciprocal link for (2 + 1)-dimensional extensions of shallow water
equations, Appl. Math. Lett. 13(3) (2000) 3742.
[40] J. K. Hunter and R. Saxton, Dynamics of director elds, SIAM J. Appl. Math. 51
(1991) 14981521.
[41] R. S. Johnson, CamassaHolm, Kortewegde Vries and related models for water
waves, J. Fluid Mech. 455 (2002) 6382.
[42] A. Kirillov, Innite-dimensional Lie groups: Their orbits, invariants and represen-
tations. The geometry of moments, in Twistor Geometry and Nonlinear Systems,
Lecture Notes in Math., Vol. 970 (Springer, Berlin, 1982), pp. 101123.
[43] B. Khesin and G. Misiolek, Euler equations on homogeneous spaces and Virasoro
orbits, Adv. Math. 176(1) (2003) 116144.
[44] B. G. Konopelchenko, Solitons in Multidimensions (World Scientic, 1993).
[45] J. Lenells, The HunterSaxton equation describes the geodesic ow on a sphere, J.
Geom. Phys. 57 (2007) 20492064.
[46] J. Lenells, Weak geodesic ow and global solutions of the HunterSaxton equation,
Discrete Contin. Dyn. Syst. 18 (2007) 643656.
[47] S.-Y. Lou, Searching for higher dimensional integrable models from lower ones via
Painleve analysis, Phys. Rev. Lett. 80 (1998) 50275031.
[48] Z. Lin and Y. Liu, Stability of peakons for the DegasperisProcesi equation, Comm.
Pure Appl. Math. 62 (2009) 125146.
[49] J. Lin, S.-Y. Lou and K. Wang, High-dimensional Virasoro integrable models and
exact solutions, Phys. Lett. A 287(34) (2001) 257267.
[50] Y.-S. Li and Y.-J. Zhang, Symmetries of a (2 + 1)-dimensional breaking soliton equa-
tion, J. Phys. A 26(24) (1993) 74877494.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00398

EulerPoincar
e Flows on the Loop BottVirasoro Group 505

[51] G. Misiolek, A shallow water equation as a geodesic ow on the BottVirasoro group,


J. Geom. Phys. 24 (1998) 203208.
[52] J. E. Marsden and T. Ratiu, Introduction to Mechanics and Symmetry (Springer-
Verlag, New York, 1994).
[53] V. Ovsienko and C. Roger, Generalizations of Virasoro group and Virasoro algebra
through extensions by modules of tensor-densities on S 1 , Indag. Math. (N.S.) 9(2)
(1998) 277288.
[54] V. Ovsienko and C. Roger, Looped cotangent Virasoro algebra and nonlinear inte-
grable systems in dimension 2 + 1, Comm. Math. Phys., 273 (2007) 357378; math-
ph/0602043.
[55] V. Ovsienko, Coadjoint representation of Virasoro-type Lie algebras and dierential
operators on tensor-densities, in Innite Dimensional K ahler Manifolds (Oberwolfach,
1995), DMV Sem., Vol. 31 (Birkh auser, Basel, 2001), pp. 231255.
[56] P. Olver and P. Rosenau, Tri-Hamiltonian duality between solitons and solitary-wave
solutions having compact support, Phys. Rev. E (3) 53(2) (1996) 19001906.
[57] V. Yu. Ovsienko and B. A. Khesin, KdV super equation as an Euler equation, Funct.
Anal. Appl. 21 (1987) 329331.
[58] V. Yu. Ovsienko, Bi-Hamiltonian nature of the equation utx = uxy uy uyy ux ,
arXiv:0802.1818v1 [math-ph].
[59] R. Radha and M. Lakshmanan, Dromion-like structures in the (2 + 1)-dimensional
breaking soliton equation, Phys. Lett. A 197(1) (1995) 712.
[60] P. Rosenau, Nonlinear dispersion and compact structures, Phys. Rev. Lett. 73(13)
(1994) 17371741.
[61] E. Ramos, C.-H. Sah and R Shrock, Algebras of dieomorphisms of the N -torus, J.
Math. Phys. 31(8) (1990) 18051816.
[62] A. Reiman and M. Semenov-Tyan-Shanskii, Hamiltonian structure of equations of
KadomtsevPetviashvili type, in Dierential Geometry, Lie Groups and Mechanics,
VI. Zap. Nauchn. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI) 133 (1984)
212227.
[63] H.-Y. Ruan and Y.-X. Chen, Dromion interactions of (2 + 1)-dimensional KdV-type
equations, J. Phys. Soc. Japan 72(3) (2003) 491495.
[64] J. Schi, Integrability of ChernSimonsHiggs vortex equations and a reduction of the
self-dual YangMills equations to three dimensions, in Painleve Transcendents, eds.
D. Levi and P. Winternitz, NATO ASI Series B, Vol. 278 (Plenum Press, New York,
1992).
[65] G. Segal, Unitary representations of some innite-dimensional groups, Comm. Math.
Phys. 80(3) (1981) 301342.
[66] I. A. B. Strachan, Some integrable hierarchies in (2 + 1)-dimensions and their twistor
description, J. Math. Phys. 34(1) (1993) 243259.
[67] P. Zusmanovich, The second homology group of current Lie algebras, in K-Theory
(Strasbourg, 1992), Asterisque 226(11) (1994) 435452.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00402

Reviews in Mathematical Physics


Vol. 22, No. 5 (2010) 507531

c World Scientic Publishing Company
DOI: 10.1142/S0129055X10004028

PROJECTIVE MODULE DESCRIPTION OF EMBEDDED


NONCOMMUTATIVE SPACES

R. B. ZHANG
School of Mathematics and Statistics,
University of Sydney, Sydney, Australia
ruibin.zhang@sydney.edu.au

XIAO ZHANG
Institute of Mathematics,
Academy of Mathematics and Systems Science,
Chinese Academy of Sciences, Beijing, P. R. China
xzhang@amss.ac.cn

Received 20 May 2009


Revised 5 February 2010

An algebraic formulation is given for the embedded noncommutative spaces over the
Moyal algebra developed in a geometric framework in [8]. We explicitly construct the
projective modules corresponding to the tangent bundles of the embedded noncom-
mutative spaces, and recover from this algebraic formulation the metric, LeviCivita
connection and related curvatures, which were introduced geometrically in [8]. Trans-
formation rules for connections and curvatures under general coordinate changes are
given. A bar involution on the Moyal algebra is discovered, and its consequences on the
noncommutative dierential geometry are described.

Keywords: Noncommutative space; projective module; isometric embedding.

Mathematics Subject Classication 2010: 51P05, 81R60, 83C65

1. Introduction
It is a long held belief in physics that the notion of spacetime as a pseudo Riemann-
ian manifold requires modication at the Planck scale [34, 38]. Theoretical inves-
tigations in recent times strongly supported this view. In particular, the seminal
paper [16] by Doplicher, Fredenhagen and Roberts demonstrated mathematically
that coordinates of spacetime became noncommutative at the Planck scale, thus
some form of noncommutative geometry [13] appeared to be necessary in order to
describe the structure of spacetime. This prompted intensive activities in mathe-
matical physics studying various noncommutative generalisations of Einsteins the-
ory of general relativity [1, 3, 511, 29, 30]. For reviews on earlier works, we refer

507
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00402

508 R. B. Zhang & X. Zhang

to [31, 35] and references therein. For more recent developments, particularly on
the study of noncommutative black holes, see [2, 4, 7, 9, 15, 26, 27, 33, 37].
In joint work with Chaichian and Tureanu [8], we investigated the noncommu-
tative geometry [13, 22] of noncommutative spaces embedded in higher dimensions.
We rst quantized a space by deforming [21, 28] the algebra of functions to a
noncommutative associative algebra known as the Moyal algebra. Such an alge-
bra naturally incorporates the generalized spacetime uncertainty relations of [16],
capturing key features expected of spacetime at the Planck scale. We then system-
atically investigated the noncommutative geometry of embedded noncommutative
spaces. This was partially motivated by Nashs isometric embedding theorem [32]
and its generalization to pseudo-Riemannian manifolds [12, 19, 23], which state that
any (pseudo-) Riemannian manifold can be isometrically embedded in Euclidean
or Minkowski spaces. Therefore, in order to study the geometry of spacetime, it
suces to investigate (pseudo-) Riemannian manifolds embedded in higher dimen-
sions. Embedded noncommutative spaces also play a role in the study of branes
embedded in RD in the context of YangMills matrix models [36].
The theory of [8] was developed within a geometric framework analogous to
the classical theory of embedded surfaces (see, e.g., [14]). The present paper fur-
ther develops the dierential geometry of embedded noncommutative spaces by
constructing an algebraic formulation in terms of projective modules, a language
commonly adopted in noncommutative geometry [13, 22].
We shall rst describe the nitely generated projective modules over a Moyal
algebra, which will be regarded as noncommutative vector bundles on a quantized
spacetime. We then construct a dierential geometry of the noncommutative vec-
tor bundles, developing a theory of connections and curvatures on such bundles. In
doing this, we make crucial use of a unique property of the Moyal algebra, namely,
it has a set of mutually commutative derivations related to the usual partial deriva-
tions of functions.
Then we apply the noncommutative dierential geometry developed to study
the embedded noncommutative spaces introduced in [8]. We explicitly construct
the projective modules corresponding to the tangent bundles of the noncommuta-
tive spaces, and recover from this algebraic formulation the geometric LeviCivita
connections and related curvatures introduced in [8]. This way, the embedded non-
commutative spaces of [8] acquire a natural interpretation in the algebraic formalism
present here.
Morally, one may regard the very denition of a projective module (a direct sum-
mand of a free module) as the geometric equivalent of embedding a low-dimensional
manifold isometrically in a higher dimensional one. In the commutative setting
of classical (pseudo-) Riemannian geometry, we make this connection more pre-
cise and explicit by showing that the projective module description of tangent
bundles studied here is a natural consequence of the isometric embedding theo-
rems [12, 19, 23, 32]. This is briey discussed in Theorem 7.1.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00402

Projective Module Description of Embedded Noncommutative Spaces 509

As a concrete example of noncommutative dierential geometries over the


Moyal algebra, we study in detail a quantum deformation of a time slice
of the Schwarzschild spacetime. The projection operator yielding the tan-
gent bundle is given explicitly, and the corresponding metric is also worked
out.
As is well known, one of the fundamental principles of general relativity is gen-
eral covariance. It is important to nd a noncommutative version of this principle.
By analyzing the structure of the Moyal algebra, we show that the noncommu-
tative geometry developed here (initiated in [8]) retains some notion of general
covariance. Properties of the connection and curvature under general coordinate
transformations are described explicitly (see Theorem 5.1).
The Moyal algebra (over the real numbers) admits an involution similar
to the bar involution in the context of quantum groups. We introduce a par-
ticularly nice class of noncommutative vector bundles over the Moyal algebra,
which are associated to bar invariant idempotents and endowed with bar her-
mitian connections (see Sec. 6). In this case, the bar involution takes the left
tangent bundles to right tangent bundles. We show that the tangent bundles
of embedded noncommutative spaces under a middle condition belong to this
class.
The organization of the paper is as follows. In Sec. 2, we describe the Moyal
algebras and nitely generated projective modules over them. In Sec. 3, we discuss
the dierential geometry of noncommutative vector bundles on quantum spaces
corresponding to Moyal algebras. In Sec. 4, we develop the dierential geometry of
embedded noncommutative spaces using the language of projective modules. As an
explicit example, we study in detail the quantum deformation of a time slice of the
Schwarzschild spacetime in Sec. 4.2. In Sec. 5, we study the eect of general coordi-
nate transformations. In Sec. 6, we investigate properties of noncommutative vector
bundles under the bar involution of the Moyal algebra. Finally, Sec. 7, concludes
the paper with some general comments and a discussion of the natural relation-
ship between projective modules and isometric embeddings in classical (pseudo-)
Riemannian geometry.
Before closing this section, we mention that the theory of [8] has the advantage
of being explicit and easy to use for computations. Using this theory, we con-
structed noncommutative Schwarzschild and Schwarzschildde Sitter spacetimes in
joint work with Wang [37]. Our long term aim is to develop a theoretical framework
for studying noncommutative general relativity. A variety of physically motivated
methods and techniques were used in the literature to study corrections to general
relativity arising from the noncommutativity of the Moyal algebra. In particular,
references [1, 3] studied deformations of the dieomorphism algebra as a means
for incorporating noncommutative eects of spacetime, while in [6, 7, 9] a gauge
theoretical approached was taken. These approaches dier considerably from the
theory of [8, 37] at the mathematical level.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00402

510 R. B. Zhang & X. Zhang

2. Moyal Algebra and Projective Modules


We describe the Moyal algebra of smooth functions on an open region of Rn , and
the nitely generated projective modules over the Moyal algebra. This provides the
background material needed in later sections, and also serves to x notations.
We take an open region U in Rn for a xed n, and write the coordinate of a
point t U as (t1 , t2 , . . . , tn ). Let h
be a real indeterminate, and denote by R[[h]]


the ring of formal power series in h. Let A be the set of formal power series in h with
coecients being real smooth functions on U . Namely, every element of A is of the
 i where fi are smooth functions on U . Then A is an R[[h]]-module

form i0 fi h
in the obvious way.
Fix a constant skew symmetric n n matrix = (ij ). The Moyal product on
A corresponding to is a map
A A,
: A R[[h]] f g  (f, g),
dened by
P ij
h
t
ti
f (t)g(t ).
ij
(f, g)(t) = lim

exp j (2.1)
t t

On the right-hand side, f (t)g(t ) means the usual product of the numerical values
of the functions f and g at t and t , respectively.
It has been known since the early days of quantum mechanics that the Moyal

product is associative (see, e.g., [28] for reference). Thus the R[[h]]-module A
equipped with the Moyal product forms an associative algebra over R[[h]], which
is a deformation of the algebra of smooth functions on U in the sense of [21]. We
shall usually denote this associative algebra by A, but when it is necessary to make
explicit the multiplication, we shall write it as (A, ).
The partial derivations i := t i with respect to the coordinates ti for U are

R[[h]]-linear maps on A. Since is a constant matrix, the Leibniz rule is valid.
Namely, for any element f and g of A, we have
i (f, g) = (i f, g) + (f, i g). (2.2)
Therefore, the i (i = 1, 2, . . . , n) are mutually commutative derivations of the
Moyal algebra (A, ) on U .

Remark 2.1. The usual notation in the literature for (f, g) is f g. This is referred
to as the star-product of f and g. Hereafter, we shall replace by and simply
write (f, g) as f g.
Following the general philosophy of noncommutative geometry [13], we regard
the associative algebra (A, ) as dening some quantum deformation of the region
U , and nitely generated projective modules over A as (spaces of sections of) non-
commutative vector bundles on the quantum deformation of U dened by the non-
commutative algebra A. Let us now briey describe nitely generated projective
modules over A.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00402

Projective Module Description of Embedded Noncommutative Spaces 511

Given an integer m > n, we let l Am (respectively, Am r ) be the set of m-


tuples with entries in A written as rows (respectively, columns). We shall regard
m
lA (respectively, Amr ) as a left (respectively, right) A-module with the action
dened
 by multiplication
 from the left (respectively, right).
 More explicitly, for

v = a1 a2 am l Am , and b A, we have b v = b a1 b a2 b am .

a1 a1 b
a2 a2 b
Similarly for w = .. Am
r , we have w b = .. . Let Mm (A) be the set
. .
am am b
of (m m)-matrices with entries in A. We dene matrix multiplication in the usual
way but by using the Moyal product for products of matrix entries, and still denote
the corresponding matrix multiplication by . Now for A = (aij ) and B = (bij ),

we have (A B) = (cij ) with cij = k aik bkj . Then Mm (A) is an R[[h]]-algebra,
which has a natural left (respectively, right) action on Ar (respectively, l Am ).
m

A nitely generated projective left (respectively, right) A-module is isomorphic


to some direct summand of l Am (respectively, Am r ) for some m < . If e Mm (A)
satises the condition e e = e, that is, it is an idempotent, then
M = l Am e := {v e | v l Am }, = e Am := {e w | Am }
M r r

are, respectively, projective left and right A-modules. Furthermore, every projective
left (right) A-module is isomorphic to an M (respectively, M) constructed this way
by using some idempotent e.
In Sec. 4, we shall give a systematic method for constructing idempotents (see
(4.1)). The corresponding noncommutative vector bundles include the tangent bun-
dles of embedded noncommutative spaces introduced in [8], which we shall inves-
tigate in depth. An explicit example of embedded noncommutative spaces will be
analyzed in detail in Sec. 4.2. To do this, we need to develop some generalities of
the dierential geometry of noncommutative vector bundles using the language of
projective modules over the Moyal algebra.

3. Dierential Geometry of Noncommutative Vector Bundles


In this section, we investigate general aspects of the noncommutative dierential
geometry over the Moyal algebra. We shall focus on the abstract theory here. A
large class of examples will be given in Sec. 4, including one which will be worked
out in detail.
As we shall see, the set of mutually commutative derivations i (i = 1, 2, . . . , n)
of the Moyal algebra A will play a crucial role in developing the noncommutative
dierential geometry.

3.1. Connections and curvatures


We start by considering the action of the partial derivations i on M and M. We
only treat the left module in detail, and present the pertinent results for the right
module at the end, since the two cases are similar.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00402

512 R. B. Zhang & X. Zhang

Let us rst specify that i acts on rectangular matrices with entries in A by


componentwise dierentiation. More explicitly,

i b11 i b12 i b1l b11 b12 b1l
i b21 i b22 i b2l
i B = for B = b21 b22 b2l .

i bk1 i bk2 i bkl bk1 bk2 bkl
In particular, given any = v e M, where v l Am regarded as a row matrix,
we have i = (i v) e + v i (e) by the Leibniz rule. While the rst term belongs
to M, the second term does not in general. Therefore, i (i = 1, 2, . . . , n) send M
to some subspace of l Am dierent from M.
Let i Mm (A) (i = 1, 2, . . . , n) be (m m)-matrices with entries in A satis-
fying the following condition:
e i (1 e) = e i e, i. (3.1)

Dene the R[[h]]-linear maps i (i = 1, 2, . . . , n) from M to l Am by
i = i + i , M.
Then each i is a covariant derivative on the noncommutative bundle M in the
sense of Theorem 3.1 below. They together dene a connection on M.

Theorem 3.1. The maps i (i = 1, 2, . . . , n) have the following properties. For all
M and a A,
i M and i (a ) = i (a) + a i .

Proof. For any M, we have


i () e = i () e + i e
= i + (i e i e),
where we have used the Leibniz rule and also the fact that e = . Using this
latter fact again, we have (i e i e) = (e i e e i e), and by the
dening property (3.1) of i , we obtain (e i e e i e) = i . Hence
i () e = i + i = i ,
proving that i M. The second part of the theorem immediately follows from
the Leibniz rule.

We shall also say that the set of i (i = 1, 2, . . . , n) is a connection on M. Since


e i e = i (e) (1 e), one obvious choice for i is i = i e, which we shall refer
to as the canonical connection on M.
By inspecting the dening property (3.1) for a connection, we easily see the
following result.
Lemma 3.2. If i (i = 1, 2, . . . , n) dene a connection on M, then so do also
i + i e (i = 1, 2, . . . , n) for any (m m)-matrices i with entries in A.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00402

Projective Module Description of Embedded Noncommutative Spaces 513

For a given connection i (i = 1, 2, . . . , n), we consider [i , j ] = i j j i


with the right hand side understood as composition of maps on M. By simple
calculations we can show that for all M,

[i , j ] = Rij with Rij := i j j i [i , j ] ,

where [i , j ] = i j j i is the commutator. We call Rij the curvature of


M associated with the connection i .
For all M,
[i , j ]k = k () Rij + k Rij ,
k [i , j ] = k () Rij + (k Rij + Rij k ).
Dene the following covariant derivatives of the curvature:

k Rij := k Rij + Rij k k Rij , (3.2)

we have

[k , [i , j ]] = k Rij , M.

The Jacobian identity [k , [i , j ]] + [j , [k , i ]] + [i , [j , k ]] = 0 leads to

(k Rij + j Rki + i Rjk ) = 0, M.

From this, we immediately see that e (k Rij + j Rki + i Rjk ) = 0. In fact, the
following stronger result holds.

Theorem 3.3. The curvature satises the following Bianchi identity:

k Rij + j Rki + i Rjk = 0.

Proof. The proof is entirely combinatorial. Let


Aijk = k i j k j i ,
Bijk = [i j , k ] [j i , k ] .
Then we can express k Rij as

k Rij = Aijk + Bijk k [i , j ] [[i , j ] , k ] .

Note that
Aijk + Ajki + Akij = 0,
Bijk + Bjki + Bkij = k [i , j ] + i [j , k ] + j [k , i ] .
Using these relations together with the Jacobian identity

[[i , j ] , k ] + [[j , k ] , i ] + [[k , i ] , j ] = 0,

we easily prove the Bianchi identity.


June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00402

514 R. B. Zhang & X. Zhang

3.2. Gauge transformations


Let GLm (A) be the group of invertible m m-matrices with entries in A. Let G be
the subgroup dened by

G = {g GLm (A) | e g = g e}, (3.3)

which will be referred to as the gauge group. There is a right action of G on M


dened, for any M and g G, by g  g := g, where the right side is
dened by matrix multiplication. Clearly, g e = g. Hence g M, and we
indeed have a G action on M.
For a given g G, let

ig = g 1 i g g 1 i g. (3.4)

Then

e ig (1 e) = g 1 e i (1 e) g g 1 e i (g) (1 e).

By (3.1),

g 1 e i (1 e) g = g 1 e i (e) g
= g 1 e i (e g) + g 1 e i g
= g 1 e i (g) e e i e + g 1 e i g
= e i e + g 1 e i (g) (1 e).

Therefore,

e ig (1 e) = e i e.

This shows that the ig satisfy the condition (3.1), thus form a connection on M.
Now for any given g G, dene the maps gi on M by

gi = i + ig , .

Also, let Rgij = i jg j ig [ig , jg ] be the curvature corresponding to the


connection ig . Then we have the following result.

Lemma 3.4. Under a gauge transformation procured by g G,

gi ( g) = i () g, M;
Rgij =g 1
Rij g.

Proof. Note that

gi ( g) = i () g + i g + g ig = (i + i ) g.

This proves the rst formula.


June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00402

Projective Module Description of Embedded Noncommutative Spaces 515

To prove the second claim, we use the following formula


i jg j ig = g 1 (i j j i ) g i (g 1 ) j g + j (g 1 ) i g
+ [i (g 1 ) g, g 1 j g] [j (g 1 ) g, g 1 i g] ;
[ig , jg ] = g 1 [i , j ] g i (g 1 ) j g + j (g 1 ) i g
+ [i (g 1 ) g, g 1 j g] [j (g 1 ) g, g 1 i g] .
Combining these formulae together we obtain Rgij = g 1 Rij g. This completes the
proof of the lemma.

3.3. Vector bundles associated to right projective modules


Connections and curvatures can be introduced for the right bundle M = e Am in
r
much the same way. Let i Mm (A) (i = 1, 2, . . . , n) be matrices satisfying the
condition that
i e = i (e) e.
(1 e) (3.5)
Then we can introduce a connection consisting of the right covariant derivatives
i

(i = 1, 2, . . . , n) on M dened by
M,
i : M 
i = i
i .

It is easy to show that i ( a) = i () a + i a for all a A.


i is equal to i e for each i, the condition (3.5) is satised. We call
Note that if
them the canonical connection on M.
Returning to a general connection i , we dene the associated curvature by
ij = i
R j j
i [
i ,
j ] .
we have
Then for all M,
[
i , ij .
j ] = R
ij by
We further dene the covariant derivatives of R
ij = k R
kR ij + ij R
k R ij
k .
Then we have the following result.
satises the Bianchi identity
Lemma 3.5. The curvature on the right bundle M
jk +
iR ki +
jR ij = 0.
kR

By direct calculations we can also prove the following result:


[
k , [
i ,
j ]] = ij ) ,
k (R
M.

Consider the gauge group G dened by (3.3), which has a right action on M:
G M,
M g  g := g 1 .
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00402

516 R. B. Zhang & X. Zhang

Under a gauge transformation procured by g G,

ig := g 1
i 
i g + i (g 1 ) g.

g on M
The connection dened by
i

g = i
ig
i


satises the following relation for all M:

g (g 1 ) = g 1
i .
i

Furthermore, the gauge transformed curvature

g := i
R jg j
ig [
ig ,
jg ]
ij

ij by
is related to R

g = g 1 R
R ij g.
ij

Given any Mm (A), we can dene the A-bimodule map

M
,
: M R[[h]] A,  ,
= , (3.6)

where is dened by matrix multiplication. We shall say that the bimodule


homomorphism is gauge invariant if for any element g of the gauge group G,

g, g
= ,
, M,
M.

Also, the bimodule homomorphism is said to be compatible with the connections i


on M and if for all i = 1, 2, . . . , n
i on M

i ,
= i ,
+ ,
i
, M,
M.

Lemma 3.6. Let ,


: M R[[h]]
M A be an A-bimodule homomorphism dened
by (3.6) with a given m m-matrix with entries in A. Then

(1) ,
is gauge invariant if g g 1 = for all g G;
(2) ,
is compatible with the connections i on M and if for all i,
i on M

e (i i +
i ) e = 0.

Proof. Note that g, g


= g g 1 for any g G, M and M.
1
Therefore g, g
= ,
if g g = . This proves part (1).
Now i ,
= i ,
+ , i
+ (i i +
i) . Thus if satises
the condition of part (2), then ,
is compatible with the connections.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00402

Projective Module Description of Embedded Noncommutative Spaces 517

3.4. Canonical connections and fiber metric


given by
Let us consider in detail the canonical connections on M and M
i = i e,
i = i e.
A particularly nice feature in this case is that the corresponding curvatures on the
left and right bundles coincide. We have the following formula:
ij = [i e, j e] .
Rij = R
Now we consider a special case of the A-bimodule map dened by Eq. (3.6).

Denition 3.7. Denote by g : M R[[h]] M A the map dened by (3.6) with


being the identity matrix. We shall call g the ber metric on M.

Lemma 3.8. The ber metric g is gauge invariant and is compatible with the
standard connections.

Proof. Since is the identity matrix in the present case, it immediately follows
from Lemma 3.6(1) that g is gauge invariant. Note that e i (e) e = 0 for all i.
Using this fact in Lemma 3.6(2), we easily see that g is compatible with the standard
connections.

4. Embedded Noncommutative Spaces


In this section, we study explicit examples of idempotents and related projective
modules. They correspond to the noncommutative spaces introduced in [8]. The
main result here is a reformulation of the theory of embedded noncommutative
spaces [8] in the framework of Sec. 3 in terms of projective modules.

4.1. Embedded noncommutative spaces


We shall consider only embedded spaces with Euclidean signature. The Minkowski
case is similarly, which we  shall briey allude
 to in Remark 4.6 at the end of
this section. Given X = X X X1 2 m in l Am , we dene an (n n)-matrix
(gij )i,j=1,2,...,n with entries given by
m

gij = i X j X .
=1

Following [8], we shall call X a noncommutative space embedded in Am if the matrix


(gij ) is invertible.
For a given noncommutative space X, we denote by (g ij ) the inverse matrix
of (gij ) with gij g jk = g kj gji = ik for all i and k. Here Einsteins summation
convention is used, and we shall continue to use this convention throughout the
paper. Let
Ei = i X, i = (Ej )t g ji ,
E E i = g ij Ej ,
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00402

518 R. B. Zhang & X. Zhang


i X 1
2
i X
for i = 1, 2, . . . , n, where (Ei )t = .. denotes the transpose of Ei . Dene
.
i X m
e Mm (A) by
j Ej
e:=E

i X 1 g ij j X 1 i X 1 g ij j X 2 i X 1 g ij j X m

i X 2 g ij j X 1 i X 2 g ij j X 2 i X 2 g ij j X m
=

.


m ij 1 m
i X g j X i X g ij j X 2 m ij
i X g j X m

(4.1)

We have the following results.

j = j for all i and j.


Proposition 4.1. (1) Under matrix multiplication, Ei E i
(2) The m m matrix e satises e e = e, that is, it is an idempotent in Mm (A).
(3) The left and right projective A-modules M = l Am e and M = e Am are
r
i
. More precisely, we have
respectively spanned by Ei and E

M = {ai Ei | ai A}, i bi | bi A}.


= {E
M

j = Ei (Ek )t g kj = j . It then
Proof. Note that gij = Ei (Ej )t . Thus Ei E i
immediately follows that

e e = Ei (Ei Ej ) Ej = Ei ij Ej = e.

Obviously, M {ai Ei | ai A} and M i bi | bi A}. By the rst part of


{E
the proposition, we have

ai Ei e = ai (Ei E j ) Ej = aj Ej ,
e E j bj = E
i (Ei E
j ) bj = E
i bi .

This proves the last claim of the proposition.

= {(Ei )t bi | bi A} since (gij ) is invertible.


It is also useful to observe that M
We shall denote M and M respectively by T X and TX, and refer to them as
the left and right tangent bundles of the noncommutative space X. Note that the
denition of the tangent bundles coincides with that in [8].

Denition 4.2. Call the ber metric g : T X R[[h]]


TX A dened in Deni-
tion 3.7 the metric of the noncommutative space X.

The proposition below in particular shows that g agrees with the metric of the
embedded noncommutative space dened in [8] in a geometric setting.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00402

Projective Module Description of Embedded Noncommutative Spaces 519

Proposition 4.3. For any = ai Ei T X and = (Ej )t bj TX with


ai , bj A,
g :  g(, ) = ai gij bj .
In particular, g(Ei , (Ej )t ) = gij .

Proof. Recall from Denition 3.7 that g is dened by (3.6) with being the
identity matrix. Thus for any = ai Ei T X and = (Ej )t bj TX with
ai , bj A,
g(, ) = ai Ei (Ej )t bj = ai gij bj .
This completes the proof.

Let us now equip the left and right tangent bundles with the canonical connec-
tions given by i = i = i e, and denote the corresponding covariant deriva-
tives by
i : T X T X,
i : TX TX.

In principle, one can take arbitrary connections for the tangent bundles, but we
shall not allow this option in this paper.
The following elements of A are dened in [8],
1 1
c ijl = (i gjl + j gli l gji ) , ijl = (i (Ej ) (El )t El i (Ej )t ) ,
2 2
ijl = c ijl + ijl , ijl = c ijl ijl ,

where ijk was referred to as the noncommutative torsion. Set [8]
kij = ijl g lk , k = g kl
ij
ijl . (4.2)
Then we have the following result.

Lemma 4.4.
i Ej = kij Ek ,
iE k j .
j = E (4.3)
ki

Proof. Consider the rst formula. Write i e = i (E k ) Ek + E


k i Ek . We have

i Ej = i Ej Ej i e
= i Ej (i (Ej e) i (Ej ) e)
k Ek .
= i (Ej ) E
It was shown in [8] that kij = i (Ej ) E k . This immediately leads to the rst
formula. The proof for the second formula is essentially the same.

Note that Lemma 4.4 can be re-stated as


j Ek,
i E j = i (Ej )t = (Ek )t
k .
ik ij
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00402

520 R. B. Zhang & X. Zhang

By using Lemmas 3.8 and 4.4, we can easily prove the following result, which is
equivalent to [8, Proposition 2.7].

Proposition 4.5. The connections are metric compatible in the sense that

i g(, ) = g(i , ) + g(,


i ), T X, TX. (4.4)

For = Ej and = (Ek )t , we obtain from (4.4) the following result for all i, j, k:

i gjk ijk
ikj = 0. (4.5)

This formula is in fact equivalent to Proposition 4.5.


Dene
l l, l
Rkij = Ek Rij E kij
R = g lq Eq Rij E p gpk . (4.6)

ij = [i e, j e] , we can show by some lengthy calculations that


Using Rij = R
l
Rkij = j lik pik ljp + i ljk + pjk lip ,
(4.7)
l = j
R l l p + i l + l p ,
kij ik jp ik jk ip jk

which are the Riemannian curvatures of the left and right tangent bundles of the
noncommutative space X given in [8, Lemma 2.12 and 4]. Therefore,
l j ](Ek )t = (El )t R
l ,
[i , j ]Ek = Rkij El , [
i,
kij (4.8)

recovering the relations [8, (2.13)] and their generalizations [8, 4] to arbitrary
m n.

Remark 4.6. We comment briey on noncommutative spaces with Minkowski


signatures embedded in higher dimensions [8]. Let = diag(1, . . . , 1, 1, . . . , 1)
be a diagonal (mm)-matrixwith p of the diagonal
 entries being 1, and q = mp
of them being 1. Given X = X 1 X 2 X m in l Am , we dene an (n n)-matrix
(gij )i,j=1,2,...,n with entries
m

gij = i X j X .
=1

We call X a noncommutative space embedded in Am if the matrix (gij ) is invertible.


Denote its inverse matrix by (g ij ). Now the idempotent which gives rise to the left
and right tangent bundles of X is given by

e = (Ei )t g ij Ej ,

which obviously satises Ei e = Ei for all i. The ber metric of Denition 3.8
yields a metric on the embedded noncommutative surface X.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00402

Projective Module Description of Embedded Noncommutative Spaces 521

4.2. Example
We analyze an embedded noncommutative surface of Euclidean signature arising
from the quantisation of a time slice of the Schwarzschild spacetime. While the
main purpose here is to illustrate how the general theory developed in previous
sections works, the example is interesting in its own right.
Let us rst specify the notation to be used in this section. Let t1 = r, t2 =
and t3 = , with r > 2m, (0, ), and (0, 2). We deform the algebra of
functions in these variables by imposing the Moyal product dened by (2.1) with
the following anti-symmetric matrix

0 0 0
3
(ij )i,j=1 = 0 0 1.
0 1 0
Note that the functions depending only on the variable r are central in the Moyal
algebra A. We shall write the usual pointwise product of two functions f and g as
f g, but write their Moyal product as f g.
Consider X = X 1 X 2 X 3 X 4 given by
1
1  2 2m
X = f (r) with (f ) + 1 = 1 ,
r (4.9)
X 2 = r sin cos , X 3 = r sin sin , X 4 = r cos .
Simple calculations yield
E1 = r X = ( f  sin cos sin sin cos ),
E2 = X = ( 0 r cos cos r cos sin r sin ),
E3 = X = ( 0 r sin sin r sin cos 0 ).
Using these formulae, we obtain the following expressions for the components of
the metric of the noncommutative surface X:
1 
2m 2m 2
g11 = 1 1 1 cos(2) sinh h ,
r r
g12 = g21 = r sin(2) sinh2 h,

g22 = r2 [1 + cos(2) sinh2 h],
(4.10)
g23 = g32 = r2 cos(2) sinh h
cosh h,

cosh h,
g13 = g31 = r sin(2) sinh h
g33 = r2 [sin2 cos(2) sinh2 h].
0, we recover the spatial components of the Schwarzschild metric.
In the limit h
Observe that the noncommutative surface still reects the characteristics of the
Schwarzschild spacetime in that there is a time slice of the Schwarzschild black
hole with the event horizon at r = 2m.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00402

522 R. B. Zhang & X. Zhang

Since the metric (gij ) depends on and r only, and the two variables commute,
the inverse (g ij ) of the metric can be calculated in the usual way as in the commu-
tative case. Now the components of the idempotent e = (eij ) = (Ei )t g ij Ej are
given by the following formula:

2m 2m(2m r)(2 + cos 2) 2 3 ),


e11 = + h + O(h
r r2
m cos sin 2m cos sin
e12 =   h
m m
r 4m+2r r 4m+2r

m(4m + r + 2m cos 2) cos sin 2 3)


+  h + O(h
2 m
r 4m+2r

m sin sin 2m cos cos


e13 =  +  h
m m
r 4m+2r r 4m+2r

m(4m + r + 2m cos 2) sin sin 2 3)


+  h + O(h
m
r2 4m+2r

m cos m cos (4m r + 2m cos 2) 2 3)


e14 =  +  h + O(h
m 2 m
r 4m+2r r 4m+2r

m cos sin 2m cos sin


e21 =  +  h
m m
r 4m+2r r 4 m+2 r

m(4m + r + 2m cos 2) cos sin 2 3)


+  h + O(h
m
r2 4m+2r

2m sin2 cos2 m
e22 = 1 + 2 [2r + 2m cos 4 cos2 6m cos2
r 2r
2 + O(h
+ 2 cos 2(m + 8r + (m r) cos 2)]h 3)

m sin2 sin 2 3m sin 2


e23 = h
r r
m(2(m r) cos 2 + m(3 + cos 4)) sin 2 2 3)
+ h + O(h
2r2
2m cos cos sin m(1 + 3 cos 2) sin
e24 = h
r r
m(8m + 5r + 4m cos 2) cos sin 2 2 3)
h + O(h
2r2
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00402

Projective Module Description of Embedded Noncommutative Spaces 523

m sin sin 2m cos cos


e31 =   h
m m
r 4m+2r r 4m+2r

m(4m + r + 2m cos 2) sin sin 2 3)


+  h + O(h
2 m
r 4m+2r

m sin2 sin 2 3m sin 2


e32 = + h
r r
m(2(m r) cos 2 + m(3 + cos 4)) sin 2 2 3)
+ h + O(h
2r2
2m sin2 sin2 m
e33 = 1 + 2 [2r + 2m cos 4 sin2 6m sin2
r 2r
2 + O(h
+ 2 cos 2(m + 8r (m r) cos 2)]h 3)
2m cos sin sin m(1 + 3 cos 2) cos
e34 = + h
r r
m(8m + 5r + 4m cos 2) sin 2 sin 2 3)
h + O(h
2r2
m cos m cos (4m r + 2m cos 2) 2 3)
e41 =  +  h + O(h
m 2 m
r 4m+2r r 4m+2r

2m cos cos sin m(1 + 3 cos 2) sin


e42 = + h
r r
m(8m + 5r + 4m cos 2) cos sin 2 2 3)
h + O(h
2r2
2m cos sin sin m(1 + 3 cos 2) cos
e43 = h
r r
m(8m + 5r + 4m cos 2) sin 2 sin 2 3)
h + O(h
2r2
2m cos2 4m cos2 (2m + r m cos 2) 2 3 ).
e44 = 1 + h + O(h
r r2

Let us write e = e0 + he 1+h 2 e2 + . Then inspecting the formulae we see


that the matrices e0 and e2 are symmetric, while e1 is skew symmetric. This is no
coincidence; rather it is a consequence of properties of X under the bar involution,
which will be discussed in Sec. 6.
Here we refrain from presenting the result of the Mathematica computation for
the curvature Rij = [i e, j e], which is very complicated and not terribly illu-
minating. However, we mention that in [37] a quantisation of the Schwarzschild
spacetime was carried out (for a particular choice of ), and the resulting non-
commutative dierential geometry was studied in detail. In particular, the metric,
Christoel symbols, Riemannian and Ricci curvatures were explicitly worked out.
We refer to that paper for details.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00402

524 R. B. Zhang & X. Zhang

5. General Coordinate Transformations


We now return to the general setting of Sec. 3 to investigate general coordinate
transformations. Our treatment follows closely [8, V] and makes use of general
ideas of [17, 21, 28]. We should point out that the material presented is part of an
attempt of ours to develop a notion of general covariance in the noncommutative
setting. This is an important matter which deserves a thorough investigation. We
hope that the work presented here will prompt further studies.
Let (A, ) be a Moyal algebra of smooth functions on the open region U of Rn
with coordinate t. This algebra is dened with respect to a constant skew symmetric
matrix = (ij ). Let : U U be a dieomorphism of U in the classical sense.
We denote
ui = i (t),
and refer to this as a general coordinate transformation of U .
Denote by Au the sets of smooth functions of u = (u1 , u2 , . . . , un ). The map

induces an R[[h]]-module isomorphism = : Au A dened for any function
f Au by
(f )(t) = f ((t)).

We dene the R[[h]]-bilinear map
u : Au Au Au , u (f, g) = 1 t ((f ), (g)).
Then it is well known [21] that u is associative. Therefore, we have the associative
algebra isomorphism

: (Au , u ) (At , t ).
We say that the two associative algebras are gauge equivalent by adopting the
terminology of [17].

Following [8], we dene R[[h]]-linear operators
i := 1 i : Au Au , (5.1)
which have the following properties [8, Lemma 5.5]:
i j j i = 0,
i u (f, g) = u (i (f ), g) + u (f, i (g)), f, g Au ,

where the second relation is the Leibniz rule for i .


Recall that this Leibniz rule
played a crucial role in the construction of noncommutative spaces over (Au , u )
in [8].
We shall denote by Mm (Au ) the set of (mm)-matrices with entries in Au . The
product of two such matrices will be dened with respect to the multiplication u
of the algebra (Au , u ). Then 1 acting component wise gives rise to an algebra
isomorphism from Mm (A) to Mm (Au ), where matrix multiplication in Mm (A) is
dened with respect to .
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00402

Projective Module Description of Embedded Noncommutative Spaces 525

Since we need to deal with two dierent algebras (A, ) and (Au , u ) simultane-
ously in this section, we write and the matrix multiplication dened with respect
to it by as before, and use u to denote u and the matrix multiplication dened
with respect to it.
Let e Mm (A) be an idempotent. There exists the corresponding nitely
generated projective left (respectively, right) A-module M (respectively, M). Now
1 1 1 1
eu := (e) is an idempotent in Mm (Au ), that is, (e)u (e) = (e). Write
eu = (E ),=1,...,m . This idempotent gives rises to the left projective Au -module
Mu and right projective Au -module M u , respectively dened by
  
Mu = a u E1 a u E2 a u Em  a Au ,


E1 u b 









E2 u b 
u =
M  b Au ,
.. 

. 





E 

m u b 
 
where a u E = u (a , E ) and E u b = u (E , b ). Below we consider
the left projective module only, as the right projective module may be treated
similarly.
Assume that we have the left connection

i : M M, i = i + i .
t
Let iu := 1 (i ). We have the following result.

Theorem 5.1. (1) The matrices iu satisfy the following relations in Mm (Au ):
eu u iu u (1 eu ) = eu u i eu .
(2) The operators i (i = 1, 2, . . . , n) dened for all Mu by
i = i + u iu
give rise to a connection on Mu .
(3) The curvature of the connection i is given by
Ruij = i ju j iu iu u ju + ju u iu ,
which is related to the curvature Rij of M by
Ruij = 1 (Rij ).

Proof. Note that eu u iu u (1 eu ) = 1 (e i (1 e)). We also have


i eu = 1 ( t
e
i ), which leads to eu u i eu =
1
(e (i eu )) = 1 (e i e). This
proves part (1). Part (2) follows from part (1) and the Leibniz rule for i . Straight-
forward calculations show that the curvature of the connection i is given by
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00402

526 R. B. Zhang & X. Zhang


Ruij = i ju j iu iu u ju + ju u iu . Now i ju = 1 ( tij ), and
iu u ju ju u iu = 1 (i j ) 1 (j i ). Hence Ruij = 1 (Rij ).

Remark 5.2. One can recover the usual transformation rules of tensors under the
dieomorphism group from the commutative limit of Theorem 5.1 in a way similar
to that in [8, 5.C].

6. Bar Involution and Generalized Hermitian Structure


In this section, we study a Moyal algebra analogue of the bar map of quantum
groups, and investigate its implications on noncommutative geometry. Note that the
admits an involution that maps an arbitrary power series a =  ai h
ring R[[h]] i in
to a  i i
i
R[[h]] = i (1) ai h . We shall call a
the conjugate of a. Note that a
a contains

only even powers of h. We can extend this map to a conjugate linear anti-involution
on the Moyal algebra A.
 i
Lemma 6.1. Let : A A be the map dened for any f = i fi h A, where fi
 i i
are real functions on U, by f = i (1) fi h . Then for all f, g A,

f g = g f.

We refer to the map as the bar involution of the Moyal algebra. It is an analogue
of the well known bar map, sending q = exp(h) to q 1 , in the theory of quantum
groups, which plays an important role in the study of canonical (crystal) bases.
The lemma can be easily proven by inspecting (2.1). Given any rectangular
matrix A = (ars ) with entries in A, we let A be the matrix obtained from A by
rst taking its transpose then sending every matrix elements to its conjugate. For
example,

  a1 a2
a 1 b 1 c1
= b1 b2 .
a 2 b 2 c2
c1 c2

It is clear that if the product A B of two matrices are dened, then (A B) =


B A .
Let Am = l Am be the R[[h]]-module
consisting of rows matrices of length m
with entries in A. We dene the form

( , ) : Am Am A,  (, ) := . (6.1)

Lemma 6.2. (1) For all , M and a, b A,

(, ) = (, ), (a , b ) = a (, ) b.

Thus in this sense the form (6.1) is sesquilinear.


(2) (, ) = 0 if and only if = 0.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00402

Projective Module Description of Embedded Noncommutative Spaces 527

(3) For all , M and A Mm (A), we have

( A, ) = (, A ).

(4) Let the bar-unitary group Um (A) over A be the subgroup of GLm (A) dened
by Um (A) = {g GLm (A) | g = g 1 }. Then the form (6.1) is invariant under
Um (A) in the sense that for all g Um (A) and , M,

( g, g) = (, ).

It is straightforward to prove the lemma. Note that part (2) of the lemma makes
the form (6.1) as nice as a positive denite hermitian form in the commutative case.
We shall call an idempotent e Mm (A) self-adjoint (with respect to the
sesquilinear form (6.1)) if

e = e .

In this case, the corresponding left and right projective modules M = l Am e and
M = e Am are related by
r

= { | M}.
M

Furthermore, the form (6.1) restricts to a sesquilinear form on M, which is invariant


under G Um (A).

Lemma 6.3. Let M = l Am e and M = e Am be the left and right bundles


r
associated with a self-adjoint idempotent e. Assume that the left connection i on
M and the right connection satisfy the condition
i on M

i = i ,
i. (6.2)

Then for any in M,

(i ) =
i ( ).

Furthermore, the curvatures on the left and right bundles are related by
ij = R .
R ij

Proof. Let = . We have

(i ) = (i + i ) = i + i =
i .

This proves the rst part of the lemma. Now

Rij = (i j j i [i , j ] )
= i j j i + [i , j ]
= R ij .

This proves the second part.


June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00402

528 R. B. Zhang & X. Zhang

Hereafter, we shall assume that condition (6.2) is satised by the left and right
connections. Let M be the left bundle corresponding to a self-adjoint idempotent
e. We shall say that a connection i on M is hermitian with respect to the bar map
(or bar-hermitian) if i = i for all i. In this case, we shall also say that the bundle
M is bar-hermitian.
Note that the canonical connections i = i e on M and satisfy
i = i e on M

i = i and i = i provided that e is self-adjoint. Therefore, in this case the

canonical connection is bar-hermitian. Since the left and right curvatures associated
to the canonical connections are equal, it follows from Lemma 6.3 that Rij = Rij .
We have the following result.
 
Theorem 6.4. Let X = X 1 X 2 X m in l Am be an embedded noncommuta-
 
tive surface satisfying the condition X := X 1 X 2 X m = X. Then X has the
following properties:

(1) The metric has the property gij = gji for all i, j.
(2) The idempotent e = (Ei )t g ij Ej is self-adjoint.
(3) Equipped with the canonical connection i = i e, the tangent bundle of X is
bar-hermitian.
(4) The curvature satises Rij = Rij .

Proof. The given condition on X implies that all the Ei satisfy Ei = (Ei )t . Thus

gij = Ei (Ej )t = Ei (Ej ) , e = (Ei )t g ij Ej = (Ei ) g ij Ej .

Hence we have gij = (Ei (Ej ) ) = Ej (Ei ) = gji . It then follows that g ij = g ji .
Now the idempotent e satises

e = ((Ei ) g ij Ej ) = (Ej ) g ij Ei = (Ej )t g ji Ei = e.

Parts (3) and (4) follow from part (2) and the discussion preceding the proposition.

Note that the quantum spacetimes studied in [37] and the example in Sec. 4.2
all satisfy the conditions of Theorem 6.4.

7. Concluding Remarks
We wish to point out that in the classical commutative setting, we can recover
(pseudo-) Riemannian geometry from the theory developed here by using the iso-
metric embedding theorems of [12, 19, 23, 32]. The simplication in this case is that
there is no need to distinguish the left and the right tangent bundles. To describe
the situation, we let (N, g) be a smooth n-dimensional (pseudo-) Riemannian man-
ifold with metric g. Denote by C (N ) the set of smooth functions on N endowed
with the usual pointwise multiplication. Let C (N )m be the space consisting of
row vectors of length m with entries in C (N ). By results of [12, 19, 23, 32],
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00402

Projective Module Description of Embedded Noncommutative Spaces 529

there exist positive integers p, q (with p + q = m) and a set of smooth func-


m
tions X 1 , . . . , X p , X p+1 , . . . , X m on N such that g = ,=1 dX dX , where
= diag(1, . . . , 1, 1, . . . 1) with p = 0 if N is Riemannian. Let U be a coordinate
     
p q
1 2 m
chart of N with local coordinate (t1 , . . . , tn ). We set Ei = ( X
ti
X X ) and
ti ti
t ij
dene e = (Ei ) g Ej on each coordinate chart U . Then we have the following
result.
Theorem 7.1. (1) The idempotent e is globally dened on N .
(2) The space (T N ) of sections of the tangent bundle of N is given by C (N )m e.
(3) For all , (T N ), we have g(, ) = ()t .
(4) The standard connection (with i = i e) on C (N )m e is the usual Levi
Civita connection on T X with the Christoel symbol kij dened by (4.2) and
ijk = 0.
(5) The Riemannian curvature tensor is given by (4.6).
Returning to the noncommutative case, we recall that one can quantise any
Poisson manifold following the prescription of [28]. Then one obtains a collection
of noncommutative associative algebras (analogous to the Moyal algebra), one on
each coordinate patch. The algebras relative to dierent local coordinates are gauge
equivalent [28, Theorem 2.3] as discussed in Sec. 5. This way, one obtains a sheaf
of noncommutative algebras over the Poisson manifold. The algebraic geometry of
such a quantized Poisson manifold has been extensively developed by Kashiwara
and Schapira [24, 25]. In principle one may extend the local theory developed in
this paper to a global dierential geometry over the quantized Poisson manifold.
Work in this direction is currently under way.
Results in this paper should be directly applicable to the development of a theory
of noncommutative general relativity, which is of considerable current interest in
theoretical physics. We hope that the theory presented here will provide a consistent
mathematical basis for this purpose. We should also mention that one may use this
theory to clarify, conceptually, aspects of the many noncommutative geometries
introduced in physics in recent years based on physical intuitions. For example,
general features of the noncommutative geometries in [3, 10, 11] have considerable
similarity with that of [8]. These works also have the advantage of being explicit and
amenable to calculations, thus have the chance to be physically tested. Therefore, it
will be useful to further develop the mathematical bases of these theories by casting
them into the framework of this paper.
Finally, we note that a noncommutative analogue of spin geometry over the
Moyal algebra within the C -algebraic framework in terms of noncompact spectral
triples was studied in [20]. Our treatment is complementary to that of [20].

Acknowledgments
We wish to thank Masud Chaichian and Anca Tureanu for discussions at various
stages of this work. X. Zhang thanks the School of Mathematics and Statistics,
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00402

530 R. B. Zhang & X. Zhang

the University of Sydney for the hospitality extended to him during a visit when
this work was completed. Partial nancial support from the Australian Research
Council, National Science Foundation of China (grants 10421001, 10725105 and
10731080), NKBRPC (2006CB805905) and the Chinese Academy of Sciences are
gratefully acknowledged.

References

[1] L. Alvarez-Gaum e, F. Meyer and M. A. Vazquez-Mozo, Comments on noncommuta-
tive gravity, Nucl. Phys. B 753 (2006) 92117.
[2] S. Ansoldi, P. Nicolini, A. Smailagic and E. Spallucci, Non-commutative geometry
inspired charged black holes, Phys. Lett. B 645 (2007) 261266.
[3] P. Aschieri, M. Dimitrijevic, F. Meyer and J. Wess, Noncommutative geometry and
gravity, Class. Quant. Grav. 23 (2006) 18831911.
[4] R. Banerjee, B. R. Majhi and S. K. Modak, Noncommutative Schwarzschild black
hole and area law, Class. Quant. Grav. 26 (2009) 085010, 11 pp.
[5] M. Buric, T. Grammatikopoulos, J. Madore and G. Zoupanos, Gravity and the struc-
ture of noncommutative algebras, JHEP 0604 (2006) 054.
[6] M. Chaichian, M. Oksanen, A. Tureanu and G. Zet, Gauging the twisted Poincare
symmetry as noncommutative theory of gravitation, Phys. Rev. D 79 (2009) 044016,
8 pp.
[7] M. Chaichian, M. R. Setare, A. Tureanu and G. Zet, On black holes and cosmological
constant in noncommutative gauge theory of gravity, JHEP 0804 (2008) 064.
[8] M. Chaichian, A. Tureanu, R. B. Zhang and Xiao Zhang, Riemannian geometry of
noncommutative surfaces, J. Math. Phys. 49 (2008) 073511, 26 pp.
[9] M. Chaichian, A. Tureanu and G. Zet, Corrections to Schwarzschild solution in non-
commutative gauge theory of gravity, Phys. Lett. B 660 (2008) 573578.
[10] A. H. Chamseddine, Complexied gravity in noncommutative spaces, Comm. Math.
Phys. 218 (2001) 283292.
[11] A. H. Chamseddine, SL(2, C) gravity with a complex vierbein and its noncommuta-
tive extension, Phy. Rev. D 69 (2004) 024015, 8 pp.
[12] C. J. S. Clarke, On the global isometric embedding of pesudo-Riemannian manifolds,
Proc. Roy. Soc. Lond. A. 314 (1970) 417428.
[13] A. Connes, Noncommutative Geometry (Academic Press, 1994).
[14] M. P. do Carmo, Dierential Geometry of Curves and Surfaces (Prentice-Hall, Engle-
wood Clis, NJ, 1976).
[15] B. P. Dolan, K. S. Gupta and A. Stern, Noncommutative BTZ black hole and discrete
time, Class. Quant. Grav. 24 (2007) 16471656.
[16] S. Doplicher, K. Fredenhagen and J. E. Roberts, The quantum structure of spacetime
at the Planck scale and quantum elds, Comm. Math. Phys. 172 (1995) 187220.
[17] V. Drinfeld, Quasi-Hopf algebras, Leningrad Math. J. 1 (1990) 14191457.
[18] S. Estrada-Jimenez, H. Garcia-Compean, O. Obregon and C. Ramirez, Twisted
covariant noncommutative self-dual gravity, Phys. Rev. D 78 (2008) 124008, 11 pp.
[19] A. Friedman, Local isometric embedding of Riemannian manifolds with indenite
metric, J. Math. Mech. 10 (1961) 625650.
[20] V. Gayral, J. M. Gracia-Bonda, B. Iochum, T. Sch ucker and J. C. V
arilly, Moyal
planes are spectral triples, Comm. Math. Phys. 246 (2004) 569623.
[21] M. Gerstenhaber, On the deformation of rings and algebras, Ann. Math. 79 (1964)
59103.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00402

Projective Module Description of Embedded Noncommutative Spaces 531

[22] J. M. Gracia-Bonda, J. C. Varilly and H. Figueroa, Elements of Noncommutative


Geometry, Birkh auser Advanced Texts: Basler Lehrb uher (Birkh
auser Boston, Inc.,
Boston, MA, 2001).
[23] R. E. Greene, Isometric Embedding of Riemannian and Pseudo Riemannian Mani-
folds, Mem. Amer. Math. Soc., No. 97 (Amer. Math. Soc., 1970).
[24] M. Kashiwara and P. Schapira, Deformation quantization modules I: Finiteness and
duality, arXiv:0802.1245 [math.QA].
[25] M. Kashiwara and P. Schapira, Deformation quantization modules II. Hochschild
class, arXiv:0809.4309 [math.AG].
[26] H. C. Kim, M. I. Park, C. Rim and J. H. Yee, Smeared BTZ black hole from space
noncommutativity, JHEP 10 (2008) 060.
[27] A. Kobakhidze, Noncommutative corrections to classical black holes, Phys. Rev. D
79 (2009) 047701, 3 pp.
[28] M. Kontsevich, Deformation quantization of Poisson manifolds, Lett. Math. Phys. 66
(2003) 157216.
[29] J. Madore and J. Mourad, Quantum space-time and classical gravity, J. Math. Phys.
39 (1998) 423442.
[30] S. Majid, Noncommutative Riemannian and spin geometry of the standard q-sphere,
Comm. Math. Phys. 256 (2005) 255285.
[31] F. Muller-Hoissen, Noncommutative geometries and gravity, in Recent Developments
in Gravitation and Cosmology, AIP Conf. Proc., Vol. 977 (Amer. Inst. Phys., Melville,
NY, 2008), pp. 1229.
[32] J. Nash, The imbedding problem for Riemannian manifolds, Ann. Math. 63 (1956)
2063.
[33] P. Nicolini, A. Smailagic and E. Spallucci, Noncommutative geometry inspired
Schwarzschild black hole, Phys. Lett. B 632 (2006) 547551.
[34] H. S. Snyder, Quantized space-time, Phys. Rev. 71 (1947) 3841.
[35] R. J. Szabo, Symmetry, gravity and noncommutativity, Class. Quant. Grav. 23 (2006)
R199R242.
[36] H. Steinacker, Emergent gravity and noncommutative branes from YangMills matrix
models, Nucl. Phys. B 810 (2009) 139.
[37] D. Wang, R. B. Zhang and X. Zhang, Quantum deformations of Schwarzschild and
Schwarzschild-de Sitter spacetimes, Class. Quant. Grav. 26 (2009) 085014, 14 pp.
[38] C. N. Yang, On quantized space-time, Phys. Rev. 72 (1947) 874.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00401

Reviews in Mathematical Physics


Vol. 22, No. 5 (2010) 533548

c World Scientific Publishing Company
DOI: 10.1142/S0129055X10004016

CONSTRUCTION OF CERTAIN FUZZY FLAG MANIFOLDS

MAJDI BEN HALIMA


Facult
e des Sciences de Sfax, D
epartement de Math
ematiques,
Route de Soukra, 3038 Sfax, Tunisia
majdi.benhalima@yahoo.fr

Received 14 May 2009


Revised 16 February 2010

Approximating the algebra of complex-valued smooth functions on a space-time manifold


by a sequence of matrix algebras AN = Mat(dN , C), with dN , is the basic idea of
fuzzy manifolds. In this paper, we explicitly construct fuzzy versions of the homogeneous
spaces SO(2n+1)/U (n) and Sp(n)/U (1)Sp(n1) for n 2. This allows us to extend a
result of Zhang giving a construction of fuzzy irreducible compact Hermitian symmetric
spaces to a class of flag manifolds.

Keywords: Fuzzy flag manifolds; BerezinToeplitz quantization; representations of


compact Lie groups.

Mathematics Subject Classification: 81T08, 81S10, 22E47

1. Introduction
Let (M, ) be a quantizable compact Kahler manifold. Let (L, h, ) be an associated
quantum line bundle. Here L is a holomorphic line bundle, h a Hermitian metric
and the unique connection in L which is compatible with the complex structure
and the metric such that the curvature form R of the line bundle and the K ahler
form of the manifold are related as
R(X, Y ) := X Y Y X [X,Y ] = i(X, Y ),
where X, Y are smooth vector elds on M . Let us x a positive integer N and
set LN := LN , the N th tensor power of L. On the space (M, LN ) of smooth
sections of LN , we have the scalar product

,  = 
hN ((x), (x))d(x),
M


where hN := hN is the induced metric on LN and d(x) is the normalized Liouville
measure on M . Let L (M, L ) be the L -completion of the space (M, LN )
2 N 2

and hol (M, LN ) be its closed subspace of holomorphic sections. By compactness


of M , the Hilbert space HN := hol (M, LN ) is nite-dimensional. The algebra

533
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00401

534 M. Ben Halima

AN := EndC (HN ) can evidently be identied with the matrix algebra


Mat(dimC HN , C). Letting C (M ) be the algebra of complex-valued smooth
functions on M , the BerezinToeplitz quantization map TN : C (M ) AN is
dened by associating to a function f multiplication of holomorphic sections of
LN by f followed by projection on the space of holomorphic sections. In this way,
one obtains a sequence of matrix algebras (AN )N 1 and a sequence of linear maps
(TN )N 1 . Referring to a work of Bordemann, Meinrenken and Schlichenmaier [7],
we know that the sequence (AN )N 1 should, in some sense, approximate the
commutative algebra C (M ). Such an approximation scheme is reminescent of
fuzzy manifolds where nite-dimensional matrix algebras are used to approximate
the algebra of complex-valued smooth functions on a space-time manifold.
More precisely, a fuzzy version of a compact manifold D is given by a sequence
of linear subspaces (EN )N 1 in the function algebra C (D) such that EN

EN +1 and N 1 EN is dense in C (D), and such that EN is isomorphic to a
matrix algebra AN = Mat(dN , C) with dN . Furthermore, it is required that
this truncation retains all symmetries of the manifold D. The prototypical exam-
ple of fuzzy compact manifold is the fuzzy two-sphere S 2 . Identify S 2 with the
homogeneous space SU (2)/S(U (1) U (1)) and recall that


L2 (S 2 )
= , V2k ,
k N0

where N0 := Z0 and Vl is the space of homogeneous complex polynomials of degree


l in two variables. Then, since

N
VN VN
= V2k
k=0

by self-duality of the Vl and the usual ClebschGordan rule, the algebra AN :=


EndC (VN ) = Mat(N + 1, C) appears not only as a natural SU (2)-equivariant trun-
cation of L2 (S 2 ) (or C (S 2 )) but carries a non-commutative multiplication as
well (see, e.g., [24, 25] for details). A number of fuzzy compact manifolds have
been constructed by now. For reviews on some of these constructions, we refer to
[5, 6, 11, 16].
As suggested by Madore in [24], fuzzy compact manifolds have found several
applications in physics. In quantum eld theory, it can provide a nite mode
approximation to commutative continuum eld theories, giving an alternative to
lattice gauge theories. Compared to a lattice regularization procedure, the fuzzy
approach has the advantage of preserving the space-time symmetries. It also has
further advantages in situations where fermions are included. Due to these and
other potential advantages, the fuzzy approach appears as a promising new tool in
quantum eld theory (see, e.g., [4, 12, 15] for more details). There are other reasons
to investigate fuzzy compact manifolds in theoretical physics. They lead to matrix
models which receive a lot of interest in string theory, especially in the theory of
D-branes (see, e.g., [2, 17]).
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00401

Construction of Certain Fuzzy Flag Manifolds 535

From a rather mathematical point of view, fuzzy compact manifolds have


an interesting connection with noncommutative geometry. Following an idea of
Fr
ohlich and Gawedzki [13], a fuzzy version of a compact manifold D can be spec-
ied by a sequence of triples

(Mat(dN , C), HN , N ),
2
where the Hilbert space HN = CdN is equipped with the inner product
1
A, B = Tr(AB ),
dN
and N is a matrix analog of the LaplaceBeltrami operator. The fuzzy Laplacian
N comes with a cuto and encodes mathematical informations about the manifold
D (see, e.g., [10] for details). This important fact motivates the study of fuzzy
compact manifolds from a framework of noncommutative geometry.
The main goal of the present work is to construct explicit fuzzy versions of
the homogeneous spaces SO(2n + 1)/U (n) and Sp(n)/U (1) Sp(n 1) (n 2)
by means of elementary representation-theoretic methods. This allows us to
establish the following result wich describes a class of fuzzy ag manifolds.

Theorem . Let G be a compact, connected simply connected Lie group with Lie
algebra g, and let p be a standard maximal parabolic subalgebra of the complexified
Lie algebra gC . Let K G be the connected Lie group with Lie algebra k := p g.
Assume that (G, K) is a Gelfand pair. Then there exists a sequence (EN )N 1
of G-invariant subspaces of L2 (G/K) such that EN EN +1 and N 1 EN is dense
in C (G/K), and such that EN is G-equivariantly isomorphic to a matrix algebra
AN = Mat(dN , C) with dN .

This theorem extends a result of Zhang (see [31, Proposition 3.1 and
Theorem 4.2]) wich gives a construction of fuzzy irreducible compact Hermitian
symmetric spaces. In the proof of the above theorem, we shall make direct use of
the standard BerezinToeplitz quantization procedure for compact Kahler mani-
folds. In connection with our work, let us mention that Lazaroiu, McNamee and
Samann (see [22]) have recently proved that a particular version of generalized
Berezin quantization, which they call BerezinBergmann quantization, provides
a general framework for approaching the construction of fuzzy compact K ahler
manifolds. Using this framework, the authors have proposed a general defenition of
fuzzy scalar eld theory on compact K ahler manifolds.
The present paper is organized as follows. In Sec. 2, we rst x our notations
and terminology. Then we recall some useful facts about a special class of Gelfand
pairs. In Sec. 3, we provide explicit formulas concerning the decomposition into
irreducibles of some tensor product representations of the groups SO(2n + 1) and
Sp(n) for n 2 (see Corollary 1 and Proposition 2 below). These formulas play an
important role in Sec. 4, wich is essentially devoted to the proof of our main result.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00401

536 M. Ben Halima

2. Preliminaries
2.1. Basic notions
Let G be a compact connected semisimple Lie group with Lie algebra g. We denote
by gC the complexication of g and by GC the simply connected Lie group with Lie
algebra gC . Let T be a maximal torus in G with Lie algebra h. The complexication
hC of h is a Cartan subalgebra of gC . We denote by the root system of gC with
respect to hC . We x a lexicographic ordering on the dual hR := (ih) and we
write + for the corresponding system of positive roots. The Killing form B on g
extends complex bilinearly to gC . It is easy to see that B is positive denite on hR .
For hR , let H be the element of hR such that (H) = B(H, H ) for all H hR .
Thus we obtain a scalar product on hR given by
,  = B(H , H ).
Let = {1 , . . . , l } be the system of simple roots corresponding to + . The
elements j , 1 j l, dened by
2j , k 
= j,k for 1 k l,
k , k 
are called the fundamental weights attached to . To simplify notation, we set
j := j . The weight lattice is then given by

l
= (hC ) ; = nj  j , nj Z .

j=1

The set of dominant weights is the cone



l
+ = (hC ) ; = nj j , nj N0 .

j=1

For each + , we denote by the unique (up to equivalence) irreducible


representation of G with highest weight , acting in V (). Let j be a simple
root. The irreducible representation j is called the fundamental representation
attached to j .
Let now K be a closed connected subgroup of G with Lie algebra k. A dominant
weight + is called K-spherical if the subspace of K-xed vectors in V () is
one-dimensional. The corresponding representation is then called K-spherical.
We write K + for the subset of K-spherical dominant weights. If for every +
the subspace of K-xed vectors in V () is at most one-dimensional, then the pair
(G, K) is called a Gelfand pair. In this case, the harmonic analysis of the square
integrable functions on the homogeneous space M = G/K, endowed with the Haar
measure, is given by

L2 (M )
= V ().
K
+
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00401

Construction of Certain Fuzzy Flag Manifolds 537

2.2. A special class of Gelfand pairs


Let us keep the notations of the previous subsection. Let
 
gC = hC gC
=h
C
CE

be the standard root decomposition of gC . For a given subset S , dene the


parabolic subalgebra

pS := hC gC
,
S

where S := + { ; span(S)}, and denote by PS the corresponding


parabolic subgroup of GC . Let lS be the Levi factor of pS ,

lS = hC gC

S (S )

and set kS := pS g = lS g. Then kS is a compact real form of lS . Setting KS :=


G PS , we see that KS is a Lie subgroup of G with Lie algebra kS .
Assume furthermore that (G, K) is a Gelfand pair and that there exists a subset
S such that S c := (\S) = 1 and k = kS . Note that the corresponding
PS G is maximal parabolic and that the Dynkin diagram of K can be obtained
from the Dynkin diagram of G by deleting one node. The simple root
with \S = {} is called the Gelfand node associated to the pair (G, K).
The following important proposition characterizes a special class of compact
Gelfand pairs.

Proposition 1 ([30, Proposition 4.7]). Let G be a compact, connected simply


connected Lie group with Lie algebra g, and let p be a standard maximal parabolic
subalgebra of the complexified Lie algebra gC . Let K G be the connected Lie group
with Lie algebra k := p g. Then (G, K) is a Gelfand pair if and only if one of the
following three conditions are satisfied:

(i) (G, K) is an irreducible compact Hermitian symmetric pair ;


(ii) (G, K)  (SO(2n + 1), U (n))(n 2);
(iii) (G, K)  (Sp(n), U (1) Sp(n 1))(n 2).

Let (G, K) be a pair from the list (i)(iii) above, and let (g, k) be the associated
pair of Lie algebras. Then k = kS for some subset S with S c = 1. Let
be the associated Gelfand node with corresponding fundamental weight  :=  .
One can extend  complex linearly to gC by setting (E ) = 0 for all .
The following fact is worth mentioning. Denote by L the isotropy group of  under
the coadjoint action of G, i.e.

L = {g G; Ad (g) = }.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00401

538 M. Ben Halima

Using the Killing form of g, we identify  with an element Z hR = ih. Thus


we get
L = {g G; Ad(g)Z = Z},
and then
l := Lie(L) = {X g; [X, Z] = 0}.
In the standard root decomposition of gC , E commutes with Z if and only if the
root is orthogonal to . Observe now that is orthogonal to  if and only if
belongs to the set S (S ). This means that lC is spanned by hC and the E s
with S (S ), and hence we get lC = kC . We conclude that K = L, which
proves that the ag manifold M = G/K can be identied with the G-orbit through
 under the coadjoint representation.

3. Decomposition of Tensor Product Representations of the


Groups SO(2n + 1) and Sp(n)
The goal of this section is to describe the decomposition into irreducibles of
some particular tensor product representations of the special orthogonal Lie group
SO(2n + 1) and the symplectic Lie group Sp(n) for n 2. We provide here explicit
formulas that will be used in the proof of our main result in the next section. For
a detailed exposition of the representation theory of SO(2n + 1) and Sp(n), we
refer to [19].

3.1. The case of the group SO(2n + 1)


The material of this subsection is not new but is worth summarizing in preparation
of our main result. Let Ei,j Mat(2n + 1, C) be the elementary matrix having 1 at
the (i, j)-entry and 0 elsewhere. We take the standard Cartan subalgebra h of the
Lie algebra so(2n + 1) spanned by the matrices (E2j1,2j E2j,2j1 ) for 1 j n.
Let ek be the linear form on the complexied Lie algebra hC given by

0 ih1

ih1 0

..
.
ek = hk

0 ihn

ihn 0
0
for 1 k n. In the usual ordering on hR := (ih) and for n 2, we have the
following system of positive roots of the pair (so(2n + 1, C), hC )
+ = + (so(2n + 1, C), hC )
= {ek el , 1 k < l n} {ek , 1 k n}.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00401

Construction of Certain Fuzzy Flag Manifolds 539

The associated system of simple roots is then


= {k = ek ek+1 , 1 k < n} {n = en }.
Let us recall that:
j
(a) the fundamental weights are j =
 k=1 ek for 1 j n 1 and n =
1 n
2 k=1 ek ; n
(b) the weight lattice is = { k=1 k ek ; k Z k or k Z + 12 k};
n
(c) a weight = k=1 k ek is dominant if and only if 1 2 n 0;
(d) the fundamental representation attached to the simple root n = en is the
so-called spin representation.
n
Given a dominant weight = k=1 k ek (or simply = (1 , . . . , n )), we
denote, as before, by V () the associated SO(2n + 1)-irreducible module with high-
est weight . Let now = s(e1 + + en ) and = t(e1 + + en ) be two constant
dominant weights of SO(2n + 1) with s, t 12 N0 and s t. In [26, Theorem 2.5],
Okada has proven the following multiplicity free decomposition formula

V () V () = V (),
Ps,t

where
Ps,t = { = (1 + t s, . . . , n + t s); (1 , . . . , n ) Nn0 , 2s 1 n 0}.
Since all representations of the group SO(2n + 1) are self-dual, we can deduce

Corollary 1. Let = s(e1 + + en ) with s 12 N0 . As SO(2n + 1)-modules, we


have

V () V () = V (),
Ps

where
Ps = { = (1 , . . . , n ) Nn0 ; 2s 1 n 0}.

3.2. The case of the group Sp(n)


We begin this subsection by recalling some well-known facts about the representa-
tions of the compact Lie group Sp(n). Let


ih1





.


. .







ih n
h= H = ; hj R 1 j n

ih1








. .


.




ihn
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00401

540 M. Ben Halima

be the standard Cartan subalgebra of the Lie algebra sp(n). Given an element H h
as above, we can simply write H = diag(ih1 , . . . , ihn , ih1 , . . . , ihn ). Let ek be
the linear form on hC dened by

ek (diag(h1 , . . . , hn , h1 , . . . , hn )) = hk ,

where 1 k n. For n 2, we x the following system of positive roots of the


pair (sp(n, C), hC )

+ = + (sp(n, C), hC )
= {ek el , 1 k < l n} {2ek , 1 k n}.

The associated system of simple roots is

= {k = ek ek+1 , 1 k < n} {n = 2en }.

Recall that:
j
(a) the fundamental weights are j = k=1 ek for 1 j n;
n
(b) the weight lattice is = { k=1 k ek ; k Z k};

(c) a weight = nk=1 k ek is dominant if and only if 1 2 n 0.

Next we are going to state a Littelmanns rule which describes the decomposition
into irreducibles of the tensor product of two general Sp(n)-irreducible modules.
To this end, we rst briey recall some basic terminology.
As usual, a partition is a non-increasing sequence = (1 , 2 , . . .) of
non-negative integers. The depth d() of a partition is the number of non-zero
terms of . A partition with depth n is regarded as an element of Nn0 . Let
= (1 , 2 , . . . , d ) be a partition of depth d. The Young diagram of is a collection
of left-justied rows of boxes with i boxes in the ith row for 1 i d. A lling of
the Young diagram of with elements of the set {1, 2, . . . , n} which is nondecreasing
in rows and strictly increasing in the columns is called n-semistandard (Young)
tableau (or tableau for short) of shape . Given a tableau T , the lling of the box
(i, j) is denoted by Ti,j .
Let again = (1 , 2 , . . . , d ) be a partition of depth d n. A tableau T
of shape is called a (2n)-symplectic tableau if its entries are elements of {1, . . . , 2n}
and if it obeys the additional constraint Ti,j 2i 1. These tableaux were
introduced by King and El-Sharkaway [18]. Consider a (2n)-symplectic tableau T .
The vector

con(T ) := ( {1 s in T } {2 s in T }, . . . , {(2n 1) s in T } {(2n) s in T })

is called the content of T . We denote by T (l) the tableau that consists of the last
n
l columns of T . Given a weight = j=1 j ej , we shall identify with the
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00401

Construction of Certain Fuzzy Flag Manifolds 541

element (1 , . . . , n ) Zn . Now we arrive at


Theorem 1 (Littelmann [23, Theorem (a), p. 346]). Let + be the set of
dominant weights of Sp(n) with n 1. For , + , we have

V () V ()
= V ( + con(T )),
T
where the sum is over all (2n)-symplectic tableaux of shape such that the weight
+ con(T (l)) is dominant for all l.

Remark. In the formulation of the Littelmanns rule stated above, we basi-


cally reproduced Krattenthalers description (see [21, Appendix A6]) with a slight
modication in the description of (2n)-symplectic tableaux, where we followed
[14]. This formulation is more elementary and is mostly convenient to clarify our
calculation.
Applying the above theorem in the case where = = (N, 0, . . . , 0), we obtain
Proposition 2. For N N0 and n 2, we have

V ((N, 0, . . . , 0)) V ((N, 0, . . . , 0))
= V ((2k + l, l, 0, . . . , 0)).
k,l N0
0 k+l N

Proof. If N = 0, then the proposition is obvious. Let us consider a (2n)-symplectic


tableau of shape = (N, 0, . . . , 0) with N N. For 1 i 2n, we set ki :=
{i s in T }. By denition of the ki s, we have k1 + k2 + + k2n = N . Note that the
content of T is given by
con(T ) = (k1 k2 , k3 k4 , . . . , k2n1 k2n ).
Assume that T satises the following property: + con(T (l)) + for all l. For
l = k2n , the content of the tableau T (l) is
con(T (l)) = (0, . . . , 0, k2n )
  
n1
and so
+ con(T (l)) = (N, 0, . . . , 0, k2n ).
  
n2

Since + con(T (l)) + , it follows that k2n = 0.


Next, we are going to prove that ki = 0 for all 4 i 2n. The case n = 2
is already proven. Assume n 3, x 4 i 2n and suppose that kj = 0 for
all i + 1 j 2n. We will prove that ki = 0. For this we consider the following
cases:
Case 1. If i is even, then we have for l = ki
con(T (l)) = (0, . . . , 0, ki , 0, . . . , 0).
  
i2
2

The fact that + con(T (l)) + clearly forces ki = 0.


June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00401

542 M. Ben Halima

Case 2. If i is odd, then we have for l = ki


con(T (l)) = (0, . . . , 0, ki , 0, . . . , 0).
  
i1
2

Since + con(T (l)) + , we easily get ki = 0.


We conclude that ki = 0 for the xed integer i. An induction on i allows us to
derive the equality ki = 0 for all 4 i 2n with n 3. Hence the claim is proven
for n 2. Consequently, we can write
con(T ) = (k1 k2 , k3 , 0, . . . , 0),
where, of course, k1 + k2 + k3 = N .
Conversely, if T is a (2n)-symplectic tableau of shape such that
con(T ) = (k1 k2 , k3 , 0, . . . , 0)
with the ki s being dened as above, then one easily veries that + con(T (l)) is
a dominant weight for all l. We deduce that

V ((N, 0, . . . , 0)) V ((N, 0, . . . , 0))
= V ((N + k1 k2 , k3 , 0, . . . , 0))
k1 ,k2 ,k3 N0
k1 +k2 +k3 =N


= V ((2k + l, l, 0, . . . , 0)).
k,l N0
0 k+l N

This completes the proof of the proposition.

4. Fuzzy Versions of Certain Flag Manifolds


We shall freely use the notations introduced earlier. Let (G, K) be a pair from the
list (i)(iii) in Proposition 1. The aim of this section is to construct a fuzzy version
of the ag manifold M = G/K. As we mentioned before, our construction is based
on the BerezinToeplitz quantization of such a manifold.

4.1. Quantum line bundle over M


Fix again a maximal torus T in G and let , + and be as in Sec. 2. Let be
the Gelfand node associated with (G, K). If k is the Lie algebra of K, then k = kS
with S = \{}. We denote by + 1 the set of positive roots corresponding to S.
Then

kC = hC (gC C
g ).
+
1

Setting
 
n+ = gC
and n = gC
,
+ \+
1 + \+
1
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00401

Construction of Certain Fuzzy Flag Manifolds 543

we get

gC = hC (gC C
g )
+

= kC n+ n .
Dene N + (respectively N ) to be the connected subgroup of GC with Lie algebra
n+ (respectively n ). Note that
G/K  GC /K C N +  GC /K C N .
This shows that M = G/K can be regarded as a complex manifold.
Let  V () be a normalized highest weight vector, with weight  =  .
Denote by  the unique holomorphic extension to K C N + of the character e .
With these notations, we have for all k K
 (k) =  (k)1  .
The line bundle L = G e C over M = G/K = GC /K C N + is identied with
GC  C, and then it is seen as a holomorphic line bundle. Note that every
holomorphic line bundle over M is of the form Lm for some m Z. Let HN
be the space of holomorphic sections of the line bundle LN := LN , N N. By
the BorelWeil theorem (see, e.g., [1]), HN is an irreducible G-module with highest
weight N . It follows that HN is isomorphic, as G-module, to the space V (N ).
The algebra AN := End C (HN ) admits a natural G-action and can be identied
with the matrix algebra Mat(dN , C), where dN := dimC V (N ).
Let h be the Hermitian structure of the bundle L M dened by
h([g, z], [g, z ]) = zz  for all g G.
We know that there exists a unique connection on L leaving h invariant and satis-
fying X = 0 for each vector eld X of type (0, 1) and for each local holomorphic
section . The curvature of (L, ) is the complex 2-form on M given by
R(X, Y ) := X Y Y X [X,Y ] = i(X, Y ),
where X, Y are smooth vector elds and is the G-invariant K ahler metric on M
(see, e.g., [3]). This shows that (L, h, ) is a quantum line bundle over M .

4.2. BerezinToeplitz quantization of M


Fix N N. On the space (M, LN ) of smooth sections of LN , we have the scalar
product

,  = 
hN ((x), (x))d(x),
M


where d(x) is the normalized G-invariant measure associated to the metric on M .
Let L (M, LN ) be the L2 -completion of the space (M, LN ). We denote by N
2
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00401

544 M. Ben Halima

the orthogonal projection onto the subspace HN L2 (M, LN ). Given a function


f in C (M ), one can dene an operator on the space HN by TN (f ) := N Mf
where Mf is the multiplication operator associated to f . The corresponding map
TN : C (M ) EndC (HN ) = AN is called the BerezinToeplitz quantization map.
Let PN be the orthogonal projector onto the highest weight subspace of V (N ).
One easily veries that N  (g)PN N  (g)1 is the projector onto the coherent
state associated to x = gK M (see [9]). Thus the coherent state map used in
the BerezinToeplitz quantization of K ahler manifolds (see [8]) is here equal to

PN : M = G/K EndC (HN )


gK  N  (g)PN N  (g)1
and we get (see [29, Proposition 3.1]) the following expression for the Berezin
Toeplitz quantization map

TN (f ) = (dimC HN ) 
f (x)PN (x)d(x).
M

From this expression, it is obvious that TN is G-equivariant. Using the fact that
the map TN : C (M ) AN is surjective (see [7, Proposition 4.2]), one can deduce
that the algebra AN is G-equivariantly isomorphic to a submodule of L2 (M ). As
shown by Bordemann, Meinrenken and Schlichenmaier (see [7]), the maps TN have
the correct semi-classical behavior for N . In particular, the following results
hold.

Theorem 2. For f, h C (M ), we have

(1) TN (f )op f  as N ;


(2) TN (f h) TN (f )TN (h)op 0 as N .

Here op is the operator norm on AN and  is the sup-norm on C (M ).

Remark. Let l be a continuous length function on G satisfying the condition


l(xyx1 ) = l(y) for all x, y G. Let be the action of G on AN by conjugation by
N  . Then l and determine a Lipschitz seminorm LN on AN by
 
x (A) Aop
LN (A) = sup ; x = e ,
l(x)
where e is the identity element of G. Let C(G/K) be the C  -algebra of continuous
complex-valued functions on G/K. We denote by the action of G on G/K and on
C(G/K) by left translation. We can dene a Lipschitz seminorm on C(G/K) by
 
x (f ) f 
L (f ) = sup ; x = e .
l(x)
Let us underline that the pairs (AN , LN ) and (C(G/K), L ) are compact quantum
metric spaces in the sense dened by Rieel in [27].
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00401

Construction of Certain Fuzzy Flag Manifolds 545

Motivated by the notion of GromovHausdor convergence of classical compact


metric spaces, Rieel gave in [27] a denition of a quantum GromovHausdor
distance between two compact quantum metric spaces. Furthermore, he proved in
[28] that the sequence {(AN , LN )}N 1 converges to (C(G/K), L ) for this distance
as N .

4.3. Fuzzy version of M


Now we are in position to prove our main result.

Theorem 3. Let (G, K), M and AN be as above. Then there exists a sequence

(EN )N 1 of G-invariant subspaces of L2 (M ) such that EN EN +1 and N 1 EN
is dense in C (M ), and such that EN is G-equivariantly isomorphic to the matrix
algebra AN .

Proof. If (G, K) is an irreducible compact Hermitian symmetric pair, then the


result of the theorem follows immediately in this case by comparing Proposition 3.1
and Theorem 4.2 in the paper of Zhang mentioned in the introduction ([31]). Thus
it suces to prove the theorem in the following two cases:

Case 1. Assume that (G, K)  (SO(2n + 1), U (n)) with n 2. Let the notations
of roots and weights be as in Sec. 3.1. The Gelfand node associated to the pair
(SO(2n + 1), U (n)) is = n = en and the fundamental weight attached to this
simple root is  = 12 (e1 + + en ). Consider the holomorphic line bundle L =
SO(2n + 1) e C over the homogeneous space SO(2n + 1)/U (n). As SO(2n + 1)-
modules, HN = hol (LN ) = V (N ) and AN = V (N ) V (N ) for N N.
Using the result of Corollary 1, one immediately has

AN = V ().
=(1 ,...,n )Nn0
N 1 2 n 0

On the other hand, an important result of Kr


amer (see [20, Table 1]) says that the
2
SO(2n + 1)-module L (SO(2n + 1)/U (n)) decomposes into irreducibles as


L2 (SO(2n + 1)/U (n))
= V ().
=(1 ,...,n )Nn0
1 2 n 0

Denote by EN the unique submodule of L2 (SO(2n + 1)/U (n)) such that



EN
= V ()
=(1 ,...,n ) Nn
0
N 1 2 n 0

as SO(2n+1)-module. The sequence (EN )N 1 satises the assertions of the theorem.


June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00401

546 M. Ben Halima

Case 2. Assume that (G, K)  (Sp(n), U (1) Sp(n 1)) with n 2. In the
notations of Sec. 3.2, the Gelfand node associated to the pair (Sp(n), U (1)
Sp(n 1)) is = 1 = e1 e2 and the fundamental weight attached to this
simple root is  = e1 . Consider the holomorphic line bundle L = Sp(n) e C
over the homogeneous space Sp(n)/(U (1) U (n)) and take HN = hol (LN ) for
N N. As Sp(n)-modules, HN = V (N ) and AN = V (N ) V (N ). Since the
module V (N ) is self-dual, the result of Proposition 2 shows that

AN = V ((2k + l, l, 0, . . . , 0)).
k,l N0
0 k+l N

As in the previous case, the decomposition into irreducibles of the Sp(n)-module


L2 (Sp(n)/(U (1) Sp(n 1))) is given by Kr
amer in [20, Table 1]. One has

L2 (Sp(n)/U (1) Sp(n 1)) = V ((2k + l, l, 0, . . . , 0)).
k,lN0

2
Denote by EN the unique submodule of L (Sp(n)/(U (1) Sp(n 1))) such that

EN
= V ((2k + l, l, 0, . . . , 0))
k,l N0
0 k+l N

as Sp(n)-module. The sequence (EN )N 1 veries the assertions of the theorem.

Finally, we observe that the analysis used in the proof of Theorem 3 directly
implies the following result.
Proposition 3 (Compare [30, Proposition 4.8]). Let (G, K) be a pair from
the list (i)(iii) in Proposition 1, and let be the associated Gelfand node
with corresponding fundamental weight  :=  . Then we have a multiplicity free
decomposition of G-modules of the form

r
V () V ()
= V (i )
i=0

for certain r N, where 0 := 0 + and {i }1ir is a subset of the K-spherical


dominant weights K + . Furthermore, every + can uniquely be written as a
K

N0 -linear combination of the i s (1 i r).

Acknowledgments
I would like to express my gratitude to Tilmann Wurzbacher for suggesting the
problem and for helpful discussions. I would also like to thank the anonymous
referee for pointing out to me references [22, 28], and for remarks improving the
article.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00401

Construction of Certain Fuzzy Flag Manifolds 547

References
[1] D. N. Akhiezer, Lie Group Actions in Complex Analysis (Vieweg, Braunschweig,
1995).
[2] A. Y. Alekseev, A. Recknagel and V. Schomerus, Non-commutative world-volume
geometries: Branes on SU (2) and fuzzy spheres, JHEP 09 (1999) 023.
[3] D. Arnal, M. Cahen and S. Gutt, Representations of compact Lie groups and quan-
tization by deformation, Acad. Roy. Belg. Bull. CI. Sci. (5) 74 (1988) 123141.
[4] A. P. Balachandran, T. R. Govindarajan and B. Ydri, The Fermion doubling problem
and noncommutative geometry, Mod. Phys. Lett. A 15 (2000) 12791286.
[5] A. P. Balachandran, B. P. Dolan, J. Lee, X. Martin and D. OConnor, Fuzzy complex
projective spaces and their star-products, J. Geom. Phys. 43 (2002) 184204.
[6] M. Ben Halima and T. Wurzbacher, Fuzzy complex Grassmannians and quantization
of line bundles, to appear in Abh. Math. Semin. Hamb. Univ.
[7] M. Bordemann, E. Meinrenken and M. Schlichenmaier, Toeplitz quantization of
Kahler manifolds and gl(N ), N limits, Comm. Math. Phys. 165 (1994)
269281.
[8] M. Cahen, S. Gutt and J. Rawnsley, Quantization of K ahler manifolds. I. Geometric
interpretation of Berezins quantization, J. Geom. Phys. 7 (1990) 4562.
[9] M. Cahen, S. Gutt and J. Rawnsley, Quantization of K ahler manifolds. II, Trans.
Amer. Math. Soc. 337 (1993) 7398.
[10] B. P. Dolan and D. OConnor, A fuzzy three sphere and fuzzy tori, JHEP 10
(2003) 060.
[11] B. P. Dolan and J. Olivier, Fuzzy complex Grassmannian spaces and their star
products, Internat. J. Modern Phys. A 18 (2003) 19351958.
[12] M. R. Douglas and N. A. Nekrasov, Noncommutative eld theory, Rev. Mod. Phys.
73 (2001) 9771029.
[13] J. Frohlich and K. Gawedzki, Conformal eld theory and geometry of strings, in
Mathematical Quantum Theory (Vancouver, 1993), Proceedings of the Conference
on Mathematical Quantum Theory, Vancouver, Canada (Amer. Math. Soc. 1993),
pp. 5797.
[14] M. Fulmek and C. Krattenthaler, Lattice path proofs for determinantal formulas for
symplectic and orthogonal characters, J. Combin. Theory Ser. A 77 (1997) 350.
[15] H. Grosse, C. Klimcik and P. Presnajder, Simple eld theoretical models on
noncommutative manifolds, in Lie Theory and Its Applications in Physics (Clausthal,
1995) (World Sci. Publishing, River Edge, NJ, 1996), pp. 117131.
[16] H. Grosse and A. Strohmaier, Noncommutative geometry and the regularization
problem of 4D quantum eld theory, Lett. Math. Phys. 48 (1999) 163179.
[17] Y. Hikida, M. Nozaki and Y. Sugawara, Formation of spherical 2D-brane from
multiple D0-branes, Nucl. Phys. B 617 (2001) 117150.
[18] R. C. King and N. G. I. El-Sharkaway, Standard young tableaux and weight multi-
plicities of the classical Lie groups, J. Phys. A 16 (1983) 31533178.
[19] A. W. Knapp, Lie Groups Beyond an Introduction, 2nd edn. (Birkh auser, Boston,
2002).
[20] M. Kr amer, Sph arische Untergruppen in Kompakten Zusammenh angenden Liegrup-
pen, Compositio Math. 38 (1979) 129153.
[21] C. Krattenthaler, Identities for classical group characters of nearly rectangular shape,
J. Algebra 209 (1998) 164.
[22] C. L. Lazaroiu, D. McNamee and C. S amann, Generalized Berezin quantization,
Bergmann metrics and fuzzy Laplacians, JHEP 09 (2008) 059.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00401

548 M. Ben Halima

[23] P. Littelmann, A generalization of the LittlewoodRichardson rule, J. Algebra 130


(1990) 328368.
[24] J. Madore, The fuzzy sphere, Class. Quantum Grav. 9 (1992) 6987.
[25] J. Madore, An Introduction to Noncommutative Dierential Geometry and Its Phys-
ical Applications, 2nd edn. (Cambridge University Press, Cambridge, 1999).
[26] S. Okada, Applications of minor summation formulas to rectangular-shaped repre-
sentations of classical groups, J. Algebra 205 (1998) 337367.
[27] M. A. Rieel, GromovHausdor distance for quantum metric spaces, Mem. Amer.
Soc. 168 (2004) 165.
[28] M. A. Rieel, Matrix algebras converge to the sphere for quantum GromovHausdor
distance, Mem. Amer. Soc. 168 (2004) 6791.
[29] M. Schlichenmaier, BerezinToeplitz quantization and Berezin symbols for arbitrary
compact K ahler manifolds, in Coherent States, Quantization and Gravity (Bialowieza,
1998), Proc. XVII Workshop on Geometric Methods in Physics (Warsaw Univ. Press,
2001), pp. 4556.
[30] J. V. Stokman and M. S. Dijkhuizen, Quantized ag manifolds and irreducible
-representations, Comm. Math. Phys. 203 (1999) 297324.
[31] G. Zhang, Berezin transform on compact Hermitian symmetric spaces, Manuscripta
Math. 97 (1998) 371388.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

Reviews in Mathematical Physics


Vol. 22, No. 5 (2010) 549596

c World Scientic Publishing Company
DOI: 10.1142/S0129055X1000403X

ON THE FEYNMAN PATH INTEGRAL FOR


NONRELATIVISTIC QUANTUM ELECTRODYNAMICS

WATARU ICHINOSE
Department of Mathematical Science, Shinshu University,
Matsumoto 390-8621, Japan
ichinose@math.shinshu-u.ac.jp

Received 17 March 2008


Revised 26 March 2010

The Feynman path integral for regularized nonrelativistic quantum electrodynamics is


studied rigorously. We begin with the Lagrangian function of the corresponding clas-
sical mechanics and construct the Feynman path integral. In the present paper, the
electromagnetic potentials are assumed to be periodic with respect to a large box and
quantized through their Fourier coecients with large wave numbers cut o. Firstly,
the Feynman path integral with respect to paths on the space of particles and vector
potentials is dened rigorously by means of broken line paths under the constraints.
Secondly, the Feynman path integral with respect to paths on the space of particles and
electromagnetic potentials is also dened rigorously by means of broken line paths and
piecewise constant paths without the constraints. This Feynman path integral is stated
heuristically in Feynman and Hibbs book. Thirdly, the vacuum and the state of photons
of given momenta and polarizations are expressed concretely as functions of variables
consisting of the Fourier coecients of vector potentials. It is also proved rigorously
in terms of distribution theory that the Coulomb potentials between charged particles
naturally appear in the above Feynman path integral approach. This shows that the
photons give rise to the Coulomb force.

Keywords: Feynman path integral; quantum electrodynamics.

Mathematics Subject Classication 2010: 81S40, 58D30

1. Introduction
A number of mathematical results on the Feynman path integrals for quantum
mechanics have been obtained. On the other hand, the author does not know any
mathematical results on the Feynman path integrals for quantum electrodynamics
(cf. [2, 23]), written as QED from now on.
The Feynman path integral for the free relativistic scalar boson eld was dened
rigorously in terms of the innite dimensional Fresnel integral in [2]. The Chern
Simons functional integral was also dened rigorously, associated with a principal

549
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

550 W. Ichinose

ber bundle over R3 with structure group a compact connected Lie group, as an
innite dimensional distribution in terms of white noise analysis and the applica-
tions of its functional integral to the topological quantum eld theory were given
in [1]. In [27], the interaction of nonrelativistic particles with a scalar boson eld
was studied. There, the functional integral with respect to paths on the space of
particles and the boson eld was dened in terms of Marko processes under the
assumption that the mass divided by the imaginary unit and a coupling constant
divided by the imaginary unit are positive. As will be seen in the present paper,
particles interact with the boson eld through the quantized vector potential in
QED. On the other hand, in [27], particles interact with the boson eld through
the quantized scalar potential, where the vector potential disappears. This is the
most dierent point between our result and Nelsons one.
The spectra of Hamiltonian operators for nonrelativistic QED models have also
been studied (cf. [12, 14, 32]). The Hamiltonian operators in these QED models are
dened by means of the Coulomb potentials, and creation operators and annihi-
 n 2 3
s (L (R ) L (R )),
2 3
lation operators acting on the bosonic Fock space n=0
dened dependently on an infrared and ultraviolet cut-o function in momentum
space R3 , where L2 (R3 ) is the space of all square integrable functions in R3 and
L2 (R3 ) L2 (R3 ) expresses the space of all amplitudes of momentum of a single
photon with polarizations. These QED models are simplied versions of those which
are primarily intended in physics (cf. [10, 11, 29, 33]). A functional integral repre-
sentation for the above nonrelativistic QED model with imaginary time was also
obtained by Hiroshima [16] by means of the probabilistic method. We can see from
Theorem 3.1 in the present paper that the Hamiltonian operators in [12, 14, 16, 32]
are formally like (3.10) in the present paper. But, our presentation (3.10) is exhib-
ited as a partial dierential operator. In addition, as will be seen in Sec. 5, creation
operators and annihilation operators with given momenta and polarizations acting
on S  (R4N ) are dened and the Hamiltonian operator (3.10) can be written by
means of these creation operators and annihilation operators, where N is a positive
integer determined from the regularization of QED, S(R4N ) denotes the Schwartz
space of all rapidly decreasing functions in R4N and S  (R4N ) is the dual space of
S(R4N ). This description of the Hamiltonian operator is the one familiar in the
heuristic presentations in physics (cf. [10, 11, 29, 33]).
It is well known that the only translation invariant -additive regular measure on
a separable innite dimensional Banach space is the identically zero measure (cf. [13,
Chap. 4, Sec. 5, Theorem 4]). The measure dening heuristically the Feynman path
integral is meant to be translation invariant (cf. [11, (7-29)]), so it cannot be realized
as a -additive regular nontrivial measure. As it is known, see, e.g., [2, 15, 23] the
Feynman path integral itself can be realized as a linear functional satisfying certain
suitable continuity conditions.
Our aim in the present paper is to dene rigorously the Feynman path integral
for a regularized nonrelativistic QED (for a physical discussion of QED and its
nonrelativistic version, see, e.g., [7, 8, 10, 11, 29]). We begin with the Lagrangian
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics 551

function of the corresponding classical mechanics, dierently from the models


in [12, 14, 16, 32], and construct rigorously the Feynman path integral. Usually in
physics, the Feynman path integral for nonrelativistic QED is only heuristically
dened.
In the present paper, electromagnetic potentials are assumed to be periodic with
respect to a large box in R3 and quantized through their Fourier coecients. We
note that in the present paper, regrettably, the Fourier coecients with large wave
numbers need to be arbitrarily cut o (ultraviolet cut-o) and we do not take the
limit of a box to R3 . In this double sense, our model is regularized.
First, the mathematical denition of the Feynman path integral with respect to
paths on the space of particles and vector potentials is given by means of broken
line paths under the constraints, i.e. (2.20) in the present paper. These constraints
are necessarily introduced in physics (cf., e.g., [11, (9-17)], [29, (A-7)], [32, (13.10)]
and [33, (7.38)]) when electrodynamics is quantized from the classical mechanics. It
is a reason for introducing the constraints that a momentum canonically conjugate
to the scalar potential is absent. See (2.3) in the present paper.
Secondly, without the constraints we give the mathematical denition of the
Feynman path integral with respect to paths on the space of particles and electro-
magnetic potentials by means of broken line paths and piecewise constant paths.
This Feynman path integral has been given heuristically by [11, (9-98)]. Our method
of dening the Feynman path integral without the constraints is like the one we
used before in [20] for dening the phase space Feynman path integral. That is,
paths considered on the space of all scalar potentials are determined so that the
derivatives of the Lagrangian function with respect to the variables of the scalar
potential are piecewise constant (Remark 3.4). The author again emphasize that
any denitions of [11, (9-98)] have not been given. So our result may be completely
new. We note that our Feynman path integral with respect to paths on the space
of particles and electromagnetic potentials can be proved to be equal with the
Feynman path integral with respect to paths on the space of particles and vector
potentials.
Thirdly, the vacuum and the states of photons with given momenta and polar-
izations are expressed concretely as functions of variables consisting of the Fourier
coecients of vector potentials. In [11], only the vacuum and the state of a pho-
ton with a momentum and a polarization are expressed concretely as functions.
Generally, in physics the vacuum and the states of photons with given momenta
and polarizations are not considered concretely but rather abstractly (cf. [29, 33]).
To write down the state of photons concretely, we introduce creation operators
and annihilation operators, which can be written concretely as rst order par-
tial dierential operators, similarly as it is done in white noise analysis in [15].
The results stated above should have many applications, as heuristically suggested
in [11, Chap. 9].
Fourthly, we show in terms of distribution theory that the Coulomb potentials
between charged particles appear when the periods of the Fourier series tend to
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

552 W. Ichinose

innity and the cut-o of the Fourier coecients is removed. This result, which
shows that photons yield the Coulomb force, is well known in physics (cf. [8, 11]).
In the present paper, we give a rigorous proof of this fact in the frame of our model
of regularized nonrelativistic QED.
The proof of giving a mathematical denition of the Feynman path integral
for nonrelativistic QED with regularization is obtained by means of a somewhat
delicate study of oscillatory integral operators, the abstract AscoliArzel`a theorem
on the weighted Sobolev spaces and the uniqueness to the initial problem for the
Schrodinger type equations as in [1821].
The proof of expressing the vacuum and the states of photons with given
momenta and polarizations concretely is as follows. We rst dene annihilation
operators of photons with given momenta and polarizations by rst order dier-
ential operators having the Fourier coecients of vector potentials as variables.
Creation operators of photons are dened as the adjoint operators of the annihila-
tion operators. The vacuum is determined from the annihilation operators and the
states of photons with given momenta and polarizations are determined from the
vacuum by means of the creation operators. For the mathematics related to this
see, e.g., [6]. This relies on formal considerations going back to [7].
The proof of the appearance of the Coulomb potentials between charged par-
ticles is given by proving the convergence theorem for the Riemann sum of a
unbounded function as the discretization parameter in space tends to zero, which
will be stated in Proposition 4.3 in the present paper.
Our plan in the present paper is as follows. Section 2 is devoted to prelim-
inaries. In Sec. 3, the main results on the Feynman path integral for regular-
ized nonrelativistic QED are stated. In Sec. 4, the appearance of the Coulomb
potentials between charged particles is proved rigorously in our model. In Sec. 5,
the vacuum and the states of photons with given momenta and polarizations are
given concretely. Sections 69 are devoted to the proofs of the main results stated
in Sec. 3.

2. Preliminaries
For a multi-index = (1 , . . . , d ) and z = (z1 , . . . , zd ) Rd , we
 write || =
d
j=1 j , z = z1 zd , z = (/z1 ) (/zd) and z = 1 + |z|2 . Let
1 d 1 d

L = L (R ) be the space of all square integrable functions in Rd with inner product


2 2 d

(, ) and norm  .
Let T > 0 be an arbitrary constant, t [0, T ] and x R3 . We consider n
charged nonrelativistic particles x(j) (t) R3 (j = 1, 2, . . . , n) with mass mj > 0
and charge ej R. Let E(t, x) = (E1 (t, x), E2 (t, x), E3 (t, x)) R3 be the electric
strength and B(t, x) = (B1 (t, x), B2 (t, x), B3 (t, x)) R3 the magnetic strength.
Then the classical equations of motion of x(j) (t) are given by
d d (j)
mj x (j) (t) = ej E(t, x(j) (t)) + ej x (j) (t) B(t, x(j) (t)), x (j) (t) = x (t).
dt dt
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics 553

Let (t, x) R be a scalar potential and A(t, x) R3 a vector potential. We set

x(t) := (x(1) (t), . . . , x(n) (t)) R3n ,


x (t) := (x (1) (t), . . . , x (n) (t)) R3n .

Then the Lagrangian function for particles and the electromagnetic eld with the
distributional charge density


n
(t, x) = ej (x x(j) (t)) (2.1)
j=1

and the distributional current density


n
j(t, x) = ej x (j) (t)(x x(j) (t)) R3 (2.2)
j=1

is given in distributional sense by


 
A
L t, x, x, A, A,
, ,
x x
 mj
n  
1
= |x (j) |2 (t, x)(t, x)dx + j(t, x) A(t, x)dx
j=1
2 c

1
+ (|E(t, x)|2 |B(t, x)|2 )dx + C
8 R3
n 
 
mj (j) 2 1
= |x | ej (t, x ) + ej x A(t, x )
(j) (j) (j)

j=1
2 c

1
+ (|E(t, x)|2 |B(t, x)|2 )dx + C (2.3)
8 R3

(cf. [11, 32]), where

1 A
E= , B = A, (2.4)
c t x

/x = (/x1 , /x2 , /x3 ) and C is an indenite constant. It seems that


a nontrivial indenite constant in (2.3) has not been explicitly discussed by anyone
before (cf. [11, 29, 32]).
As in [8, 10, 29] we consider a sucient large box


L1 L1 L2 L2 L3 L3
V = , , , R3 .
2 2 2 2 2 2
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

554 W. Ichinose

In the present paper, as variables we consider all periodic potentials (t, x) and
A(t, x) in x R3 with periods L1 , L2 and L3 satisfying

A(t, x) = 0 in [0, T ] R3 (the Coulomb gauge) (2.5)

and also
 
(t, x)dx = 0, A(t, x)dx = 0. (2.6)
V V

Let |V | = L1 L2 L3 . We set
 
2 2 2
k := s1 , s2 , s3 (s1 , s2 , s3 = 0, 1, 2, . . .). (2.7)
L1 L2 L3

Then, using the Gram and Schmidt method, we can easily determine ej (k)
R3 (j = 1, 2) such that (e1 (k), e2 (k), k/|k|) for all k = 0 form a set of mutually
orthogonal unit vectors in R3 and

ej (k) = ej (k) (j = 1, 2) (2.8)

(cf. [3, p. 448]). We x these ej (k) hereafter. Noting (2.5) and (2.6), we can expand
(t, x) and A(t, x) formally into the Fourier series

4 
A(x, {alk (t)}) = c {a1k (t)eikxe1 (k) + a2k (t)eikxe2 (k)}, (2.9)
|V |
k=0

1 
(x, {k (t)}) = k (t)eikx . (2.10)
|V |
k=0

Remark 2.1. Usually in the physical literature (cf. [11, 29]) the condition (2.6) is
not stated clearly.
We write
(1) (2)
alk ialk
alk =: (l = 1, 2), (2.11)
2
(1) (2)
k =: k ik , (2.12)

where alk R and k R, and also the complex conjugate of alk as alk . Since A
(i) (i)

and are real valued, the relations

(1) (1) (2) (2) (1) (1) (2) (2)


alk = alk , alk = alk , k = k , k = k (2.13)
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics 555

hold from (2.8). So, from (2.9) and (2.10), we have



4   1 (1)
2
A(x, {alk }) = c (alk cos k x + a(2)
lk sin k x)
el (k), (2.14)
|V | 2 k=0 l=1

1  (1) (2)
(x, {k }) = (k cos k x + k sin k x). (2.15)
|V |
k=0

We also write
(1)

n
k (x) := ej cos k x(j) , (2.16)
j=1

(2)

n
k (x) := ej sin k x(j) . (2.17)
j=1

Determining the constant C in the Lagrangian function (2.3) formally as the


innite constant
2  2  1  c|k|
n
ej + 2 , (2.18)
|V | j=1 |k|2 2
k=0 k=0

we can write L from (2.3) by means of (2.4), (2.9), (2.10) and (2.15) as
L(x, x , {alk }, {a lk }, {k })

2

n
mj 1  (i) (i) (i)
= |x (j) |2 + (|k|2 (k )2 8k (x)k )
2 8|V |
j=1 k=0 i=1



n

e2j 1
n
+ + ej x (j) A(x(j) , {alk })
16 2
j=1
c j=1
|k|2
 (i) 
1 
(i)
(a lk )2 (c|k|)2 (alk )2 c|k|
+ + . (2.19)
2 2|V | 2|V | 2
k=0,i,l

The reason why we have chosen the indenite constant C in (2.3) in the way given
by (2.18) will be explained in Remark 5.1.
n (1)
Remark 2.2. If we do not assume (2.6), we must add (1/|V |)( j=1 ej )0 and
 (i)
l0 )2 /(4|V |) to (2.19).
i,l=1,2 (a

If we take into the constraints E = 4 as in [11, (9-17)] and [33, (7.38)],


we have
(i) (i)
|k|2 k = 4k (x) (i = 1, 2, k = 0) (2.20)
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

556 W. Ichinose

n
and j=1 ej = 0 formally from (2.1), (2.4) and (2.5). But, in the present paper, we
adopt only (2.20) as constraints. Then from (2.16) and (2.17), we have

n
e2j

2
(i) (i) (i) j=1
(|k|2 (k )2 8k (x)k ) + 16 2
|k|2
i=1

16 2 (1) 2 
n
(2)
= 2 (k ) + (k )2 e2j
|k| j=1

16 2 
n
(j) (l)
= ej el eikx eikx
|k|2
j,l=1,j=l

16 2 
n
= ej el cos k (x(j) x(l) ). (2.21)
|k|2
j,l=1,j=l

So we get
Lc (x, x , {alk }, {a lk })

n
mj 2  
n
ej el cos k (x(j) x(l) )
= |x (j) |2
j=1
2 |V | |k|2
k=0 j,l=1,j=l

1 
n
+ ej x (j) A(x(j) , {alk })
c j=1
 
1 
(i) (i)
(a lk )2 (c|k|)2 (alk )2 c|k|
+ + . (2.22)
2 2|V | 2|V | 2
k=0,i,l

We introduce the weighted Sobolev spaces B a (Rd ) := {f L2 ; f B a :=



f  + ||=a (z f  + (z ) f ) < } (a = 1, 2, . . .). Let B a (Rd ) denote

their dual spaces. We set B 0 := L2 . Let C (Rd ) with compact sup-
port such
 that (0) = 1. We dene the oscillatory integral Os- g(, z  )dz  by
lim0 (
z )g(, z  )dz  independently of the choice of pointwise, in the topology


of B a (Rd ) or in the topology in S(Rd ) (cf. [24]) for a function g(z, z  ) in Rd Rd ,
provided the integral involving exists in Lebesgue sense for any
> 0.

3. Main Results
We arbitrarily cut o the terms of large wave numbers k in (2.22). That is, let
Mj (j = 1, 2, 3) be arbitrary positive integers such that M2 M3 . We consider
   
2 2 2
j := k = s1 , s2 , s3 ; s21 + s22 + s23 = 0, |s1 |, |s2 |, |s3 | Mj .
L1 L2 L3
(3.1)
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics 557

Then we can determine j (j = 1, 2, 3) such that


j =: j (j ), j (j ) = empty set, 2 3 (3.2)
and x j hereafter. Let Nj denote the number of elements of the set j . It
(i)
follows from (2.13) that aj := {alk }kj ,i,l R4Nj are independent variables
(cf. [32, p. 154]).
We also introduce cut-o functions g(x) C (R3 ) and () C (R). We
consider
Lc (x, x , {alk }, {a lk })

n
mj 2  
n
ej el cos k (x(j) x(l) )
:= |x (j) |2
j=1
2 |V | |k|2
k1 j,l=1,j=l

1
n
+ ej x (j) A(x
(j) , a )
2
c j=1
 (i) 
1 
(i)
(a lk )2 (c|k|)2 (alk )2 c|k|
+ + (3.3)
2 2|V | 2|V | 2
k3 ,i,l

in place of Lc given by (2.22), where A given by (2.14) is replaced with



4   2
1

A(x, a2 ) = cg(x) ((a(1)lk ) cos k x
|V | 2 k2 l=1

(2)
+ (alk ) sin k x)el (k). (3.4)
We assume () = () ( R).
For the sake of simplicity we write  := 3 and N := N3 . We consider a
subdivision
: 0 = 0 < 1 < < = T, || := max (l l1 )
1l

of [0, T ]. Let x R3n and a R4N be xed. We take arbitrarily


x (0) , . . . , x (1) R3n
and
(0) (1)
a , . . . , a R4N .
Then, we write the oriented broken line path on [0, T ] connecting x (l) at =
l (l = 0, 1, . . . , , x () = x) by q () R3n . Of course, dq ()/d =: q () in
distributional sense is in L2 ([0, T ]). In the same way we dene the broken line path
(0) (1)
a () R4N on [0, T ] for a , . . . , a and a . We dene a () R8N by
means of (2.13). We write the classical action
 T
Sc (T, 0;  q , a ) = Lc (q (), q (), a (), a ())d. (3.5)
0
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

558 W. Ichinose

Let > 0 be the constant, which will be dened from 1 , 2 and 3 in


Proposition 7.2 of the present paper. See also Remark 7.1. Then we have

Theorem 3.1. We assume for cut-o functions g(x) and () in (3.4) that for
any l = 1, 2, . . . and any multi-index there exist constants l > 0 and > 0
satisfying

|l ()| Cl (1+l ) , R (3.6)

and

|x g(x)| C x(1+ ) , x R3 . (3.7)

Let || and f (x, a ) B a (R3n+4N ) (a = 0, 1, 2, . . .). Then,



 n 
 3
4N

m j 1
2i(l l1 ) 2i|V |(l l1 )
l=1 j=1
 
Os- (exp i1 Sc (T, 0; q, a ))f (q (0),

(0) (1)
a (0))dx (0) dx (1) da da (3.8)

is well dened in B a (R3n+4N ), which we write as (C (T, 0)f )(x, a ) or



(exp i1 Sc (T, 0; 
q, a ))f (q (0), a (0))Dq Da . In addition, as ||
tends to 0, the function (C (T, 0)f )(x, a ) converges to a limit which we call
the Feynman path integral (exp i1 Sc (T, 0; q, a ))f (q(0), a (0))Dq Da in
B a (R3n+4N ). We can also see that this limit is B a -valued continuous and B a2 -
valued continuously dierentiable in T (0, ), and satises the Schr odinger type
equation


i u(t) = H(t)u(t) (3.9)
t

with u(0) = f, where

  2
1   
n
ej (j)
H(t) =  (j)
A(x , a2 )


j=1
2m j i x c

2   ej el cos k (x(j) x(l) )


n
+
|V | |k|2
k1 j,l=1,j=l
 2
 |V |  (c|k|) (i) 2 c|k|
2
+ + (a ) . (3.10)

2 i a (i) 2|V | lk 2
k ,i,l lk
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics 559

Remark 3.1. Let us determine the indenite constant C in (2.3) by


2  2  1  c|k|
n
ej + 2
|V | j=1 |k|2 2
k1 k3

and cut o the terms of large wave numbers k in (2.19) by introducing j (j =


1, 2, 3). Then we get (3.3) again, taking into the account the constraints (2.20).
Remark 3.2. Let 0 <
1 and g (x) C (R3 ) satisfy (3.7) for all . Let
U (t, 0)f (0 t T ) denote the Feynman path integral dened in Theorem 3.1
for f B a (R3n+4N ). Suppose that x g (x) are uniformly bounded with respect to
0 <
1 in R3n for all and that x g (x) converges to x 1 pointwise in R3n
for all as
tends to zero. Then we can prove that as
tends to zero, U (t, 0)f
converges to the solution of (3.9) with u(0) = f , where g(x) in (3.4) is replaced
with 1, in B a uniformly in t [0, T ]. In this way we can remove the cut-o function
g(x) in (3.4). This result will be published in [22].
Remark 3.3. Let 0 t0 t T . For f B a (R3n+4N ) (a = 0, 1, 2, . . .) we dene
C (t, t0 )f with C (t0 , t0 )f = f as in (3.8). See (9.3) in the present paper for the
precise denition. As will be seen from the proof of Theorem 3.1 of the present
paper, under the assumptions of Theorem 3.1 (C (t, t0 )f )(x, a ) is well dened in
B a and lim||0 C (t, t0 )f exists in B a uniformly in 0 t0 t T , which satises
the Sch odinger type equation (3.9) with u(t0 ) = f .
In place of L expressed by (2.19) we consider
x, x , {alk }, {a lk }, {k })
L(

2

n
mj 1   (i) (i) (i)
:= |x (j) |2 + (|k|2 (k )2 8k (x)k )
2 8|V |
j=1 k1 i=1



n

e2j 1
n
+ + ej x (j) A(x
(j) , a )
16 2

j=1 c j=1 2

|k|2
 
1 
(i) (i)
(a lk )2 (c|k|)2 (alk )2 c|k|
+ + (3.11)
2 2|V | 2|V | 2
k3 ,i,l

by means of (3.4) as in Lc .
Let q () R3n , a () R4N and a () R8N be the broken line paths
dened before. Let k := (k , k ) R2 for k 1 . Take  k ,  k , . . . and  k
(1) (2) (0) (1) (1)

(1) (2)
in R2 arbitrarily. Set k (x) := (k (x), k (x)) by means of (2.16) and (2.17).
Then, we dene the path
(l) 4k (q ())
k () :=  k + R2 , l1 < l (3.12)
|k|2
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

560 W. Ichinose

(l = 1, 2, . . . , ), where k (0) := lim0+0 k (). We set 1 () :=


{k ()}k1 R2N1 . We dene 1 () R4N1 by means of (2.13). Let
x, x , {alk }, {a lk }, {k }) given
q , a , 1 ) be the classical action for L(
S(T, 0; 
by (3.11).

Theorem 3.2. Let || and f (x, a ) B a (R3n+4N ) (a = 0, 1, 2, . . .). Then,


under the assumptions of Theorem 3.1 the function

  n  3
4N
m j 1
2i(l l1 ) 2i|V |(l l1 )
l=1 j=1

 |k|2 (l l1 )  
Os- (exp i1 S(T, 0; q, a , 1 ))

4i 2 |V |
k1

q (0), a (0))dx (0) dx (1)


f (
(0) (1)
 (0) (1) (1)
da  da 
d d d
k k k (3.13)
k1

is well dened in B a (R3n+4N ) and is equal to



(exp i1 Sc (T, 0; q , a ))f (q (0), a (0))Dq Da

dened by (3.8) in Theorem 3.1. So it follows from Theorem 3.1 that as || 0,


then (3.13) converges to the Feynman path integral

(exp i1 S(T, 0; q, a , 1 ))f (q(0), a (0))Dq Da D1 (3.14)

in B a (R3n+4N ), which satises the Schr


odinger type equation (3.9) with u(0) = f .

The Feynman path integral (3.14) is given heuristically in [11, 9-8].

Remark 3.4. As was noted in the introduction, the constraints (2.20) are not
needed in Theorem 3.2 above. The path k () dened by (3.12) is determined so
(i)
that L( q (), a (), a (), 1 ())/k (i = 1, 2) are piecewise con-
q (), 
stant.

Remark 3.5. We take f S(R3n+4N ) and set M0 = [(3n + 4N )/2] + 1, where


[] denotes Gauss symbol. Let = (x, X), and and multi-indices. Then, the
Sobolev inequality shows

sup | f ()|  f  +  ( f ).
R3n+4N ||=M0

It follows from Lemma 2.4 with a = b = 1 in [17] or as in the proof of (7.14) in the
present paper that the right-hand side of the above is bounded by C, f B |+|+M0
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics 561

with a constant C, . Hence, for || the functions (3.8), (3.13), the limit of
(3.8) as || 0 and the limit of (3.13) as || 0 are well dened in S, so
pointwise.

Remark 3.6. We write (3.13) as G (T, 0)f . Let 0 t0 t T . For f


B a (R3n+4N ) (a = 0, 1, 2, . . .) we can dene G (t, t0 )f as in (3.13) in the same way
that C (t, t0 )f is dened in Remark 3.3. See also (9.20) in the present paper for the
precise denition. As will be seen in the proof of Theorem 3.2, under the assump-
tions of Theorem 3.1, G (t, t0 )f is well dened in B a and is equal to C (t, t0 )f .

We consider an external electromagnetic eld Eex (t, x) = (Eex1 (t, x), Eex2 (t, x),
Eex3 (t, x)) R3 and Bex (t, x) = (Bex1 (t, x), Bex2 (t, x), Bex3 (t, x)) R3 such that
x Eex j (t, x), x Bex j (t, x) and t Bex j (t, x) (j = 1, 2, 3) are continuous in [0, T ]
Rn for all . Let ex (t, x) R and Aex (t, x) R3 be the electromagnetic potential
to Eex and Bex . Then we obtain Theorem 3.3 below. Though Theorem 3.3 gives
the generalization of Theorems 3.1 and 3.2, the results are stated separately from
Theorems 3.1 and 3.2 to avoid confusion.
We replace A(x (j) , a ) in (3.3), (3.10) and (3.11) by A(x (j) , a ) + Aex (t, x(j) ).
n 2 n2
Moreover we add j=1 ej ex (t, x(j) ) to (3.3) and (3.11), and j=1 ej ex (t, x(j) )
to (3.10), respectively. Then we have

Theorem 3.3. Besides the assumptions of Theorem 3.1 we suppose as in [1921]


that for any = 0 there exist constants C and > 0 satisfying

|x Eex j (t, x)| C , |x Bex j (t, x)| C x(1+ ) (3.15)

and

|x Aex j (t, x)| C , |x ex (t, x)| C x (3.16)

for j = 1, 2 and 3 in [0, T ] Rn. Then, the same assertions as in Theorems 3.1 and
3.2 hold.

Remark 3.7. It follows from [19, Lemma 6.1] that under the assumptions (3.15)
there exist Aex and ex satisfying (3.16).

4. The Appearance of the Coulomb Potentials


We will show rigorously that the Coulomb potentials appear as the limit of the
second term on the right-hand side of (3.3) and the limit of the second term on
the right-hand side of (3.10). This result is well known as a heuristic result in
physics (cf. [8, 11]). We will give a rigorous proof in our model. In the Hamiltonian
operators of QED models in [12, 14, 16, 32], the Coulomb potentials are assumed
from the beginning. Our proof is somewhat delicate.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

562 W. Ichinose

Theorem 4.1. Let Lj (j = 1, 2, 3) tend to under the condition

1 Li
m0 , i, j = 1, 2, 3 (4.1)
m0 Lj

for a constant m0 1. Then we have

2  
n
ej el cos k (x(j) x(l) )
lim lim
L1 ,L2 ,L3 M1 |V | |k|2
k1 j,l=1,j=l

2  
n
ej el cos k (x(j) x(l) )
= lim lim
M1 L1 ,L2 ,L3 |V | |k|2
k1 j,l=1,j=l

1 
n
ej el
= in S  (R3n ). (4.2)
2
j,l=1,j=l
|x(j) x(l) |

Let 0 (k) be the function in R3 dened by



1, |k| 1,
0 (k) := (4.3)
0, |k| > 1.

We rst prove

Lemma 4.2. Let


> 0. Then we have


n 
1 cos k (x(j) x(l) )
lim ej el 0 (
k)dk
0 (2)2 |k|2
j,l=1,j=l

1 
n
ej el
= in S  (R3n ). (4.4)
2
j,l=1,j=l
|x(j) x(l) |

Proof. Let x and k be in R3 . Then, it is well known that



1 eikx 1
dk = in S  (R3 ) (4.5)
(2)2 |k|2 2|x|

(cf. [25, 5.9]).


For the sake of simplicity, we consider the case n = 2. Let x = x(1) and y = x(2) .
We will prove

1 eik(xy) 1
lim 0 (
k)dk = in S  (R6 ). (4.6)
0 (2)2 |k|2 2|x y|
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics 563

Let (x, y) S(R6 ). Then, with ,  understood as distributional pairing, from


(4.6) we have
#  ik(xy) $
1 e
lim 0 (
k)dk, (x, y)
0 (2)2 |k|2
#  $
1 cos k (x y)
= lim 0 (
k)dk, (x, y)
0 (2)2 |k|2
#  $
1 sin k (x y)
+i 0 (
k)dk, (x, y)
(2)2 |k|2
#  $
1 cos k (x y)
= lim 0 (
k)dk, (x, y)
0 (2)2 |k|2
# $
1
= , (x, y) .
2|x y|

Consequently we obtain (4.4).


Equation (4.6) is equivalent to
#  $ 
1 eik(xy) 1
lim 0 (
k)dk, (x, y) = (x, y)dxdy (4.7)
0 2 2 |k|2 |x y|

for all (x, y) S(R6 ). We set x = (x y)/ 2 and y  = (x + y)/ 2. Let 1 (x )
and 2 (y  ) be in S(R3 ). We take (x, y) = (x
 , y  ) := 1 (x )2 (y  ) in the left-hand
side of (4.7). Then the left-hand side of (4.7) is equal to

1 eik(xy)
lim 0 (
k)1 (x )2 (y  )dkdx dy 
0 2 2 |k|2
  ik2x 
1   e
= lim 2 1 (x )dx 0 (
k)dk 2 (y  )dy  ,
0 2 |k|2

which is also equal to


   
1  , y)  
(x (x, y)
1 (x )dx 2 (y  )dy  = dx dy = dxdy
2|x | 2|x | |x y|

from (4.5). So, (4.7) holds for (x, y) = 1 (x )2 (y  ). Since the set of all linear
combinations of 1 (x )2 (y  ) for all 1 and 2 in S(R3 ) is dense in S(Rx6  ,y ), so
(4.7) holds for all (x, y) S(R6 ). Hence we get (4.6).

Proposition 4.3. Let c 0 be a constant. Let (k) be continuous in R3 \({0}


{|k| = c}). We suppose |(k)| (|k|) (k R3 ). We assume that (r) is non-
increasing in (0, ) and that r2 (r) is in L1 ([0, )) and is bounded in (0, ).

Then, ((2)3 /|V |) k=0 (k) is absolutely convergent, where the sum of k is taken
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

564 W. Ichinose

over (2s1 /L1 , 2s2 /L2 , 2s3 /L3 ) (s1 , s2 , s3 = 0, 1, 2, . . .). We also get

(2)3 
lim (k) = (k)dk (4.8)
L1 ,L2 ,L3 |V |
k=0

under the condition (4.1).

Proof. We write L = (L1 , L2 , L3 ). Let us dene the step function L (k) by


  

2s1 2s2 2s3 2(s1 1) 2s1


L (k) = , , , k ,
L1 L2 L3 L1 L1



2(s2 1) 2s2 2(s3 1) 2s3


, , ,
L2 L2 L3 L3
  

2s1 2s2 2s3 2(s1 1) 2s1


L (k) = , , , k ,
L1 L2 L3 L1 L1
 

2s2 2(s2 1) 2(s3 1) 2s3


, ,
L2 L2 L3 L3

for s1 , s2 , s3 = 1, 2, . . . . Then, for k (2(s1 1)/L1 , 2s1 /L1 ] (2(s2


1)/L2 , 2s2 /L2 ] (2(s3 1)/L3 , 2s3 /L3 ] we have
    
 2s1 2s2 2s3   2s1 2s2 2s3 
|L (k)| =  , ,  , ,  (|k|)
L1 L2 L3   L1 L2 L3 

since (r) is non-increasing. In the same way, for k (2(s1 1)/L1 , 2s1 /L1 ]
[2s2 /L2 , 2(s2 1)/L2 ) (2(s3 1)/L3 , 2s3 /L3 ] we get

|L (k)| (|k|). (4.9)

In the same way as the above, we can dene the step function L (k) for all
k R3 \{0} such that (4.9) and

(2)3  (2)3 
(k) = L (k)dk + (k). (4.10)
|V | R3 |V |
k=0 k=0,s1 s2 s3 =0

For a short while we suppose L1 L2 L3 . Since (r) is non-increasing, it


holds that for s1 2 we have
   
 2s1   2(s1 1) 2 2 
 
, 0, 0    , ,  (|k|),
L1 L1 L2 L3 




2(s1 2) 2(s1 1) 2 2
k , 0, 0,
L1 L1 L2 L3
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics 565

and also for s1 2 and s2 1


   
 2s1 2s2   2(s1 1) 2s2 2 
 , , 0   , ,  (|k|),
L1 L2 L1 L2 L3 




2(s1 2) 2(s1 1) 2(s2 1) 2s2 2


k , , 0, .
L1 L1 L2 L2 L3

For s2 2, we also have


   
 2 2s2   2 2(s2 1) 2 
 , , 0   , ,  (|k|),
L1 L2 L1 L2 L3 




2 2(s2 2) 2(s2 1) 2
k 0, , 0, .
L1 L2 L2 L3

Thus we get

(2)3  (2)3 
|(k)| (|k|)
|V | |V |
k=0,s3 =0 k=0,s3 =0

(2)3 
(|k|)
|V |
k=0,s3 =0,s1 ,s2 =0,1

(2)3 
+ (|k|)
|V |
k=0,s3 =s1 =0,|s2 |2

+ 10 (|k|)dk. (4.11)
0k3 (2)/L3

We can take a constant 1 m m0 from (4.1) such that L2 mL1 L3 . We


add the renement {((2)/(mL1 ), (2s2 )/L2 , (2s3 )/L3 ); s2 , s3 = 0, 1, 2, . . .} to
{((2s1 )/L1 , (2s2 )/L2 , (2s3 )/L3 ); s1 , s2 , s3 = 0, 1, 2, . . .}. Then, for s2 2
noting
   
 2s2   2 2(s2 1) 2 

 0, 
,0    , ,  (|k|),
L2 mL1 L2 L3 




2 2(s2 2) 2(s2 1) 2
k 0, , 0, ,
mL1 L2 L2 L3

we have
 
(2)3
(|k|) 2 (|k|)dk
m|V | 0k1 (2)/(mL1 ),0k3 (2)/L3
k=0,s3 =s1 =0,|s2 |2

2 (|k|)dk.
0k3 (2)/L3
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

566 W. Ichinose

Consequently, from (4.11), we get

(2)3  (2)3 
|(k)| (|k|)
|V | |V |
k=0,s3 =0 k=0,s3 =0,s1 ,s2 =0,1

+ 2(5 + m0 ) (|k|)dk. (4.12)
0k3 (2)/L3

Let us consider the case of general L1 , L2 and L3 . We may suppose


L1 L2 . Noting L2 m0 L3 from (4.1), we add the renement {((2s1 )/L1 ,
(2s2 )/L2 , (2)/(m0 L3 )); s1 , s2 = 0, 1, 2, . . .} to {((2s1 )/L1 , (2s2 )/L2 ,
(2s3 )/L3 ); s1 , s2 , s3 = 0, 1, 2, . . .}. Then, as in the proof to (4.11), for s1 2
we have
   
 2s1   2(s1 1) 2 2 
  
, 0, 0    , ,  (|k|),
L1 L1 L 2 m0 L 3 




2(s1 2) 2(s1 1) 2 2
k , 0, 0,
L1 L1 L2 m0 L 3

and also for s1 2 and s2 1,


   
 2s1 2s2   2(s1 1) 2s2 2 
 , , 0   , ,  (|k|),
L1 L2 L1 L 2 m0 L 3 




2(s1 2) 2(s1 1) 2(s2 1) 2s2 2


k , , 0, .
L1 L1 L2 L2 m0 L 3

For s2 2 we also have


   
 2 2s2   2 2(s2 1) 2 
 , , 0   , ,  (|k|),
L1 L2 L1 L2 m0 L 3 




2 2(s2 2) 2(s2 1) 2
k 0, , 0, .
L1 L2 L2 m0 L 3

Hence we can prove

(2)3  (2)3 
|(k)| m0 (|k|)
|V | m0 |V |
k=0,s3 =0 k=0,s3 =0

(2)3 
(|k|)
|V |
k=0,s3 =0,s1 ,s2 =0,1

(2)3 
+ (|k|)
|V |
k=0,s3 =s1 =0,|s2 |2

+ 10m0 (|k|)dk.
0k3 (2)/(m0 L3 )
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics 567

as in the proof of (4.11) and so


(2)3  (2)3 
|(k)| (|k|)
|V | |V |
k=0,s3 =0 k=0,s3 =0,s1 ,s2 =0,1

+ 2m0 (5 + m0 ) (|k|)dk
0k3 (2)/(m0 L3 )

as in the proof of (4.12). Thus, for general L1 , L2 and L3 we obtain


(2)3  (2)3 
|(k)| (|k|)
|V | |V |
k=0,sj =0 k=0,s1 ,s2 ,s3 =0,1

+ 2m0 (5 + m0 ) (|k|)dk (j = 1, 2, 3).
0kj (2)/(m0 Lj )
(4.13)
We assumed that r2 (r) is in L1 (R). So, from (4.9), (4.10) and (4.13) we can

prove that k=0 |(k)| is convergent. In addition, since r2 (r) is assumed to be
bounded in (0, ),
1
0 (|k|) Const. 2 , k = 0
|k|

holds. So we see that ((2)3 /|V |) k=0,s1 ,s2 ,s3 =0,1 (|k|) tends to zero as L1 , L2
and L3 tend to the innity under the condition (4.1). Consequently, from (4.13),
we have
(2)3 
lim (k) = 0, j = 1, 2, 3
L1 ,L2 ,L3 |V |
k=0,sj =0

under (4.1). Hence, noting (4.9), from (4.10) we obtain (4.8) by means of the
Lebesgue dominated convergence theorem.

Now we will prove Theorem 4.1. For the sake of simplicity, let n = 2. Let
0 (k) be the function dened by (4.3). We write x = x(1) and y = x(2) . We take
(x, y) S(R6 ). Then, we have
% &
(2)3  cos k (x y)
0 (
k), (x, y)
|V | |k|2
k=0

(2)3  cos k (x y)
= 0 (
k)(x, y)dxdy
|V | |k|2
k=0

(2)3  cos k (x y)
= 0 (
k)Dx 2 (x, y)dxdy, (4.14)
|V | |k|2 k2
k=0

where we dene Dx  := (1 nj=1 x2j ). Let
2

(k) = |k|2 k2 cos k (x y)Dx 2 (x, y)dxdy
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

568 W. Ichinose

and

1
(|k|) := |Dx 2 (x, y)|dxdy.
|k|2 k2

Then from (4.14), Proposition 4.3 shows


% &
(2)3  cos k (x y)
lim lim 0 (
k), (x, y)
L1 ,L2 ,L3 0 |V | |k|2
k=0

(2)3  cos k (x y)
= lim Dx 2 (x, y)dxdy
L1 ,L2 ,L3 |V | |k|2 k2
k=0
 
1
= dk (cos k (x y))Dx 2 (x, y)dxdy. (4.15)
|k|2 k2

In the same way from (4.14), we also have


% &
(2)3  cos k (x y)
lim lim 0 (
k), (x, y)
0 L1 ,L2 ,L3 |V | |k|2
k=0
 
1
= dk (cos k (x y))Dx 2 (x, y)dxdy. (4.16)
|k|2 k2

On the other hand, Lemma 4.2 and Proposition 4.3 indicate


% &
(2)3  cos k (x y)
lim lim 0 (
k), (x, y)
0 L1 ,L2 ,L3 |V | |k|2
k=0

cos k (x y)
= lim 0 (
k)(x, y)dxdydk
0 |k|2

(x, y)
= 2 2 dxdy. (4.17)
|x y|

Hence we obtain (4.2) together with (4.15) and (4.16).

Remark 4.1. Let (k) S(R3 ) such that (0) = 1 and (k) = (k). We take
the limit of Lj (j = 1, 2, 3) under the condition (4.1). Then it holds that

2  
n
ej el cos k (x(j) x(l) )
lim lim (
k)
0 L1 ,L2 ,L3 |V | |k|2
k=0 j,l=1,j=l

1 
n
ej el
= (4.18)
2 |x(j) x(l) |
j,l=1,j=l

pointwise for x R3n such that x(j) x(l) = 0 (j, l = 1, 2, . . . , n, j = l). The proof is
easy. Consider the case n = 2 and e1 = e2 = 1. Let us write x = x(1) and y = x(2) .
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics 569

We take 1 (k) C (R3 ) such that 1 (k) = 1 (|k| 1) and 1 (k) = 0 (|k| 2).
Then, Proposition 4.3 says for x = y that the left-hand side of (4.18) is equal to

1 cos k (x y)
lim (
k) dk
(2)2 0 |k|2

1 cos k (x y)
= lim 1 (k)(
k) dk
(2)2 0 |k|2
 
1 2
(cos k (x y)) k {(1 1 (k))(
k)|k| }dk
|x y|2

1 cos k (x y)
= 1 (k) dk
(2)2 |k|2
 
1 2
(cos k (x y)) k {(1 1 (k))|k| }dk (4.19)
|x y|2
pointwise, where k denotes the Laplacian operator with respect to k R3 and
we used |
 (
k)| =
1/3 |k|2/3 (
|k|)2/3 | (
k)| Const.
1/3 |k|2/3 . Since we have
|k {(1 1 (k))(
k)|k|2 }| Ck31/3 with a constant C independent of
, so
we can prove that Eq. (4.19) is also true in the distribution sense S  (R6 ). On the
other hand, we see as in the proof of Lemma 4.2 that the left-hand side of (4.19) is
equal to 1/(2|x y|) in S  (R6 ). Consequently we can prove that (4.19) is equal to
1/(2|x y|). Hence (4.18) holds pointwise.

5. The Expression for the Vacuum and the States of Photons


In this section, we express the vacuum and the states of photons with given momenta
and polarizations concretely as functions of variables a consisting of the Fourier
coecients of vector potentials. In [11, Problem 9-8] only the vacuum and the state
of a photon of momentum k and polarization state l are expressed concretely. In
this section, we generalize this result in [11] for the general states of photons. In
physics, the vacuum and the state of photons are not considered concretely but
rather considered abstractly (cf. [29, 33]). We also note that the state of photons of
given momenta and polarizations are not discussed in the study for QED models
dened by means of the functional method (cf. [12, 14, 16, 32]), because in the
functional method each photon with polarizations is expressed by an amplitude of
momentum in L2 (R3 ) L2 (R3 ) as stated in the introduction.
To write down the vacuum and the state of photons concretely, we will introduce
creation operators and annihilation operators acting on the space S  (Ra4N 
). Let us
dene
  
(i) |V |  c|k| (i)
a
lk := i i a
2c|k| i a(i) |V | lk
lk
  
|V | c|k| (i)
=  (i) + a (5.1)
2c|k| a |V | lk
lk
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

570 W. Ichinose

acting on the space S  (Ra4N



) for k and i, l = 1, 2. From (2.13) we have
(1) (1) (2) (2)
lk =
a alk , a
lk = a
lk .
(i)  (i) (i)
Let alk denote the formal adjoint operator |V |/(2c|k|)(/alk +c|k|alk /|V |)
lk acting on the space S  (Ra4N ). For f S  (Ra4N
(i)
of a  
) and g S(Ra4N

) we have
(i) (i)
(
alk f, g) = (f, a
lk g)

lk is continuous from S  (Ra4N


(i)
from the denition of the distribution. So, a 
) into
S  (Ra4N lk is continuous from S  (Ra4N
(i)

) in weak topology. In the same way a 
) into

S (Ra ) in weak topology.
4N

We can easily see from (5.1) that the commutator relations


(i) (i ) (i) (i )
alk , a
[ l k ] = i i l l kk , [
alk , a
l k  ] = 0

on S(Ra4N
) and so on S  (Ra4N 
) hold for k and k  in the bounded domain  (cf. [7,
34] and [6,30]). For S(Ra4N 
) is dense in S  (Ra4N

) in weak topology (cf. [26]) and the
operators of both sides above are continuous in S  (Ra4N 
). We dene the operator
lk acting on S  (Ra4N
a 
) for k and l = 1, 2 by
(1) (2)
lk i
a a
a
lk := lk (5.2)
2

(cf. (2.11)). We call a


lk the annihilation operator and alk the creation operator. We
lk
(i)
can easily see from the commutator relations for a lk that the operators a lk and a
also satisfy the commutator relations

l k ] = l l kk ,
alk , a
[ [
alk , a
l k  ] = 0 (5.3)

on S  (Ra4N

) for k and k  in (cf. [29, (2.26)]). It follows from the commutator
relations (5.3) that we have
  
a alk )n (
lk ( alk )n a alk )n 1
lk = n ( (5.4)

on S  (Ra4N

) (cf. [7, 34]). Then we get the following expression as in physics (cf., e.g.,
[29, (2.60) and (2.64)], and [33, (6.165) and (6.172)]).

Proposition 5.1. We can write the last term of H(t) dened by (3.10) as
 2
2
  |V |  (c|k|) (i) 2 c|k|
2
Hrad := + (a )

2 i a(i) 2|V | lk 2
k ,l i=1 lk

= alk a
c|k| lk (5.5)
k,l
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics 571

on S  (Ra4N

). The vector potential A(x, a2 ) dened by (2.14), where the sum of k
is taken over 2 , is given for each x R3 by the expression

4  
2
1
A(x, a2 ) = c  ( lk eikx )el (k)
alk eikx + a (5.6)
|V | 2c|k|
k2 l=1

acting on S  (Ra4N

).

Proof. Since from (5.1) and (5.2) we have

alk a
c|k|( lk a
lk + a lk )
c|k| (1) (2) (1) (2) (1) (2) (1) (2)
= {(
alk + i
alk )(alk i
alk ) + (
alk + i alk i
alk )( alk )}
2
(1) (1) (2) (2)
= c|k|(
alk a
lk + a lk alk )
 
2
(c|k|) (i) 2 c|k|
2
 |V |  2
= + (a )
2 i a (i) 2|V | lk 2
i=1 lk

on S(Ra4N

) for k , so we get (5.5) on S  (Ra4N

) as in the same way as before.
From (5.1) and (5.2), we have

lk eikx
lk eikx + a
a
1 (1) (1) (2) (2)
= {( lk ) cos k x i(
alk + a alk a
lk ) cos k x
2
(1) (1) (2) (2)
+ i(alk a
lk ) sin k x + (
alk + alk ) sin k x}
 
|V | c|k| (1) c|k| (2)
= a cos k x + a sin k x
c|k| |V | lk |V | lk


i(cos k x) (2) + i(sin k x) (1)
alk alk

on S(Ra4N

). So, it is shown from (2.8) and (2.13) that

 1
 lk eikx )el (k)
alk eikx + a
(
k2
2c|k|
 1 (1) (2)
=  (alk cos k x + alk sin k x)el (k)
k2
2|V |
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

572 W. Ichinose

on S(Ra4N

). Hence, we see that the right-hand side of (5.6) is equal to

4   1 (1)
2
c (alk cos k x + a(2)
lk sin k x)
el (k)
|V | 2
k l=1
2

on S  (Ra4N

), which is equal to the left-hand side of (5.6) from (2.14).

We know
 
a 2
e d =
a

for a constant a > 0. So, we can easily see from (5.2) and (5.5) that
  
 c|k| c|k| (1)2 (2)2
0 (a ) := exp (alk + alk ) (5.7)

|V | 2|V |
k ,l

is the normal ground state of Hrad , called vacuum, whose energy is 0, i.e.

Hrad 0 = 0 (5.8)

and that we have



2c|k|
lk 0 =
a a 0 , a
lk 0 = 0 (k ) (5.9)
|V | lk

(cf. [11, 8-1, (9-43) and Problem 9-8]). We know that the eigenvalue 0 of (5.8) is
simple in L2 (R4N ) (cf. [4, Chap. 3, Theorem 3.4]).
alk )n 0 (a ) (k , n = 0, 1, 2, . . .), which can

The function n lk (a ) := (
be written concretely from (5.1), (5.2) and (5.7), expresses the state of n photons
of momentum k and polarization state l (cf. [11, 9-2] and [29, 2-2]) and satises


a lk n l k = n n l k ,
lk a
k,l
 

alk a
k lk n l k = n (k  )n l k
k

and

Hrad n l k = n (c|k  |)n l k


 
from (5.4), (5.5) and (5.9). The operators k,l a lk a
lk and alk a
k k lk
are called the total number operator and the momentum operator, respec-
tively (cf. [6], and [29, (2.68) and (2.80)]). Let n (l, k) 0 be integers. Then
'
alk )n (l,k) 0 (a ) denotes the state of n (l, k) photons of momentum k and

k,l (
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics 573

' 
polarization state l in the same way. Setting (a ) = alk )n (l,k) 0 (a ),
k,l (
we get

 
a
alk = n (l, k), (5.10)
lk
k,l k,l

 
 
lk =
alk a
k n (l, k)k (5.11)
k k,l

and


Hrad = n (l, k)c|k|. (5.12)
k,l

The family

 

(i)
alk )n (l,k,i) 0
(

k ,l,i
n (l,k,i)=0

makes a complete orthogonal system in L2 (R4N ) (cf. [4, Chap. 3, Theorem 3.1]
and [7, 34]). We have
(1) lk a
a lk (2) i(
alk + a
lk )
lk =
a , a
lk =
2 2
from (2.13) and (5.2). So we see together with (5.4) and the second equation in
(5.9) that the family

 1 

 alk )n (l,k) 0
( (5.13)
n (l, k)!
k,l
n (l,k)=0

also makes a complete orthonormal system in L2 (R4N ) (cf. [7, 34] and [29, (2.46)]).
For example, we have
alk 0 , (
( alk )2 0 ) = (0 , a alk )2 0 )
lk (
alk )2 a
= (0 , ( lk 0 )
lk 0 ) + 2(0 , a
alk 0 , 0 ) = 0.
= 2(

Remark 5.1. We considered the Lagrangian function (3.3) and the Hamilto-
nian operator (3.10), determining the indenite constant in (2.3) by (2.18) or in
Remark 3.1. On the other hand, in many references (cf. [11, 29, 32]) the indenite

constant is chosen to be 0. Consequently, the term = (1/2) nj=1 e2j /|x(j) x(j) |

appears in (4.2) from (2.21) and the ground state energy of Hrad is k c|k|/2,
which tends to innity as M3 tends to innity. Arguments are made about these
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

574 W. Ichinose

innities in [11, 9-3 and 9-5]. In the present paper, we could see that the innity
n
arising from the term (1/2) j=1 e2j /|x(j) x(j) | disappears in (4.2) and that the
ground state energy of Hrad is 0.

6. Preliminaries for the Proofs of Main Results


From Secs. 69 we often write x and y in R3n as x and y, respectively, for the sake
of simplicity when no confusion arises.
Let 0 s < t T . For x and y in R3n , we dene

t
x,y () := x
q t,s (x y), s t. (6.1)
ts

For X and Y in R4N , we also dene

t
at,s
 X,Y () := X (X Y ), s t. (6.2)
ts

Then at,s
X,Y () R
8N
is dened by means of (2.13). We set

2  
n
ej el cos k (x(j) x(l) )
V1 (x) := (6.3)
|V | |k|2
k1 j,l=1,j=l

and
  (c|k|)2 (i) c|k|

V2 (a ) := (alk )2 . (6.4)

2|V | 2
k ,i,l

For the sake of simplicity we suppose 2 = 3 (=  ) from Secs. 69. We write


x = (x, X) R3n+4N and

t,s
qt,s q t,s
x,y () = (,  x,y (), a X,Y ()) R
1+3n+4N
, st (6.5)

for s < t. Then, from (3.3) and (3.5), we have

t,s
q t,s
Sc (t, s;  x,y , aX,Y )

1 n
= mj |x(j) y (j) |2
2(t s) j=1

  n
(j) , a ) dx(j) V2 (a )dt + |X Y |
2
V1 (x)dt + 1
+ ej A(x
qt,s
x,y
c j=1 2|V |(t s)
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics 575

 n   
1 t
t
= mj |x(j) y (j) |2 V1 x (x y) d
2(t s) j=1 s ts


1
n 1
+ ej (x(j) y (j) ) (j) (x(j) y (j) ), X (X Y ))d
A(x
c j=1 0

  
|X Y |2 t
t
+ V2 X (X Y ) d
2|V |(t s) s ts

n  1
1
= mj |x (j)
y (j) 2
| (t s) V1 (x (x y))d
2(t s) j=1 0


1
n 1
+ ej (x(j) y (j) ) (j) (x(j) y (j) ), X (X Y ))d
A(x
c j=1 0


|X Y |2 1
+ (t s) V2 (X (X Y ))d. (6.6)
2|V |(t s) 0

Let M 0 and p(x, w, X, W ) a C function in R6n R8N such that


 

|w x W X p(x, w, X, W )| C,, ,  (x; wX; W )M (6.7)

 
with constants C,, ,  , where x; w :=
 all multi-indices , , and 3n+4N
for
1 + |x| + |w| . For f (x, X) S(R
2 2 ) we dene the operator P (t, s) by


n 
 3 4N  

m 1
(exp i1 Sc (t, s; q t,s
j


t,s
x,y , aX,Y )

2i(t s) 2i|V |(t s)

j=1



 

xy X Y



p x, , X, f (y, Y )dydY, s < t,

ts ts



n 
 3  4N  n

m j 1 1 mj |wj |2

Os- exp i

2i 2i|V | 2

j=1 j=1









|W |
2

+ p(x, w, X, W )dwdW f (x, X), s = t.
2|V |
(6.8)

When p(x, w, X, W ) = 1, P (t, s) is called the fundamental operator and denoted


by C(t, s).
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

576 W. Ichinose

Lemma 6.1. Let M1 and M2 be non-negative constants. Suppose that g(x)(x R3 )


and ()( R) in (3.4) satisfy

|x g(x)| C xM1 , x R3

for all and


 k 
d 
 
 dk () Ck  , R
M2


for all k = 0, 1, . . . . Let f S(R3n+4N ). Then, x X

(P (t, s)f )(x, X) are continu-
ous in 0 s t T and (x, X) R 3n+4N
for all and  .

Proof. Let s < t and make the change of variables: y w = (x y)/ t s and

Y W = (X Y )/ t s in (6.8). Then from (6.6) we have

n  3 4N 
m 1
(exp i1 (t, s; x, w, X, W ))
j
P (t, s)f = Os-
j=1
2i 2i|V |

p(x, w, X, W )f (x w, X W )dwdW, = t s, (6.9)

where

(t, s; x, w, X, W )

n
mj 1
:= |w(j) |2 + |W |2 + (t, s; x, w, X, W )
j=1
2 2|V |

 
1  (j)
n n
mj 1 1

:= |w | +
(j) 2
|W |2 V1 (x w)d + ej w
j=1
2 2|V | 0 c j=1
 
(j) w(j) , X W )d
1 1
A(x V2 (X W )d. (6.10)
0 0

We note from (6.8) that (6.9) is also true for t = s.


3
Let L(j) := w(j) 2 (1 im1
(j) (j)
j k=1 wk /wk ) (j = 1, 2, . . . , n) and L1 :=

W 2 (1 i|V | k=1 Wk Wk ). Then, integrating by parts with respect to w and
4N

W in (6.9) by means of L(j) and L1 , we see that the integrand is bounded by

Const.x; Xl w(3n+1) W (4N +1)

for some real constant l. See the proof of [19, Lemma 2.1] for further details.
Consequently, we see that (P (t, s)f )(x, X) is continuous in 0 s t T and
(x, X) R3n+4N . We note (6.9) and (6.10). Then, in the same way as in the above

we can prove that x X (P (t, s)f )(x, X) are continuous in 0 s t T and
(x, X) R3n+4N for all and  .
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics 577

For 0 1 , 2 1 we set := (1 , 2 ) and

() := t 1 (t s) R,
(j) () := z (j) + 1 (x(j) z (j) ) + 1 2 (y (j) x(j) ) R3 , j = 1, 2, . . . , n,
() := z + 1 (x z) + 1 2 (y x) R , 3n


() := Z + 1 (X Z) + 1 2 (Y X) R4N . (6.11)

We also set

Al (j) Am (j)
Bml (x(j) , a ) = (x , a ) (x , a ) (6.12)
xm xl

for l, m = 1, 2, 3 and j = 1, 2, . . . , n. Then, from (6.6), we have

Lemma 6.2. We can write for s < t

t,s t,s
q t,s
Sc (t, s;  q t,s
z,y , aZ,Y ) Sc (t, s;  z,x , aZ,X )
 
1 
n
x(j) + y (j)
= mj (x(j) y (j) ) z (j)
t s j=1 2
 1 1
V1
+ (t s)(x y) 1 (())d1 d2
0 0 x

n  1
1
+ ej (x(j) y (j) ) (j) (x(j) y (j) ), X (X Y ))d
A(x
c j=1 0

 1 1
1 
n 3
(j) (j)
+ ej (xm ym )(xl zl )
(j) (j)
1 Bml ( (j) (), ())d1 d2
c j=1 0 0
l,m=1
(  1 )
1
n 1
A (j)
+ ej (x(j) y (j) ) (Z X) 1 ( (), ())d1 d2
c j=1 0 0 a

 1 1
1
n 3
Am (j)
+ (X Y ) ej (x(j)
m z (j)
m ) 1 ( (), ())d1 d2
c j=1 m=1 0 0 a
 
1 X +Y
+ (X Y ) Z
(t s)|V | 2
 1 1
V2
+ (t s)(X Y ) 1 (())d1 d2 . (6.13)
0 0 a
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

578 W. Ichinose

Proof. We use (6.6). From (6.5) and (6.11), we see


 
(V1 (x))dt (V1 (x))dt
q zt,s
y
,y q zt,s
x
,x

 3 
n 
(j) (j)
= V1 /xl dt dxl
j=1 l=1

 3  1
n  1 (j)
(j) ( (), l ())
= V1 (())/xl det d1 d2
j=1 l=1 0 0 (1 , 2 )


n 
3  1 1
(j) (j) (j)
= (t s)(xl yl ) 1 V1 (())/xl d1 d2
j=1 l=1 0 0

 1 1
V1
= (t s)(x y) 1 (())d1 d2 , (6.14)
0 0 x

where = (t, s, x, y, z) is the 2-dimensional plane with oriented boundary con-


z,y ()), (, 
sisting of (, q t,s q t,s y,x ()) (s t), and in (6.11)
q s,s
z,x ()) and (, 
gives the positive orientation of . So the second term on the right-hand side of
(6.13) appears. In the same way the last term appears. It is easy to show that the
rst and the 7th terms appear.
As in the proof of (6.14), we have
 
(j) , a ) dx(j)
A(x (j) , a ) dx(j)
A(x
q zt,s
y
,y q zt,s
x
,x
 
= (j) , a ) dx(j) +
A(x (j) , a ) dx(j) )
d(A(x
s,s
qx y
,y
 1
= (x (j)
y (j)
) (j) (x(j) y (j) ), X (X Y ))d
A(x
0

   3 

(j) (i) (i)
+ Bml dx(j)
m dxl m dalk
( Am /alk )dx(j)
1m<l3 k ,i,l m=1

 1
= (x(j) y (j) ) (j) (x(j) y (j) ), X (X Y ))d
A(x
0
 (j) (j) (j) (j)
+ {(x(j)
m ym )(xl
(j)
zl ) (xl yl )(x(j)
m zm )}
(j)

1m<l3
 1 1

1 Bml ( (j) (), ())d1 d2
0 0
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics 579


3
{(x(j)
m ym )(X Z) (X Y )(xm zm )}
(j) (j) (j)

m=1
 1 1
Am (j)
1
( (), ())d1 d2 . (6.15)
0 0 a
So we can complete the proof of (6.13) from (6.6).

(j)
Let us dene m (t, s; x(j) , y (j) , z (j) , X, Y, Z) R (m = 1, 2, 3, j = 1, 2, . . . , n)
and 1 (t, s; x, y, z, X, Y, Z) R4N by
 
ej (t s)  (j)
(j) (j) 3
xm + y m (j)
(j)
m = zm(j)
+ (xl zl )
2 mj c
l=1
 1 1

1 Bml ( (j) (), ())d1 d2
0 0
 1 1
ej (t s) Am (j)
(X Z) 1 ( (), ())d 1 d2
mj c 0 0 a

ej (t s) 1
+ Am (x(j) (x(j) y (j) ), X (X Y ))d
mj c 0
2  1 1
(t s)
+ 1 V1 (())/x(j)
m d1 d2 (6.16)
mj 0 0

and
 
(t s)|V |  
n 3
X +Y
1 = Z + m zm )
ej (x(j) (j)
2 c j=1 m=1
 1
Am (j)
1

1
( (), ())d 1 d2
0 0 a
 1 1
V2
+ (t s)2 |V | 1 (())d1 d2 , (6.17)
0 0 a
(j) (j) (j)
respectively. Let (j) := (1 , 2 , 3 ) R3 . Then it follows from (6.13), (6.16)
and (6.17) that
t,s t,s
q t,s
Sc (t, s;  q t,s
z,y , aZ,Y ) Sc (t, s;  z,x , aZ,X )

1 
n
= mj (x(j) y (j) ) (j) (t, s; x(j) , y (j) , z (j) , X, Y, Z)
t s j=1

1
+ (X Y ) 1 (t, s; x, y, z, X, Y, Z). (6.18)
(t s)|V |
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

580 W. Ichinose

7. The Stability of the Fundamental Operator

Lemma 7.1. Let f C 1 (Rd ) and |x f | C < x >(1+ ) for all x Rd and
all || = 1, where > 0 are constants. Then we have:

(1) f is a bounded function in Rd .


(2) We have
   
 1 1 

|x z| x y z 1 f (z + 1 (x z) + 1 2 (y x))d1 d2 
0 0

C,, , | + + | = 1, x, y, z Rd .

The proof is easy, following the proof of [18, Lemma 3.5].


We note (3.4) and (6.11). Then, it follows from Lemma 7.1 that under the
assumptions of Theorem 3.1 we have
  1 1 
 Am (j) 
    
x(j) y(j) z(j) X Y Z (Z X) 1 ( (), ())d1 d2 
 0 0 a  

C,,, ,  ,  , | + + +  +  +  | 0 (7.1)
for x(j) , y (j) , z (j) R3 and X, Y, Z  R4N . In the same way we have the same
(j) (j) 1 1 (j)
estimates as the above for (xl zl ) 0 0 1 Bml ( (j) (), ())d 1 d2 and (xm
(j)  
1 1
zm )
1 Am ( (j) (), ())d 1 d2 . To obtain these estimates we assumed (3.6)
0 0 a
and (3.7). Consequently, letting be a component of (j) and 1 , and | + +
+  +  +  | 1, then from (6.16) and (6.17) we obtain
  

|x y z X Y Z | C,,, ,  ,  (7.2)
together with (6.3) and (6.4) for 0 s t T, x, y, z R3n and X, Y, Z R4N .

Proposition 7.2. Under the assumptions of Theorem 3.1 we have:


(1) There exists a constant > 0 such that the mapping: R3n+4N  (z, Z)
(, ) = (, 1 ) := ((1) , (2) , . . . , (n) , 1 ) R3n+4N is homeomorphic
and det (, )/(z, Z) 1/2 for each xed 0 t s , x, y, X
and Y . We write its inverse mapping as R3n+4N  (, ) (z, Z) =
(z(t, s; x, , y, X, , Y ), Z(t, s; x, , y, X, , Y )) R3n+4N .
(2) Let (t, s; x, , y, X, , Y ) be a component of z and Z. Then, letting | + +
+  +  +  | 1, we have
  

| x y X Y (t, s; x, , y, X, , Y )| C,,, ,  ,  (7.3)
for 0 t s , x, , y R3n and X, , Y R4N .

Proof. (1) From (6.16) and (6.17), we can write


(, 1 )/(z, Z) = I + (t s)d(t, s; x, y, z, X, Y, Z), (7.4)
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics 581

where I is the identity matrix of degree 3n + 4N . We can see as in the


proof of (7.2) that each component of d(t, s; x, y, z, X, Y, Z) satises (7.2) for
all , , ,  ,  and  . Hence, applying [31, Theorem 1.22] to the mapping:
(z, Z) (, 1 ), we prove (1).
(2) We see
(, ) = ((t, s; x, y, z, X, Y, Z), 1 (t, s; x, y, z, X, Y, Z))
with z = z(t, s; x, , y, X, , Y ) and Z = Z(t, s; x, , y, X, , Y ). So, (7.3) follows
from (7.2) and det (, )/(z, Z) 1/2.

Remark 7.1. Let us consider the general case 2 3 in Proposition 7.2. Then
a ) and Bml (x, a ) in (6.16) and (6.17).
from (3.4) and (6.12), we consider A(x, 2 2
Let 1 and 2 be xed. When 3 = 2 , we could determine > 0 from (7.4)
  

such that we get det (, 1 )/(z, Z) 1/2 for 0 t s , x, y, z R3n and


X, Y, Z R4N3 . Let 3 2 . Then, direct calculations show
det (, 1 )/(z, Z) 1/2
for 0 t s , x, y, z R3n and X, Y, Z R4N3 from (6.16) and (6.17) since
(i)
(t s)2 |V | 2 V2 (a )/(alk )2 = (t s)2 (c|k|)2 are non-negative. Consequently, we
can see that when 1 and 2 are xed, the constant > 0 is taken independently
of 3 ( 2 ).

Theorem 7.3. Let > 0 be the constant determined in Proposition 7.2. Then
under the assumptions of Theorem 3.1 we can nd constants Ka 0 (a = 0, 1, 2, . . .)
such that
C(t, s)f B a eKa (ts) f B a , 0 t s (7.5)
for all f (x, a ) B a (R3n+4N ).

Proof. The denition (6.8) says


C(s, s) = Identity. (7.6)
So (7.5) holds for t = s.
Let 0 < t s . We take C (R3n+4N ) with compact support such that
(0) = 1. Let
> 0 and f S(R3n+4N ). Then from (6.8) and (6.18), we can write
C(t, s) (
)2 C(t, s)f

 n  3  4N  
mj 1
= f (y, Y )dydY
2(t s) 2|V |(t s)
j=1

(
z,
Z)2 exp{i1 Sc (t, s; q t,s t,s
z,y , aZ,Y )

i1 Sc (t, s; 
q t,s t,s
z,x , aZ,X )}dzdZ
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

582 W. Ichinose


n  3  4N  
mj 1
= f (y, Y )dydY (
z,
Z)2
2(t s) 2|V |(t s)
j=1


n
m (j)

exp i dzdZ.
j 1
(x(j) y (j) ) + i(X Y ) (7.7)
j=1
(t s) |V |(t s)

We can make the change of variables: (z, Z) (, ) = (, 1 ) in (7.7) from


Proposition 7.2. Then

C(t, s) (
)2 C(t, s)f

 n  3  4N
mj 1
=
2(t s) 2|V |(t s)
j=1

  
n
mj (j)
f (y, Y )dydY (
z,
Z)2 exp i (x(j) y (j) )
(t s)
j=1

(z, Z)
+ i(X Y ) det dd.
|V |(t s) (, )

Equation (7.4) and Proposition 7.2(2) show

(z, Z)
det = 1 + (t s)h(t, s; x, , y, X, , Y ), (7.8)
(, )

where h(t, s; x, , y, X, , Y ) satises (7.3) for all , , ,  ,  and  . Consequently,


from Proposition 7.2(2), we have

lim C(t, s) (
)2 C(t, s)f
0
 3n+4N  
1
= lim f (y, Y )dydY (
z,
Z)2
2 0

(z, Z)
{exp(i(x y) + i(X Y ) )} det dd
(, )
 3n+4N 
1
= f (x, X) + (t s) Os- {exp(i(x y) + i(X Y ) )}
2
h(t, s; x, , y, X, , Y )f (y, Y )dydY dd, (7.9)

where (j) = (t s) (j) /mj (j = 1, 2, . . . , n) and = |V |(t s). We note that
the second term on the right-hand side of (7.9) is a pseudo-dierential operator.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics 583

So, applying the Calder


onVaillancourt theorem ([5]), we obtain
lim (
)C(t, s)f 2 = lim (C(t, s) (
)2 C(t, s)f, f )
0 0
 

= lim C(t, s) (
) C(t, s)f, f
2
0

(1 + 2K0 (t s))f 2
e2K0 (ts) f 2
with a constant K0 0. Hence we get (7.5) with a = 0 by Fatous lemma.
Let p(x, w, X, W ) be a C function satisfying (6.7) with an integer M 0.
Then we obtain
P (t, s)f  Const.f B M (7.10)
as in the proof of (7.5) with a = 0. See the proof of [19, Proposition 4.3] for further
details.
Let us recall the expression (6.9) of C(t, s)f . Set := (x, X) and let
= (1 , 2 , . . . , 3n+4N ) be an arbitrary multi-index. Then we can see that
(C(t, s)f ) C(t, s)( f ) and (C(t, s)f ) C(t, s)( f ) are written in the form

(t s) P (t, s)( f )
||||

 4N
 
n
mj
3
1
:= (t s)
2i 2i|V |
|||| j=1


Os- (exp i1 (t, s; x, w, X, W ))p (t, s; x, w, X, W )

( f )(x w, X W )dwdW (7.11)
respectively, where p (t, s; x, w, X, W ) satises (6.7) with M = || || for
all , ,  and  . We can prove these results by induction with respect to
1 (j) 2 1 (j) 2 1 2
||, using w(j) eimj  |w | /2 = imj w(j) eimj  |w | /2 , W ei |W | /(2|V |) =
1 2
(iW/|V |)ei |W | /(2|V |) and the integration by parts in (6.9). See the proof of
[21, Lemma 3.2] for further details.
Let || = a (a = 0, 1, 2, . . .). Then we have

 (C(t, s)f ) C(t, s)( f ) + (t s) P (t, s)( f ).
||a

Applying (7.5) with a = 0 and (7.10) to the right-hand side above, we get

 (C(t, s)f ) eK0 (ts)  f  + Const.(t s) ||a  f B a|| .
We know from Lemma 2.3 with s = 1 and a = b in [17] that there exist a constant
a 0 and a (, ) satisfying
| a (, )| C, ; a (7.12)
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

584 W. Ichinose

for all and , and


a (, D ) = (a + a + D a )1 (7.13)
on S, where a (, D ) is the pseudo-dierential operator with symbol a (, ). So,
using [17, Lemma 2.4] and the Calder onVaillancourt theorem, we have
 f B a|| Const.(a|| + a|| + D a|| ) f 

= Const.{(a|| + a|| + D a|| ) a }


(a + a + D a )f 
Const.f B a . (7.14)
Hence we get
 (C(t, s)f ) eK0 (ts)  f  + Const.(t s)f B a . (7.15)
In the same way, we get
 (C(t, s)f ) eK0 (ts)  f  + Const.(t s)f B a . (7.16)
Thus we obtain
C(t, s)f B a eK0 (ts) f B a + Const.(t s)f B a
eKa (ts) f B a .
This completes the proof of Theorem 7.3.

Proposition 7.4. Let 0 ts and p(x, w, X, W ) satisfy (6.7) with an integer


M 0. Then P (t, s) is a continuous operator from B a (a = 0, 1, 2, . . .) into B a+M .

Proof. Let = (x, X) and f S(R3n+4N ). We also use (6.9) as in the proof of
Theorem 7.3. Then we have

P (t, s)f = P (t, s) f,

where denotes j j for all j and p (t, s; x, w, X, W ) satisfy (6.7) with



M + || || as M . Using = (x, X) = (x w, X W ) + (w, W ), we also
have

P (t, s)f = Q (t, s) f,

where q (t, s; x, w, X, W ) satisfy (6.7) with M + || || as M . Hence from (7.10)


and (7.14) we see

P (t, s)f B a = P (t, s)f  + ( P (t, s)f  +  P (t, s)f )
||=a

Const.f B a+M . (7.17)


June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics 585

8. The Consistency of the Fundamental Operator


Let C(t, s) and H(t) be the fundamental operator dened in Sec. 6 and the operator
dened by (3.10) with variables a = a2 = X, respectively.
Theorem 8.1. Under the assumptions of Theorem 3.1 there exist integers M
0, M  0, C functions r(t, s; x, w, X, W ) and r (t, s; x, w, X, W ) in 0 s t T,
(x, w) R6n and (X, W ) R8N satisfying (6.7) for all , ,  and  , respectively
such that
 

i H(t) C(t, s)f = t sR(t, s)f (8.1)
t
and

i C(t, s)f + C(t, s)H(s)f = t sR (t, s)f (8.2)
s
for f S(Rx3n Ra4N

), where R(t, s) and R (t, s) are operators dened by (6.8).

Proof. In this proof, we write x and y as x and y, respectively. Let x denote
variables in R3 . We may assume s < t from Lemma 6.1. It follows from (3.10), (6.6)
and (6.8) that direct calculations show
 

i H(t) C(t, s)f
t

n  3
4N
m j 1
=
j=1
2i(t s) 2i|V |(t s)
 
1 t,s t,s
(exp i Sc (t, s; q x,y , aX,Y ) r1 (t, s; x, y , X, Y )

i
+ r2 (t, s; x, y, X, Y ) f (y, Y )dy dY (8.3)
2
by means of (6.3) and (6.4), where
r1 (t, s; x, y , X, Y ) = t Sc (t, s; q t,s t,s
y , aX,Y )
x,
  2
1  
n
ej (j)
+  x(j) Sc A(x , X)
j=1
2mj c
|V |
+ V1 (x) + |X Sc |2 + V2 (X) (8.4)
2
and
3n + 4N  1
n
r2 = (j) Sc
ts j=1
mj x

1  ej
n
+ (x A)(x
(j) , X) |V |X Sc , x R3 (8.5)
c j=1 mj

(cf. the proof of [18, Proposition 2.3]).


June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

586 W. Ichinose

Set = t s. From (6.6), we can write


ej (j)
x(j) Sc A(x , X)
c
mj (x(j) y (j) )
=

 1
ej
+ {A(x
(j) (x(j) y (j) ), X (X Y )) A(x
(j) , X)}d
c 0
 1
ej  (j)
3
(j) Al (j)
+ (xl yl ) (1 ) (x (x(j) y (j) ), X (X Y ))d
c 0 x
l=1
 1
V1
(1 ) (x (x y ))d
0 x(j)

mj (x(j) y (j) ) ej  (j)


3
(j) A
= (xm ym ) (x(j) , X)
2c m=1 xm

ej  ej  (j)
4N 3
A (j)
(j) Al
(Xm Ym ) (x , X) + (xl yl ) (x(j) , X)
2c m=1 Xm 2c x
l=1
 
x y X Y
+ q1 t, s; x, , X, (8.6)

and

1 
n 3
X Y 1
V2 (j) (j)
X Sc = (1 ) (X (X Y ))d + ej (xl yl )
|V | 0 X c j=1
l=1
 1
Al (j)
(1 ) (x (x(j) y (j) ), X (X Y ))d
0 X

1 
n 3
X Y (j)

(j) Al
= + ej (xl yl ) (x(j) , X)
|V | 2c j=1 X
l=1
 
x y X Y
+ q2 t, s; x, , X, . (8.7)

We can easily see


3
(j) (j) Ak (j)
(xk yk )(x(j)
m ym )
(j)
(x , X)
xm
k,m=1


3
(j) (j) (j) (j) Al (j)
+ (xk yk )(xl yl ) (x , X) = 0. (8.8)
xk
k,l=1
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics 587

Equations (8.6)(8.8) show


 2
n
1  
S
ej (j)
A(x , X) + |V | |X Sc |2
2m j
 x (j) c
c  2
j=1
 
1 
n
|X Y |2 x y X Y
= 2 mj |x y | +
(j) (j) 2
+ q3 t, s; x, , X, .
2 j=1 2|V |2
(8.9)

From (6.6), we also have

1 
n
|X Y |2
q t,s
t Sc (t, s;  t,s
y , aX,Y ) = mj |x(j) y (j) |2 V1 (x)
x, 2
2 j=1 2|V |2
 
x y X Y
V2 (X) + q4 t, s; x, , X, .

(8.10)

Hence together with (8.4), we obtain


 
x y X Y
r1 (t, s; x, y , X, Y ) = q5 t, s; x, , X, . (8.11)

From (6.6) or (8.6) and (8.7), the same arguments as for (8.11) show
n
1
(j) Sc + |V |X Sc
j=1
mj x

2  ej
n 1
3n + 4N
= + (1 )
c j=1 mj 0

(x A)(x
(j) (x(j) y (j) ), X (X Y ))d
 
x y X Y
+ q6 t, s; x, , X,

1  ej
n
3n + 4N
= + (x A)(x
(j) , X)
c j=1 mj
 
x y X Y
+ q7 t, s; x, , X, . (8.12)

Hence together with (8.5), we get
 
x y X Y
r2 (t, s; x, y, X, Y ) = q7 t, s; x, , X, . (8.13)

Thus we could complete the proof of (8.1) from (8.3), (8.11) and (8.13).
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

588 W. Ichinose

Let us consider (8.2). By direct calculations we see that the left-hand side of
(8.2) is equal to

n  3
4N
m 1

j

j=1
2i(t s) 2i|V |(t s)
 
1
(exp i Sc (t, s; q t,s t,s
y , aX,Y
x,
 ) r1 (t, s; x, y , X, Y )

i 
+ r2 (t, s; x, y , X, Y ) f (y, Y )dy dY, (8.14)
2

where

r1 (t, s; x, y, X, Y ) = s Sc (t, s; q t,s t,s


y , aX,Y )
x,
 2
n
1  ej (j) 
 y(j) Sc + A(y , Y )
j=1
2mj c

|V |
V1 (y ) |Y Sc |2 V2 (Y ) (8.15)
2
and

3n + 4N  1
n
r2 = + (j) Sc
ts j=1
mj y

1  ej
n
+ (x A)(y
(j) , Y ) + |V |Y Sc . (8.16)
c j=1 mj

Consequently, we can prove (8.2) as in the proof of (8.1).

9. The Proofs of the Main Results


We rst prove Theorem 3.1. Let > 0 be the constant determined in
Proposition 7.2 and C (R3n+4N ) with compact support such that (0) = 1.
We consider bounded operators Kj and Kj (j = 1, 2, . . . , ) on B a (R3n+4N ). Then,
it holds for f B a (R3n+4N ) that

K (
)K1 (
) (
)K1 (
)f K K1

K1 f


= K (
) (
)Kj+1 (
)(Kj Kj )Kj1

K1 f
j=1


1
+ K (
) (
)Kj+1 ((
) 1)Kj K1 f. (9.1)
j=0
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics 589

Noting (6.1) and (6.2), from (3.5) we have




, ,
Sc (T, 0; 
q , a ) = Sc (l , l1 ; q xl(l) l1
,x(l1)
l l1
, aX (l) ,X (l1) ),

l=1

(l)
where X (l) = a (l = 1, 2, . . . , 1) and X () = a . So, (3.8) is written as

lim C(T, 1 )(
)C(1 , 2 )(
) C(2 , 1 )(
)C(1 , 0)(
)f
0

for f B a (R3n+4N ). Let f B a (R3n+4N ) and || . We can easily see

sup (
)f B a Const.f B a
0<1

and

lim ((
) 1)f B a = 0.
0

Consequently, using Theorem 7.3 and (9.1), we can see that (3.8) is well dened in
B a , which is written as

C(T, 1 )C(1 , 2 ) C(2 , 1 )C(1 , 0)f (= C (T, 0)f ). (9.2)

We also see from Remark 3.5 that there exists (3.8) in S for f S.
Let 0 t0 t T . For a subdivision of [0, T ] we can nd j and l such that
j l, j1 < t0 j and l1 < t l , where we take j = 1 for t0 = 0. Then we
dene

C (t, t0 )f := lim C(t, l1 )(


)C(l1 , l2 )(
)
0

(
)C(j+1 , j )(
)C(j , t0 )(
)f (9.3)

for f B a as was stated in Remark 3.3. Let || . Then we have

C (t, t0 )f = C(t, l1 )C(l1 , l2 ) C(j+1 , j )C(j , t0 )f

as in the proof of (9.2). Consequently, from (7.5) we have

C (t, t0 )f B a eKa (tt0 ) f B a (a = 0, 1, 2, . . .) (9.4)

under the assumptions of Theorem 3.1.

Proposition 9.1. Let || . Then, under the assumptions of Theorem 3.1 we


can nd an integer M 2 such that

C (t, t0 )f C (t , t0 )f B a Ca (|t t | + |t0 t0 |)f B a+M (9.5)

for 0 t0 t T, 0 t0 t T and a = 0, 1, 2, . . . .


June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

590 W. Ichinose

Proof. Let R(t, s) and R (t, s) be the operators dened by (8.1) and (8.2), respec-
tively. We determine M in Proposition 9.1 by max (M, M  , 2) for M and M  in
Theorem 8.1. We can easily see
 t


i(C(t, s)f C(t , s)f ) = (H()C(, s)f + sR(, s)f )d (9.6)
t

from (8.1) for s t t T . Let j < t j+1 and k < t k+1 . So j k holds.
Using the equation just after (9.3) and (9.6), we get

i(C (t, t0 )f C (t , t0 )f )


 t  t
= H()C (, t0 )f d + j R(, j )dC (j , t0 )f
t j

  jl+1
jk1

+ jl R(, jl )dC (jl , t0 )f
l=1 jl
 k+1 
+ k R(, k )dC (k , t0 )f. (9.7)
t

See the proof of [21, Theorem 4.2] for further details.


As in the proof of (7.14), we see

H(t)f B a Const.f B a+M (9.8)

from (3.10) because of M 2. We also see

R(t, s)f B a Const.f B a+M (9.9)

from Proposition 7.4 for 0 t s . Consequently, (9.4) and (9.7) show



C(t, t0 )f C (t , t0 )f B a Const. eKa+M T (1 + )|t t |f B a+M

for 0 t0 t t T . The inequality above holds for 0 t0 t , t T . In the


same way we get

C(t, t0 )f C (t, t0 )f B a Const. eKa+M T (1 + )|t0 t0 |f B a+M

for 0 t0 , t0 t T . Hence, we can complete the proof of Proposition 9.1.

Let M 2 be the integer determined in Proposition 9.1. We consider a solution


u(t), which is B M -valued continuous and L2 -valued continuously dierentiable in
[t0 , T ], to (3.9) with u(t0 ) = 0 for a t0 [0, T ). Then, noting M 2, from (3.9)
and (3.10) we can easily see
 
d du
(u(t), u(t)) = 2 (t), u(t) = 21 i(H(t)u(t), u(t)) = 0
dt dt
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics 591

and so u(t) = 0 in [t0 , T ], where a for a complex number a denotes the real part
of a. Consequently, we can see for a given f B a+M that the solution to (3.9)
with u(t0 ) = f is determined uniquely in the space of all B M -valued continuous
and L2 -valued continuously dierentiable functions in [t0 , T ].
Let {j } j=1 be a family of subdivisions of [0, T ] such that |j | and

limj |j | = 0. Take an arbitrary f B a+2M (a = 0, 1, 2, . . .). Then we see


from (9.4) and (9.5) that {Cj (t, t0 )f }j=1 is uniformly bounded as a family of
B a+2M -valued continuous functions and equicontinuous as a family of B a+M -valued
functions in 0 t0 t T , respectively. It follows from the Rellich criterion
(cf. [28, Theorem XIII.65]) that the embedding map from B M into L2 is compact.
So is the embedding map from B a+2M into B a+M from (7.12), (7.13) and [17,
Lemma 2.5] with a = b = 1. Consequently, from AscoliArzel` a theorem we can nd
a subsequence {jk } k=1 , which may depend on f , such that Cjk (t, t0 )f converges
in B a+M
uniformly in 0 t0 t T as k . Since Cj (t0 , t0 )f = f follows
from Lemma 6.1, so (9.7)(9.9) show that limk Cjk (t, t0 )f =: U (t, t0 )f , where
U (t, t0 )f is a B a+M -valued continuous and B a -valued continuously dierentiable
function in 0 t0 t T satisfying (3.9) with u(t0 ) = f . Hence, it follows from the
uniqueness of solutions to (3.9) proved above that C (t, t0 )f converges to U (t, t0 )f
in B a+M uniformly in 0 t0 t T as || 0.
Take an arbitrary f B a . Let and  be subdivisions such that ||
and | | . For any
> 0 we can take a g B a+2M such that g f B a <
.
Then from (9.4) we have

C (t, t0 )f C (t, t0 )f B a C (t, t0 )g C (t, t0 )gB a


+ C (t, t0 )(f g)B a + C (t, t0 )(f g)B a
C (t, t0 )g C (t, t0 )gB a+M + 2eKa T
.

So,

lim max C (t, t0 )f C (t, t0 )f B a 2eKa T


. (9.10)
||,||0 0t0 tT

Hence, we can see that C (t, t0 )f converges in B a uniformly in 0 t0 t T as


|| 0. We write this limit as W (t, t0 )f .
Let f B a . Take fj B a+M such that limj fj = f in B a . From (9.7) we
have
 t
i(W (t, t0 )fj fj ) = H()W (, t0 )fj d
t0

in B a . The inequality W (t, t0 )f B a eKa (tt0 ) f B a holds from (9.4). So, from
[17, Lemma 2.5] with a = b = 1 we can see
 t
i(W (t, t0 )f f ) = H()W (, t0 )f d
t0
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

592 W. Ichinose

in B a2 and that W (t, t0 )f is B a -valued continuous and B a2 -valued continuously


dierentiable in 0 t0 t T. Hence lim||0 C (t, t0 )f (=W (t, t0 )f ) satises
(3.9) with u(t0 ) = f . Thus, we could complete the proof of Theorem 3.1.
We shall consider the proof of Theorem 3.2. Let q t,s  t,s
x,y () and a  X,Y () be the
paths dened by (6.1) and (6.2) for s < t, respectively. For k R2 (k  ) we

1
dene the path by

4k (q t,s
x,y ())
t,s () := k + R2 , st (9.11)
 k |k|2

as in (3.12). The path t,s () R2 (k 1 ) is dened by (2.13). So from (2.16)


 k
and (2.17) we have
(1) (1) (2) (2)
k = k , k = k .

For k 1 , we can easily see


 2
 
2  t,s
|k|  () 8k (q t,s t,s
x,y ())  ()
k k

 
 t,s 4k 2 16 2

= |k| 
2  |k |2
k |k|2  |k|2
16 2
= |k|2 | k |2 |k (q t,s 2
x,y ())| . (9.12)
|k|2
dened by (3.11) is written as
So, the classical action for L
t,s t,s
S(t, s;  x,y , aX,Y , { }k1 )
q t,s
k

t,s (t s) 
= Sc (t, s; q t,s
x,y , aX,Y ) + |k|2 |k |2 (9.13)
4|V |  k1

from (2.21) and (3.3).


Let 1 C (R2N1 ) with compact support such that 1 (0) = 1. Let
> 0 and
:= {k }k1 R2N1 . For f S(R3n+4N ) we dene G (t, s)f (0 s t T ) by



n 
 3 4N 

m j 1 |k|2
(t s)



2i(t s) 2i|V |(t s) 4i 2 |V |

j=1 k1 

   (9.14)


1
ei S 1 (
)f (y, Y )dydY dk , s < t,





k1

f, s = t,
t,s t,s
q t,s
where S = S(t, s;  x,y , aX,Y , { }k1 ).
k
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics 593

Proposition 9.2. Let f B a (R3n+4N )(a = 0, 1, 2, . . .). Then, under the assump-
tions of Theorem 3.1 we have

lim G (t, s)f = C(t, s)f (9.15)


0

in B a for 0 t s .

Proof. In the case t = s Eq. (9.15) is clear from (7.6). Let 0 < t s and
f S(R3n+4N ). From (9.13) we have

n  3
4N
m j 1
G (t, s)f =
j=1
2i(t s) 2i|V |(t s)

(exp i1 Sc (t, s; q t,s t,s
x,y , aX,Y ))f (y, Y )dydY


 |k|2 (t s)  
i(t s) 
exp |k|2 |k |2

4i 2 |V | 4|V | 
k1 k1

1 (
) dk .
k1

(1) (2)
k := (k , k ) R2 and := {k }k1 . We know
Let 
 
ia 2 i
e d = (9.16)
a
for a constant a > 0. So we can write

G (t, s)f = P (t, s)f, (9.17)

where

 |k|2   
p (t, s) = exp i |k|2 |k |2

i 
k1 k1
 
(
4|V |/(t s)) dk . (9.18)
k1

We see that

lim p (t, s) = 1 (9.19)


0

pointwise. Letting q (t, s) = p (t, s) 1, we have

P (t, s)f C(t, s)f = Q (t, s)f.


June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

594 W. Ichinose

We consider

G (t, s)f C(t, s)f 2 = P (t, s)f C(t, s)f 2
= ((P (t, s) C(t, s)) (P (t, s) C(t, s))f, f )
= (Q (t, s) Q (t, s)f, f ).

Hence, we obtain (9.15) as in the proof of Theorem 7.3 in the present paper together
with [17, Lemma 2.2]. See the proof of [20, Lemma 4.1] for further details.

We can write (3.13) as

lim G (T, 1 )(
)G (1 , 2 )(
) G (2 , 1 )(
)G (1 , 0)(
)f (9.20)
0

in the same way that (3.8) is written in the above of (9.2). Integrating by parts in
(9.18), we see that sup0<1 |p (t, s)| is nite. So the same proof as for (7.5) shows

sup G (t, s)f B a Ca f B a , a = 0, 1, 2, . . .


0<1

with constants Ca from (9.17). Hence, using (9.1), we can prove Theorem 3.2 as in
the proof of the convergence of (3.8) to (9.2) together with (9.15).
Finally, we will prove Theorem 3.3. As in the proof of (6.15) we get
   
1
Aex (t, x ) dx ex (t, x )dt
(j) (j) (j)
q zt,s
y
,y q zt,s
x
,x
c
 1
1
= (x(j) y (j) ) Aex (s, x(j) (x(j) y (j) ))d
c 0
 1 1
(t s)(x (j)
y (j)
) 1 Eex ( (), (j) ())d1 d2
0 0
 1 1
1  (j) 
3 3
(j) (j) 
(x ym )
(j)
(zl xl ) 1 Bml ( (), (j) ())d1 d2
c m=1 m 0 0
l=1
(9.21)

    
for s < t, where (B23 (t, x), B31 (t, x), B12 (t, x)) = Bex (t, x), Blm = Bml , and ()
(j)
and () were dened by (6.11). See the proof of [18, Proposition 3.3] for further
details. So, we get Eq. (6.18) where the sum over j = 1, 2, . . . , n of (9.21) multiplied
by mj ej /(ts) is added to. Hence, under the assumptions of Theorem 3.3 we obtain
the same assertion as in Theorem 3.1 in the same way that Theorem 3.1 is proved.
In the same way of the proof of Theorem 3.2 we also get the same assertion as in
Theorem 3.2 under the assumptions of Theorem 3.3. Thus, we could complete the
proof of the main results.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics 595

Acknowledgements
The author thanks the referee for many useful suggestions. This research was
partially supported by Grant-in-Aid for Scientic Research No. 16540145 and
No. 19540175, Ministry of Education, Culture, Sports, Science and Technology,
Japanese Government.

References
[1] S. Albeverio, A. Hahn and A. N. Sengupta, ChernSimons theory, Hida distribution,
and state models, Inn. Dimens. Anal. Quantum Probab. Relat. Top. 6 (2003) 6581.
[2] S. Albeverio, R. J. Hegh-Krohn and S. Mazzuchi, Mathematical Theory of Feynman
Path Integrals, Lecture Notes in Math. Vol. 523, 2nd edn. (Springer-Verlag, Berlin
and Heidelberg, 2008).
[3] A. Arai, Fock Space and Quantum Field (Nihon Hyoron Co., Tokyo, 2000) (in
Japanese).
[4] F. A. Berezin and M. A. Shubin, The Schr odinger Equation (Kluwer Academic
Publishers, Dordrecht, 1983).
[5] A. P. Calderon and R. Vaillancourt, On the boundedness of pseudo-dierential oper-
ators, J. Math. Soc. Japan 23 (1971) 374378.
[6] J. M. Cook, The mathematics of second quantization, Trans. Amer. Math. Soc. 74
(1953) 222245.
[7] P. A. M. Dirac, The Principles of Quantum Mechanics, 4th edn. (Oxford Univ. Press,
London, 1958).
[8] E. Fermi, Quantum theory of radiation, Rev. Mod. Phys. 4 (1932) 87132.
[9] R. P. Feynman, Space-time approach to nonrelativistic quantum mechanics, Rev.
Mod. Phys. 20 (1948) 367387.
[10] R. P. Feynman, Mathematical formulation of the quantum theory of electrodynamic
interaction, Phys. Rev. 80 (1950) 440457.
[11] R. P. Feynman and A. R. Hibbs, Quantum Mechanics and Path Integrals
(McGraw-Hill, New York, 1965).
[12] J. Fr
ohlich, M. Griesemer and I. M. Sigal, Spectral theory for the standard model of
non-relativistic QED, Comm. Math. Phys. 283 (2008) 613646.
[13] I. M. Gelfand and N. Y. Vilenkin, Generalized Functions. Vol. IV, Applications of
Harmonic Analysis (Academic Press, New York-London, 1964).
[14] S. J. Gustafson and I. M. Sigal, Mathematical Concepts of Quantum Mechanics
(Springer, Berlin, 2003).
[15] T. Hida, H.-H. Kuo, J. Pottho and L. Streit, White Noise (Kluwer Academic
Publishers, Dordrecht, 1993).
[16] F. Hiroshima, Functional integral representation of a model in quantum electro-
dynamics, Rev. Math. Phys. 9 (1997) 489530.
[17] W. Ichinose, A note on the existence and -dependency of the solution of equations
in quantum mechanics, Osaka J. Math. 32 (1995) 327345.
[18] W. Ichinose, On the formulation of the Feynman path integral through broken line
paths, Comm. Math. Phys. 189 (1997) 1733.
[19] W. Ichinose, On convergence of the Feynman path integral formulated through broken
line paths, Rev. Math. Phys. 11 (1999) 10011025.
[20] W. Ichinose, The phase space Feynman path integral with gauge invariance and its
convergence, Rev. Math. Phys. 12 (2000) 14511463.
June 2, 2010 14:55 WSPC/S0129-055X 148-RMP J070-00403

596 W. Ichinose

[21] W. Ichinose, Convergence of the Feynman path integral in the weighted Sobolev
spaces and the representation of correlation functions, J. Math. Soc. Japan 55 (2003)
957983.
[22] W. Ichinose, The continuity of the solution with respect to an electromagnetic poten-
tial to the Schr
odinger equation and the Dirac equation, preprint (2009).
[23] G. W. Johnson and M. L. Lapidus, The Feynman Integral and Feynmans Operational
Calculus (Oxford Univ. Press, Oxford, 2000).
[24] H. Kumano-go, Pseudo-Dierential Operators (MIT Press, Cambridge, 1981).
[25] E. H. Lieb and M. Loss, Analysis (Amer. Math. Soc. Providence, 1997).
[26] S. Mizohata, The Theory of Partial Dierential Equations (Cambridge Univ. Press,
New York, 1973).
[27] E. Nelson, Schr odinger particles interacting with a quantized scalar eld, in
Proceedings of a Conference on the Theory and Applications of Analysis in Func-
tion Space (M.I.T. Press, Cambridge, 1964), pp. 88120.
[28] M. Reed and B. Simon, Methods of Modern Mathematical Physics IV: Analysis of
Operators (Academic Press, New York, 1978).
[29] J. J. Sakurai, Advanced Quantum Mechanics (Addison-Wesley, Massachusetts, 1967).
[30] F. E. Schroeck, Jr., Generalization of the Cook formalism for Fock space, J. Math.
Phys. 12 (1971) 18491857.
[31] J. T. Schwartz, Nonlinear Functional Analysis (Gordon and Breach Science
Publishers, New York, 1969).
[32] H. Spohn, Dynamics of Charged Particles and Their Radiation Field (Cambridge
University Press, Cambridge, 2004).
[33] M. S. Swanson, Path Integrals and Quantum Processes (Academic Press, San Diego,
1992).
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Reviews in Mathematical Physics


Vol. 22, No. 6 (2010) 597667

c World Scientic Publishing Company
DOI: 10.1142/S0129055X10004053

GRADIENT FLOWS FOR OPTIMIZATION IN QUANTUM


INFORMATION AND QUANTUM DYNAMICS:
FOUNDATIONS AND APPLICATIONS


THOMAS SCHULTE-HERBRUGGEN and STEFFEN J. GLASER

Department of Chemistry, Technical University of Munich (TUM),


Lichtenbergstrasse 4, D-85747 Garching, Germany
tosh@tum.de

GUNTHER DIRR and UWE HELMKE


Institute of Mathematics, University of W
urzburg,
Am Hubland, D-97074 W urzburg, Germany
dirr@mathematik.uni-wuerzburg.de

Received 14 December 2008


Revised 26 February 2010

Many challenges in quantum information and quantum control root in constrained opti-
mization problems on nite-dimensional quantum systems. The constraints often arise
from two facts: (i) quantum dynamic state spaces are naturally smooth manifolds (orbits
of the respective initial states) rather than being Hilbert spaces; (ii) the dynamics of
the respective quantum system may be restricted to a proper subset of the entire state
space. Mathematically, either case can be treated by constrained optimization over the
reachable set of an underlying control system. Thus, whenever the reachable set takes
the form a smooth manifold, Riemannian optimization methods apply. Here, we give a
comprehensive account on the foundations of gradient ows on Riemannian manifolds
including new applications in quantum information and quantum dynamics. Yet, we do
not pursue the problem of designing explicit controls for the underlying control systems.
The framework is suciently general for setting up gradient ows on (sub)manifolds,
Lie (sub)groups, and (reductive) homogeneous spaces. Relevant convergence conditions
are discussed, in particular for gradient ows on compact and analytic manifolds. This
is meant to serve as foundation for new achievements and further research.
Illustrative examples and new applications are given: we extend former results on
unitary groups to closed subgroups with tensor-product structure, where the nest prod-
uct partitioning relates to SUloc (2n ) := SU (2) SU (2) known as (qubit-wise)
local unitary operations. Such applications include, e.g., optimizing gures of merit on
SUloc (2n ) relating to distance measures of pure-state entanglement as well as to best
rank-1 approximations of higher-order tensors. In quantum information, our gradient
ows provide a numerically favorable alternative to standard tensor-SVD techniques.

Keywords: Constrained optimization in quantum systems; Riemannian optimization;


Riemannian gradient ows and algorithms; double-bracket ows; quantum control; low-
rank approximation of tensors; tensor SVD.

Mathematics Subject Classication 2010: 49-02, 49R50, 53-02, 53Cxx, 65Kxx, 81V70,
90C30, 15A18, 15A69

597
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

598 T. Schulte-Herbr
uggen et al.

Contents

1. Introduction 598
2. Overview 602
2.1. Flows and dynamical systems . . . . . . . . . . . . . . . . . . . . . . 602
2.2. Gradient ows for optimization . . . . . . . . . . . . . . . . . . . . . 603
2.3. Discretized gradient ows . . . . . . . . . . . . . . . . . . . . . . . . 603
2.4. Reachability and controllability . . . . . . . . . . . . . . . . . . . . . 606
2.5. Settings of interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . 608
3. Theory: Gradient Flows 609
3.1. Gradient ows on Riemannian manifolds . . . . . . . . . . . . . . . . 609
3.2. Gradient ows on Lie groups . . . . . . . . . . . . . . . . . . . . . . 614
3.3. Gradient ows on homogeneous spaces . . . . . . . . . . . . . . . . . 619
3.4. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 631
4. Applications to Quantum Information and Quantum Control 644
4.1. A geometric measure of pure-state entanglement . . . . . . . . . . . 644
4.2. Generalized local subgroups . . . . . . . . . . . . . . . . . . . . . . . 644
4.3. Locally reversible interaction Hamiltonians . . . . . . . . . . . . . . 651
4.4. Intrinsic versus penalty approach: An example . . . . . . . . . . . . 653
5. Conclusions 655

1. Introduction
Controlling quantum systems oers a great potential for performing computational
tasks or for simulating the behavior of other quantum systems (which are dicult to
handle experimentally) or classical systems [1,2], when the complexity of a problem
reduces upon going from a classical to a quantum setting [3]. Important examples
are known in quantum computation, quantum search and quantum simulation.
Most prominently, there is the exponential speed-up by Shors quantum algorithm
of prime factorisation [4,5], which on a general level relates to the class of quantum
algorithms [6,7] solving hidden subgroup problems in an ecient way [8]. In Grovers
quantum search algorithm [9,10] one still nds a quadratic acceleration compared to
classical approaches [11]. Recently, the simulation of quantum phase-transitions [12]
has shifted into focus [13, 14].
Among the generic tools needed for advances in quantum simulation and quan-
tum technology, quantum control plays a major role. For a survey see, e.g., [15,16].
Its key concern is to develop (optimal) control strategies and constructive ways for
implementing them under realistic experimental settings such that a certain per-
formance index is maximized. In a wider sense, such gures of merit depend on
terminal conditions as well as on running cost like time or energy. Yet in quan-
tum control important classes of performance indices are completely determined by
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 599

their value at the nal state, typical examples being quantum gate delities, e-
ciencies of state transfer or coherence transfer, as well as distance functions related
to Euclidean entanglement measures.
Since realistic quantum systems are mostly beyond analytical tractability,
numerical methods are often indispensable. A good strategy is to proceed in two
steps: (a) rstly, by exploring the possible gains on an abstract and computation-
ally cheap level, i.e. by maximizing the quality function either over the entire state
space or over the set of possible states the so-called reachable set; (b) secondly,
by going into optimizing the experimental controls (pulse shapes) in a concrete
setting. However, (b) is often computationally expensive and highly sensitive, as
it actually approximates the solution of a constrained variational problem in an
innite dimensional function space. Here we almost entirely focus on task (a) and
no longer pursue issues of optimal control (b). In particular, we do not address the
problem of designing controls that steer concrete experimental setups to achieve
optimal gures of merit.
By merely depending on the geometry of the underlying state-space manifold the
rst instance (a) allows for analyzing in advance and on an abstract level the limits
of what one can achieve in step (b). We therefore refer to (a) as the abstract opti-
mization task. The second instance, in contrast, hinges on introducing the specic
time scales and control parameters of an experimental setting for nding steerings
of the quantum dynamical system such that the optima determined in (a) are actu-
ally assumed. This is why we term (b) the dynamic optimal control task. Certainly,
one can approach the entire problem only in terms of (b) and sometimes one is
even forced to do so, e.g., if nothing is known about the geometric structure of
the reachable set. Yet, the above two-fold strategy serves to yield (strict) upper
bounds (independent of the concrete experimental setting) in (a) which provide
benchmarks for the reliability of the numerical methods applied in (b).
In a pioneering paper [17], Brockett introduced the idea of exploiting gradient
ows on the orthogonal group for diagonalizing real symmetric matrices and for
sorting lists. In a series of subsequent papers he extended the concept to intrinsic
gradient methods for (constrained) optimization [18,19]. Soon these techniques were
generalized to Riemannian manifolds, their mathematical and numerical details
were worked out [2022] and thus they turned out to be applicable to a broad
range of optimization tasks including eigenvalue and singular-value problems, prin-
cipal component analysis, matrix least-squares matching problems, balanced matrix
factorisations, and combinatorial optimization for an overview see, e.g., [22, 23].
Implementing a gradient method for optimization on a smooth (constrained) man-
ifold, such as an unitary orbit, via the Riemannian exponential map, inherently
ensures that the (discretized) ow remains within the manifold. Alternatively, for-
mulating the optimization problem on some embedding Euclidean space comes at
the expense of additional constraints (e.g., enforcing unitarity) to be taken care
of by penalty-type or augmented Lagrange-type techniques. In this sense, gradi-
ent ows on manifolds are intrinsic optimization methods [24], whereas extrinsic
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

600 T. Schulte-Herbr
uggen et al.

optimizations on an embedding space require in general nonlinear projective tech-


niques in order to stay on the (constrained) manifold. In particular, using the
dierential geometry of matrix manifolds has thus become a eld of active research.
For new developments (however without exploiting the Lie structure to the full
extent) see, e.g., [25]. Even beyond manifolds, gradient ows have recently been
described for metric spaces with applications of probability theory [26].
For optimization in quantum dynamics, gradient ows and their discrete numer-
ical integration schemes have also proven powerful tools. This holds in both types
of tasks: (a) for exploring the maxima of pertinent quality functions on the reach-
able set of a quantum system, e.g., on the unitary group and its orbits [2729] and
(b) beyond the current focus also for arriving at concrete experimental steerings
(i.e.pulse shapes). These steerings actually achieve the quality limits established
in (a) under given experimental conditions for closed systems [3032], whereas they
give (best) approximations in open systems [3335]. Note that in task (b) gradi-
ent ows on the set of admissible control amplitudes can be viewed as another
instance of ows on Riemannian manifolds. However, this instance will not be
pursued here.
Moreover, in view of unifying variational approaches to ground-state calcula-
tions [36, 37], a common framework of gradient ows on Riemannian manifolds
as well as projective techniques on their tangent spaces will be useful. The lat-
ter allow for restricting the ows, e.g., from Lie groups G to any closed subgroup
H, in particular to any closed subgroup of SU (N ). Consecutive partitionings into
dierent subgroups of SU (N ) are exploited in unitary networks addressing large-
scale quantum systems by neglecting long-range entangling correlations [3840].
Related techniques for truncating the Hilbert space to pertinent parametrized
subsets include matrix-product states (MPS) [41, 42] of density-matrix renormali-
zation groups (DMRG) [43, 44], quantum cellular automata with Margolus parti-
tionings [45], projected entangled pair states (PEPS) [46] weighted graph states
(WGS) [47], multi-scale entanglement renormalization approaches (MERA) [48],
string-bond states (SBS) [49] as well as combined methods [36,50]. It is noteworthy
that in many-particle physics gradient ows for diagonalizing Hamiltonians were re-
introduced independently of Brocketts work [17] by Wegner [51] and were further
elaborated on again independently of the monography by Helmke and Moore [22]
or the one by Bloch [23] in the tract by Kehrein [52]. Suce this to illustrate the
need for making the mathematical methods available to the physics community in
a comprehensive way.
Another eld of applications of restricting ows to closed subgroups of SU (N )
is entanglement of multi-partite quantum systems [53, 54]: we present a connec-
tion from gradient ows on the subgroup of local unitary operations to best
rank-1 approximations of higher-order tensors as well as a relation to tensor-SVDs.
These methods are of importance, e.g., in view of optimization of entanglement
witnesses [55]. Applying the same approach to other subgroups of SU (N ) with
tensor product structure is anticipated to be of use also for classifying multi-partite
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 601

systems by partial separability, an example being three-tangles of GHZ-type and


W-type states [5658].
Here the goal is to give a comprehensive account of the foundations of gradient
ows and thus the justication for some recent developments as well as to
present new applications to quantum information. Terms are kept general enough
to trigger future developments, since we elucidate the necessary requirements for
implementing gradient-based optimization methods in dierent geometric settings:
Riemannian manifolds and submanifolds, Lie groups and homogeneous spaces. We
will also show how they can be carried over to homogeneous spaces that do no longer
form Lie groups themselves. Standard examples are coset spaces G/H, i.e. the
quotient of a Lie group G by a closed (yet not necessarily normal) subgroup H.
Here naturally reductive homogeneous space are of particular interest and the well-
known double-bracket ows will be demonstrated to form a special case precisely
of this kind.
A separate paper on open quantum systems [59] sets up a formal approach within
the framework of Lie semigroups accounting for Markovian quantum evolutions (or
Markovian channels). There we also show the current limits of abstract optimization
over reachable sets specically arising in open systems. The dierential geometry
of the set of all completely positive, trace-preserving invertible maps is analyzed in
the framework of Lie semigroups. In particular, the set of all KossakowskiLindblad
generators is retrieved as its tangent cone (Lie wedge). Moreover, it shows how the
Lie-semigroup structure corresponds to the Markov properties recently studied in
terms of divisibility [60]. It illustrates why abstract optimization tasks for open
systems are much more intricate than in the case of closed system, while dynamic
optimal control tasks for open systems can be handled completely analogously. It
species algebraic conditions for time-optimal controls to be the method of choice
in open systems. Finally it draws perspectives to a new algorithmic approach for
optimization on semigroup orbits combining (a priori) knowledge of the respective
Lie wedge with well-known techniques from optimal control.

Outline
To begin with, recall some basic terminology on dynamical systems and Riemannian
geometry. Then the aim is to provide the dierential geometric tools for setting
up Riemannian optimization methods primarily focussing on gradient ows
in dierent scenarios ranging from optimization over the entire unitary group to
closed subgroups or homogeneous spaces. Finally we give a number of applications
including worked examples.
More precisely, the paper is organized as follows: Section 2 draws a general sketch
of dynamical systems and ows on manifolds including issues of reachability and
controllability. It provides the manifold setting for gradient-ow-based algorithms
like steepest ascent, conjugate gradients, Jacobi-type, and Newton methods.
A detailed analysis is then given in Sec. 3, where (1) we resume the general pre-
conditions for gradient ows on smooth manifolds. In particular, we recall the role
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

602 T. Schulte-Herbr
uggen et al.

of a Riemannian metric that allows for identifying the cotangent bundle T M with
the tangent bundle T M . Though major parts of the foundations can be found scat-
tered in [22,25,61], here we add a comprehensive overview of the interplay between
Riemannian geometry, Lie groups, and (reductive) homogeneous spaces. (2) We give
examples of gradient ows on compact Lie groups as well as their closed subgroups.
(3) In view of further developments, we address gradient ows on reductive homo-
geneous spaces including specializations to Cartan-like cases as well as naturally
reductive homogeneous spaces. In particular, double-bracket ows turn out as gra-
dient ows on naturally reductive homogeneous spaces. (4) Examples interdispersed
in the main text illustrate the relevance in a plethora of dierent settings.
Section 4 is dedicated to specic applications in quantum information and quan-
tum control. (1) We show how gradient ows on the subgroup of local unitaries
SUloc (2n ) in n qubits do not only provide a valuable tool in witness optimization,
but relate to generalized singular-value decompositions (SVD), namely the tensor-
SVD. Here, our gradient ows yield an alternative to common algorithms for best
rank-1 approximations of higher-order tensors, e.g., higher-order power methods
(HOPM) or higher-order orthogonal iteration (HOOI). (2) Flows on SUloc (2n ) also
serve as a convenient tool to decide whether Hamiltonian interactions can be time-
reversed solely by local unitary manipulations thus complementing the algebraic
assessment given in [29]. (3) Optimization tasks with additional extrinsic constraints
are addressed by tailored gradient ows on the respective subgroups or by auxil-
iary penalty methods. By including practical applications and worked examples we
illustrate the ample range of problems to which gradient ows on manifolds provide
valuable solutions. To this end, we start out by an extended overview on Rieman-
nian optimization techniques on manifolds with particular emphasis on gradient
techniques.

2. Overview
2.1. Flows and dynamical systems
In this paper, we treat various optimization tasks for quantum dynamical systems
in a common framework, namely by gradient ows on smooth manifolds. Let M
denote a smooth manifold, e.g., the unitary orbit of all quantum states relating to
an initial state X0 . By a continuous-time dynamical system or a ow one denes a
smooth map
: RM M (2.1)
such that for all states X M and times t, R one has
(0, X) = X
(2.2)
(, (t, X)) = (t + , X).
Since these equations hold for any X M , one gets the operator identity
t = t+ (2.3)
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 603

for all t, R, thus showing the ow acts as a one-parameter group, and for positive
times t, 0 as a one-parameter semigroup of dieomorphisms on M .

2.2. Gradient flows for optimization


Now, the general idea for optimizing a scalar quality function on a smooth man-
ifold M (which might either arise naturally or from including smooth equality
constraints, vide infra) by dynamical systems is as follows: Let f : M R be
a smooth quality function on M . The dierential of f : M R is a mapping
Df : M T M of the manifold to its cotangent bundle T M , while the gradi-
ent vector eld is a mapping grad f : M T M to its tangent bundle T M . So
the gradient of f at X M , denoted grad f (X), is the vector in TX M uniquely
determined by
Df (X) = grad f (X) | X for all TX M . (2.4)

Here, the scalar product  | X plays a central role: it allows for identifying TX M
with TX M . The pair (M,  | ) is called a Riemannian manifold with Riemannian
metric  | . In view of gradient ows, the convenience of Riemannian manifolds
lies in the fact that by duality in particular the dierential Df (X) of f at X can
be identied with a tangent vector of TX M .
Then, the ow : RM M determined by the ordinary dierential equation
X = grad f (X) (2.5)
is termed gradient ow. Formally, it is obtained by integrating Eq. (2.5), i.e.
(t, X) = (t, (0, X)) = X(t), (2.6)
where X(t) denotes the unique solution of Eq. (2.5) with initial value X(0) = X.
Observe this ensures that f does increase along trajectories of the system by
virtue of following the gradient direction of f . In generic problems, gradient ows
typically run into some local extremum as sketched in Fig. 2. Therefore a suciently
large set of independent initial conditions may be needed to provide condence into
numerical results. However, in some pertinent applications, local extrema can be
ruled out; prominent examples of this type will be discussed in detail in the context
of Brocketts double bracket ow [17, 22].

2.3. Discretized gradient flows


Gradient ows may be envisaged as natural continuous versions of the steepest
ascent method for optimizing a real-valued function f : Rm R by moving along
its gradient grad f Rm , i.e.
Steepest ascent method
Xk+1 = Xk + k grad f (Xk ), (2.7)
where k 0 is an appropriate step size.
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

604 T. Schulte-Herbr
uggen et al.

Here, the right-hand side of Eq. (2.7) does make sense, as the manifold M =
Rm coincides with its tangent space TX M = Rm containing grad f (X). Clearly, a
generalization is required as soon as M and TX M are no longer identiable. This
gap is lled by the Riemannian exponential map dened by

expX : TX M M,  expX () (2.8)

so that t  expX (t) describes the unique geodesic with initial value X M and
initial velocity TX M as illustrated in Fig. 1.
If the manifold M carries the structure of a matrix Lie group G, we may iden-
tify the tangent space element TX G with X, where is itself an element
of the Lie algebra g, i.e. the tangent space at the unity element g = T1 G. More-
over, if the Lie-group structure matches with the Riemannian framework in the
sense that the metric is bi-invariant (as will be explained in more detail later),
then the Riemannian exponential of = X can readily be calculated explicitly.
This is done in three steps by (i) right translation with the inverse group element
X 1 , (ii) taking the conventional exponential map of the Lie algebra element ,
(iii) right translation with the group element X as summarized in the following
diagram
exp
= X TX G X
e X G


RX 1 
R
X (2.9)
exp
g e G.
Next, the gradient system (2.5) will be integrated (to sucient approximation)
by a discrete scheme that can be seen as an intrinsic Euler step method. This can
be performed by way of the Riemannian exponential map, which is to say straight
line segments used in the standard method are replaced by geodesics on M . This
leads to the following integration scheme which is well-dened on any Riemannian
manifold M .

(1) Riemannian gradient method

Xk+1 := expXk (k grad f (Xk )) (2.10)

Fig. 1. The Riemannian exponential expX is a smooth map taking the tangent vector t TX M
at X M to expX (t) M . By varying t R, it yields the unique geodesic with initial value
X M and initial velocity TX M . (Color online)
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 605

where k 0 denotes a step size appropriately selected to guarantee conver-


gence, cf. Sec. 3.

For matrix Lie groups G with bi-invariant metric, Eq. (2.10) simplies to

(1 ) Gradient method on a Lie group

Xk+1 := exp(k grad f (Xk )Xk1 )Xk , (2.11)

where exp : g G denotes the conventional exponential map.

In either case, the iterative procedure can be pictured as follows: at each point
Xk M one evaluates grad f (Xk ) in the tangent space TXk M . Then one moves via
the Riemannian exponential map in direction grad f (Xk ) to the next point Xk+1 on
the manifold so that the quality function f improves, f (Xk+1 ) f (Xk ), as shown
in Fig. 2.

Higher-order Riemannian optimization methods


The steepest ascent approach just outlined is most basic for addressing abstract
optimization tasks intrinsically. Other intrinsic iterative schemes exploiting the
underlying Riemannian geometry like conjugate gradients, Jacobi-type methods
or Newtons method can be obtained similarly. For an introduction to these more
advanced topics beyond the subsequent sketch see, e.g., [20, 62, 63].

Fig. 2. Abstract optimization task: The quality function f : M R, X  f (X) (top trace) is
driven into a (local) maximum by following the gradient ow X = grad f (X) on the manifold M
(lower trace). (Color online)
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

606 T. Schulte-Herbr
uggen et al.

(2) Conjugate gradient method


Xkl+1 := argmax f (expXkl (t lk ))
t0
(2.12)
Xk+1
0
:= Xkn ,

grad f (Xkl ) for l = 0
lk :=
grad f (Xkl ) + lk Xkl1 ,Xkl (lk1 ) for l = 1, . . . , n 1,
where lk
is a real parameter and X,Y () denotes the parallel transport of
along the geodesic from X to Y .
(3) Jacobi-type method
Xkl+1 := argmax f (expXkl (t l (Xkl )))
tR
(2.13)
Xk+1
0
:= Xks ,
where 0 , 1 , . . . , s1 are vector elds such that 0 (X), 1 (X), . . . , s1 (X)
span TX M for all X M . The integer s is called sweep length.
(4) Newtons method
Xk+1 := expXk ((Hess f (Xk ))1 grad f (Xk )), (2.14)
where Hess f (X) denotes the Hessian of f at X. Since inverting the Hessian or,
more precisely, solving the equation Hess f (Xk )Z = grad f (Xk ) is numerically
costly, in higher-dimensional problems it is customary to use approximative
methods with partial updates, e.g., the limited memory variant of the Broyden
FletcherGoldfarbShanno algorithm (LBFGS) [6466].

2.4. Reachability and controllability


Up to now we have addressed optimization tasks over abstract state spaces forming
Riemannian manifolds M hence the term abstract optimization task (AOT).
However, often there may be restrictions of the original manifold M to a (proper)
submanifold N  M .
In this paragraph, we sketch how restrictions arising from an underlying con-
trol system can be accounted for. To this end, some general remarks on reachable
sets and controllability are in order. The abstract optimization task (AOT) then
amounts to optimizing over reachable sets, which is the topic we focus on here. In
contrast, the dynamic optimal control task (OCT) would give explicit steerings,
which will not be discussed here. Instead, we refer the reader to [67, 68], or, for the
quantum case, to [69,70] and for numerical methods and applications to [30,7174].
For simplicity, let () denote a smooth control system on the state manifold M ,
i.e. a family of (ordinary) dierential equations
() X = F (X, u), u U Rm (2.15)
with control parameters u U and smooth vector elds Fu := F (, u) on M . While
the vector elds Fu are assumed to be time-independent, the controls are allowed
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 607

to vary in time. For convenience, the resulting control function t  u(t) U is


denoted again by u. Moreover, the set of all admissible control functions is supposed
to contain at least all piecewise constant ones.
For an admissible control function u, we refer to X(t, X0 , u) as the unique solu-
tion of (2.15) with initial value X0 . Thereby the reachable set of X0 is dened

Reach(X0 ) := Reach(X0 , T ). (2.16)
0T

Here Reach(X0 , T ) denotes the set of all states which can be reached in time T , i.e.

Reach(X0 , T ) := {X(T, X0, u) M | u U}. (2.17)

The system () is said to be controllable, if Reach(X0 ) = M for all X0 M , i.e.


if for each pair (X0 , Y0 ) of states there exists an admissible control u and a time
T 0 such that X(T, X0 , u) = Y0 .
In general, it is hard to decide whether a given system () is controllable or not.
However, for dynamics evolving on some Lie group G, the situation is much easier.
Let (G ) be a bilinear or, equivalently, a right invariant, control-ane system on
a matrix Lie group G with Lie algebra g, i.e.

m
(G ) X = A0 + uj Aj X, u U Rm (2.18)
j=1

with drift A0 g and control directions Aj g. For compact Lie groups G, a simple
algebraic test for controllability is known: If the system Lie algebra

s := A0 , . . . , Am Lie (2.19)

generated by A0 , . . . , Am via nested commutators coincides with g, then the cor-


responding system (G ) is controllable, cf. [75, 76]. In particular, there exists a
(minimal) nite time T  > 0, such that the entire group G can be reached from
any initial point X0 G within this time, i.e.

G = Reach(X0 , T  ) := Reach(X0 , T ) (2.20)
0T T 

for all X0 G. Estimates on T  which leads to upper and lower bounds for the
optimal time of state-to-state transfer in controlled quantum systems can be found
in [77]. If s  g, but s generates a closed subgroup of G, one can still conclude how
the reachable set of (2.18) looks like:

Reach(X0 ) = S X0 , (2.21)

where S denotes the closed subgroup generated by s.


In contrast, for non-compact groups G, which are indispensable for describing
open quantum systems, the situation gets more involved. Here, s = g implies only
accessibility of (G ), i.e. that all reachable sets Reach(X0 ) have non-empty interior.
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

608 T. Schulte-Herbr
uggen et al.

This follows from a more general result on smooth non-linear control systems, which
says that the so-called Lie algebra rank condition (LARC)
{F (X) | F Fu | u ULie } = TX M, (2.22)
for all X M implies accessibility. Here, Fu | u ULie denotes the Lie subalgebra
of vectors elds generated by Fu , u U via Lie bracket operation, cf. [67]. Note
that for right invariant vector elds on G, the Lie bracket coincides (up to sign)
with the commutator such that (2.22) boils down to s = g. Moreover, by exploiting
the identity
Reach(1, T1 ) Reach(1, T2 ) = Reach(1, T1 + T2 ), (2.23)
one can show that Reach(1) is always a Lie subsemigroup of G. A subsemigroup is a
subset S G which contains the unity and is closed under multiplication, i.e. 1 S
and S S S. However, the geometry of subsemigroups is rather subtle compared
to Lie subgroups and therefore at present not amenable to intrinsic optimization
methods, as shown in more detail in a paper dwelling on open quantum systems in
terms of Lie semigroups [59].

2.5. Settings of interest


In terms of reachability, there are dierent scenarios that structure the subsequent
line of thought: we start out with fully controllable or operator controllable quan-
tum systems [28, 75, 76, 7881] represented as spin- or pseudo-spin systems. Then,
neglecting decoherence, to any initial state represented by its density operator A,
the entire unitary orbit O(A) := {U AU 1 | U unitary} can be reached [81]. In
systems of n qubits (e.g., spin- 21 particles), this is the case under the following
mild conditions [82]: (1) the qubits have to be inequivalent, i.e. distinguishable and
selectively addressable, and (2) they have to be pairwise coupled (e.g., by Ising
or Heisenberg-type interactions), where the coupling topology may take the form
of any graph as long as it is connected, (3) the Hamiltonians must not show any
symmetry (so the system algebra has to be given in irreducible representation),
and nally (4) the Hamiltonians must not (simultaneously) allow for an orthogonal
or a symplectic representation. In other instances not the entire (unitary) group,
but just a subgroup K can be reached. This is the case if the system Lie algebra
is a proper subalgebra of the fully unitary algebra, so k  su(N ) or equivalently
exp k = K  SU (N ). Such restrictions may ay arise from symmetry constraints,
which can be conveniently characterized by the centralizera of k in su(N ), see [82,83].
Otherwise, the system itself can be fully controllable, but the focus of interest
may be reduced: e.g., the subgroup K = SUloc (2n ) := SU (2) SU (2) SU (2)
of (possibly fast) local actions on each qubit is of interest to study local reachability,
or whether an eective multi-qubit interaction Hamiltonian is locally reversible in

a i.e. by k := {s su(N ) | [s, k] = 0 k k}.


July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 609

the sense of Hahns spin echo [29]. Or, one may ask what is the Euclidean distance
of some pure state to the nearest point on the local unitary orbit of a pure product
state. This may be useful when optimizing entanglement witnesses [55]. Likewise,
one may address other than the nest partitioning of the entire unitary group, e.g.,

K = SU (2n1 ) SU (2nj ) SU (2nr ) SU (2n ), where rj=1 nj = n.
Another type of reduction arises not by restriction to a subgroup H, but by the
fact that the quality function of interest f is equivariant, i.e. constant on cosets
HG. Consider, for instance, a fully controllable system where f is equivariant with
respect to the closed subgroup H G. Then, it may be favorable to transfer the
optimization problem from the original Lie group G to the homogeneous space
G/H.

3. Theory: Gradient Flows


Gradient systems are a standard tool of Riemannian optimization for maximizing
smooth quality functions on a manifold M . Thus the manifold structure of M arises
either naturally by the problem itself or by smooth equality-constraints imposed
on a previously unconstrained problem. Note that in general inequality-constraints
would entail manifolds with a boundary and thus are a much more subtle issue
not to be developed any further here.
The case M = Rm sometimes referred to as the unconstrained case is well-
known and can be found in many texts on ordinary dierential equations or nonlin-
ear programming, cf. [8487]. However, gradient systems on abstract Riemannian
manifolds provide a rather new approach to constrained optimization problems.
Although the resulting numerical algorithms are in general only linearly conver-
gent, their global behavior is often much better then the global behavior of locally
quadratic methods.
Textbooks combining the dierent areas of Riemannian geometry, gradient sys-
tems and constrained optimization are quite rare. The best choices to our knowledge
are [22,61]. For further reading we also suggest the papers [1921,62]. Nevertheless,
most of the material which is necessary to understand the intrinsic optimization
approach applied in Sec. 4 is scattered in many dierent references. For the readers
convenience, we therefore review the basic ideas on these topics. First, we discuss
the general setting on Riemannian manifolds, then we proceed with Lie groups
and nally summarize some more advanced results on homogeneous spaces. For
standard denitions and terminology from Riemannian geometry we refer to any
modern text on this subject such as [8891].

3.1. Gradient flows on Riemannian manifolds


In the following, let M denote a nite dimensional smooth manifold M with tangent
and cotangent bundles T M and T M , respectively. Moreover, let M be equipped
with a Riemannian metric  | , i.e. with a scalar product  | X on each tangent
space TX M varying smoothly with X M . More precisely,  |  has to be a
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

610 T. Schulte-Herbr
uggen et al.

smooth, positive denite section in the bundle of all symmetric bilinear forms over
M . Such sections always exist for nite dimensional smooth manifolds, cf. [89, 92].
The pair (M,  | ) is called a Riemannian manifold.
Let f : M R be a smooth quality function on M with dierential Df : M
T M . Then the gradient of f at X M , denoted by grad f (X), is the vector in
TX M uniquely determined by the equation
Df (X) = grad f (X) | X (3.1)
for all TX M . Equation (3.1) naturally denes a vector eld on M via
grad f : M T M, X  grad f (X) (3.2)
called the gradient vector eld of f . The corresponding ordinary dierential equation
X = grad f (X), (3.3)
and its ow are referred to as the gradient system and the gradient ow of f ,
respectively.
Obviously, the critical points of f : M R coincide with the equilibria of the
gradient ow. Moreover, the quality function f is monotonically increasing along
solutions X(t) of (3.3), i.e. the real-valued function t  f (X(t)) is monotonically
increasing in t, as
d
f (X(t)) = grad f (X(t)) | X(t)
X(t) = grad f (X(t)) X(t) 0.
2
dt

Here X denotes the norm on TX M induced by  | X , i.e. X :=  | X
for all TX M .

3.1.1. Convergence of gradient ows


Recall that the asymptotic behavior (for t +) of a solution of (3.3) is charac-
terized by its -limit set

(X0 ) := {X(, X0 ) | t < t+ (X0 )},
0<t<t+ (X0 )

where { } denotes the closure of the set { } and X(t, X0 ) the unique solution
of (3.3) with initial value X(0) = X0 and positive escape time t+ (X0 ) > 0. The
following result gives a sucient condition for solutions of Eq. (3.3) to converge to
the set of critical points of f .

Proposition 3.1. If f has compact a superlevel set, i.e. if the sets {X M |


f (X) C} are compact for all C R, then any solution of Eq. (3.3) exists for
t 0 and its -limit set is a non-empty compact and connected subset of the set of
critical points of f .

Proof. Since solutions of Eq. (3.3) are monotonically increasing in t, the compact
sets {X M | f (X) C} are positively invariant, i.e. invariant for t 0 under
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 611

the gradient ow of Eq. (3.3). Thus the assertion follows from standard results on
-limit sets and Lyapunov theory, cf. [22, 85].

Although, Proposition 3.1 guarantees that (X0 ) is contained in the set of


critical points of f , this does not imply convergence to a critical point. Indeed,
there are smooth gradient systems which exhibit solutions converging only to the
set of critical points, cf. [93]. The next two results provide sucient conditions
for convergence to a single critical point under dierent settings. In particular,
Theorem 3.1 yields a powerful tool for analyzing real analytic gradient systems.

Corollary 3.1. If f has compact superlevel sets and if all critical points are
isolated, then any solution of (3.3) converges to a critical point of f for t +.

Proof. This is an immediate consequence of Proposition 3.1.

Theorem 3.1 ([94]). If (M,  | ) and f are real analytic, then all non-empty
-limit sets (X0 ) of Eq. (3.3) are singletons, i.e. (X0 ) = implies that X(t, X0 )
converges to a single critical point X of f for t +.

Proof. The main argument is based on L  ojasiewiczs inequality which says that

in a neighborhood of X an estimate of the type
|f (X)|p C grad f (X)
for some p < 1 and C > 0 holds. A complete proof can be found in [94, 95].

3.1.2. Restriction to submanifolds


Now, consider the restriction of f to a smooth submanifold N M . Obviously, the
Riemannian metric  |  on M restricts to a Riemannian structure on N . Thus
(N,  | |T N ) constitutes a Riemannian manifold in a canonical way. Moreover
the equality Df |N (X) = Df (X)|TX N immediately implies that the gradient of the
restriction f |N at X N is given by the orthogonal projection of grad f (X) onto
TX N , i.e.
grad f |N (X) = PX (grad f (X)), (3.4)
where PX denotes the orthogonal projector onto TX N . Hence the gradient system
of f |N on an arbitrary submanifold N is well-dened and reads
X = PX (grad f (X)). (3.5)

3.1.3. Analyzing critical points by the Hessian


Subsequently, we address the problem, how to dene and compute the Hessian of f ,
as its knowledge is essential for a deeper insight of (3.3). For instance, the stability
of critical points is determined by its eigenvalues or the computation of explicit
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

612 T. Schulte-Herbr
uggen et al.

discretization schemes, preserving the convergence behavior of (3.3), can be based


on it, cf. [17, 22].
At critical points X M of f , the Hessian is given by the symmetric bilinear
form Hess f (X ) : TX M TX M R,
Hess f (X )(, ) := (D(X )) Hess (f 1 )((X ))D(X ), (3.6)
where is any chart around X and Hess(f 1 ) denotes the ordinary Hesse
matrix of f 1 . It is straightforward to show that (3.6) is independent of .
Equivalently, Hess f (X ) is uniquely determined by

d2 (f ) 
Hess f (X )(, ) := , (3.7)
dt2 t=0
where is any smooth curve with X = (0) and (0)
= . While the remaining

values of Hess f (X ) can be obtained by a standard polarization argument,b i.e.
via the formula
2 Hess f (X )(, )
= Hess f (X )( + , + ) Hess f (X )(, ) Hess f (X )(, ). (3.8)
However, the previous denition does not apply to regular points of f . In general,
one has to establish the concept of geodesics, cf. Remark 3.1 More precisely, the
Hessian of f at an arbitrary point x M is given by

d2 (f ) 
Hess f (X)(, ) := , (3.9)
dt2 t=0
where is the unique geodesic with X = (0) and (0)
= . Again, the remaining
values can be computed by (3.8). As usual, we associate to Hess f (X) a unique
selfadjoint linear operator Hess f (X) : TX M TX M such that
 | Hess f (X)X = Hess f (X)(, ) (3.10)
holds for all , TX M . It is called the Hessian operator of f at X M .

Remark 3.1. In modern textbooks on dierential geometry, the concept of


geodesics as well as the notion of (higher) covariant derivatives are dened via
linear connections, cf. [90, 96]. Therefore, Eq. (3.9) is usually derived as a conse-
quence and not introduced as a denition of the Hessian. For Riemannian manifolds,
however, it is also possible to establish (Riemannian) geodesics as curves of mini-
mal arc length. Both approaches coincide if one picks the so-called Riemannian or
LeviCivita connection as linear connection on M .

b More precisely, the polarization procedure is dened as follows: Let H be a real Hilbert space
and : H R a bounded quadratic form, i.e. there exists a bounded symmetric bilinear form
B : H H R such that (v) = B(v, v) for all v H. By the symmetry and bilinearity of B
we have B(v + w, v + w) = B(v, v) + B(w, w) + 2B(v, w) and hence B(v, w) = 12 (B(v + w, v +
w) B(v, v) B(w, w)) = 12 ((v + w) (v) (w)) for all v, w H. Therefore, B is uniquely
determined by the quadratic form and the latter identity is known as law of polarization.
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 613

Unfortunately, the computation of geodesics is in general a non-trivial problem,


as one has to solve (in local charts) a second order dierential equation. However,
on compact Lie groups their calculation is rather simple as we will see at the end
of Sec. 3.2.
The above concepts yield the following generalization of a familiar result from
elementary calculus for characterizing local extreme points.
Theorem 3.2. Let M be a Riemannian manifold and let X be a critical point
of the quality function f : M R. If Hess f (X ) or, equivalently, Hess f (X) are
negative denite, then X is a strict local maximum of f .

Proof. In local coordinates the result follows straightforwardly from Eq. (3.6).

In general, (asymptotic) stability of an equilibrium X M of (3.3) may depen-


dent on the Riemannian metric  | . However, the property of being a strict local
maximum or an isolated critical point of a smooth function f is obviously not up
to the choice of any Riemannian metric. Therefore, the following result shows that
in fact certain (asymptotically) stable equilibria X M of (3.3) are independent
of the Riemannian metric.
Theorem 3.3. (a) If X M is a strict local maximum of f, then X is a stable
equilibrium of (3.3). In particular, for any neighborhood U of X there exists
a neighborhood V of X such that the -limit sets (x0 ) are non-empty and
contained in U for all x0 V .
(b) If X M is a strict local maximum and an isolated critical point of f, then
X is an asymptotically stable equilibrium of (3.3). In particular, there is a
neighborhood V of X such that (x0 ) = {X } for all X0 V, i.e. all solutions
X(t, X0 ) with initial value X0 V converge to X for t +.

Proof. Both assertions follow immediately from classical stability theory by taking
f as Lyapunov function, cf. [22, 85].

Note that the convergence analysis near arbitrary equilibria, i.e. near arbitrary
critical points of f is quite subtle and may depend on the Riemannian metric,
cf. [97].

3.1.4. Discretised gradient ows


Finally, we approach the problem of nding discretizations of (3.3) which lead to
convergent gradient ascent methods. The ideas presented below can be traced back
to Brockett, cf. [17]. Let
expX : TX M M (3.11)
be the Riemannian exponential map at X M , i.e. t  expX (t) denotes the
unique geodesic with initial value X M and initial velocity TX M . Moreover,
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

614 T. Schulte-Herbr
uggen et al.

we assume that M is (geodesically) complete, i.e. any geodesic is dened for all
t R. Hence, (3.11) is well-dened for the entire tangent bundle T M .
The simplest discretization approach a scheme that can be seen as an intrinsic
Euler step method leads to
Riemannian gradient method
Xk+1 := expXk (k grad f (Xk )) (3.12)
where k denotes an appropriate step size.
In order to guarantee convergence of (3.12) to the set of critical points, it is sucient
to apply the Armijo rule [87]. An alternative to Armijos rule provides the step size
selection suggested by Brockett in [19], see also [22]. Convergence to a single critical
point is a more subtle issue. If (M,  | ) and f are analytic, and the step sizes are
chosen according to a version of the rst WolfePowell condition for Riemannian
manifolds, then pointwise convergence holds. A detailed proof can be found in [98].

3.2. Gradient flows on Lie groups


In the following, we apply the previous results to Lie groups and Lie subgroups.
However, to fully exploit Lie-theoretic tools, the Riemannian structure and the
group structure have to match, i.e. the metric  |  has to be invariant under the
group action. For basic concepts and results on Lie Groups and their Riemannian
geometry we refer to [88,89,99101]. In particular, we recommend the AMS-booklet
of Arvanitoyeorgos [102] for a rather comprehensive, but condensed overview includ-
ing many references for further reading. Sometimes we refer to [102] although it does
not contain a full proof of the corresponding statement. Nevertheless, the details
given therein will help the reader to get a better understanding of the subject. In
any case, we always added a second reference containing a complete proof.
Let G denote a nite dimensional Lie group, i.e. a group which carries a smooth
manifold structure such that the group operations are smooth mappings.c For nota-
tional convenience we will assume that G can be represented as a (closed) matrix Lie
group, i.e. as an (embedded) Lie subgroup of some general linear group GL(N, K)
of invertible N N -matrices over K = R or C.

Remark 3.2. According to a well-known result by Cartan, a subgroup G of


GL(N, K) is an (embedded) Lie subgroup, i.e. a smooth submanifold of GL(N, K),
if and only if it is closed in GL(N, K), cf. [103]. Note, however, that there is a
subtle dierence between embedded and immersed Lie subgroups. Moreover, not
every abstract Lie group admits a faithful representation as a matrix Lie group.
Nevertheless, the class of matrix Lie groups is rich enough for all of our subsequent
applications. For more details on these topics we also refer to [100, 103].

c Actually, any Lie group G exhibits a real analytic substructure (induced by the exponential

map), i.e. G can also be regarded as a real analytic manifold [101, 103].
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 615

3.2.1. Invariant metrics


A Lie group G can be endowed in a canonical way with a Riemannian metric  | .
Let g := T1 G be the Lie algebra of G, i.e. the tangent space of G at the unity 1.
From the fact that the right multiplication rH : G G and left multiplication
lH : G G are dieomorphisms of G for all H G, it follows
TH G = gH = Hg (3.13)
for all H G. Now, let ( | ) be any scalar product on g. Then
g | hG := (gG1 | hG1 ) (3.14)
for all G G and g, h TG G yields a right invariant metric on G, where right
invariance stands for
g | hG = gH | hHGH (3.15)
for all G, H G and g, h TG G. Thus right multiplication rH represents an
isometry of G. In the same way, one could obtain left invariant metrics on G.

Remark 3.3. In an abstract setting, one has to replace (3.13) by


TH G = DrH (1)g = DlH (1)g (3.16)
for all H G, where DrH and DlH denote the tangent maps of rH and lH ,
respectively. For a matrix Lie group, however, the respective tangent maps are
given by DrH (G) = H and DlH (G) = H for all G G and TG G. Hence
(3.16) reduces to (3.13).
The construction of bi-invariant, i.e. right and left invariant metrics is much
more subtle and in general even impossible. To summarize the basic results on this
topic we need some further terminology. The adjoint maps Ad : G GL(g) and
ad : g gl(g) are dened by
AdG h := GhG1 and adg h := [g, h] := gh hg
for all G G and all g, h g, where GL(g) and gl(g) denote the set of all auto-
morphisms and, respectively, endomorphisms of g. Note both notations adg h and
[g, h] are used interchangeably in the literature. A bilinear form ( | ) on g is
called
(a) AdG -invariant if the identity
(g | h) = (AdG g | AdG h) (3.17)
is satised for all g, h g and G G.
(b) adg -invariant if the identity
(adg h | k) = (h | adg k) (3.18)
is satised for all g, h, k g.
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

616 T. Schulte-Herbr
uggen et al.

Proposition 3.2. The following statements are equivalent:


(a) There exists a bi-invariant Riemannian metric  |  on G.
(b) There exists an AdG -invariant scalar product ( | ) on g.
Moreover, each of the statements (a) and (b) imply
(c) There exists an adg -invariant scalar product ( | ) on g.
If G is also connected, then (c) is equivalent to (a) and (b), respectively.

Proof. The equivalence (a) (b) follows easily by exploiting Eq. (3.15) at G = 1.
Moreover, applying (b) to a one-parameter subgroup t  exp(tg) and taking the
derivative at t = 0 yields (c). The implication (c) (b) is obtained in the same
way, i.e. by dierentiating
t  (Adetg h | Adetg k)
with respect to t, cf. [102]. Note, however, that this implies AdG -invariance only
on the connected component of the unity. Therefore, connectedness is necessary for
the implication (c) (b) as counter-examples show.

Now, the main result on the existence of bi-invariant metrics reads as follows.

Theorem 3.4. A connected Lie group G admits a bi-invariant Riemannian metric


if and only if G is the direct product of a compact Lie group G0 and an abelian
one, which is isomorphic to some (Rm , +), i.e. G
= G0 Rm

Proof. Cf. [89, 104].

Finally, we focus on a special class of Lie groups. A connected Lie group G


is called semisimple if the Killing form, i.e. the bilinear form (g, h)  (g, h) :=
tr(adg adh ) is non-degenerate on g. Most prominent representatives of this class are
SL(N, R), SL(N, C), SO(N, R) and SU (N ). More on semisimple Lie groups and
their algebras can be found in [99, 103].

Theorem 3.5. (a) If G is semisimple then the Killing form denes an


adg -invariant bilinear form on g.
(b) If G is semisimple and compact then denes an adg -invariant scalar product
on g. Thus induces a bi-invariant Riemannian metric on G.

Proof. Cf. [102, 103].

3.2.2. Gradient ows with respect to an invariant metric


Next, we study gradient ows on G or on a closed subgroup H G with respect
to an invariant metric  | . Therefore, let f : G R be a smooth quality function
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 617

and let : G G be any dieomorphism. Using the identity


grad(f )(G) = (D(G)) grad f ((G)) (3.19)

for all G G, where () denotes the adjoint operator, we obtain by the right
invariance of the metric
grad f (G) = grad(f rG ) (1)G (3.20)
for all G G. Hence
G = grad f (G) (3.21)
can be rewritten as
G = grad(f rG )(1)G. (3.22)
Thus the gradient ow of f is determined by the map G  grad(f rG )(1) g.
To study its asymptotic behavior of Eq. (3.21) we can apply the results of the
previous section. For instance, for compact Lie grops we have.

Corollary 3.2. Let G be a compact Lie group with a right invariant Riemannian
metric  |  and let f : G R be a real analytic quality function. Then any
solution of Eq. (3.21) converges to a critical point of f for t +.

Proof. This follows immediately from Proposition 3.1 and Theorem 3.1, as the pair
(G,  | ) constitutes a real analytic Riemannian manifold whenever the metric
 |  is invariant, cf. footnote [146].

Now, let H be a closed subgroup of G. By Remark 3.2, we know that H is


actually an (embedded) submanifold of G. Therefore, the gradient ow of f |H with
respect to  | |H is well-dened and can be given explicitly via the orthogonal
projectors PH , cf. (3.5). However, for an invariant metric the computation of PH
simplies considerably, as all calculations can be carried out on the Lie algebra g
of G.

Lemma 3.1. Let G be a Lie group with a right invariant Riemannian metric  | 
and let H be a closed subgroup of G. Furthermore, let g and h their corresponding
Lie algebras and denote by Ph the orthogonal projection of g onto h. Then the
orthogonal projection PH in (3.4) is given by
PH (gH) := Ph (g)H (3.23)
for all gH TH G.

Proof. This is a straightforward consequence of the identity TH H = hH and the


right invariance of  | .

According to (3.5), (3.20) and (3.23), the gradient ow of f |H nally reads


H = Ph (grad(f rH )(1))H. (3.24)
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

618 T. Schulte-Herbr
uggen et al.

3.2.3. Geodesics with respect to an invariant metric


The remainder of this subsection is devoted to the issue: How to compute
geodesics and the Hessian of a smooth quality function with respect to an invari-
ant metric. The main results for the forthcoming applications are summarized in
Theorem 3.6(b) and Proposition 3.3. For readers with basic dierential geometric
background we provide some details of the proof which however can be skipped, so
as not to lose the thread. First, we need some further notation. Let
Xgr : G  gG and Xgl : G  Gg
be the right and left invariant vector elds on G which are uniquely determined
by Xgr (1) = g and Xgl (1) = g, respectively. Moreover, let LX () denote the Lie
derivative with respect to the vector eld X , i.e. for a smooth function f : G R
one has
LX (f )(G) := Df (G) X (G).
On vector elds Y, the action of LX () is given by
(DX (t, G))1 Y(X (t, G)) Y(G)
LX (Y)(G) := lim ,
t0 t
where X (t, ) denotes the corresponding ow of X .
Next, we recall two basic facts from dierential geometry which play a key role
for the proof of Theorem 3.6. The rst one shows that the set of right/left invariant
vector elds is invariant under Lie derivation, cf. [99, 102]. The second one relates
a Riemannian metric of a manifold M with a particular linear connection on M .
For more details see e.g., [89].

Fact 1. The Lie derivative of a right/left invariant vector eld is again right/left
invariant and satises
LXgr Xhr = X[g,h]
r
and LXgl Xhl = X[g,h]
l
. (3.25)

Fact 2. On any Riemannian manifold M there exists a unique Riemannian


connection determined by the properties
LX Y = X Y Y X (3.26)
and
X Y | Z = X Y | Z + Y | X Z. (3.27)
Now, combining both facts yields the main result about geodesics on Lie groups.

Theorem 3.6. Let G be a Lie group with a bi-invariant metric  |  and let
denote the unique Riemannian connection on G induced by  | .
(a) For right/left invariant vector elds the Riemannian connection is given by
1 r 1 l
Xgr Xhr = X[g,h] and Xgl Xhl = X . (3.28)
2 2 [g,h]
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 619

(b) The geodesics through any G G are of the form t  G exp(tg) or t 


exp(tg)G with g g. In particular, the geodesics through the unity 1 are pre-
cisely the one-parameter subgroups of G.

Proof. (a) Applying Koszuls identity, cf. [89, 91],


2X Y | Z = LX Y | Z + LY Z | X  LZ X | Y X | LY Z
+ Y | LZ X  + Z | LX Y,
to Xgr , Xhr and Xkr we obtain
2Xgr Xhr | Xkr  = +Xgr | X[h,k]
r
 Xhr | X[k,g]
r
 Xkr | X[g,h]
r
.
Now Proposition 3.2 and Fact 1 imply
1 r
2Xgr Xhr | Xkr  = Xkr | X[g,h]
r
 and hence2Xgr Xhr = X[g,h] .
2
Obviously, for left invariant vector elds the same arguments apply.
(b) Let (t) := exp(tg)G. Part (a) implies that the covariant derivative (t)
(t)
=
r
Xgr Xg ((t)) of vanishes and thus represents the unique geodesics through
G with initial velocity = gG. The same holds for (t) := G exp(tg),
cf. [99, 102].

Observe that the bi-invariance of the metric and the invariance of the vector elds
are essential for the above result. For example Eq. (3.28) fails, if the Riemannian
metric is just right invariant. More details on this topic can be found in [99, 105].
Finally, by Theorem 3.6, the Hessian of the restriction f |H can easily be obtained
by restricting the Hessian of f to T H. More precisely, we have.

Proposition 3.3. Let f : G R be a smooth quality function on a Lie group with


bi-invariant metric  |  and let H be a closed subgroup. Then the Hessian of f |H
at H is given by
Hess f |H (H) = Hess f (H)|TH HTH H (3.29)
Note that in general Eq. (3.29) is sheer nonsense unless H is a Lie subgroup. Coun-
terexamples can be obtained easily for G = Rm .

3.3. Gradient flows on homogeneous spaces


The subsequent section on homogeneous spaces is motivated by the following obser-
vation, cf. Sec. 3.4. As before, let f : G R be a smooth quality function. In many
applications f can be decomposed into a function F dened on a smooth manifold
M and a (right) group action : (X, G)  X G on M such that
f (G) := F (X G) (3.30)
for some xed X M . Then we can think of f as dened on the orbit of X. More
precisely, let f = F |O(X) , where O(X) := {X G | G G} denotes the orbit of X.
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

620 T. Schulte-Herbr
uggen et al.

Thus
f(Y ) = f (G) (3.31)
for Y = X G with G G. Such quality functions f are called induced by F , cf.
Sec. 3.4. By construction, we have
max f (G) = max f(Y ). (3.32)
GG Y O(X)

Moreover, let HX := {G G | X G = X} denote the stabilizer or, equivalently,


isotropy subgroup of X. Then f can also be viewed as a function on the right coset
space,d
G/HX := {HX G | G G}, (3.33)
which is equivalent to say that f is equivariant with respect to HX , i.e.
f (G) = f (HG) (3.34)
for all H HX . Therefore, coset space show up quite naturally in optimizing
equivariant quality functions. Note that passing from G to G/Hx can be rather
useful in order to avoid certain degeneracies such as continua of critical points.

3.3.1. Coset spaces


We rst collect the fundamental facts on the dierential structure of G/H, where
H is any closed subgroup of G. Detailed expositions can be found in [91, 99, 101,
102, 106].

Theorem 3.7. Let G be a Lie group with Lie algebra g and let H G be a closed
subgroup with Lie algebra h. Moreover, let p be any complementary subspace to h,
i.e. g = h p. Then the following holds:
(a) The quotient topology turns the set of right cosets G/H := {[G] := HG | G
G} into a locally compact Hausdor space.
(b) There exists a unique manifold structure on G/H such that the canonical
projection : G G/H, G  [G] is a submersion. In particular, the
tangent space of G/H at [1] is isomorphic to p via the canonical identica-
tion p  dt
d
[exp tp]|t=0 and thus dim G/H = dim G dim H.
The following statements refer to the unique manifold structure on G/H given in
part (b).
(c) The Lie group G acts smoothly from the right on G/H via
([G ], G)  [G G] (3.35)

d Notethat the coset-terminoloy in the group literature is not consistent, i.e. right cosets are
sometimes called left cosets and vice versa. Here, we stick to the term right coset, if the group
element in on the right side, i.e. [G] = HG.
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 621

such that
rG : G/H G/H, [G ]  [G G] (3.36)
are dieomorphisms for all G G. Moreover,
lG : G G/H, G  [GG ] (3.37)

are submersions for all G G. Thus the tangent space T[G] G/H is given by
T[G] G/H = D
rG ([1]) T[1] G/H = D( lG )(1)g
= D( lG )(1)(AdG1 p). (3.38)
(d) Moreover, if H is a normal subgroup, i.e. GHG1 = H for all G G, then the
multiplication [G] [G ] := [GG ] is well-dened and yields a Lie group structure
on G/H.

Proof. Cf. [99, 101, 106].

The Lie group G/H given by Theorem 3.7(d) is called the quotient Lie group of G
by H. Moreover, the result provides the possibility to extend the well-known First
Isomorphism Law to the category of Lie groups.

Theorem 3.8. Let : G G be a smooth surjective Lie group homomorphism.


 : G/H G with H :=
Then there exists a well-dened Lie group isomorphism
ker such that the diagram

G
/ G
yy<
yy
yy b (3.39)
 yy
G/H

commutes. Moreover, let g, g and h denote the corresponding Lie algebras and let
p be any complementary space to h. Then D(1) is a surjective Lie algebra homo-
morphism with ker D(1) = h and commutative diagram
D(1)
g / g
ww;
w
ww (3.40)
D(1)
wwwD(1)
b
 w
p= g/h.

Proof. Note that H = ker is a closed normal subgroup of G. Thus by the First

Isomorphism Law ([G]) := (G) for [G] G/H is a well-dened group isomor-

phism. Moreover, is smooth, since is a smooth submersion by Theorem 3.7.
The assertion that D(1) is a surjective Lie algebra homomorphism, follows easily
from the properties of the exponential map. Finally, a straightforward application
of the chain rule yields Eq. (3.40).
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

622 T. Schulte-Herbr
uggen et al.

3.3.2. Orbit theorems and homogeneous spaces


Next, we analyze the relation between group actions and coset spaces. A smooth
right Lie group action is a smooth map : M G M , (X, G)  X G with
(X G) H = X (GH) and X 1 = X
for all X M and G, H G. The orbit of X M under the group action
is dened by O(X) := {X G | G G}. The action is called transitive if
M = O(X) for some and hence for all X M . Equivalently, one can say that for
all X, Y M there exists an element G G with Y = X G. Moreover, for X M
let HX := {G G | X G = X} denote the stabilizer of X and X : G M
the map G  X G. Then the canonical map X : G/HX M is dened by
[G]  X G.
Theorem 3.9 (Orbit Theorem). Let G be a Lie group with Lie algebra g and let
: M G M be a smooth right action of G on a smooth manifold M . Moreover,
let X be any point in M . Then the following statements are satised :
(a) The stabilizer subgroup HX is a closed subgroup of G.
(b) Let hX be the Lie algebra of HX . Then
ker DX (1) = hX . (3.41)
In particular, the canonical map
X : G/HX M is an injective immersion.
(c) The canonical map X is an embedding, i.e. O(X) is a submanifold of M
dieomorphic to G/HX , if and only if X is proper.e In this case, the tangent
space of O(X) at Y = X G is given by
TY O(X) = DX (G) TG G = DY (1) g = DY (1) AdG1 pX , (3.42)
where pX is any complementary subspace of hX , i.e. g = hX pX .

Proof. (a) The continuity of X implies that HX = 1 X (X) is closed.


X is an injective immersion, consider the identity X rG =
(b) In order to see that
(X (), G) and thus
DX (G) gG = D1 (X, G) DX (1) g.
Therefore, DX (1) g = 0 implies
d
X (exp(tg)) = 0
dt
for all t R and hence ker DX (1) hX . As the inclusion hX ker DX (1)
is obvious, we obtain ker DX (1) = hX . Moreover, let pX be any complemen-
tary subspace of hX . Then, identifying pX with T[1] G/HX yields D X ([1]) =
DX (1)|p , cf. Theorem 3.7. Thus DX ([1]) is injective and the same holds for
any other [G] G/HX by right multiplication rG .

eA map is called proper if the pre-image 1 (K) of any compact set K is also compact.
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 623

(c) The rst part follows from a standard embedding criterion on immersed mani-
folds, cf. [92]. The rst equality of Eq. (3.42) is a straightforward consequence
of the identity X = X X , where X : G G/HX denotes the canonical
projection. The second one is obtained by Y = X lG = X rG AdG , while
the third one follows from HY = AdG1 Hx . For further details see also.

Corollary 3.3. Let : M G M be as in Theorem 3.9 and let X M be any


point.

(a) If G is compact then G/HX is dieomorphic to O(X).


(b) If is transitive then G/HX is dieomorphic to M .

Proof. (a) This follows readily from Theorem 3.9(c) and the compactness of G.
X ([G]).
(b) Observe that transitivity of implies surjectivity of DX (G) and D
Thus Theorem 3.9(b) yields the desired result, cf. [106].

This gives rise to the following denition. A manifold M is called a homogeneous


G-space or for short a homogeneous space, if there exists a transitive smooth Lie
group action of G on M . In particular, any coset space G/H can be regarded as
a homogeneous space via the canonical action ([G ], G)  [G G] for [G ] G/H
and G G. Further results on homogeneous spaces, orbit spaces and principal
G-bundles can be found in [96, 101, 106].

Remark 3.4. Note that by Theorem 3.9 the orbit O(X) carries always a man-
ifold structure the topology of which is equal or ner than the topology induced
by M .

3.3.3. Reductive homogeneous spaces


Let M be homogeneous space with transitive Lie group action : M G M and
let H := HX be the stabilizer subgroup of a xed element X M . Next, we are
interested in carrying over the Riemannian structure of G to M or, equivalently,
to G/H. First, we need some further terminology. As most of the following terms
are conveniently expressed via algebraic properties of the pair (G, H), we focus on
the case M = G/H. Yet one could restate all results in terms of an abstract group
action on M .
A homogeneous space G/H is reductive, if the Lie algebra h of H has a com-
plementary subspace p in g such that p is AdH -invariant, i.e. HpH 1 p for
all H H. A Riemannian metric  |  on G/H is called G-invariant if the
mappings rG are isometries, i.e. if for all , T[G ] G/H and G, G G the
identity

rG ([G ]) | D
 | [G ] = D rG ([G ])[G G] (3.43)
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

624 T. Schulte-Herbr
uggen et al.

holds. Moreover a bilinear form (|) on p is called


(a) AdH -invariant if the identity
(p | p ) = (AdH p | AdH p ) (3.44)
is satised for all p, p p and H H.
(b) adh -invariant if the identity
(adh p | p ) = (p | adh p ) (3.45)

is satised for all p, p p and h h.
Note that G/H is reductive, if G has a bi-invariant metric, as one can choose
p := h . Next, we give a generalization of Proposition 3.2 and Theorem 3.4 to
homogeneous spaces.

Proposition 3.4. Let G/H be a homogeneous space with reductive decomposition


g = h p. The following statements are equivalent :
(a) There exists a G-invariant metric  |  on G/H.
(b) There exists an AdH -invariant scalar product (|) on p.
In addition, if H is connected then (a) and (b) are equivalent to
(c) There exists a adh -invariant scalar product (|) on p.

Proof. Cf. [102] and Proposition 3.2.

Theorem 3.10. Let G/H be a homogeneous space with reductive decomposition


g = h p. Then G/H admits a G-invariant metric if and only if the closure of
AdH |p := {AdH : p p | H H} is compact in GL(p).

Proof. Cf. [89].

Remark 3.5. (a) As a special case, Theorem 3.10 implies the existence of
bi-invariant metrics on compact Lie groups, cf. Theorem 3.4 and [89].
(b) Replacing p by the quotient space g/h, allows to state Theorem 3.10 without
referring to any reductive decomposition g = h p of g, cf. [89]. Moreover, it
can be shown that any homogeneous space G/H which admits a G-invariant
metric is reductive, cf. [107].
Theorem 3.10 can easily be rephrased for an arbitrary homogeneous G-space M
with transitive group action : M G M , by choosing H := HX with X M .
Note however, for orbits M := O(X) embedded in some larger Riemannian manifold
N , the invariant metric given by Theorem 3.10 does in general not coincide with
the induced metric. This gives rise to the following denition.
A manifold M is called a Riemannian homogeneous G-space or for short Rie-
mannian homogeneous space, if M is a homogeneous G-space with -invariant
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 625

metric, which is to say that the mappings G : M M , G (X) := X G are


isometries of M for all G G, i.e. for all , TX M and G G one has
 | X = DG (X) | DG (X)XG . (3.46)

Proposition 3.5. (a) Any homogeneous space of the form G/H with a G-invariant
metric is a Riemannian homogeneous space.
(b) Any Riemannian homogeneous space is isometric to a homogeneous space of
the form G/H with a G-invariant metric.

Proof. Follows readily from the previous denitions and Corollary 3.3(b).

3.3.4. Naturally reductive homogeneous spaces and geodesics


Characterizing the Riemannian connection of a homogeneous space and its geodesics
are in general advanced issues which we do not want to address here, cf. [102] and
the references therein e.g., [108]. However, there are two cases see (a) and (b)
below which are easy to handle. A homogeneous space G/H is called
(a) Naturally reductive if it is reductive with complementary space p and AdH -
invariant scalar product (|) on p such that the identity
(P adg h | k) = (h | P adg k) (3.47)
is satised for all g, h, k p, where P denotes the projection onto p along h.
(b) Cartan-like if it is reductive with complementary space p and AdH -invariant
scalar product (|) on p such that the commutator relations
[h, h] h, [h, p] p and [p, p] h. (3.48)
are satised.

Remark 3.6. If, in denition (a), the complementary space p can be chosen as
the orthogonal complement of h with respect to some AdG -invariant scalar product
(|) on g, then condition (3.47) reduces to
(adg h | k) = (h | adg k) (3.49)
for all g, h, k p.
Lemma 3.2. (a) Every Cartan-like homogeneous space G/H is naturally reductive.
(b) Every naturally reductive homogeneous space G/H is a Riemannian homoge-
neous space.

Proof. (a) By the commutator relation [p, p] h, we have P adg h = 0 for all
g, h p. Thus Eq. (3.47) is satised.
(b) The assertion follows immediately from Proposition 3.4.

Theorem 3.11 (Coset Version). Let G/H be naturally reductive. Then G/H is
Riemannian homogeneous space such that all geodesics through [G] G/H are of
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

626 T. Schulte-Herbr
uggen et al.

the form
t  [G exp(t AdG1 p)] = [exp(tp)G] (3.50)
with p p.

Proof. By Lemma 3.2(b), the quotient space G/H is also a Riemannian homoge-
neous space. For a proof for Eq. (3.50) we refer to [91, 102, 105].

The above result can be restated for an arbitrary naturally reductive Rieman-
nian homogeneous G-space.
Theorem 3.12 (Orbit Version). Let M be a homogeneous G-space with transi-
tive group action : M G M . Assume that G/HX is naturally reductive with
decomposition g = hX pX . Then M is a Riemannian homogeneous G-space such
that all geodesics through Y = X G M are of the form
t  Y exp(t AdG1 p) (3.51)
with p pX .

Proof. The result is a straightforward consequence of Theorem 3.11.

Thus naturally reductive homogeneous spaces are Riemannian spaces, where


the exponential map is particularly simple to express. By taking the basic pic-
ture of [91] further to discuss geodesics, Fig. 3 illustrates that only in naturally
reductive homogeneous spaces the geodesics on G project to geodesics on G/H. In
this sense, projection and exponentiation of tangent vectors commute in naturally
reductive homogeneous spaces. However, on reductive homogeneous spaces that
are not naturally reductive, the problem is considerably more involved. A necessary
and sucient condition for t  [G exp(tg)] being a geodesic in G/H can be found
in [102, 109].
On the other hand, for numerical purposes it is often enough and even advisible
to approximate the Riemannian exponential map by another computationally more
ecient local parametrisation. Here, the map
p  p  lG exp(AdG1 p) (3.52)
might be a natural candidate, even if it fails to give the exact Riemannian expo-
nential map. These issues are subject to current research, and recent details can be
found in [25, 110]. Figure 3 also shows how in reductive homogeneous spaces that
are no longer naturally reductive, the projected geodesic still provides a rst-order
approximation to the geodesic generated by the projection of the tangent vector.

3.3.5. Adjoint orbits


A prime example for naturally reductive homogeneous spaces is provided by the
adjoint action of a compact Lie group a scenario which is of major interest in
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 627

Fig. 3. Geodesics in reductive homogeneous spaces G/H. The tangent vector p p projects to
the tangent vector at the coset [1] = H. Note that only in naturally reductive homogeneous
spaces the geodesic in G generated by p projects onto G/H such that it coincides with the geodesic
of the projected tangent vector in the sense (etp ) = exp[1] (t). In reductive homogeneous spaces
that are not naturally reductive, the projection yields in general only a rst-order approximation
at [1] = H as shown in the lower part, where (etp ) = exp[1] (t). (Color online)

the forthcoming applications. Therefore, we summarize the previous results for the
particular case of adjoint orbits. Note that the adjoint action given by (X, G) 
AdG X := GXG1 is a left action. However, all previous statements and formulas
remain valid mutatis mutandis, e.g., right cosets have to be replaced by left cosets,
etc.

Corollary 3.4. Let G be a Lie group with Lie algebra g and let K G be a
compact subgroup with Lie algebra k and bi-invariant metric  | . Moreover, let
: g K g, (X, K)  AdK X := KXK 1 be the adjoint action of K on g and
denote by X : K g the map K  AdK X. Then the following assertions hold

(a) The stabilizer group H := HX of X is a closed subgroup of K.


(b) The coset space K/H is dieomorphic to the adjoint orbit O(X) := {AdK X |
K K} of X. In particular, the map X : K/H O(X), [K]  AdK X is a
well-dened dieomorphism satisfying the commutative diagram
X
K / O(X) g
s9
sss
s
ss
(3.53)
 ss bX
K/H
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

628 T. Schulte-Herbr
uggen et al.

(c) Let h := hX denote the Lie algebra of H and p be any complementary space to
h in k, then D(1) = adX is a surjective homomorphism with ker adX = h
and commutative diagram
DX (1)
g / TX O(X) g
7
pp ppp
pppX ([1])
D(1) (3.54)
 ppp Db
p= k/h;
Moreover, the tangent space of O(X) at Y = AdK X is given by
TY O(X) = adY k = adY (AdK 1 p).
(d) O(X) = K/H is naturally reductive. More precisely, p := h yields a naturally
reductive decomposition of k with AdH -invariant scalar product on p is given
by the restriction of  | .
(e) There is a well-dened -invariant metric on O(X) given by
 | AdK X := p | p  (3.55)
with = adY (AdK p ), = adY (AdK p ) and p , p p.
(f) All geodesics through Y = AdK X O(X) with respect to the metric given in
part (e) are of the form
t  Adexp(t AdK p) Y (3.56)
with p p.

Proof. Part (a) and (b) follow immediately from Theorem 3.9 and Corollary 3.3.
(c) For k k we have

d 
Adexp(tk) X  = adX k
dt t=0

and thus D(1) = adX . All other statements are again consequences of Theo-
rem 3.9.
(d) First, observe that the bi-invariance of  |  implies that k = h p with p := h
is reductive. Now, let P denote the orthogonal projection onto p. In turn, the
bi-invariance of  |  yields
P adg h | k = adg h | k = h | adg k = h | P adg k
for all g, h, k p, cf. Proposition 3.2. Therefore, O(X) = K/H is naturally
reductive.
(e) Let Y O(X) and K  K. A straightforward calculation using the identities
DKe (Y ) = AdKe for TY O(X) and AdK e (adY k) = adAdK
fY
(AdK e k) for
all k k yields the required invariance.
Part (f) follows immediately from Theorem 3.12 and the identity hY =
AdK hX for Y = AdK X which implies h Y = AdK p.
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 629

3.3.6. Gradient ows on Riemannian homogeneous spaces


Applying the previous results on gradient ows to quality functions f on Rieman-
nian homogeneous spaces G/H, we obtain by the G-invariance of the Riemannian
metric similar to (3.20) the gradient equality
grad f([G]) = DrG ([1]) grad(f rG ) ([1]) (3.57)
for all G G, where rG denotes the mapping [G ]  [G G]. Similar to Eq. (3.20),
the gradient of f is therefore completely determined by
G  grad(f rG )([1]) p. (3.58)
However, Eq. (3.58) does not induce a mapping from G/H to p, as in general
grad(f rG )([1]) = grad(f rHG )([1])
for H H\{1}. Now, for analyzing the asymptotic behavior of
= grad f([G])
[G] (3.59)
Sec. 3.1 provides again the appropriate tools. For instance, if G/H is compact we
have.
Corollary 3.5. Let G/H be a compact Riemannian homogeneous space and let
f : G/H R be real analytic. Then any solution of Eq. (3.59) converges to a
critical point of f for t +.

Proof. This follows immediately from Proposition 3.1 and Theorem 3.1 as a
Riemannian homogeneous space constitutes always a real analytic Riemannian
manifold, cf. [99, 101].

Finally, we return to our starting point and ask for the relation between (3.59)
and (3.21) in the case of an H-equivariant quality function f . Then, f induces a
quality function f on G/H via
f([G]) := f (G) (3.60)
for all G G. Moreover, assume G carries a bi-invariant metric  |  and G/H
is a homogeneous space with reductive decomposition g = h p and p := h . This
implies that the restriction of  |  to p p is AdH -invariant. Now, the identity
f = f yields
Df([G]) D(G) = Df (G) for all G G (3.61)
and hence
(D(G)) grad f([G]) = grad f (G) (3.62)
for all G G, where denotes the canonical projection and () the adjoint
mapping. By identifying p with the tangent space of G/H at [1], the map D(1)
represents the orthogonal projector h + p  p for h h and p p. Thus we obtain
D(1)(D(1)) = idp .
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

630 T. Schulte-Herbr
uggen et al.

In the same way, using the identity rG = rG , one shows


D(G)(D(G)) = idT[G] G/H
for all G G. Consequently, (3.62) yields
grad f([G]) = D(G) grad f (G) (3.63)
for all G G. Therefore, we have proven the following result:
Theorem 3.13. Suppose G/H satises the above assumptions and f : G R
is a H-equivariant quality function with induced quality function f : G/H R.
Then the canonical projection of the gradient ow of Eq. (3.21) onto G/H yields
the gradient ow of Eq. (3.59), i.e. if G(t) is a solution of Eq. (3.21) then (G(t))
is one of Eq. (3.59).

3.3.7. Discretized gradient ows on naturally reductive homogeneous spaces


As before, let G be a Lie group with bi-invariant metric and let f be an equivariant
quality function with respect to the closed subgroup H, i.e. for all H H one has
f (G) = f (HG), so
f |HG = constant
for every G G. Moreover, assume that G/H is a naturally reductive coset space.
Implementing a gradient algorithm for the induced quality function f on G/H
nally yields the following recursion scheme
[Gk+1 ] := [exp(k grad f (Gk ) G1
k ) Gk ], (3.64)
where k > 0 denotes a suitable step size. This, however, is not surprising, which
can be seen as follows. With G/H being naturally reductive, there is the reductive
decomposition g = h p with p := h , such that any g decomposes uniquely
into = h + p . Then the equivariance of f guarantees that its gradient at G G
is orthogonal to the coset HG. Thus one nds
grad f (G) | h G = Df (G)h G = 0
for all h h. Therefore, the pullback of the gradient of f to g satises
grad f (G)G1 p. Furthermore, combining Eqs. (3.38) and (3.63) with the identity
D( lG )(1) = D(G)G
for all g (cf. Remark 3.3) yields
grad f([G]) = D( lG )(1)(G1 grad f (G)).
Thus from Eq. (3.50) we nally obtain
exp[G] (t grad f([G])) = [exp(t grad f (G) G1 )G]
for all t R, where exp[G] denotes the Riemannian exponential map at [G], cf.
Eqs. (2.10) and (2.11). This precisely explains why recursion scheme (3.64) ressem-
bles the corresponding one on the group level.
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 631

3.4. Examples
Often practically relevant quality functions take the form of a linear functional
restricted to an adjoint orbit O(X). For instance, in quantum dynamics the unitary
orbit

O(A) := {UAU | U SU (N )} (3.65)

of an initial state A plays a central role, because it denes the largest reachability
set under closed Hamiltonian dynamics. Then the set of feasible expectation values
is such a linear map, since it is the projection onto an observable C in the sense of
a HilbertSchmidt scalar product. These expectation values can be generalized to
arbitrary complex square matrices A, C CN N such as to coincide with elements
of the C-numerical range

W (C, A) := {tr(C UAU ) | U SU (N )}. (3.66)

As C-numerical ranges are well established in the mathematical literature [111,112],


in the sequel we will adopt the notation. Note that nding the maximum absolute
value, i.e. the C-numerical radius

r(C, A) := max |tr{C UAU }| (3.67)


USU(N )

is straightforward for Hermitian A, C (it amounts to sorting the respective eigen-


values, cf. Corollary 3.8), while for arbitrary complex A, C there is no gen-
eral analytical solution. Moreover, when restricting to local unitary operations
K SUloc (2n ) := SU (2)n , the maximization task becomes non-trivial even for
Hermitian A, C [113, 114].
Having set the frame, we now illustrate the previous theory by gradient ows
on the entire unitary group SU (2n ), on the local unitary group SU (2)n as well as
their adjoint orbits.

3.4.1. Geometric optimization by gradient ows on SU (N )


Consider a fully controllable system () on SU (N ) in the sense that the entire
group SU (N ) can be generated by evolutions under the Hamiltonian of the system
plus the available controls. If A is an initial density operator or a matrix collecting
its signal-relevant terms, then the reachable set to A coincides with the orbit of
the canonical (semi)group action of () on A which yields in the entire unitary
orbit O(A), cf. Eq. (3.65). Recall its projection on some observable C (or its
signal-relevant terms) forms the C-numerical range of A, cf. Eq. (3.67).
In this setting, there are two geometric optimization tasks of particular practical
relevance as they determine maximal signal intensity in coherent spectroscopy [27].

(a) Find all points on the unitary orbit of A that minimize the Euclidean distance
to C.
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

632 T. Schulte-Herbr
uggen et al.

(b) Find all points on the unitary orbit of A that minimize the angle to the
1-dimensional, complex subspace spanned by C.
Clearly, the distance
UAU C 22 = A 22 + C 22 2 Re{tr(C UAU )} (3.68)

is minimal if the overlap Re{tr(C UAU )} is maximal. Moreover, making use of
the denition of the angle between 1-dimensional complex subspaces
|tr{C UAU }|2
cos2 ({UAU , C}) := 2 2 , (3.69)
A 2 C 2
problem (b) is equivalent to maximizing the function |tr(C UAU )|. Its maximal
value is the C-numerical radius of A (see Eq. (3.66)). Obviously, rC (A) A 2 C 2
with equality if and only if UAU and C are complex collinear for some U SU (N ).
Note that the two tasks (a) and (b) are equivalent whenever the C-numerical range
forms a circular disk in the complex plane (centred at the origin); conditions for
circular symmetry have been characterized in [115].
Extending concepts of Brockett [17] from the orthogonal to the special unitary
group [27, 28, 116], the above optimization problems (a) and (b) can be treated by
the previously presented gradient-ow methods, cf. also [22, 23].
For xed matrices A, C CN N dene
f1 : SU (N ) R, f1 (U ) := Re tr(C UAU ) (3.70)
and
f2 : SU (N ) R, f2 (U ) := |tr(C UAU )|2 . (3.71)
Observe that the distance problem (a) is solved by maximizing f1 , while the angle
problem is solved for maximal f2 .
Now, the dierential and the gradient of f1 with respect to the bi-invariant
Riemannian metric Eq. (3.77) is precisely given by
Df1 (U )(U ) = Re tr([UAU , C ]),
grad f1 (U ) = [UAU , C ]S U,
as will be illustrated in the worked example below. The dierential and the gradient
of f2 can be obtained in the same manner as

Df2 (U )(U ) = tr(C UAU ) tr([UAU , C ]) tr(C UAU )
tr([UAU , C ] ),
grad f2 (U ) = 2(f2 (U ) [UAU , C ])S U.
This yields the following result.
Theorem 3.14. The gradient systems of f , = 1, 2 with respect to the bi-invariant
Riemannian metric (3.77) are given by
U = (U )U (3.72)
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 633

with

1 (U ) := [UAU , C ]S and 2 (U ) := 2(f2 (U ) [UAU , C ])S . (3.73)
respectively. Each solution of (3.72) converges to a respective critical point for t
+. Thereby, the critical points of f are characterized by (U ) = 0, = 1, 2.

Proof. The above computations immediately yield Eq. (3.72). As f , = 1, 2 are


real analytic, the convergence of each solution to a critical point is guaranteed by
Proposition 3.1 and Theorem 3.1, cf. [116].

An implementable numerical integration scheme for the above gradient systems


making use of the Riemannian exponential, see Eqs. (2.9) and (2.11), is given by
() () () ()
Uk+1 = exp(k (Uk )) Uk , U0 = 1N . (3.74)
()
A suitable choice of step sizes k > 0 ensuring convergence can be found in
()
[27, 28, 116]. Generically, it drives Uk into nal states attaining the maxima of
the quality functions f , = 1, 2. However, there is no guarantee that the gradient
ows always reach the global maxima. Standard numerical integration procedures
such as, e.g., the Euler method are not applicable here as they would not preserve
unitarity.

3.4.2. Worked example


We now derive the discretized integration scheme maximizing the quality function
f1 in all detail. To this end, recall that SU (N ) is a compact connected Lie group of
real dimension N 2 1. Its Lie algebra, i.e. its tangent space at the identity is given
by set su(N ) of all skew-Hermitian matrices with tr = 0, i.e.
su(N ) := { CN N | = , tr = 0}. (3.75)
So elements su(N ) relate to Hamiltonians H via = iH. The tangent space
at an arbitrary element U SU (N ) is
TU SU (N ) = su(N )U = {U | su(N )}, (3.76)
cf. Eq. (3.13). Moreover, let SU (N ) be endowed with the bi-invariant Riemannian
metric
U | U U := tr( ), (3.77)
dened on the tangent spaces TU SU (N ), cf. Eq. (3.15).
Now set
F : SU (N ) CN N , F (U ) := C UAU
f : SU (N ) R, f (U ) := Re tr{C UAU }
For computing the tangent map of F , we exploit the fact that SU (N ) is an embed-
ded submanifold of CN N . Therefore, the tangent map is obtained by restricting
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

634 T. Schulte-Herbr
uggen et al.

the ordinary Frechet derivative DF (U ) to the tangent space TU SU (N ), cf. Eqs. (3.4)
and (3.5). Thus, by applying the product rule, one easily nds

DF (U )(U ) = C UAU + C UA(U ) = C UAU C UAU .

Now, the chain rule as well as the short-hand notations A := UAU and [, ]S to
denote the skew-Hermitian part of the commutator [, ] give

Df (U )(U ) = D(Re tr)(F (U )) D(F (U ))(U )


= Re tr{C A C A} C ]} = tr{[A,
= Re tr{[A, C ]S }
C] |  = [A,
= [A, C ] U | U ,
S S

where the last identity explicitly invokes the right-invariance of the Riemannian
metric on SU (N ), cf. Eq. (3.77). Next, identifying the above expression with

Df (U )(U ) = grad f (U ) | U  (3.78)

one gets the gradient vector eld


C ] U
grad f (U ) = [A, (3.79)
S

and thus the gradient system


C ]S U.
U = grad f (U ) = [A, (3.80)

By the Riemannian exponential, see Eqs. (2.9) and (2.11), and with k 0 as an
appropriate step size we nally arrive at the discretization

Uk+1 = ek [Uk AUk ,C]S Uk . (3.81)

3.4.3. Gradient ows on the local subgroup SUloc (2n )


The quality functions introduced in the previous subsection may be restricted to
the subgroup of local action, i.e. to

SUloc (2n ) := SU (2) SU (2) SU (2n ). (3.82)


  
n-times

Let the Pauli matrices be dened as


     
0 1 0 i 1 0
x := , y := , z := . (3.83)
1 0 i 0 0 1

Moreover the k, , {x, y, z} are dened by

k, := 12 12 12 12 , (3.84)

where the term appears in the kth position of the Kronecker product and 12
denotes the 22-identity matrix.
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 635

The Lie subalgebra to SUloc (2n ) SU(2n ) can be specied by


 n  
 
n 
suloc (2 ) := 12 12 k 12 12  k su(2) ,

k=1

with the term k su(2) appearing at the kth position, cf. Eq. (3.84). Then the
tangent space of SUloc (2n ) at an arbitrary element U is given by
TU SUloc (2n ) = {U | suloc (2n )}. (3.85)
Finally, SUloc (2n ) is endowed with the bi-invariant Riemannian metric induced by
SU (2n ), i.e.
U, U U := tr( ) (3.86)
for U, U TU SUloc (2n ).

Lemma 3.3. Let H GL(N, C) be any closed subgroup with Lie algebra h
gl(N, C) := CN N . Moreover let h1 , . . . , hm be a real orthonormal basis of h with
respect to the real scalar product
(g1 | g2 ) := Re tr(g1 g2 ), g1 , g1 CN N , (3.87)
i.e. spanR {h1 , . . . , hm } = h and (hi | hj ) = ij .
(a) Then the orthogonal projection P : CN N CN N onto h is given by
m

g  P g := Re tr{hj g}hj . (3.88)
j=1

(b) The orthogonal projection P : CN N CN N onto the orthogonal comple-


ment h is given by
g  P g = g P g.

Proof. Both (a) and (b) are basic and well-known facts from linear algebra.

Remark 3.7. For the unitary case, i.e. for h su(N ), the real part in Eq. (3.88)
can be neglected and the projector P can be rewritten in the more convenient
matrix form P as
m
P := vec(hj ) vec(hj ) , (3.89)
j=1

where the terms vec(hj ) vec(hj ) represent the rank-1 projectors Pj = |hj hj | in
vec-notation.
Corollary 3.6. The orthogonal projection P : CN N CN N onto suloc (2n ) with
respect to (3.87) is given by
n
1 
P g := (Re(tr(g Xk ))Xk + Re(tr(g Yk ))Yk + Re(tr(g Zk ))Zk ),
2n
k=1
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

636 T. Schulte-Herbr
uggen et al.

where Xk , Yk and Zk are dened by, cf. Eq. (3.84)


Xk := ik,x , Yk := ik,y , Zk := ik,z .

Proof. This follows straightforwardly from the orthogonality of the set


{Xk , Yk , Zk | k = 1, . . . , n} and Lemma 3.3.

Theorem 3.15. Let floc be the restriction of (3.70) to SUloc (2n ).


(a) The gradient of floc with respect to (3.86) and the corresponding gradient system
are given by
grad floc (U ) = P ([C , UAU ])U (3.90)
and
U = P ([C , UAU ])U, (3.91)
where P denotes the orthogonal projection P : gl(2n , C) gl(2n , C) onto
suloc (2n ). More explicitly, (3.91) is equivalent to a system of n coupled equations
U k = k Uk , k = 1, . . . , n (3.92)
on SU (2), where
1
k = (Re(tr([C , UAU ] Xk ))X + Re(tr([C , UAU ] Yk ))Y
2n
+ Re(tr([C , UAU ] Zk ))Z).
Each solution of (3.91) converges for t to a critical point of floc char-
acterized by
P ([C , UAU ]) = 0. (3.93)
(b) The Hessian form Hess floc (U ) and the Hessian operator Hess floc (U ) of floc
at U are given by
1
Hess floc (U )(U, U ) = (Re(tr( [C , [, UAU ]]))
2
+ Re(tr( [UAU , [, C ]]))). (3.94)
and
Hess floc (U )U = (S(U ))U, (3.95)
respectively, with suloc (2n ) and
1
S(U ) := P ([C , [, UAU ]] + [UAU , [, C ]]).
2
(c) For all initial points U0 SUloc (2n ) the discretization scheme
Uk+1 := exp(k P ([C , Uk AUk ]))Uk (3.96)
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 637

with step size


P ([C , Uk AUk ]) 2
k = (3.97)
[C , P ([C , Uk AUk ])] [P ([C , Uk AUk ]), Uk AUk ]

converges to the set of critical points of floc .

Proof. The subsequent arguments follow our conference report [117], which also
contains a complete proof for the ow on the entire groups such as SU (2n ).

(a) Since SUloc (2n ) is a closed subgroup of SU (2n ), it is also an embedded Lie
subgroup and thus a submanifold of SU (2n ), cf. Remark 3.2. Therefore, the
gradient of floc is well-dened by (3.4). Furthermore, by (3.23) and (3.73) we
obtain
grad floc (U ) = P (grad f1 (U )) = P ([UAU , C ] )U = P ([C , UAU ])U,
where the last equality follows from P ([UAU , C ] ) = P ([UAU , C ]) and
the skew-symmetry of the commutator. Moreover, Eq. (3.92) is derived by
Corollary 3.6 and the identity
 n 
d 
(U1 (t) Un (t)) = 12 U k (t)Uk1 (t) 12
dt
k=1

(U1 (t) Un (t)).


Compactness of SUloc (2n ) and real analyticity of floc imply that each solution
converges to critical points for t +, cf. Proposition 3.1 and Theorem 3.1.
(b) By (3.9), the Hessian of floc at U is determined by evaluating the second deriva-
tive of := f at t = 0, where is any geodesic. This yields
Hess floc (U )(U, U ) :=  (0) = Re(tr(C [, [, UAU ]])), (3.98)
for suloc (2n ). The Hessian then is obtained from the quadratic form (3.98)
by a standard polarisation argument Eq. (3.8), i.e.
1 
Hess floc (U )(U, U ) = Re(tr(C [, [, UAU ]])) + Re(tr(C [, [, UAU ]])) .
2
Finally, by the identity tr[X, Y ]Z = tr Y [X, Z] we conclude
1 
Hessfloc (U )(U, U ) = Re(tr( [C , [, UAU ]])) + Re(tr( [UAU , [, C ]])) .
2
Therefore, the Hessian operator of floc at U is given by
Hess floc (U )U = (S(U ))U
with suloc (2n ) and
1
S(U ) := P ([C , [, UAU ]] + [UAU , [, C ]]).
2
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

638 T. Schulte-Herbr
uggen et al.

(c) Estimating the second derivative


 (t) = Re(tr([C , ][, et UAU et ]))
for := grad floc (U ) = P ([C , UAU ]) and U SUloc (2n ) yields
| (t)| [C , ] [, et UAU et ] = [C , ] [, UAU ] .
Therefore, we get the estimate
 2 
d 

max  2 floc (expU (t)) [C , ] [, UAU ]
t0 dt

for := grad floc (U ). Now, a standard Lyapunov-type argument, similar to the


proof of Theorem 3.3 in [22], yields the desired result.

For similar discretization schemes in dierent contexts or other intrinsic Riemannian


methods see also [19, 22, 27, 118].

3.4.4. Double-bracket ows as gradient ows on naturally reductive


homogeneous spaces
The well-known double-bracket ows have established themselves as useful tools
for diagonalizing matrices (usually real symmetric ones) as well as for sorting lists
[17, 19, 22, 23]. Moreover, they relate to Hamiltonian integrable systems [119, 120].
(Note again that in many-particle physics gradient ows were later introduced inde-
pendently for diagonalizing Hamiltonians [51,52].) In summarizing the most impor-
tant results we show that double-bracket ows can be viewed as special cases of
gradient ows on naturally reductive homogeneous spaces G/H in terms of Sec. 3.3,
where H is a stabilizer group, which is typically not normal. Then the homogeneous
space G/H does not constitute a group itself.
Let O(A) as in Eq. (3.65) denote the unitary orbit of some A CN N . Note that
the adjoint action (U, A)  AdU A := UAU of SU (N ) constitutes a left action on
the Lie algebra g := CN N . However, this should not cause any confusion for the
reader since the key result we refer to Corollary 3.4 was presented for left
actions.
Let C CN N be another complex matrix. For minimizing the (squared)
Euclidean distance X C 22 between C and the unitary orbit of A we derive
a gradient ow maximizing the target function
f(X) := Re tr{C X} (3.99)
over X O(A). Clearly, this is but an alternative to tackling the problem by a
gradient ow on the unitary group, since as in Sec. 3.3, we have the equivalence
max f(X) = max f (U ) (3.100)
XO(A) USU(N )

for f (U ) := Re tr{C UAU }.


July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 639

Building upon Corollary 3.4, we have the following facts: O(A) constitutes
a compact and connected naturally reductive homogeneous space isomorphic to
SU (N )/H. Here,

H := {U SU (N ) | AdU A = A} (3.101)

denotes the stabilizer group of A. Recalling that the Lie algebra of SU (N ) is su(N ),
we further obtain for the tangent space of O(A) at X = AdU A the form

TX O(A) = {adX | su(N )} (3.102)

with adX := [X, ]. Moreover, the kernel of adA : su(N ) g reads

h = { su(N ) | [A, ] = 0} (3.103)

and forms the Lie subalgebra to H. Now, by the standard HilbertSchmidt scalar
product (1 , 2 )  tr{1 2 } on su(N ) one can dene the ortho-complement to
the above kernel as

p := h . (3.104)

This induces a unique decomposition of any skew-Hermitian matrix = h + p


with h h and p p. Finally, we obtain an AdSU(N ) -invariant Riemannian
metric on O(A) via

adX (AdU 1 ) | adX (AdU 2 )X := tr{p1 p2 } (3.105)

for X := AdU A, which is equivalent to saying



adX (1 ) | adX (2 )X := tr{p1X p2X } (3.106)

with pX := AdU p. Now, the main results on double-bracket ows read as follows:

Theorem 3.16. Set f : O(A) R, f(X) := Re tr{C X}. Then one nds

(a) The gradient of f with respect to the Riemannian metric dened by Eq. (3.105)
is given by

grad f(X) = [X, [X, C ]S ], (3.107)

where [X, C ]S denotes the skew-Hermitian part of [X, C ].


(b) The gradient ow

X = grad f(X) = [X, [X, C ]S ] (3.108)

denes an isospectral ow on O(A) g. The solutions exist for all t 0 and


converge to a critical point X of f(X) characterized by [X , C ]S = 0.

Proof. (A detailed proof for the real case can be found in [22]; for an abstract Lie
algebraic version see also [19].)
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

640 T. Schulte-Herbr
uggen et al.

(a) For X = AdU A and = adX TX O(A) we obtain



d 

Df (X) adX = t
Re tr{C e t 
Xe } = Re tr{C adX }.
dt t=0

Therefore, the gradient of f has to satisfy

Re tr{C adX pX } = grad f(X) | adX pX X

for all pX pX . Applying Eq. (3.105) to X = A gives



Re tr{C adA p } = tr{p p }

for all p p, where p is dened by grad f(A) = adA p with p p. Thus


we nally arrive at

tr{(adA C)S p } = tr{p p }

for all p p, where (adA C)S denotes the skew-Hermitian part of adA C.
Hence, p = (adA C)pS . Moreover, for h h, we have

tr{(adA C) h } = tr{adA C h } = tr{C adA h } = 0.

Hence, (adA C)S p and therefore

grad f(A) = adA (adA C)S = [A, [A, C ]S ].

The same arguments apply to X = AdU A and thus

grad f(X) = [X, [X, C ]S ].

(b) Since Eq. (3.107) evolves on the unitary orbit of A, the associated ow is isospec-
tral by construction. The compactness of O(A) then implies that each solution
X(t) of Eq. (3.107) exists for all t 0 and converges to the set of critical points
cf. Proposition 3.1. Moreover, from Theorem 3.1 we derive that X(t) converges
actually to a single critical point X of f, i.e. to a point X which satises

[X , [X , C ]S ] = 0. (3.109)

Since [X , C ]S pX , Eq. (3.109) is equivalent to [X , C ]S = 0.

In order to obtain a numerical algorithm for maximizing f one can discretize


the continuous-time gradient ow (3.107) as in the previous examples via

Xk+1 = ek [Xk ,C ]S
Xk ek [Xk ,C ]S
(3.110)

with appropriate step sizes k > 0. Note that Eq. (3.110) heavily exploits the fact
that the adjoint orbit O(A) constitutes a naturally reductive homogeneous space
and thus the knowledge on its geodesics, cf. Corollary 3.4.
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 641

Remark 3.8. As an alternative to Eq. (3.110), taking the standard Euler-type


iteration
Xk+1 = Xk + k [Xk , [Xk , C ]S ] (3.111)
does not retain the isospectral nature of the ow. Therefore, it should only be used
as a computationally inexpensive, rough scheme in the neighborhood of equilibrium
points, if at all.
For A, C complex Hermitian (real symmetric) and the full unitary (or orthog-
onal) group or its respective orbit the gradient ow (3.107) is well understood,
cf. Corollary 3.8. However, for non-Hermitian A and C, the nature of the ow and
in particular the critical points have not been analyzed in depth, because the Hes-
sian at critical points is dicult to come by. Even for A, C Hermitian, a full critical
point analysis becomes non-trivial as soon as the ow is restricted to a closed and
connected subgroup K SU (N ). Nevertheless, the techniques from Theorem 3.16
can be taken over to establish a gradient ow and a respective gradient algorithm
on the orbit OK in a straightforward manner.
Corollary 3.7. The gradient ow of Eq. (3.107) restricts to the subgroup orbit
OK (A) := {KAK | K K SU (N )} by taking the respective orthogonal projec-
tion Pk onto the subalgebra k su(N ) of K instead of projecting onto the skew-
Hermitian part, i.e. X = [X, Pk [X, C ]].
With step sizes k > 0 the corresponding discrete integration scheme reads

Xk+1 = ek Pk [Xk ,C ] Xk ek Pk [Xk ,C ] . (3.112)
In view of unifying the interpretation of unitary networks, e.g., for the task
of computing ground states of quantum mechanical Hamiltonians H A, the
double-bracket ows for complex Hermitian A, C on the full unitary orbit Ou (A)
as well as on the subgroup orbits OK (A) for dierent partitionings brought about
r
by K := {K SU (N1 ) SU (N2 ) SU (Nr )| j=1 Nj = 2n } have shifted into
focus [36]. Therefore, we have given the foundations for the recursive schemes of
Eqs. (3.110) and (3.112), which are listed in Table 2 as U1P and U1KP.
Finally, we summarize what is known about the nature of critical points for
the real symmetric or complex Hermitian case. For a detailed discussion of the real
symmetric case and the orthogonal group see e.g., [22].
Corollary 3.8. Let C and A be real symmetric or complex Hermitian and assume
for simplicity that they show distinct eigenvalues in either case. Then one nds:
(a) For A, C real symmetric, dene with respect to the special orthogonal group
SO(N ) and Y Oo (A) := {OAO | O SO(N )} a pair of target functions on
the group and on the respective orbit by
g(O) := tr{C  OAO } (3.113)
g(Y ) := tr{C  Y }. (3.114)
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

642 T. Schulte-Herbr
uggen et al.

Then the gradient ow

O := grad g(O) = [OAO , C]O (3.115)

shows 2(N 1) N ! critical points, while the double-bracket ow

Y := grad 
g(Y ) = [Y, [Y, C]] (3.116)

only shows N ! equilibrium points.


(b) For A, C complex Hermitian, and X Ou (A) := {UAU | U SU (N )}

f (U ) := tr{C UAU } (3.117)


f(X) := tr{C X} (3.118)

the gradient ow on the special unitary group SU (N )

U := grad f (U ) = [UAU , C]U (3.119)

shows a continuum of critical points, while the double-bracket ow on the


unitary orbit

X := grad f(X) = [X, [X, C]] (3.120)

again shows only N ! equilibrium points.


(c) On the orbit, the respective target function has a unique global maximum which
is given by the diagonalization diag(1 , . . . , N ), 1 > > N of A, if C is
assumed to be diagonal of the form C = diag(1 , . . . , N ), 1 > > N .
Moreover, the respective gradient ow converges to the unique global maximum
for almost all initial values with an exponential bound on the rate.

Proof. (a) and (b) The counting arguments follow immediately from the fact that
in either case for C diagonal with distinct eigenvalues, the set of critical points
C := {X O(A) | [X , C] = 0} on the orthogonal or unitary orbit is given
by N ! dierent diagonalizations of A and remains therefore invariant under
conjugation by any permutation matrix.
Moreover, on the orthogonal group O(N ), the stabilizer group of A is
given by

{diag(1, 1, . . . , 1)},

which adds 2N independent further degrees of freedom. Finally, restricting to


SO(N ) we obtain 2N 1 N ! critical points on the group level.
In contrast, for the unitary case SU (N ), the stabilizer group of A reads
 N 

iN 
diag(e , . . . , e , . . . , e ) 
i1 i
2Z, R ,

=1

which is always continuous.


July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 643

(c) Since C is symmetric or Hermitian, we can assume without loss of generality


that C is diagonal. Then, the critical point condition [X , C] yields that the
critical points of g and, respectively, f are given by the diagonalizations of A.
Moreover, analyzing the Hessian at critical points shows that there is only one
global maximum in both cases and no local ones [22].

The exponential convergence of the gradient ows Eqs. (3.116) and (3.120) to
the respective unique global maximum for almost all initial values is also estab-
lished via the Hessian, i.e. by linearizing the respective gradient ows at critical
points [22].

3.4.5. Some nal remarks on the naturally reductive case


Let f : SU (2n ) R be an arbitrary smooth function that is equivariant under local
unitary operations of the n-fold tensor product SUloc (2n ) := SU (2) SU (2).
This includes, e.g., any measure of entanglement E (U ) that varies smoothly with
U . By construction grad f |SUloc (2n ) = 0, so we may consider then the induced ow
to [U ] = grad f([U ]) on the homogeneous space

G/K = SU (2n )/SUloc (2n ),

which is naturally reductive for all n and even Cartan-like for n = 2. This can
be seen, because (i) SU (2n ) carries a bi-invariant metric induced by the Killing
form allowing to dene p := k , which gives the reductive decomposition g =
k p, yet only for n = 2 one recovers the commutator inclusions [k, k] k, [p, p]
k, and [k, p] p; (ii) in any case, by Proposition 3.4 there is an AdK -invariant scalar
product on p; and (iii) Eq. (3.47) is fullled for all {a, b, c} p, as tr{[a, b] c} =
tr{b [a, c]}, cf. Remark 3.6. Therefore, one nally arrives at a discretized gradient
algorithm of the form

[Uk+1 ] := [exp(k grad f (Uk ) Uk1 )Uk ], (3.121)

cf. Eq. (3.64). Clearly, this example extends analogously to functions that are equiv-
ariant under the action of generalized local subgroups SU(N1 ) SU(Nr ) with
r
j=1 Nj = N , cf. (4.8), giving ows on the corresponding reductive homogeneous
spaces

G/K = SU (N )/(SU (N1 ) SU (N2 ) SU (Nr )).

Comparing Eq. (3.121) with the results of the previous subsection on double
bracket ows shows the following: having a model of the coset space G/K, i.e. hav-
ing a smooth group action of G (e.g. on some vector space) such that one of its
orbits is dieomorphic to G/K, facilitates the implementation Eq. (3.121) rather
than implementing it on the abstract coset level.
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

644 T. Schulte-Herbr
uggen et al.

4. Applications to Quantum Information and Quantum Control


4.1. A geometric measure of pure-state entanglement
The Euclidean distance of a pure state to the set Spp of all pure product states may
be seen as a geometric measure of entanglement [55, 121, 122]. Since Spp coincides
with the local unitary orbit
Oloc (yy ) := {U yy U | U SUloc (2n )} (4.1)
of any pure product state y Spp , it relates to the following optimization task
(x) := min xx U yy U 2 , (4.2)
USUloc (2n )
n n
where x C2 denotes a normalized pure state and y C2 a pure product state,
e.g., y = (1, 0, . . . , 0) = (e1 e1 ). This notation replaces |x by x and |xx|
by xx for the sake of convenient generalization to higher-order tensor products.
Obviously, minimizing (4.2) is equivalent to maximizing the so-called local transfer
max Re(tr(xx U yy U )), (4.3)
USUloc (2n )

between xx and yy . Further, since


tr(xx U yy U ) = | tr(x U y)|2
taking the real part in (4.3) is redundant.
Now, the techniques developed in Sec. 3.4.3 match perfectly to tackle problem
(4.3). Let C := xx , A := diag(1, 0, . . . , 0) and dene the so-called local unitary
transfer between C and A by the real-valued function
floc (U ) := tr (CUAU ). (4.4)
Then the gradient ow (3.91) or more precisely its discretization (3.96) will gener-
ically solve (4.3). For explicit numerical results see Sec. 4.2.3 and [117, 123].
In general, neither an algebraic characterization of the maximal value of floc nor
the structure of its critical points is known, the major diculty arising from the
fact that U is restricted to SUloc (2n ). As soon as U may be taken from the entire
special unitary group, the solution is well-known: it is simply obtained by arranging
the (real) eigenvalues of both A and C magnitude-wise in the same order [17, 22,
124, 125].

4.2. Generalized local subgroups


4.2.1. Bipartite systems and relations to singular-value decompositions
An exceptional case, where the restricted problem (4.3) can be solved are bipartite
pure systems. These systems are particularly simple in as much as the maxima
of floc can be linked to the singular-value decomposition (SVD) of the matrices
X and Y associated to x and y by x := vec X and y := vec Y . Since these ideas
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 645

readily extend to arbitrary nite dimensional bipartite systems, we generalize the


formulation of problem (4.3) thus leading to Eq. (4.5), before going into multi-
partite systems.

Proposition 4.1. Let X = VX X WX , Y = VY Y WY be singular value decompo-
sitions with VX , VY U(N1 ), WX , WY U(N2 ) and X , Y sorted by magnitude.
Moreover, let x := vec X and y := vec Y . Then the maximum value of the local
transfer between xx and yy is bounded by

max Re(tr(xx U yy U )) (tr X Y )2 . (4.5)


USU(N2 )SU(N1 )

Equality is actually achieved for VX , VY SU(N1 ), while WX , WY SU(N2 ) and


U := (WX
VX ) (WY VY ).

Proof. For U := W V SU (N2 ) SU (N1 ) we obtain

tr(xx U yy U ) = tr(xx (W V )yy (W V ))


= tr(xx vec(V Y W  ) vec(V Y W  ) )
= |x vec(V Y W  )|2 = |tr(X V Y W  )|2 . (4.6)

Here, we have used the identities vec(V Y W ) = (W  V ) vec Y and (vec X)


vec Y = tr X Y for all X, Y CN1 N2 . Now, (4.6) implies

max Re tr(xx U yy U ) = max |tr(X V Y W  )|2 (tr X Y )2 ,


USU(N2 )SU(N1 ) V SU(N1 )
W SU(N2 )

(4.7)

where the last inequality is due to von Neumann, cf. [111,124]. If VX , VY SU(N1 )
and WX , WY SU(N2 ), equality is assumed in Eq. (4.7) for

U := (WY WX ) VX VY = (WX

VX ) (WY VY ).
Corollary 4.1. Set x := vec A and y := vec C. Then the maximum local transfer
between xx and yy in the sense of Proposition 4.1 is bounded by

A 2C := max | tr(C V AW )|2 ,


V U(N1 )
W U(N2 )

which is known as the C-spectral norm of A, cf. [112].

Note that in the context of nding maximal distances between global unitary
orbits for the purpose of geometric discrimination of generic non-pure quantum
states [126], results similar to [125, 127] show up, while here we treat local unitary
orbits of pure bipartite states as made explicit in Eq. (4.5).
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

646 T. Schulte-Herbr
uggen et al.

4.2.2. Multipartite systems and relations to best rank-1 approximations


of higher-order tensors
Proposition 4.1 has a straightforward generalization to multipartite systems, which
relates to best rank-1 approximations of higher-order tensors. To outline this rela-
tion, we dene the concept of a generalized local subgroup
SUloc (N1 , . . . , Nr ) := SU(N1 ) SU(Nr ). (4.8)
of type (N1 , . . . , Nr ) with Nk N, k = 1, . . . , r. Thus the associated general local
subgroup optimization problem can be stated as follows.
Generalized Local Subgroup Problem (GLSP). For C, A CN N with N :=
N1 N2 Nr nd
max Re(tr(CU AU )). (4.9)
USUloc (N1 ,...,Nr )

To our knowledge, the GLSOP seems to be unsolved so far. To introduce higher-


order tensors, we have to x some further notation. For simplicity, we regard a
tensor of order r N as an array
X = (Xi1 ir )1i1 N1 ,...,1ir Nr
of size N1 Nr . The space of all N1 Nr -tensors is denoted by CN1 Nr .
A natural scalar product for tensors of the same size is given by

Y | X := Yi1 ir Xi1 ir . (4.10)
i1 ir

Moreover, a tensor X is called a rank-1 tensor if there exist xk CNk , k = 1, . . . , r


such that
X = x1  x2   xr , (4.11)
where the (i1 ir )-entry of the outer product  is dened by
(x1  x2   xr )i1 ir := x1i1 x2i2 xrir .
Thus the question of decomposing a given tensor by tensors of lower rank leads to
the following fundamental approximation problem:
Best Rank-1 Approximation Problem (BRAP). Let denote the norm
induced by scalar product (4.10). For X CN1 Nr solve
min X C x1   xr 2 . (4.12)
CC, xk =1
k=1,...,r

Note that the above notation  is necessary to distinguish between two dierent
types of outer products: the Kronecker product (of column-vectors), which maps
r-tuples of column-vectors to a column-vector of larger size, and the abstract
outer product , which maps r-tuples of column-vectors to arrays (= tensors) of
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 647

order r. The relation between both is given by the canonical isomorphism vec :
CN1 Nr CN with N := N1 N2 Nr , which is uniquely determined by
x1  x2   xr  x1 x2 xr , (4.13)
i.e. vec assigns to each array X CN1 Nr a column-vector in CN by arranging
the entries of X in a lexicographical order. With these notations at hand, the
relation between GLSP and BRAP can be stated as follows.

Theorem 4.1. Let X CN1 Nr be a tensor of order r and let x := vec(X)


CN with N := N1 N2 Nr . Then the BRAP is equivalent to the GLSP
max Re(tr(xx U yy U )), (4.14)
USUloc (N1 ,...,Nr )

where y CN can be any pure product state, e.g., y = (1, 0, . . . , 0) = e1 e1 .


More precisely,
(a) If U1 Ur is a solution of (4.14) then xk := Uk e1 , k = 1, . . . , r and
C := X | x1   xr  solve (4.12).
(b) If C C and xk , k = 1, . . . , r solve (4.12) then any U1 Ur with xk =
Uk e1 , k = 1, . . . , r yields a solution of (4.14).
For proving Theorem 4.1 we need the following technical lemma.

Lemma 4.1. The pair (x1   xr , C) solves (4.12) if and only if x1   xr is


a maximum of
max |X | z 1   z r | (4.15)
z k =1,k=1,...,r

and C = X | x1   xr .

Proof. Consider the following identity


X C z 1   z r 2 = X 2 + |C|2 2 Re(C X | z 1   z r )
= X 2 + |C X | z 1   z r |2 |X | z 1   z r |2 .
Thus we obtain
min X C z 1   z r 2 = X 2 max |X | z 1   z r |2 .
CC, z k =1 z k =1
k=1,...,r k=1,...,r

This yields the desired result.

Proof of Theorem 4.1. Let y = e1 e1 . Then


(U1 Ur )y = (U1 e1 ) (Ur e1 )
and thus
tr(xx U yy U ) = tr(x U yy U x) = |x U y|2 = |X | (U1 e1 )   (Ur e1 )|2 .
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

648 T. Schulte-Herbr
uggen et al.

Therefore, we obtain
max Re(tr(xx U yy U )) = max |X | (U1 e1 )   (Ur e1 )|2
USUloc (N1 ,...,Nr ) USUloc (N1 ,...,Nr )

= max |X | z 1   z r |2 .


z k =1
k=1,...,r

and hence Lemma 4.1 implies (a) and (b).

Remark 4.1. (1) The isomorphism vec coincides almost with the standard vec-
operation on matrices for r = 2, more precisely vec(X) = vec(X  ).
(2) Since any phase factor can readily be absorbed into x1   xr , it is easy to
show that
max |X | x1   xr | = max Re(X | x1   xr ).
xk =1,k=1,...,r xk =1,k=1,...,r

Therefore, maxima of the real-part-expression on the right-hand side are


always maxima of the absolute-value-term on the left.

(3) By replacing yy in (4.14) with an appropriate sum li=1 yi yi , the above ideas
can be extended to best approximations of higher rank, i.e. to best approxima-
tions of the form
 2
 l 
 i,1 i,r 
min X Ci x   x  ,
Ci C, xi,k =1  
i=1

with l min{N1 , . . . , Nr } and all xi,1   xi,r mutually orthogonal, cf. [128,
129].
(4) Unfortunately, an analogue of Proposition 4.1 involving the tensor SVD as
dened in [130] does not hold for higher-order tensors. Even the classical
EckartYoung Theorem, which asserts that the best rank-k approximation of a
matrix is given by its truncated SVD, is false for higher-order tensors, cf. [131].
(5) Higher-order methods, like Newton-, BFGS- or conjugate gradient methods
for computing best approximations of higher order tensors can be found in
[132135]. Near local maxima these methods are in general faster than gradient
algorithms: Although a single iteration of them is more time-consumimg than
a gradient step, the number of iterations to guarantee a certian error threshold
is considerably lower due to local higher-order convergence rate. However, their
global convergence behavior is a rather delicate issue. In practice, therefore,
one often applies a combined strategy: (i) rst, run a gradient algorithm to
reach the region of attraction of a higher-order method; (ii) then switch to a
higher-order method.

4.2.3. Numerical results


For comparing our gradient-ow approach to tensor-SVD techniques, here we focus
on two examples that are well-established in the literature, since analytical solutions
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 649

[136] as well as numerical results from semidenite programming are known [55].
First, consider a pure 3-qubit state depending on a real parameter s [0, 1]

|X(s) := s|W  + 1 s|V , (4.16)
where one denes
1 1
|W  := (|001 + |010 + |100) and |V  := (|110 + |101 + |011)
3 3
  
with the usual shorthand notation of quantum information |0 := 10 , |1 := 01
  
and |001 := 10 10 01 , etc. With these stipulations one nds the corresponding
2 2 2 tensor representations for |W  and |V  to take the form
   
1 0 1 1 1 0
W(1,:,:) = W(2,:,:) = (4.17)
3 1 0 3 0 0
and
   
1 0 0 1 0 1
V(1,:,:) = V(2,:,:) = . (4.18)
3 0 1 3 1 0
Likewise, observe the pure 4-qubit-state


|X(s) := s|GHZ   1 s|X +  |X + , (4.19)
with the denitions
1 1
|GHZ   := (|0011 + |1100) and |X +  := (|10 + |01).
2 2
Consider the target function f (K) = tr{C KAK } with C = diag(1, 0, 0, . . . , 0)

and A := |X(s) 
X(s)|. As shown in Fig. 4 with the gradient ow restricted
to the local unitaries K SUloc (2n ) one obtains results perfectly matching the
analytical solutions of [136] as well as the numerical ones from semidenite pro-
gramming ensuring global optimality yet in drastically less CPU time as com-
pared to [55], see Table 1. Gradient ows are some 30 to 150 times faster in CPU
time than semidenite programming methods for the 3-qubit and 4-qubit example,
respectively.
In the tensor-SVD algorithms [131] such as the higher-order power method
(HOPM) or the higher-order orthogonal iteration (HOOI) as implemented in the
MATLAB package [137], N = 50 to N = 60 iterations are required for quantitative
agreement with the algebraically established results. In the 3-qubit example, all
minimal distances are also reproduced correctly with N = 5 iterations except
for the limiting values s near 0 and near 1, for which the minimal distances of
(|X(0)) = (|X(1)) = 2/3 are obtained by either tensor method instead of
the correct analytical value of 5/9, which requires N = 60 iterations as shown
in Fig. 4(c). In the 4-qubit example, however, for N = 5 iterations, both tensor
methods suer from apparently random numerical instabilities, which only vanish
when allowing for N = 50 iterations in either method. It is the considerably high
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

650 T. Schulte-Herbr
uggen et al.

0.6 0.8
1 max. local transfer

0.5 0.7

0.4 0.6

0.3 0.5

0.2 0.4
0 0.5 1 0 0.5 1
s s
(a) (b)

0.7 0.7

0.68
N=5 0.68
N = 60
1 max. local transfer

0.66 0.66

0.64 0.64

0.62 0.62

0.6 0.6

0.58 0.58

0.56 0.56

0.54 0.54

0.52 0.52

0.5 0.5
0.998 0.999 1 0.998 0.999 1
s s

(c)

Fig. 4. Numerical results by gradient ows on the local unitary group K = SU n


loc (2 ) deter-
mining (a) the Euclidean distance of the 3-qubit state |X(s) = s|W  + 1 s|V  (see
Eq. (4.16)) to the nearest
product state as a function of s; (b) the distance of 4-qubit state
b
|X(s) = s|GHZ  1 s|X+  |X+ (see Eq. (4.19)) to the nearest product state. (c) Tensor-
SVD results for Euclidean distance of the 3-qubit state |X(s) to the nearest product state as
in part (a). With the standard of N = 5 iterations, both methods (here shown for HOPM) give
systematic errors as indicated by the arrow. N = 60 iterations are needed for quantitatively
matching the well-established distance values. The high number of iterations required slows down
the method as indicated in Table 1. (Color online)

number of iterations that makes the tensor methods substantially slower than our
gradient-ow algorithm as shown in Table 1.
Therefore, at least for lower order tensors, gradient ows provide an appealing
alternative to standard tensor-SVD methods for best rank-1 approximations. More-
over, one should take into account that the above gradient methods are developed
to solve the GLSOP and thus a considerable speed-up can be expected by adjusting
them to the local orbit Oloc (yy ) of a pure product state. For similar results obtained
by an intrinsic Newton and conjugated gradient method see also [118, 123]. Gener-
alizations of such higher-order methods to Grassmann manifolds, which perfectly
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 651

Table 1. CPU times for determining Euclidean distance to orbit of separable pure
states in Fig. 4.

Semidenite programming By gradient ow


Qubits CPU-time [sec]a CPU-time [sec]b Speed-up
3 10.92 0.30 36.4
4 103.97 0.71 147.0
Higher-ord. tensor-SVD (HOPM) H.O. tensor-SVD (HOOI)
Qubits CPU-time [sec]b CPU-time [sec]b Speed-ups
3 2.39 5.37 4.6 (2.0)
4 3.93 7.03 26.5 (14.8)
a Eisert et al. (processor with 2.2 GHz, 1 GB RAM) [55].
b Average of 50 runs, Athlon XP1800+ (1.1 GHz, 512 MB RAM).

t in the previous theory of Riemannian homogeneous spaces [110], are provided


in [132135]. As also discussed therein, the applications to tensor approximation in
signal processing and data compression or subspace reconstruction in image pro-
cessing are numerous. Moreover we anticipate that these numerical approaches will
also prove useful tools in tensor and rank aspects of entanglement and kinematics
of qubit pairs as addressed, e.g., in [138, 139].

4.3. Locally reversible interaction Hamiltonians


4.3.1. Joint local reversibility
In a recent study [29], we have addressed the decision problem whether a time-
independent (self-adjoint) Hamiltonian H normalized to ||H||2 = 1 generates a
one-parameter unitary group U (t) = {eitH | t R} that is jointly invertible for all
t by local unitary operations K SUloc (2n ) = SU (2)n in the sense
KHK = H. (4.20)
Apart from complete algebraic classication, in [29] we used that the question
obviously nds an armative answer, if there is an element K SUloc (2n ) such
that
||KHK + H||2 = 0, (4.21)
which amounts to minimizing the transfer function
f (K) = Re tr{HKHK }. (4.22)
n
With P denoting the projector onto k, i.e. the Lie algebra of K = SUloc (2 ), we
therefore used the gradient ow
K = grad f (K) = P ([KHK , H])K (4.23)
as an other application of Theorem 3.15. If (due to normalization)
Re tr{HKHK } = 1 can be reached, the interaction Hamiltonian is locally
reversible.
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

652 T. Schulte-Herbr
uggen et al.

Remark 4.2. There is an interesting relation to local C-numerical ranges as


described in detail in [113, 114]: if the local C-numerical range

Wloc (H, H) := {tr(HKHK 1 )|K K} = [1; +1]

then the interaction Hamiltonian H is locally reversible. The references also estab-
lish the interconnection to local C-numerical ranges of circular symmetry and
multi-quantum interaction components transforming like irreducible spherical spin
tensors.

In Fig. 5, we give some examples: e.g., the Ising-ZZ interaction in a cyclic


four-qubit coupling topology is locally reversible, while in the cyclic three-qubit
topology it is not, and also for two qubits coupled by an isotropic Heisenberg-XXX
interaction it is not. Thus numerical tests provide convenient answers particularly in
problems where an algebraic assessment becomes more tedious than in the examples
presented here, which are fully understood on algebraic grounds [29].

4.3.2. Pointwise local reversibility


In [29] we also generalized the above problem to the question, whether for a xed
R there is a pair K1 , K2 K = SUloc (2n ) so that

K1 ei H K2 = e+i H (4.24)

which upon setting A := ei H and C := e+i H is equivalent to

||K1 AK2 C||2 = 0. (4.25)

(a)
[normalised]

0.5

(b)
0
tr {KHK 1 H}

(c)

0.5

0 50 100 150
iteration

Fig. 5. Gradient-ow driven local reversion of dierent Heisenberg interaction Hamiltonians:


(a) the Ising-ZZ interaction on a cyclic four-qubit topology C4 can in fact be locally reversed,
whereas (b) neither the ZZ interaction on a cyclic three-qubit topology C3 can be reversed locally,
(c) nor the Heisenberg-XXX interaction between two qubits. (Color online)
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 653

Re tr {K eitHK (eitH)} [normalised]


0
(a)

0.5 (b)
2
1

0 10 20 30 40 50
iteration

Fig. 6. Gradient-ow driven local inversion of the exponential of Hamiltonian H = 1


2
(z 1+
i H
1 z + z z ) and U ( ) := e (a) by a gradient ow with independent K1 and K2 (b)
4

by a gradient ow with K1 = K2 =: K. (Color online)

Thus one may choose a gradient ow to minimize


1
f (K1 , K2 ) := Re tr{C K1 AK2 } (4.26)
2n
by the coupled system
K 1 = grad f (K1 ) = P (K1 AK2 C )K1
(4.27)
K 2 = grad f (K2 ) = P (K2 C K1 A)K2 .
So if f (K1 , K2 ) = 1 can be reached, then U (t) = ei H is locally reversible at time
t = . See Fig. 6 for examples comparing pointwise and universal local reversibility.

4.4. Intrinsic versus penalty approach: An example


So far, we have demonstrated that in quantum information and control constrained
optimization tasks arise that lend themselves to Riemannian, i.e. intrinsic optimiza-
tion methods. This is because the dierential geometry of their constraint sets is
well understood, in particular, many of their Riemannian quantities, like the expo-
nental map, are given explicitly by well-known formulas. In other case, however,
the use of sophisticated tools from dierential geometry may be to time-consuming.
Therefore, it is sometimes advisable to combine intrinsic techniques with extrinsic
methods, like a penalty term or an augmented Lagrange multiplier approach. Here,
we only sketch how to incorporate a basic penalty term.
For instance, one may face the problem to maximize a quality function f on
the reachable set of a quantum system under additional state space contraints. An
example amounts to nding the maximal unitary transfer from matrix (state) A
to C subject to leaving another state E invariant (provided A and E do not share
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

654 T. Schulte-Herbr
uggen et al.

the same stabilizer group). Another variant amounts to optimizing the contrast
between the transfer from A to C and the transfer from A to D; so the task is to
maximize the transfer from A to C subject to suppressing the transfer from A to D.
For tackling those types of problems, we address two basically dierent
approaches a purely intrinsic one and a combined method joining intrinsic and
penalty-type techniques. Both methods will be briey illustrated for the problem
of maximizing the transfer from A to C while leaving E invariant, i.e.
max |tr{UAU C }| subject to UE U = E. (4.28)
UU(N )

It is straightforward to see that the stabilizer group


KE := {K U (N ) | KEK = E} (4.29)
of E forms a compact connected Lie subgroup of U (N ). Dierentiating the identity
etk Eetk = E for t = 0 yields its Lie algebra
kE := {k u(N ) | adk (E) [k, E] = 0}. (4.30)
By the Jacobi identity
[[1 , 2 ], E] + [[2 , E], 1 ] + [[E, 1 ], 2 ] = 0
one can easily verify that kE is indeed a Lie subalgebra of u(N ). Moreover, from
the compactness of KE we conclude that the exponential map exp : kE KE is
not only locally, but globally onto. Note, however, this fact is not exploited in what
follows. A set of generators of kE may constructively be found by solving a system
of homogeneous linear equations, i.e.
kE = ker adE u(N ) = {k u(N ) | (1 E E  1)vec(k) = 0}.
In particular, if E is of the form E = 1 + with C and u(N ), then kE is
identical to the centralizer of in u(N ).
By ortho-normalizing the elements kj kE of the generating set kE with j =
1, 2, . . . , nE , one obtains the projectors Pj := |kj kj | (see also Eq. (3.88)) to give

the total projection operator P := j Pj . With this denition, the gradient ow
U2K of the summarizing Table 2 applies and solves Eq. (4.28). Therefore, the
constraint of leaving a neutral state E invariant during the transfer from A to C
can be approached intrinsically by restricting the ow from the full unitary group
to a compact connected Lie subgroup, the stabilizer group KE of E.
However, it may be tedious to check for the stabilizer group KE in each and
every practical instance and then project the gradients onto the corresponding
subalgebra kE . In [28], we therefore presented a combined approach based on the
penalty function
L(U ) = f2 (U ) (tr{E U EU } ||E||22 ) (4.31)
with f2 (U ) := |tr{C UAU }|2 and penalty term (tr{E U EU } E 22). Here, the
constraint U EU E = 0 was rewritten in the more convenient form tr{E U EU }
E 22 = 0. The algorithm given in Table 2 as U2C implements a discretized gradient
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 655

ow of L obtained from the identity



DL(U ) (U ) = tr{(2(f2 (U ) [UAU , C ])S [UEU , E ])}.
Note that the penalty parameter is increased within the recursion to guarantee
that the constraint is (at least approximately) satised in the limit.
Thus, for the constrained optimization task of maximizing the transfer from A
to C subject to leaving the state E invariant, one has the choice of taking either the
intrinsic approach U2K or the combined approach of U2C. Note, however, that the
intrinsic approach restricts the ow to the stabilizer group KE at any time, whereas
the combined method is designed such as to start arbitrarily on U (N ) but nally to
give an equilibrium point on KE . Therefore, the intrinsic approach has the advan-
tage that the constraint is (at least in principal) properly satised for the entire
iteration. However, there are situations where an intrinsic method is impractical as
the computational costs are too expensive. The combined method, in contrast, does
not suer from this shortcoming and thus has a wider range of applications. On
the other hand, it is well-known that simple penalty methods as presented above
become ill-conditioned for large values of . Therefore, an augmented Lagrange
multiplier approach may be a good alternative if numerical diculties arise,
cf. [86, 87].
Note that the intrinsic approach paves the way to perform (or approximate)
a transfer from A to C robustly by taking KE as the stabilizer group resistent
against a certain error class in the sense familiar from stabilizer codes [142144].
The extrinsic approach, on the other hand, could be taken to transfer one protected
state A to another one C via intermediate states that are no longer necessarily
protected against errors as in the intrinsic case.
Finally, in [28, 113], we devised a penalty-type gradient ow algorithm for
solving the constrained optimization
maxU |tr{C UAU }| subject to tr{D UAU } = min . (4.32)
To this end, we introduced the penalty function
L(U ) := |tr{C UAU }|2 |tr{D UAU }|2 , (4.33)
to maximize the transfer from A to C while suppressing the transfer from A
to D. This leads to the recursive scheme U3C in Table 2. For the relation of
unconstrained and constrained gradient ows to the topic of C-numerical ranges
and relative C-numerical ranges, see [113, 114, 145], where the latter explicitly
compares gradient results with those of quadratic programming with quadratic
constraints.

5. Conclusions
The ability to calculate optima of quality functions for quantum dynamical pro-
cesses and to determine steerings in concrete experimental settings that actually
July

Table 2. Examples of optimization tasks and related gradient ows.

656
No. Target function Discretized gradient ow Ref.
I. Unconstrained optimization
12,

Maximization over the orthogonal group: O SO(N, R) and A, RNN with diagonal, k > 0 stepsize
J070-S0129055X10004053

O1 f (O) = tr{ OAO  } Ok+1 = exp{k [Ok AOk ,  ]}Ok [17, 22, 23]
Maximization over the unitary group: U, V SU (N ) and A, C CNN ; [, ]S and ()S denote skew-Hermitian parts
T. Schulte-Herbr

U1 f (U ) = Re tr{C UAU } Uk+1 = exp{k [Uk AUk , C ]S }Uk [27, 28]


U2 f (U ) = | tr{C UAU }|2 Uk+1 = exp{k ([Ak , C ]f (Uk ) [Ak , C ] f (Uk ))}Uk where Ak := Uk AUk [27, 28]
U3 f (U, V ) = Re tr{C U AV } Uk+1 = exp{k (Uk AVk C )S }Uk [23, 29]
Vk+1 = exp{k (Vk C Uk A)S }Vk
uggen et al.

Maximization restricted to subgroups K U (N ) of the unitary group with K K and Pk as projection from gl(N, C) onto k, i.e. the
Lie algebra to K
U1K f (K) = Re tr{C KAK } Kk+1 = exp{k Pk [Kk AKk , C ]}Kk [herea ]
U2K f (K) = |tr{C KAK }|2 Kk+1 = exp{k (Pk [Ak , C ]f (Kk ) Pk [Ak , C ] f (Kk ))}Kk [herea ]
where Ak := Kk AKk
(1) (1) (2) (1)
U3K f (K1 , K2 ) = Re tr{C K 1 AK2 } Kk+1 = exp{k Pk (Kk AKk C )}Kk [29]
(2) (2) (1) (2)
Kk+1 = exp{k Pk (Kk C Kk A)}Kk
X
N
(j) (j) (j) (j)
U4K min U j Aj U j A0 Uk+1 = exp{k [Ak , A0jk ] }Uk [141]
s
2010 12:0 WSPC/S0129-055X

U SU (n)
j=1
N
X
(j) (j) (j) ()
where Ak := Uk Aj Uk and A0jk := A0 Ak
=1
=j
X
N (j) (j) (j) (j) (j)
U5K min U j A j Vj A 0
Uk+1 = exp{k (Uk Aj Vk A0jk )s }Uk [141]

148-RMP

U,V SU (n)
j=1
(j) (j) (j) (j) (j)
Vk+1 = exp{k (Vk A0jk Uk Aj )s }Vk ,
N
X () ()
where A0jk := A0 Uk A Vk
=1
=j
July 12,
J070-S0129055X10004053

Table 2. (Continued )

No. Target function Discretized gradient ow Ref.


Maximization restricted to homogeneous spaces G/H of the orthogonal group with X G/H and A, C real symmetric
O1P f (X) = tr{CX} with Xk := AdOk (A) Xk+1 = ek [Xk ,C] Xk e+k [Xk ,C] [22, 119]
Maximization restricted to homogeneous spaces G/H of the unitary group with X G/H and A, C arbitrary complex square and
Pk as projection from gl(N, C) onto k

U1P f (X) = Re tr{C X} with Xk := AdUk (A), Xk+1 = ek [Xk ,C ]S X e+k [Xk ,C ]S
k [here]

U1KP f (X) = Re tr{C X} with Xk := AdKk (A), Xk+1 = ek Pk [Xk ,C ] Xk e+k Pk [Xk ,C ] [here]
II. Constrained optimization
Maximizing L(U ) with penalty parameter R over the unitary group: U SU (N ); A, C, D, E CNN
U1C L(U ) = Re fC (U ) Im2 fC (U ) Uk+1 = exp{k ([Ak , C ]S + 2iIm fC (Uk )[Ak , C ]H )}Uk [28]
1
with fC (U ) := tr{C UAU } where Ak := Uk AUk and XH,S := (X X )
2
U2C [28]
2010 12:0 WSPC/S0129-055X

(U )[A , C ]) [E , E ])}U
L(U ) = |fC (U )|2 (fE (U ) ||E||22) Uk+1 = exp{k ((2fC k k S k k
fC (U ) (s.a.) and fE (U ) := tr{E U EU } where Ak := Uk AUk and Ek := Uk EUk
U3C L(U ) = |fC (U )|2 |fD (U )|2 Uk+1 = (U )[A , C ]) (f (U )[A , D ]) )}U
exp{2k ((fC k k S D k k S k [28]
fC (U ) (s.a.) and fD (U ) := tr{D UAU } where Ak := Uk AUk
a Work presented in part at the MTNS 2006 [117].
148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics


657
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

658 T. Schulte-Herbr
uggen et al.

Summary:
General Gradient Algorithm for Steepest Ascent on Riemannian Manifolds

Requirements: Riemannian manifold M , e.g., Lie group G with (bi-invariant)


metric  |  or its group orbits; smooth target function f : M R; associated
gradient system X = grad f (X).
Input : initial state X(0) M , parameters for target function.
Output : sequence of iterative pairs {(Xk , f (Xk ))} approximating critical points
X and their critical values f (X ).
Initialization: If possible, generate generic initial state X0 , e.g., for compact Lie
groups pick random G0 G according to Haar measure (for SU (N ) see [140])
and set X0 := G0 X(0), otherwise identify X0 := X(0); calculate f (X0 ),
grad f (X0 ), and step size 0 according to Sec. 3.
Recursion:
while k = 0, 1, 2, . . . , klimit and k > threshold > 0 do
1: iterate Xk+1 = expXk (k grad f (Xk )) according to examples in Table 2.
2: calculate f (Xk+1 ).
3: update step size k+1 according to Sec. 3.
4: go to step 1.
end
Fig. 7. Summarizing scheme for steepest-ascent gradient ows on Riemannian manifolds. For
related methods, like conjugate gradients, Jacobi- or Newton-type schemes, step (1) has to be
modied in a straight-forward way according to Sec. 2, for details see [20, 62, 63]. If the dynamic
stepsize selection of Sec. 3 is too costly CPU-timewise, one may start out with constant stepsizes,
and halve them whenever (f (Xk+1 ) f (Xk )) 0, cf. Armijos rule. In cases, where local extrema
exist (see Sec. 3), make sure to run with a sucient number of generic initial conditions.

achieve these optima is tantamount to exploiting and manipulating quantum eects


in future technology. To this end, we have presented a comprehensive account
of gradient ows on Riemannian manifolds (see general scheme of Fig. 7) allow-
ing for generically convergent quantum optimization algorithms an ample
array of explicit examples being given in Table 2. Since the state spaces of
quantum dynamical systems can often be represented by smooth manifolds,
the unied foundations given here are also illustrated by many applications for
numerically addressing optimization tasks in quantum information and quantum
control.
In the present work, a variety of applications are addressed by relating the
dynamics to Lie group actions of the unitary group and its closed subgroups, which
also includes recent least-squares approximations by a sum of several elements on
independent matrix orbits [141] given as instances U4K and U5K in Table 2. Since
symmetries give rise to stabilizer groups, particular attention has been paid to
gradient ows on homogeneous spaces.
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 659

Theory and algorithms have been structured and tailored for the following
scenarios:
(i) for Lie groups with bi-invariant metric,
(ii) for closed subgroups
(iii) for compact Riemannian symmetric spaces, or, more generally,
(iv) for naturally reductive homogeneous spaces.
As soon as the homogeneous spaces are no longer naturally reductive, the stan-
dard way of representing geodesics on quotient spaces (by projecting geodesics
from the group level to the quotient) fails. Alternatives of local approximations
have been sketched in these cases in order to structure future developments.
Techniques based on the Riemannian exponential are easy to implement on Lie
groups (with bi-invariant metric) and their closed subgroups. In particular, gradient
ows on subgroups of the unitary group with tensor product structure allow to
address dierent partitionings of m-party quantum systems, the nest one being the
group of purely local operations SU (2) SU (2) SU (2). The corresponding
gradient ows have several applications in quantum dynamics: for instance they
prove useful tools to decide whether eective multi-qubit interaction Hamiltonians
generate time evolutions that can be reversed in the sense of Hahns spin echo solely
by local operations. As a new application, gradient ows on SU (N1 ) SU (N2 )
SU (Nm ) turned out to be a valuable and reliable alternative to conventional
tensor-SVD methods for determining best rank-1 tensor approximations to higher-
order tensors. In the case of m-party multipartite pure quantum states, they can
readily be applied to optimizing entanglement witnesses.
Double-bracket ows have been characterized as a special case of gradient ows
on naturally reductive homogeneous orbit spaces. Here, in view of using gradient
techniques for ground-state calculations [36], it is important to note that double-
bracket ows can also be established for any closed subgroup of SU (N ): by allowing
for dierent partitionings SU (N1 ) SU (N2 ) SU (Nm ), one may set up a
common frame to compare dierent types of unitary networks [36,50] for calculating
and simulating large-scale quantum systems.
Moreover, we have shown how techniques of restricting a gradient ow to
subgroups also prove a useful tool for addressing constrained optimization tasks
by ensuring the constraints are fullled intrinsically. As an alternative, we have
sketched gradient ows that respect constraints extrinsically, e.g., by way of penalty-
type parameters. These methods await application, e.g., in error-correction and
robust state transfer.
Finally, in a follow-up study, we discussed the dynamics of open quantum sys-
tems in terms of Lie semigroups [59]. We discuss relations between the theory of Lie
semigroups and completely positive semigroups. In particular in open systems, an
easy characterization of reachable sets arises only in very simple cases. It thus poses
a current limit to an abstract optimization approach on reachable sets. However,
in these cases, gradient-assisted optimal control methods again prove valuable.
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

660 T. Schulte-Herbr
uggen et al.

Therefore, not only does the current work give the justication to some recent
developments, it also provides new techniques to the eld of quantum dynamics. It
shows how to exploit the dierential geometry in Lie theoretical terms for optimiza-
tion on quantum-state manifolds. Thus the comprehensive theoretical treatment
illustrated by known examples and new practical applications has been given to ll
a gap. We anticipate that the ample array of methods and their exemplications
will nd broad application in particular, since tensor approximations begin to play
a key role in tensor-network approaches. They are used to approximate ground
state energies (Rayleigh-coecients) of large-system Hamiltonians exceeding the
memory capacity of any (classical) computer hardware [36, 4150]. The account of
theoretical foundations is also meant to structure and trigger further basic research
thus widening the set of useful tools.

Acknowledgments
Fruitful discussion with Jens Eisert on [36] is gratefully acknowledged. We wish
to thank Otfried G uhne for drawing our attention to witness optimization and
Ref. [55]. This work was supported in part by the integrated EU programmes QAP,
Q-ESSENCE and the exchange with COQUIT, as well as by Deutsche Forschungs-
gemeinschaft, DFG, in the incentives SPP 1078 and SFB 631. Support and exchange
enabled by the two Bavarian PhD programmes of excellence Quantum Computing,
Control, and Communication (QCCC) as well as Identication, Optimization and
Control with Applications in Modern Technologies is gratefully acknowledged.

References
[1] R. P. Feynman, Simulating physics with computers, Int. J. Theoret. Phys. 21 (1982)
467488.
[2] R. P. Feynman, Feynman Lectures on Computation (Perseus Books, Reading, MA,
1996).
[3] A. Y. Kitaev, A. H. Shen and M. N. Vyalyi, Classical and Quantum Computation
(American Mathematical Society, Providence, 2002).
[4] P. W. Shor, Algorithms for quantum computation: Discrete logarithms and fac-
toring, in Proceedings of the Symposium on the Foundations of Computer Science
(1994 ), Los Alamitos, California, USA (IEEE Computer Society Press, New York,
1994), pp. 124134.
[5] P. W. Shor, Polynomial-time algorithms for prime factorisation and discrete loga-
rithm on a quantum computer, SIAM J. Comput. 26 (1997) 14841509.
[6] R. Jozsa, Quantum algorithms and the Fourier transform, Proc. R. Soc. A 454
(1998) 323337.
[7] R. Cleve, A. Ekert, C. Macchiavello and M. Mosca, Quantum algorithms revisited,
R. Soc. Lond. Proc. Ser. A Math. Phys. Eng. 454 (1998) 339354.
[8] M. Ettinger, P. Hyer and E. Knill, The quantum query complexity of the hidden
subgroup problem is polynomial, Inf. Process. Lett. 91 (2004) 4348.
[9] L. K. Grover, A fast quantum mechanical algorithm for database search, in Proceed-
ings of the 28th Annual Symposium on the Theory of Computing (1996 ), Philadel-
phia, Pennsylvania, USA (ACM Press, New York, 1996), pp. 212219.
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 661

[10] L. K. Grover, Quantum mechanics helps in searching for a needle in a haystack,


Phys. Rev. Lett. 79 (1997) 325328.
[11] C. H. Papadimitriou, Computational Complexity (Addison Wesley, Reading MA,
1995).
[12] S. Sachdev, Quantum Phase Transitions (Cambridge University Press, Cambridge,
1999).
[13] E. Jane, G. Vidal, W. Dur, P. Zoller and J. Cirac, Simulation of quantum dynamics
with quantum optical systems, Quant. Inf. Computation 3 (2003) 1537.
[14] D. Porras and J. Cirac, Eective quantum spin systems with trapped ions, Phys.
Rev. Lett. 92 (2004) 207901.
[15] J. Dowling and G. Milburn, Quantum technology: The second quantum revolution,
Phil. Trans. R. Soc. Lond. A 361 (2003) 16551674.
[16] H. M. Wiseman and G. J. Milburn, Quantum Measurement and Control (Cambridge
University Press, Cambridge, 2009).
[17] R. W. Brockett, Dynamical systems that sort lists, diagonalise matrices, and solve
linear programming problems, in Proc. IEEE Decision Control (1988 ), Austin,
Texas, USA (1988), pp. 779803; reproduced in: Lin. Alg. Appl. 146 (1991) 7991.
[18] R. W. Brockett, Least-squares matching problems, Lin. Alg. Appl. 122(4) (1989)
761777.
[19] R. W. Brockett, Dierential geometry and the design of gradient algorithms, Proc.
Symp. Pure Math. 54 (1993) 6991.
[20] S. T. Smith, Geometric optimization methods for adaptive ltering, PhD Thesis,
Harvard University, Cambridge MA (1993).
[21] S. T. Smith, Hamiltonian and Gradient Flows, Algorithms and Control, Fields Insti-
tute Communications (American Mathematical Society, Providence, 1994), pp. 113
136, chap. Optimization techniques on Riemannian manifolds.
[22] U. Helmke and J. B. Moore, Optimization and Dynamical Systems (Springer, Berlin,
1994).
[23] A. Bloch (ed.), Hamiltonian and Gradient Flows, Algorithms and Control, Fields
Institute Communications (American Mathematical Society, Providence, 1994).
[24] M. T. Chou and K. R. Driessel, The projected gradient method for least-squares
matrix approximations with spectral constraints, SIAM J. Numer. Anal. 27 (1990)
10501060.
[25] P. A. Absil, R. Mahony and R. Sepulchre, Optimization Algorithms on Matrix Man-
ifolds (Princeton University Press, Princeton, 2008).
[26] L. Ambrosio, N. Gigli and G. Savare, Gradient Flows in Metric Spaces and in the
Space of Probability Measures, Lectures in Mathematics, 2nd edn. (ETH-Z urich,
Birkhauser, Basel, 2008).
[27] S. J. Glaser, T. Schulte-Herbr uggen, M. Sieveking, O. Schedletzky, N. C. Nielsen,
O. W. Srensen and C. Griesinger, Unitary control in quantum ensembles: Max-
imising signal intensity in coherent spectroscopy, Science 280 (1998) 421424.
[28] T. Schulte-Herbr uggen, Aspects and prospects of high-resolution NMR, PhD Thesis,
Diss-ETH 12752, Z urich (1998).
[29] T. Schulte-Herbr uggen and A. Sp orl, Which quantum evolutions can be reversed by
local unitary operations? Algebraic classication and gradient-ow based numerical
checks (2006); http://arXiv.org/pdf/quant-ph/0610061.
[30] N. Khaneja, T. Reiss, C. Kehlet, T. Schulte-Herbr uggen and S. J. Glaser, Optimal
control of coupled spin dynamics: Design of NMR pulse sequences by gradient ascent
algorithms, J. Magn. Reson. 172 (2005) 296305.
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

662 T. Schulte-Herbr
uggen et al.

[31] T. Schulte-Herbr uggen, A. K. Sp orl, N. Khaneja and S. J. Glaser, Optimal control-


based ecient synthesis of building blocks of quantum algorithms: A perspective
from network complexity towards time complexity, Phys. Rev. A 72 (2005) 042331.
[32] A. K. Sporl, T. Schulte-Herbr
uggen, S. J. Glaser, V. Bergholm, M. J. Storcz, J. Fer-
ber and F. K. Wilhelm, Optimal control of coupled Josephson qubits, Phys. Rev. A
75 (2007) 012302.
[33] T. Schulte-Herbr uggen, A. Sp orl, N. Khaneja and S. Glaser, Optimal control for
generating quantum gates in open dissipative systems (2006); http://arXiv.org/
pdf/quant-ph/0609037.
[34] P. Rebentrost, I. Serban, T. Schulte-Herbr uggen and F. Wilhelm, Optimal control
of a qubit coupled to a non-Markovian environment, Phys. Rev. Lett. 102 (2009)
090401.
[35] M. Grace, C. Brif, H. Rabitz, I. Walmsley, R. Kosut and D. Lidar, Optimal control
of quantum gates and suppression of decoherence in a system of interacting two-level
particles, J. Phys. B.: At. Mol. Opt. Phys. 40 (2007) S103S125.
[36] C. Dawson, J. Eisert and T. J. Osborne, Unifying variational methods for simulating
quantum many-body systems, Phys. Rev. Lett. 100 (2008) 130501.
[37] T. Huckle, K. Waldherr and T. Schulte-Herbr uggen, Unifying large-scale tensor
approximations Concepts and algorithms (2010); to be submitted.
[38] M. Plenio, J. Eisert, J. Dreissig and M. Cramer, Entropy, entanglement, and area:
Analytical results for harmonic lattice systems, Phys. Rev. Lett. 94 (2003) 060503.
[39] M. Cramer, J. Eisert, M. Plenio and J. Dreissig, Entanglement-area law for general
bosonic harmonic lattice systems, Phys. Rev. A 73 (2006) 012309.
[40] M. Wolf, F. Verstraete, M. B. Hastings and I. Cirac, Area laws in quantum systems:
Mutual information and correlations, Phys. Rev. Lett. 100 (2008) 070502.
[41] M. Fannes, B. Nachtergaele and R. Werner, Abundance of translation invariant pure
states on quantum spin chains, Lett. Math. Phys. 25 (1992) 249258.
[42] M. Fannes, B. Nachtergaele and R. F. Werner, Finitely correlated states on quantum
spin chains, Comm. Math. Phys. 144 (1992) 443490.
[43] I. Peschel, X. Wang, M. Kaulke and K. Hallberg (eds), Density-Matrix Renor-
mailzation: A New Numerical Method in Physics, Lecture Notes in Physics, Vol. 528
(Springer, Berlin, 1999).
[44] U. Schollw ock, The density-matrix renormalization group, Rev. Mod. Phys. 77
(2005) 259315.
[45] B. Schumacher and R. Werner, Reversible quantum cellular automata (2004);
http://arXiv.org/pdf/quant-ph/0405174.
[46] F. Verstraete, D. Porras and I. Cirac, DMRG and periodic boundary conditions: A
quantum information perspective, Phys. Rev. Lett. 93 (2004) 227205.
[47] S. Anders, M. B. Plenio, W. D ur, F. Verstraete and H. J. Briegel, Ground-state
approximation for strongly interacting spin systems in arbitrary spatial dimension,
Phys. Rev. Lett. 97 (2006) 107206.
[48] G. Vidal, Entanglement renormalization, Phys. Rev. Lett. 99 (2007) 220405.
[49] N. Schuch, M. Wolf, F. Verstraete and I. Cirac, Strings, projected entangled pair
states, and variational Monte Carlo methods, Phys. Rev. Lett. 100 (2008) 040501.
[50] R. Hubner, C. Kruszynska, L. Hartmann, W. D ur, F. Verstraete, J. Eisert and
M. Plenio, Renormalization algorithm with graph enhancement, Phys. Rev. A 79
(2009) 022317.
[51] F. Wegner, Flow-equations for Hamiltonians, Ann. Phys. (Leipzig) 3 (1994) 7791.
[52] S. Kehrein, The Flow-Equation Approach to Many-Particle Systems, Springer Tracts
in Physics, Vol. 217 (Springer, Berlin, 2006).
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 663

[53] M. B. Plenio and S. Virmani, An introduction to entanglement measures, Quant.


Comp. Inf. 7 (2007) 151.
[54] R. Horodecki, P. Horodecki, M. Horodecki and K. Horodecki, Quantum entangle-
ment, Rev. Mod. Phys. 81 (2009) 865942.
[55] J. Eisert, P. Hyllus, O. G uhne and M. Curty, Complete hierarchies of ecient
approximations to problems in entanglement theory, Phys. Rev. A 70 (2004) 062317.
[56] R. Lohmayer, A. Osterloh, J. Siewert and A. Uhlmann, Entangled three-qubit states
without concurrence and three-tangle, Phys. Rev. Lett. 97 (2006) 260502.
[57] A. Osterloh, J. Siewert and A. Uhlmann, Tangles of superpositions and the convex-
roof extension, Phys. Rev. A 77 (2008) 032210.
[58] C. Eltschka, A. Osterloh, J. Siewert and A. Uhlmann, Three-tangle for mixtures of
generalized GHZ and generalized W states, New J. Phys. 10 (2008) 043014.
[59] G. Dirr, U. Helmke, I. Kurniawan and T. Schulte-Herbr uggen, Lie semigroup struc-
tures for reachability and control of open quantum systems, Rep. Math. Phys. 64
(2009) 93121; http://arXiv.org/pdf/0811.3906.
[60] M. M. Wolf and J. I. Cirac, Dividing quantum channels, Comm. Math. Phys. 279
(2008) 147168.
[61] C. Udriste, Convex Functions and Optimization Methods on Riemannian Manifolds
(Kluwer, Dordrecht, 1994).
[62] D. Gabay, Minimizing a dierential function over a dierential manifold, J. Optim.
Theory Appl. 37 (1982) 177219.
[63] M. Kleinsteuber, Jacobi-type methods on semisimple Lie algebras A Lie algebraic
approach to the symmetric eigenvalue problem, PhD Thesis, Universit at Wurzburg
(2006).
[64] J. Nocedal, Updating quasi-Newton matrices with limited storage, Math. Comp. 35
(1980) 773782.
[65] R. H. Byrd, P. Lu and R. B. Schnabel, Representation of quasi-Newton matrices
and their use in limited memory methods, Math. Program. 63 (1994) 129156.
[66] J. Nocedal and S. J. Wright, Numerical Optimization, 2nd edn. (Springer, New York,
2006).
[67] V. Jurdjevic, Geometric Control Theory (Cambridge University Press, Cambridge,
1997).
[68] Y. L. Sachkov, Controllability of invariant systems on Lie groups and homogeneous
spaces, J. Math. Sci. 100 (2000) 23552427.
[69] G. Dirr and U. Helmke, Lie theory for quantum control, GAMM-Mitteilungen 31
(2008) 5993.
[70] D. DAlessandro, Introduction to Quantum Control and Dynamics (Chapman &
Hall/CRC, Boca Raton, 2008).
[71] V. F. Krotov, Global Methods in Optimal Control (Marcel Dekker, New York, 1996).
[72] A. Peirce, M. Dahleh and H. Rabitz, Optimal control of quantum mechanical sys-
tems: Existence, numerical approximations and applications, Phys. Rev. A 37 (1987)
49504962.
[73] K. L. Teo, C. J. Goh and K. H. Wong, A Unied Computational Approach to Optimal
Control Problems (Chapman & Hall/CRC, Boca Raton, 1991).
[74] Y. Maday and G. Turinici, New formulation of monotonically convergent quantum
control algorithms, J. Chem. Phys. 118 (2003) 81918196.
[75] H. Sussmann and V. Jurdjevic, Controllability of nonlinear systems, J. Dierential
Equations 12 (1972) 95116.
[76] V. Jurdjevic and H. Sussmann, Control systems on Lie groups, J. Dierential Equa-
tions 12 (1972) 313329.
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

664 T. Schulte-Herbr
uggen et al.

[77] A. Agrachev and T. Chambrion, An estimation of the controllability time for single-
input systems on compact Lie groups, ESAIM Control Optim. Calc. Var. 12 (2006)
409441.
[78] R. W. Brockett, System theory on group manifolds and coset spaces, SIAM J.
Control 10 (1972) 265284.
[79] R. W. Brockett, Lie theory and control systems dened on spheres, SIAM J. Appl.
Math. 25 (1973) 213225.
[80] W. M. Boothby and E. N. Wilson, Determination of the transitivity of bilinear
systems, SIAM J. Control Optim. 17 (1979) 212221.
[81] F. Albertini and D. DAlessandro, Notions of controllability for bilinear multilevel
quantum systems, IEEE Trans. Automat. Control 48 (2003) 13991403.
[82] R. Zeier, U. Sander and T. Schulte-Herbr uggen, Symmetry in quantum system
theory of multi-qubit systems, in Proc. 19th MTNS, Budapest, Hungary (2010),
in press.
[83] U. Sander and T. Schulte-Herbr uggen, Symmetry in quantum system the-
ory of multi-qubit systems: Rules for quantum architecture design (2009);
http://arXiv.org/pdf/0904.4654.
[84] M. W. Hirsch and S. Smale, Dierential Equations, Dynamical Systems, and Linear
Algebra (Academic Press, San Diego, 1974).
[85] M. C. Irwin, Smooth Dynamical Systems (Academic Press, New York, 1980).
[86] R. Fletcher, Practical Methods of Optimization, 2nd edn. (Wiley & Sons, Chichester,
1987).
[87] D. G. Luenberger and Y. Ye, Linear and Nonlinear Programming, 3rd edn. (Springer,
Berlin, 2008).
[88] W. Boothby, An Introduction to Dierential Manifolds and Riemannian Geometry
(Academic Press, New York, 1975).
[89] S. Gallot, D. Hulin and J. Lafontaine, Riemannian Geometry, 3rd edn. (Universitext,
Springer, Berlin, 2004).
[90] M. Spivak, A Comprehensive Introduction to Dierential Geometry, Vols. III, 3rd
edn. (Publish or Perish, Houston, 1999).
[91] B. ONeill, Semi-Riemannian Geometry (Academic Press, San Diego, 1983).
[92] R. Abraham, J. E. Marsden and T. Ratiu, Manifolds, Tensor Analysis and Appli-
cations, 2nd edn. (Springer, New York, 1988).
[93] J. Palis and W. de Melo, Geometric Theory of Dynamical Systems (Springer, New
York, 1982).
[94] S. L
 ojasiewicz, Sur les Trajectoires du Gradient dune Fonction Analytique. Seminari
di Geometria 19821983, Universit` a di Bologna, Istituto di Geometria, Dipartimento
di Matematica (1984).
[95] K. Kurdyka, On gradients of functions denable in O-minimal structures, Ann. Inst.
Fourier 48 (1998) 769783.
[96] S. Kobayashi and K. Nomizu, Foundations of Dierential Geometry, Vols. III
(Wiley Interscience, New York, 1996).
[97] F. Takens, A solution, in Manifolds Amsterdam 1970, ed. N. Kuiper, Lecture
Notes in Math., Vol. 197 (Springer, New York, 1971), p. 231.
[98] C. Lageman, Convergence of gradient-like dynamical systems and optimization algo-
rithms, PhD Thesis, Universit at W urzburg (2007).
[99] S. Helgason, Dierential Geometry, Lie Groups, and Symmetric Spaces (Academic
Press, New York, 1978).
[100] B. C. Hall, Lie Groups, Lie Algebras, and Representations (Springer, New York,
2003).
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 665

[101] J. J. Duistermaat and J. A. C. Kolk, Lie Groups (Springer, New York, 2000).
[102] A. Arvanitoyeorgos, An Introduction to Lie Groups and the Geometry of Homoge-
neous Spaces (American Mathematical Society, Providence, 2003).
[103] A. W. Knapp, Lie Groups beyond an Introduction, 2nd edn. (Birkh auser, Boston,
2002).
[104] J. Milnor, Curvatures of left invariant metrics on Lie groups, Adv. Math. 21 (1976)
293329.
[105] J. Cheeger and D. G. Ebin, Comparison Theorems in Riemannian Geometry (North-
Holland, Amsterdam, 1975).
[106] T. Br ocker and T. tom Dieck, Representation of Compact Lie Groups (Springer,
New York, 1985).
[107] O. Kowalski and J. Szenthe, On the existence of homogeneous geodesics in homo-
geneous Riemannian manifolds, Geom. Dedicata 81 (2000) 209214; Erratum, ibid.
84 (2001) 331.
[108] A. Besse, Einstein Manifolds (Spinger, Berlin, 1986).
[109] B. Kostant, Holonomy and Lie algebra of motions in Riemannian manifolds, Trans.
Amer. Math. Soc. 80 (1955) 520542.
[110] U. Helmke, K. H uper and J. Trumpf, Newtons method on Grassmann manifolds
(2007); http://arXiv.org/pdf/0709.2205.
[111] M. Goldberg and E. Straus, Elementary inclusion relations for generalized numerical
ranges, Linear Algebra Appl. 18 (1977) 124.
[112] C.-K. Li, C-numerical ranges and C-numerical radii, Lin. Multilin. Alg. 37 (1994)
5182.
[113] T. Schulte-Herbr uggen, G. Dirr, U. Helmke, M. Kleinsteuber and S. Glaser, The
signicance of the C-numerical range and the local C-numerical range in quantum
control and quantum information, Lin. Multin. Alg. 56 (2008) 326.
[114] G. Dirr, U. Helmke, M. Kleinsteuber and T. Schulte-Herbr uggen, Relative
C-numerical ranges for applications in quantum control and quantum information,
Lin. Multin. Alg. 56 (2008) 2751.
[115] C.-K. Li and N. K. Tsing, Matrices with circular symmetry on their unitary orbits
and C-numerical ranges, Proc. Amer. Math. Soc. 111 (1991) 1928.
[116] U. Helmke, K. H uper, J. B. Moore and T. Schulte-Herbr uggen, Gradient ows com-
puting the C-numerical range with applications in NMR spectroscopy, J. Global
Optim. 23 (2002) 283308.
[117] G. Dirr, U. Helmke, M. Kleinsteuber, S. Glaser and T. Schulte-Herbr uggen, The
local C-numerical range: Examples, conjectures and numerical algorithms, in Proc.
MTNS (2006), Kyoto, Japan (2006), pp. 14191426.
[118] G. Dirr, U. Helmke, M. Kleinsteuber and T. Schulte-Herbr uggen, A new type of C-
numerical range arising in quantum computing, PAMM 6 (2006) 711712; Special
issue on 80th Annual Meeting GAMM.
[119] A. Bloch, R. W. Brockett and T. Ratiu, A new formulation of the generalized Toda
lattice equations and their x-point analysis via the moment map, Bull. Am. Math.
Soc. 56 (1990) 447451.
[120] A. Bloch, R. W. Brockett and T. Ratiu, Completely integrable gradient ows,
Comm. Math. Phys. 147 (1992) 5774.
[121] R. Bertlman, H. Narnhofer and W. Thirring, A geometric picture of entanglement
and Bell inequalities, Phys. Rev. A 66 (2002) 032319.
[122] M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information
(Cambridge University Press, Cambridge, UK, 2000).
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

666 T. Schulte-Herbr
uggen et al.

[123] O. Curtef, G. Dirr and U. Helmke, Conjugate gradient algorithms for best rank-1
approximation of tensors, PAMM 7 (2008) 10622011062202; Proceedings of the
ICIAM (2007), Z urich, Switzerland.
[124] J. von Neumann, Some matrix-inequalities and metrization of matrix-space, Tomsk
Univ. Rev. 1 (1937) 286300; reproduced in John von Neumann: Collected Works,
Vol. IV: Continuous geometry and other topics, ed. A. H. Taub (Pergamon Press,
Oxford, 1962), pp. 205219.
[125] O. Srensen, Polarization transfer experiments in high-resolution NMR spec-
troscopy, Prog. NMR Spectrosc. 21 (1989) 503569.
[126] D. Markham, J. A. Miszczak, Z. Puchala and K. Zyczkowski, Quantum state
discrimination: A geometric approach, Phys. Rev. A 77 (2008) 042111.
[127] J. Stoustrup, O. Schedletzky, S. J. Glaser, C. Griesinger, N. C. Nielsen and O. W.
Srensen, Generalized bound on quantum dynamics: Eciency of unitary transfor-
mations between non-Hermitian states, Phys. Rev. Lett. 74 (1995) 29212924.
[128] T. Kolda, Orthogonal tensor decompositions, SIAM J. Matrix Anal. Appl. 23 (2001)
243255.
[129] T. Zhang and G. H. Golub, Rank-one approximation to higher-order tensors, SIAM.
J. Matrix Anal. Appl. 23 (2001) 534550.
[130] L. de Lathauwer, B. de Moor and J. Vandewalle, A multilinear singular value decom-
position, SIAM J. Matrix Anal. Appl. 21 (2000) 12531278.
[131] L. de Lathauwer, B. de Moor and J. Vandewalle, On the best rank-1 and rank-
(R1 , R2 , . . . , Rn ) approximation of higher-order tensors, SIAM J. Matrix Anal.
Appl. 21 (2000) 13241342.
[132] L. Elden and B. Savas, A NewtonGrassmann method for computing the best mul-
tilinear rank-(R1 , R2 , R3 ) approximation of a tensor, SIAM J. Matrix Anal. Appl.
31 (2009) 248271.
[133] B. Savas and L. H. Lim, Quasi-Newton methods on Grassmannians and multilinear
approximations of tensors, Optimization Online 2009 (2009) 2362; arXiv:0907.2214.
[134] O. Curtef, G. Dirr and U. Helmke, Riemannian optimization on tensor manifolds:
Applications to generalized Rayleigh quotients (2010); arXiv:1005.4854.
[135] M. Ishteva, L. D. Lathauwer, P. A. Absil and S. V. Huel, Dierential-geometric
Newton method for the best rank-(R1 , R2 , R3 ) approximation of tensors, Numer.
Algorithms 51 (2009) 179194; Tributes to Gene H. Golub, Part II.
[136] T. Wei and P. Goldbart, Geometric measure of entanglement and applications to
bipartite and multipartite quantum states, Phys. Rev. A 68 (2003) 022307.
[137] T. G. Kolda and B. W. Bader, Tutorial on MATLAB for tensors and the Tucker
decomposition, Talk at workshop on tensor decomposition and applications, CIRM,
Marseille (2005).
[138] J. L. Brylinski, Mathematics of Quantum Computation, Computational Mathemat-
ics Series (Chapman & Hall/CRC, Boca Raton, 2002), pp. 323, chap. on Algebraic
Measures of Entanglement.
[139] B. G. Englert and N. Metwally, Mathematics of Quantum Computation, Computa-
tional Mathematics Series (Chapman & Hall/CRC, Boca Raton, 2002), pp. 2475,
chap. on Kinematics of Qubit Pairs.
[140] F. Mezzadri, How to generate random matrices from the classical compact groups,
Notices Amer. Math. Soc. 54 (2007) 592604.
[141] C. K. Li, Y. T. Poon and T. Schulte-Herbr uggen, Least-squares approximation by
elements from matrix orbits achieved by gradient ows on compact Lie groups,
Math. Comp., in press (2010); arXiv:0812.1817.
July 12, 2010 12:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004053

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 667

[142] M. Grassl, Lectures on Quantum Information (Wiley-VCH, Weinheim, 2007),


pp. 105120, chap. on Quantum Error Correction.
[143] A. R. Calderbank, E. M. Rains, P. W. Shor and N. J. A. Sloane, Quantum error
correction via codes over GF (4), IEEE Trans. Inf. Theory 44 (1998) 13691387.
[144] A. R. Calderbank and P. W. Shor, Good quantum error-correcting codes exist, Phys.
Rev. A 54 (1998) 10891105.
[145] B. Tibken, Y. Fan, S. J. Glaser and T. Schulte-Herbr uggen, Semidenite pro-
gramming relaxations applied to determining upper bounds of C-numerical ranges,
in Proc. IEEE Intl. Conference on Control Applications (CCA) (2004 ), Munich,
Germany (2004); published as CD-ROM Proceedings (2006).
July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

Reviews in Mathematical Physics


Vol. 22, No. 6 (2010) 669697

c World Scientific Publishing Company
DOI: 10.1142/S0129055X10004041

DENSITY DEPENDENT STOCHASTIC NAVIERSTOKES


EQUATIONS WITH NON-LIPSCHITZ RANDOM FORCING

MAMADOU SANGO
School of Mathematics, Institute for Advanced Study,
1, Einstein Drive, Princeton, NJ 08540, USA
and
Department of Mathematics and Applied Mathematics,
University of Pretoria, Pretoria 0002, South Africa
sango1767@gmail.com
mamadou.sango@up.ac.za

Received 24 September 2009


Revised 17 March 2010

In this work, we investigate the question of existence of weak solutions to the density
dependent stochastic NavierStokes equations. The noise considered contains functions
which depend nonlinearly on the velocity and which do not satisfy the Lipschitz
condition. Furthermore, the initial density is allowed to vanish. We introduce a suitable
notion of probabilistic weak solution for the problem and prove its existence.

Keywords: Density dependent stochastic NavierStokes equations; weak solutions;


Galerkin scheme; tightness of probability measures; Prokhorov and Skorokhods
theorems.

Mathematics Subject Classification 2010: 35R60, 35D05, 35Q35

1. Introduction
The mathematical study of incompressible NavierStokes equations goes back to
the pioneering work of Leray [2931]. Since then a considerable wealth of work
and ground-breaking results have been obtained by some of the brightest minds in
Mathematics and Applied Mathematics. For an in-depth historical overview of the
body of work done in this direction, we refer to the monographs [1, 27, 32, 35, 57].
One of the greatest challenges in the eld of uid dynamics is the question of
understanding of complex phenomenon of Turbulence. With the development of
stochastic processes, models of NavierStokes equations perturbed by white noise
were proposed and investigated in the quest for better understanding turbulence
in uids (see [35, 10, 11, 1517, 19, 40, 41, 43, 46, 47, 6062], just to cite a few). The
main feature in these equations is the decomposition of the force acting on the
uid into a regular (deterministic) part and very irregular (turbulent) part driven
by white noise. The mathematical theory of stochastic (mainly incompressible)

669
July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

670 M. Sango

NavierStokes equations is a very rich and broad area covering deep results on
existence of solutions, dynamical systems feature, ergodicity, and many more, see
[11, 14, 17, 42], for instance.
However, while research in the density independent case has known a relatively
sustained growth over the years, very little is known about the density dependent
case which even in the deterministic case has a relatively recent history. On the
deterministic front, results on global existence and some uniqueness results have
been obtained by Antonsev and Kazhikov [1,24], in the case of non vanishing initial
density, see also [28]. These results were subsequently extended to the vanishing
initial density case in [20, 22, 23, 33, 35, 5254] (the magnetohydrodynamic version).
The most dicult case of compressible uids is a very active area since the work of
Lions [3739] where the notion of renormalized solution introduced earlier by him
and Di-Perna led to a breakthrough in the eld; we refer to his monograph [36] and
those of Feireisl [18] and Novotny [44] for a greater wealth of information.
In the present work, we provide a detailed investigation of a large class of
stochastic density dependent NavierStokes equations. We consider a suciently
general forcing consisting of a regular part and a stochastic part both depend-
ing nonlinearly on the velocity of the uid and we do not require the functions
involved in the forcing to satisfy the Lipschitz condition and we allow the initial
density of the uid to be non negative. The main result is the construction of
a probabilistic weak solution for the problem considered. The result is achieved
thanks to a delicate blending of the semi-Galerkin approximation and deep theo-
rems of compactness both of probabilistic and analytic nature which has proved
very ecient in establishing existence of solutions in other problems, we refer
to [35,15,16,19,4650,62,64]. Securing the strong convergence of several sequences
of the approximating solutions through the tightness of the corresponding probabil-
ity distributions and ne measure, theoretic results presented far more challenges
than in the deterministic and the density independent stochastic NavierStokes
cases. Our results extend most of the known deterministic existence results referred
to above to the stochastic case. Yashima was the rst to study stochastic den-
sity dependent equations in his thesis [64]. He considered additive noise and the
case of positive initial density. One of his main contributions is the extension of
some results of Bensoussan and Temam [5] to the density dependent case and the
extension of some results of Antontsev and Kazhikov [1] to the stochastic case.
The next work known to us in this direction is that of Cutland and Enright [12]
who treat the case of positive initial density with nonlinear noise depending on the
velocity. Their approach is based on nonstandard analysis and Loeb space tech-
niques. It is worth noting that some existence results in the one-dimensional and
two-dimensional compressible cases were obtained in the work of Tornatore and
Yashima [58, 59, 63].
In view of the lack of Lipschitzity of the forces uniqueness is out of reach for the
problem we study. The genuine uniqueness question is similar to the still unsolved
deterministic case.
July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

Density Dependent Stochastic NavierStokes Equations 671

Let D be a domain bounded in R3 with a suciently smooth boundary D


(at least Lipschitz). We x a nal time T > 0 and denote by QT the cylindrical
domain (0, T ) D. We consider the initial boundary value problem for the density
dependent stochastic NavierStokes equations

du + ((u )u  u + P )dt = f (t, u)dt + g(t, u)dW in QT , (1)



+ (u ) = 0 in QT , (2)
t
div u = 0 in QT , (3)
u = 0 on (0, T ) D, (4)
(0) = 0 , u(0) = u0 in D; (5)

u is the velocity of the particles of uid, P the pressure, the density, W a l-


dimensional Wiener process and the right-hand side of (1) represents the force
acting on the uid and consisting of a regular part involving the function f and a
chaotic part involving the function g and W .
As a closing remark in this introduction, we note that the framework elabo-
rated in the present paper opens some opportunities for attacking ergodic problems
related to density dependent turbulent NavierStokes uids. The density indepen-
dent case was considered in [79, 14, 17, 25, 26], just to cite a few; see also the
references therein. The Galerkin approximation plays an important role in these
works.
The plan of the paper is as follows. In Sec. 2, we gather some preliminary results
that will be needed in the work, we introduce the denition of the probabilistic
weak solution for the problems (1)(5), we formulate our main result. In Sec. 3,
we introduce a semi-Galerkin approximation scheme for the problems (1)(5) and
obtain a priori estimates for the approximating solutions needed for the application
of several compactness results. In Sec. 4, we prove the crucial result of tightness of
Galerkins solutions and apply Prokhorovs and Skorokhods compactness results.
In the last Sec. 5, we prove our main result.

2. Preliminaries and Main Result


We introduce some function spaces.
Let D(D) be the space of C functions compactly supported in D and let D (D)
be the space of distributions on D.
For 1 r , l a nonnegative integer we dene the Sobolev spaces

Wrl (D) = {v Lr (D) : D v (Lr (D))3 for || l},


July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

672 M. Sango

D = D11 D33 , = (1 , 2 , 3 ), || = 1 + 2 + 3 , Di = /xi .


l
Wr,0 (D) is the closure of D(D) in Wrl (D),
 3

 vi
1
Wr (D) = v = v0 + : vi L (D), i = 0, 1, 2, 3
r

i=1
xi

H l (D) = W2l (D), H0l (D) = W2,0


l
(D), H 1 (D) = W21 (D);

these spaces are endowed with their respective usual norms.


Next let

V = {v D(D) : div v = 0}.

Denote by V the closure of V in (H 1 (D))3 and by H the closure of V in (L2 (D))3 .


V and H are Hilbert spaces with norms  V and  H , respectively. We denote
the Euclidean norm by | |.
In view of the Lipschitzity of the boundary of D the following characterization
of V and H hold:

V = {v (H 1 (D))3 : div v = 0 in D and v|D = 0},


H = {v (L2 (D))3 : div v = 0 in D and v|D n = 0},

where v|D denotes the trace of v on D and n is a vector normal to D.


The inner product in H is induced by the inner product (, ) in L2 (D). We
denote by ,  the duality paring between V and V  the dual of V .
We denote by (, )D the duality product in all functions spaces on D.
In particular,

(v, w)D = v(x)w(x)dx,
D

p 1  1
if v L (D) and w L (D), p + (p ) = 1.
p

We recall some properties of products in Sobolev spaces Wp1 (D), p 1; the


Sobolev conjugate p is given by p1
=p
1
31 ; p is any nite non negative real
if p = 3, p = if p > 3.

Lemma 1.

(i) For 1 p q , the product

Wp1 (D) Wq1 (D) Wr1 (D)

is continuous if r 1 and r1 = p1 + q1 .
(ii) For 1 p , 1 q , the product

Wp1 (D) Wq1 (D) Wr1 (D) (6)

is continuous if p1 + q 1 1 and r1 = p1
+q
1
.
July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

Density Dependent Stochastic NavierStokes Equations 673

For a probability space (, F, P ) and a Banach space X we introduce the space


L (, F, P, Lq (0, T, X))(1 p, q ) of random functions dened on with values
p

in Lq (0, T, X). We endow Lp (, F, P, Lq (0, T, X)) with the norm


Lp (,F,P,Lq (0,T,X)) = (E(, , )pLq (0,T,X) )1/p .
We shall need in the sequel some important compactness results that we for-
mulate now. The proofs of these results can be found in the given references. We
have [32, Chap. 1, Lemma 1.3].

Lemma 2. Let (g )=1,2,... and g be some functions in Lq (0, T, Lq (D)) with q


(1, ) such that
g Lq (0,T,Lq (D)) C,
and as
g g for almost all (x, t) QT .
Then g weakly converges to g in Lq (0, T, Lq (D)).

Remark 3. The results of the lemma hold for the space Lq (, F, P, Lq (0, T, D))
in QT .
The next result is a sharper version of a theorem of Aubin (cf. [32, Chap. 1,
Par. 5]) due to Simon [51, Sec. 8, Theorem 5].

Lemma 4. Let X, B and Y be some Banach spaces such that X is compactly


embedded into B and let B be a subset of Y . For any 1 p, q , and 0 < s 1
let E be a set bounded in Lq (0, T, X) N s,p (0, T, Y ), where
 
s
N (0, T, Y ) = v L (0, T, Y ) : sup h v(t + ) v(t)Lp (0,T ,Y ) < .
s,p p
h>0
p
Then E is relatively compact in L (0, T, B).
We shall need in the sequel two deep results due to Prokhorov and Skorokhod.
We begin by introducing the concept of tightness of probability measures.
Let E be a separable Banach space and let B(E) be its Borel -eld.

Definition 5. A family of probability measures P on (E, B(E)) is tight if for any


> 0, there exists a compact set K E such that
(K ) 1 , for all P.
A sequence of measures {n } on (E, B(E)) is weakly convergent to a measure if
for all continuous and bounded functions on E
 
lim (x)n (dx) = (x)(dx).
n E E

The following result due to Prokhorov [45] shows that the tightness property is
a compactness criterion.
July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

674 M. Sango

Lemma 6. A sequence of measures {n } on (E, B(E)) is tight if and only if it is


relatively compact, that is, there exists a subsequence {nk } which weakly converges
to a probability measure .

Skorokhod proves in [55] the next result which relates the weak convergence
of probability measures with that of almost everywhere convergence of random
variables.

Lemma 7. For an arbitrary sequence of probability measures {n } on (E, B(E))


weakly convergent to a probability measure , there exists a probability space
(, F , P ) and random variables , 1 , . . . , n , . . . with values in E such that the
probability law of n ,

L(n )(A) = P { : n () A}, for all A F,

is n , the probability law of is , and

lim n = , P -a.s.
n

We borrowed the presentation of these lemmas from [13].


We now formulate the conditions on f and g.
We assume that f : (0, T ) H V  is a nonlinear mapping:

(i) continuous in both its variables,


(ii) there exists a positive constant C such that

f (t, v)V  C(1 + vH ). (7)

We assume that g : (0, T ) H H l is a nonlinear mapping:

(i) continuous in both its variables,


(ii) there exists a positive constant C such that

g(t, v)|H l C(1 + vH ); (8)

H l is the product of l copies of the space H.

We state the following:

Definition 8. A weak solution of (1)(5) is a probabilistic system (, F, F t , P,


W, u, ) where

(i) (, F, P ) is a probability space, F t is a ltration on (, F, P ),


(ii) W (t) is an l-dimensional F t standard Wiener process,
(iii) for almost every t, u(t) and (t) are F t -measurable,
(iv) u L4 (, F, P, L (0, T, H)) L2 (, F, P, L2 (0, T, V )),
L (, F, P, L (0, T, L (D))),
July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

Density Dependent Stochastic NavierStokes Equations 675

(v) for any V, H 1 (D)


  t  t
(u)(t)dx uudxdt + u dxdt
D 0 D 0 D
  t  t
= 0 u0 dx + f (t, u)dxdt + g(t, u)dxdW, (9)
D 0 D 0 D
   t
(t) dx 0 dx u dxdt = 0, (10)
D D 0 D

and
 
(0) = 0 , (0)u(0)dx = 0 u0 dx (11)
D D

almost surely and for all t [0, T ].

Our main result is

Theorem 9. Let the above conditions on f and g be satisfied and assume that
u0 H, 0 L (D), 0 0. Then there exists a solution of problems (1)(5) in
the sense of Definition 8.

Remark 10. We hereby emphasize the fact that the initial conditions (5) are
understood in the sense of (11). Under the above estimates satised by u and ,
and the integral identities (9) and (10) it can be shown as in the deterministic
case ( [35, 54, Chap. 2]) that (11) holds almost surely. [54, Proposition 13, p. 1110]
shows that conditions (11) are equivalent to

(0 (u(0) u0 )) = q,

where is the Leray projector and q H 1 (D). Therefore unless (0) is constant
this condition is weaker than the one usually required,

0 (u(0) u0 ) = 0.

This means the velocity elds u(0) and u0 are equal outside the vacuum.

3. Semi-Galerkin Approximation and A Priori Estimates


3.1. The semi-Galerkin scheme
In this section, we introduce a semi-Galerkin approximation following [1,24,33,54].
We obtain key a priori estimates for the approximating sequences of the presumed
solutions of our problem.
Let A be the Stokes operator with domain D(A) = H 2 V . We consider an
orthonormal basis of D(A) consisting of the eigenvectors w1 , . . . , wm , . . . of A. We
denote the span of w1 , . . . , wm by V m . On the probability space (,
F , P ) with a
July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

676 M. Sango

given l-dimensional Wiener process W , we look for the pair of sequences (m , um )


(u is sought as linear combination of w1 , . . . , wm as will be made precise below)
m

satisfying the integral equation


  
(m dum )(t)v dx + m um um v dxdt + um v dxdt
D D D
 
= m f (t, um )v dxdt + ,
m g(t, um )v dxdW (12)
D D

for all v V m and


m
+ (um )m = 0 in QT (13)
t
um (0) = um
0 , m (0) = m
0 in D. (14)

We assume that

0 V ,
um m
0 u0
um in (L2 (D))3 , (15)
1
0 C (D),
m 0 0
m in L (D) weakly-star, (16)
1 1
m = + inf 0 m
0 + sup 0 = m . (17)
m D m D

In solving (13) with the second initial condition in (14), we assume that um exists
and let y m (, t, x) be the ow of um (, ); that is y m is solution of the Cauchy
problem

dy(, t, x)
= um (, y(, t, x)), y(, t, x)| =t = x. (18)
d
By the method of characteristics, we have the representation

m (t, x) = m m
0 (y (0, t, x)) (19)

for the requested solution. This implies that

0 < m m (t, x) m . (20)

We note that m is a random function through the relations (18) and (19) which is
bounded above and below by deterministic values in (20).
For the existence of a solution um to (12), we substitute the function m
from (19) in (12) and look for um in the form of the expansion


m
um = m k
k (t)w (x). (21)
k=1
July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

Density Dependent Stochastic NavierStokes Equations 677

Substituting v = w1 , . . . , wm successively into (12) we obtain a system of ordinary


stochastic dierential equations for the coecients m k (t)
m   m 
m wk wl dx dm j )w k w dxdt
m (wj m k m l
k (t) +
k=1 D j,k=1 D

 
m  
m
m f t, wj m
j
wl dxdt + j (t)w w dx
m j l
D j=1 D j=1


 
m
= g t, wj m l
j (t) w dxdW , l = 1, . . . , m. (22)
D j=1

The matrix
 m
m k l
w w dx
D k,l=1

is non-degenerate since the family { w }k=1,...,m is free; in view of (20) m > 0.


m k

Thus (22) can be reduced to the canonical form


dm m m m m m m
l (t) + Fl (t, 1 , . . . , m )dt = Gl (t, 1 , . . . , m )dW , (23)
with the initial conditions
m m
l (0) = l0 , (24)
where m
l0 are the coecients in the expansion

m
um
0 = m k
k0 w .
k=1

In view of the conditions on f and g, the functions Flm and Gm l are continu-
ous in their variables t, m 1 , . . . , m
m . Thus thanks to an existence result for sys-
tem of stochastic ordinary dierential equations due to Skorokhod [56, Theorem 2,
Chap. 5], a local solution of (23) exists on an interval [0, Tm ]. Therefore any for
t [0, Tm ], the representation (21) holds. The existence over the whole interval
[0, T ] will follow from uniform a priori estimates in the next subsection.

3.2. The a priori estimates


We now proceed to the task of deriving needed a priori estimates.
Substituting v = wk into (12), multiplying the resulting relation by m
k (t) and
summing over k = 1, . . . , m, we get
  
(um m dum )(t)dx + m um um um dxdt + um um dxdt
D D D
 
= m f (t, um )um dxdt + .
m g(t, um )um dxdW (25)
D D
July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

678 M. Sango

We introduce the stopping times




inf{t > 0 :  m (t)um (t)L2 (D) N },

N = if {
:  m (t)um (t)L (D) N } = ,


2
if { :  m (t)um (t)L (D) N } = .
2

Applying Itos formula to



m um um dx,
D

we deduce from (25) that




d | m um |2 dx
D
  
m (s)
= um um ds 2 um Aum dxds 2 m um (um )um dxds
D s D D
 

2 ]dx +
m um [f (s, um )ds + g(s, um )dW | m g(s, um )|2 dxds, (26)
D D

where s [0, t N ], t [0, Tm ], t N = min{t, N }.


We have
 
div(um um m um )dx = [um um div(m um ) + m um (um um )]dx
D D

= [um um (um )m + 2m um (um )um ]dx;
D

where in the last step we made use of the divergence freeness of um . The left-hand
side is equal to zero in view of the vanishing of um on (0, T ) D. Hence from (13),
we have
 
m (s)
um um dx = um um um m dx
D s D

=2 um m (um )um dx; (27)
D

Thus substituting the right-hand side of (27) into (26), we get for all s [0, t N ]
  s
 m (s)um (s)2L2 (D) + 2 um (r)2V dr
0

 m 2
s
 m
0 u0 L2 (D) + 2|um , m f (r, um )|dr
0
  s 
s   
+  (r)g(r, u )L2 (D) dr + 2  (u , g(r, u ))dW  .
m m 2  m m m (28)
0 0
July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

Density Dependent Stochastic NavierStokes Equations 679

Taking supremum in both sides of (28) over the interval [0, t N ], followed by the
expectation, we have
  tN
2
E sup  (s)u (s)L2 (D) + 2E
m m um (s)2V ds
0stN 0

 m 2
tN
E

0 u0 L2 (D) + E
m 2|um , m f (s, um )|ds
0
  
tN   tN 

+E 
 m (s)g(s, um )2L2 (D) ds + 2E .
(um , m g(s, um ))dW
 
0 0
(29)
We estimate terms in the right-hand side of this equation. By Youngs inequality
and the conditions on f , we have for any > 0
 tN  tN 
2u , f (s, u )ds
m m m
 m (s)um (s)2L2 (D) ds
0 0
 tN 
+ C  m (s)f (s, um )2L2 (D) ds
0
 tN 
C  m (s)um (s)2L2 (D) ds + C. (30)
0

Similarly in view of the conditions on g we have


 tN   tN 
m 2
 (s)g(s, u )L2 (D) ds C
m  m (s)um (s)2L2 (D) ds + C. (31)
0 0

We now estimate the stochastic integral in (28). We have for any > 0,
 s 
 

E sup   2( (s)g(s, u (s)), u (s))dW 
m m m
0stN 0
 tN 1/2
2
C E m m
( (s)g(s, u (s)), u (s)) ds m
0
 1/2
tN 
C E  (s)L (D) (1 + u (s)L2 (D) )  m (s)um (s)2L2 (D) ds
m m 2
0

C E sup  m (s)um (s)L2 (D)
0stN

 tN 1/2
2
 (s)L (D) (1 + u (s)L2 (D) ) ds
m m
0

E
sup  m (s)um (s)2L2 (D)
0stN
 tN 

+ CE (1 +  m (s)um (s)2L2 (D) )ds. (32)
0
July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

680 M. Sango

Substituting the inequalities (30)(32) into (29) we get for suciently small > 0
  tN
E sup  m (s)um (s)2 2 + 2E um (s)2 ds
L (D) V
0stN 0
 
 m m 2 tN

= E 0 u0 L2 (D) + C E (1 +  m (s)um (s)2L2 (D) )ds.
0

In view of Gronwalls inequality, it follows that


  tN

E sup  m (s)um (s)2L2 (D) + 2E
um (s)2V ds C.
0stN 0

As N , t N t. Thus passing to the limit in this inequality, we nd that


  t
E sup  m (s)um (s)2L2 (D) + 2E
um (s)2V ds C, t [0, Tm ]. (33)
0st 0

Since the constant C is independent of m, we have Tm = T .


Applying Itos formula to Eq. (26) with p 1, we get
 
d m (t)um (t)pL2 (D) + p m (t)um (t)L
p2
2 (D) u
m
(t)2V dt
p  m 
=  (t)um (t)Lp2
2 (D) {2u , f (t, um ) +  m (t)g(t, um )2L2 (D) }dt
m m
2
 p2
+ p m (t)um (t)L 2 (D) (u
m m
, g(t, um ))dW

p p  
+ 1  m (t)um (t)L p4
2 (D) (u , g(t, um ))2 dt, t [0, T ].
m m
2 2
Integrating this equation over [0, t] and squaring the resulting equation we get
 2
 t 
 m (t)um (t)2p
L2 (D) + (p)
2 p2
 m (s)um (s)L 2 (D) u
m
(s)2V ds
0
 m 2p
C{ m
0 u0 L2 (D) + I1 + I2 + I3 + I4 }, (34)

where
 2
t  p2
I1 =  m (s)um (s)L 2 (D) u
m
(s), m (s)f (s, um (s))ds ,
0
 2
t  
I2 =  p2
m (s)um (s)L 2 (D)  m (s)g(s, um (s))2L2 (D) ds ,
0
 2
t  p2
I3 =  m (s)um (s)L 2 (D) (u
m
(s), m (s)g(s, um (s)))dW ,
0
 2
t 
I4 = p4
 m (s)um (s)L 2 (D) (u
m
(s), m (s)g(s, um (s)))2 ds .
0
July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

Density Dependent Stochastic NavierStokes Equations 681

The following inequalities readily follow


 t 2
 
I1 + I2  m (s)um (s)L p2
2 (D) (1 +  m (s)um (s)2L2 (D) )ds
0
 t 
C (1 +  m (s)um (s)2p
L2 (D) )ds
0
 2
t  
I4 = p4
 m (s)um (s)L 2 (D) (1 +  m (s)um (s)4L2 (D) )ds
0
 t 
C (1 +  m (s)um (s)2p
L2 (D) )ds.
0

For the estimation of I4 , we use the Martingale inequality


 t 2
  
E sup   m (s)um (s)Lp2
2 (D) (u m
(s), m
(s)g(s, u m
(s)))dW 

0tT 0
 T  2p4
E
 m (s)um (s)L 2 (D) (u
m
(s), m (s)g(s, um (s)))2 ds
0
 T  
2p4
E
 m (s)um (s)L 2 (D) (1 +  m (s)um (s)4L2 (D) )ds
0
 T 
E
(1 +  m (s)um (s)2p
L2 (D) )ds.
0

In view of these estimates and (34) making use of Gronwalls inequality, we obtain

E sup  m (t)um (t)2p
L2 (D) C, p 1. (35)
0tT

Raising both sides of (28) to the power p 1, and using the above inequality (35),
we also get along the previous lines
 p
T
2

E u (s)V ds
m
C. (36)
0

Our next task is to estimate some increments in time of um and m in the space

V . But before that let us make a few remarks.
In view of estimate (35), for any p 1, and the fact that m L (0, T, L (D))
m um L2p (,
F , P , L (0, T, (L2 (D))3 )). (37)
Thus
(m um ) L2p (,
F , P , L (0, T, H 1 (D)))

and by (13), it follows that


m
L2p (,
F , P , L (0, T, H 1(D))). (38)
t
July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

682 M. Sango

Also by (36), for all p 1


F , P , L2 (0, T, V )).
um Lp (,
Thus in view of the Sobolev embedding
V  (L6 (D))3
we have
F , P , L2 (0, T, (L6 (D))3 )),
um Lp (, (39)
and
F , P , L2 (0, T, (L6 (D))3 )).
m um Lp (, (40)
Recall the following result due to Riesz and Thorin (cf. [6, Theorem 1.1.1]).

Lemma 11. Let T be a linear operator from Lp1 (0, T ) into Lp2 (D) and from
Lq1 (0, T ) into Lq2 (D) with q1 p1 and q2 p2 . Then for any s (0, 1), T maps
Lr1 (0, T ) into Lr2 (D) with
1 1
r1 = , r2 = .
s/p1 + (1 s)/q1 s/p2 (1 s)/q2
Applying this lemma with p1 = 2, p2 = 6, q1 = , q2 = 2 and s = 3/4, we get
from (37) and (40) that
m u m , F , P , L8/3 (0, T, ((L4 (D)))3 ));
um Lp (, (41)
where we have also used the lemma with respect to L2p (,
F , P ) X and
L (, F , P ) X.
p

Next we have
F , P , L4/3 (0, T, (L2 (D))9 )),
m um um Lp (, (42)
and thus
F , P , L4/3 (0, T, (H 1 (D))3 )).
(m um um ) Lp (, (43)
Indeed applying Holders inequality we have for k = 1, 2, 3
 T  2 4/6
m m m
( uk uk ) dx dt
0 D

 T  4/12  4/12
4 m 4
(m um
k ) dx (u k ) dx dt
0 D D
  8/12 1/2   8/12 1/2
T T
m m 4 4
( uk ) dx dt (um
k ) dx dt
0 D 0 D
  8/12   8/12 
T T
4 4
C (m um
k ) dx dt + (um
k ) dx dt .
0 D 0 D
July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

Density Dependent Stochastic NavierStokes Equations 683

The integrals in the right-hand side are bounded for a.e. in view of the
estimates (41). The sought estimates thus follow.
Recalling the denition of the norm of V  , we have

(m um )(t + ) (m um )(t)2V 



= sup [(m um )(t + ) (m um )(t)]v dx.
vV :vV =1 D

Thus owing to the integral identity (12), we have

 T

E (m um )(t + ) (m um )(t)2V  dt
0
  2
T  t+ 
 
= E  d(m um )ds dt
0  t  
V
 T
E
[R1 (t) + R2 (t) + R3 (t) + R4 (t)]dt (44)
0

where
 2  2
 t+   t+ 
   m 
R1 (t) =  ( u u )ds ,
m m m
R2 (t) =  u ds ,
 t    t  
V V
 2  2
 t+   t+ 
 m m   m m 
R3 (t) =  f (s, u )ds , R4 (t) =  g(s, u )dW  .
 t    t  
V V

We have
    
t+
1/2
R1 = sup (m um um )ds (x)dx : V, V = 1
D t
 t+
C m um um L2 (D) ds.
t

Then in view of (42)

    3/4
T T t+
4/3

E R1 (t)dt C1/2 E
m um um L2 (D) dsdt
0 0 t

C1/2 .
July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

684 M. Sango

Next using (33), we get


   2
T T t+

E R2 (t)dt E u L2 (D) ds
m
dt
0 0 t
 T  t+
E um 2L2 (D) dsdt
0 t

C.

Using the conditions on f and estimate (35), we have


 T  T  t+ 2

E R3 (t)dt C E
 (s)L (D) (1 + u (s)L2 (D) )ds dt
m m
0 0 t
 T  t+
m 2
CE (1 + um (s)2L2 (D) )dsdt
L (0,T,L (D))
0 t

C.

For the stochastic integral we use the martingale inequality. We have


 T 
 t+
2

 
E  g(s, u )dW  dt
m m
0  t  
V
     2
T t+

E sup m m dt
g(s, u )(x)dx dW
0 V :V =1 t D

    2 
T t+

E sup m g(s, um )(x)dx ds dt
0 V :V =1 t D

 T   t+   
m 2

E m
[ g(s, u )] dx ds dt
0 t D

 T   t+


E m 2L (D) g(s, um )2(L2 (D))l ds dt
0 t
 T
C E sup m 2L (D) (1 + um 2H )
0 0tT

C;

at some steps we made use of Fubinis theorem and the estimate (35). Combining
the estimates that weve just derived with (44) we get the crucial estimate
 T
E (m um )(t + ) (m um )(t)2V  dt C1/2 . (45)
0
July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

Density Dependent Stochastic NavierStokes Equations 685

We also need to show that


 T
E um (t + ) um (t)2W 1 (D) dt C1/2 . (46)
0 3/2

We note that
m (t + )(um (t + ) um (t))
= (m um )(t + ) (m um )(t)um (t)(m (t + ) m (t)). (47)
Let us estimate
 T

E um (t)(m (t + ) m (t))2W 1 (D) dt.
0 3/2

We have, by (38), (33) and (6), W3/2 1


(W21 (D) W21 (D)
(D)) that
 T
um (t)(m (t + ) m (t))2W 1 (D) dt
0 3/2

   t+  m  2
T  (s) 
u (t)V
m   dsdt
 s  
0 t V
 m 2 
 (s)  T
C2 
 s 
 um (t)2V dt.
L (0,T,V ) 0

Taking mathematical expectation in this inequality and using (36) and (38), we get
 T
um (t)(m (t + ) m (t))2W 1 (D) dt C2 . (48)
0 3/2

Combining (45), (48) and (47), we get


 T

E m (t + )(um (t + ) um (t))2W 1 (D) dt C1/2 .
0 3/2

This implies (46).


We are left with another key estimate on the function

m (t) = m (t)um (t)v dx
D
for v V . We claim that
m (t + h) m (t)C([0,T ]) ch1/4 .
E (49)
We have from (12),
m (t + h) m (t)|
E|
   
 t+h    t+h  
   
E  m um um v dxds + E um v dxds
 t D   t D 
   
 t+h    t+h  
 m m   m m  .
+E f (s, u )v dxds + E g(s, u )v dxdW
 t D   t D 
July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

686 M. Sango

In view of (42), we have


 
 t+h  
 
sup 
E u u v dxds
m m m
t[0,T h]  t D 

Ch1/4 vH E
m um um L4/3 (0,T,(L2 (D))9 ) Ch1/4 .

Next, using (36), we have


   1/2
 t+h   t+h
 
E sup  um v dxds Ch1/2 vV E um 2V
t[0,T h]  t D  t

Ch1/2 .
By similar arguments, we have by (35)
 
 t+h  
 
E m f (s, um )v dxds Ch1/2 m L (Q) vH E
sup (1 + um (s)H )
 t D  s[t,t+h]

Ch1/2 .
Finally using Martingale inequality we have
 
 t+h 
sup   m m 
E g(s, u )v dxdW
t[0,T h]  t D 
 2 1/2
 t+h 
 
E
 m g(s, um )v dx dt
 t D 

Ehv
H  L (Q)
m
sup (1 + um (s)H )
s[t,t+h]

Ch.
Hence summarizing these estimates we arrive at (49). Furthermore, in view of (35),
we have for any p 1
sup |m (t)|p Cvp , sup m um p 2
E H E L (D) C. (50)
t[0,T ] t[0,T ]

We now summarize our key estimates in this section. For that we introduce the
spaces Xp,
k
n ,n
(1 p < ) (k = 1, 2) of random variables y such that
1
(i) For Xp,n ,n
 p
T
sup y(t)2p2 y(s)2V
E L (D) C, E ds C,
0tT 0
 T
sup 1 sup
E y(t + ) y(t)2W 1 (D) dt C;
n n ||n 0 3/2
July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

Density Dependent Stochastic NavierStokes Equations 687

endowed with the norm


 p/2 2/p
 1/2p  T
yXp,
1 = E sup y(t)2p2 + E
y(s)2V ds
n ,n L (D)
0tT 0

  1/2
T
sup 1
+E sup y(t + ) y(t)2W 1 (D) dt ,
n n ||n 0 3/2

1
Xp,n ,n
is a Banach space.
2
(ii) For Xp, n ,n

p

Ey L8/3(0,T,((L4 (D)))3 )
C,
 T
sup 1 sup
E y(t + ) y(t)2V  dt C;
n n ||n 0

endowed with the norm

yXp,
2
n ,n

= (Ey p
L8/3 (0,T,((L4 (D)))3 )
)1/p
  1/2
T
sup 1
+E sup y(t + ) y(t)2V  dt ,
n n ||n 0

2
Xp,n ,n
is a Banach space.

We dene Xq3 (q is any positive number) as the space of random variables y such
that
 q
 y 

Ey q
C,
E   C;
L (0,T,L (D))  t 
L (0,T,H 1 (D))

endowed with the norm


   1/q
 y q
yXq3 =
(Ey q 1/q  
L (0,T,L (D)) ) + E  t  ,
L (0,T,H 1 (D))

Xq3 is a Banach space.


4
Finally we have the space Xp,n ,n
of random variables y such that

p 1
L (0,T ) C, + ) y(t)C[0,T ] C,

Ey sup
sup Ey(t
n n ||n

which endowed with the norm


p 1/p 1
yXp,
4
n ,n

(Ey L (0,T ) ) + sup
sup Ey(t + ) y(t)C[0,T ]
n n ||n

is a Banach space.
July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

688 M. Sango

Combining the estimates (20), (35), (36), (38), (41), (45), (46), (49) and (50),
we have

Theorem 12. For any p 1 and for n , n such that the series
 1/4
n
n=1
n
1 2
converges, the sequences um , m um , m and m are bounded in Xp,n ,n
, Xp,n ,n
,
3 4
Xq and Xp,n ,n for any n, respectively.

4. Tightness Property of Probability Measures Induced by


Galerkin Solutions
We may rewrite Lemma 4 in the following more convenient form adapted to our
situation as in [4]. For any sequences n , n which converge to zero as n , and
any 1 pk , qk (k = 1, 2, 3, 4) the set Ykn ,n of functions
y Lqk (0, T, Xk ) Npnk ,n (0, T, Yk )
where Npnk ,n (0, T, Y ) is the set
 
1
v L (0, T, Yk ) : sup
pk
sup v(t + ) v(t)Lpk (0,T ,Yk ) <
n n ||n

is relatively compact in Lpk (0, T, Bk ), Xk , Bk and Yk play respectively the role of


X, B and Y in Lemma 4.
1
Let Y1n ,n be the space with X1 = V, Y1 = W3/2 (D), q1 = 2, p1 = 2 and let
B1 = L (D). Let Yn ,n be the space with X2 = L (D), Y2 = V  , q2 = 8/3, p2 = 8/3
2 2 4

and B2 = W2 (D) (0 < < 1), W2 (D) being the interpolation space [L2 (D) =
W20 (D), H 1 (D)] ; we refer to [34] for the needed informations. Also by [34, The-
orem 16.1, Chap. 1], we have that W2 (D) is compactly embedded into H 1 (D).
Let Y3n ,n be the space with X3 = L (D), Y3 = H 1 (D), q3 = , p3 = and let
1
B3 = W (D). Let Y4n ,n be the space with X4 = B4 = Y4 = R, p4 = q4 = .
Now we consider the set
4

S = C(0, T, Rl ) Lpk (0, T, Bk ).
k=1

and B(S) the -algebra of the Borel sets of S. For each m, let be the map :
S:
 (W , ), um (
( , ), m um (
, ), m (
, ), m (
, )). Since the solution
is not unique in general this map is multivalued. However a selection can be made
to suit our needs. Precise arguments can be found in [5]. So we make use of the
map modulo a selection. For each m, we introduce a probability measure m on
(S, B(S)) by
m (A) = P (1 (A)) for all A B(S).
July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

Density Dependent Stochastic NavierStokes Equations 689

The main result of this section is

Theorem 13. The family of probability measures {m : m N} is tight.

Proof. For > 0 we should nd the compact subsets


4

C(0, T, Rl ), Y Lpk (0, T, Bk )
k=1

such that

P { , )
(
:W / } /2 (51)
P { , ), m um (
: (um ( , ), m (
, ), m (
, ))
/ Y } /2. (52)

The quest for is made by taking account of some facts about the Wiener process
such as the formula

E|B(t2 ) B(t1 )|2j = (2j 1)!(t2 t1 )j , j = 1, 2, . . . . (53)

For a constant L depending on to be chosen later and n N, we consider the set


 
w() C(0, T, Rl ) :
= .
sup{n|w(t2 ) w(t1 )| : t1 , t2 [0, T ], |t2 t1 | < n6 } L

The is relatively compact in C(0, T, Rd ) by ArselaAscolis theorem. Making use


of Markovs inequality
1
P { ) }
: ( )|k ]
E[|(
k
F , P ) and positive numbers and k, we get
for a random variable on (,

P {
:W , )
( / }
  
P n : sup |W
(t2 ) W
(t1 )| > L /n
t1 ,t2 [0,T ],|t2 t1 |<n6

 n1 
6 4
n (iT n6 )|4
E sup |W
(t) W
n=1 i=0
L iT n6 t(i+1)T n6

  4
n C  1
C (T n6 )2 n6 = 4 .
n=1
L L n=1 n2

We choose

1
1  1
L4 = 2
2C n=1
n

to get (51).
July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

690 M. Sango

Next we choose Y as a ball of radius M in Y1n ,n Y2n ,n Y3n ,n Y4n ,n


centered at zero and with n , n independent of , converging to zero and such that
 1 1/4
n n n converges. As remarked above Y is a compact subset of
4

Lpk (0, T, Bk ).
k=1

We have further

P { , ), m um (
: (um ( , ), m (
, ), m (
, ))
/ Y }
: um Y1n ,n > M } + P {
P { : m um Y2n ,n > M }
: m Y3n ,n > M } + P {
+ P { : m Y4n ,n > M }
1 m
(Eu Y1n ,n + E
m um Y 2
n ,n
m Y 3
+ E n ,n
m Y 4
+ E n ,n
)
M
C
.
M
Choosing M = 2C1 we get (52). From (51) and (52), we have

P { , ) ; (um (
(
:W , ), m um (
, ), m (
, ), m (
, )) Y } 1 .

This proves that

m ( Y ) 1 , > 0

and hence the theorem.

In view of the just proven tightness of {m } we have from Lemma 6 that there
exists a subsequence {mj } and a measure such that

mj

weakly. By Skorokhods Lemma 7, there exist a probability space (, F, P ) and


random variables (Wmj , umj , mj umj , mj , mj ), (W, u, g, , ) on (, F, P ) with
values in S such that the probability law of (Wmj , umj , mj umj , mj , mj ) is mj ;
hence {Wmj } is a sequence of l-dimensional Wiener processes. Furthermore

(Wmj , umj , mj umj , mj , mj ) (W, u, g, , ) in S, P -a.s. (54)

and the probability law of (W, u, g, , ) is .


Set

F t = {W (s), u(s), (s)}s[0,t] .

We show that W (t) is a F t -standard Wiener process. For this we use the following
characterization of Wiener processes through their characteristic functions (see [21])
July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

Density Dependent Stochastic NavierStokes Equations 691

which stipulates that for any m N, 0 = t0 < t1 < < tm and v0 , v1 , . . . , vm


m 

E exp ivk [W (tk ) W (tk1 )] iz0 W (t0 )
k=1

1Nj
= exp vk2 (tk tk1 ) . (55)
2
k=1

(55) will follow if we can prove that for the conditional characteristic function
we have
 2 
v h
E[exp{iv[W (t + h) W (t)]}/F t ] = exp (56)
2
for all h > 0 and any v. Note that for any given -algebra F and random variables
X and Y on a probability space (, F , P ) on which the mathematical expectation
is denoted by E, if X is F -measurable and E|Y |, E|XY | < ,
E(XY /F ) = XE(Y /F ), EE(Y /F ) = E(Y ),
that is
E(XY ) = E(XE(Y /F )).
Using this fact we see that (56) will be proved if for any continuous bounded
functional t (W (), u(), ()) on S depending only on the values of W, u and on
the interval (0, t), we have
E[exp{v[W (t + h) W (t)]}t (W (), u(), ())]
 2 
z h
= exp Et (W (), u(), ()). (57)
2
Since [Wmj (t + h) Wmj (t)] are independent of t (Wmj , umj , mj ) and Wmj is a
Wiener process
E[exp{iz[Wmj (t + h) Wmj (t)]}t (Wmj , umj , mj )]
= E exp{iz[Wmj (t + h) Wmj (t)]}Et (Wmj , umj , mj )
 2 
z h
= exp Et (Wmj , umj , mj ).
2
In view of (54) and the continuity of t , we can pass to the limit in this equality
and get (57).
It can be shown that Wmj , umj , mj satisfy the approximating equations (12)
and (25) with m replaced by mj . In particular
div umj = 0, (58)
mj
+ (umj )mj = 0, (59)
t
m m
umj (0) = u0 j , mj (0) = 0 j , (60)
July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

692 M. Sango

  t  t
(mj umj )(t)v dx mj umj umj v dxdt + umj v dxdt
D 0 D 0 D
  t
m m
= 0 j u0 j v dx + mj f (t, umj )v dxdt
D 0 D
 t
+ mj g(t, umj )v dxdWmj . (61)
0 D

5. Passage to the Limit


In Theorem 12 let us take p = 2.
We have that
u mj u weakly-star in L4 (, F, P, L (0, T, H)), (62)
u mj u weakly in L2 (, F, P, L2 (0, T, V )). (63)
By (35), (54) and Vitalis theorem, we have
umj u strongly in L2 (, F, P, L2 (0, T, H)). (64)
Thus for xed x,
u mj u a.e. (t, ) with respect to the measure dt dP. (65)
Next we have
mj u mj g weakly-star in L2 (, F, P, L (0, T, L2 (D))). (66)
By (35) and the uniform boundedness of mj
Emj umj 4L (0,T,L2 (D)) C.
This implies that
Emj umj 4L (0,T,H 1/2 (D)) C.
This together with (54) and Vitalis theorem give
mj u mj g strongly in L2 (, F, P, L (0, T, W2 (D))). (67)
We have that mj is bounded in Xq3 for any q > 0. Taking q = 4 we get
mj weakly-star in L4 (, F, P, L (0, T, L (D))) (68)
and
Emj 4L (0,T,W 1 (D)) C.

This estimate combined with (54) and Vitalis theorem imply that
mj strongly in L2 (, F, P, C(0, T, W
1
(D))). (69)
(42) gives
mj u mj u mj h weakly in L2 (,
F , P , L4/3 (0, T, (L2 (D))9 )).
July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

Density Dependent Stochastic NavierStokes Equations 693

1
The product W (D) H 1 (D) W61 (D) is continuous. Thus (69) and (63)
give
mj umj u weakly in L2 (, F, P, L2 (0, T, W61 (D))). (70)
And taking account of (67) we get
g = u. (71)
Similarly since the product W61 (D)H 1 (D) 1
W3/2 (D) is continuous we have
from (67), (70) and (63) that
1
mj umj umj gu = uu weakly in L2 (, F, P, L2 (0, T, W3/2 (D))). (72)
Next, in view of (67), (35), the conditions on f and Vitalis theorem we have
f (, umj ()) f (, u()) in L2 (, F, P, L2 (0, T, H)). (73)
Similarly owing to the conditions on g
g(, umj ()) g(, u()) in L2 (, F, P, L2 (0, T, H d )). (74)
Using this convergence with (69) and (54), we can show that
 t  t
mj g(s, umj (s))dWmj (s) g(s, u(s))dW (s) (75)
0 0

weakly in L2 (, F, P, L2 (D)). We skip the details and instead refer to [50] where a
similar situation is dealt with thoroughly. A key role is played by Lemma 2.
Next in view of (54),
mj (, ) (, )
uniformly in C([0, T ]), P a.s. Hence owing to (50) and Vitalis theorem we get
mj
strongly in L1 (, F, P, C(0, T, R)). Hence
 
mj
(0) = (0)u (0)v dx
mj mj
(0)u(0)v dx.
D D
But
 
mj (0)umj (0)v dx = 0 u0 v dx.
D D
Thus
 
(0)u(0)v dx = 0 u0 v dx. (76)
D D
Also passing to the limit in (17), we get
inf 0 sup 0 . (77)
D D

Combining all these convergences we can pass to the limit in the weak formula-
tion of problem (58)(61) and obtain the claim of our main result.
July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

694 M. Sango

Acknowledgments
This work is supported by the National Science Foundation under the agreement
No. DMS-0635607 and by the National Research Foundation of South Africa. The
results were obtained during my stay at the Institute for Advanced Study in fall
of 2009. I thank the institute for providing excellent conditions of work. I thank
Professor Ya. G. Sinai for stimulating discussions on the results of the paper and
encouragement. Until the paper was completed, I was not aware of the work of Pro-
fessor F. H. Yashima who informed me during the Panafrican Congress of Math-
ematicians in Yamoussoukro (Cote-dIvoire) in August 2009. My sincere gratitude
is due to him for sending his thesis [64]. I thank one of the reviewers for interesting
comments that improved the paper.

References
[1] S. N. Antontsev, A. V. Kazhikhov and V. N. Monakhov, Boundary Value Problems in
Mechanics of Nonhomogeneous Fluids, Studies in Mathematics and Its Applications,
Vol. 22 (North-Holland Publishing Co., Amsterdam, 1990).
[2] A. Bensoussan, Some existence results for stochastic partial dierential equations,
in Stochastic Partial Dierential Equations and Applications (Trento, 1990), Pitman
Res. Notes Math. Ser., Vol. 268 (Longman Scientic and Technical, Harlow, UK,
1992), pp. 3753.
[3] A. Bensoussan, Results on stochastic NavierStokes equations, in Control of Par-
tial Dierential Equations (Trento, 1993), Lecture Notes in Pure and Appl. Math.,
Vol. 165 (Dekker, New York, 1994), pp. 1121.
[4] A. Bensoussan, Stochastic NavierStokes equations, Acta Appl. Math. 38 (1995) 267
304.

[5] A. Bensoussan and R. Temam, Equations stochastiques du type NavierStokes,
J. Funct. Anal. 13 (1973) 195222.
[6] J. Bergh and J. L om, Interpolation Spaces. An Introduction, Grundlehren der
ofstr
Mathematischen Wissenschaften, No. 223 (Springer-Verlag, Berlin-New York, 1976).
[7] J. Bricmont, A. Kupiainen and R. Lefevere, Exponential mixing of the 2D stochastic
NavierStokes dynamics, Comm. Math. Phys. 230(1) (2002) 87132.
[8] J. Bricmont, A. Kupiainen and R. Lefevere, Ergodicity of the 2D NavierStokes
equations with random forcing. Dedicated to Joel L. Lebowitz, Comm. Math. Phys.
224(1) (2001) 6581.
[9] J. Bricmont, A. Kupiainen and R. Lefevere, Probabilistic estimates for the two-
dimensional stochastic NavierStokes equations, J. Statist. Phys. 100(34) (2000)
743756.
[10] Z. Brzezniak, M. Capinski and F. Flandoli, Stochastic NavierStokes equations with
multiplicative noise, Stochastic Anal. Appl. 10(5) (1992) 523532.
[11] Z. Brzezniak and Y. Li, Asymptotic compactness and absorbing sets for 2D stochas-
tic NavierStokes equations on some unbounded domains, Trans. Amer. Math. Soc.
358(12) (2006) 55875629.
[12] N. J. Cutland and B. Enright, Stochastic nonhomogeneous incompressible Navier
Stokes equations, J. Dierential Equations 228(1) (2006) 140170.
[13] G. Da Prato and J. Zabczyk, Stochastic Equations in Innite Dimensions, Encyclo-
pedia of Mathematics and Its Applications, Vol. 44 (Cambridge University Press,
Cambridge, 1992).
July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

Density Dependent Stochastic NavierStokes Equations 695

[14] G. Da Prato and J. Zabczyk, Ergodicity for Innite-Dimensional Systems, London


Mathematical Society Lecture Note Series, Vol. 229 (Cambridge University Press,
Cambridge, 1996).
[15] G. Deugoue and M. Sango, On the stochastic 3D NavierStokes-alpha model of uids
turbulence, Abstr. Appl. Anal. 2009 (2009), Article ID 723236, 27 pp.
[16] G. Deugoue and M. Sango, On the strong solution for the 3D stochastic Leray-Alpha
Model, Boundary Value Problems 2010 (2010), Article ID 723018, 31 pp.
[17] E. Weinan and Ya. G. Sinai, New results in mathematical and statistical hydrody-
namics, Russian Math. Surveys 55(4) (2000) 635666.
[18] E. Feireisl, Dynamics of Viscous Compressible Fluids, Oxford Lecture Series in Math-
ematics and Its Applications, Vol. 26 (Oxford University Press, Oxford, 2004).
[19] F. Flandoli and D. Gatarek, Martingale solutions and stationary solutions for
stochastic NavierStokes equations, Probab. Theory Relat. Fields 102 (1995)
367391.
[20] J.-F. Gerbeau and C. Le Bris, Existence of solution for a density-dependent magne-
tohydrodynamic equation, Adv. Dierential Equations 2(3) (1997) 427452.
[21] I. I. Gikhman and A. V. Skorohod, Stochastic Dierential Equations, Ergebnisse
der Mathematik und ihrer Grenzgebiete, Band 72 (Springer-Verlag, New York-
Heidelberg, 1972).
[22] Y. Cho and H. Kim, Unique solvability for the density-dependent NavierStokes
equations, Nonlinear Anal. 59(4) (2004) 465489.
[23] H. J. Choe and H. Kim, Strong solutions of the NavierStokes equations for non-
homogeneous incompressible uids, Comm. Partial Dierential Equations 28(56)
(2003) 11831201.
[24] A. V. Kazhikhov, Solvability of the initial-boundary value problem for the equations
of the motion of an inhomogeneous viscous incompressible uid, Dokl. Akad. Nauk
SSSR 216 (1974) 10081010 (in Russian).
[25] S. B. Kuksin, Randomly Forced Nonlinear PDEs and Statistical Hydrodynamics in
2 Space Dimensions, Zurich Lectures in Advanced Mathematics (European Mathe-
matical Society (EMS), Z urich, 2006), x+93 pp.
[26] S. Kuksin and A. Shirikyan, Ergodicity for the randomly forced 2D NavierStokes
equations, Math. Phys. Anal. Geom. 4(2) (2001) 147195.
[27] O. A. Ladyzhenskaya, The Mathematical Theory of Viscous Incompressible Flow,
2nd edn., revised and enlarged (Gordon and Breach, Science Publishers, New York-
London-Paris, 1969).
[28] O. A. Ladyzhenskaja and V. A. Solonnikov, The unique solvability of an initial-
boundary value problem for viscous incompressible inhomogeneous uids. Boundary
value problems of mathematical physics, and related questions of the theory of func-
tions, 8, Zap. Nauch. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI) 52 (1975)
52109, 218219 (in Russian).
[29] J. Leray, Sur le syst`eme dequations aux derivees partielles qui regit lecoulement
permanent des uides visqueux, C. R. Acad. Sci. Paris 192 (1931) 11801182.
[30] J. Leray, Sur le mouvement dun liquide visqueux emplissant lespace, Acta Math.
63(1) (1934) 193248.
[31] J. Leray, Etude de diverses equations integrales non lineaires et de quelques probl`emes
que pose lhydrodynamique, J. Math. Pure Appl. (9 ) 12 (1933) 182.
[32] J. L. Lions, Quelques methodes de resolution des probl` emes aux limites non lineaires
(Dunod, Gauthiers-Villars, Paris, 1969; Russian translation by Mir).
[33] J.-L. Lions, On some problems connected with NavierStokes equations, in Nonlinear
Evolution Equations (Proc. Sympos., Univ. Wisconsin, Madison, Wis., 1977), Publ.
July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

696 M. Sango

Math. Res. Center Univ. Wisconsin, Vol. 40 (Academic Press, New York-London,
1978), pp. 5984.
[34] J.-L. Lions and E. Magenes, Non-Homogeneous Boundary Value Problems and Appli-
cations, Vol. I, Die Grundlehren der Mathematischen Wissenschaften, Band 181
(Springer-Verlag, New York-Heidelberg, 1972).
[35] P.-L. Lions, Mathematical Topics in Fluid Mechanics, Vol. 1, Incompressible Models
(The Clarendon Press, Oxford University Press, New York, 1996).
[36] P.-L. Lions, Mathematical Topics in Fluid Mechanics, Vol. 2, Compressible Models
(The Clarendon Press, Oxford University Press, New York, 1998).
[37] P.-L. Lions, Limites incompressible et acoustique pour des uides visqueux, compress-
er. I Math. 317(12) (1993) 11971202.
ibles et isentropiques, C. R. Acad. Sci. Paris S
[38] P.-L. Lions, Compacite des solutions des equations de NavierStokes compressibles
isentropiques, C. R. Acad. Sci. Paris S er. I Math. 317(1) (1993) 115120.
[39] P.-L. Lions, Existence globale de solutions pour les equations de NavierStokes com-
pressibles isentropiques, C. R. Acad. Sci. Paris S er. I Math. 316(12) (1993) 1335
1340.
[40] R. Mikulevicius and B. L. Rozovskii, Global L2 -solutions of stochastic NavierStokes
equations, Ann. Probab. 33(1) (2005) 137176.
[41] R. Mikulevicius and B. L. Rozovskii, Stochastic NavierStokes equations for turbulent
ows, SIAM J. Math. Anal. 35(5) (2004) 12501310.
[42] S.-E. Mohammed and T.S. Zhang, Dynamics of Stochastic 2D NavierStokes, to
appear in J. Funct. Anal.
[43] A. S. Monin and A. M. Yaglom, Statistical Fluid Mechanics: Mechanics of Turbulence,
Vols. I, II (Dover Publications, Dover Ed Edition, 2007).
[44] A. Novotny and I. Stravskraba, Introduction to the Mathematical Theory of Com-
pressible Flow, Oxford Lecture Series in Mathematics and Its Applications, Vol. 27
(Oxford University Press, Oxford, 2004), xx+506 pp.
[45] Yu. V. Prohorov, Convergence of random processes and limit theorems in probability
theory, Teor. Veroyatnost. i Primenen. 1 (1956) 177238 (in Russian).
[46] P. A. Razamandimby and M. Sango, Weak solutions of a stochastic model for
two-dimensional second grade uids, Boundary Value Problems 2010 (2010), Article
ID 636140, 47 pp.
[47] P. A. Razamandimby and M. Sango, Asymptotic behavior of solutions of stochastic
evolution equations for second grade uids, to appear in C. R. Math. Acad. Sci.
Paris.
[48] M. Sango, Existence result for a doubly degenerate quasilinear stochastic parabolic
equation, Proc. Japan Acad. Ser. A Math. Sci. 81(5) (2005) 8994.
[49] M. Sango, Weak solutions for a doubly degenerate quasilinear parabolic equation
with random forcing, Discrete Contin. Dyn. Syst. Ser. B 7(4) (2007) 885905.
[50] M. Sango, Magnetohydrodynamic turbulent ows: Existence results, Phys. D 239
(2010) 912923.
[51] J. Simon, Compact sets in the space Lp (0, T ; B), Ann. Mat. Pura Appl. 146(4) (1987)
6596.
[52] J. Simon, Sur les uides visqueux incompressibles et non homog`enes, C. R. Acad.
Sci. Paris Ser. I Math. 309(7) (1989) 447452.
[53]
J. Simon, Ecoulement dun uide non homog`ene avec une densite initiale sannulant,
C. R. Acad. Sci. Paris S er. A-B 287(15) (1978) A1009A1012.
[54] J. Simon, Nonhomogeneous viscous incompressible uids: Existence of velocity, den-
sity, and pressure, SIAM J. Math. Anal. 21(5) (1990) 10931117.
July 12, 2010 11:50 WSPC/S0129-055X 148-RMP
J070-S0129055X10004041

Density Dependent Stochastic NavierStokes Equations 697

[55] A. V. Skorokhod, Limit theorems for stochastic processes, Teor. Veroyatnost. i Prime-
nen. 1 (1956) 289319.
[56] A. V. Skorokhod, Studies in the Theory of Random Processes (Scripta Technica, Inc.
Addison-Wesley Publishing Co., Inc., Reading, Mass., 1965); Translated from the
Russian.
[57] R. Temam, NavierStokes Equations. Theory and Numerical Analysis, Studies
in Mathematics and Its Applications, Vol. 2 (North-Holland Publishing Co.,
Amsterdam-New York-Oxford, 1977).
[58] E. Tornatore and H. Fujita Yashima, One-dimensional stochastic equations for a
viscous barotropic gas, Ricerche Mat. 46(2) (1997) 255283 (in Italian).
[59] E. Tornatore, Global solution of bi-dimensional stochastic equation for a viscous gas,
NoDEA Nonlinear Dierential Equations Appl. 7(4) (2000) 343360.
[60] M. I. Vishik, A. I. Komech and A. V. Fursikov, Some mathematical problems of sta-
tistical hydromechanics, Uspekhi Mat. Nauk 34(5)(209) (1979) 135210 (in Russian).
[61] M. I. Vishik and A. V. Fursikov, Mathematical Problems of Statistical Hydromechan-
ics, Mathematics and Its Applications (Kluwer, Drodrecht, 1988).
[62] M. Viot, Solutions faibles dequations aux derivees partielles stochastiques non lin-
eaires, Doctor of Sciences thesis, Parix 6 (1973).
[63] H. F. Yashima, Equations stochastiques dun gaz visqueux isotherme dans un
domaine monodimensionnel inni, Acta Math. Vietnam. 26(2) (2001) 147168.
[64] H. F. Yashima, Equations de NavierStokes stochastiques non homog`enes et appli-
cations, Tesi di Perfezionamento, Scuola Normale Superiore, Pisa (1992), 169 pp.
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

Reviews in Mathematical Physics


Vol. 22, No. 6 (2010) 699732

c World Scientic Publishing Company
DOI: 10.1142/S0129055X10004065

DERIVATIONS OF THE TRIGONOMETRIC BCn


SUTHERLAND MODEL BY QUANTUM
HAMILTONIAN REDUCTION

, and B. G. PUSZTAI,
L. FEHER
Department of Theoretical Physics, MTA KFKI RMKI,
H-1525 Budapest, P.O.B. 49, Hungary
and
Department of Theoretical Physics, University of Szeged,
Tisza Lajos krt 84-86, H-6720 Szeged, Hungary
Bolyai Institute, University of Szeged,

Aradi v
ertanuk tere 1, H-6720 Szeged, Hungary
lfeher@rmki.kfki.hu
gpusztai@math.u-szeged.hu

Received 7 October 2009

The BCn Sutherland Hamiltonian with coupling constants parametrized by three arbi-
trary integers is derived by reductions of the Laplace operator of the group U (N ). The
reductions are obtained by applying the Laplace operator on spaces of certain vector val-
ued functions equivariant under suitable symmetric subgroups of U (N ) U (N ). Three
dierent reduction schemes are considered, the simplest one being the compact real
form of the reduction of the Laplacian of GL(2n, C) to the complex BCn Sutherland
Hamiltonian previously studied by Oblomkov.

Keywords: Integrable many-body systems; quantum Hamiltonian reduction; polar


action.

Mathematics Subject Classication 2010: 22E70, 53C80, 81R12

1. Introduction
The family of CalogeroSutherland type many-body models is very important both
in physics and mathematics, as is amply demonstrated in the reviews [16]. In this
paper, we focus on the group theoretic derivation of the trigonometric Sutherland
models introduced by Olshanetsky and Perelomov [7] in correspondence with the
crystallographic root systems. The Hamiltonian of the model associated with the
roots system R is given by
1 1  ||2 ( + 22 1)
HR = + , (1.1)
2 4
R
sin2 ( q)
where is the Laplacian on the Euclidean space of the roots and the are
arbitrary real constants depending only on the lengths of the roots, with 2 := 0

699
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

700 L. Feh
er & B. G. Pusztai

if 2 / R. In the original An1 case, the model was solved by Sutherland [8]. An
interesting general observation [9] is that the radial part of the Laplace operator
of any compact Riemannian symmetric space is always conjugate to a Sutherland
operator (1.1) built on the root system of the symmetric space, with coupling
constants determined by the multiplicities of the roots. This observation showed
the algebraic integrability of the resulting Hamiltonians HR at (small) nite sets
of coupling constants and inspired later developments. The integrability, and exact
solvability in terms of a triangular structure, was rst established for the models
(1.1) in full generality by Heckman and Opdam [10, 11]. Their technique is based
on dierential-reection operators belonging to the Hecke algebraic generalization
of harmonic analysis [2, 12].
The Hecke algebraic approach is very powerful, but it is still desirable to treat
as many cases of the models (1.1) in group theoretic terms as possible. Important
progress in this direction was achieved by Etingof, Frenkel and Kirillov [13] who
worked out the quantum mechanical version of the classical Hamiltonian reduction
due to Kazhdan, Kostant and Sternberg [14] and thereby showed that the An1
Sutherland Hamiltonian arises as the restriction of the Laplace operator of SU (n)
to certain vector valued spherical functions. A spherical function F on SU (n) with
values in the SU (n) module V satises the equivariance condition F (gxg 1 ) =
g F (x) and thus it is uniquely determined by its restriction to the maximal torus
T < SU (n). It is easily seen that the restricted function f = F |T must vary in
the zero-weight subspace V T and the action of the Laplace operator of SU (n) on
F can be expressed by the action of a scalar dierential operator on f whenever
dim(V T ) = 1. This latter condition singles out the symmetric tensorial powers
V = S kn (Cn ) (k Z0 ) and their duals among the irreducible highest weight
representations of SU (n), and the resulting scalar dierential operator turns out to
be the Sutherland operator HAn1 with coupling parameter = k + 1.
The above arguments cannot be extended to the simple Lie groups beyond
SU (n), since in general they do not admit non-trivial highest weight represen-
tations with multiplicity one for the zero weight.a However, taking any compact
connected Lie group Y , there exist other nice actions of certain subgroups of Y Y
on Y for which one can try to generalize the above arguments. Indeed [17], if G is
the xed point set of an involution of Y Y , then every orbit of the natural action of
G on Y can be intersected by a toral subgroup A < Y . Therefore the G-equivariant
functions on Y with values in a representation V of G give rise to V K -valued func-
tions on A, where K is the isotropy group of the generic elements of A. Moreover, if
dim(V K ) = 1, then the application of the Laplace operator of Y on C (Y, V )G may
induce a scalar Sutherland operator. The group actions just alluded to are called
Hermann actions. They received a lot of attention in dierential geometry (see,

a The only exceptions [15, 16] are the dening representation of SO(2n + 1) and the 7-dimensional

representation of G2 . In the former case, we have checked that the reduced Laplacian gives a
decoupled system.
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

Derivations of Trigonometric BCn Sutherland Model 701

e.g., [17, 18] and references therein), but their use for the construction of integrable
systems still has not been explored systematically.
The goal of this paper is to explain that certain Hermann actions on Y = U (N )
permit derivations of the BCn Sutherland Hamiltonian from the Laplacian of U (N ).
The derivations that we present are partly motivated by an earlier derivation found
in the complex holomorphic setting in [19], and by our previous paper [20] where
we discussed how the classical mechanical version of the trigonometric BCn model
with three arbitrary coupling constants can be obtained by reducing the free particle
moving on the group U (N ). Taking for R the root system
BCn = {i j , k , 2k | i, j, k {1, . . . , n}, i = j}, (1.2)
with orthonormal vectors {i }, and introducing new coupling parameters a, b, c by
the denition
1
i j := a + 1, k := b c, 2k := c + , (1.3)
2
the Hamiltonian (1.1) reads
 
1  2 
n
a(a + 1) a(a + 1)
HBCn = + +
2 j=1 qj2
1k<ln
sin2 (qk ql ) sin2 (qk + ql )

1 1
1 b 4 1 c 4
n 2 n 2

+ + . (1.4)
2 j=1 sin2 (qj ) 2 j=1 cos2 (qj )

In fact, we shall obtain this Hamiltonian with arbitrary non-negative integers a,


b and c as a reduction of the Laplace operator of U (N ). More precisely, we shall
present 3 dierent derivations, for which N = 2n, N = 2n + 1 or N = 2n + 2.
There is considerable conceptual overlap between this paper and the above-
mentioned work [19] of Oblomkov, who related the eigenfunctions of the holomor-
phic BCn Sutherland operator to vector valued spherical functions on the group
GL(N, C). If we replace GL(N, C) by U (N ), then Oblomkovs construction leads
to our construction in the most important N = 2n case. However, there are also
dierent cases considered in [19] and in this paper even after such replacement, and
the language and the techniques used are rather dierent. In fact, we shall obtain
the results by applying a recently developed general framework of quantum Hamil-
tonian reduction under polar group actions [21]. We shall raise interesting open
questions, too, and to facilitate their future investigation we describe our analysis
in a self-contained manner.
The organization of the article is as follows. In the next section, we recall the
necessary notions and results concerning quantum Hamiltonian reductions of the
Laplace operator on a Riemannian manifold that admits generalized polar coordi-
nates adapted to the symmetry group in the sense of [22]. In Sec. 3, we specialize to
Hermann actions on a compact Lie group Y , and describe those Hermann actions on
Y = U (N ) that are expected to lead to BCn Sutherland models if the representation
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

702 L. Feh
er & B. G. Pusztai

of the symmetry group G < Y Y is chosen appropriately. The key part of the
paper is Sec. 4, where we conrm the above expectation for three innite families of
cases. In Sec. 5, we summarize the results, further discuss the comparison with [19]
and formulate open questions. There is also an appendix containing background
material.

2. Quantum Hamiltonian Reduction Under Polar Actions


We here collect general denitions and results that will be used subsequently. Our
main purpose is to explain that formula (2.14) characterizes the reductions of the
Laplace operator of a Riemannian manifold under so-called polar actions [22] of
compact symmetry groups. The exposition is restricted to the necessary minimum,
for more details see [21] and references therein.
Let Y be a smooth, connected, complete Riemannian manifold with metric .
Consider the Laplace operator Y corresponding to . For a smooth function F ,
1 1
in local coordinates {y } on Y one has Y F = || 2 (|| 2 F ) with || :=
det(, ). The restriction of Y onto the space of the complex-valued compactly
supported smooth functions,
0Y := Y |Cc (Y ) : Cc (Y ) Cc (Y ), (2.1)
is an essentially self-adjoint linear operator of the Hilbert space L2 (Y, dY ), where
Y denotes the measure generated by the Riemannian volume form, locally dened
1 
by || 2 dy . Suppose that a compact Lie group G acts on (Y, ) by isometries.
The action is given by a smooth map
: G Y Y, (g, y)  (g, y) = g (y) = g.y (2.2)
such that g = for every g G. The measure Y inherits the G-invariance
and therefore the Hilbert space L2 (Y, dY ) naturally carries a continuous unitary
representation of G. This in turn is unitarily equivalent to an orthogonal direct
sum, L2 (Y, dY )
= M V, where (, V ) runs over a complete set of pairwise
inequivalent irreducible unitary representations of G, denotes the contragredient
of the representation , and M is a multiplicity space on which G acts trivially.
Correspondingly, the self-adjoint scalar Laplace operator, 0 , which by denition
Y
0
is the closure of Y (2.1), can be decomposed as Y =
0 idV , where
is

a self-adjoint operator on the Hilbert space M . The system (M , ) is called the
0 ) having the symmetry type .
reduction of the system (L2 (Y, dY ), Y
In order to present a convenient model of (M , ), consider now an irreducible
unitary representation (, V ) of G, where V is a nite dimensional complex vector
space with inner product ( , )V . By simply acting componentwise, the dieren-
tial operator 0Y extends onto the complex vector space of the V -valued com-
pactly supported smooth functions, Cc (Y, V ). This gives the essentially self-adjoint
operator
0Y : Cc (Y, V ) Cc (Y, V ) (2.3)
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

Derivations of Trigonometric BCn Sutherland Model 703

of the Hilbert space L2 (Y, V, dY ). Because of the G-symmetry of the metric ,


the set

Cc (Y, V )G := {F | F Cc (Y, V ), F g = (g) F ( g G)} (2.4)

of the V -valued, compactly supported G-equivariant smooth functions is an


invariant linear subspace of 0Y . Moreover, the restriction of 0Y (2.3) onto
Cc (Y, V )G ,

:= 0Y |Cc (Y,V )G : Cc (Y, V )G Cc (Y, V )G , (2.5)

is a densely dened, symmetric, essentially self-adjoint linear operator on the Hilbert


space L2 (Y, V, dY )G of the square-integrable G-equivariant functions. It is not
dicult to demonstrate the unitary equivalence
)
(M , ) with V := V ,
= (L2 (Y, V, dY )G , (2.6)

where denotes the closure of in (2.5). It is convenient for many purposes to


use the realization of the reduced quantum system furnished by L2 (Y, V, dY )G .
Particularly simple cases of the reduction arise if the reduced conguration space
Yred := Y /G is a smooth manifold, although this happens very rarely. However,
restricting to the principal orbit type, Y Y , one always obtains a smooth ber
bundle : Y Y /G. Note that Y consists of the points of Y having the smallest
isotropy subgroups for the G-action [23]. The big cell of the reduced conguration
space, given by Yred := Y /G, is naturally endowed with a Riemannian metric, red ,
making a Riemannian submersion. From a quantum mechanical point of view,
neglecting the non-principal orbits is harmless, in some sense, since Y is not only
open and dense in Y , but it is also of full measure.
In many applications polar group actions are important, whose characteristic
property is that the G-orbits possess representatives that form sections in the sense
of Palais and Terng [22]. By denition, a section Y is a connected, closed,
regularly embedded smooth submanifold of Y that meets every G-orbit and it does
so orthogonally at every intersection point of with an orbit. If a section exists,
then any two sections are G-related. The induced metric on is denoted by ,
and for the measure generated by , we introduce the notation . For a section
, denote by a connected component of the manifold := Y . The isotropy

subgroups of all elements of are the same and for a xed section we dene K := Gy
for y . The group K is called the centralizer of the section . By restricting
: Y Y /G onto , ), where is the
(Yred , red ) becomes identied with (,

induced metric on . We let stand for the Laplace operator of the Riemannian

manifold (, ). The G-equivariant dieomorphism

(G/K) (Q, gK)  g (Q) Y


(2.7)

provides a trivialization of the ber bundle : Y Y /G. Generalized polar coordi-


nates on Y consist of radial coordinates on and angular coordinates on G/K.
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

704 L. Feh
er & B. G. Pusztai

To concretize the reduced system (2.6) for polar actions, we introduce the
space
V K ) := {f | f Cc (,
Fun(, V K ), f = F | for some F Cc (Y, V )G }, (2.8)

where V K is spanned by the K-invariant vectors in the representation space V . We


assume that the representation (, V ) of the symmetry group G is admissible in the
sense that
dim(V K ) > 0. (2.9)
The restriction of functions appearing in the denition (2.8) gives rise to a linear
isomorphism Fun(, V K) = Cc (Y, V )G  L2 (Y, V, dY )G . This induces a scalar
product on Fun(, V ) making it a pre-Hilbert space whose closure satises the
K

Hilbert space isomorphism Fun(, V K) = L2 (Y, V, dY )G . Next, consider the Lie


algebra G := Lie(G) and its subalgebra K := Lie(K). Fix a G-invariant positive
denite scalar product, BG , on G and thereby determine the orthogonal complement
K of K in G. For any G denote by  the associated vector eld on Y . Then at
each point Q the linear map K   TQ Y is injective, and the inertia
Q
operator J(Q) End(K ) can be dened by the requirement

Q (Q 
, Q ) = BG (, J(Q)), , K . (2.10)
Note that J(Q) is symmetric and positive denite with respect to BG |K K . By
choosing dual bases {T }, {T } K , that is, BG (T , T ) = , we let
b, (Q) := BG (T , J(Q)T ), b, (Q) := BG (T , J(Q)1 T ). (2.11)
The G-orbit G.Q Y through any point Q is an embedded submanifold of
Y and by its embedding it inherits a Riemannian metric, G.Q . Thus we can dene
(0, ) by
the smooth density function :
(Q) := volume of the Riemannian manifold (G.Q, G.Q ), (2.12)
where the volume is understood with respect to the measure, G.Q , belonging to
G.Q . It is easy to see that
1 1
(Q) = C|det(J(Q))| 2 = C|det(b, (Q))| 2 (2.13)
with some constant C > 0. In the following proposition, quoted from [21],  denotes
the representation of G corresponding to the representation of G.

Proposition 2.1. Let us consider a polar G-action using the above notations.
Then the reduced system (2.6) associated with an admissible irreducible unitary
V K , d ), red ),
representation (, V ) of G can be identified with the pair (L2 (,
where
1 1
red = 2 ( 2 ) + b,  (T ) (T ) (2.14)
1
with domain D(red ) = 2 Fun(, V K ) is a densely defined, symmetric, essentially
V K , d ).
self-adjoint operator on the Hilbert space L2 (,
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

Derivations of Trigonometric BCn Sutherland Model 705

The above statement results by calculating the action of Y on the V -valued


equivariant functions in (2.8) with the aid of polar coordinates, using also the
Hilbert space identications
V K)
Fun(, = L2 (Y, V, dY )G V K , d ).
= L2 (, (2.15)

The last equality follows by integrating out the angular coordinates in the scalar
product of equivariant functions. One also uses the unitary map U :L2 (, V K,
1
d ) L (, V , d ) dened by U : f  2 f .
2 K

The rst term in (2.14) corresponds to the kinetic energy of a particle moving on
(Yred , red ) ) and the rest represents potential energy if dim(V K ) = 1. The
= (,
second term of (2.14) is always potential energy, which is constant in some cases.
We refer to this term as the measure factor. It represents a signicant dierence
between the outcomes of the corresponding classical and quantum Hamiltonian
reductions [21]. If dim(V K ) > 1, then one says that the reduced system contains
internal spin degrees of freedom and then the third term of (2.14) encodes spin-
dependent potential energy.

3. Examples of Polar Actions on Compact Lie Groups


From now we take the unreduced conguration space Y to be a compact, con-
nected, real Lie group endowed with a bi-invariant metric , induced by a positive
denite, Y -invariant bilinear form BY of the Lie algebra Y := Lie(Y ). For the reduc-
tion group G one may choose any symmetric subgroup of the direct product group
Y Y , that is,

(Y Y )0 G (Y Y ) , (3.1)

where (Y Y ) stands for the xed-point set of some involutive automorphism


Inv(Y Y ), and (Y Y )0 is the connected component of the identity in
(Y Y ) . The group G acts on Y by the map
1
: G Y Y, ((gL , gR ), y)  (gL ,gR ) (y) := gL ygR . (3.2)

The group actions of this form are often called Hermann actions. Under mild con-
ditions, which hold in the examples below, these are polar actions in the sense of
[22]. In fact, the sections are provided by certain toral subgroupsb A < Y . Thus the
sections are at in the induced metric, which is the characteristic property of the
so-called hyperpolar actions [17]. In the simplest special case (y1 , y2 ) = (y2 , y1 ),
G = Ydiag = {(y, y) | y Y } = Y and (3.2) is just the adjoint action of Y on itself,
for which the sections are the maximal tori of Y .

bA toral subgroup A < Y is a connected and closed Abelian subgroup. It is the closedness of the
relevant subgroups that requires some conditions. If Y is semi-simple, then a sucient condition
is to take BY as a multiple of the Killing form [17].
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

706 L. Feh
er & B. G. Pusztai

3.1. Hermann actions associated with pairs of involutions


The reductions that we study later arise from the following construction. Let
L , R Inv(Y ) be two involutions of Y , and let YL , YR Y be corresponding
symmetric subgroups of Y ,

(Y I )0 YI Y I (I {L, R}). (3.3)

We suppose that the scalar product BY is invariant under both L and R and
introduce Inv(Y Y ) by (y1 , y2 ) := (L (y1 ), R (y2 )). Then

G := YL YR (3.4)

is a symmetric subgroup of Y Y and Eq. (3.2) denes a hyperpolar Hermann action


of G on Y . The classication of the inequivalent pairs of involutions (L , R ) has
been worked out by Matsuki [24]. We assume for simplicity that the two involutions
L and R commute with each other, which holds for the large majority of cases
in the classication. Subsequently, the induced Lie algebra involutions are denoted
by the same letters L and R .
Now, with the aid of the subspaces

Y I , := ker(I IdY ) Y (I {L, R}) and Y := Y L , Y R , Y


(3.5)

we obtain the orthogonal decomposition

Y = Y ++ Y + Y + Y , (3.6)

which gives also a Z2 Z2 -gradation of Y. The Lie algebra of the symmetric sub-
group YI Y is Lie(YI )
= Y I ,+ (I {L, R}). Then, we choose a maximal Abelian

subalgebra A in Y and also dene A := exp(A), which is a toral subgroup of Y .
According to an important theorem proved in [25, 26], the Lie group Y admits the
generalized Cartan decomposition

Y = YL AYR . (3.7)

This means that every element of Y can be written as a product of the elements
of the subgroups in (3.7). Recalling the denition of the Hermann action (3.2) for
G = YL YR , Eq. (3.7) says that the subgroup A intersects every G-orbit. Moreover,
it does so orthogonally at every intersection point, and thus A provides a section
for the G-action in the sense of [22]. Below A denotes a connected component of
the regular part of the section A.
Let us introduce the subgroups YLR := YL YR Y and

M := {g | g YLR , gag 1 = a ( a A)} YLR . (3.8)


July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

Derivations of Trigonometric BCn Sutherland Model 707

Their Lie algebras are

Lie(YLR )
= Lie(YL ) Lie(YR )
= Y L ,+ Y R ,+ = Y ++ , (3.9)
M := Lie(M ) = {X | X Y ++ , adX (q) = 0 ( q A)}, (3.10)

where adX is dened by the Lie bracket on Y. It can be shown that the centralizer
of the section A = exp(A) (the isotropy subgroup of the elements of A) is now
furnished by

K = Mdiag = {(g, g) | g M } G. (3.11)

To specialize the inertia operator J dened in (2.10), we introduce a G-invariant


scalar product on the Lie algebra

G = Lie(G) = Lie(YL YR )
= Lie(YL ) Lie(YR )
= Y L ,+ Y R ,+ (3.12)

by the formula

BG ((L , R ), (L , R )) := BY (L , L )+BY (R , R ), (L , R ), (L , R ) G. (3.13)

This induces the decomposition G = K K , where K = Lie(K). By using the


decomposition Y = M M dened by BY , we also introduce the subspaces

Ka := {(X, X) | X M} K , (3.14)
Ke := {(L , R ) | L , R M Y ++ } K , (3.15)
Ko := {(L , R ) | L Y + , R Y + } K , (3.16)

which yield the orthogonal decomposition

K = Ka Ke Ko . (3.17)

Now consider the vector eld  = (L , R ) on Y associated with = (L , R ) G


by means of the G-action. At an arbitrary point eq A (q A) of the section A
we nd
 
eq = (L , R )eq = (dLeq )e R eadq (L ) Teq Y, (3.18)

where Ly denotes the left-translation on Y by group element y Y . Simply by


plugging (3.18) into the denition (2.10), routine algebraic manipulations lead to
the following result:

Lemma 3.1. Equation (3.17) is a decomposition of K into invariant


 subspaces
One has J(eq ) = 2 IdK and,
of the inertia operator J(eq ) at any point eq A. K a a
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

708 L. Feh
er & B. G. Pusztai

writing = (L , R ) G as a 2-component column vector with components L and


R , the action of J(eq ) on Ke and Ko is encoded by the matrices
 
 
1 cosh(ad q ) 
J(eq )K =  ,
e cosh(adq ) 1 
Ke
  (3.19)
 
1 sinh(ad q ) 
J(eq )K =  .
o sinh(adq ) 1 
K o

q
For the inverse of J(e ) one has J(eq )1 K= IdK a
1
together with
2
a
 

q 1 
sinh2 (adq ) cosh(adq ) sinh2 (adq ) 
J(e ) K =  , (3.20)
e cosh(adq ) sinh2 (adq ) sinh2 (adq ) 
Ke
 
 2
cosh (adq ) 2
sinh(adq ) cosh (adq )  
J(eq )1 K =  . (3.21)
o 2
sinh(adq ) cosh (adq ) cosh2 (adq ) 
K o

3.2. A family of two involutions on U (N )


For our later purpose, we now focus on the unitary group

Y := U (N ) = {y | y GL(N, C), y y = 1N }. (3.22)

We equip the Lie algebra

Y := u(N ) = {X | X gl(N, C), X + X = 0} (3.23)

with the scalar product

BY (X, Z) := tr(XZ), X, Z u(N ). (3.24)

To any pair (m, n) Z20 with m n and m+n = N we associate the block-matrix


1m 0
Im,n := diag(1m , 1n ) = U (N ), (3.25)
0 1n
and the involutive inner automorphism

m,n : U (N ) U (N ), y  m,n (y) := Im,n yI1


m,n . (3.26)

The xed-point set of m,n is



 
a 0 
U (N )m,n
=  a U (m), b U (n) = U (m) U (n). (3.27)
0 b 

Note that U (N )m,n is connected. The induced Lie algebra involution operates as

m,n (X) = Im,n XI1


m,n , X u(N ). (3.28)
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

Derivations of Trigonometric BCn Sutherland Model 709

Using the block-matrix realization



 
A C 
u(N ) =  A u(m), B u(n), C C mn
, (3.29)
C B 

the eigenspaces u(N )m,n , are



 
A 0 
u(N ) m,n ,+
=  A u(m), B u(n) ,
0 B 

  (3.30)
0 C 
u(N )m,n , =  C Cmn .
C 0 

Now we take two pairs (m, n), (r, s) Z20 with the additional requirements
m r s n and m + n = r + s = N , and consider the commuting involutions

L := r,s and R := m,n . (3.31)

The corresponding symmetric subgroups YL , YR Y are

U (N )L := U (N )L
= U (r) U (s) and U (N )R := U (N )R
= U (m) U (n).
(3.32)

The partition N = n+(rn)+(sn)+n leads to a 44 block-matrix decomposition


of any N N matrix in general. (Of course, if r = n or s = n, then the block-
matrix decomposition contains fewer blocks.) That is, any matrix X CN N can
be written as

X1,1 X1,2 X1,3 X1,4

X2,1 X2,2 X2,3 X2,4
X= , (3.33)

X3,1 X3,2 X3,3 X3,4
X4,1 X4,2 X4,3 X4,4

where the entries Xi,j are themselves matrices, X1,1 Cnn , X1,2 Cn(rn) ,
X1,3 Cn(sn) , X1,4 Cnn , etc. Then for the Lie group YLR = YL YR we have

U (N )LR

a1,1 a1,2 0 0 






a2,1 a2,2 0 0  a1,1
a1,2
= 0
 U (r), a3,3 U (s n), a4,4 U (n) .


0 a3,3 0 
 a2,1 a2,2





0 0 0 a4,4 
(3.34)
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

710 L. Feh
er & B. G. Pusztai

Therefore U (N )LR = U (r) U (s n) U (n) and the Lie algebra Lie(U (N )LR ) =
u(N )++
is isomorphic to u(r) u(s n) u(n). In our case the subspace Y in
(3.5) reads


0 0 0 A1,4 



0 0 0 A




=  A1,4 C
2,4
u(N ) 
nn
, A2,4 C (rn)n
. (3.35)

0 0 0 0 





A1,4 A2,4 0 0 
To proceed, we dene the diagonal n n matrix
q := diag(q1 , q2 , . . . , qn ) Rnn (3.36)
for any real n-tuple (q1 , q2 , . . . , qn ) Rn , and we also set

0 0 0 q

0 0 0 0
q := 0 0 0 0 u(N ) .

(3.37)

q 0 0 0
Then the set of matrices
A := {q | (q1 , q2 , . . . , qn ) Rn } u(N ) (3.38)
is a maximal Abelian subalgebra in u(N ) . A basis of the dual space A is given
by the functionals
k : A R, q  k (q) := qk . (3.39)
The corresponding subgroup A = exp(A) has the form


cos(q) 0 0 sin(q) 




0 1 0 0 
A= e =  (q1 , q2 , . . . , qn ) R
q rn n
 . (3.40)

0 0 1 0 



sn



sin(q) 0 0 cos(q) 
If T(n) denotes the diagonally embedded standard torus in U (n), then it is straight-
forward to show that the subgroup M (3.8) is now furnished by


a 0 0 0 



0 b 0 0 


M=  a T(n), b U (r n), c U (s n) . (3.41)

0 0 c 0 





0 0 0 a 
Note that M is connected, and therefore so is the centralizer K = Mdiag of the
section A. Moreover, we have the identications
K
= Mdiag
=M
= T(n) U (r n) U (s n)

= U (1)n U (r n) U (s n). (3.42)
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

Derivations of Trigonometric BCn Sutherland Model 711

It is shown in [26, p. 63], that the closed, connected subset


  

A+ := eq  0 q1 q2 qn A (3.43)
2
intersects each orbit of G = U (N )L U (N )R under the action (3.2) precisely
once. Note also that matrix exponentiation provides a bijection from
  

A+ := q  0 q1 q2 qn A (3.44)
2
onto A+ . By inspecting the isotropy subgroup Geq G for eq A+ , we nd that
Geq = K if and only if q A+ , where A+ denotes the connected open subset
  

A+ := q  0 < q1 < q2 < < qn < A+ . (3.45)
2
We can conclude from the above that the subset A := exp(A+ ) provides a connected
component for the regular part of the section A. Regarding the components qk in
(3.45) as global coordinates on A, for the Laplace operator dened by the
A
induced metric we obtain
1  2
n
A = . (3.46)
2 qk2
k=1

3.3. Diagonalization of the inertia operator


We continue the study of the examples (3.31) by presenting a basis of K that
diagonalizes J(eq ) (3.19) for any q A+ in (3.45). We then use this basis to
1
compute the density 2 that enters the second term of the reduced Laplacian (2.14).
1
Note that 2 could be found also by the specialization of general formulae available
for two commuting involutions [25, 2], but we need to x a basis for the evaluation
of the third term of (2.14), which will be performed later.
We start by dening an orthonormal basis (ONB) in the space M u(N )++ ,
which (due to (3.34) and (3.41)) has the form
M u(N )++

X1,1 X1,2 0 0




X 
0 0 0  X1,1 , X4,4 u(n), (X1,1 + X4,4 )diag = 0,
= 0
1,2  .


0 0 0 
 X1,2 C
n(rn)





0 0 0 X4,4 
(3.47)

If r = n, then there are no o-diagonal blocks, and in general dim(M u(N ) ) =++

n(2r 1). For all 1 j n we let



Ejj 0 0 0

i

0 0 0 0
,
i
E2 := (3.48)
j
2
0 0 0 0
0 0 0 Ejj
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

712 L. Feh
er & B. G. Pusztai

and for all 1 k < l n we dene



Ekl Elk 0 0 0

1 0 0 0 0
Erk +l :=

,

2 0 0 0 0
0 0 0 Elk Ekl

Ekl + Elk 0 0 0

i 0 0 0 0
Ei k +l := ,
2
0 0 0 0

0 0 0 Ekl Elk
(3.49)
Ekl Elk 0 0 0

1

0 0 0 0
,
Erk l :=
2
0 0 0 0

0 0 0 Ekl Elk

Ekl + Elk 0 0 0

i 0 0 0 0
Ei k l := .
2
0 0 0 0

0 0 0 Ekl + Elk
For all 1 j n and 1 d r n we set

0 Ejd 0 0 0 Ejd 0 0

1 Edj 0 0 0 E
, E i,d := i dj
0 0 0
Er,d :=
. (3.50)
2 0
j j
2 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
The superscripts i and r refer to purely imaginary and to real matrices, respectively,
and the elementary matrices Eab are always understood to be of the correct size as
dictated by (3.33). The set of matrices

{ED },D := {Erk l , Ei k l }1k<ln {E2


i
}n {Er,d
j j=1 j
, Ei,d
j
}1jn, (3.51)
1drn

forms an ONB in M u(N )++ . Here D is an index of degeneration and runs


over the positive roots R+ for the root system Cn or BCn . More precisely,

R+ (Cn ) if r = n,
R+ = (3.52)
R+ (BCn ) if r > n.

One can easily verify the relations

(adq )2 ED = (q)2 ED . (3.53)


July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

Derivations of Trigonometric BCn Sutherland Model 713

Next, we deal with the subspaces u(N )+ and u(N )+ given by



0 0 0 0 




0 

0 0 0 
u(N )+ =
0
 X3,4 C(sn)n ,
 (3.54)

0 0 X3,4 





0 0 X3,4 0

u(N )+


0 0 X1,3 0 




0 0 X2,3

0 


= X
 X1,3 Cn(sn) , X2,3 C(rn)(sn) .




X2,3 0 0 





1,3


0 0 0 0 
(3.55)

Note that both u(N )+ and u(N )+ are trivial if s = n. In general, dim(u(N )+ ) =
2n(s n) and dim(u(N )+ ) = 2r(s n). For all 1 j n and 1 d s n we
dene

0 0 0 0 0 0 0 0

1
0 0 0 0
, i
0 0 0 0
,
r,d
Ej :=
i,d
Ej := (3.56)
2 0 0 0 Edj


2 0 0 0 Edj

0 0 Ejd 0 0 0 Ejd 0


0 0 Ejd 0 0 0 Ejd 0

1 0 0 0 0 i 0 0 0 0
Fj :=
r,d

, Fj :=
i,d . (3.57)
2 Edj 0 0 0

2 Edj 0 0 0
0 0 0 0 0 0 0 0

For all 1 c r n and 1 d s n we introduce



0 0 0 0 0 0 0 0

1 0 0 Ecd 0 i 0 0 Ecd 0

F0r,c,d
:=
,
F0i,c,d
:= .
2 0 Edc 0 0

2 0 Edc 0 0
0 0 0 0 0 0 0 0
(3.58)
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

714 L. Feh
er & B. G. Pusztai

The set of matrices

{E
D }j,D := {E
j j
i,d }1jn
r,d , E
j
(3.59)
1dsn

forms an ONB in u(N )+ . The set of matrices

{FDj }j,D := {Fr,d


j
, Fi,d
j
}1jn (3.60)
1dsn

together with the set

{F0D }D := {F0r,c,d, F0i,c,d}1crn (3.61)


1dsn

form an ONB in u(N )+ . They verify the relations

D ) = qj F D ,
adq (Ej j adq (FDj ) = qj E
D,
j adq (F0D ) = 0. (3.62)

Now we compute the matrix of J and of J 1 on the invariant subspaces in


dim(M)
(3.17). First, choose an arbitrary ONB {Lj }j=1 in M. Then the vectors

Lj
j := 1 (Lj , Lj ) 1
L (3.63)
2 2 Lj

yield an ONB in Ka . The matrix entries of J(eq )|K


a
and J(eq )1 |K
a
read

k , J(eq )L
BG (L l ) = 2k,l , BG (L l ) = 1 k,l .
k , J(eq )1 L (3.64)
2
Second, upon introducing the vectors
 
1 ED 1 ED
V :=
D
, W :=
D
, (3.65)
2 ED 2 ED

we obtain an ONB in Ke , and by applying (3.19) on these vectors we get



q 1 (1 cosh(adq ))ED
J(e )V =
D
,
2 (1 cosh(adq ))ED
 (3.66)
q 1 (1 + cosh(adq ))ED
J(e )W =
D
.
2 (1 + cosh(adq ))ED

We nd from the relations (3.53) that cosh(adq )ED = cos((q))ED , and then
elementary trigonometric identities yield
   
q D 2 (q) D q D 2 (q)
J(e )V = 2 sin V , J(e )W = 2 cos WD . (3.67)
2 2
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

Derivations of Trigonometric BCn Sutherland Model 715

Therefore the only non-trivial matrix entries of J(eq )|K e


and J(eq )1 |K
e
are the
following ones:
 
(q)
BG (VD , J(eq )VD ) = 2 sin2 ,
2
 
q (q)
BG (W , J(e )W ) = 2 cos
D D 2
,
2
1
BG (VD , J(eq )1 VD ) =  , (3.68)
(q)
2 sin2
2
1
BG (WD , J(eq )1 WD ) =  .
(q)
2 cos2
2
Third, by introducing
 D  
1 E  ED 0
Vj :=
D j
, D := 1
W
j
, Z0D := , (3.69)
j
2 FD j
2 FD
j
F D
0

we obtain an ONB in Ko , and the application of (3.19) on these basis vectors gives

1 D sinh(adq )FD
E j j
J(eq )VD
j
= ,
2 sinh(adq )EDj + FDj
 (3.70)
1 D + sinh(adq )F D
Ej j
J(eq )W
D
j = .
2 sinh(adq )EDj FDj
By using the relations (3.62) we see that
J(eq )VD
j
= (1 + sin(qj ))VD
j
, J(eq )W
D = (1 sin(qj ))W
j
D .
j
(3.71)
q
Since J(e )Z0D
= Z0D ,
we conclude that the only non-trivial matrix entries of
q
J(e )|K
o
and its inverse J(eq )1 |K
o
are the following ones:
BG (VD
j
, J(eq )VD
j
) = 1 + sin(qj ), D , J(eq )W
BG (W j
D ) = 1 sin(qj ),
j

1 1
BG (VD , J(eq )1 VD )= , D , J(eq )1 W
BG (W D ) = ,
j j
1 + sin(qj ) j j
1 sin(qj )

BG (Z0D , J(eq )Z0D ) = 1, BG (Z0D , J(eq )1 Z0D ) = 1. (3.72)


Lemma 3.2. By using the identification := A = exp(A+ ) with A+ in (3.45),
the second term of the reduced Laplacian (2.14) is given by
(m n)(r s)  4(s n)2 1 
n n
1 1 1 1
2 A ( 2 ) = 2 + 2
2 j=1
sin (qj ) 2 j=1
sin (2qj )

n(3m2 + n2 1)
. (3.73)
6
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

716 L. Feh
er & B. G. Pusztai

Proof. Consider the function




n
1

n
2
J := [sin(qk ql ) sin(qk + ql )] [sin(qj )] [sin(2qj )] , (3.74)
1k<ln j=1 j=1

where the domain of the variables q1 , q2 , . . . , qn is such that all sin functions are
positive and , 1 , 2 R are arbitrary parameters. Recall from [9] the identity
n
2J   1 1

J 1 = ( 1) +
a=1
qa2
1k<ln
sin2 (qk ql ) sin2 (qk + ql )


n
1 n
1
+ 1 (1 + 22 1) 2 + 42 (2 1) 2
j=1
sin (qj ) j=1
sin (2qj )
 
2 2
n (1 + 22 ) + 2(1 + 22 )(n 1) + (n 1)(2n 1) .
2
3
(3.75)

By calculating det(J(eq )) using the above basis of K , it is easily obtained from


1
(2.13) that 2 (eq ) J (q1 , q2 , . . . , qn ) with
1
= 1, 1 = r s, 2 = s n + . (3.76)
2
Taking into account (3.46), the required statement follows immediately.

The subsequent formula is obtained by direct substitution since we have deter-


mined the matrix elements of J(eq )1 (cf. (2.11)). It will be used in Sec. 4, when
we shall further inspect the reduced Laplace operator (2.14) in interesting cases.

Lemma 3.3. In terms of the above notations, the third term of the reduced Lapla-
cian (2.14) takes the following form:

b,  (T ) (T )
 
1   n
 (V2
i
)2  (W2
i
)2
= j )2 + 1

(L j
+ j

2
1jdim(M)
2 j=1 sin2 (qj ) cos2 (qj )

1   (Vrk l )2 +  (Vik l )2  (Wrk l )2 +  (Wik l )2
+   +  
2 qk ql qk ql
2 2
1k<ln sin cos
2 2

1   (Vrk +l )2 +  (Vik +l )2  (Wrk +l )2 +  (Wik +l )2
+   +  
2 qk + ql qk + ql
2 2
1k<ln sin cos
2 2
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

Derivations of Trigonometric BCn Sutherland Model 717


1   (Vj ) + (Vj )
 
n rn r,d 2 i,d 2
 (Wr,d )2 +  (Wi,d )2
+ & ' + j
&q ' j

2 j=1 2 qj 2 j
d=1 sin cos
2 2
 
  (Vj ) + (Vj )
n sn  r,d 2  i,d 2  r,d 2
(Wj ) + (W  i,d )2
j
+ +
j=1
1 + sin(qj ) 1 sin(qj )
d=1

 sn
rn 
+ ( (Z0r,c,d )2 +  (Z0i,c,d)2 ). (3.77)
c=1 d=1

4. BCn Sutherland Models from the KKS Ansatz


In this section, we study interesting examples of the quantum Hamiltonian reduc-
tion based on the Hermann action (3.2) on Y = U (N ) associated with the invo-
lutions (3.31). The reductions correspond to certain UIRREPS of the symmetry
group
G = U (N )L U (N )R = (U (r) U (s)) (U (m) U (n)). (4.1)
To describe them, we now briey summarize our notations for the UIRREPS of
U (n), for arbitrary n. (See also the Appendix.) First, we have the UIRREP ( , V )
of SU (n) in correspondence to any highest weight P+ (SU (n)), that can be
(n1
written as = i=1 ai i using the fundamental weights i and integers ai
Z0 . A label n () {0, 1, . . . , n 1} is attached to the highest weight by the
congruence relation

n1 
n1
n () kak (mod n) for = ai  i . (4.2)
k=1 i=1
2 2
It enters the equality (ei n 1n ) = ei n n () IdV . Then, for any k Z, the rep-
resentation of SU (n) extends to the representation (k,) of U (n) dened by

(k,) (g) = nk+n () (g), U (1), g SU (n). (4.3)


Up to equivalence, all UIRREPS of U (n) are obtained in this way. The notation
makes sense even for n = 1, by putting P+ (SU (1)) := {0}, and we have (k,0) (g) =
(det g)k (g U (n)). By letting (k,) and stand for the innitesimal version of
the representations (k,) and , respectively, we have
 
 tr(Z) tr(Z)
(k,) (Z) = Z 1n + (n () + nk) IdV , Z u(n). (4.4)
n n
(n) (n) (n)
We use the notations , V , (k,) etc. when considering various values of n
simultaneously.
The UIRREPS of the direct product group G (4.1) have the form
& ' & '
(r) (s) (m) (n)
= (k1 ,1 )  (k2 ,2 )  (k1 ,1 )  (k2 ,2 ) , (4.5)
L L L L R R R R
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

718 L. Feh
er & B. G. Pusztai

where 1L , 2L , 1R , 2R are the highest weights and kL


1 2
, kL 1
, kR 2
, kR Z according to
(4.3). The main problem is to nd the UIRREPS (, V ) for which
dim(V K ) = 1, (4.6)
where K = Mdiag < G is given by (3.42). We investigate this problem by adopt-
ing the ansatz that one of the four constituent representations in (4.5) has the
(l)
form (k,a1
1 ) (l {r, s, m, n}) and the other three constituent representations are
(l)
1-dimensional. More exactly, (k,a1
1 ) will be used for a factor of the maximal size,
l = max{r, s, m, n}. We call this assumption the KKS ansatz, since it eventually
originates from the seminal paper by Kazhdan, Kostant and Sternberg [14]. The
usefulness of this assumption is also supported by results in [13, 19, 20]. The key
(l)
property is that all weight-multiplicities of (k,a1
1 ) are equal to one. The analysis
of the condition (4.6) is the easiest if the group K (3.42) is Abelian, which happens
in the following cases:
Case I: m = r = s = n, N = 2n,
Case II: m = r = n + 1, s = n, N = 2n + 1,
Case III: m = n + 2, r = s = n + 1, N = 2n + 2.
Next we describe the simplest Case I in detail, then present the essential points for
the other two cases. The complex holomorphic analogue of Case I was studied in
[19]; and the results are consistent. The other two cases of our KKS ansatz have
not been investigated before.
Remark. The reader may wonder why we take l = max{r, s, m, n} in our KKS
ansatz in Cases II and III. In fact, we previously studied ([20] and unpublished
work) the classical Hamiltonian reductions of the free particle on U (N ) based on
the symmetry group (4.1) by using a minimal coadjoint orbit of positive dimension
for any one of the four factors and one-point orbits for the other three factors. We
found that this leads to the classical BCn Sutherland model with three independent
coupling constants only in the three cases mentioned above, and only if the minimal
coadjoint orbit of positive dimension, 2(l 1) for U (l), is associated with a factor
of maximal size. The connection to quantum Hamiltonian reduction is clear from
the relation between the coadjoint orbits of U (l) of dimension 2(l 1) and the
(l)
representations (k,a1
1 ) (and their contragredients), which follows for example
from geometric quantization.

4.1. Case I: m = r = s = n, N = 2n
Now L = R = n,n and U (N )L = U (N )R = U (n) U (n). The decomposition
(3.33) of any matrix in CN N simplies to a two by two block form with all four
blocks having size n n. We look for admissible UIRREPS of G (4.1) by adopting
the KKS ansatz
& ' & '
(n) (n) (n) (n)
:= (k1 ,a1
1 )  (k2 ,0)  (k1 ,0)  (k2 ,0) , (4.7)
L L R R
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

Derivations of Trigonometric BCn Sutherland Model 719

where a1 Z0 , kL
1 2
, kL 1
, kR 2
, kR Z and the representation space is identied as
V Va(n)
1
1
. (4.8)
Note that any element X G = u(N )L ,+ u(N )R ,+ of the symmetry algebra G
can be realized as a pair X = (XL , XR ) with XL , XR u(N )L ,+ = u(N )R ,+
=
u(n) u(n). So, for any X G we have the rened decomposition
X = (XL , XR ) = ((XL1 , XL2 ), (XR
1 2
, XR )), (4.9)
where XL1 , XL2 , XR
1 2
, XR u(n) and as block-matrices
 
1 2
XL1 0 1 2
XR1
0
(XL , XL ) := , (X R , X R ) := . (4.10)
0 XL2 0 2
XR
With these notations, the formula of the Lie algebra representation corresponding
to (4.7) reads
   
 tr(XL1 ) n (a1 1 )
(X) = a1
1 XL
(n) 1
1n + kL + 1
tr(XL1 )
n n

2
+ tr(kL XL2 + kR
1
XR1
+ kR2 2
XR ) IdV . (4.11)

Lemma 4.1. The KKS ansatz (4.7) defines admissible UIRREPS of G satisfying
dim(V K ) = 0 if and only if kL
1 2
+ kL 1
+ kR 2
+ kR = 0 and a1 = n with some Z0 .
In these cases dim(V K ) = 1. Using the bosonic oscillator realization of V (4.8)
described in the Appendix, V K has the form
VK (n)
= Vn
1
[0] = spanC {|, , . . . , }. (4.12)

Proof. The isotropy subalgebra is K = Mdiag = {X = (X0 , X0 ) | X0 M},


where M can be parametrized as

 

H + ix1n 0  (n)
M = X0 =  H iHR , x R . (4.13)
0 H + ix1n 
That is, for the components of any X K we have the parametrization
XL1 = XL2 = XR
1 2
= XR = H + ix1n . (4.14)
(n)
Thus, using Eq. (4.11), for any v and X K we can write
Va1
1
 (n)
 1 2 1 2

(X)v = a1
1 (H)v + ix n (a1 1 ) + n(kL + kL + kR + kR ) v. (4.15)
Clearly  (X)v = 0 (X K) if and only if
(n)
a(n)
1
1
(H)v = 0 (H iHR ) and n (a1 1 ) + n(kL
1 2
+ kL 1
+ kR 2
+ kR ) = 0.
(4.16)
VK = VK
(n) 1 2 1 2
Therefore = Va1
1 [0], provided that n (a1 1 )+n(kL +kL +kR +kR ) = 0.
(n)
It is easy to see that Va1
1 [0] = {0} if and only if a1 = n for some Z0 .
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

720 L. Feh
er & B. G. Pusztai

1 2 1 2
Since n (n1 ) = 0 by (4.2), the requirement kL + kL + kR + kR = 0 then also
(n)
follows from (4.16). Finally, note that by using the oscillator realization of Vn
1
one has the second equality in (4.12).

In what follows we make use of the basis of K constructed in Sec. 3.3. In the
present case this is given by the basis {Va , Wa }a{r,i},R+(Cn ) of Ke together with
j } of Ka dened according to (3.63) by using the following orthonormal
the basis {L
basis {Lj }j=1 of M:
n

i Ejj 0
Lj := M (1 j n). (4.17)
2 0 Ejj
Lemma 4.2. In the case of the KKS ansatz (4.7) subject to the conditions of
Lemma 4.1 the third term in the reduced Laplacian (2.14) gives
b,  (T ) (T )
  
1 1 1
= n(kL
1
) ( + 1)
2 2
+ kL +
2
1k<ln
sin (qk ql ) sin (qk + ql )
2 2

1 2 n n
1
(kL + kR ) (kL
1 2 2
+ kR ) 1 1
2 2(kL
2
+ kR
1 2
) 2 .
2 j=1
sin (qj ) j=1
sin (2qj )
(4.18)
Proof. Note that in the present case only the rst four sums occur in the formula
(3.77). Recalling that n (n1 ) = 0 and utilizing formula (4.11) for  , we can
calculate the action of the various terms. For example, since

L j = 1 (Lj , Lj ) = i ((Ejj , Ejj ), (Ejj , Ejj )) , (4.19)


2 2
we get
   
j ) = i n

 (L (n)
Ejj
1
1n + (kL
1
+ kL
2
kR
1
kR
2
)IdV . (4.20)
1
2 n
The action of  (L
j ) on V K can be easily calculated in the bosonic oscillator picture.
(n)  
Since n
1 Ejj n1 1n |, , . . . ,  = 0, and since kR 2
= kL1
kL 2
kR 1
, it follows
that on the subspace V K = spanC {|, , . . . , } the operator 
( L j ) acts as the
 1 2  i 1 1
scalar (Lj ) = i(kL + kL ). In the same manner, the equalities (V2j ) = i(kL + kR )

and (W2j ) = i(kL + kR ) hold on V . Furthermore, we have on V
i 2 1 K

 (Vrk l ) =  (Wrk l ) =  (Vrk +l ) =  (Wrk +l )


1
= (n
(n)
1
(Ekl ) n

(n)
1
(Elk )), (4.21)
2 2
 (Vik l ) =  (Wik l ) =  (Vik +l ) =  (Wik +l )
i
= (n

(n)
1
(n)
(Ekl ) + n
1
(Elk )). (4.22)
2 2
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

Derivations of Trigonometric BCn Sutherland Model 721

Next, k, l {1, 2, . . . , n}, k = l, we obtain


(n)
n
1
(n)
(Ekl )n
1
(Elk )|, , . . . ,  = bk bl bl bk |, , . . . , 
= ( + 1)|, , . . . , . (4.23)
The above equations imply that on V K
 (L
j )2 = (k 1 + k 2 )2 ,
L L  (V2
i
j
)2 = (kL
1 1 2
+ kR ) ,

 (W2
i
j
)2 = (kL
2 1 2
+ kR ) , (4.24)
1
 (Vr )2 +  (Vi )2 =  (Wr )2 +  (Wi )2 = ( + 1) for
2
= k l , k = l. (4.25)
Now (4.18) results by substitution into (3.77), using obvious trigonometric
identities.

The following proposition is obtained by putting together the statements of


Eq. (3.46), Lemmas 3.2 and 4.2.

Proposition 4.3. Under the KKS ansatz (4.7) the general formula (2.14) gives the
following result for the reduction of the Laplace operator of U (N ):
1 1
red = HBCn + n(kL1
) n(2n 1)(2n + 1),
2 2
+ kL (4.26)
2 6
where HBCn is the Sutherland Hamiltonian (1.4) with the coupling parameters
defined by
a , b |kL
1 1
+ kR |, c |kL
2 1
+ kR | (4.27)
1
in terms of the free parameters kL 2
, kL 1
, kR Z and Z0 determined by
Lemma 4.1.
1 2 1
Remark. By varying , kL , kL , kR , the coupling parameters a, b, c in (1.4) can take
arbitrary non-negative integer values. As further discussed in Sec. 5, Proposition 4.3
follows also from the results of Oblomkov [19].

4.2. Case II: m = r = n + 1, s = n, N = 2n + 1


In this case L = R = n+1,n and correspondingly U (N )L = U (N )R = U (n + 1)
U (n). We consider the following ansatz for the UIRREP (, V ) of the symmetry
group G (4.1),
& ' & '
(n+1) (n) (n+1) (n)
:= (k1 ,a1
1 )  (k2 ,0)  (k1 ,0)  (k2 ,0) , (4.28)
L L R R

(n+1)
where a1 Z0 , kL
1 2
, kL 1
, kR 2
, kR Z and the carrier space is identied as V Va1
1 .
Similarly to (4.9), any X G = u(N )L ,+ u(N )R ,+ can be realized as a pair
X = (XL , XR ) with XL , XR u(N )L ,+ = u(N )R ,+ = u(n + 1) u(n). So, we
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

722 L. Feh
er & B. G. Pusztai

 
write X G as X = (XL , XR ) = (XL1 , XL2 ), (XR 1
, XR2
) with XL1 , XR1
u(n + 1),
XL , XR u(n). Then (4.28) implies the formula
2 2
 
 tr(XL1 )
(X) = a1
1 XL
(n+1) 1
1n+1
n+1
  
1 n+1 (a1 1 ) 1 2 2 1 1 2 2
+ kL + tr(XL ) + kL tr(XL ) + kR tr(XR ) + kR tr(XR ) IdV .
n+1
(4.29)
Lemma 4.4. The KKS ansatz (4.28) yields admissible UIRREPS of G if and only
if , Z0 such that the parameters kL
1 2
, kL 1
, kR 2
, kR Z, and a1 Z0 satisfy
the conditions
a1 = n + , 2
kL 2
+ kR = , 1
kL 1
+ kR = R (
), (4.30)
where = Q + (n + 1)R with uniquely determined Q = Q(, ) {0, 1, . . . , n}
and R = R(, ) Z. If these conditions hold, then dim(V K ) = 1 and V K is
given by
VK
= Va(n+1)
1
1
[e1 + e2 + + en + en+1 ] = spanC {|, , . . . , , }, (4.31)
(n+1)
where the last equality refers to the bosonic oscillator realization of Va1
1 .

Proof. For the isotropy subalgebra we have K = Mdiag = {X = (X0 , X0 ) | X0


M}, where

D 0 0 
M = X0 = i 0 0  D = diag(d1 , d2 , . . . , dn ) Rnn , R . (4.32)

0 0 D 
So, for any X K we have XL = XR = X0 , and
 
1 1 D 0
XL = XR = i , XL2 = XR 2
= iD. (4.33)
0
(n
Now, for each = (1 , 2 , . . . , n ) Rn we let := j=1 j , and consider the
traceless Cartan elements
(n+1)
H := diag(1 , 2 , . . . , n , )
HR ,
(4.34)
:= diag(1 , 2 , . . . , n ) 1 1
H
(n)
n HR .
n
Then the components of X K can be parametrized as
 
XL1 = XR
1
= iH + ix1n+1 , XL2 = XR 2 + i x + 1 1n ,
= iH (4.35)
n
(n+1)
where Rn and x R. From (4.29), it follows that v Va1
1 we have
 (X)v = a(n+1)
1
1
2
(iH )v + i(kL 2
+ kR )v
+ ix(n+1 (a1 1 )
1 1 2 2
+ (n + 1)(kL + kR ) + n(kL + kR ))v. (4.36)
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

Derivations of Trigonometric BCn Sutherland Model 723

Clearly  (X)v = 0 (X K) if and only if

a(n+1)
1
1
(H )v = (kL
2 2
+ kR )v
( Rn ), (4.37)
1 1 2 2
(n
and n+1 (a1 1 ) + (n + 1)(kL + kR ) + n(kL + kR ) = 0. Note that = j =
(n j=1
j=1 e j (H ), so after introducing the shorthand notations
1
1 := kL 1
+ kR Z 2
and 2 := kL 2
+ kR Z, (4.38)

we conclude that


n
VK =VK
= Va(n+1)
1
1
2 ej , (4.39)
j=1

provided that n+1 (a1 1 ) + (n + 1)1 + n2 = 0. Our next goal is to identify the
(n+1)
weight space Va1
1 [2 (e1 + e2 + + en )]. Recall that 2 (e1 + e2 + + en )
(n+1)
Wa1
1 if and only if (l1 , l2 , . . . , ln+1 ) Zn+1
0 with l1 + l2 + + ln+1 = a1 , such
that

n+1 
n
2 (e1 + e2 + + en ) = lj ej = (lj ln+1 )ej . (4.40)
j=1 j=1

Since the functionals e1 , e2 , . . . , en are linearly independent, we end up with the


requirement l1 = l2 = = ln = ln+1 2 . For the free parameters we choose
2 2
:= l1 and := ln+1 , then the parameters 2 = kL + kR and a1 have to obey the
equations 2 = and a1 = n + . Note that under these assumptions we have

Va(n+1)
1
1
[2 (e1 + e2 + + en )] = Va(n+1)
1
1
[e1 + e2 + + en + en+1 ]
= spanC {|, , . . . , , }. (4.41)

Now let us express the value of the label n+1 (a1 1 ) {0, 1, . . . , n} in terms of
and . Recalling (4.2), we can write

n+1 (a1 1 ) = n+1 ((n + )1 ) n + (mod(n + 1)). (4.42)

Notice that ! Q = Q(, ) {0, 1, . . . , n} and ! R = R(, ) Z such that =


Q + (n + 1)R, thereby the previous congruence relation translates into the equation
n+1 (a1 1 ) = Q. Plugging this equation into the requirement n+1 (a1 1 ) + (n +
1)1 + n2 = 0, we get

R + 1 ),
0 = Q + (n + 1)1 + n(Q + (n + 1)R) = (n + 1)( (4.43)
1
therefore we end up with the additional constraint kL 1
+ kR = 1 = R (
).
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

724 L. Feh
er & B. G. Pusztai

Observe from Lemma 4.4 that kR 1 2


, kR Z and , Z0 can be taken as free
parameters that label the admissible cases of the KKS ansatz (4.28). By proceeding
like in Sec. 4.1, it is a matter of straightforward substitutions to specialize the
reduced Laplacian (2.14) to our case. In this way we found the following result.

Proposition 4.5. Under the KKS ansatz (4.28) with parameters satisfying (4.30)
the Laplace operator of U (N ) reduces to
1 1
red = HBCn + n(kR 1 2 2
+ kR ) + (kR ) n(n + 1)(2n + 1),
1 2
(4.44)
2 3
where HBCn is given by (1.4) with the coupling parameters determined in terms of
1
the arbitrary parameters kR 2
, kR Z and , Z0 according to
a , b + + 1, c |
+ kR
1
kR
2
|. (4.45)

Remark. The non-negative integer coupling parameters a, b, c that arise in this


case satisfy the condition b a + 1.

4.3. Case III: m = n + 2, r = s = n + 1, N = 2n + 2


Now the xpoint subgroups of the two dierent involutions L = n+1,n+1 and
R = n+2,n are U (N )L
= U (n + 1) U (n + 1) and U (N )R = U (n + 2) U (n).
We consider the reductions associated with UIRREPS (, V ) of G (4.1) having the
form
& ' & '
(n+1) (n+1) (n+2) (n)
:= (k1 ,0)  (k2 ,0)  (k1 ,a1
1 )  (k2 ,0) , (4.46)
L L R R

where a1 Z0 and 1
kL 2
, kL 1
, kR 2
, kR Z, and the representation space is identied
(n+2)
as V Va1
1 . Any X G is a pair X = (XL , XR ) with XL u(n + 1) u(n + 1)
and XR u(n + 2) u(n), and we may further write XL = (XL1 , XL2 ) and XR =
(XR 1 2
, XR ), where now XL1 , XL2 u(n + 1), XR 1
u(n + 2) and XR 2
u(n). Then the
G-representation can be written as
 1

tr(XR )
 (X) = a(n+2)
1
1
X 1
R 1 n+2
n+2
   
1 n+2 (a1 1 )
+ kL tr(XL1 ) + kL
2
tr(XL2 )+ kR 1
+ 1
tr(XR 2
) + kR tr(XR2
) IdV .
n+2
(4.47)
Lemma 4.6. The KKS ansatz (4.46) yields admissible UIRREPS if and only if
, , Z0 and k Z such that the parameters kL
1 2
, kL 1
, kR 2
, kR Z and a1 Z0
satisfy the conditions
a1 = n + + , 1
kL = k, 2
kL = + k,
(4.48)
1
kR = R k, 2
kR = k,
where a1 = Q+(n+2)R with uniquely determined Q = Q(, , ) {0, 1, . . . , n+1}
and R = R(, , ) Z. If the above conditions are met, then dim(V K ) = 1 and
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

Derivations of Trigonometric BCn Sutherland Model 725

concretely
V K = Va(n+2)
1
1
[e1 + e2 + + en + en+1 + en+2 ]
= spanC {|, , . . . , , , }, (4.49)
(n+2)
where the last equality refers to the bosonic oscillator realization of Va1
1 .

Proof. For the isotropy subalgebra we have K = Mdiag = {X = (X0 , X0 ) | X0


M}, where


D 0 0 0 

0
0 0 
M = X0 = i 0 0
 D = diag(d1 , d2 , . . . , dn ) R

nn
, ,
R .


0 


0 0 0 D 
(4.50)
Any X = (XL , XR ) K satises XL = XR = X0 , and therefore it has the
components

  D 0 0
1
D 0 2

0 1 2
XL = i , XL = i , XR = i 0 0 , XR = iD.
0 0 D
0 0

(4.51)
(n+1
For any real (n + 1)-tuple = (1 , 2 , . . . , n+1 ) Rn+1 we let := j=1 j ,
(n
:= j=1 j , and introduce the traceless matrices
H := diag(1 , 2 , . . . , n+1 , ),


2
HR := diag(1 , 2 , . . . , n ) 1n , (4.52)
n

HL1 := diag(1 , 2 , . . . , n+1 ) 1n+1 ,
n+1
n+1
HL2 := diag(,
1 , . . . , n ) + 1n+1 . (4.53)
n+1
We then write the components of X K in the form
 
1 2 2
XR = iH + ix1n+2 , XR = iHR +i x+ 1n , (4.54)
n
   
n+1
1 1
XL = iHL + i x + 1n+1 , XL = iHL + i x
2 2
1n+1 . (4.55)
n+1 n+1
(n+2)
From (4.47), it follows that for any v Va1
1 and X K we have
 (X)v = a(n+2)
1
1
1
(iH )v + i(kL kL
2 2
n+1 + kR )v

 1 1 2 2

+ ix n+2 (a1 1 ) + (n + 2)kR + (n + 1)(kL + kL ) + nkR v.
(4.56)
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

726 L. Feh
er & B. G. Pusztai

Clearly  (X)v = 0 (X K) if and only if

a(n+2)
1
1
2
(H )v = (kL n+1 kL
1
kR
2
)v
( Rn ), (4.57)

and
1 1 2 2
n+2 (a1 1 ) + (n + 2)kR + (n + 1)(kL + kL ) + nkR = 0. (4.58)

Since
2
kL n+1 kL
1
kR
2
= (kL
1 2
+ kR )(e1 + e2 + + en )(H )
2
+ (kL kL
1
)en+1 (H ), (4.59)

we obtain from (4.57) that we must have

V K = Va(n+2)
1
1
1
[(kL 2
+ kR )(e1 + e2 + + en ) + (kL
2
kL
1
)en+1 ]. (4.60)

It is easy to see (cf. the Appendix) that the weight space in (4.60) is non-trivial if
and only if (l1 , l2 , . . . , ln+2 ) Zn+2
0 with l1 + l2 + + ln+2 = a1 , such that


n+1
(kL
1 2
+ kR )(e1 + e2 + + en ) + (kL
2
kL
1
)en+1 = (lj ln+2 )ej . (4.61)
j=1

We set
1
:= l1 , := ln+1 , := ln+2 , k := kL . (4.62)

Then (4.61) requires l1 = l2 = = ln = and = k + kR


2
with = kL
2
k.
So, regarding , , Z and k Z as free parameters, we see that the other
parameters have to obey the relations
2
kL = + k, 2
kR = k, a1 = n + + . (4.63)

To satisfy the remaining condition (4.58), we now dene Q = Q(, , )


{0, 1, . . . , n + 1} and R = R(, , ) Z by the equality

a1 = n + + = Q + (n + 2)R. (4.64)
1
Then (4.58) translates into the condition kR = R k, which completes the
proof.

Further direct calculations yield the explicit form of the reduced Laplacian
(2.14).

Proposition 4.7. Under the KKS ansatz (4.46) parametrized by arbitrary , ,


Z0 and k Z according to Lemma 4.6, the reduced Laplacian of U (N ) satisfies
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

Derivations of Trigonometric BCn Sutherland Model 727

red = HBCn + C with the constant


1 1
C = n(4n2 + 12n + 11) + n(2k + )2 + (
+ k)(
+ k + 1)
6 2
k)(
+ ( k + 1) (4.65)
and coupling parameters given in the notation (1.4) by
a , b + + 1, c + + 1. (4.66)

Remark. The integer coupling parameters a, b, c arising in this case satisfy b, c


a + 1.

5. Discussion
We here summarize the results, discuss the related work [19] and point out open
problems.
In this paper, we applied the formalism of quantum Hamiltonian reduction under
polar group actions to study the reductions of the Laplace operator of U (N ) by
means of the Hermann action (3.2) of the symmetry group G = (U (r) U (s))
(U (m) U (n)) with N = m + n = r + s. We concentrated on the three series
of cases for which the centralizer of the corresponding section, the group K =
Mdiag (3.42), is Abelian. We built the representation (, V ) of the symmetry group
that enters the denition of the reduction by using as building blocks in (4.5)
1-dimensional representations and a symmetric power of the dening representation
of the largest factor of G. In the framework of this KKS ansatz we determined
all cases for which the reduction is consistent (that is dim(V K ) = 0), and saw
also that in these admissible cases dim(V K ) = 1. We then calculated the explicit
formula of the reduced Laplacian by specializing Eq. (2.14), and found that up to
an additive constant it yields the BCn Sutherland Hamiltonian (1.4) with coupling
parameters given as follows:
Case I: a, b, c Z0 ,
Case II: a, b, c Z0 with b a + 1,
Case III: a, b, c Z0 with b, c a + 1.
The dependence of the additive constant and of the coupling parameters a, b, c
on the parameters of the respective representation (, V ) is given by the three
propositions formulated in Sec. 4.
The above results show that Case I, which is the simplest case, covers all integral
values of the coupling parameters a, b, c and the other two cases allow for alternative
group theoretic descriptions of the BCn model at proper subsets of the integral
coupling parameters. This state of aairs could not be foreseen before performing
the analysis of the dierent reduction schemes. Observe also that if b = c, then the
Hamiltonian (1.4) becomes of type Cn , but the Bn and Dn type Sutherland models
do not arise from (1.4) at any values of the integers a, b, c. This is in contrast with
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

728 L. Feh
er & B. G. Pusztai

the corresponding classical Hamiltonian reduction [20], which covers all coupling
constants of the classical BCn model, and is due to the never vanishing second term
of the measure factor given by (3.73). The measure factor represents a kind of
quantum anomaly since it gives the dierence between the naive quantization of
the reduced classical Hamiltonian and the outcome of the corresponding quantum
Hamiltonian reduction [21].
In Case I, our analysis is consistent with the results of Oblomkov [19], who
studied reductions of the Laplace operator of GL(m + n, C) using the symmetry
group
GC := (GL(m, C) GL(n, C)) (GL(m, C) GL(n, C)), m n. (5.1)
In fact, in Case I, our reduction is nothing but the compact real form of the reduction
studied in [19] for m = n. For the m > n cases of the symmetry group (5.1) a
generalization of the KKS ansatz was employed in [19], which was found to yield
the complex version of the BCn Sutherland Hamiltonian (1.4) with integer coupling
parameters subject to the restriction c b (m n) 0. Thus the coupling
parameters obtained for m > n form a proper subset of those obtained for m = n,
and this proper subset is dierent from those that we derived in our Cases II and
III. For clarity we note that the KKS ansatz (4.28) that we adopted in Case II was
motivated by the corresponding classical reduction [20], and it does not correspond
to the ansatz used in [19] for m n = 1. It is not clear to us how the classical
analogues of the m > n reductions of [19] work.
Of course, the reductions can be applied also to the dierential operators associ-
ated with the higher Casimirs. This can be used to explain the complete integrability
of the BCn Sutherland model and to derive the spectra as well as the form of the
joint eigenfunctions of the corresponding commuting Hamiltonians at the pertinent
values of the coupling constants from representation theory [19].
We stress that the general method that we applied in our analysis can be used
also to study other problems in the future. For example, one may try to determine
all possible values of the coupling constants of the Sutherland models (1.1) that
may result as reductions of the Laplacian of a compact Lie group in general. This
is closely related to the open problem concerning the classication of the Hermann
actions and representations (, V ) of symmetric subgroups G (3.1) such that the
condition dim(V K ) = 1 holds for the centralizer K < G of the section. In all
such cases the reduced Laplace operator (2.14) is expected to provide a many-body
model that can be solved by the group theoretic method because of its very origin.
Besides the trigonometric real form that we considered, the complex BCn
Sutherland model admits the well-known hyperbolic real form and other physi-
cally very dierent real forms associated with two types of particles [27, 28]. The
derivation of the hyperbolic model by quantum Hamiltonian reduction can be done
similarly to the present work, but starting from U (n, n) instead of U (2n) (in Case I)
taking the Cartan involution both for L and for R (see also [20]). The models
with two types of particles pose a more dicult problem. At the classical level, it
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

Derivations of Trigonometric BCn Sutherland Model 729

can be seen from [28] that to derive them one needs to take the Cartan involution
of U (n, n) for L and a dierent involution for R that has a non-compact xpoint
subgroup. Therefore the corresponding quantum Hamiltonian reduction would
require some modications of the method used in this paper, which need further
investigation.

Acknowledgments
We thank J. Balog for useful comments on the manuscript. This work was supported
by the Hungarian Scientic Research Fund (OTKA) under the grant K 77400.

Appendix. Some Representation Theoretic Facts


In this Appendix, we gather some basic facts in order to x the notations used in
Sec. 4.

A.1. On the UIRREPS of SU (n) and U (n)


Since the Lie group SU (n) is compact, connected and simply-connected, there is a
one-to-one correspondence between the UIRREPS (, V ) of SU (n) and the nite
dimensional complex IRREPS (, V ) of sl(n, C) = su(n)C . In the complex simple Lie
algebra sl(n, C) we have the Cartan subalgebra H consisting of diagonal matrices,
and use also the real Cartan subalgebra
HR := {H | H sl(n, C), H is diagonal with real entries} H. (A.1)
The functionals {ei }ni=1 H are dened by the formula ei (H) := Hii (H H).
The roots with respect to H form the set R := {ei ej | 1 i, j n, i =
j} H and we x the root vectors Eei ej := Eij . The set of positive roots
is R+ := {ei ej | 1 i < j n} and the simple roots are i := ei ei+1
(i
(1 i n 1). Let i = k=1 ek H (1 i n 1) denote the fundamental
weights. The equivalence classes of the IRREPS of sl(n, C) can be uniquely labeled
by the highest (dominant integral) weights, which are the elements of
P+ (SU (n)) = {a1 1 + a2 2 + + an1 n1 | a1 , a2 , . . . , an1 Z0 }
= Zn1
0 .
(A.2)
Now take an sl(n, C) IRREP ( , V ) of highest weight P+ (SU (n)). To any
linear functional H we associate the weight space
)
V [] := ker ( (H) (H) IdV ) V , (A.3)
HH

and we also dene the set of weights W := { | H , V [] = {0}}. Then we


*
have the weight space decomposition V = W V []. Note that W and
dim(V []) = 1, so we can write V [] = Cv with some highest weight vector v .
The characteristic property of the non-zero vector v is that (E )v = 0 holds
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

730 L. Feh
er & B. G. Pusztai

for all R+ . The IRREP ( , V ) of sl(n, C) induces the UIRREP ( , V ) of


SU (n) by the requirement (eX ) = e (X) for all X su(n). The corresponding
scalar product on V can be dened by xing the norm of v and requiring the
anti-hermiticity of (X) for all X su(n).
The UIRREPS of U (n) are usually parametrized by the set
P+ (U (n)) = {m = (m1 , m2 , . . . , mn ) Zn | m1 m2 mn }. (A.4)
The representation m of U (n) may be dened as the extension of the representation
of SU (n) < U (n) characterized by the properties

n1
= (mi mi+1 )i and m (1n ) = m1 ++mn IdV U (1). (A.5)
i=1

In the main text we use a slightly dierent parametrization by pairs (k, ) Z


P+ (SU (n)). The correspondence is given by the relation m1 + +mn = n ()+kn,
as is seen from the comparison between (A.5) and (4.2) and (4.3).

A.2. On the bosonic oscillator realization of (m1 , Vm1 )


Fix an integer n 2 and to each n-tuple (l1 , l2 , . . . , ln ) Zn0 associate a symbol
|l1 , l2 , . . . , ln . Let F denote the complex vector space generated by these symbols,
+
F := C|l1 , l2 , . . . , ln . (A.6)
(l1 ,l2 ,...,ln )Zn
0

Endow F with the scalar product ( , ) for which the vectors {|l1 , l2 , . . . ,
ln }(l1 ,l2 ,...,ln )Zn0 satisfy

(|l1 , l2 , . . . , ln , |l1 , l2 , . . . , ln ) = l1 ,l1 l2 ,l2 ln ,ln , (A.7)

and introduce the annihilation and creation operators bi and bi (1 i n) on F by


li |l1 , l2 , . . . , li 1, . . . , ln  if li 1,
bi |l1 , l2 , . . . , ln  := (A.8)
0 if li = 0,
,
bi |l1 , l2 , . . . , ln  := li + 1|l1 , l2 , . . . , li + 1, . . . , ln . (A.9)
Then bi is the adjoint of bi , and one has the commutation relations
[bi , bj ] = 0, [bi , bj ] = 0, [bi , bj ] = i,j IdF . (A.10)
The bosonic Fock space F decomposes as the orthogonal direct sum F =
*
mZ0 Fm with

Fm := spanC {|l1 , l2 , . . . , ln  | (l1 , l2 , . . . , ln ) Zn0 , l1 + l2 + + ln = m}. (A.11)


Now consider the linear map : gl(n, C) End(F ) dened on the standard
basis {Eij }1i,jn of gl(n, C) by
(Eij ) := bi bj . (A.12)
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

Derivations of Trigonometric BCn Sutherland Model 731

Then (, F ) is a representation of gl(n, C) and the subspace Fm is invariant under


. The map

m : gl(n, C) End(Fm ), X  m (X) := (X)|Fm (A.13)

provides a nite dimensional representation of the Lie algebra gl(n, C). By restrict-
ing m to the subalgebra sl(n, C) < gl(n, C), we end up with a nite dimen-
sional representation (m , Fm ) of sl(n, C). The set of weights of the representation
(m , Fm ) is

n 
 
Wm := li ei  (l1 , l2 , . . . , ln ) Z0 , l1 + l2 + + ln = m ,
n
(A.14)
i=1
(n
and the weight space Fm [] Fm corresponding to weight = i=1 li ei Wm
takes the form

Fm [l1 e1 + l2 e2 + + ln en ] = C|l1 , l2 , . . . , ln . (A.15)

Note that each weight space is 1-dimensional. The representation (m , Fm ) con-


tains the (up to rescaling) unique highest weight vector vm := |m, 0, . . . , 0, with
weight m1 = me1 Wm . This shows that (m , Fm ) is equivalent to the IRREP
(m
1 , Vm
1 ). We identify these sl(n, C) (and the naturally corresponding su(n))
representations in the proofs presented in Sec. 4.

References
[1] M. A. Olshanetsky and A. M. Perelomov, Quantum integrable systems related to Lie
algebras, Phys. Rept. 94 (1983) 313404.
[2] G. Heckman, Hypergeometric and spherical functions, Harmonic Analysis and Special
Functions on Symmetric Spaces, eds. G. Heckman and H. Schlichtkrull, Perspectives
in Mathematics, Vol. 16 (Academic Press, 1994), pp. 189.
[3] B. Sutherland, Beautiful Models (World Scientic, 2004).
[4] R. Sasaki, Quantum CalogeroMoser systems, in Encyclopaedia of Mathematical
Physics (Academic Press, 2006), pp. 123129.
[5] P. Etingof, CalogeroMoser Systems and Representation Theory (European Mathe-
matical Society, 2007).
[6] A. P. Polychronakos, Physics and mathematics of Calogero particles, J. Phys. A
Math. Gen. 39 (2006) 1279312845; arXiv:hep-th/0607033.
[7] M. A. Olshanetsky and A. M. Perelomov, Completely integrable Hamiltonian systems
connected with semisimple Lie algebras, Invent. Math. 37 (1976) 93108.
[8] B. Sutherland, Exact results for a quantum many-body problem in one dimension II,
Phys. Rev. A 5 (1972) 13721376.
[9] M. A. Olshanetsky and A. M. Perelomov, Quantum systems related to root sys-
tems, and radial parts of Laplace operators, Funct. Anal. Appl. 12 (1978) 121128;
arXiv:math-ph/0203031.
[10] G. J. Heckmam and E. M. Opdam, Root systems and hypergeometric functions I,
Compositio Math. 64 (1987) 329352.
[11] E. M. Opdam, Root systems and hypergeometric functions IV, Compositio Math. 67
(1988) 191209.
July 12, 2010 12:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004065

732 L. Feh
er & B. G. Pusztai

[12] I. Cherednik, Double Ane Hecke Algebras, London Mathematical Society Lecture
Notes Series, Vol. 319 (Cambridge University Press, 2005).
[13] P. I. Etingof, I. B. Frenkel and A. A. Kirillov, Jr., Spherical functions on ane Lie
groups, Duke Math. J. 80 (1995) 5990; arXiv:hep-th/9407047.
[14] D. Kazhdan, B. Kostant and S. Sternberg, Hamiltonian group actions and dynamical
systems of Calogero type, Commun. Pure Appl. Math. 31 (1978) 481507.
[15] A. D. Berenstein and A. V. Zelevinsky, When is the multiplicity of a weight equal
to 1?, Funct. Anal. Appl. 24 (1990) 259269.
[16] R. Howe, Perspectives on invariant theory: Schur duality, multiplicity-free actions
and beyond, in The Schur Lectures (1992), Israel Math. Conf. Proc., Vol. 8 (Bar-Iran
Univ., 1995), pp. 1182.
[17] E. Heintze, R. Palais, C.-L. Terng and G. Thorbergsson, Hyperpolar actions on sym-
metric spaces, in Geometry, Topology, and Physics for Raoul Bott, ed. S.-T. Yau
(International Press, 1995), pp. 214245.
[18] A. Kollross, Polar actions on symmetric spaces, J. Dierential Geom. 77 (2007)
425482; arXiv:math/0506312 [math.DG].
[19] A. Oblomkov, HeckmanOpdams Jacobi polynomials for the BCn root system and
generalized spherical functions, Adv. Math. 186 (2004) 153180; arXiv:math/0202076
[math.RT].
[20] L. Feher and B. G. Pusztai, A class of Calogero type reductions of free motion on a
simple Lie group, Lett. Math. Phys. 79 (2007) 263277; arXiv:math-ph/0609085.
[21] L. Feher and B. G. Pusztai, Hamiltonian reductions of free particles under
polar actions of compact Lie groups, Theoret. Math. Phys. 155 (2008) 646658;
arXiv:0705.1998 [math-ph].
[22] R. Palais and C.-L. Terng, A general theory of canonical forms, Trans. Amer. Math.
Soc. 300 (1987) 771789.
[23] V. V. Gorbatsevich, A. L. Onishchik and E. B. Vinberg, Foundations of Lie Theory
and Lie Transformation Groups (Springer, 1997).
[24] T. Matsuki, Classication of two involutions on compact semisimple Lie groups and
root systems, J. Lie Theory 12 (2002) 4168.
[25] B. Hoogenboom, Intertwining Functions on Compact Lie Groups, CWI Tract, Vol. 5
(Centrun Wisk. Inform., Amsterdam, 1984).
[26] T. Matsuki, Double coset decomposition of reductive Lie groups arising from two
involutions, J. Algebra 197 (1997) 4991.
[27] F. Calogero, Exactly solvable one-dimensional many-body problems, Lett. Nuovo
Cim. 13 (1975) 411416.
[28] M. Hashizume, Geometric approach to the completely integrable Hamiltonian sys-
tems attached to the root systems with signature, Adv. Stud. Pure Math. 4 (1984)
291330.
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Reviews in Mathematical Physics


Vol. 22, No. 7 (2010) 733838

c World Scientic Publishing Company
DOI: 10.1142/S0129055X10004077

A CLASSICAL MECHANICAL MODEL OF BROWNIAN


MOTION WITH PLURAL PARTICLES

SHIGEO KUSUOKA and SONG LIANG


Graduate School of Mathematical Sciences,

The University of Tokyo, Komaba 3-8-1, Meguro-ku


Tokyo 153-8914, Japan
kusuoka@ms.u-tokyo.ac.jp
Institute of Mathematics,

Tsukuba University, Tennoudai 1-1-1


Tsukuba 305-8571, Japan
liang@math.tsukuba.ac.jp

Received 15 December 2008


Revised 26 May 2010

We give a connection between diusion processes and classical mechanical systems in


this paper. Precisely, we consider a system of plural massive particles interacting with an
ideal gas, evolved according to classical mechanical principles, via interaction potentials.
We prove the almost sure existence and uniqueness of the solution of the considered
dynamics, prove the convergence of the solution under a certain scaling limit, and give
the precise expression of the limiting process, a diusion process.

Keywords: Innite particle systems; classical mechanics; Markov processes; diusions;


convergence; Brownian motion.

Mathematics Subject Classication 2010: 70F45, 34F05, 60B10, 60J60

1. Introduction
Brownian motion is a well-known physical phenomenon concerning the dynamics
of a small particle put into a uid in equilibrium, e.g., a grain of pollen in a glass
of water [10]. It is an interesting problem in mathematical physics to describe the
Brownian motion phenomenology by classical mechanical models.
Brownian motion was rst observed accidentally by Brown in 1827. The rst
physical explanation of it was given by Einstein: the motion being explained as
coming about as a result of the repeated collisions of the particle with the numer-
ous much smaller uid atoms. In more mathematical terms, the explanation is often
presented in the following rough way: since a big number of water atoms collide
with the massive particle randomly, and each atom is light enough, if we assume
that the interactions from each atom at each time are independent, the motion

733
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

734 S. Kusuoka & S. Liang

of the massive particle could be considered as a sum of many i.i.d. random vari-
ables, by the central limit theorem, this will give in a suitable limit the Brownian
motion.
However, we have to notice that, even in a model where only interactions through
collisions are considered, there exists the possibility of re-collisions, so the states
(i.e. positions and velocities) of light particles at each time are not independent of
each other, so one is not really in presence of a sum of i.i.d. variables. This becomes a
more evident and signicant drawback when considering the model of interactions
caused by potentials, or a model with more than one massive particles. Indeed,
the actual motion of the massive particles could not be a result of a sum of i.i.d.
random variables, it is not even a Markov process. So to study this phenomenon
more precisely, we need to construct some new model, which takes the mentioned
re-interactions into account.
In such a model, a nite number of massive particles (molecules) interact with a
gas of innitely many light atoms, which have a random initial state. The dynam-
ics is fully deterministic, Newtonian, as long as the initial condition is given, and
the only source of randomness is from the initial conguration of the light atoms.
The problem is to prove that in an appropriate limit as the mass m of the gas
atoms goes to 0 while their velocities and their density increase in an appropri-
ate manner, such that the variance of momentum transfer stays of order 1, the
motion of the molecules converges to a Markov process, in particular a diusion
process.
This type of model, called a mechanical model of Brownian motion, was rst
introduced and studied by Holley [6], for the case of only one molecule, with
the whole system in dimension d = 1, and the interactions only given by col-
lisions. This model was later extended by, e.g., D urrGoldsteinLebowitz [35],
CalderoniD urrKusuoka [2], and others, to the case of higher dimensional spaces.
But in all papers, only collisional interactions of one molecule with light atoms are
considered.
In the present paper, we consider the case of plural molecules interacting via
smooth compact support potentials with an ideal gas of atoms. This increases the
diculties in many aspects, for example, (1) strong non-Markovian character of the
dynamics (for every positive mass m of the atoms), due to possibly multiple, or even
extended in time, interactions between a particular atom and the molecules; (2) the
appearance of an interaction (mediated by the gas atoms) between the molecules
in the limiting process; and (3) irregular behaviour of the above interaction when
the interaction ranges of the molecules overlap, i.e. an atom can interact with more
than one molecule at a time.
Let us note that one expects that the non-Markovian character of the dynamics
mentioned above disappears when the gas atoms become innitely fast, in the limit
m 0.
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 735

We show the existence and the uniqueness of the solution of an innite system of
ordinary dierential equations (ODEs) describing the model, for almost every initial
condition of the ideal gas, and study the motion of the molecules in the Brownian
limit, where m 0, and the intensity function of the initial conguration of the
N
atoms is given by  2
d1
m (dx, dv) = m 2 ( 2 |v| + i=1 Ui (x Xi,0 ))dxdv, where
m

is a function giving the initial distribution, x respectively, v denote the position


respectively, velocity of an atom, N is the number of molecules, Xi,0 are the initial
positions of the molecules and Ui are the potentials. (See Sec. 2 for more details
and the notations.)
A heuristic central limit theorem type argument for this scaling, which assumes
all of the necessary independency in the limit, may be given as follows: when m 0,
the average energy mv 2 of the atoms remains order 1 due to the velocity scaling.
For the momentum transfer, notice that since potentials are compactly supported
by our assumption, we have that (as long as the molecules stay in a bounded region)
interactions occur only if the atoms are in a certain area A which does not depend
d
on time. So roughly speaking, for a xed time t, dt Vi (t) is a result of the sum of the
eects from those atoms whose positions x and velocities v satisfy x + tv A. Also,
the eect from each one such atoms is of order 1. Since an atom has velocity of order
m1/2 , the length of the time that it stays in the area A has order m1/2 . In summary,
in a xed time interval [0, T ], T > 0, the momentum transfer for a molecule from
one atom is of order m1/2 , hence the momentum
T  will remain constant if the total
number of interacting atoms has mean 0 ds x+svA  m (dx, dv) behaving as m
1/2
,
for m 0.
Let us explain further main ideas and provide a sketch of the content of this
paper before closing this section. First, we introduce a system (t,  x, v; X)
 (or
 (see (2.2) and (3.4)), describing the motion of the light atoms when
(t, x, v; X))
the molecules are frozen, and consider the classical scattering theory (including a
ray representation) for it. As a result of our cut-o in the potentials and the initial
distribution of the atoms, as long as the velocities of the molecules are not too
fast, each light atom interacts with the molecules for a time length of order m1/2
 (instead of x(t, x, v)) gives the 0-order
only, so the interaction given by (t, x, v; X)
approximation of the momentum variance of the molecules (see Proposition 3.6.3).
We use this fact to get the tightness of the states of the molecules as long as their
velocities are of order O(1) (Lemma 3.5.1 and Sec. 4). Next, with the help of this
tightness, by adding the eect given by the error caused by the described frozing
approximation as a 1-order term (see Proposition 3.6.4), we prove the desired con-
vergence for m 0. This is done by characterizing the possible accumulation points
by martingale problem theory (Sec. 5). Finally, we show that when there is only
one molecule, or when there are two molecules but the potential functions satisfy
certain conditions (see Theorem 2.0.1(4)), the velocity(s) of the molecule(s) do(es)
not go beyond order O(1), so the stopping time n (to keep the velocity(s) O(1))
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

736 S. Kusuoka & S. Liang

converges to , which means that our convergence is valid for all time intervals
(Sec. 6).

2. Description of the Model and Statement of the Result


Let us describe our model and results precisely in this section.
Let N 1 and d 1 be integers, and let M1 , . . . , MN , m > 0. Here N stands for
the number of molecules, d for the dimension of the space Rd , in which the whole
system is considered. M1 , . . . , MN are the masses of the molecules. m stands for the
mass of the light particles (the environmental ideal gas atoms), (later on the limit
m 0 will be taken). We use Ui C0 (Rd ), i = 1, . . . , N , to denote the (cut-o)
potential functions, which, as (2.1) shows, are assumed to provide potentials that
only depend on the relative positions of the molecules and the atoms. Also, let
Xi,0 , Vi,0 Rd , i = 1, . . . , N , be given, which stand for the initial positions and the
initial velocities of the molecules.
Assume that the initial condition of the environment, i.e. the positions and the
velocities of the ideal gas atoms at time 0, is given by   = Conf(Rd Rd ). The
distribution of  will be specied later. (We ask for the readers tolerance for using
for a while. We do so because we will soon convert the problem to some new
probability space (see Sec. 3.1) by using ray representation, and we believe that
it is better to keep the notations without until then.) Here Conf(Rd Rd )
stands for the set of all non-empty closed subsets of Rd Rd which have no cluster
point. Conf(Rd Rd ) is equipped with the -algebra E0 , the -algebra generated
by {{C Rd Rd ; C = , closed, C G = }; G is open in Rd Rd }. Each  is
a subset of Rd Rd , and (x, v)  means that there exists an atom at position x
with velocity v at time 0.
As claimed before, we assume that as long as the initial conditions 
Conf(Rd Rd ) and Xi,0 , Vi,0 Rd , i = 1, . . . , N , are given, the whole system
evolves according to Newton mechanical laws via interaction potentials depending
only on the relative positions.
(m) (m) (m) (m)
We use Xi (t) = Xi (t, ) and Vi (t) = Vi (t, ) Rd to denote the
position and the velocity of the ith molecules at time t with initial environmental
condition  , and for each (x, v)  , we use x(m) (t, x, v,
 ), v (m) (t, x, v,
 ) Rd to
denote the position and the velocity at time t of the atom which had state (x, v) at
time 0.
Also, for the sake of simplicity, we assume that there is no direct interaction
between molecules or between atoms. Actually, adding the eect of interactions
between molecules causes totally no mathematical diculty, while making the for-
mula more complicated. We would rather say that one of the most interesting
points of our results in this paper is that, even for the case with no direct interac-
tions between molecules, after taking the limit m 0, we get a diusion in which
interactions between molecules appear. (See Theorem 2.0.1, especially the denition
of the generator L below.)
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 737

In conclusion, for each initial environmental condition  , we assume that


the motion of the system is described by the following innite system of
ODEs:


d X (m) (t, (m)

 ) = Vi (t,  ),

dt i





d (m) (m)

M V (t, 
) = Ui (Xi (t,  ) x(m) (t, x, v,
 ))e (dx, dv),


i
dt i


R d Rd



(m)
(Xi (0,
(m)
 ), Vi (0,  )) = (Xi,0 , Vi,0 ), i = 1, . . . , N,
(2.1)

d
x (t, x, v,
(m) (m)
 ) = v (t, x, v,  ),



dt


d

N

(m) (m)

m v (t, x, v, 
) = Ui (x(m) (t, x, v,
 ) Xi (t,  )),

dt

i=1

(x(m) (0, x, v,  ), v (m) (0, x, v,
 )) = (x, v), (x, v) .

Here e ( ) is the counting measure determined by  : e (A) = ( A) for any


A B(Rd Rd ). ( ( ) thus denoting the number of points in the argument.)
Since we are only interested in the motion of the molecules, from now on,
when talking about the solution of (2.1), we always mean the value of (X  (m) (t,
 ),
(m) (m) (m) (m) (m)

V  )) = ((X1 (t,
(t,  ), . . . , XN (t,  )), (V1 (t,  ), . . . , VN (t,
 ))).
Finally, let us give the distribution of the environmental initial condition  . Let
: R [0, ) be a continuous function such that (s) 0 rapidly as s (see
conditions A1 and A2 below for details). Let  m be the non-atomic Radon measure
on R R given by
d d


m 2

N

m (dx, dv) = m 2
d1
|v| + Ui (x Xi,0 ) dxdv,
2 i=1

and let P ) be the Poisson point process with the intensity measure 
m (d m . So
Pm is a probability measure on 
(= Conf(R d
R d
)). We assume that the dis-
tribution of 
 is given by Pm . (See, e.g., [7] for more details about Poisson point
processes.)
In this paper, we consider the following questions:

(Q1) Does the dynamics have a unique solution for Pm -almost every initial condi-
tion?
 (m) (t,
(Q2) What is the limit behavior of the solution (X  (m) (t,
 ), V  )) as m 0?

Throughout this paper, we assume that Ui C0 (Rd ) satisfy Ui (x) =


Ui (x), x Rd , i = 1, . . . , N . Let Ri be constants such that Ui (x) = 0 if
N
|x| Ri . Dene the constants C0 = (2 i=1 Ri Ui )1/2 , e0 = 12 (2C0 + 1)2 +
N
i=1 Ui . Assume that : R [0, ) is a measurable function satisfying the
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

738 S. Kusuoka & S. Liang

following:

(A1) (s) = 0 if s e0 ,
(A2) for any c > 0, there exists a c : R [0, ) such that

sup (s + a) c (s), for any s R,


|a|c

and

3 1 2
(1 + |v| )c |v| dv < .
Rd 2
The meaning of the assumption (A1) is that those atoms with their initial momenta
less than a certain value are ignored. The point is that, under this condition, (same
as in the case with the molecules frozen, which we call the classical case),
since the initial velocities of the atoms are fast enough, the interactions are not
strong enough to stop the atoms, so they keep their velocities at a certain level
for all time, hence they will leave the valid region for interaction very quickly (see
Proposition 3.2.2 and Corollary 3.2.3 for the classical case, and Propositions 3.6.1
and 3.6.5 for our case). This helps us to avoid the problem of too many collisions
in a short period of time. (A2) is a assumption with respect to the rapidness of
the decreasing of .
Also, assume that the initial position (X1,0 , . . . , XN,0 ) satises |Xi,0 Xj,0 | >
Ri + Rj for any i = j, i.e. the molecules are originally separated enough such that
their potential ranges do not overlap.
We answer in this paper the two questions (Q1), (Q2) described above under
our present assumption. For (Q1), we will show that there exists a unique solution
of (2.1) for Pm -almost every initial condition for every m > 0 (Theorem 2.2(1)
below).
In order to answer (Q2), let us rst dene some notations to describe the limit
process. For any X  = (X1 , . . . , XN ) RdN , let

 x0 , v0 ; X)
(t,  0 (t, x0 , v0 ),
 = (  1 (t, x0 , v0 )) = (  v(t, x, v; X))
x(t, x, v; X), 

denote the solution of Newtons equation




dx

(t, x, v; X)
 = v(t, x, v; X),


dt


d
N
(2.2)

v(t, x, v; X) =
 Ui (  Xi ),
x(t, x, v; X)

dt

i=1


(  v(0, x, v; X))
x(0, x, v; X),  = (x, v).

Compare (2.2) with the second half of (2.1) with m = 1, one nds that the
only dierence is that in (2.2), we have the molecules xed, whereas in (2.1), the
 x0 , v0 ; X)
molecules are also moving. We will use this (t,  (with proper X)
 as an
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 739

approximation of (x(t, x0 , v0 ), v(t, x0 , v0 )). As mentioned in Sec. 1, this is actually


one of our main ideas of the present paper.
Also, we will use the so-called ray representation : let
E = {(x, v) Rd (Rd \{0}); x v = 0},
Ev = {x Rd ; x v = 0}, v Rd \{0},
and let (dx, dv) be the measure on E given by (dx, dv) = |v|
(dx; v)dv, where
(dx; v) is the Lebesgue measure on Ev . Dene
: R E Rd (Rd \{0}),
(s, (x, v)) (s, (x, v)) = (0 (s, (x, v)), 1 (s, (x, v)))
= (x sv, v),
in other words, we decompose the position of each atom into two parts: one parallel
to its velocity and the other orthogonal to its velocity.
Then by Lemma 3.2.1, we have that lims  0 (t + s, (s, x, v); X)
 is well-
0
dened for any t R and (x, v) E. Denote it by (t, x, v; X), i.e. let


0 (t, x, v; X)  0 (t + s, (s, x, v); X).


 = lim 
s

Now we are ready to give the quadratic term of the diusion generator of the limit
process: Let

1
 =
aik;jl (X) k Ui ( 0 (t, x, v; X)
 Xi )dt
Mi Mj E
 
0 1 2
l Uj ( (t, x, v; X) Xj )dt
 |v| (dx, dv).
2
Notice that the integral above, although it might look like innite at a glance, is
actually nite by Corollary 3.2.3 and assumptions (A1) and (A2).
We next give the denition of the drift term of the limit process. For any (x, v)
 V
E, X,  RdN and a R, let z(t; x, v, X,  V , a) Rd denote the solution of

d2
N

2 Ui ( 0 (t, x, v, X)
 Xi )(z(t) (t + a)Vi ),
z(t) =
dt2 i=1 (2.3)



lim z(t) = lim d
z(t) = 0.
t t dt

 V
Then z(t; x, v, X,  , a) is a linear function of V  . Let bik;jl : RdN R be the
C -functions determined by the following:
 
2 0 1 2
Ui ( (t, x, v, X) Xi )z(t, x, v, X, V , t)dt
   |v| (dx, dv)
E 2

N
=  j ,
bi;j (X)V
=1 j=1
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

740 S. Kusuoka & S. Liang

  
or equivalently, ( dp=1 k p Ui ( 0 (t, x, v, X)
 Xi)zp (t, x, v, X,  t)dt)
 V,

E

( 12 |v|2 )(dx, dv) = d=1 N  
j=1 bik;j (X)Vj , k = 1, . . . , d, where zp means the pth
element of the vector z for p = 1, . . . , d. By the same reason as that for the quadratic
term, the integral on the left-hand side above is nite.
Now we are in a position to give the denition of the limit diusion generator
L on R2dN :
1



N d N d N
d
 2  
L= aik,jl (X) + b ik,jl (X)Vj + Vik .
2 i,j=1
k,l=1
Vik Vjl i,j=1 k,l=1
Vik i=1 Xik k=1

The coecients a and b correspond to the 0-order and the 1-order approximations,
respectively, given by the frozing approximation of the molecules (see also Sec. 1).
Our main results in the present paper are formulated in Theorem 2.0.1 below.
(1) ensures the existence of a unique dynamics for P m -almost every initial condition.
(2)(4) of Theorem 2.0.1 are to be understood with respect to the convergence of
the distribution of {(X (m) (t,  )), t 0} under P
 ), V (m) (t, ) as m 0: for the
m (d
case of only one molecule, we have the convergence with no further assumption (the
assertion (2)); when there are more than one molecule, in the general case, the con-
vergence is valid until the stopping time given as the rst time for which the poten-
tial ranges of any pair of molecules overlap (the assertion (3)); nally, for the special
case of exactly two molecules with spherically-symmetric potentials, we strengthen
the result by allowing the process to run until an arbitrary time (the assertion (4)).
The precise description is as follows.

Theorem 2.0.1. Under our present setting, we have the following.


(1) For any m > 0, there exists a unique solution to (2.1) for P .
m -almost every
(m) (m)
(2) Assume N = 1. Then as m 0, the distribution of {(X1 (t), V1 (t)), t
0} under P m converges weakly to the diusion process with generator L in
C([0, ); R2d ) equipped with the Skorohod metric.
(3) Assume N 2. Let
 
) = inf t > 0; min{|Xi (t;
0 (  ) Xj (t;
 )| (Ri + Rj )} 0 ,
i=j

be the rst time for which the distance between molecules in some pair is less
than the sum of the radii of their potentials. Then as m 0, the distribution of
{(X  (m) (t 0 )), t 0} under P
 (m) (t 0 ), V m converges weakly to the diusion
with generator L stopped at 0 in C([0, ); R2dN ) equipped with the Skorohod
metric.
(4) Let N = 2 and d 3. Assume that there exist functions h1 , h2 such that
Ui (x) = hi (|x|), i = 1, 2,
and there exists a constant 0 > 0 such that
(1)i1 hi (s) > 0, (1)i1 hi (s) > 0, s (Ri 0 , Ri ), i = 1, 2.
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 741

Then we have that when m 0, {(X  (m)(t), V (m) (t)), t 0} under P


m con-
verges weakly to the Markov process given by the following: it acts as the dif-
fusion with generator L as long as the potential ranges of the two molecules
do not overlap, and the two molecules collide whenever their potential ranges
touch each other. (See Theorem 6.3.2 for the precise denition of the limiting
process.)

Let us comment a little bit more about the conditions in Theorem 2.0.1. We do so
for (4) rst. The rst half of the conditions requires that the potential functions for
the two molecules depend only on the distances from the atoms. The condition d 3
is used in the proof of (4), and we would say that it is not strange to have it here
since, as remarked at the end of Sec. 3, our cut-o ts reality (in the sense described
there) only if d 3. Finally, the second half of the assumptions above implies that
at least near to the edges of the potential ranges, one molecule experiences repulsive
forces with the atoms, and the other molecule experiences attractive forces with the
atoms. We use this condition to keep bounded (for m 0) the velocites of the two
molecules.
This is also the reason why we need to stop the process at 0 in (3): our decom-
position of Vi (t) (see (3.30)) is valid only when the velocities of the molecules are
O(1), which holds until 0 without further assumption (see (3.31)), while this is
not always true after 0 (to see this, notice that the resulting direct interactions
 t
m1/2 0 i U  (X(s))ds between molecules in (3.30) become when m 0 if

i U(X(s)) = 0). We succeeded to extend the result until any time for the special
case described in (4), by showing that in that case, the resulting direct interac-
tions turn out to be colliding forces, which do not change the total momenta of
the molecules; while in the general case, these might accelerate the molecules to
immediately (to see this, just consider the case of two molecules of the same type),
making the decomposition itself not valid anymore. (See also Lemma 3.5.1 and the
paragraphs following it.)

Remark 1. We can also get the unique existence of the solution to (2.1) for Pm-
 under some more simple-looked assumptions (see Proposition 3.3.9).
almost every

Remark 2. We emphasize again that as explained in Sec. 1, in our present prob-


lem, the forces at any xed time are not independent of the history. Therefore,
since both the molecules and the light environmental atoms are moving, the
system is very complicated and dicult to handle. Our basic idea for the proof
is that, although all of the particles are moving all the time, since the molecules
are very heavy compared with the atoms, when considering the scattering of the
atoms, we can use the approximation that the molecules are frozen (see (2.2)),
which gives us the 0-order appximation of the momentum variance of the molecules.
 V
The 1-order error appears in our result as z(t, x, v; X,  ). (See Secs. 35 for more
details.)
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

742 S. Kusuoka & S. Liang

Remark 3. For any xed m > 0, although Vi (t) is continuous with respect to
t (since it is described by the ODEs (2.1)), our martingale part Mi (t) in the
decomposition of Vi (t) (see Lemma 3.5.1) needs not be continuous. The only thing
we can say is that its jumps are dominated by some constant multiple of m1/2 (see
Lemma 3.5.1). This is also one of our ideas, namely to use the martingale theorem
only for the terms for which it is applicable. For the remaining terms, instead of
trying to deal with them in detail, we show that they are negligible as m 0.
The rest of this paper is organized as follows: In Sec. 3, we prove the unique
existence of the solution, and present some preparation for the proof of conver-
gence. Especially, we formulate the decomposition of Vi (t) (see Lemma 3.5.1) and
deduce from it some properties. The frozing approximation is also discussed in
this section. Section 4 gives the proof of the lemmas formulated in Sec. 3. In Sec. 5,
we use these lemmas to prove the rst two convergence results (Theorems 2.0.1(2)
and 2.0.1(3)), with the help of martingale theory. The proof of the last part of
Theorem 2.0.1 is given in Sec. 6.

3. Preparations
In this section, we formulate the ray-representation, prove the unique existence of
the solution of the dynamics for each xed m > 0, and give some preparations for
the proof of our convergence results. For the sake of simplicity, from now on, we
will omit the superscription (m) when there is no risk of confusion.
We represent related results of classical mechanics, especially Newtons equa-
 x, v; X)
tion and ray representation in Sec. 3.1; some results with respect to (t,  are
prepared in Sec. 3.2; Sec. 3.3 is devoted to the almost surely unique existence of the
solution of (2.1) with the help of the ray representation; in Sec. 3.4, we recall some
basic facts about the Skorohod spaces (D([0, T ]; Rd), d0 ) and (D([0, ); Rd ), dis),
which will be used later (as described in Remark 3, although both (X(t,   ), V  ))
 (t,
and the limit processes are continuous with respect to t, this new space is necessary
in our proof); in Sec. 3.5, we state several basic lemmas, especially the decompo-
sition of Vi (t), the proof of which will be given in Sec. 4; nally, in Sec. 3.6, we
prepare some basic calculations for later use.
Since we are considering the Skorohod metric, it suces for us to prove our
assertions for t [0, T ] for any T > 0, instead of t [0, ). (See [1].) So from now
on, we choose an arbitrary T > 0 and x it. Also, as mentioned in Sec. 1, we use
the stopping time that the velocities of the molecules are larger than or equal to n:
choose any n 1 and x it for a while (we will take n at the end). Now, we
are ready to dene the following notations: Let
 
() = n () = inf t 0; max |Vi (t, )| n ,
i=1,...,N

R0 = R0 (n, T ) = max (Ri + |Xi,0 | + nT ) + 1,


i=1,...,N
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 743

 1/2

N
C0 = 2 Ri Ui ,
i=1

= (n, T ) = C01 R0 .
Also, for readers convenience, we give a brief list of the other main notations
and their meanings used in this paper:

(X(t), V (t), x(t, x, v), v(t, x, v)): Solution of the dynamics,
 x, v; X):
(t,  Solution of (2.2), the motion of the atoms with the molecules frozen,
which is shown to be the 0-order term in our approximation of
(x(t, x, v), v(t, x, v)),
 The corresponding scattering,
(t, x, v; X):
(t, x, v) := (x tv, v), used in the ray representation,
 V
z(t, x, v; X,  ): Solution of (2.3), the 1-order term in our approximation of
(x(t, x, v), v(t, x, v)).

3.1. Classical mechanics


In this and the next subsection, we prepare some results with respect to the solution
of Newtons equation (2.2).
As in Sec. 2, for any X  x, v; X)
 (Rd )N , let (t,  be the solution of (2.2).
First, let us recall the following well-known result about Newtons equation.

Proposition 3.1.1. For any f : R2d [0, ), we have



1
N
2
 x, v; X))
f ((t,  |v| + Ui (x Xi ) dxdv
R2d 2 i=1

1 2

N
= f (x, v) |v| + Ui (x Xi ) dxdv. (3.1)
R2d 2 i=1

Proof. As the proof is fundamental and well-known, we give a sketch only.


N
First, since the total energy is constant, we have that 12 |v|2 + i=1 Ui (x

Xi ) = 12 |  1 (t, x, v; X)|
 2 + N Ui (  0 (t, x, v; X)
 Xi ), so the left-hand side of

 2 + N Ui (
i=1
1
(3.1) is equal to R2d f ((t,  x, v; X))(
  1 (t, x, v; X)|
2 | i=1  0 (t, x, v; X)

Xi ))dxdv. Therefore, in order to show the assertion, it is sucient to show that
e 0 ,
e 1)
| ((x,v)

| = 1 for any t > 0. On the other hand, by a straightforward calcu-
0 1
lation, we get by the denition of (t,  x, v; X)  that d (| (e ,e ) |) = 0, also, we
dt (x,v)
 0,
have by denition (  1 )(0, x, v; X)
 = (x, v). This completes the proof of our
assertion.

The rest of this subsection is dedicated to a discussion of the ray represen-


tation. Let E, Ev , , etc., be as given in Sec. 2. Note that for any measurable
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

744 S. Kusuoka & S. Liang

f : R2d [0, ), we have by denition and a simple variable change that



f (x, v)dxdv = f ((t, x, v))dt(dx, dv). (3.2)
R2d RE

In order to derive our new intensity function m , for the sake of simplicity, we
introduce the following notations. Let
1
m : R E Rd (Rd \{0}), (s, x, v) m (s, x, v) = (s, x, m 2 v),
and let
1
fm (x, v) = f (x, m 2 v),

1 2

N
0 (x, v) = |v| + Ui (x Xi,0 ) .
2 i=1

Then we have

f (x, v)
m (dx, dv)
R2d

m 2

N
d1
=m 2 f (x, v) |v| + Ui (x Xi,0 ) dxdv
R2d 2 i=1

1 2

N
12 12
=m f (x, m v) |v| + Ui (x Xi,0 ) dxdv
R2d 2 i=1

12
=m fm ((s, x, v))0 ((s, x, v))ds(dx, dv)
RE

1 1
= m1 fm ((m 2 s, x, v))0 ((m 2 s, x, v))ds(dx, dv),
RE

where we used (3.2) when passing to the forth line. On the other hand,
1 1 1
fm ((m 2 s, x, v)) = f (x m 2 sv, m 2 v) = f (m (s, x, v)).
Therefore,

f (x, v)
m (dx, dv) = f (m (s, x, v))m (ds, dx, dv),
R2d RE

where m (ds, dx, dv) is the measure on Conf(R E) dened by


1
m (ds, dx, dv) = m1 0 ((m 2 s, x, v))ds(dx, dv)

1 2

N
1 1/2
=m |v| + Ui (x m sv Xi,0 ) ds(dx, dv).
2 i=1

Also, with a little abuse of notation, we use m to denote the natural map
 = Conf(R E) to Conf(Rd (Rd \{0})), i.e. m (A) = {m (a)|a A}.
from
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 745

Let Pm (d) = Pm (d) be the Poisson point process on Conf(RE) with intensity
function m (ds, dx, dv). Then since m (B) = 
m (m (B)) for any B B(R E),
we have that

Pm (A) = P
m (m (A)), for all A E0 .

Therefore, we can convert our problem with respect to Conf(Rd Rd ) to a problem


with respect to Conf(R E).
In summary, we let = Conf(R E),

(ds, dx, dv) = m (ds, dx, dv)



1 2

N
1 1/2
=m |v| + Ui (x m sv Xi,0 ) ds(dx, dv),
2 i=1

Pm = Pm be the Poisson point process on with intensity m (ds, dx, dv).


has distribution Pm , and for each initial condition , we are considering the
following system of innite ODEs (we omit the superscription (m) for the sake of
simplicity):


d

Xi (t, ) = Vi (t, ),

dt





d 1

M V (t, ) = Ui (Xi (t, ) x(t, (s, x, m 2 v))) (ds, dx, dv),


i
dt
i


RE



(Xi (0, ), Vi (0, )) = (Xi,0 , Vi,0 ), i = 1, . . . , N,
(3.3)




d

x(t, x, v, ) = v(t, x, v, ),

dt




N

d

m v(t, x, v, ) = Ui (x(t, x, v, ) Xi (t, )),

dt

i=1

(x(0, x, v, ), v(0, x, v, )) = (x, v), (x, v) ().

3.2. Classical scattering


Continuing as in Sec. 3.1, let X  = (X1 , . . . , XN ) (Rd )N , and let
 be the solution
of (2.2). In this subsetion, we prove some results with respect to (t, x, v; X)  (see
(3.4) below for its denition). We call it classicalscattering since as opposite to
(x(t, x, v, ), v(t, x, v, )), the massive particles are not moving when considering
 x, v; X).
(t, 


Lemma 3.2.1. Let R(X)  = max{Ri +|Xi |; i = 1, . . . , N }, and let s0 = R(X)
|v| . Then
for any (x, v) E and t R, we have that (t
 + s, (s, x, v); X)  is independent of
s as long as s s0 .
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

746 S. Kusuoka & S. Liang

Proof. For any (x, v) E, notice that x v = 0 by denition of E, so


inf |x (s0 + u)v + rv| = inf |x s0 v (u r)v|
0ru 0ru

s0 |v| = R(X),
 u 0,
 with respect to time t (the right-hand side
which implies that the derivative of
of (2.2)) is 0, hence
 (s0 + u, x, v); X)
(u,  = (x s0 v, v) = (s0 , x, v), u 0.
 we have
Therefore, by the Markovian property of ,
 + s0 + u, (s0 + u, x, v); X)
(t  + s0 , (u,
 = (t  (s0 + u, x, v); X);
 X)

 + s0 , (s0 , x, v); X),


= (t  t R, u 0.
That is,
 + s, (s, x, v); X)
(t  + s0 , (s0 , x, v); X),
 = (t  for any s s0 ,
 + s, (s, x, v); X)
or equivalently, (t  is independent of s as long as s s0 .

By Lemma 3.2.1, we get that lims (t  + s, (s, x, v); X)


 is well-
dened, and is equal to (t  + s0 , (s0 , x, v); X). Write it as (t, x, v; X)
  =
0  1 
( (t, x, v; X), (t, x, v; X)), i.e.
 = ( 0 (t, x, v; X),
(t, x, v; X)  1 (t, x, v; X))


 + s, (s, x, v); X)
= lim (t  + s0 , (s0 , x, v); X).
 = (t  (3.4)
s

With the same notations as in Sec. 2, we shall present one more result concerning
 x, v; X).
(t, 

Proposition 3.2.2. Suppose that |v| > 2C0 . Then


 1 (t, x, v; X)
 (|v|1 v) > C0 , for any t R, x Ev .

Proof. Notice that  1 (0, x, v; X)


 = v. Write = |v|1 v. Then by assumption,
v = |v| > 2C0 . Let
 1 (t, x, v; X)
1 = inf{t 0;  C0 }.

We show that 1 = +.
Suppose 1 < +. Then  1 (1 , x, v; X)
 = C0 . By denition, we have
t
0 0
 (t, x, v; X)
(   (s, x, v; X)) =
  1 (u, x, v; X)
 du
s

> C0 |t s|, for any 0 s < t 1 ,


which implies that
d 0
 (t, x, v; X)
(  ) C0 , 0 t 1 . (3.5)
dt
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 747

In particular, d
 0 (t, x, v; X)
dt (
 ) > 0 for 0 t 1 . Also, since
1

N
1
 (1 , x, v; X)
 v =  0 (t, x, v; X)
Ui (  Xi )dt,
0 i=1

we have by denition that


1

N
 0 (t, x, v; X)
Ui (  Xi ) dt
0 i=1

 1 (1 , x, v; X)
=  v < C0 2C0 = C0 .

Therefore, with the help of (3.5), we have


1

N
C0 <  0 (t, x, v; X)
Ui (  Xi ) dt
0 i=1
N 1

1 d 0
 0 (t, x, v; X)
|Ui (  Xi ) |  (t, x, v; X)
(  )dt
C0 i=1 0 dt
N
1

=  0 (t, x, v; X)
|Ui (  Xi ) |
C0 i=1 {t[0,1 ],|e 0 (t,x,v;X)X
 i |<Ri }

d 0
 (t, x, v; X)
(  )dt
dt

1

N
d 0
Ui  (t, x, v; X)
(  Xi )dt
C0 i=1 {t[0,1 ],| 0 
e (t,x,v;X)Xi |<Ri } dt

N
Ui 2Ri
C0 i=1
= C0 ,

 1 (t, x, v; X)
which makes a contradiction. Therefore, 1 = +, i.e.  (|v|1 v) > C0
for any t 0.
The assertion for t < 0 can be shown in the same way by considering

 1 (t, x, v; X)
2 = sup{t < 0;  C0 }.

By Proposition 3.2.2, we get the following important result with respect to


0 (t, x, v; X).


Corollary 3.2.3. For any (x, v) E with |v| > 2C0 , we have that

| 0 (t, x, v; X)
 Xi | > Ri , i = 1, . . . , N,

if t 2C01 R(X)
 or t C 1 R(X).
0

August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

748 S. Kusuoka & S. Liang

Proof. Choose and x any (x, v) E with |v| > 2C0 , and let = |v|1 v. Then
since x v = 0, we have that
0 (s, x, v) = (x sv) = sv = s|v|, for any s > 0.

Let s0 = R(X)
|v| as before. Then s0 < C01 R(X),
 and

 0 (0, (s0 , x, v); X)


 = 0 (s0 , x, v) = s0 |v| = R(X).
 (3.6)
Also, |1 (s0 , x, v)| = |v| > 2C0 by assumption. Combining (3.4), Proposition 3.2.2
and (3.6), we get
0 (t, x, v; X)
 = 0 (t + s0 , (s0 , x, v); X)

t+s0
=  1 (u, (s0 , x, v); X)
 0 (0, (s0 , x, v); X)
 du + 
0

> (t + s0 )C0 R(X),


 for any t > s0 .
In particular, if t > 2C01 R(X),
 then

0 (t, x, v; X)
 > (t + s0 )C0 R(X)
 R(X).


In the same way, if t < C01 R(X),


 then t + s0 < 0, so

0 (t, x, v; X)  0 (t + s0 , (s0 , x, v); X)


 = 
0
=  1 (u, (s0 , x, v); X)  0 (0, (s0 , x, v); X)
 du + 
(t+s0 )

< C0 ((t + s0 )) R(X)




< R(X).


This completes the proof of our assertion.

Proposition 3.2.4. For any measurable f : R2d [0, ) such that the integrand
below is integrable, we have

1 2

N
f (x, v) |v| + Ui (x Xi ) dxdv
R2d 2 i=1
 
1 2
= 
f ((t, x, v; X))dt |v| (dx, dv). (3.7)
E 2

Remark 4. The integral f ((t, x, v; X))dt  on the right-hand side of (3.7),
although it might look as being an innite integral, is actually a nite one by
Corollary 3.2.3.

Proof. By using approximation and taking limit with the help of convergence
theorem, we may and do assume, without loss of generality, that there exists a
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 749

 > 0 such that


constant R

supp(f ) {(x, v); |x| + |v| R}.
Let
 + R(X)),
T = 2C01 (R 

and let

N
 v) = 1 |v|2 +
E(x, Ui (x Xi ).
2 i=1

Then by Proposition 3.1.1 and a simple change of variables, we have



 v))dxdv
f (x, v)(E(x,
R2d

=  x, v; X))(
f ((T,   v))dxdv
E(x,
R2d

=  (t, x, v); X))(
f ((T,  
E((t, x, v)))dt(dx, dv). (3.8)
RE

Therefore, it suces for us to show that the right-hand side of (3.8) is equal to

1 2
f ((T t, x, v); X)
 |v| dt(dx, dv).
RE 2
We only need to show that the integrands are equal, i.e. it suces to show that

 1 2
 (t, x, v); X))(E((t, x, v))) = f ((T t, x, v; X))
f ((T,   |v| . (3.9)
2
Let us prove this in what follows. We rst show that if the left-hand side of (3.9)
is not 0, then it is equal to the right-hand side. Assume that
 (t, x, v); X))(
f ((T,  
E((t, x, v))) = 0.

Then (E((t, 
x, v))) > 0 implies by our assumption that E((t, x, v)) > e0 , so
|v| > 2C0 , hence by Proposition 3.2.2,
 1 (s, (t, x, v); X)
 > C0

for any s R, where = |v|1 v. Therefore, since (  0,


 1 ) is the solution of (2.2),
we have by denition that
T
d 0
 0 (T, (t, x, v); X)
(  0 (t, x, v)) =  (s, (t, x, v); X)ds

0 ds
T
=  1 (s, (t, x, v); X)
 ds
0

 + R(X)),
> T C0 = 2(R  (3.10)
where in the latter step we used the denition of T .
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

750 S. Kusuoka & S. Liang

 (t, x, v); X))


We also have f ((T,  > 0 in addition, which gives us

 0 (T, (t, x, v); X)


|  0 (T, (t, x, v); X)|
 | |  1 (T, (t, x, v); X)|
 + | 
 R.
(3.11)
Combining (3.10) with (3.11), and noticing that x v = 0 since (x, v) E, we get
by the denition of that
 + 2R(X)
R  < 0 (t, x, v) = (x tv) = t|v|, (3.12)
e
R+2R( 
X)
hence t |v| s0 . So by the denition of , we get

(T t, x, v) = (T
 t + t, (t, x, v); X)  (t, x, v); X).
 = (T, 

Also, (3.12) gives us that |0 (t, x, v)| = |x tv| t|v| R(X),


 so by the denition

of E, we also get

 1
E((t, x, v)) = |v|2 .
2
This completes the proof of the fact that if the left-hand side of (3.9) is not 0, then
it is equal to the right-hand side.
We next show the opposite, i.e. we assume that the right-hand side of (3.9),
1 2
f ((T t, x, v; X))(

2 |v| ), is not 0, and show that it is equal to the left-hand side,
 (t, x, v); X))(
f ((T,  
E((t,

x, v))). It is sucient to show that t s0 (= R(X) ).
|v|
(Indeed, if t s0 , then by using x v = 0, we get |x tv| t|v| R(X),  hence
 1 2
E((t, x, v)) = 2 |v| by denition. Also, since t s0 , we have by the denition of
that (T t, x, v; X)  t + t, (t, x, v); X)
 = (T  = (T, (t, x, v); X),
 which
1 2 1 2 2
will complete our proof.) Since ( 2 |v| ) > 0, we have 2 |v| > 2C0 , hence |v| > 2C0 ,
which in turn by Proposition 3.2.2 gives us that
 1 (u, (s, x, v); X)
 > C0 (3.13)
for any u, s R and x Ev . If t T , then by the denition of T , since |v| > 2C0 ,
we have
2  
tT = (R + R(X)) > 4 (R  > R(X) = s0 .
 + R(X))
C0 |v| |v|
If t < T , then we have by (3.13) and the denition of T that for any r > 0
 0 (T t + r, (r, x, v); X)
(  0 (r, x, v))
T t+r
=  1 (u, (r, x, v); X)
 du
0

 + 2R(X)
> (T t + r) C0 = 2R  + (r t)C0 . (3.14)
Also, since f ((T t, x, v; X))
 > 0, we have

| 0 (T t, x, v; X)|
 + | 1 (T t, x, v; X)| 
 R.
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 751

Therefore, we have for any r s0


 0 (T t + r, (r, x, v); X)|
|  = | 0 (T t, x, v; X)| 
 R. (3.15)
Combining (3.14) and (3.15), we get
 + 2R(X)
(r|v| =) 0 (r, x, v) > R  + (r t)C0 , for any r s0 .
  
Applying the above to r = s0 = R(|v| X)
, we get

 + 2R(X)
 >R
R(X)  + (s0 t)C0 .

Therefore, t > s0 . This completes our proof.

3.3. Existence and uniqueness of the solution


In this subsection, we prove the rst assertion of Theorem 2.0.1, the almost sure
unique existence of the solution of the considered innite system of ODEs for any
xed m > 0. Recall that by Sec. 3.1, we have already converted the problem into
 C1 , etc.,
(3.3), which uses the ray representation. In the following we shall use C, C,
to denote constants which may be dierent in dierent places.
For any open subset G R E, let
G : Conf(R E) Conf(R E), G () = G.
Then G is E0 /E0 -measurable. Here E0 is the -algebra on O(R E) = {A R
E|A = , A is closed}, generated by {{C O(R E); C A = }; A is open in R
E}. Also, let FG = {XK ; K G, K is compact} . Here XK is the random
variable dened by XK () = (K)(= ( K)), , and stands for the set
of null sets. Then it is trivial that {FG |G is open} is an increasing -algebra.
Let Fin(R E) denote the set of non-empty nite subsets of R E. It is easy
to see that if Fin(R E), then (3.3) has a unique solution. In the following,
we extend this unique existence of a solution for (3.3) to Pm -almost every .
Fix any T > 0 as before. Let R0 and be as given at the end of Sec. 2, set
Gn = {(t, x, v) R E; |x| < R0 , |t| < T + m1/2 },
and let n = Gn .

Lemma 3.3.1. n Fin(R E) for Pm -a.e. .


N
Proof. Let c = i=1 Ui . Then by denition and assumption,

m (Gn ) = 1{|x|<R0 ,|t|<T +m1/2 }
RE

1 2

N
1 1/2
m |v| + Ui (x m tv Xi,0 ) dt(dx, dv)
2 i=1
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

752 S. Kusuoka & S. Liang


1 2
(2R0 )d1 2(T + m1/2 )m1 |v|c |v| dv
Rd 2
< . (3.16)
So E Pm [ (n )] = E Pm [ (Gn )] = m (Gn ) < .

By Lemma 3.3.1, we have that (X(t,  n ), V  (t, n )) is well-dened for Pm -


almost every .
Next, for any t [0, T ], we dene St : Fin(R E) O(R E) as

St () = (u, x, v) R E;
 
1/2 1
min min |Xi (s, ) (x (u s)m v)| Ri + 0
i=1,...,N 0st 2
for any Fin(R E). Then we have the following:

Lemma 3.3.2. For any open set G and Fin(R E), we have that {St ()
G} = {St (G ) G}.

Proof. Choose and x any Fin(RE). We give the proof of the part {St ()
G} {St (G ) G}, the opposite one can be proven in exactly the same way.
Notice that by denition,
1
/ St () |Xi (s, ) (x um1/2 v + sm1/2 v)| Ri + ,
(u, x, v)
2
s [0, t], i = 1, . . . , N

x(s, x um1/2 v, m1/2 v; ) = x um1/2 v + sm1/2 v,

v(s, x um1/2 v, m1/2 v; ) = m1/2 v, for any s [0, t].
So
(u, x, v)
/ St ()
1
|Xi (s; ) x(s, x um1/2 v, m1/2 v; )| Ri + , for any s [0, t],
2
Ui (Xi (s; ) x(s, x um1/2 v, m1/2 v; )) = 0, for any s [0, t],
(3.17)
for i = 1, . . . , N . Moreover, it is trivial to see that
(u, x, v) G (du, dx, dv) = G (du, dx, dv). (3.18)
(3.17) and (3.18) combined with the denition (3.3) imply
St () G (X(s,
 ), V
 (s, )) = (X(s,
 G ), V
 (s, G )), for any s [0, t],
(3.19)
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 753

(as long as Fin(R E)). Therefore,

St () G St (G ) G.
We next deal with general Conf(R E). As a special case of Lemma 3.3.2,
we have the following.

Corollary 3.3.3.

{St (k ) Gn } = {St (n ) Gn }, Pm -a.e. , for any k > n.

By Lemmas 3.3.1 and 3.3.2, we have that

{St (n ) G} = {St (Gn G ) G}, Pm -a.e. ,

so by the denition of F , we have {St (n ) G} FGn G FG for any open


G R E, i.e.

St (n ) is a {FG }-stopping time.

Here, a map T : O(R E) is, by denition, called a F -stopping time if, T is


B()/E0 -measurable and { ; T () G} FG for any F -regular open set G.
For any , let
 
n () = inf t 0; max |Vi (t, n )| > n T.
i=1,...,N

Lemma 3.3.4. For any n N, there exists a unique solution to (3.3) for Pm -a.e.
satisfying n () = T .

Proof. We rst notice that

n () = T ST (n ) Gn . (3.20)

Indeed, if n () = T , then |Vi (t, n )| n for any t [0, T ] and i = 1, . . . , N , hence


|Xi (t, n )| nT +|Xi,0 | for any t [0, T ] and i = 1, . . . , N . Assume (u, x, v) / Gn .
Then either |x| R0 + nT or |u| m1/2 C01 (R0 + nT ) + T . If |x| R0 + nT ,
then |x + rv| |x| R0 + nT for any r R, so |Xi (s, n ) (x um1/2 v +
sm1/2 v)| Ri + 12 for any s [0, T ], which implies that (u, x, v) / ST (n ).
If |u| m1/2 C01 (R0 + nT ) + T , then since |v| > C0 Pm -almost surely, for any
s [0, T ], we have |x um1/2 v + sm1/2 v| C01 (R0 + nT )|v| R0 + nT , so in
this case we also have |Xi (s, n ) (x um1/2 v + sm1/2 v)| Ri + 12 for any
s [0, T ], which implies that (u, x, v) / ST (n ). In conclusion, we have in either
cases that (u, x, v)
/ ST (n ). This completes the proof of (3.20).
Now, we are ready to show that the desired solution is well-dened almost surely
on the set n () = T for any n N. Indeed, if n () = T , then we have by (3.20),
Corollary 3.3.3 and (3.19) that

(X(t,  (t, k )) = (X(t,


 k ), V  n ), V
 (t, n )), for any t [0, T ] and k n,
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

754 S. Kusuoka & S. Liang

so we can dene

 ), V
(X(t,  (t, )) = (X(t,  (t, n )),
 n ), V

(x(t, x, v, ), v(t, x, v, )) = (x(t, x, v, n ), v(t, x, v, n )),

which exists for Pm -almost every satisfying our condition by Lemma 3.3.1. Then
 ), V
(X(t,  (t, ), x(t, x, v, ), v(t, x, v, )) satises (3.3).

Notice that n () = T n+1 () = T . Therefore, to complete the proof of


Theorem 2.0.1(1), it suces to prove the following:

Lemma 3.3.5.



P {n = T } = 1.
n=1

We divide the proof of Lemma 3.3.5 into several steps.

Lemma 3.3.6. There exist constants C1 , C2 > 0 such that

N
1 2
Mi |Vi (t, n )| C1 + C2 1Gn (u, x, v)(1 + |v|2 ) (du, dx, dv),
i=1
2 St (n )

for any n Fin(R E).

Proof. For any n Fin(R E), we have by the invariance of the energy

N
1
Mi |Vi (t, n )|2
i=1
2

m
+ |v(t, x um1/2 v, m1/2 v; n )|2 n (du, dx, dv)
2 RE
N

+ Ui (Xi (t, n ) x(t, x um1/2 v, m1/2 v; n ))n (du, dx, dv)


i=1 RE

N
1 m
= Mi |Vi,0 |2 + |m1/2 v|2 n (du, dx, dv)
i=1
2 2 RE

+ Ui (Xi,0 (x um1/2 v))n (du, dx, dv). (3.21)


i=1 RE

/ St (n ), then |Xi (s, n ) (x (u s)m1/2 v)| > Ri + 12 for any


If (u, x, v)
s [0, t] and i = 1, . . . , N , so by (3.3), v(t, x um1/2 v, m1/2 v; n ) = m1/2 v
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 755

and Ui (Xi (t, n ) x(t, x um1/2 v, m1/2 v; n )) = 0. Therefore, (3.21) implies


N
1
Mi |Vi (t, n )|2
i=1
2

m
+ |v(t, x um1/2 v, m1/2 v; n )|2 n (du, dx, dv)
2 St (n )

+ Ui (Xi (t, n ) x(t, x um1/2 v, m1/2 v; n ))n (du, dx, dv)


i=1 St (n )

N
1 m
= Mi |Vi,0 |2 + |m1/2 v|2 n (du, dx, dv)
i=1
2 2 St (n )

+ Ui (Xi,0 (x um1/2 v))n (du, dx, dv).


i=1 St (n )

N 1 2
N
i=1 2 Mi |Vi,0 | i=1 Ui
m
So with C1 := and C2 := 2 + 2, we get

N
1 2
Mi |Vi (t, n )| C1 + C2 (1 + |v|2 )n (du, dx, dv)
i=1
2 St (n )

= C1 + C2 1Gn (u, x, v)(1 + |v|2 ) (du, dx, dv).
St (n )
(3.22)

Let us prepare for later use the following general result with respect to stopping
times and Poisson point process.

Lemma 3.3.7. (1) Let f : R E [0, ) be measurable and let S be a stopping


time. Then
   
Pm Pm
E f d = E f dm .
S() S()

(2) Let f : R E [0, ) be measurable and S, T be two stopping times satisfying


(i) T ()  S() for any ,
(ii) E Pm [ S() |f |dm ] < .
Then
    


E f (d d) FT = E f (d d) .
S()  T ()

Proof. As the result is already known, we give a sketch only. (See, e.g., [8, 12] for
related results.)
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

756 S. Kusuoka & S. Liang

We rst have

E Pm [ (A\S)|FS ] = m (A\S), A B(R E).

This is heuristically based on the denition of Poisson point process and the inde-
pendence of FS and FA\S , and can be proved rigorously, for example, rst for
non-random S, and then be extended to stopping times in a routine way.
So for positive simple functions f we have
  


E Pm
f d  FS = f dm . (3.23)
(RE)\S  (RE)\S

With the help of the monotone convergence theorem, this can be extended to any
positive measurable function f in a routine way. Therefore,
     
E Pm f d = E Pm f d E Pm f d
S() RE (RE)\S()

 
= f dm E Pm
f dm
RE (RE)\S()
 
Pm
=E f dm .
S()


 For the second assertion, (3.23) implies that E[ RE
f (d dm )|FS ] =
S() f (d dm ), hence

       
  

E f (d dm ) FT = E E f (d dm ) FS  FT
S()  RE
  

=E 
f (d dm ) FT
RE

=E f (d dm ).
T ()

Since St (n ) is a {FG }-stopping time, FSt+ (n ) is well-dened for any > 0


(n) 
small enough. Let Ft = >0 FSt+ (n ) , 0 t < T . Then n is a stopping time
(n)
with respect to the ltration {Ft }t[0,T ) . Let

(n)
Mt = 1Gn (u, x, v)(1 + |v|2 )( (du, dx, dv) (du, dx, dv)).
St (n )

(n) (n)
Lemma 3.3.8. {Mt }t[0,T ] is a {Ft }t[0,T ] -martingale with mean 0.
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 757

Proof. Notice that St (n ) is monotone non-decreasing with respect to t, and by


N
assumption, with c = i=1 Ui , we have

1Gn (u, x, v)(1 + |v|2 )(du, dx, dv)
RE

2(T + m1/2 C01 (R0 + nT ))(R0 + nT )d1 m1



1 2
(1 + |v|2 )c |v| |v|dv,
Rd 2
which is nite (but may depend on n) by assumption.
This combined with Lemma 3.3.7 gives us our assertion.

(n)
Proof of Lemma 3.3.5. We have by Lemma 3.3.8 that E[Mn ] = 0. So by
Lemma 3.3.6,

N
1
Mi E[|Vi (n , n )|2 ]
i=1
2
 
2
C1 + C2 E 1Gn (u, x, v)(1 + |v| )(du, dx, dv)
Sn (n )
 
2
C1 + C2 E (1 + |v| )(du, dx, dv) .
Sn (n )

So with C3 := (min M2i )1 , we have


 
P [n < T ] = P max |Vi (n , n )| n
i=1,...,N
N 
C3
1
2
2E Mi |Vi (n , n )|
n i=1
2
 
1 1 2
2 C1 C3 + 2 C2 C3 E (1 + |v| )(du, dx, dv) . (3.24)
n n Sn (n )

Let us estimate the expectation on the right-hand side of (3.24). Let Sd (r) denote
N
the volume of the ball in Rd with radius r, and let C1 = i=1 m1/2 Sd (Ri + 12 ), C2 =
N
i=1 m
1/2
T Sd1 (Ri + 12 ). Then

|{(u, x) R Ev ; (u, x, v) St (n )}|




=  (u, x) R Ev ; i = 1, . . . , N, s.t.,
 s  
  1 

min (x um 1/2
v) + (m 1/2 
v Vi (r, n ))dr Ri +
0st  0 2 
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

758 S. Kusuoka & S. Liang


  s  
N
   
m 1/2 1
|v|  y Rd ; min y + (m 1/2
v V (r, ))dr  Ri + 1 
 0st  i n  2 
i=1 0

N 
1/2 1 1/2 1
m |v| T (m |v| + max |Vi (s, n )|)Sd1 Ri +
0st 2
i=1

1
+ Sd Ri +
2

1   1/2
= |v| C1 + C2 (m |v| + max |Vi (s, n )|) .
0st

Also, notice  that |Vi (s, n )| n for any s [0, n ]. Therefore,  with
C1 = m1 Rd(1 + |v|2 )(C1 + C2 m1/2 |v|)c ( 12 |v|2 )dv and C2 = m1 C2 Rd(1 +
|v|2 )c ( 12 |v|2 )dv, which are nite by assumption, we have

(1 + |v|2 )(du, dx, dv)
Sn (n )

1 2
(1 + |v|2 )m1 c
|v| |v|dv|{(u, x) R Ev ; (u, x, v) St (n )}|
Rd 2

2 1 1 2
(1 + |v| )m c |v| (C1 + C2 (m1/2 |v| + n))dv
Rd 2
= C1 + C2 n.

This combined with (3.24) implies


1 1
P (n < T ) 2
C1 C3 + 2 C2 C3 (C1 + C2 n) 0, as n ,
n n
which completes the proof.

As mentioned in Sec. 2, we can also get the unique existence of the solution
of (2.1) under the following condition (and without any further assumption such

as (A1) or (A2)): d 2 and (1 + |s|)d (s)ds < . (See Proposition 3.3.9.)
This result is not necessary for the rest of this paper, but we include it here since
the condition is very simply: the intensity function decreases rapidly enough at
innity.

Proposition 3.3.9. Assume that d 2 and



(1 + |s|)d (s)ds < , (3.25)

then there exists a unique solution to (2.1) for P .


m -almost every

Notice that neither does Theorem 2.0.1(1) include Proposition 3.3.9 nor
vice versa.
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 759

Proof. The proof is almost the same as the one we just used for Theorem 2.0.1(1),
although we do not use the ray representation this time. The point is that in the
proof of Theorem 2.0.1(1), the assumption (A2) was only used to estimate several

integrals with respect to ( 12 |v|2 + N
i=1 Ui (x m
1/2
tv Xi,0 )) (e.g., (3.16)), while
if we do not use the ray representation, then the corresponding term ( 12 |v|2 +
N
i=1 Ui (x Xi,0 )) does not depend on v, so by the variable change r = |v| and a
suitable shift we can get similar estimates without the help of (A2).
We give a brief sketch of the proof in the following. Unless otherwise specied,
the notations have the same meanings as in the proof of Theorem 2.0.1(1).
First notice that for any 0, we have + d2 1 0 since d 2, so for any
c R, we have by assumption and a simple calculation that

m 2
2
(|c| + |s|)+ 2 1 (s)ds
d
|v| |v| + c dv Cd,m (3.26)
R d 2

for some constants Cd,m > 0 independent of c. So



m 2
2 d
|v| |v| + c dv < , if 0 + 1. (3.27)
Rd 2 2

Let

Gn = {(x, v) R2d ; |x| < R0 + nT + |v|T },

and let n = Gn . Then since

|{x; (x, v) Gn }| = 2d (R0 + nT + T |v|)d 4d (R0 + nT )d + 4d T d |v|d ,


N
with C = i=1 Ui , we have the following by (3.26) and our assumption:


m 2

N
m d1
2 
m (Gn ) = |v| + Ui (x Xi,0 ) dxdv
Gn 2 i=1

m 2

N
|v| + Ui (x Xi,0 ) dxdv
|x|R0 2 i=1

m 2
+ |v| dxdv
Gn {|x|>R0 } 2

(C + |s|) 2 1 (s)ds
d
(2R0 )d Cd,m

 
m 2 m 2
+ 4d (R0 + nT )d |v| dv + 4d T d |v|d |v| dv
Rd 2 Rd 2
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

760 S. Kusuoka & S. Liang


(C + |s|) 2 1 (s)ds
d
(2R0 )d Cd,m


|s| 2 1 (s)ds
d
d d
+ 4 (R0 + nT ) Cd,m


+ 4d T dCd,m |s|d1 (s)ds

< .

So the conclusion of Lemma 3.3.1 still holds in our case.


The proof of Theorem 2.0.1(1) until Lemma 3.3.8 is valid in the present case,
just with the trivial modications such as R E replaced by R2d , and with the
denition of St modied as St : Fin(R2d ) O(R2d ), given by
  
2d 1
) = (x, v) R ; min
St ( min |Xi (s,
 ) (x + sv)| Ri + 0
i=1,...,N 0st 2

for any  Fin(R2d).


The fact that R2d 1Gn (x, v)(1 + |v|2 ) m (dx, dv) < in the proof of
Lemma 3.3.8 is now proven as follows: since |{x; (x, v) Gn }| = 2d (R0 + nT +
T |v|)d and there exists a constant C2 > 0 (depending on R0 , n, T, d) such that
2d (R0 + nT + T |v|)d (1 + |v|2 ) C2 (1 + |v|d+2 ), we get by (3.26) and our assumption
that

1Gn (x, v)(1 + |v|2 )
1d
m 2 m (dx, dv)
R2d

m
N
(1 + |v|2 ) |v|2 + Ui (x Xi,0 ) dxdv
|x|R0 2 i=1

m 2
+ (1 + |v|2 ) |v| dxdv
Gn {|x|>R0 } 2

m 2

N
2
dx (1 + |v| ) |v| + Ui (x Xi,0 ) dv
|x|R0 Rd 2 i=1

m 2
+ C2 (1 + |v| )
d+2
|v| dv
Rd 2

[(C + |s|) 2 1 + (C + |s|) 2 ](s)ds
d d
(2R0 )d Cd,m


[|s| 2 1 + |s|d ](s)ds
d
+ C2 Cd,m

< ,
N
where C = i=1 Ui .
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 761

The part under the title Proof of Lemma 3.3.5 is most changed, and we give
it as follows:
 

N
1 2 2 
 )| ] C1 + C2 E
Mi E[|Vi (n , n 1Gn (x, v)(1 + |v| )m (dx, dv) .
i=1
2 Sn (n
e)

Therefore, with C3 := (min M2i )1 , we have


 
P [n < T ] = P max |Vi (n , n  )| n (3.28)
i=1,...,N
N 
C3
1
2
2E Mi |Vi (n , n
 )|
n i=1
2
 
1 1 2 
2 C3 C1 + 2 C3 C2 E 1Gn (x, v)(1 + |v| )m (dx, dv) .
n n Sn (n e)

(3.29)
Notice that by denition,  2
m (dx, dv) = ( 2 |v| )dxdv if |x| > R0 . Also, there exist
m
 
constants C0 , C1 > 0 (depending on T, N, d and Ri ) such that
|{x Rd ; (x, v) St (n
 )}|
 
 1 

=  x R ; i {1, . . . , N }, s.t., min |x + sv Xi (s, n
d
 )| Ri +
0st 2 
 s 
 1 
=  x Rd ; i {1, . . . , N }, s.t., min |x + (v Vi (r, n  ))dr| Ri +
0st 0 2 


N
 
C0 + C1 |v| + max |Vi (s, n  )| .
0st
i=1

Moreover, |Vi (t, n


 )| n if t [0, n ]. Therefore, by assumption and (3.27), there
exist constants C0 , C1 > 0 such that

1Gn (x, v)(1 + |v|2 ) m (dx, dv)
Sn (n
e)

(1 + |v|2 )
m (dx, dv)
|x|R0


m 2
|v| dv(1 + |v|2 )|{x Rd ; (x, v) Sn (n
d1
+m 2  )}|
Rd 2

m
N
(1 + |v|2 ) |v|2 +
d1
m 2 dx Ui (x Xi,0 ) dv
|x|R0 Rd 2 i=1

m 2
(C0 + C1 (|v| + N n))(1 + |v|2 )
d1
+m 2 |v| dv
Rd 2
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

762 S. Kusuoka & S. Liang


[(C + |s|) 2 1 + (C + |s|) 2 ](s)ds
d1 d d
(2R0 )d m 2 Cd,m


(C0 C1 nN )Cd,m [|s| 2 1 + |s| 2 ](s)ds
d1 d d
+m 2 +


C1 Cd,m
d1 d1 d+1
+m 2 [|s| 2 + |s| 2 ](s)ds

C0 + C1 n.

This combined with (3.29) implies P (n < T ) 0, as n .

3.4. Some basic facts about Skorohod spaces


In this subsection, we recall some basic facts about the Skorohod spaces
(D([0, T ]; Rd ), d0 ) and (D([0, ); Rd ), dis), and the tightness of the probability
measures on them. As mentioned in Remark 3 of Sec. 2, these spaces will be needed
in order to carry out our proof. (See [1] for more details.)
For any T > 0, let D([0, T ]; Rd) be the Skorohod space:

D([0, T ]; R ) = w: [0, T ] Rd ; w(t) = w(t+) := lim w(s), t [0, T ),
d
st

and w(t) := lim w(s) exists, t (0, T ] ,
st

with the metric d0 = d0T given by

d0 (w, w)
 = inf { 0 w w
 }

 D([0, T ]; Rd), where


for any w, w

= {: [0, T ] [0, T ]; continuous, non-decreasing, (0) = 0, (T ) = T },

w = sup0tT |w(t)|, and


 
 (t) (s) 
0 
= sup log
0s<tT ts 

for any .
It is well known that (D([0, T ]; Rd ), d0 ) is a complete metric space. Also,
C([0, T ]; Rd ) = {w: [0, T ] Rd ; continuous} is closed in (D([0, T ]; Rd ), d0 ), and
the Skorohod topology relativized to C([0, T ]; Rd) coincides with the uniform topol-
ogy there. (See, e.g., [1].)
We have the following result about the tightness in (D([0, T ]; Rd )), the space
of all probabilities on D([0, T ]; Rd): let (n , Fn , Pn ), n = 1, 2, . . . , be probability
spaces, and let Xn : n D([0, T ]; Rd), n N, be measurable. Let Xn = Pn Xn1 .
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 763

Then we have the following:

Theorem 3.4.1. Suppose that there exist constants , , , C > 0 such that

(1) E Pn [ Xn ( ) ] C,
(2) E Pn [|Xn (r)Xn (s)| |Xn (s)Xn (t)| ] C|tr|1+ for any 0 r s t 1,
(3) E Pn [|Xn (s) Xn (t)| ] C|t s| for any 0 s t 1,

for any n N. Then {Xn } d


n=1 is tight in (D([0, T ]; R )).

Proof. This is a corollary of results of [1]. Indeed, by [1, Theorem 13.2] and the
paragraph between pp. 140141 there, we have that {Xn } n=1 is tight if the fol-
lowing four conditions are satised (see [1] for the notations).

(1) lima lim supn Pn ( Xn a) = 0,



(2) lim0 supnN Pn (|wX n
()| a) = 0 for any a > 0,
(3) lim0 supnN Pn (|Xn () Xn (0)| a) = 0 for any a > 0,
(4) lim0 supnN Pn (|Xn (1) Xn (1 )| a) = 0 for any a > 0.

The fact that our conditions (1) and (3) imply (1) and (3) here, respectively, is
trivial by Chebyshevs inequality. The condition (4) here is also gotten in the same
way, with the help of our (1) and the dominated convergence theorem. So the only
thing left is to conrm that the (2) here is also satised. We do it in the following.
We use [1, Theorem 10.4], (the quantities , ((s, t]), and P there are
1
Xn , C 1+ (t s), /2 and Pn in our case, respectively, and the quantity L(, )

there is now replaced by wX n
()). Our condition (2) implies that

Pn (|Xn (s) Xn (r)| |Xn (t) Xn (s)| )


1 Pn
E [|Xn (r) Xn (s)| |Xn (s) Xn (t)| ]
2
1
2 C|t r|1+

1
= 2 ((r, t])1+ ,

i.e. [1, (10.20)] is satised. So by [1, Theorem 10.4], [1, (10.21)] holds, i.e.

 2K 1 1
Pn (|wX n
()| a) 2
(C 1+ T )(C 1+ 2) .
a
The right-hand side above certainly converges to 0 as 0 for any a > 0.

Finally, let D([0, ); Rd ) be the set of functions on [0, ) that are right con-
tinuous and have left limits at every point, and let

dis(w1 , w2 ) = 2n (1 d0n (gn w1 , gn w2 )),


n=1
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

764 S. Kusuoka & S. Liang

where d0n is the Skorohod metric on D([0, n]; Rd ) as just dened, and gn is the
function given by gn (t) = 1{t[0,n1]} + (n t)1{t(n1,n]} .
Then the convergence to a continuous process in (D([0, ); Rd ), dis) is equiva-
lent to the convergence to it in (C([0, T ]; Rd ), [0,T ] ) for all T > 0.
By [1, Theorem 16.7], we have that in order to prove the weak convergence of
the distribution of a process with t [0, ) in (D([0, ); Rd ), dis), it is sucient
to show it for t [0, T ], for all T > 0.

3.5. Basic lemmas


In this subsection, we state several key lemmas which are used for the proof of our
results. The proof of these lemmas will be given in Secs. 4 and 5.
Let
(T,n)
Ft = Ft = F(,t+2m1/2 )E

= { (, t + 2m1/2 ) E} .
Proposition 3.6.5 below ensures that (Xi (t ), Vi (t )), i = 1, . . . , N , are
Ft -measurable.
Also, we dene a new potential in the following way. Let

(t) = (s)ds, t R,
t


1 2
p(s) =  |v| + s dv,
Rd 2
and let

N
 X)
U(  = p Ui (Xi x) p(0) dx.
Rd i=1

Some more discussion concerning U  will be given after Lemma 3.5.1.


Our key decomposition is given in Lemma 3.5.1. Its result suces for the proof
of the tightness, but in order to nd the limit, concrete expressions for Mi (t) and
Pi1 (t) are necessary, and will be given later (see (4.22)). In order to keep the line
of our proof sharp we shall rst avoid presenting such concrete expressions.

Lemma 3.5.1. For any i = 1, . . . , N, there exist an Rd -valued (Ft )t -martingale


Mi (t), an Rd -valued (Ft )t -adapted process i (t) and an Rd -valued (Ft )t -adapted
C 1 -class (in t) process Pi1 (t), such that
(1)
Mi (Vi (t ) Vi (0))
t
= Mi (t) + i (t) + Pi1 (t) m1/2  (X(s))ds,
i U (3.30)
0
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 765

(2)
  
 d 1 2
sup sup E Pm  Pi (t) < ,
 dt 
m(0,1] t[0,T ]

(3) there exists a constant C independent of m such that for any i = 1, . . . , N, 0


s t T and m (0, 1], we have
E Pm [|Mi (t) Mi (s)|2 |Fs ] C|t s|,
and the jumps of Mi () satisfy |Mi (t)| Cm1/2 ,
(4)
 
2
E Pm
sup |i (t)| 0, as m 0
t[0,T ]

for any i = 1, . . . , N .
In particular, the distributions of {Mi (t) + i (t); t [0, T ]} and {Pi1 (t); t
[0, T ]} under Pm are tight in (D([0, T ]; Rd)) as m 0, and any of their
cluster points have continuous canonical processes.
Let us explain a little bit before going further. As claimed in Sec. 1, in our
model, the molecules feel each other through the mediation of the gas atoms, and
the molecules do not interact with each other directly. In Lemma 3.5.1, we re-
express the interactions in such a way that the light atoms do not appear explicitly
this time. In this new expression, the function U(  X)
 appears as a new potential.
As will be shown later (Lemma 4.3.3), it is approximately the expected total force
given by the frozing approximations (t, x, v, X). 
 , it is easy to see that if |Xi Xj | > Ri + Rj for any i = j,
By the denitionof U
 N
then U (X) = i=1 Rd (p(Ui (x)) p(0))dx, therefore,


 (X)
U  = 0, if |Xi Xj | > Ri + Rj for any i = j. (3.31)
So in this case, the value of U at X
 is a constant. Write this constant as U0 .
 
So our new potential U (X(t)) keeps 0 until any pair of two molecules are too
near such that their (original) potentials overlap. This is heuristic because when
the molecules are far enough from each other, as a result of our cut-o, they feel
the inuence of dierent atoms, so by the symmetry of the potentials and the
initial distribution m , we get our assertion. Also notice that as soon as this term
becomes non-zero, since m1/2 , it gives us an innitely strong force. This
is why we needed to stop the process in Theorem 2.0.1(2) (see also the paragraphs
following it).
Also, we will use the following lemmas to prove Theorem 2.0.1(4):

Lemma 3.5.2. Let D be any open subset of RdN , and assume that for any i =
1, . . . , N, there exists a Cb1 -class function gi : D
Rd satisfying

gi (X)  X)
 i U(  (X)|,
 = |i U   D,
for any X i = 1, . . . , N.
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

766 S. Kusuoka & S. Liang

Let
D = inf{t 0; X(t) D }.
  C

Then
(1)
 
T f
D

sup E Pm
m 1/2  X(t))|dt
|i U(  <
m(0,1] 0

for any i = 1, . . . , N,
N
 (X(t
(2) the distributions of m1/2 (U   
D )) U0 ) +
2
2 |Vi (t 
Mi
i=1 D )|
under Pm is tight in (C([0, T ]; R)) as m 0.
Let L be the operator dened in Sec. 2. By looking into the concrete expressions
of the decomposition (3.30), we can get the following Lemma. In particular, this
implies Theorems 2.0.1(2) and 2.0.1(3). The proof will be given in Sec. 5.
Lemma 3.5.3. Let D0 = (supp U  U 0 )C RdN , and assume that f C0 (D0
dN
R ). Then we have that the distribution of f (X(t  ), V (t )) under Pm is
tight in (C([0, T ]; R)) as m 0, and its limit is the solution of the L-martingale
problem stopped at .

3.6. Some basic calculation


In this subsection, we prepare some estimates, especially some properties of
x(t, x, v, ) (see Propositions 3.6.33.6.5), for later use.
First notice that it is trivial by denition that
|Xi (t, )| |Xi,0 | + nT, for any t [0, () T ]. (3.32)
Proposition 3.6.1. Suppose that (x, v) E, |v| > (2C0 +1)m1/2 and n m1/2 .
Then
(|v|1 v) v(t, x, v; ) m1/2 (C0 + 1), for any t [0, ()].

Proof. Let = |v|1 v and let


= inf{t > 0; v(t, x, v, ) < m1/2 (C0 + 1)}.
We only need to show that (). Suppose that the contrary holds. Notice that
by denition,
N

(v(, x, v, ) v) = m1 (Ui (x(t, x, v, ) Xi (t, )) )dt.


i=1 0

Also, for any t [0, ()], we have by assumption


d
(x(t, x, v, ) Xi (t, )) = v(t, x, v, ) Vi (t, )
dt
m1/2 (C0 + 1) n
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 767

m1/2 (C0 + 1) m1/2


= m1/2 C0 ,
in particular, (x(t, x, v, ) Xi (t, )) is monotone increasing with respect to t.
So since v = |v| > (2C0 + 1)m1/2 by assumption, we have
m1/2 C0 < (v(, x, v, ) v)

N
= m1 (Ui (x(t, x, v, ) Xi (t, )) )dt
i=1 0

N


m1 |Ui (x(t, x, v, ) Xi (t, )) |
i=1 0

(m1/2 C0 )1 d[(x(t, x, v, ) Xi (t, )) ]


N
m1 (m1/2 C0 )1 Ui
i=1

d[(x(t, x, v, ) Xi (t, )) ]
|(x(t,x,v,)Xi (t,))|Ri

N
m1 (m1/2 C0 )1 Ui 2Ri
i=1

= m1/2 C0 ,
which yields a contradiction. Therefore, ().

Since we are considering the limit behavior as m 0, without loss of generality,


we assume n < m1/2 from now on. Also, for the sake of simplicity, from now on,
we omit the notation when there is no risk of confusion.
Note that in our setting, since

d x(t, (s, x, m1/2 v)) = v(t, (s, x, m1/2 v)),



dt

d

N

m v(t, (s, x, m 1/2
v)) = Ui (x(t, (s, x, m1/2 v)) Xi (t)),
dt
i=1
we have
d2
x(m1/2 t + s, (s, x, m1/2 v))
dt2

N
= Ui (x(m1/2 t + s, (s, x, m1/2 v)) Xi (m1/2 t + s, )).
i=1
Also, for any s > 0 and t [0, T ()], we have by denition and (3.32) that
(x(t, (s, x, m1/2 v)), v(t, (s, x, m1/2 v))) = (s t, x, m1/2 v) (3.33)
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

768 S. Kusuoka & S. Liang

if t < s (m1/2 C0 )1 R0 , (x, v) E and |v| 2C0 + 1. Indeed, since 0 t <


s (m1/2 C0 )1 R0 , for any u [0, t], we have that |s u| > (m1/2 C0 )1 R0 . This
combined with (x, v) E and |v| 2C0 + 1 gives us that |x (s u)m1/2 v|
|(s u)m1/2 v| > R0 Ri + |Xi,0 | + nT , which in turn combined with |Xi (u, )|
|Xi,0 | + nT implies that |x (s u)m1/2 v Xi (u, )| Ri for any u [0, t] and
i = 1, . . . , N . Therefore, until t, the velocity of this atom keeps unchanged, hence
its position at time t is equal to x (s t)m1/2 v.
Therefore,

1/2 1/2 d 1/2 1/2
x(m t + s, (s, x, m v)), x(m t + s, (s, x, m v))
dt
= (0 (m1/2 t, x, m1/2 v), m1/2 1 (m1/2 t, x, m1/2 v)) = (x + tv, v)
= (t, x, v) (3.34)

if t < C01 R0 , (x, v) E, |v| 2C0 + 1, and 0 m1/2 t + s T ().


We recall the following well-known Gronwalls Lemma, for later use.

Lemma 3.6.2 (Gronwalls Lemma). Suppose that a continuous function g()


satises
t
0 g(t) (t) + g(s)ds, 0 t T,
0

with 0 and : [0, T ] R integrable. Then


t
g(t) (t) + (s)e(ts) ds, 0 t T.
0

In particular, if (t) = is a constant, then

g(t) et , 0 t T.

As claimed in Sec. 2, we will use 0 (t, x, v, X(sam


 1/2
, )) as an approximation
of x(m1/2 t + s; (s, x, m1/2 v)). In the following two propositions, with the help
of Gronwalls Lemma, we show that this is a good approximation by giving some
estimate for the error (see Proposition 3.6.3(3)), which is necessary when showing
the tightness, and giving the coecient of the next term in its expansion (see
Proposition 3.6.4). which is necessary when showing the convergence to the limit.

Proposition 3.6.3. Fix any a R. Suppose that 0 s am1/2 T () and


0 s m1/2 T (). Let

y(t) = x(m1/2 t + s, (s, x, m1/2 v)) 0 (t, x, v; X(s


 am1/2 , )).

Also, suppose that (x, v) E and |v| > 2C0 + 1. Then

(1) y(t) = 0 if 0 m1/2 t + s T () and t ,


August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 769

(2)
d2
N
y(t) = {Ui (y(t) + 0 (t, x, v; X(s
 am1/2 , )) Xi (m1/2 t + s, ))
dt2 i=1

Ui ( 0 (t, x, v; X(s
 am1/2 , )) Xi (s am1/2 , ))}.

(3) there exists a constant C, depending only on n, and N 2 Ui , such that
i=1
 
d 
|y(t)| +  y(t) m1/2 C(2
 + |a|), (3.35)
dt
if 0 m1/2 t + s T () and |t| 2 .

Proof. We rst show the rst assertion. We have by (3.34) that x(m1/2 t +
s, (s, x, m1/2 v)) = x+tv in our setting. We next look at the term 0 (t, x, v; X(s

1/2 1/2
am , )). It is trivial that |Xi (s am , )| |Xi,0 | + nT under our assump-
tion. Also, since t and |v| 2C0 + 1, we have for any s big enough that
u [0, t + s] u s [
s, t] [
s, ], hence
inf |x sv + uv| |t||v| C01 R0 (2C0 + 1) R0 ,
s]
u[0,t+e

(this might look incorrect if one forgets the fact that t is now taken to be nega-
tive). Therefore, 0 (t, x, v; X(s 0 (t + s, x sv, v; X(s
 am1/2 , )) = limes 
1/2
am , )) = x + tv. This proves our rst assertion.
The second assertion is trivial by denition.
Let us prove the third assertion. Notice that for any |t| 2 satisfying 0
m1/2 t + s T (), we have
|Xi (m1/2 t + s, ) Xi (s am1/2 , )|
n|(m1/2 t + s) (s am1/2 )| nm1/2 (2 + |a|),
so by (2),
 2 

d  N
 y(t) 2 Ui |y(t) [Xi (m1/2 t + s, ) Xi (s am1/2 , )]|
 dt2 
i=1

N
2 1/2 2
Ui m n(2 + |a|) + Ui |y(t)|.
i=1 i=1
Therefore,
      2 
d      
  y(t), d y(t)   d y(t) +  d y(t)
 dt  dt   dt   dt 2 
N

m1/2 2 Ui n (2 + |a|)
i=1
 

N
 d 
+ 1+ Ui  y(t), y(t) 
2

i=1
dt
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

770 S. Kusuoka & S. Liang

if |t| 2 and 0 m1/2 t + s T (). Also, by (1), y( ) = dtd


y( ) = 0. Let
g(t) = |(y(t ), dt y(t ))|, then we have g(0) = 0 and
d

    
d    
 g(t) =  d  y(t ), d y(t ) 
 dt   dt  dt 
N

N
1/2 2 2
m Ui n (2 + |a|) + 1 + Ui g(t),
i=1 i=1

if t 3 and 0 m1/2 (t ) + s T (). (Notice that t = 0 satises


these conditions since 0 s m1/2 T () under our assumption.) Therefore,
if 0 t 3 and 0 m1/2 (t ) + s T (), then
N


N t
1/2 2 2
g(t) m Ui n (2 + |a|)3 + 1 + Ui g(s)ds,
i=1 i=1 0

so by Gronwalls inequality, we get


N

PN 2
1/2
g(t) m Ui n (2 + |a|)3 e(1+ i=1 Ui )t .
2

i=1

The assertion for t [, 0] satisfying 0 m1/2 (t ) + s T () is proved in


the same way, and we omit the proof here. This completes the proof.

 V , a) be the solution of (2.3). In the following, we show that this


Let z(t; x, v, X,
z(t) gives the next term in the approximation of x(m1/2 t + s, (s, x, m1/2 v)).

Proposition 3.6.4. Let a R. Suppose that t a, 0 s m1/2 T


(), t 2 and 0 s am1/2 s + m1/2 t T (). Also, let (x, v) E
and |v| > 2C0 + 1. Then

|x(m1/2 t + s, (s, x, m1/2 v))


( 0 (t, x, v, X(s
 am1/2 )) + m1/2 z(t; x, v, X(s
 am1/2 ), V  (s am1/2 ), a))|
 s+m1/2 t 
1/2 2 1/2 1/2 1/2
Cm (1 + |a|) m + m |V (r) V (s am )|dr .
 
sam1/2

N N
Here C is a constant depending only on , n, i=1 3 Ui and i=1 2 Ui .

Proof. The main tool is again Gronwalls Lemma. Let

y(t) = x(m1/2 t + s, (s, x, m1/2 v)) 0 (t, x, v, X(s


 am1/2 , ))

as in Proposition 3.6.3, and let

(t) = y(t) m1/2 z(t; x, v, X(s


 am1/2 ), V (s am1/2 ), a).
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 771

We need to estimate |(t)|. By a simply calculation,

d2
N
y(t) = {Ui (y(t) + 0 (t, x, v; X(s
 am1/2 )) Xi (m1/2 t + s))
dt2 i=1

Ui ( 0 (t, x, v; X(s
 am1/2 )) Xi (s am1/2 ))}
N 1

= 2 Ui ([y(t) {Xi (m1/2 t + s) Xi (s am1/2 )}]


i=1 0

+ 0 (t, x, v, X(s
 am1/2 )) Xi (s am1/2 ))

[y(t) {Xi (m1/2 t + s) Xi (s m1/2 a)}]d,

so


N 1
d2
2
(t) = d{2 Ui ([y(t) {Xi (m1/2 t + s) Xi (s m1/2 a)}]
dt i=1 0

+ 0 (t, x, v; X(s
 am1/2 )) Xi (s am1/2 ))

2 Ui ( 0 (t, x, v; X(s
 am1/2 )) Xi (s am1/2 ))}

(y(t) {Xi (m1/2 t + s) Xi (s m1/2 a)})


N
2 Ui ( 0 (t, x, v, X(s
 am1/2 )) Xi (s am1/2 ))
i=1

((t) {Xi (m1/2 t + s) Xi (s m1/2 a) m1/2 (t + a)Vi (s m1/2 a)}).

Therefore, since |Xi (m1/2 t + s) Xi (s m1/2 a)| n(t + |a|)m1/2 in our domain,
 s+m1/2 t
and Xi (m1/2 t + s) Xi (s m1/2 a) = sam1/2 Vi (r)dr, we get
 2 
N
d  N

  3 Ui (|y(t)| + n(t + |a|)m1/2 )2 + 2 Ui |(t)|


 dt2 (t)
i=1 i=1

N s+m1/2 t
+ 2 Ui |Vi (r) Vi (s m1/2 a)|dr. (3.36)
i=1 sam1/2

 be the constant in Proposition 3.6.3(3), and let


Let C

N
C1 =  + n)2 (2 + 1)2 ,
3 Ui (C
i=1

N
C2 = 2 Ui .
i=1
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

772 S. Kusuoka & S. Liang

Then (3.36) combined with Proposition 3.6.3(3) gives us


 2  s+m1/2 t
d 
 (t) C1 m(1 + |a|)2 + C2 |Vi (r) Vi (s m1/2 a)|dr + C2 |(t)|,
 dt2 
sam1/2

if 0 m1/2 t + s T (), |t| 2 and t a. Let


 
 d 
g(t) =  (t ), (t ) .

dt
Then the estimate above gives us
     2 
d     
 g(t)  d (t ) +  d (t )
 dt   dt   dt2 
s+m1/2 (t )
C1 m(1 + |a|)2 + C2 |Vi (r) Vi (s m1/2 a)|dr
sam1/2

+ (C2 + 1)g(t),

if t a, |t | 2 and 0 m1/2 (t ) + s T (). Since ( ) =


 s+m1/2 (t )
d
dt ( ) = 0, we have g(0) = 0. Also, sam1/2 |Vi (r) Vi (s m1/2 a)|dr is
monotone non-decreasing with respect to t. So if t a and 0 t 3 , then
1/2

s+m (t )
g(t) 3 C1 m(1 + |a|)2 + C2 |Vi (r) Vi (s m1/2 a)|dr
sam1/2
t
+ (C2 + 1) g(u)du.
0

 s+m1/2 t
Therefore, by Gronwalls inequality and the monotonicity of sam1/2 |Vi (r)Vi (s
m1/2 a)|dr again, the above implies
1/2

s+m (t )
(C2 +1)3 2 1/2
g(t) 3 e C1 m(1 + |a|) + C2 |Vi (r) Vi (s m a)|dr ,
sam1/2

if t a, t 2 and 0 m1/2 (t ) + s T (). This completes


the proof of our assertion.

In the following proposition, we show that similarly as for the solution of New-
tons equation (see Corollary 3.2.3), x(m1/2 t + s, (s, x, m1/2 v)) does not interact
with Xi (m1/2 t + s, ) if |t| is big.

Proposition 3.6.5. Let (x, v) E and |v| > 2C0 +1. Suppose that 0 m1/2 t+s
T () and that either t < or t > 2 . Then

Ui (x(m1/2 t + s, (s, x, m1/2 v)) Xi (m1/2 t + s, )) = 0.


August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 773

Proof. Let = |v|1 v. Notice that |Xi (m1/2 t + s, )| |X0,i | + nT if 0 m1/2 t +


s T (). So it suces to show that |x(m1/2 t + s, (s, x, m1/2 v))| R0 for t
satisfying our condition. We show it in the following.
First notice that by (3.34), if t < = C01 R0 , then |x(m1/2 t +
s, (s, x, m1/2 v))| = |x + tv| |t||v| C01 R0 (2C0 + 1) > R0 .
We next prove the assertion for t > 2 . Let us divide it into two cases, accord-
ing to whether s < 0 or not. We rst deal with the case s < 0. Notice that by
Proposition 3.6.1, we have that v(u, (s, x, m1/2 v)) m1/2 (C0 + 1) for any
u (0, T ). Also, x(0, (s, x, m1/2 v)) = x sm1/2 v, x = 0 and v = |v|.
Therefore,

x(m1/2 t + s, (s, x, m1/2 v))


m1/2 t+s
= v(u, (s, x, m1/2 v))du + (x sm1/2 v)
0

m1/2 (C0 + 1)(m1/2 t + s) sm1/2 |v|


= t(C0 + 1) + m1/2 s(C0 + 1 |v|)
t(C0 + 1) > 2C01 R0 (C0 + 1) > R0 ,

where when passing to the last line, we used the fact that s < 0 and C0 +
1 |v| < 0.
Let us now prove the assertion for t > 2 and s > 0. Notice that s < T in
this case since we have by assumption 0 m1/2 t + s T ().
We rst show that

x(s, (s, x, m1/2 v)) R0 , for all s [0, T ). (3.37)

In the following, again, we use the fact that v(u, (s, x, m1/2 v)) m1/2 (C0 +
1) > 0 for any u (0, T ), which is guaranteed by Proposition 3.6.1. We also
use the fact that x(0, (s, x, m1/2 v)) = x sm1/2 v, x = 0 and v = |v|. Let
1/2
s0 = R 0
|v| m . If s [0, s0 ], then we have that
s
1/2
x(s, (s, x, m v)) = v(u, (s, x, m1/2 v))du + (x sm1/2 v)
0
R0 1/2
0 m1/2 |v|s m1/2 |v| m = R0 .
|v|

If s [s0 , T ], then by using a similar argument as in the proof of (3.33), it is


easy to see by denition that

x(s s0 , (s, x, m1/2 v)) = x s0 m1/2 v,


v(s s0 , (s, x, m1/2 v)) = m1/2 v,
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

774 S. Kusuoka & S. Liang

therefore,
s
1/2
x(s, (s, x, m v)) = v(u, (s, x, m1/2 v))du
ss0

+ x(s s0 , (s, x, m1/2 v))


0 + (x s0 m1/2 v)
R0 1/2
= s0 m1/2 |v| = m m1/2 |v| = R0 .
|v|
This completes the proof of (3.37).
Since
d
x(m1/2 t + s, (s, x, m1/2 v)) = m1/2 v(m1/2 t + s, (s, x, m1/2 v)),
dt
and 0 m1/2 t + s () by assumption, we have by Proposition 3.6.1 that
d
( x(m1/2 t + s, (s, x, m1/2 v))) > C0 . (3.38)
dt
This combined with (3.37) implies that
t
1/2 1/2 d
x(m t + s, (s, x, m v)) = ( x(m1/2 u + s, (s, x, m1/2 v))du
0 du
+ x(s, (s, x, m1/2 v))
C0 t R0 C0 2C01 R0 R0 = R0 .

This completes the proof of our assertion, hence the lemma is proven.

Before closing this section, let us discuss a little bit more about the new potential
 and the function p dened in Sec. 3.5.
U
The following equation will be used later:

1
N
i U (X)
 = Ui (Xi x) |v|2 + Ui (x Xi ) dxdv. (3.39)
R2d 2 i=1

Also, by a simple calculation, there exists a global constant Cd such that



(r + s)r 2 1 dr,
d
p(s) = Cd
0

hence

p (s) = Cd (r + s)r 2 1 dr
d

0

(t)(t s) 2 1 dt.
d
= Cd
s
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 775

So if s < e0 , then


(t)(t s) 2 1 dt,
d
p (s) = Cd (3.40)
0

d
p (s) = Cd 1 (t)(t s) 2 2 dt,
d
(3.41)
2 0
 
 d d
(t)(t s) 2 3 dt.
d
p (s) = Cd 1 2
2 2 0

Also notice that under the condition s < e0 , if 0 t < s, then t < e0 , hence
(t) = 0. Therefore, we get that

< 0, if d 3,


p (s) = 0, if d = 2, (3.42)


> 0, if d = 1.

We remark that in reality, we have (t) = et , so (t) = et and p(s) =


Ces , for some constant C > 0, so p (s) < 0.

4. Proof of Basic Lemmas


We give the proofs of Lemmas 3.5.1 and 3.5.2 in this section. The proof of
Lemma 3.5.3 will be given in Sec. 5.

4.1. First decomposition


Let () = n () = inf{t 0; maxi=1,...,N |Vi (t, )| n}, R0 = maxi=1,...,N {Ri +
|Xi,0 |} + nT + 1, and = C01 R0 as before. Also, we always assume that (x, v) E,
i.e. x v = 0.
First, for any t T , we have by (3.3) that

Mi (Vi (t) Vi (0))


t
= ds Ui (Xi (s, ) x(s, (r, x, m1/2 v))) (dr, dx, dv),
0 RE

so we have the following decomposition.

Mi (Vi (t n ) Vi (0)) = Vi0 (t) + Vi1 (t),

with
tn
Vi0 (t) = 1[4m1/2 ,) (s)ds
0

Ui (Xi (s, ) x(s, (r, x, m1/2 v))) (dr, dx, dv),
RE
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

776 S. Kusuoka & S. Liang

tn
Vi1 (t) = 1[0,4m1/2 ) (s)ds
0

Ui (Xi (s, ) x(s, (r, x, m1/2 v))) (dr, dx, dv).
RE

4.2. The term Vi1 (t)


Let us deal with Vi1 (t) in this subsection. We will show that it is negligible as
m 0.
Let us decompose Vi1 (t) as follows:

Vi1 (t) = Vi10 (t) + Vi11 (t),

with
tn
Vi10 (t) = 1[0,4m1/2 ) (s)ds
0

{Ui (Xi (s, ) x(s, (r, x, m1/2 v)))
RE

Ui (Xi (0)  0 (m1/2 s, (m1/2 r, x, v); X(0)))}


 (dr, dx, dv),
tn
Vi11 (t) = 1[0,4m1/2 ) (s)ds
0

 0 (m1/2 s, (m1/2 r, x, v); X(0)))
Ui (Xi (0)  (dr, dx, dv).
RE

Before discussing the behavior of Vi10 (t), let us prepare the following result. Fix
any t0 > 0. Then we have the following:

Lemma 4.2.1. For any s [0, t0 ] satisfying 0 m1/2 s T n (), we have that

|x(m1/2 s, (r, x, m1/2 v))


 0 (s, (m1/2 r, x, v); X(0)))|


N PN
2 Ui +1)t0
nm1/2 s 2 Ui t0 e( i=1 .
i=1

Proof. The main tool is again Gronwalls lemma. First notice that under our
condition, |Xi (m1/2 s) Xi (0)| nm1/2 s. Let

(s) = x(m1/2 s, (r, x, m1/2 v))


 0 (s, (m1/2 r, x, v); X(0))).


Then we have
d2
N
(s) = {Ui (x(m1/2 s, (r, x, m1/2 v)) Xi (m1/2 s))
ds2 i=1

 0 (s, (m1/2 r, x, v); X(0)))


+ Ui (  Xi (0))}.
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 777

Therefore, since 2 Ui , i = 1, . . . , N , are bounded, we have that


 2 
d 
N
 (s) 2 Ui (|(s)| + |Xi (m1/2 s) Xi (0)|)
 ds2 
i=1

N
2 Ui (|(s)| + nm1/2 s).
i=1

Let g(s) = |((s), d


ds (s))|.
Then the above implies that
     2 
d  d   
 g(s)  (s) +  d (s)
 ds   ds   ds2 
N

N

nm1/2 s 2 Ui + 2 Ui + 1 g(s).
i=1 i=1

Also, g(0) = 0. So for any 0 s t0 , we get that


N

N
s
1/2 2 2
g(s) nm s Ui t0 + Ui + 1 g(u)du.
i=1 i=1 0

Therefore, by Gronwalls Lemma, we have


N PN
1/2 2 Ui +1)s
g(s) nm s 2 Ui t0 e( i=1 .
i=1

This gives us our assertion.

In particular, applying Lemma 4.2.1 to t0 = 4 , we get that


|x(s, (r, x, m1/2 v))
 0 (m1/2 s, (m1/2 r, x, v); X(0)))|


N PN
2 Ui +1)4
ns 2 Ui 4 e( i=1 , (4.1)
i=1

|Xi (s) Xi (0)| ns, for any s [0, 4m1/2 T ()).


We use this to prove the following. The key point here is that the domain of s
now is close to 0 and narrow enough.

Lemma 4.2.2. E Pm [sup0tT |Vi10 (t)|2 ] 0 as m 0.

Proof. First notice that in the denition of Vi10 , we are taking an integral for
s [0, 4m1/2 T ()), so if r > 6m1/2 or r < 2m1/2 , then we have
|u| > 2m1/2 for any u [r s, r], so since x v = 0, we get by denition
 0 (m1/2 s, (m1/2 r, x, v); X(0)))|
|  = |x m1/2 (r s)v|
m1/2 |r s||v| 2 |v|
R0 .
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

778 S. Kusuoka & S. Liang

Therefore, for any s [0, 4m1/2 T ()), we have

 0 (m1/2 s, (m1/2 r, x, v); X(0)))


Ui (Xi (0)  =0 (4.2)

if r > 6m1/2 or r < 2m1/2 . Also, (4.2) holds if |x| R0 + 1. Similarly, the same
holds with X(0) substituted by X(s) (since 0 s ). Let


N PN 2
C1 = 2 Ui 2 Uj 4 e( j=1 Uj +1)4 + 1.
j=1

Then by combining these facts with (4.1), we get that for any s [0, 4m1/2 T
()),

|Ui (Xi (s, ) x(s, (r, x, m1/2 v)))


 0 (m1/2 s, (m1/2 r, x, v); X(0)))|
Ui (Xi (0) 

1[0,R0 +1) (|x|)1[2m1/2 ,6m1/2 ] (r)nsC1 .

Therefore, by the denition of Vi10 (t), we get that


tn
10
|Vi (t)| 1[0,4m1/2 ) (s)ds
0

C1 ns1[0,R0 +1) (|x|)1[2m1/2 ,6m1/2 ] (r) (dr, dx, dv)
RE

C1
n(4m1/2 )2 1[0,R0 +1) (|x|)1[2m1/2 ,6m1/2 ] (r) (dr, dx, dv).
2 RE
(4.3)

We need to discuss the L2 (Pm )-norm of the integral on the right-hand side above.
Notice thatin general, it is easy to see  by the denition of a Poisson point process
that E Pm [( gd )2 ] = g 2 dm + ( gdm )2 for any g L2 (m ).
N
Let c = j=1 Uj , and set C2 = 8 (2(R0 + 1))d1 Rd c ( 12 |v|2 )|v|dv, which
is nite by our assumption. Then we have by denition that

1[0,R0 +1) (|x|)1[2m1/2 ,6m1/2 ] (r)(dr, dx, dv)
RE

= 1[0,R0 +1) (|x|)1[2m1/2 ,6m1/2 ] (r)m1 0 (x m1/2 rv, v)dr(dx, dv)
RE

1 2
1[0,R0 +1) (|x|)1[2m1/2 ,6m1/2 ] (r)m1 c |v| dr(dx, dv)
RE 2

1/2 1 1 2
8m m (2(R0 + 1)) d1
c |v| |v|dv
Rd 2
= C2 m1/2 .
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 779

Therefore,
 2 
Pm
E 1[0,R0 +1) (|x|)1[2m1/2 ,6m1/2 ] (r) (dr, dx, dv)
RE

= 1[0,R0 +1) (|x|)1[2m1/2 ,6m1/2 ] (r)(dr, dx, dv)
RE
2
+ 1[0,R0 +1) (|x|)1[2m1/2 ,6m1/2 ] (r)(dr, dx, dv)
RE
1/2
C2 m + C22 m1 . (4.4)

This combined with (4.3) gives us that


  2
1
E Pm sup |Vi10 (t)|2 C1 n(4m1/2 )2 (C2 m1/2 + C22 m1 ).
0tT 2
The right-hand side above converges to 0 as m 0. This completes the proof of
our assertion.

For the term Vi11 (t), we show in the following that it is also negligible when
m 0. The main idea is to use the fact that the expectation (of the integral with
respect to the counting measure) is 0 (see (4.5) below), which means that we only
need to calculate its variance.

Lemma 4.2.3.
 
E Pm
sup |Vi11 (t)|2 0 as m 0.
0tT

Proof. We rst notice that



 0 (m1/2 s, (m1/2 r, x, v); X(0)))(dr,
Ui (Xi (0)  dx, dv) = 0 (4.5)
RE

for any s [0, 4m1/2 T ) and |v| C0 . Indeed, since |Xi (0) Xj (0)| > Ri + Rj
 X(0))
for any i = j, we have by (3.31) that i U(  = 0. Combining this with (3.39),
we get that


N
1
Ui (Xi (0) x) |v|2 + Uj (x Xj (0)) dxdv = 0.
R 2d 2 j=1

Applying Proposition 3.1.1 to this with t = m1/2 s and f (x, v) = Ui (Xi (0) x),
we get


N
 0 (m1/2 s, x, v; X(0)))
Ui (Xi (0)  1 |v|2 + Uj (x Xj (0)) dxdv = 0.
R 2d 2 j=1
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

780 S. Kusuoka & S. Liang

Reformulating this using the ray representation yields



Ui (Xi (0)  0 (m1/2 s, (r, x, v); X(0)))

RE

1

N
|v|2 + Uj (0 (r, x, v) Xj (0)) dr(dx, dv) = 0.
2 j=1

By changing variable r = m1/2 r, we obtain (4.5).


By (4.5), we get that
tn
Vi11 (t) = 1[0,4m1/2 ) (s)ds
0

 0 (m1/2 s, (m1/2 r, x, v); X(0)))
Ui (Xi (0) 
RE

( (dr, dx, dv) (dr, dx, dv)). (4.6)

As in the proof of Lemma 4.2.2, (4.2) holds if r > 6m1/2 or r < 2m1/2 , or
if |x| R0 + 1. Let

1
C3 = 8 (2(R0 + 1))d1 Ui 2 c ( |v|2 )|v|dv,
R d 2

which is nite by our assumption. Then we have that




E Pm  Ui (Xi (0)  0 (m1/2 s, (m1/2 r, x, v); X(0)))

RE
2 

( (dr, dx, dv) (dr, dx, dv))

=  0 (m1/2 s, (m1/2 r, x, v); X(0)))|
|Ui (Xi (0)  2
(dr, dx, dv)
RE

1[0,R0 +1) (|x|)1[2m1/2 ,6m1/2 ] (r) Ui 2 (dr, dx, dv)
RE

= Ui 2 1[0,R0 +1) (|x|)1[2m1/2 ,6m1/2 ] (r)
RE

1
N
m1 |v|2 + Uj (Xj,0 (x m1/2 rv)) dr(dx, dv)
2 j=1

1 1/2 1 2
m 8m (2(R0 + 1)) d1
Ui 2 c |v| |v|dv
Rd 2
= C3 m1/2 . (4.7)
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 781

Therefore,
 
E Pm
sup |Vi11 (t)|2
t[0,T ]


T 
E Pm
1[0,4m1/2 ) (s)  Ui (Xi (0)
0 RE

 2 


 (m0 1/2
s, (m 1/2
r, x, v); X(0)))( (dr, dx, dv) (dr, dx, dv)) ds


 
T 
1/2 
E Pm
4m 1[0,4m1/2 ) (s)  Ui (Xi (0)
0  RE
2


 0 (m1/2 s, (m1/2 r, x, v); X(0)))(
 (dr, dx, dv) (dr, dx, dv)) ds


 
4m1/2 
(4m 1/2
) dsE Pm  Ui (Xi (0)

0 RE
2 


 (m 0 1/2
s, (m 1/2 
r, x, v); X(0)))( 
(dr, dx, dv) (dr, dx, dv))

(4m1/2 )2 C3 m1/2 ,
which converges to 0 as m 0. This completes the proof of our assertion.

Combining Lemmas 4.2.2 and 4.2.3, we get the following main result of this
subsection.

Lemma 4.2.4.
 
E Pm sup |Vi1 (t)|2 0 as m 0.
0tT

4.3. The term Vi0 (t)


Let us discuss the term Vi0 (t) in this subsection.
For any r R, let r = r() = ((r 2m1/2 ) 0) T (). Notice that by
Corollary 3.2.3,
r ) 0 (m1/2 (s r), x, v; X(
Ui (Xi (  r ))) = 0

|m1/2 (s r)| 2.
So for s [4m1/2 , ),
r < 2m1/2 Ui (Xi (
r ) 0 (m1/2 (s r), x, v; X(
 r ))) = 0.
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

782 S. Kusuoka & S. Liang

Therefore, we have the following decomposition:


Vi0 (t) = Vi01 (t) + Vi02 (t) + Vi03 (t) Vi04 (t) + Vi05 (t),
with
tn
Vi01 (t) = 1[4m1/2 ,) (s)ds
0

Ui (Xi (s) 0 (m1/2 (s r), x, v; X(s)))(dr,
 dx, dv),
RE
tn
Vi02 (t) = 1[4m1/2 ,) (s)ds fi (s, r, x, v) (dr, dx, dv),
0 RE
tn
Vi03 (t) = ds r ) 0 (m1/2 (s r), x, v; X(
Ui (Xi (  r )))
0 (2m1/2 ,)E

( (dr, dx, dv) (dr, dx, dv)),


tn
Vi04 (t) = 1[0,4m1/2 ) (s)ds
0 [2m1/2 ,)E

r ) 0 (m1/2 (s r), x, v; X(


Ui (Xi (  r )))

( (dr, dx, dv) (dr, dx, dv)),


tn
Vi05 (t) = 1[4m1/2 ,) (s)ds $
F 05
i (s, r, x, v)(dr, dx, dv),
0 RE

where
fi (s, r, x, v) = Ui (Xi (s) x(s, (r, x, m1/2 v)))
r ) 0 (m1/2 (s r), x, v; X(
Ui (Xi (  r ))),
$ 05 0 1/2
i (s, r, x, v) = {Ui (Xi (s) (m (s r), x, v; X(s)))
F 

r ) 0 (m1/2 (s r), x, v; X(


Ui (Xi (  r )))}.

We discuss each term in the above decomposition in the following. We will show
that Vi02 (t) and Vi05 (t) give us the smooth term in (3.30), and the martingale
part of Vi03 (t) gives us the martingale term there (see the end of Sec. 4).
For the term Vi02 , we have by denition

d 02
Vi (t) = 1(4m1/2 ,) (t) fi (t, r, x, v; ) (dr, dx, dv).
dt RE

By denition and assumption, we have that m (dr, dx, dv) = 0 if |v| 2C0 + 1.
Also, by Proposition 3.6.5 and Corollary 3.2.3, fi (t, r, x, v) = 0 if |r t| 2m1/2 .
So we only need to consider the case where t [4m1/2 , T ), r [2m1/2 , T
() + 2m1/2 ] and |v| 2C0 + 1.
Before going further, we rst show the following, with the help of Proposi-
tion 3.6.5, Corollary 3.2.3 (which claimed that both of the two interactions exist
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 783

only for a certain range of t r), and Proposition 3.6.3 (which gave an estimate for
the error of our approximation of x(t, (r, x, m1/2 v))).

Lemma 4.3.1. There exists a constant C > 0 such that

|fi (t, r, x, v)| 1[0,R0 +1) (|x|)1[m1/2 ,2m1/2 ) (t r) Cm1/2 ,

if t [4m1/ , T ], |r 2m1/2 | T () and |v| 2C0 + 1.

Proof. First, since t [0, T n ), we have by Proposition 3.6.5 that Ui (Xi (t)
x(t, (r, x, m1/2 v))) = 0 if t r > 2m1/2 or t r < m1/2 . Also, since r
[0, T n ) by denition, we have |Xi ( r )| |Xi,0 | + nT , so by Corollary 3.2.3,
Ui (Xi ( r ) 0 (m1/2 (t r), x, v; X(
 r ))) = 0 if t r 2m1/2 or t r m1/2 .
Combining the above, we get that fi (t, r, x, v) = 0 if r / [t 2m1/2 , t + m1/2 ].
1/2 1/2
Next, for r [t 2m , t + m ], if |x| R0 + 1, since x v = 0, we get
easily that |x(t, (r, x, m1/2 v))| = |x (r t)m1/2 v| |x| R0 + 1, hence both
of the terms of fi (t, r, x, v) are equal to 0.
Finally, we show, for |x| < R0 + 1 and r [t 2m1/2 , t + m1/2 ], that

|fi (t, r, x, v)| Cm1/2 . For this kind of x and r, since t [4m1/ , T ()], we
have by denition 2m1/2 r T + m1/2 , so r = r 2m1/2 . We have

|fi (t, r, x, v)| 2 Ui (|Xi (t) Xi (


r )| + |x(t, (r, x, m1/2 v))
0 (m1/2 (t r), x, v; X(
 r ))|).

The term involving X is easy. Indeed, since t, r [0, T ()], we have by denition

r )| n|t r| = n|t (r 2m1/2 )|


|Xi (t) Xi (
n(|t r| + 2m1/2 ) n4m1/2 .

We next deal with the second absolute value above. Notice that by assumption,
0 r 2m1/2 T (), 0 r m1/2 T () and 0 t T (). There-
fore, by Proposition 3.6.3 (3) (with (t, s, a) there given by (m1/2 (t r), r, 2 )),
 such that
there exists a constant C

|x(t, (r, x, m1/2 v)) 0 (m1/2 (t r), x, v; X(r 


 2m1/2 ))| m1/2 C(2 + 2 ).

Combining the above, we get our assertion.

Now we are ready to prove the following result concerning the term Vi02 (t).

Lemma 4.3.2. We have that


  
 d 02 2
sup sup E Pm  V (t) < .
 dt i 
m(0,1] 0tT

In particular, {the distribution of {Vi02 (t)}t[0,T ] under Pm }m(0,1] is tight in


(D([0, T ]; Rd)).
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

784 S. Kusuoka & S. Liang

Before giving the proof, we remark that this result is natural since by
Lemma 4.3.1, fi (s, r, x, v) is not 0 only if r is very near to s, which implies by
Proposition 3.6.3(3) and 3.6.4 that 0 (m1/2 (s r), x, v; X(r
 2m1/2 )) is a good
1/2
approximation of x(s, (r, x, m v)).

Proof. By Lemma 4.3.1, we have


 
 d 02 
 V (t) Cm1/2 1[0,R0 +1) (|x|)1[m1/2 ,2m1/2 ) (t r) (dr, dx, dv).
 dt i 
RE

Therefore,
 2 
 d 
E Pm  Vi02 (t)
dt
 2 
1/2
E Pm
Cm 1[0,R0 +1) (|x|)1[m1/2 ,2m1/2 ) (t r) (dr, dx, dv)
RE

C 2m 1[0,R0 +1) (|x|)1[m1/2 ,2m1/2 ) (t r)m (dr, dx, dv)
RE
2
+ Cm1/2 1[0,R0 +1) (|x|)1[m1/2 ,2m1/2 ) (t r)m (dr, dx, dv) .
RE
(4.8)
N 
Let c = i=1 Ui , and C = 3 [2(R0 + 1)]d1 Rd c ( 12 |v|2 )|v|dv, which is nite
by assumption. Then

1[0,R0 +1) (|x|)1[m1/2 ,2m1/2 ) (t r)m (dr, dx, dv)
RE

= 1[0,R0 +1) (|x|)1[m1/2 ,2m1/2 ) (t r)
RE

1 2

N
1 1/2
m |v| + Ui (x m rv Xi,0 ) dr|v| (dx; v)dv
2 i=1

1 2
m1 3m1/2 1[0,R0 +1) (|x|)c |v| |v|
(dx; v)dv
E 2

1 2
3m1/2 [2(R0 + 1)]d1 c |v| |v|dv
Rd 2
= Cm1/2 . (4.9)

Combining (4.8) and (4.9), we get that


  
 d 02 2
 := sup
C sup E Pm  Vi (t) < ,
0tT
 dt 
m(0,1]
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 785

which is exactly the rst half of our assertion. Therefore,


 t |2 ,
E[|Vi02 (t) Vi02 (t )|2 ] C|t
hence by Theorem 3.4.1 (with = = = 1), {{Vi02 (t)}t[0,T ] under Pm }m(0,1]
is tight in (D([0, T ]; Rd)).

We next deal with Vi01 (t). By using Proposition 3.2.4, we show that it is equal to
m 1/2 t  (X(s))ds,
i U  which gives us the colliding term in Theorem 2.0.1(4).
0

Lemma 4.3.3. There exists an m0 > 0 (depending on X  0 , n, T and Ui , i =


1, . . . , N ) such that for any m m0 ,
t
01
Vi (t) = m 1/2  (X(s))ds,
i U 
0

 is as dened in Sec. 3.5.


where U

Proof. Suppose that Ui (Xi (s) 0 (m1/2 (s r), x, v; X(s))


 = 0. Then s r <
2m by Proposition 3.6.5, this combined with s 4m1/2 implies that r >
1/2

2m1/2 = 2m1/2 C01 R. Since |v| 2C0 +1 and xv = 0 for m -almost every (r, x, v),
this implies |x m1/2 rv| m1/2 r|v| R0 , hence Ui (Xi,0 (x m1/2 rv)) = 0.
Therefore, by denition, Proposition 3.2.4 and (3.39),
tn
Vi01 (t) = 1[4m1/2 ,) (s)ds Ui (Xi (s) 0 (m1/2 (s r), x, v; X(s)))

0 RE

1
N
m1 |v|2 + Ui (x m1/2 rv Xi,0 ) dr(dx, dv)
2 i=1
tn
= 1[4m1/2 ,) (s)ds Ui (Xi (s) 0 (m1/2 (s r), x, v; X(s)))

0 RE

1 2
m1 |v| dr(dx, dv)
2
tn
= 1[4m1/2 ,) (s)ds
0

1 2

N
1/2
m Ui (Xi (s) x) |v| + Uk (x Xk,0 ) dxdv
R2d 2
k=1
tn
=  X(s))ds,
1[4m1/2 ,) (s)m1/2 i U( 
0

where we used Proposition 3.2.4 in passing to the third equality, and used (3.39) in
passing to the last equality.
So in order to complete the proof of our assertion, it suces to show that
 X(s))
i U(  = 0 for any s [0, 4m1/2 ], if m is small enough. We show it from
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

786 S. Kusuoka & S. Liang

now on. Notice that since |Xi,0 Xj,0 | > Ri + Rj for any i, j = 1, . . . , N with i = j
by assumption, there exists an m0 > 0 (small enough) such that for any m m0 ,
we have |Xi,0 Xj,0 | > Ri + Rj + 8m1/2 n for any i = j. Also, by denition, we
have |Xi (s) Xi,0 | sn 4m1/2 n for any s [0, 4m1/2 ] and i = 1, . . . , N .
Therefore,
|Xi (s) Xj (s)| |Xi,0 Xj,0 | |Xi (s) Xi,0 | |Xj (s) Xj,0 |
> Ri + Rj + 8m1/2 n 4m1/2 n 4m1/2 n
= Ri + Rj ,
so by (3.31), i U (X(s))
 = 0 for any s [0, 4m1/2 ]. This completes the proof
of our assertion.

Before discussing the term Vi05 (t), let us rst prepare, by using Gronwalls
Lemma, the continuity of 0 (t, x, v; X)
 with respect to X:


 (depending on
Lemma 4.3.4. For any Y > 0, there exists a constant C
N 2
maxi=1 Ri + Y, , C0 and i=1 Ui ) such that
N

| 0 (t, x, v; X
 1 ) 0 (t, x, v; X  X
 2 )| C 1X
 2 Rd ,
 1 |, |X
for any (x, v) E, |v| 2C0 + 1, |t| 2 and |X  2| Y .

maxN
i=1 Ri +Y
Proof. Choose and x any v Rd with |v| 2C0 + 1, and let s0 = |v|
2 . Let g(t) = 0 (t, x, v; X
 1 ) 0 (t, x, v; X
 2 ). Then by denition,

 0 (t + s0 , x s0 v, v; X
g(t) =  1)
 0 (t + s0 , x s0 v, v; X
 2 ),

so
d2
N
g(t) =  0 (t + s0 , x s0 v, v; X
Ui (  1 ) Xi1 )
dt2 i=1

N
+  0 (t + s0 , x s0 v, v; X
Ui (  2 ) Xi2 ).
i=1
N 2
Let C = i=1 Ui , then
 2 

d  N
 g(t) 2 Ui (|g(t)| + |Xi1 Xi2 |) C(|g(t)| + X
1X
 2 Rd ),
 dt2 
i=1

therefore,
      2 
d   d   
  g(t), d g(t)   g(t) +  d g(t)
 dt  dt   dt   dt2 
 
 d 
C X X Rd + (1 + C)  g(t), g(t) .
 1  2 
dt
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 787

d
Also, g(s0 ) = dt g(s0 ) = 0. Let h(t) = |(g(t s0 ), dt
d
g(t s0 ))|. Then h(0) = 0,
and for any t [0, s0 + 2 ],
t
h(t) C X 1X  2 Rd (s0 + 2 ) + (1 + C) h(s)ds,
0

so by Gronwalls Lemma,
1X
h(t) C X  2 Rd (s0 + 2 )e(1+C)(s0 +2 ) , t [0, s0 + 2 ].
maxN Ri +Y
Notice that since |v| 2C0 + 1, we have 2 s0 i=1
2C0 +1 2 . Therefore,
|g(t)| h(t + s0 )

maxN i=1 Ri + Y
maxN
i=1 Ri +Y
C 2 + 2 e(1+C)( 2C0 +1 2 +2 ) X
1X
 2 Rd ,
2C0 + 1
for any t [2, 2 ]. This complets the proof of our assertion.

We use Lemma 4.3.4 to prove the following:

Lemma 4.3.5. There exists a constant C > 0 such that


$
|F 05 1/2
i (s, r, x, v)| Cm 1[0,2m1/2 ] (|s r|)1[0,R0 +1) (|x|)
for s [4m1/2 , T n ].

Proof. First, since s, r [0, T ()] in our domain, it is easy to see that
$
|F 05 $ 05
i (s, r, x, v)| = 0 if |x| R0 + 1. Also, by Corollary 3.2.3, |Fi (s, r, x, v)| = 0 if
1/2
|m (s r)| 2 . Finally, for |x| R0 + 1 and |s r| 2m1/2 , by denition
and Lemma 4.3.4, we only need to show the following:
r )| Cm1/2 ,
|Xi (s) Xi ( s 4m1/2 . (4.10)
To show (4.10), again, notice that in the present setting, 0 r 2m1/2 T ,
so r = r 2m1/2 . So the left-hand side of (4.10) = |Xi (s) Xi (r 2m1/2 )|
n|s (r 2m1/2 )| n(|s r| + 2m1/2 ) n4m1/2 .
This completes the proof of our assertion.

By Lemma 4.3.5, we get the following lemma in the same way as we derived
Lemma 4.3.2 from Lemma 4.3.1.

Lemma 4.3.6. (1) supm(0,1] sup0tT E Pm [| dt d


Vi05 (t)|2 ] < ,
05
(2) {the distribution of {Vi (t)}t[0,T ] under Pm }m(0,1] is tight in (D([0, T ]; Rd )).
We show that the term Vi04 is negligible. Precisely, we show the following:

Lemma 4.3.7.
 
E Pm
sup |Vi04 (t)|2 0 as m 0.
0tT
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

788 S. Kusuoka & S. Liang

Proof. The proof is similar to previous ones, it is easier than the one of
Lemma 4.2.3, where we had to show rst that the expectation is 0 (see (4.5)),
whereas now, we are considering only the variance from the very beginning.
We have for any s [0, 4m1/2 ] that

r ) 0 (m1/2 (s r), x, v; X(


|Ui (Xi (  r )))|

Ui 1[0,R0 +1) (|x|)1[0,2m1/2 ) (|s r|)


Ui 1[0,R0 +1) (|x|)1[2m1/2 ,6m1/2 ] (r).

Let C4 = 8 Ui 2 (2(R0 +1))d1 Rd c ( 12 |v|2 )|v|dv, which is nite. Then we have
by the denition of and the assumption A2 that


Pm 
E  r ) 0 (m1/2 (s r), x, v; X(
Ui (Xi (  r )))
 [2m1/2 ,)E
2


( (dr, dx, dv) (dr, dx, dv))

 
0 1/2 2
=E Ui (Xi (
r ) (m (s r), x, v; X(
 r ))) (dr, dx, dv)
[2m1/2 ,)E

Ui 2 1[0,R0 +1) (|x|)1[2m1/2 ,6m1/2 ] (r)
[2m1/2 ,)E

1

N
m1 |v|2 + Uj (x m1/2 rv Xj,0 ) dr(dx, dv)
2 j=1

1 2
Ui 2 8m1/2 (2(R0 + 1))d1 m1 c |v| |v|dv
Rd 2
= C4 m1/2 . (4.11)

Therefore,

  4m1/2 

E Pm sup |Vi04 (t)|2 E Pm 4m1/2 
0tT 0  [2m1/2 ,)E

Ui (Xi (r ) 0 (m1/2 (s r), x, v; X(


 r )))
2


( (dr, dx, dv) (dr, dx, dv)) ds


(4m1/2 )2 C4 m1/2 ,

which converges to 0 as m 0. This completes the proof of our assertion.


August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 789

Now, the only term left to be discussed is Vi03 . We deal with it in the next
subsection.

4.4. The term Vi03


We deal with the term Vi03 in this subsection. More precisely, we show that it is
equal to a martingale plus a negligible term.
(m,n)
We rst prepare some notations. Let Ft = Ft = F(,2m1/2 +t)E as
in Sec. 3.5. Then Ft is increasing and right continuous. Let

N ((0, t] A) := ((2m1/2 , 2m1/2 + t] A)


N
for any A B(E). Notice that if ( 12 |v|2 + j=1 Uj (Xj,0 (x m1/2 rv))) > 0,
then |v| 2C0 + 1, hence if r m1/2 in addition, then |x m1/2 rv| |v| > R0 ,
N
so ( 12 |v|2 + j=1 Uj (Xj,0 (x m1/2 rv))) = ( 12 |v|2 ). Therefore, if we let

1 2
(dx, dv) = |v| (dx, dv),
2
then N is the Ft -adapted Poisson point process with intensity measure
(dt, dx, dv) = m1 dt(dx, dv) = m1 dt( 12 |v|2 )(dx, dv). Notice that N ((s, t] A)
is independent of Fs for any s < t and A B(E). Let
(dt, dx, dv) = N (dt, dx, dv) m1 dt(dx, dv).
N

Notice that Xi (t ) and Vi (t ) are Ft -measurable. Also, since Ui (Xi ( r)


0 (m1/2 (s r), x, v; X(
 r )) = 0 only if |m1/2 (s r)| 2 , which combined with
r 2m1/2 and s T implies r = r 2m1/2 , we get by denition that
t
Vi03 (t) = ds
0 [2m1/2 ,2m1/2 +(T ))E

Ui (Xi (r 2m1/2 ) 0 (m1/2 (s r), x, v; X(r


 2m1/2 )))

( (dr, dx, dv) (dr, dx, dv))


t
= ds
0 [0,T )E

Ui (Xi (r) 0 (m1/2 (s r) 2, x, v; X(r)))


 N (dr, dx, dv).

In the last expression above, if r > t , then since s t , we get


m1/2 (s r) 2 < , hence Ui (Xi (r) 0 (m1/2 (s r) 2, x, v; X(r)))
 = 0.
Therefore,
t
03
Vi (t) = ds
0 [0,t)E

Ui (Xi (r) 0 (m1/2 (s r) 2, x, v; X(r)))


 N (dr, dx, dv).
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

790 S. Kusuoka & S. Liang

Let

$ t
V$03
i (t) =
(dr, dx, dv)
N ds
(0,t]E 0

Ui (Xi (r ) 0 (m1/2 (s r) 2, x, v; X(r


 ))).

Then
$
Vi03 (t) = V$03
i (t ).

By Corollary 3.2.3, Ui (Xi (r ) 0 (u, x, v; X(r ))) = 0 if |u| 2 . So


$
the integral domain s [0, t] in the denition of V$ 03
i (t), which is equivalent to
s r [r, t r], can be substituted by s r [0, (t r) 4m1/2 ] = [0, 4m1/2 ] \
$
[(t r) (4m1/2 ), 4m1/2 ]. Therefore, V$
03 (t) can be decomposed into
i
$
V$03 i i (t),
i (t) = M (t) +

where
4m1/2
i (t) =
M (dr, dx, dv)
N ds
(0,t]E 0

Ui (Xi (r ) 0 (m1/2 s 2, x, v; X(r


 ))),
4m1/2
i (t) = N (dr, dx, dv) ds
(0,t]E (tr)(4m1/2 )

Ui (Xi (r ) 0 (m1/2 s 2, x, v; X(r


 ))).
$
By denition (notice that the integral domain (0, t] E in the denition of V$03
i (t)
can always be converted into (0, T ] E whenever necessary, and vice versa),

d $
$03 (dr, dx, dv)
V (t) = N
dt i (0,t]E

Ui (Xi (r ) 0 (m1/2 (t r) 2, x, v; X(r


 ))),

so with C1 = 4 Ui 2 (2(R0 + 1))d1 Rd ( 12 |v|2 )|v|dv, we have
 2  
 d $ 
E Pm  V$ 03 (t)
 =E |Ui (Xi (r )
dt i (0,t]E

0 1/2 2
(m (t r) 2, x, v; X(r )))| (dr, dx, dv)



Ui 2 1[0,R0 +1) (|x|)1[0,2 ] (|m1/2 (t r) 2 |)
(0,t]E

1 2
m1 dr |v| (dx, dv)
2
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 791


1 2
4m1/2 Ui 2 (2(R0 + 1))d1 m1 |v| |v|dv
Rd 2
= C1 m1/2 .
(4.12)

This fact will be used later.


Let us study the term M i (t). First, it is easy to see by denition that M i (t)
is a Ft -martingale, with its jumpssatisfying |M i | 4m1/2 Ui . Also, with
2 2 1 2
C = (4 ) Ui (2(R0 + 1)) d1
Rd
( 2 |v| )|v|dv, we have that, for any 0 s
t T,

i (t) M
E Pm [|M i (s)|2 |Fs ]
 2
 4m1/2 
Pm  0 1/2 
=E  Ui (Xi (r) (m u , x, v; X(r)))du
 
(s,t)E  0 

 
1 2 
1[0,R0 +1) (|x|)m1 dr |v| (dx, dv) Fs
2 

C|t s|, (4.13)

hence for any 0 r s t T ,

i (t) M
E Pm [|M i (s)|2 |M
i (s) M
i (r)|2 ] C 2 |t s||s r|. (4.14)

Also, by Doobs inequality and (4.13), we get

  2 1/2
E Pm i (t)| E Pm sup |M
sup |M i (t)|
t[0,T ] t[0,T ]

i (t)|2 ]1/2
2 sup E Pm [|M
t[0,T ]

2 sup Ct = 2 CT < . (4.15)
t[0,T ]

By Theorem 3.4.1 (with = 1, = 2 and = 1/2), (4.13)(4.15) imply the


following:

i (t)}
Lemma 4.4.1. {The distribution of {M t[0,T ] under Pm }m(0,1] is tight in
d
(D([0, T ]; R )).

We next show that under any of its cluster points as m 0, the canonical
process is continuous with probability 1. We rst make the following preparation.
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

792 S. Kusuoka & S. Liang

Lemma 4.4.2. For any (0, 1], let


 
%
A= D([0, T ]; R ): sup |(t) (s)| > ,
d

0 |ts|
 
%
B= D([0, T ]; R ): sup |(t) (s)| >
d
.
|ts|e 2
0

Then
A A B o B.
Here A and B o means the closure of A and the interior of B in (D([0, T ]; Rd), d0 ),
respectively.

Proof. For any 0 A and D([0, T ]; Rd ) with d0 (, 0 ) < 5 , we have that


B. Indeed, by denition, we have that there exists a continuous non-decreasing
function : [0, T ] [0, T ] such that (0) = 0, (T ) = T , and
|(t) (s)| e/4 |t s| e|t s|, for any 0 s < t T,
sup |0 (t) ((t))| /4.
0tT

Therefore,
sup |(t) (s)| = sup |((t)) ((s))|
|ts|e |(t)(s)|e

sup |((t)) ((s))|


|ts|

sup |0 (t) 0 (s)| sup |0 (t) ((t))|


|ts| 0tT

sup |0 (s) ((s))|


0sT

= /2,
>
4 4
which means that B. This completes the proof of our assertion.

Now, we are ready to prove the continuity of canonical processes of cluster points
i (t)}
of {{M t[0,T ] under Pm }m0 .

Lemma 4.4.3. Any cluster point of {{Mi (t)}


t[0,T ] under Pm }m0 in (D([0, T ];
d
R )) must have continuous canonical processes.

Proof. Suppose there exists a sequence mk 0 (as k 0) such that Pmk


i )1 (which we write as Qk for the sake of simplicity) converges to some Q
(M
(D([0, T ]; Rd )) as k . We show that the canonical process under Q is
continuous with probability 1. Suppose not. Then there exists a constant > 0
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 793

such that
 
%
Q Dd ([0, T ]): sup |(t) (s)| > = a > 0.
0 |ts|

Without loss of generality, we assume that 1. Let A and B be the sets


dened in Lemma 4.4.2. Then Q (A) = a > 0, so by Lemma 4.4.2, Q (B o )
a > 0. Also, B o is an open set, and Qk Q weakly in (D([0, T ]; Rd)),
so we have lim inf k Qk (B o ) Q (B o ). Therefore, there exists an N N
such that for any k N, Qk (B o ) a2 , hence Qk (B) a2 , which means that
Pm (Mi has a jump greater than /2) a . Since mk 0 as k , this yields a
k 2
i under Pm are smaller than
contradiction with the fact that all of the jumps of M k
1/2
4mk Ui .
This completes the proof of our assertion.

We next use Lemma 4.4.3 to show the following, which will be used later.

Lemma 4.4.4. For any > 0, we have that



lim sup lim sup Pm sup i (t) M
|M i (s)| > = 0. (4.16)
0 m0 0stT,|st|

i (t) M
Proof. Let a(m, ) = Pm (sup0stT,|st| |M i (s)| > ). If

lim sup lim sup a(m, ) > 0,


0 m0

then there exists a constant a > 0 and sequences k 0, mk 0 (as k ) such


that

Pm sup  
|M (t) M (s)| > a
i i (4.17)
k
0stT,|st|k

i )1 , k N. Also, let
for any k N. As before, let Qk = Pmk (M
 
Ak = D([0, T ]; R ): d
sup |(t) (s)| > ,
0stT,|ts|k
 

Bk = D([0, T ]; Rd): sup |(t) (s)| > .
0stT,|ts|ek 2

Then Qk (Ak ) > a by assumption, and by the same argument as in the proof of
Lemma 4.4.2, we get that Ak Ak Bko Bk for any k N. Also, Ak is monotone
decreasing with respect to k, hence for any  k, we have that Q (Ak ) Q (A ) >
a. Therefore, since Ak is a closed set, we get that
Q (Bk ) Q (Ak ) lim sup Q (Ak ) a.

August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

794 S. Kusuoka & S. Liang

This is true for any k N, so since Bk is monotone decreasing with respect to k,


we get that

%
Q Bk a,
k=1

which means that Q ({canonical process has jump /2}) a, which contra-
dicts Lemma 4.4.3. This completes the proof of our assertion.

i (t), for later use.


Before dealing with i (t), we prepare one more result about M
Lemma 4.4.5. There exists a constant C > 0 (not depending on m) such that
 
sup E Pm  i 4
sup |M (t)| C.
m(0,1] t[0,T ]


Proof. |4 ]
 2By 2 the general
4
fact of Poisson point process that E[| f dN
E[3( f d) + f d], we get with the help of Doobs inequality that
 
E Pm sup |M i (t)|4
t[0,T ]

i (T )|4 ]
(4/3)4 E Pm [|M


4m1/2
4
= (4/3) E 3 (dr, dx, dv) ds
(0,T ]E 0

2 2

Ui (Xi (r ) (m1/2 s 2, x, v; X(r
 )))


4m1/2
+ (dr, dx, dv) ds
(0,T ]E 0

4

Ui (Xi (r ) (m1/2 s 2, x, v; X(r
 )))



4 1 1 2
(4/3) 3 m |v| dr(dx, dv)
(0,T ]E 2
2
1/2 2
(4m Ui 1[0,R0 +1) (|x|))


1 2
+ m1 |v| dr(dx, dv)(4m1/2 Ui 1[0,R0 +1) (|x|))4
(0,T ]E 2
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 795


 2
 1 2
(4/3)4 3(4 Ui )4 T (2(R0 + 1))d1  |v  |v|dv
Rd 2
  
 1 2
+ (4 Ui )4 mT (2(R0 + 1))d1  |v  |v|dv .
Rd 2
The right-hand side above is dominated by a nite global constant for m (0, 1].

We next deal with i (t). First, we use some basic properties of Poisson point
process to show that there exists a constant C such that
E Pm [|i (t)|6 ] Cm3/2 , t [0, T ], m (0, 1]. (4.18)
In fact, notice that i (t) can be expressed as
4m1/2
i (t) =
N (dr, dx, dv) ds
[(t4m1/2 )0,t]E (tr)(4m1/2 )

Ui (Xi (r ) 0 (m1/2 s 2, x, v; X(r


 ))).

Also, in general, if Z is a Poisson random variable with mean a, then we have


E[Z a] = 0, E[(Z a)2 ] = E[(Z a)3 ] = a, E[(Z a)4 ] = 3a2 +a, and E[(Z a)6 ] =
15a3 + 25a2 + a. Therefore, by denition of Poisson point process and a simple
calculation, there exists a global constant C such that
 6 
 
E  f dN 

 3 2   
2 3 2 4 6
CE f d + f d + f d f d + f d ,

for any measurable function f . We use this to prove (4.18).


 4m1/2
Let A = | (tr)(4m1/2 ) Ui (Xi (r ) 0 (m1/2 s 2, x, v; X(r
 )))ds|.
Then since t r 0, we get that A 4m1/2 Ui . Therefore,
E Pm [|i (t)|6 ]
3

1 2
CE 2
A m 1
|v| dr(dx, dv)
[(t4m1/2 )0,t]E 2
 2
3 1 2
1
+ A m |v| dr(dx, dv)
[(t4m1/2 )0,t]E 2

1
+ A2 m1 |v|2 dr(dx, dv)
[(t4m1/2 )0,t]E 2
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

796 S. Kusuoka & S. Liang


4 1 1 2
A m |v| dr(dx, dv)
[(t4m1/2 )0,t]E 2

1 2
+ A6 m1 |v| dr(dx, dv)
[(t4m1/2 )0,t]E 2
  3
1/2 1/2 2 1 1 2
C 4m (4m Ui ) m (2(R0 + 1))d
|v| |v|dv
Rd 2
 2
1/2 1/2 3 1 1 2
+ 4m (4m Ui ) m (2(R0 + 1))d
|v| |v|dv
Rd 2
 
1/2 1/2 2 1 1 2
+ 4m (4m Ui ) m (2(R0 + 1)) d
|v| |v|dv
Rd 2
 
1 2
4m1/2 (4m1/2 Ui )4 m1 (2(R0 + 1))d |v| |v|dv
Rd 2
 
1
+ 4m1/2 (4m1/2 Ui )6 m1 (2(R0 + 1))d |v|2 |v|dv ,
Rd 2

which gives us our assertion.


We use (4.18) to show the following, with the help of (4.12) (the estimate for
$
the derivative of V$03 i
i ), Lemma 4.4.4 (the continuity of the limit of M (t)), and
 4
Lemma 4.4.5 (the estimate with respect to |M (t)| ).
i

Lemma 4.4.6.
 
lim E Pm sup |i (t)|2 = 0.
m0 0tT

Proof. By (4.18),
4

[m 3 T ]


E Pm |i (km4/3 )|6 Cm3/2 m4/3 T 0, as m 0.
k=0

In particular we have
 
4/3 6
E Pm
max 4 |i (km )| 0, as m 0. (4.19)
0k[m 3 T ]

ag process, there exists a measurable m : [0, T ] such


Since i (t) is a c`adl`
that

|i (m )| |i (m )| = sup |i (t)|. (4.20)


0tT
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 797

Also, the jumps of i satisfy |i | 4m1/2 Ui , so |i (m )| |i (m )| +


4m1/2 Ui . Let  m = m
4/3
[m4/3 m ]. Then 0 m 
m m
4/3
.
Combining the above, we get that
 
E Pm
sup |i (t)| = E Pm [|i (m )|2 |i (m )|2 ]
2
0tT

2(4m1/2 Ui )2 + 2E Pm [|i (m )|2 ]


 
1/2 2 4/3 2
2(4m Ui ) + 4E Pm
max 4 |i (km )|
0k[m 3 T ]

+ 4E Pm [|i (m ) i ( 2
m )| ].

The rst term on the right-hand side above converges to 0 as m 0 evidently.


By (4.19), the second term above is also converging to 0 as m 0. So in order
to show that E Pm [sup0tT |i (t)|2 ] 0, it suces to prove that the third term
E Pm [|i (m ) i ( 2
m )| ] converges to 0. We show it in the following.
Notice that
Pm $ $
E Pm [|i (m ) i ( 2
m )| ] 2E [|V$03 $ 03  2
i (m ) Vi (m )| ]

i (m ) M
+ 2E Pm [|M i ( 2
m )| ].

Since 0 m m m
4/3
, we get by (4.12) that

T   2
$ $  d $ 
E Pm [|V$03 $ 03  2
i (m ) Vi (m )| ] E
Pm
1[m ,f
m]
(t)  V$ 03 (t) dt


0 dt i
  2 
T T d $ 
E Pm 1[m ,f (t)dt  V$ 03 
m]  dt i (t) dt
0 0

  
T  $ 2
Pm  d $
m4/3 E 03 
 dt Vi (t) dt
0

m4/3 T C1 m1/2 0, as m 0.
i (m ) M
For the term E Pm [|M i ( 2 
m )| ], we rst notice that since 0 m m
m4/3 by denition, (4.16) gives us that
i (m ) M
lim Pm (|M i (
m )| > ) = 0. (4.21)
m0

This is true for any > 0. Also, we have by Lemma 4.4.5 that for any > 0,
i (m ) M
E Pm [|M i ( 2
m )| ]

i (m ) M
E Pm [|M i ( 2 i i 
m )| , |M (m ) M (m )| > ] +
2
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

798 S. Kusuoka & S. Liang

i (m ) M
E Pm [|M i ( 4 1/2
m )| ]
i (m ) M
P (|M i (
m )| > )
1/2
+ 2
1/2 4

4E Pm i (t)|
sup |M i (m ) M
P (|M i (
m )| > )
1/2
+ 2
t[0,T ]

i (m ) M
4C 1/2 P (|M i (
m )| > )
1/2
+ 2 .

This combined with (4.21) gives us that

i (m ) M
lim E Pm [|M i ( 2
m )| ] = 0,
m0

and completes the proof of the fact that

lim E Pm [|i (m ) i ( 2
m )| ] = 0,
m0

completing then the proof of our assertion.

Combining all of the results in Secs. 4.14.3, we get Lemma 3.5.1, with

i (t ),
Mi (t) = M
Pi1 (t) = Vi02 (t) Vi05 (t), (4.22)
i (t) = Vi1 (t) + Vi04 (t) i (t ).

Before closing this subsection, we state the following result with respect to the
quadratic variation of the martingale Mi (). The proof is easy and we omit it. For
i = 1, . . . , N and k = 1, . . . , d, let
2
Aik (r) = Aik (r, x, v) = k Ui (Xi (r) 0 (u, x, v; X(r)))du.

2

Then we have:

Lemma 4.4.7. For any l1 , l2 = 1, . . . , N and k1 , k2 = 1, . . . , d, the following equal-


ity holds:

[Mlk11 , Mlk22 ]s = m Al1 k1 (r, x, v)Al2 k2 (r, x, v)N (dr, dx, dv).
[0,s]E

4.5. Proof of Lemma 3.5.2


In this subsection, we present the proof of Lemma 3.5.2.
The rst assertion is just an easy consequence of Lemma 3.5.1 and the formula
of integration by parts. Indeed, for any t 0, we have by assumption and the
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 799

formula of integration by parts that


tfD tfD
m 1/2 
|i U (X(s))|ds =
 
g(X(s))  (X(s)))ds
(m1/2 i U 
0 0
tf
D
 
= g(X(t D ))
 (X(s))ds
m1/2 i U 
0
tf
D
s

ds(g(X(s))V (s))  (X(r))dr.
m1/2 i U 
0 0

Therefore, by Lemma 3.5.1(1), we get


T fD
 X(s))|ds
m1/2 |i U( 
0


= g(X(T D ))(Mi (Vi (T
D ) Vi (0)) + Mi (T
D)
T fD
1
+ i (T 
D ) + Pi (T D ))

(g(X(t))  (t))
V
0

{Mi (Vi (t 
D ) Vi (0)) + Mi (T
D ) + i (T
D)

+ Pi1 (t 
D )}dt
 
( g + g N nT ) 2Mi n + sup |Mi (t) + i (t)| + sup |Pi1 (t)| .
0tT 0tT

Therefore, we get our rst assertion by Lemmas 3.5.1(2), 3.5.1(4) and (4.15).
Before giving the proof of the second assertion, let us make some preparation.
With the help of Lemma 4.4.7, we have the following.

Lemma 4.5.1.
  t 2 
 
lim E Pm
sup  i (s)dMi (s) = 0.
m0 t[0,T ] 0
t
Proof. Since Mi () is a martingale, Lemma 3.5.1(4) implies that 0 i (s)dMi (s) is
also a martingale. Therefore, with the help of Lemma 4.4.7 and Doobs inequality,
we get that
  t 2 
 
E Pm 
sup  i (s)dMi (s)
t[0,T ] 0
 2
 T 
 
4E Pm  i (s)dMi (s)
 0 
 

d T
Pm
=2 E ik (s)i (s)d[Mik , Mi ]s
k,=1 0
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

800 S. Kusuoka & S. Liang

 

d T
Pm
= 2m E ik (s)i (s) Aik (s, x, v)Ai (s, x, v)N (ds, dx, dv)
k,=1 0 E

1 2
2(4 Ui )2 E Pm [|i (s)|2 ]1[0,R0 +1) (|x|) |v| (dx, dv)ds
[0,T ]E 2
  
 1  2
2(4 Ui ) T (2(R0 + 1))2 d1
  v  |v|dvE Pm [|i (s)|2 ].
Rd 2

This combined with Lemma 3.5.1(4) completes the proof of our assertion.

We next show the second assertion of Lemma 3.5.2. The basic idea is to add an
d
extra term i=1 M1i 0 i (s)dMi (s) rst, use the decomposition and the estimates
t

of Lemma 3.5.1 to show that the resulting quantity is tight, and nally delete the
added term by Lemma 4.5.1.
First, by Lemma 3.5.1, we have

 (X(t
m1/2 (U  (X(0)))
 )) U 
N 

t 
Mi 1 2
+ |Vi (t )| + i (s)dMi (s)
i=1
2 Mi 0

N N 

t
Mi 2  (X(s))
= |Vi (0)| + m1/2 i U  Vi (s)ds
i=1
2 i=1 0

t  t 
d 1
+ Mi Vi (s) Vi (s)ds + i (s)dMi (s)
0 ds Mi 0

N N 

t t
Mi d 1
= |Vi (0)|2 + Vi (s) P (s)ds + Vi (s)dMi (s)
i=1
2 i=1 0 ds i 0

t t 
1
+ Vi (s)di (s) + i (s)dMi (s) .
0 Mi 0

Since |Vi (t )| n by the denition of , we have by Lemma 3.5.1(2) that


 2 
 d 
sup sup E Pm Vi (t ) Pi1 (t) < .
m(0,1] 0tT dt

 t
Therefore, by Theorem 3.4.1, we get that 0 Vi (s) ds d
Pi1 (s)ds under Pm is tight
for m (0, 1]. 
t
For the term 0 1[0,] (s)Vi (s)dMi (s), we recall that = inf{t > 0; maxi=1,...,N
|Vi (t)| = n}, so is a Ft -stopping time. Therefore, since {Mi (s)}s is a martingale,
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 801

we get that
t
Ni (t) := 1[0,] (s)Vi (s)dMi (s)
0
is also a Ft -martingale. Notice that
E Pm [|Ni (t) Ni (s)|2 |Fs ] n2 d2 E Pm [|Mi (t) Mi (s)|2 |Fs ].
So by Lemma 3.5.1(3), we get that
E Pm [|Ni (t) Ni (s)|2 |Fs ] n2 d2 C|t s|, |N (t)| dnCm1/2 .
Therefore, similarly as in the proof of Lemmas 4.4.1 and 4.4.3, we get that {Ni (t)}t
under Pm is tight for m 0, and the canonical process under any of its cluster
points is continuous withprobability 1.  t
Finally, we show that 0 Vi (s)di (s)+ M1i 0 i (s)dMi (s) is negligible. Notice
t

that by Lemma 3.5.1(3),


t t
1
Vi (s)di (s) + i (s)dMi (s)
0 Mi 0
t t
1
= Vi (t )i (t) i (s)dVi (s) + i (s)dMi (s)
0 Mi 0
t  t
1 d 1 1
= Vi (t )i (t) i (s) Pi (s) ds i (s)di (s)
Mi 0 ds Mi 0
t
1  (X(s))ds
+ i (s)m1/2 i U 
Mi 0
1 1
= Vi (t )i (t) i (t)2 + [i , i ]t
2Mi 2Mi
t 
1  (X(s)) d
+ i (s) m1/2 i U  Pi1 (s) ds.
Mi 0 ds
Since |Vi (t )| n, Lemma 3.5.1(4) gives us that
  
 1 1 
lim E Pm 
sup Vi (t )i (t) 2
i (t) + [i , i ]t  = 0.
m0 t[0,T ] 2Mi 2Mi
Also, for any > 0, we have for any A > 0,
 t  
 d 1 
Pm sup   i (s) m 1/2  
i U (X(s)) Pi (s) ds >

t[0,T ] 0 ds

Pm sup |i (s)| > A
s[0,T ]
 
t  d 1 
+ Pm sup |m 1/2  (X(s))|
i U  +  Pi (s) ds >
s[0,T ] 0 ds A
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

802 S. Kusuoka & S. Liang

 
1
E Pm sup |i (s)|
A s[0,T ]
   
A T  d 1 
+ E Pm |m 1/2   
i U(X(s))| +  Pi (s) ds .
0 ds

Combining this with Lemmas 3.5.1(2), 3.5.1(4) and 3.5.2(1), by taking rst A > 0
small enough and then m > 0 small enough, we get that
 t  
 d 
lim Pm sup  i (s) m 1/2  (X(s))
i U  Pi (s) ds > = 0
1
m0  ds
t[0,T f
D] 0

for any > 0. This completes the proof of the fact that m1/2 (U  (X(t
 D ))
  tfD
U (X(0)))
 + N M i
|Vi (t 
D )|2
+ 1
i (s)dM i (s) under Pm is tight
i=1 2 Mi 0
as m 0, and the canonical process under any of its cluster points is continuous
with probability 1. This combined with Lemma 4.5.1 gives us our second assertion
of Lemma 3.5.2.

5. Convergence until Near


As mentioned at the end of Sec. 3.4, weak convergence of the distribution of a pro-
cess with t [0, T ] for any T > 0 implies the weak convergence of the distribution
of the process with t [0, ). So in order to prove Theorems 2.0.1(2)2.0.1(4),
it suces to prove the assertions for t [0, T ] for any T > 0. Fix a T > 0 from
now on.  t
By Lemma 3.5.1, we have that {{Mi Vi (t n ) + m1/2 0 n i U 
(X s )ds}t[0,T ] under Pm } is tight in (D([0, T ]; Rd)) as m 0, and the canon-
ical process under any of its cluster points is continuous with probability 1.
Let 0 () = inf{t > 0; mini=j {|Xi (t) Xj (t)| (Ri + Rj )} 0}. Then by
(3.31), i U  (X
 s ) = 0 for any s 0 . Therefore, there exists (at least) one sequence
mk 0 (as k ) such that {distribution of {(X(t  n 0 ), V  (t n
0 ))}t(0,T ] under Pmk } converges in (D([0, T ]; R )).
d

In this section, we give the proof of the fact that any cluster point gotten above is
the stopped diusion process with generator L as given in Sec. 2, by proving that it
is the solution of the martingale problem L. This certainly implies Theorem 2.0.1(2)
and 2.0.1(3).
For the sake of simplicity, in this section, we let = n 0 . We use the same
notations as in Sec. 4. Also, we use the notation D0 = (supp U  )C RdN .

5.1. Decomposition
As claimed, we show from now on that any cluster point of {distribution of {(X(t 
), V (t ))}t[0,T ] under Pm } is a solution of the martingale problem L, i.e. for
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 803

any f C0 (D0 RdN ),


 t 
f (X(t ), V (t )) f (X0 , V0 ) Lf (Xs , Vs )ds , (5.1)
0

after taking the limit m 0, is a martingale.


First, since we do not have enough information about the term i (t), we use the
following to convert the problem to the one without i (t). Let
tn 
1
Yi (t) = Vi (0) + Mi 1
Mi (t) + Pi (t) m 1/2 
i U (Xs )ds

0

= Vi (t) Mi1 i (t), i = 1, . . . , N,

and let Yt = Y (t) = (Y1 (t), . . . , YN (t)). Then we have the following. (We use the

notations Xt = X(t) and Vt = V  (t).)

Lemma 5.1.1. For any f C0 (D0 RdN ), we have that {f (Xtn , Vtn )}t and
{f (Xtn , Ytn )}t converge or do not converge for m 0 at the same time, and
when they converge, they have the same limit.

Proof. Just notice that if we let fV denote the partial dierential of f with respect
to V , then fV < and
1
|f (Xtn , Vtn ) f (Xtn , Ytn )| fV max sup |i (s)|,
i=1,...,N Mi s[0,T ]

hence
 
E Pm
sup |f (Xtn , Vtn ) f (Xtn , Ytn )|
0tT
 
1 Pm
fV max E sup |i (s)| ,
i=1,...,N Mi s[0,T ]

which, by Lemma 3.5.1(4), converges to 0 as m 0.

By Lemma 5.1.1, in order to prove that any cluster point of (5.1) is a martingale,
it suces to prove that any cluster point of
 t 
f (Xt , Yt ) f (X0 , Y0 ) Lf (Xs , Vs )ds
0

is a martingale. Since f C0 (D0 RdN ) (notice that all the terms involved except
Mi (t) are continuous with respect to t), we have
t
 (X
fV (Xs , Ys ) U  s )ds = 0,
0
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

804 S. Kusuoka & S. Liang

so we obtain by Itos formula and the denition of Yi (t) that

f (Xt , Yt ) f (X0 , Y0 )
t
= fX (Xs , Ys ) Vs ds
0


N t
1
+ fVi (Xs , Ys ) dMi (s) + (II) + (III) + (IV),
i=1
Mi 0

with


N t
1
(II) = fV (Xs , Ys ) dPi1 (s),
i=1
M i 0

d t
1
(III) = fV k1 V k2 (Xs , Ys )d[Mlk11 , Mlk22 ]s ,
2Ml1 Ml2 0+ l1 l2
l1 ,l2 =1 k1 ,k2 =1


N
1
(IV) = f (Xs , Ys ) f (Xs , Ys ) fV (Xs , Ys ) Ml (s)
Ml
0<st l=1

N
1 1
fVl1 Vl2 (Xs , Ys )(Ml1 (s))(Ml2 (s)) .
2 M l1 M l2
l1 ,l2 =1

N 1  t
The term i=1 Mi 0 fVi (Xs , Ys ) dMi (s) is already a martingale since
{Mi (t)}t , i = 1, . . . , N , are martingales and fV is bounded, hence it remains a
martingale when taking m 0. So it suces to show that
t t
fX (Xs , Ys ) Vs ds + (II) + (III) + (IV) Lf (Xs , Vs )ds 0.
0 0

 t  t
The fact that the dierence between 0 fX (Xs , Ys ) Vs ds and 0 fX (Xs , Vs )
 t
Vs ds, its corresponding term in 0 Lf (Xs , Vs )ds, converges to 0 is a direct conse-
quence of Lemma 5.1.1. In the following sections, we show the convergences of the
other terms. Precisely, we show that when m 0,
t

d

(II) Vj (s)bik,jl (Xs ) f (Xs , Vs )ds 0, (5.2)
0 i,j=1 k,l=1 Vik

d
t
2
(III) aik,jl (Xs ) f (Xs , Vs )ds 0, (5.3)
0 i,j=1 k,l=1 Vik Vjl

(IV) 0. (5.4)
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 805

5.2. The term (IV)


By using the fact that any jump of Mi (t) is dominated by Cm1/2 (Lemma 3.5.1(3))
and the denition of Mi (t), we show with the help of the properties of a Poisson
point process that (IV) is negligible. Precisely, we show the following.

Lemma 5.2.1.
 
lim E Pm
sup |(IV)| = 0.
m0 0tT

Proof. Since f C0 (D0 RdN ), we have that the third partial deriva-
tives fV k1 V k2 V k3 , l1 , l2 , l3 = 1, . . . , N, k1 , k2 , k3 = 1, . . . , d, are bounded. Also,
l1 l2 l3

by Lemma 3.5.1(3), the jumps of {Mi (t)} satisfy |Mi (s)| Cm1/2 . There-
N d
fore, by Taylors expansion, with C1 = l1 ,l2 ,l3 =1 k1 ,k2 ,k3 =1 fV k1 V k2 V k3 C, l1 l2 l3
we have

d
|(IV)| fV k1 V k2 V k3 |Mlk11 (s)||Mlk22 (s)||Mlk33 (s)|
l1 l2 l3
0<st l1 ,l2 ,l3 =1 k1 ,k2 ,k3 =1

N
C1 m1/2 |Ml (s)|2 .
0<st l=1

Therefore, to complete the proof of this lemma, it suces to show that



E Pm [ 0<sT |Mi (s)|2 ] is bounded for m > 0, which we are now going
to show.
We have by the denition of {Mi (t)} that
4m1/2
Mi (t) = (dr, dx, dv)
N du
(0,t]E 0

Ui (Xi (r ) 0 (m1/2 u 2, x, v; X(r


 ))),

so


4m1/2
2
|Mi (s)| = N (dr, dx, dv) du
0<st (0,t]E 0

2
Ui (Xi (r ) 0 (m1/2 u 2, x, v; X(r
 ))) .

Recall that N is the Poisson point process with intensity m1 ( 12 |v|2 )dr(dx, dv).
Therefore, since

|Ui (Xi (r ) 0 (m1/2 u 2, x, v; X(r


 )))| Ui 1[0,R +1) (|x|),
0
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

806 S. Kusuoka & S. Liang

we get that

E Pm |Mi (s)|2
0<sT

4m1/2
= E Pm N (dr, dx, dv) du
(0,T ]E 0

2
Ui (Xi (r ) 0 (m1/2 u 2, x, v; X(r
 )))


1 2
m1
|v| 1[0,R0 +1) (|x|)(4m1/2 )2 Ui 2 dr(dx, dv)
[0,T ]E 2

1 2
16 2 Ui 2 T (2(R0 + 1))d1 |v| |v|dv,
Rd 2
which is nite by our assumption.
This completes the proof of our assertion.

5.3. The term (III)


For the term (III), we show in this subsection that (5.3) holds, i.e. when m 0,
(III) corresponds to the quadratic term of the generator L.
By Lemma 4.4.7, we have that
t
fV k1 V k2 (Xs , Ys )d[Mlk11 , Mlk22 ]s
0+ l1 l2

t
=m fV k1 V k2 (Xs , Ys )Al1 k1 (s, x, v)Al2 k2 (s, x, v)N (ds, dx, dv).
0+ l1 l2

Let

d t
1
(III ) = fV k1 V k2 (Xs , Ys )
2Ml1 Ml2 0+ l1 l2
l1 ,l2 =1 k1 ,k2 =1
 
1 2
Al1 k1 (s, x, v)Al2 k2 (s, x, v) |v| (dx, dv) ds.
E 2
Then we have the following. The reason is intuitively as follows: when subtracting
(III ), we are subtracting the corresponding expectation, so the resulting quantity
is its variance, which converges to 0.

Lemma 5.3.1.
 

lim E Pm
sup |(III) (III )| = 0.
m0 0tT
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 807

Proof. By denition, N (ds, dx, dv) is the Poisson point process with intensity
(ds, dx, dv) = m1 ( 12 |v|2 )ds(dx, dv). Also, notice that there exists a constant
C > 0 such that |Alk (s, x, v)| C1[0,R0 +1] (|x|). Let C1 = 2C 2 fV V (T ((2R0 +

1))d1 Rd ( 12 |v|2 )|v|dv)1/2 , which is nite by assumption. Then by Doobs inequal-
ity, for any l1 , l2 = 1, . . . , N and k1 , k2 = 1, . . . , d, we have
  t 
 
E Pm
sup  mfV k1 V k2 (Xs , Ys )Al1 k1 (s)Al2 k2 (s)(N )(ds, dx, dv)
0tT 0 E l 1 l 2

  t

E Pm sup  mfV k1 V k2 (Xs , Ys )Al1 k1 (s)Al2 k2 (s)
0tT 0 l1 l2
E

2 1/2

(N )(ds, dx, dv)

T
2E Pm mfV k1 V k2 (Xs , Ys )Al1 k1 (s)Al2 k2 (s)
0 E l1 l2

2 1/2
(N )(ds, dx, dv)

 1/2
T
= 2E Pm (mfV k1 V k2 (Xs , Ys )Al1 k1 (s)Al2 k2 (s))2 (ds, dx, dv)
0 E l1 l2

 1/2
T
2 1 1 2
2C fV V m 1[0,R0 +1] (|x|)m |v| ds(dx, dv)
0 E 2

C1 m1/2 .

This completes the proof of our assertion.

Lemma 5.3.1 combined with Corollary 3.2.3 implies that (5.3) holds, i.e.
N d
after taking the limit m 0, (III) corresponds to the term i,j=1 k,l=1
 2
aik,jl (X) k l .
Vi Vj

5.4. The term (II)


In this subsection, we deal with the term (II). The most basic idea is the same as up
to now: use the benet that the variance of the corresponding Poisson point process
is small (see Lemmas 5.4.1 and 5.4.7). Proposition 3.6.4 is also used, to derive the
 V
limit, which gives us z(; x, v, X,  , a).
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

808 S. Kusuoka & S. Liang

Recall that Pi1 is given by Pi1 (t) = Vi02 (t) Vi05 (t). So we have the
decomposition
t
fV (Xs , Ys ) dPi1 (s)
0
t 
= fV (Xs , Ys )1[4m1/2 ,) (s) 
fi (s, r, x, v) (dr, dx, dv) ds
0 RE
t 
+ fV (Xs , Ys )1[4m1/2 ,) (s) $
F 05 (s, r, x, v)(dr, dx, dv) ds
i
0 RE
t
= fV (Xs , Ys )1[4m1/2 ,) (s)
0

 $ 05
(fi (s, r, x, v) + Fi (s, r, x, v)) (dr, dx, dv) ds
RE
t
+ fV (Xs , Ys )1[4m1/2 ,) (s)
0

$ 05
Fi (s, r, x, v)((dr, dx, dv) (dr, dx, dv)) ds. (5.5)
RE

We rst show in the following lemma that the second term on the right-hand
side above is negligible.

Lemma 5.4.1.
 
 t
lim E Pm
sup  fV (Xs , Ys )1[4m1/2 ,) (s)
m0 0tT 0
 
$ 
F 05 (s, r, x, v)((dr, dx, dv) (dr, dx, dv)) ds = 0.
i 
RE

Proof. As mentioned, this result intuitively lies in the fact that only the variance
of the Poisson point process is involved. We prove it by rst performing a proper
decomposition (see (5.6)) and then show that each of these terms are small enough
(see Lemmas 5.4.25.4.4).
Let

$
R(s, r, x, v) = F 05 2
r ) 0 (m1/2 (s r), x, v; X(
i (s, r, x, v) Ui (Xi (
 r )))

r ) 0 (m1/2 (s r), x, v; X(s))


{Xi (s) Xi ( 

+ 0 (m1/2 (s r), x, v; X(


 r ))}.
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 809

Then we have the decomposition


t
fV (Xs , Ys )1[4m1/2 ,) (s)
0

$
F 05 (s, r, x, v)((dr, dx, dv) (dr, dx, dv)) ds
i
RE

= (5I) + (5II) + (5III), (5.6)

where
t
(5I) = fV (Xs , Ys )1[4m1/2 ,) (s)
0

R(s, r, x, v)( (dr, dx, dv) (dr, dx, dv)) ds,
RE
t
(5II) = fV (Xs , Ys )1[4m1/2 ,) (s)
0

2 Ui (Xi (
r ) 0 (m1/2 (s r), x, v; X(
 r )))
RE

{(Xi (s) Xi ( r )) ( 0 (m1/2 (s r), x, v; X(s))


r ) (s r)Vi ( 

0 (m1/2 (s r), x, v; X(


 r ) + (s r)V
 (
r )))}

( (dr, dx, dv) (dr, dx, dv)) ds,
t
(5III) = fV (Xs , Ys )1[4m1/2 ,) (s)
0

g(r, s, x, v)( (dr, dx, dv) (dr, dx, dv)) ds,
RE

with

g(r, s, x, v) = 2 Ui (Xi (
r ) 0 (m1/2 (s r), x, v; X(
 r )))

r ) 0 (m1/2 (s r), x, v; X(


{(s r)Vi (  r ) + (s r)V
 (
r ))
+ 0 (m1/2 (s r), x, v; X(
 r ))}.

So Lemma 5.4.1 follows from Lemmas 5.4.25.4.4 in the following:

Lemma 5.4.2.
 
lim E Pm
sup |(5III)| = 0.
m0 0tT

Proof. First notice that 0 r , hence |V r )| N n. Let C1 =


 (
2  N n). Then by Corollary 3.2.3 and Lemma 4.3.4, we
Ui (2 n + 4C
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

810 S. Kusuoka & S. Liang

have that
|g(r, s, x, v)|
2 Ui 1[0,2m1/2 ) (|s r|)(|s r||Vi (  r||V
r )| + C|s  (
r )|)1[0,R0 +1) (|x|)
m1/2 C1 1[0,2m1/2 ) (|s r|)1[0,R0 +1) (|x|).
Also, it is easy to see that g(r, s, x, v) is Fr -measurable. Therefore by
N
assumption, with c = i=1 Ui and C2 = fV T C1 (2(R0 +
1))(d1)/2 (4 Rd c ( 12 |v|2 )|v|dv)1/2 , we have
 
E Pm
sup |(5III)|
0tT
  
T  
fV E Pm
ds  g(r, s, x, v)( (dr, dx, dv) (dr, dx, dv))
0 RE

 2 1/2
T 
Pm 

fV dsE  g(r, s, x, v)( (dr, dx, dv) (dr, dx, dv))
0 RE
T  1/2
2
= fV ds E Pm
[|g(r, s, x, v)| ](dr, dx, dv)
0 RE

T
fV ds (m1/2 C1 1[0,2m1/2 ) (|s r|)1[0,R0 +1) (|x|))2
0 RE
 1/2
1 2

N
1 1/2
m |v| + Ui (x m rv Xi,0 ) dr(dx, dv)
2 i=1

C2 m1/4 ,
which converges to 0 as m 0.

Lemma 5.4.3.
 
lim E Pm sup |(5I)| = 0.
m0 0tT

Proof. By the denition of R(s, r, x, v), a Taylor expansion, Corollary 3.2.3 and
Lemma 4.3.4, we get that
|R(s, r, x, v)| 3 Ui |(Xi (s) Xi (
r ))
( 0 (m1/2 (s r), x, v; X(s))
 0 (m1/2 (s r), x, v; X(
 r )))|2
 2 |Xi (s) Xi (
(1 + C) r )|2 1[0,2m1/2 ) (|s r|)1[0,R0 +1) (|x|).
Notice that when |s r| 2m1/2 , since s, r [0, ], we get that |Xi (s) Xi (
r )|
n|s r| n4m1/2 . So the above gives us that
 2 m1[0,2m1/2 ) (|s r|)1[0,R +1) (|x|).
|R(s, r, x, v)| (4n (1 + C)) 0
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 811


 2 4 (2(R0 + 1))d1
Therefore, with C1 = 2 fV T (4n (1 + C)) c ( 12 |v|2 )|v|dv,
Rd
we have
 
E Pm sup |(5I)|
0tT
T
2 fV ds E Pm [1[0,] (s)|R(s, r, x, v)|](dr, dx, dv)
0 RE
T
2 fV ds  2 m1[0,2m1/2 ) (|s r|)1[0,R +1) (|x|)
(4n (1 + C)) 0
0 RE

1 2

N
1 1/2
m |v| + Ui (x m rv Xi,0 ) dr(dx, dv)
2 i=1

C1 m1/2 ,
which converges to 0 as m 0.

Lemma 5.4.4.
 
lim E Pm
sup |(5II)| = 0.
m0 0tT

Proof. First, by Lemma 4.3.4,


| 0 (m1/2 (s r), x, v; X(s))
 0 (m1/2 (s r), x, v; X(
 r ) + (s r)V
 (
r ))|
 X(s)
C|  X(
 r ) (s r)V
 (
r ))|.
Notice that if s 4m1/2 and |s r| 2m1/2 , then by the denition of r, we
always have that r s. Therefore,
s

X(s) X(
 r ) (s r)V
 (
r) =  (u) V
(V  (
r ))du. (5.7)
e
r
For any l s ( ), we have that |Xi (l) Xj (l)| Ri + Rj , i = j, which implies
 X(l))
that i U(  = 0, i = 1, . . . , N . Therefore we have by Lemma 3.5.1 that
u
1 d 1
Vi (u) Vi (
r) = Pi (l)dl + i (u) i (
r)
Mi e dl
r
u 
+ Mi (u) Mi ( r) m 1/2 
i U(X(l))dl

e
r
u 
1 d 1
= Pi (l)dl + i (u) i (
r ) + Mi (u) Mi (
r) . (5.8)
Mi e
r dl
Let
 
am = 4m1/2 + 2 max E Pm sup |i (u)| + (4m1/2 )1/2 .
i=1,...,N 0uT

Then by Lemma 3.5.1(4), we have am 0 as m 0. Notice that for s [0, T


], |s r| 2m1/2 implies |s r| 4m1/2 . Let C be the constant in (4.13),
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

812 S. Kusuoka & S. Liang

N
and let C1 = 4 i=1 M1i (supm(0,1] supt[0,T ] E Pm [| dl
d 1
Pi (l)|2 ]1/2 + 1 + C), which is
nite by Lemma 3.5.1. Then we get by (5.7), (5.8) and (4.13) that


E Pm [|X(s) X(
 r ) (s r)V (r ))|]

  
N
1 Pm 
 s u
d 1 
E  du Pi (l)dl
i=1
M i re e
r dl
 s   s 
   
+  r )) + 
du(i (u) i ( r ))
du(Mi (u) Mi (
re e
r
 2 1/2

1  
N
d
(4m1/2 )2 sup sup E Pm  Pi1 (l)
i=1
Mi m(0,1] t[0,T ] dl

s
+ 4m1/2 2 sup E Pm [|i (u)|] + r )|2 ]1/2
duE Pm [|Mi (u) Mi (
0uT e
r

C1 (4m1/2 )am . (5.9)



Also,by properties  of Poisson point processes, we have in general E Pm [ f d ] =
E Pm [ f dm ] = E[f ]dm for any measurable f : R E Conf(R E) R. Let
C be as in Lemma 4.3.4, and let C2 = 2(1 + C) f  V 2 Ui T C1 4 (2(R0 +

1))d1 Rd c ( 12 |v|2 )|v|dv. Then by Corollary 3.2.3, Lemma 4.3.4 and (5.9), we get
that
  T
E Pm sup |(5II)| fV 
ds(1 + C)
0tT 0

E Pm 1[0,) (s) 2 Ui |X(s)
 X(
 r ) (s r)V
 (
r ))|
RE

1[0,2m1/2 ) (|s r|)1[0,R0 +1) (|x|)



( (dr, dx, dv) + m (dr, dx, dv))
T
 V 2 Ui
= 2(1 + C) f ds
0


E Pm [1[0,) (s)|X(s) X(
 r ) (s r)V
 (
r ))|]
RE

1[0,2m1/2 ) (|s r|)1[0,R0 +1) (|x|)m (dr, dx, dv))


C2 am ,

which converges to 0 as m 0. This completes the proof of Lemma 5.4.4.

Lemmas 5.4.25.4.4 complete the proof of Lemma 5.4.1.


August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 813

We next deal with the rst term on the right-hand side of (5.5). We rst make
the decomposition
$
fi (s, r, x, v) + F 05 1/2
i (s, r, x, v) = Ui (Xi (s) x(s, (r, x, m v)))
Ui (Xi (s) 0 (m1/2 (s r), x, v; X(s)))


= f1 2 3
i (s, r, x, v) + fi (s, r, x, v) + fi (s, r, x, v),

with

f1
i (s, r, x, v) = Ui (Xi (s) x(s, (r, x, m
1/2
v)))
Ui (Xi (s) 0 (m1/2 (s r), x, v; X(s)))


2 Ui (Xi (s) 0 (m1/2 (s r), x, v; X(s)))




(x(s, (r, x, m1/2 v)) 0 (m1/2 (s r), x, v; X(s))),




f2 2 0
i (s, r, x, v) = Ui (Xi (s) (m
1/2
(s r), x, v; X(s)))


(x(s, (r, x, m1/2 v)) 0 (m1/2 (s r), x, v; X(s))




m1/2 z(m1/2 (s r); x, v, X(s),


 V (s), m1/2 (s r))),

f3 2 0
i (s, r, x, v) = Ui (Xi (s) (m
1/2
(s r), x, v; X(s)))


m1/2 z(m1/2 (s r); x, v, X(s),


 V (s), m1/2 (s r)),

where z is as dened in (2.3).


Although we are not subtracting the expectations of the corresponding terms,
the terms involving f 1 2 1
i (s, r, x, v) and fi (s, r, x, v) are negligible, as fi (s, r, x, v) and
f2
i (s, r, x, v) themselves are small enough. Indeed, it is easy to see by Taylor expan-
sion that f 1
i (s, r, x, v) is of higher order than x(s, (r, x, m
1/2
v)) 0 (m1/2 (s

r), x, v; X(s)), so it is somehow trivial that the term corresponding to it is negli-
gible. The fact that the term involving f 2
i (s, r, x, v) is also negligible comes from
Proposition 3.6.4. We formulate the result in the following.

Lemma 5.4.5. We have that


  t

lim E Pm sup  fV (Xs , Ys )1[4m1/ ,] (s)
m0 0tT 0
 

f
i
k (s, r, x, v) (dr, dx, dv) ds = 0,
 k = 1, 2.
RE

Proof. We rst show the assertion for k = 1. First notice that by Corollary 3.2.3
and Proposition 3.6.5, we have that for s [0, T ], f 1
i (s, r, x, v) = 0 only if
1/2 1/2 1/2
|x| R0 + 1 and s r [m , 2m ]. Since s [4m , T ], this implies
that r m1/2 [0, T ]. So in this region, we have by Proposition 3.6.3 that
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

814 S. Kusuoka & S. Liang

 > 0 such that


there exists a constant C
|x(s, (r, x, m1/2 v)) 0 (m1/2 (s r), x, v; X(s))|


m1/2 C(2  m1/2 .
+ |m1/2 (r s)|) 4C
 )2 , we have
So with C1 = 3 Ui (4C

|f1 3
i (s, r, x, v)| Ui |x(s, (r, x, m
1/2
v))
0 (m1/2 (s r), x, v; X(s))|
 2

C1 m1[0,2m1/2 ] (|s r|)1[0,R0 +1] (|x|).



Let C2 = fV T C12 (2(R0 + 1))d1 4 Rd c ( 12 |v|2 )|v|dv. Then by the denition
of , we get
  t  
  
E Pm 
sup  fV (Xs , Ys )1[4m1/ ,] (s) fi (s, r, x, v) (dr, dx, dv) ds
1
0tT 0 RE
T
ds fV C1 m 1[0,2m1/2 ] (|s r|)1[0,R0 +1] (|x|)
0 RE

1 2

N
1 1/2
m |v| + Ui (x m rv Xi,0 ) dr(dx, dv)
2 i=1

C2 m1/2 ,
which converges to 0 as m 0.
The assertion for k = 2 is similar. Again, for s [0, T ], we have by Corol-
lary 3.2.3 that f2
i (s, r, x, v) = 0 only if |x| R0 + 1 and s r [m
1/2
, 2m1/2 ].
1/2
For any s and r satisfying |s r| 2m , we have by Proposition 3.6.4 that
|x(s, (r, x, m1/2 v)) 0 (m1/2 (s r), x, v; X(s))


m1/2 z(m1/2 (s r); x, v, X(s),


 V (s), m1/2 (s r))|
Cm1/2 (1 + 2 )2 m1/2 .

Let C3 = 4 fV 2 Ui C(1 + 2 )2 T (2(R0 + 1))d1 Rd c ( 12 |v|2 )|v|dv. Then
  t  
  
E Pm 
sup  fV (Xs , Ys )1[4m1/ ,] (s) fi (s, r, x, v) (dr, dx, dv) ds
2
0tT 0 RE
T
ds fV 2 Ui C(1 + 2 )2 m 1[0,2m1/2 ] (|s r|)1[0,R0 +1] (|x|)
0 RE

1 2

N
1 1/2
m |v| + Ui (x m rv Xi,0 ) dr(dx, dv)
2 i=1

C3 m1/2 ,
which converges to 0 as m 0.
This completes the proof of our lemma.
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 815

Before dealing with the main term, namely the one involving fi3 (s, r, x, v), let
 V
us prove the following continuity of z(t; x, v, X,  , a) with respect to X
 and V
 , by
again using Gronwalls Lemma.

Lemma 5.4.6. For any T1 > 0 and b, A, B > 0, there exists a constant C =
C(T1 , b, A, B) such that
 1, V
|z(t; x, v, X  1 , a) z(t; x, v, X
 2, V
 2 , a)| C( X
1X
 2 + V
 1 V 2 ).

 1 , X
for any t [, T1 ], |a| A, X  2 B, V
 1 , V
 2 b.

 V
Proof. First notice that for any a, x, v, X,  , by using the same method as in the
proofs of Lemmas 3.6.3, 4.3.4, etc., with the help of Gronwalls Lemma, we get
easily that for any T0 > 0,
PN
2 Ui )T0
|z(t)| |z  (t)| (T0 + |a|) V
 T0 e(1+ i=1 , |t| T0 . (5.10)

For the sake of simplicity, we write z k (t) = z(t; x, v, X  k, V


 k , a), k = 1, 2, and let
1 2
(t) = z (t) z (t). Then we have that in our domain, there exists a constant C0 =
C0 (T, b, A, B) > 0 such that |z 1 (t)| C0 . Let C  > 0 be the constant in Lemma 4.3.4
N 3  + 1)(C0 + (T + A)b) + 2 Ui (1 + T + A)}. Then
and let C = i=1 { Ui (C
by denition and Lemma 4.3.4,
 2   N
d  
2
  Ui ( 0 (t, x, v; X  1 ) Xi1 )(z 1 (t) (t + a)V  1)
 dt2 (t) = 
i=1


N 
+ 2 Ui ( 0 (t, x, v; X
 2 ) Xi2 )(z 2 (t) (t + a)V  2 )

i=1
 N



=  {2 Ui ( 0 (t, x, v; X 1 ) X 1 ) 2 Ui ( 0 (t, x, v; X  2 ) X 2 )}
 i i
i=1

(z 1 (t) (t + a)V
 1)


N 
2 0 2
Ui ( (t, x, v; X ) Xi2 )(z 1 (t) 2
z (t) (t + a)(V  ))
 V 1 2

i=1

N
 + 1) X
3 Ui (C 1X
 2 (|z 1 (t)| + (T + |a|) V
 1 )
i=1

N
+ 2 Ui (|z 1 (t) z 2 (t)| + (T + |a|) V
1V
 2 )
i=1

1X
C( X  2 + V
1V
 2 ) + C|z 1 (t) z 2 (t)|
1X
= C( X  2 + V
1V
 2 ) + C|(t)|.
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

816 S. Kusuoka & S. Liang

Let g(t) = |((t), dt


d
(t))|. Then
     2 
d     
 g(t)  d (t) +  d (t)
 dt   dt   dt2 

1X
C( X  2 + V
 1 V 2 ) + (C + 1)g(t).

Hence if we set 
g(t) = g(t ), then g(0) = d
(0)
dt g = 0 by denition, and the
discussion above gives us that
d 1 X  2 + V
1V  2 ) + (C + 1)
g(t) C( X g (t).
dt
So we have for any t [0, T1 + ] that
t
1 2 1 2
g(t) Ct( X X + V V ) + (C + 1)
    g(s)ds.
0
This combined with Gronwalls Lemma implies that
1 X
g(t) C(T1 + )( X  2 + V
 1 V 2 )e(C+1)(T1 + ) , t [0, T1 + ],
which completes the proof of our assertion.

Now, let us come back to deal with the term corresponding to fi3 (s, r, x, v). We
make once more a decomposition of the form
t 
fV (Xs , Ys )1[4m1/ ,] (s) f
i
3 (s, r, x, v) (dr, dx, dv) ds = (V 1) + (V 2),

0 RE
with
t 
(V1) = fV (Xs , Ys )1[4m1/ ,] (s) 3
fi (s, r, x, v)(dr, dx, dv) ds,
0 RE
t 
(V2) = fV (Xs , Ys )1[4m1/ ,] (s) 3
fi (s, r, x, v)( )(dr, dx, dv) ds.
0 RE

The term (V1) (after a slight modication to get rid of the restriction that
s 4m1/ ), is actually our goal term. The term (V2), being the variance with
respect to the corresponding Poisson point process, is expected to be negligible. We
show the second assertion in Lemma 5.4.7.
Notice that up to n , V  (t) and X(t)
 are bounded. Also, m1/2 |s r| 2 and
2 0 1/2
|x| R0 + 1 if Ui (Xi (s) (m (s r), x, v; X(s)))
 = 0. So by (5.10), in
1/2 1/2
this case, z(m (s r); x, v, X(s), V (s), m
  (s r)) is bounded. So by the
denition of f 3 2
i and the boundedness of Ui , we get that there exists a constant
C > 0 such that
|f3 (s, r, x, v)| Cm1/2 1
i 1/2 (|s r|)1
[0,2m ] (|x|).[0,R0 +1]

Lemma 5.4.7.
 
lim E Pm
sup |(V 2)| = 0.
m0 0tT
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 817

Proof. Let

R3 (s, r, x, v) = f3 2
r ) 0 (m1/2 (s r), x, v; X(
i (s, r, x, v) Ui (Xi (
 r )))

m1/2 z(m1/2 (s r); x, v, X( r ); m1/2 (s r)).


 r ), V (

Then

(V2) = (V21) + (V22),

with
t
(V21) = fV (Xs , Ys )1[4m1/ ,] (s)
0

R3 (s, r, x, v)( )(dr, dx, dv) ds,
RE
t
(V22) = fV (Xs , Ys )1[4m1/ ,] (s)
0

2 Ui (Xi (
r ) 0 (m1/2 (s r), x, v; X(
 r )))
RE

m1/2 z(m1/2 (s r); x, v, X( r ), m1/2 (s r))


 r ), V (

( )(dr, dx, dv) ds.

We rst deal with (V21). We have by Corollary 3.2.3 and Proposition 3.6.5 that
R3 (s, r, x, v) = 0 if |x| R0 + 1 or if |s r| 2m1/2 . For s [0, T ] and
|s r| 2m1/2 , we have by denition PN
|s r| 4m1/2 . Let C1 = 2 Ui C +
3  0 + 2 )nN 2 e
Ui (1 + C)(T (1+ 2
i=1 Ui 2 ) , where C is the constant in

Lemma 5.4.6, and C  is the one in Lemma 4.3.4. Then by (5.10), Lemmas 5.4.6
and 4.3.4, we have

|R3 (s, r, x, v)| = 2 Ui (Xi (s) 0 (m1/2 (s r), x, v; X(s)))


m1/2 {z(m1/2 (s r); x, v, X(s),


  (s), m1/2 (s r))
V
z(m1/2 (s r); x, v, X( r ), m1/2 (s r))}
 r ), V (

+ {2 Ui (Xi (s) 0 (m1/2 (s r), x, v; X(s)))




2 Ui (Xi (
r ) 0 (m1/2 (s r), x, v; X(
 r )))}

m1/2 z(m1/2 (s r); x, v, X(
 r ), V r ), m1/2 (s r))
 (

2 Ui m1/2 |z(m1/2 (s r); x, v, X(s),


  (s), m1/2 (s r))
V
z(m1/2 (s r); x, v, X( r ), m1/2 (s r))|
 r ), V (
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

818 S. Kusuoka & S. Liang

+ 3 Ui (|Xi (s) Xi (
r )|
+ | 0 (m1/2 (s r), x, v; X(s))
 0 (m1/2 (s r), x, v; X(
 r ))|)

m1/2 |z(m1/2 (s r); x, v, X( r ), m1/2 (s r))|


 r ), V (

C1 m1/2 ( X(s)
 X(
 r ) + V
 (s) V (
r ) ).
Since |Vi (t)| n until n , we have |X(s)
  r )| N n|s r| 4m1/2 N n. To
X(
estimate the term with respect to V () in the equation above, let
 
am = 4m1/2 + 2 max E Pm sup |i (u)| + (4m1/2 )1/2
i=1,...,N 0uT

as before. Then by Lemma 3.5.1(4), am 0 as m 0.


N
Let C2 = i=1 M1i (supm(0,1] sup0ut E Pm [| du
d
Pi1 (u)|] + 1 + C), where C is
the constant in (4.13). Then we have that
 (s) V
E Pm [|V r )|] C2 am ,
 ( |s r| 2m1/2 .
Indeed, since s, r [0, 0 n ], we have by Lemma 3.5.1(1) and (3.31) that
s 
1 d 1
Vi (s) Vi (
r) = Pi (l)dl + i (s) i (
r ) + Mi (s) Mi (
r) ,
Mi e dl
r
hence by Lemma 3.5.1(2) and (4.13),

 
d
1 
Pm  d

E [|V (s) V (
Pm   r )|] |s r| sup E  Pi (u)
1

i=1
Mi 0uT du
  
+ 2E Pm sup |i (u)| + E Pm [|Mi (s) Mi (
r )|]
0uT


 
d
1  d 
|s r| sup E Pm  Pi1 (u)
i=1
Mi 0ut du
  
1/2
+ 2E Pm
sup |i (u)| + C|s r|
0uT

C2 am ,
which gives us our assertion.
Combining the above and the denition  =
of , we get that with C
 1 2
8 T fV (2(R0 + 1)) C1 (4 N n + C2 ) Rd c ( 2 |v| )|v|dv,
d1
 
E Pm
sup |(V 21)|
0tT
T 
ds fV E Pm 1[0,T ] (s) C1 m1/2 (4m1/2 N n + V
 (s) V
 (
r ) )
0 RE

1[0,2m1/2 ] (|s r|)1[0,R0 +1] (|x|)( + )(dr, dx, dv)
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 819

T
2 ds fV E Pm [1[0,T ] (s)C1 m1/2 (4m1/2 N n + C2 am )]
0 RE

1[0,2m1/2 ] (|s r|)1[0,R0 +1] (|x|)(dr, dx, dv)


 1/2 + am ) 0,
C(m as m 0.
To handle the term (V22) is easier. We have |2 Ui (Xi ( r ) 0 (m1/2 (s r),
2

x, v; X(r )))| Ui 1[0,2m1/2 ] (|s r|)1[0,R0 +1] (|x|). Also, for s [0, T ] and
|s r| 2m1/2 , we have by (5.10) that z(m1/2 (s r); x, v, X(  r ), V (
r ),
m 1/2
(sr)) is bounded. Let C 3 be a bound of it, and let 
C = T f C3 ((2(R 0+
 V
1))d1 4 2 Ui Rd c ( 12 |v|2 )|v|dv)1/2 . Then since X(
 r ) is Fr -measurable, by the
denition of Poisson point processes and the denition of , we have
 
E Pm sup |(V 22)|
0tT

 
T 
ds fV E Pm  2 Ui (Xi (
r ) 0 (m1/2 (s r), x, v; X(
 r ))
0 RE

m1/2 z(m1/2 (s r); x, v, X( r ), m1/2 (s r))


 r ), V (
2 1/2

( (dr, dx, dv) (dr, dx, dv))


T +
= ds fV E Pm (2 Ui (Xi (
r ) 0 (m1/2 (s r), x, v; X(
 r ))
0 RE
,
m1/2 z(m1/2 (s r); x, v, X( r ), m1/2 (s r)))2
 r ), V (

1/2
(dr, dx, dv)

T
ds fV ( 2 Ui m1/2 C3 )2 1[0,R0 +1) (|x|)
0 RE
 1/2
1[0,2m1/2 ) (|s r|)(dr, dx, dv)

 1/4 0,
Cm as m 0.
This completes the proof of Lemma 5.4.7.
N
Up to now, we have shown that all of the terms of (II) except i=1 M1i (V1)
are negligible. We are almost done with our discussion with respect to (II), except
for getting rid of the term 1[4m1/ ,] (s) in the denition of (V1). We do it now.
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

820 S. Kusuoka & S. Liang

Notice that in the integral domain of (V1), we have s 4m1/2 . So if


Ui (Xi (s) 0 (m1/2 (s r), x, v; X(s))
2  = 0, then r 2m1/2 . If ( 12 |v|2 +
N 1/2
i=1 Ui (x m rv Xi,0 )) = 0 in addition, then |v| 2C0 + 1. Therefore, in this
1/2
case, |m rv| 2 (2C0 +1) R0 , hence since xv = 0, we get |xm1/2 rv| R0 ,

which in turn gives us that ( 12 |v|2 + N i=1 Ui (x m
1/2
rv Xi,0 )) = ( 12 |v|2 ).
Therefore, by denition,
t
(V1) = dsfV (Xs , Ys )1[4m1/ ,] (s)
0

2 Ui (Xi (s) 0 (m1/2 (s r), x, v; X(s)))

RE

m1/2 z(m1/2 (s r); x, v, X(s),


  (s), m1/2 (s r))
V
 
1 2
m1 |v| dr(dx, dv)
2
t
= dsfV (Xs , Ys )1[4m1/ ,] (s)
0
+
du2 Ui (Xi (s) 0 (u, x, v; X(s)))

E
  
z(u; x, v, X(s),
  (s), u) 1 |v|2 (dx, dv) ,
V
2
where when passing to the last equality, we used the change of variable u =
m1/2 (s r) for every s xed.
With the help of this re-expression, we make a decomposition once more,
(V1) = (V11) + (V12),
with
t
(V11) = dsfV (Xs , Ys )
0
+
du2 Ui (Xi (s) 0 (u, x, v; X(s)))

E
  
1 2
z(u; x, v, X(s), V (s), u)
  |v| (dx, dv) ,
2
t
(V12) = dsfV (Xs , Ys )1[0,4m1/2 ] (s)
0
+
du2 Ui (Xi (s) 0 (u, x, v; X(s)))

E
  
1 2
z(u; x, v, X(s), V (s), u)
  |v| (dx, dv) .
2
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 821

Notice that for s [0, T ], 2 Ui (Xi (s) 0 (u, x, v; X(s)))


 = 0 only if |u| 2
and |x|
  + R 0 + 1, and z(u; x, v, 
X(s), 
V (s), u) is bounded in this domain. So
(
E du 2
U i (X i (s) 0
(u, x, v; 
X(s)))z(u; x, v, 
X(s), 
V (s), u))( 12 |v|2 )(dx,
dv) is bounded. Let C be a bound of it. Then

|(V12)| 4C fV m1/2 .

This completes the proof of (5.2), i.e. the fact that the term (II) is converging
N
to i=1 M1i (V11) as m 0.

5.5. Conclusion
Combining the results of Secs. 5.15.4, and taking the limit n at last (notice
that n a.s.), we get Theorems 2.0.1(2) and 2.0.1(3).
Notice that this also gives us Lemma 3.5.3, by considering each time interval
[n1 , n ], with n , n given by the following: 0 = 0,
 
n = inf t n1 ; X(t)
  , f
B supp U ,
2
n = inf{t n ; X(t)
  , f )},
/ B(supp U n 1.

 , 2f ) RdN )C .
Here f > 0 is chosen such that supp f (B(supp U

6. Case of Two Molecules


In this section, we consider the special case of two molecules with d 3 and
spherically-symmetric potential functions U1 , U2 , as described in Theorem 2.0.1(4).
Precisely, in addition to all of the assumptions in Secs. 35, we assume from
now on that d 3 and there exist functions h1 , h2 : [0, ) R such that
Ui (x) = hi (|x|), i = 1, 2, and, moreover, there exists a constant 0 > 0 such that
(1)i1 hi (s) > 0, (1)i1 hi (s) > 0, s (Ri 0 , Ri ), i = 1, 2. See Sec. 2 for
the explanation of these assumptions. Without loss of generality, we assume that
0 < R1 R2 .
In the following, we show that in this special case, as announced in Sec. 2,
as m 0, {(X(t),
 V (t))}t under Pm converges to the reecting diusion process
which has generator L and act as colliding when the potential ranges of the
two molecules overlap. (See Theorem 6.3.2 for the precise denition of the limiting
process.)
We rst discuss a little bit more about the new potentials U  . We then show
that in our present setting, the condition of Lemma 3.5.2 is satised, and that
when m 0 {the distribution of {(X(t  (t n ))}t under Pm }m is tight in
 n ), V
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

822 S. Kusuoka & S. Liang

$d ) with the metric function dis


(W $d = C([0, ); Rd )D([0, ); Rd ) given by
 of W



 1 , 2 ) =
dis( 2n 1 max |x1 (t) x2 (t)| + dis(v1 , v2 )
t[0,n]
n=1

$d , i = 1, 2. Here dis is the Skorohod metric on


for i = (xi (), vi ()) W
D([0, ), R ) dened in Sec. 3.4. Finally, we use these to show the desired
d

convergence.


6.1. The new potential U
 be as dened in Sec. 3.6, and let U
Let p and U 0 be the constant

2
0 =
U (p(Ui (Xi x)) p(0))dx,
i=1 Rd

 (X1 , X2 ), when X1 and X2 are far


which, as claimed in Sec. 3.6, is the value of U
enough, precisely, when |X1 X2 | R1 + R2 . Then

 
U (X1 , X2 ) U0 = {[p(U1 (X1 x) + U2 (X2 x)) p(0)]
Rd

[(p(U1 (X1 x)) p(0)) + (p(U2 (X2 x)) p(0))]}dx



U1 (X1 x)+U2 (X2 x)
= dx p (s)ds
Rd 0


U1 (X1 x) U2 (X2 x)
p (s)ds p (s)ds
0 0

 
U1 (X1 x)+U2 (X2 x) U1 (X1 x)
 
= dx p (s)ds p (s)ds
Rd U2 (X2 x) 0
U1 (X1 x)
= dx (p (s + U2 (X2 x)) p (s))ds
Rd 0
U1 (X1 x) U2 (X2 x)
= dx ds p (s + u)du.
Rd 0 0
Therefore,
U2 (X2 x)
 (X1 , X2 ) =
1 U dx p (U1 (X1 x) + u)duU1 (X1 x). (6.1)
Rd 0
Notice that the integrand in (6.1) is 0 outside of
B2 = BX1 ,X2 = {x Rd ; |x X1 | R1 , |x X2 | R2 }.
Therefore,
U2 (X2 x)
 (X1 , X2 ) =
1 U dx p (U1 (X1 x) + u)duU1 (X1 x). (6.2)
B2 0
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 823

We will use this expression in the following calculations.


In this subsection, we show, by using the spherical-symmetry of the poten-

 1 U (X1 , X2 ) has the same direction as X2 X1 . Therefore, the term
tials, that
m 1/2 t  (X(s))ds in the decomposition (3.30) of Mi Vi (t) gives us the
i U
0
reecting force.
First, we have the following:
Lemma 6.1.1. Let (0, 0 ]. Then there exists a C > 0 such that for any
X1 , X2 Rd satisfying |X1 X2 | [R1 + R2 , R1 + R2 2 ), we have that
 1 , X2 ) is parallel to X2 X1 in Rd , and
i U(X
 (X1 , X2 ) C ,
(X1 X2 ) 1 U  (X1 , X2 ) C .
(X1 X2 ) 2 U

 = (X1 , X2 )
Proof. First notice that by assumption and (3.39), we have for any X
2d
R

 1 2
i U(X) =
 Ui (Xi x) |v| + U1 (X1 x) + U2 (X2 x) dxdv
R2d 2

Xi x  1 2
= hi (|Xi x|) |v| + h1 (|X1 x|) + h2 (|X2 x|) dxdv.
R2d |Xi x| 2
From this, it is easy to see that i U  (X) is parallel to X1 X2 in Rd .
For the second half of the lemma, since the proofs are similar, we only prove
the rst assertion.
Notice that for any x B2 , since |X1 X2 | R1 + R2 , we have that
|X1 x| R1 , |X2 x| R2 .
By our assumption, U1 (X1 x) = h1 (|X1 x|), U2 (X2 x) = h2 (|X2 x|).
Therefore, by (6.2),
h2 (|X2 x|)
 (X1 , X2 ) = X1 x
1 U dx p (h1 (|X1 x|) + u)duh1 (|X1 x|) .
B2 0 |X 1 x|

Notice that in this integral domain, since 0 < R1 R2 , we have (X1 X2 )


X1 x
|X1 x| > 0. By assumption,

h1 (|X1 x|) > 0, h2 (|X2 x|) < 0,


h1 (|X1 x|) < 0, h2 (|X2 x|) > 0.
Also, since d 3, we have by (3.42) that p (s) < 0 for any s < e0 . Therefore,
if we set
- .
2 = x; |X1 x| R1 , |X2 x| R2 B2 ,
B
6 6
then
h2 (|X2 x|)
 1 , X2 )
(X1 X2 ) 1 U(X dx p (h1 (|X1 x|) + u)du
f2
B 0

X1 x
h1 (|X1 x|)(X1 X2 ) .
|X1 x|
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

824 S. Kusuoka & S. Liang

We have by (3.42) that p (s) > 0 for any |s| U1 + U2 , also, p ()
is continuous in this closed interval. Therefore, there exists a constant C1 > 0 such
that
inf{p (s); |s| U1 + U2 } C1 .
 , we have that
Moreover, for any x B

5
|X1 x| |X1 X2 | |X2 x| (R1 + R2 ) R2 = R1 ,
6 6
i.e. |X1 x| [R1 56 , R1 6 ]. In the same way, |X2 x| [R2 56 , R2 6 ]. So
by assumption, there exists a constant C1 > 0 (which does not depend on x) such
that
h1 (|X1 x|) C1 , h2 (|X2 x|) C1 ,
h1 (|X1 x|) C1 , h2 (|X2 x|) C1 .
Also, we have that
X1 x (R1 + R2 )(R1 )
(X1 X2 ) .
|X1 x| R1
Indeed, if we decompose X1 x into
X1 x = X1 x
 + (x x
)
with X1 x X1 X2 and x x  X1 X2 , then X2 x = X2 x  + (x x ) is
2 2 2 2
also an orthogonal decomposition. So R2 |X2 x| = |X2 x | + |x x | , hence
|X2 x
| R2 . Also, |X1 X2 | R1 + R2 , So |X1 x
| |X1 X2 | |X2 x|
(R1 + R2 ) R2 = R1 . Therefore,
X1 x |X1 X2 | |X1 x
| (R1 + R2 )(R1 )
(X1 X2 ) .
|X1 x| R1 R1
Combining these, we get that
h2 (|X2 x|)
 (X1 , X2 )
(X1 X2 ) 1 U dx p (h1 (|X1 x|) + u)du
f2
B 0

X1 x
h1 (|X1 x|)(X1 X2 )
|X1 x|

(R1 + R2 )(R1 )
C1 C1 C1 dx,
R1 Bf

which gives us our rst assertion.

As a direct corollary of Lemma 6.1.1, we have the following.

Lemma 6.1.2. Let (0, 0 ], and let X1 , X2 Rd satisfying |X1 X2 | [R1 +


R2 , R1 + R2 ). Then we have that
 (X1 , X2 ) < 0,
(X1 X2 ) 1 U  1 , X2 ) > 0.
(X1 X2 ) 2 U(X
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 825

We also have the following as an easy corollary of Lemma 6.1.1.

Corollary 6.1.3. Assume that t1 , t2 [0, n ] satisfy |t1 t2 | 4n


, and |X1 (t1 )
X2 (t1 )| [R1 + R2 , R1 + R2 2 ). Then


 /2
(X1 (t2 ) X2 (t2 )) 1 U (X1 (t1 ), X2 (t1 )) C 1 .
R1 + R2

Proof. By using the general fact that (a,b) |ab|


|b|2 1 |b| for any a, b R , we get
d

1 , X
by Lemma 6.1.1 that for any (X 2 ) with |(X
1 X
2 ) (X1 X2 )| < |X1 X2 |,
we have
1 X
(X  (X1 , X2 )
2 ) 1 U
1 X
(X  2 , X1 X2 )
 1 , X2 )
= (X1 X2 ) 1 U(X
|X1 X2 |2

1 X
|(X 2 ) (X1 X2 )|
C 1 .
|X1 X2 |

Under our assumption, we have |X1 (t1 ) X1 (t2 )| n|t1 t2 |


4, similarly,
|X2 (t1 ) X2 (t2 )| 4 . Therefore, by the argument above,
 1 (t1 ), X2 (t1 ))
(X1 (t2 ) X2 (t2 )) 1 U(X

|(X1 (t2 ) X2 (t2 )) (X1 (t1 ) X2 (t1 ))|
C 1
|X1 (t1 ) X2 (t1 )|

/2
C 1 .
R1 + R2

6.2. Tightness
Same as before, we only need to discuss under condition |Vi | n, i.e. use t n
instead of t, and nally take n .
We rst show that the condition of Lemma 3.5.2 is satised.
For any X  = (X1 , X2 ) R2d with |X1 X2 | < R1 + R2 big enough,
by Lemma 6.1.1, we have that i U  (X) is parallel to X1 X2 in Rd , and by

Lemma 6.1.2, 1 U(X) has the opposite direction as X1 X2 , and 2 U
  (X)
 has the
same direction as X1 X2 .
Therefore, if we let g1 (X)  = X2 X1 , g2 (X)  = X1 X2 , and let D =
|X2 X1 | |X1 X2 |
{(X1 , X2 )||Xi | |Xi,0 |+nT, |X1 X2 | R1 +R2 0 }. Then since R1 +R2 0 > 0,
we have that g1 , g2 Cb1 (D)  i U(
and gi (X)  X)
 = |i U (X)|
 for any x D,
i.e.
the condition of Lemma 3.5.2 is satised.
We next give a brief proof of the tightness of {the distribution of {(X(t 
n ), V (t n ))}t under Pm }m as m 0. The only diculty is the assertion with
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

826 S. Kusuoka & S. Liang

T
respect to V (). We deal with it from now on. Let Ak = {Yt : 0 |dYt | k}, k N.
Then we have by Kusuoka [9, Corollary 8] that Ak is compact in Lp ([0, T ]; Rd) with
cluster points in D([0, T ]; Rd) for any k N. Also, by Lemma 3.5.2(1), there exists
a constant C > 0 such that
n 1
Pm m 1/2 
i U(X(s))ds
 (Ak )
0

T
= 1 Pm m 1/2  X(s))|ds
|i U(  >k
0
 
T
1 1/2  (X(s))|ds
1 E Pm m |i U 
k 0

C
1 ,
k
 t
which converges to 1 as k . Therefore, {{m1/2 0 n i U  (X(s))ds}
 t
under Pm }m(0,1] is tight in (D([0, T ]; R )). Therefore, since by Lemma 3.5.1,
d

t
1
Mi (Vi (t ) Vi (0)) = Mi (t) + i (t) + Pi m 1/2  (X(s))ds,
i U 
0

and the distributions of Mi (t) + i (t) and Pi1


under Pm are tight in
(D([0, T ]; Rd )), we get the conclusion that {{Vi (tn )}t under Pm }m0 is tight in
(D([0, T ]; Rd )).

6.3. Convergence to a Markov process


The idea is similar to that presented by Kusuoka in [9].
Let us rst recall the following existence and uniqueness theorem of Kusuoka [9,
Theorem 1]. Let D be a bounded domain in Rd with a smooth boundary D and
let n(x), x D, be the outer normal vector at x D. Let

d
1
ij
d
2

d

L0 = vi + a (x) + bi (x, v) i ,
i=1
xi 2 i,j=1 v i v j i=1 v

where aij : Rd R, i, j = 1, . . . , d, are smooth function, symmetric with respect


to i, j and uniformly elliptic with respect to x, and bi : R2d R, i = 1, . . . , d, are
bounded measurable functions.
Let : Rd D Rd be a smooth map satisfying the following:
(1) (, x): Rd Rd is linear for all x D,
(2) (v, x) = v for any x D and v Tx (D), i.e. (v, x) = v if x D, v Rd
and v n(x) = 0,
(3) ((v, x), x) = v for all v Rd and x D,
(4) (n(x), x) = n(x) for any x D.
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 827

Then Kusuoka [9, Theorem 1] proved the following:

Theorem 6.3.1. Let (x0 , v0 ) (D)


C Rd . Then there exists a unique probability
satisfying the following:
measure over W d

(1) ((0) = (x0 , v0 )) = 1,


(2) ((t) DC Rd , t [0, )) = 1, 
(3) For any f C0 ((D) C Rd ), {f ((t)) t L0 f (w(s))ds; t 0} is a martingale
0
under (),
(4) (1D (x(t))(v(t) (v(t), x(t))) = 0 for all t [0, )) = 1.

Here () = (x(), v()) W


d.

By using this, we get the following slight variation. Recall that D0 = {(X1 , X2 )
R2d ; |X1 X2 | > R1 + R2 } in our present setting.

Theorem 6.3.2. There exists a unique probability measure P,0 over D([0, );
R4d ) satisfying the following.

(1) P,0 ((0) = (x0 , v0 )) = 1,



(2) P,0 (X(t) D 0 , t [0, )) = 1,
t
(3) For any f C0 (D0 R2d ), {f (X(t),
 V (t)) 0 (Lf )(X(s),
  (s))ds; t 0} is
V
a martingale under P,0 ,
(4) If f C0 (R4d ) satises

M11 (v1 f )(x, v) (x1 x2 ) + M21 (v2 f )(x, v) (x2 x1 ) = 0 (6.3)

for any (x, v) D0 R2d , then f (X(t),


 V (t)) is continuous in t, P,0 -a.s.,
2 2
(5) M1 |V1 (t)| + M2 |V2 (t)| is continuous in t, P,0 -a.s.

Proof. We dene (v, x), (v, x) = (v1 , v2 , x1 , x2 ) R4d , in the following way: For
any such v1 , v2 , x1 , x2 Rd , decompose v1 and v2 into vi = ui +wi with ui x1 x2
and wi x1 x2 , i = 1, 2, and let (v, x) = (1 (v, x), 2 (v, x)) with

M1 M2 2M2
1 (v, x) = u1 + w1 + w2 ,
M1 + M2 M1 + M2
2M1 M2 M1
2 (v, x) = u2 + w1 + w2 .
M1 + M2 M1 + M2

Then satises the conditions before Theorem 6.3.1.


We rst check the fact that a probability satisfying the conditions (1)(4) of
Theorem 6.3.1 with given above also satises conditions (1)(5) of Theorem 6.3.2.
All except (4) are trivial. For (4), it sucient to show that f (x, (v, x)) = f (x, v)
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

828 S. Kusuoka & S. Liang

for any x D0 if it satises (6.3). We show it in the following. Since


2M2 2M1
1 (v, x) v1 = (w2 w1 ), 2 (v, x) v2 = (w1 w2 ),
M1 + M2 M1 + M2
we have
1
f (x, (v, x)) f (x, v) = [v1 f (x, v + t((v, x) v))(1 (t, x) v1 )
0

+ v2 f (x, v + t((v, x) v))(2 (t, x) v2 )]dt


1
2M1 M2
= [M11 v1 f + M21 v2 f ]
M1 + M2 0
(x, v + t((v, x) v)) (w2 w1 )dt
= 0,
where in the last line we used (6.3) and the fact that w2 w1 x2 x1 .
For the opposite direction, i.e. the fact that a probability satisfying the con-
ditions (1)(5) of Theorem 6.3.2 also satises conditions (1)(4) of Theorem 6.3.1
with given above, we only need to check that (4) of Theorem 6.3.1 is satised,
or equivalently, show that V () = (V (), X()) if X()  D0 . Choose any
w R and x it for a while. Let f (x, v) = M1 v1 w + M2 v2 w. Then f satises
d


(6.3), so by (4) of Theorem 6.3.2, f (X(t), V (t)) is continuous in t. We write it down
together with (5) of Theorem 6.3.2:
M1 V1 (t) + M2 V2 (t) is continuous in t,
M1 V12 (t) + M2 V22 (t) is continuous in t.
Solving these two equations, we get that either
1 (V (), X()) w = V1 () w, and
(6.4)
2 (V (), X()) w = V2 () w
or

V1 () w = V1 () w, and V2 () w = V2 () w. (6.5)
If w is orthogonal to X1 () X2 (), then these two conditions are equivalent, so
both of them hold, which means that there is no jump at time in any of these
directions. Now, the only thing left to be checked is that (6.4) also holds for any
w X1 () X2 (). If not, then (6.5) holds, so Vi () = Vi () for i = 1, 2. Since
d
(X1 (t) X2 (t))2 = (X1 (t) X2 (t)) (V1 (t) V2 (t)),
dt
this implies that
 
d 
2 d 
2
(X1 (t) X2 (t))  = (X1 (t) X2 (t))  . (6.6)
dt t= dt t=
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 829

If (6.6) is equal to 0, then V1 () V2 () is orthogonal to X1 () X2 (), so by


the denition of this implies that (V (), X()) = V (), which combined
with our assumption implies that (V (), X()) = V (), the very equation that
we need. If (6.6) is not equal to 0, write it as C R, then by the continuity
of dtd
(X1 (t) X2 (t))2 |t= , there exists an > 0 small enough such that either
(X1 ( ) X2( ))2 or (X1 ( + ) X2( + ))2 is less than (X1 () X2 ())2
|C| 2 |C|
2 = (R1 + R2 ) 2 . This contradicts the condition (2). Therefore, (6.4),
i.e. (4) of Theorem 6.3.1 holds.

We have already shown in Sec. 6.2 that {{(X(t


 n ), V (tn ))}t under Pm }m0

is tight. We show from now on that any cluster point of it satises all of the
conditions of Theorem 6.3.2.
The fact that any of its cluster points satises (1) is trivial. The fact that it
satises (3) is nothing but Lemma 3.5.3. So we only need to show that (2), (4) and
(5) are also satised.
We show (2) rst. Choose an arbitrary > 0 and x it for a while. Let
 
3
= = inf t > 0; |X1 (t) X2 (t)| R1 + R2 n T.
4

Then (2) is implied by the following.

Lemma 6.3.3. Let (0, 0 ] and let be as dened above. Then

lim Pm ( < T n ) = 0.
m0

This result is easy to be imagined,


 since as m 0, m1/2 , so by Corol-
lary 6.1.3, the term m 1/2 t 
i U (X(s))ds in the decomposition of Mi Vi (t)
0
gives us a very strong force as soon as the distance between the two molecules is
less than R1 + R2 .

Proof. Notice that if < T n , then |X1 () X2 ()| = R1 + R2 34 , hence


+ , + ,
|X1 (t) X2 (t)| R1 + R2 , R1 + R2 , for any t , .
2 8n
We have by Itos formula and Lemma 3.5.1 that
t
2 2
|X1 (t) X2 (t)| = |X1 (0) X2 (0)| + 2 (X1 (s) X2 (s))
0

M1 (s) M2 (s) + 1 (s) 2 (s) + P11 (s) P21 (s)
s 
m 1/2  
(1 U (X1 (u), X2 (u)) 2 U (X1 (u), X2 (u)))du ds,
0
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

830 S. Kusuoka & S. Liang

so
2
3
R1 + R2 (R1 + R2 )2
4
  2
 
|X1 () X2 ()|2 X1 X2
8n 8n 


=2 (X1 (s) X2 (s)) M1 (s) M2 (s) + 1 (s) 2 (s)

8n

+ P11 (s) P21 (s)



8n
m1/2  1 (u), X2 (u)) 2 U
(1 U(X  (X1 (u), X2 (u)))du
0


s
m 1/2  1 (u), X2 (u)) 2 U
(1 U(X  (X1 (u), X2 (u)))du ds

8n

 


2 R1 + R2 |M1 (s)| + |M2 (s)| + |1 (s)| + |2 (s)|

8n 2

+ |P11 (s)| + |P21 (s)|



T n
+m 1/2  (X1 (u), X2 (u))| + |2 U
(|1 U  (X1 (u), X2 (u)))| du ds
0

s
+ 2m1/2 ds [(X1 (s) X2 (s))

8n 8n

 (X1 (u), X2 (u)) 2 U


(1 U  (X1 (u), X2 (u)))]du. (6.7)

Let C1 = (R1 + R2 )2 (R1 + R2 34 )2 and C2 = ( 8n 2 /2


) C (1 R1 +R 2
) > 0,
where C is the constant given in Lemma 6.1.1 and Corollary 6.1.3. Notice that C1
and C2 depend only on R1 + R2 , and n, and do not depend on m. Also, write
Ys = |M1 (s)| + |M2 (s)| + |1 (s)| + |2 (s)| + |P11 (s)| + |P21 (s)|. Then with the help
of Corollary 6.1.3, (6.7) implies that

< T n
 

2 R1 + R2 Ys ds + R1 + R2
2
8n 4n 2
T n
 (X1 (u), X2 (u))| + |2 U
m1/2 (|1 U  (X1 (u), X2 (u)))|)du
0
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 831

2
2 3
(R1 + R2 ) R1 + R2
4
s
+ 2m1/2 ds [(X1 (s) X2 (s))

8n 8n

 (X1 (u), X2 (u)) 2 U


(1 U  (X1 (u), X2 (u)))]du
 s
1/2 /2
C1 + 2m C 1 ds du
R1 + R2
8n
8n
 2
1/2 /2 1
= C1 + 2m C 1
R1 + R2 2 8n
= C1 + m1/2 C2 .
T 2 
Pm T n
Let C3 = supm(0,1] {2 0 E Pm [Ys ]ds + 4n
 (X1 (u),
m1/2 |i U
i=1 E [ 0
X2 (u))|du]}, which is nite by Lemmas 3.5.1 and 3.5.2. Then the above implies
that
 

Pm ( < T n ) Pm 2 R1 + R2 Ys ds + R1 + R2
2
8n 4n 2
T n
 1 (u), X2 (u))|
m1/2 (|1 U(X
0

 (X1 (u), X2 (u)))|)du C1 + m1/2 C2
+ |2 U

  n
1
E Pm
2 R1 + R2 Ys ds
C1 + m1/2 C2 2
8n
 T n
 (X1 (u), X2 (u))|
+ R1 + R2 m1/2 (|1 U
4n 2 0

 (X1 (u), X2 (u)))|)du
+ |2 U


1
R1 + R2 C3 ,
C1 + m1/2 C2 2
which converges to 0 as m 0.

This completes the proof of our assertion.

We next show that the condition (5) of Theorem 6.3.2 is satised, i.e.
M1 |V1 (t)|2 + M2 |V2 (t)|2 is continuous in t almost surely, under any limit
probability.
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

832 S. Kusuoka & S. Liang

We rst prepare the following:

Lemma 6.3.4. 1 U  (Y1 , Y2 ) Y1 Y2 is monotone non-increasing with respect to


|Y1 Y2 |
|Y1 Y2 | for |Y1 Y2 | [R1 + R2 0 , R1 + R2 ].

Proof. As in the proof of Lemma 6.1.1, by (6.2), we have that in our present
setting,
h2 (|Y2 x|)
1 U (Y1 , Y2 ) Y1 Y2 = dx p (h1 (|Y1 x|) + u)du
|Y1 Y2 | BY1 ,Y2 0

Y1 x Y1 Y2
h1 (|Y1 x|) .
|Y1 x| |Y1 Y2 |

Let B Y1 ,Y2 = {(s, t)|x BY1 ,Y2 , s = |Y1 x|, t = |Y2 x|}, and for any (s, t)

BY1 ,Y2 , let , , be the angles between Y1 Y2 and Y1 x, Y2 Y1 and Y2 x, xY1 and xY2 ,
respectively. Write A = |Y1 Y2 |. Finally, let l(s, t) denote the length of the hyper-
circle {x Rd ; |Y1 x| = s, |Y2 x| = t} in Rd2 . Then by using a change of
variables,

 (Y1 , Y2 ) Y1 Y2
1 U
|Y1 Y2 |
0
= dsdt (p (h1 (s) + u))du(h1 (s))l(s, t) cos sin .
B
Y1 ,Y2 h2 (t)

Notice that all of the terms above are positive. The integration domain B 
Y1 ,Y2
is decreasing with respect to |Y1 Y2 |. Also, for any xed s and t, the term l(s, t)
is also decreasing with respect to |Y1 Y2 |. Therefore, it is sucient to show that
for any s, t xed, cos sin is decreasing with respect to A = |Y1 Y2 |. We shall
show it from now on.
By the sine formula, cos sin = At sin cos . So it suces to show that
A sin cos is monotone decreasing with respect to A, or equivalent, is mono-
tone increasing /with respect to , for > 0 small enough. It is easy to see that
A = s cos + t2 s2 sin2 . So
/
A sin cos = s sin cos2 + t2 s2 sin2 sin cos
0
= s sin (1 sin2 ) + (t2 s2 sin2 )(1 sin2 ) sin2 .

Since > 0 is small enough, we have sin2 > 0 small enough and monotone
increasing with respect to . Also, since s/t is near to R 1
R2 (> 0), there exists an
1 > 0 such that the functions f1 (x) = sx(1 x ) and f2 (x) = (t2 s2 x)(1 x)x =
2
2
t2 x(x 1)(x st2 ) are monotone increasing in x [0, 1 ]. Combining these, we get
the desired property of A sin cos to be increasing with respect to for > 0
small enough, or equivalent, decreasing with respect to A.
This completes the proof of our assertion.
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 833

Let 0 = 0 = inf{t > 0; |X1 (t) X2 (t)| R1 + R2 34 0 } n T . Then by


Lemma 6.3.3, Pm (0 < T n ) 0 as m . We next use Lemma 6.3.4 to prove
the following:
 T
Lemma 6.3.5. limm0 Pm ( 0 n 0 m1/2 (U  (X(s))
 0 )ds > ) = 0 for any
U
> 0.

Proof. By Lemma 6.1.2, we have that 1 U(Y  1 , Y2 ) Y1 Y2 is positive for |Y1


|Y1 Y2 |
Y2 | [R1 + R2 0 , R1 + R2 ). Also, by Lemma 6.3.4, the same quantity is monotone
non-increasing with respect to |Y1 Y2 |. Notice that U  (X1 , X2 ) = U  (X1 X2 , 0). So
 
with a little bit abuse of notation, we can write U (X1 , X2 ) = U (X1 X2 ). We have
 1 , X2 ) U
that U(X 0 = 0 if |X1 X2 | R1 + R2 . Also, for any |X1 X2 | < R1 + R2 ,
 R1 +R2
we have U( |X1 X2 | (X1 X2 )) = U 0 , and R1 +R2 + t(1 R1 +R2 ) 1 for t [0, 1],
|X1 X2 | |X1 X2 |
hence
 (X1 , X2 ) U
U 0

= U(X 1 X2 ) U  R1 + R2 (X1 X2 )
|X1 X2 |
1  
= 1 U R1 + R2 (X1 X2 ) + t 1 R1 + R2 (X1 X2 )
0 |X1 X2 | |X1 X2 |

R1 + R2
1 + (X1 X2 )dt
|X1 X2 |
1 
1 U (X1 X2 ) (X1 X2 ) 1 + R1 + R2 dt
0 |X1 X2 |

 R1 + R2
= 1 U(X1 X2 ) (X1 X2 ) 1 +
|X1 X2 |

 (X1 X2 )||X1 X2 | R1 + R2 |X1 X2 |


|1 U
|X1 X2 |
 (X1 X2 )|(R1 + R2 |X1 X2 |).
= |1 U
The rst equation in the calculation above also gives us that U 0 is
 (X1 , X2 ) U
non-negative. Also, by (3.31),  
(X(s)) U0 = 0 if |X1 (s) X2 (s)| R1 + R2 .

 T U

Let C = supm(0,1] E Pm [ 0 n 0 m1/2 |1 U(X  1 (s) X2 (s))|ds], which is nite
by Lemma 3.5.2. Then for any (0, 34 0 ), we have

T n 0
1/2   
Pm m (U (X(s)) U0 )ds >
0

T n 0
Pm  (X1 (s) X2 (s))|
m1/2 |1 U
0

(R1 + R2 |X1 (s) X2 (s)|)1{|X1 (s)X2 (s)|<R1 +R2 } ds >
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

834 S. Kusuoka & S. Liang


Pm inf |X1 (s) X2 (s)| R1 + R2
s[0,T n ]

T n 0
 1 (s) X2 (s))|ds >
1/2
+ Pm m |1 U(X
0
 
T n 0
Pm 1/2  (X1 (s) X2 (s))|ds
Pm ( 43 < T n ) + E m |1 U
0

Pm ( 43 < T n ) + C.

By Lemma 6.3.3, Pm ( 43 < T n ) 0 as m 0 for any > 0. Therefore,
by taking rst > 0 small enough and then m > 0 small enough, we get our
assertion.

We are now ready to show that the condition (5) of Theorem 6.3.2 is satised.
Lemma 6.3.6. M1 |V1 (t)|2 +M2 |V2 (t)|2 is continuous in t almost surely, under any
cluster point of {(X(t),
  (t))t under Pm } as m 0.
V

Proof. Let mk be a sequence and P be a probability such that limk mk = 0


and {(X(t),
 V (t))t under Pm } converges to P as k . (This is possible by
Sec. 6.2.) Then
(Vi2 (s))s under Pm (Vi2 (s))s under P
in (D([0, T ]; Rd)), as m 0. Also, let

2
 (X(s))
Hsm = m1/2 (U  0 ) + 1
U Mi |Vi (s)|2 .
2 i=1
Then we have by Lemma 3.5.2(2) that under our present setting, {(Ht m
) under
n 0 t
Pm }m0 is tight in (C([0, T ]; R )). That is, there exists a Hs C([0, T ]; Rd) such
d

that
(Hsm )s under Pm (Hs )s under P
in (C([0, T ]; Rd )), as m 0. Combining the above, we get
2

1
2
Hs
m
Mi Vi (s) under Pm
2 i=1
s[0,T n 0 )
2

1

Hs Mi Vi (s)2 under P
2 i=1
s[0,T n 0 )

in (D([0, T ]; Rd)), as m 0. However, for any > 0, we have by Lemma 6.3.5


that
 
T n 0  2 
 m 1
2
Pm  Hs Mi Vi (s)  ds > 0, as m 0.
0  2 
i=1
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 835

So
 
T n 0  1

2 
 2
P  Hs Mi Vi (s)  ds > = 0
0  2 i=1 

for any > 0. Also, 0 T n as m 0. Therefore,


T n  2


 1
2
 Hs Mi Vi (s)  ds = 0, P -a.s.
0  2 
i=1

This combined with the continuity of Hs and the fact that n a.e. gives us
that M1 |V1 (t)|2 + M2 |V2 (t)|2 is continuous in t, P -almost surely.

We nally show that the condition (4) of Theorem 6.3.2 is satised. The method
is similar to the one of the proof of (5).
As in Sec. 5.1, let Yi (t) = Vi (t) Mi1 i (t), i = 1, 2, where i (t) is as given in
Lemma 3.5.1. Let Y  (t) = (Y1 (t), Y2 (t)), and let
t
Gt = m1/2 {M 1 fV1 (X(s),
  (X(s))
Y (s)) 1 U 
1
0

+ M21 fV2 (X(s),


 Y  (X(s))}ds
 (s)) 2 U  
+ f (X(t), V (t)).

We rst show the following.

Lemma 6.3.7. {(Gtn )t under Pm }m0 is tight in (C([0, T ]; Rd )).

Proof. Let
t = Gt f (X(t),
G   (t)) + f (X(t),
V  Y (t)).

Then
 t | fV1 M 1 |1 (t)| + fV2 M 1 |2 (t)|.
|Gt G 1 2

Therefore, by Lemma 3.5.1(4), we have that the tightness of {(Gtn )t


 tn )t
under Pm }m0 in (C([0, T ]; Rd)) is equivalent to the tightness of {(G
under Pm }m0 in (C([0, T ]; Rd )).
On the other hand, we have by Lemma 3.5.1 and Itos formula that
t = fX1 (X(t),
dG  Y (t)) V1 (t)dt + fX2 (X(t),
 Y (t)) V2 (t)dt
+ M11 fV1 (X(t),
 Y (t)) (dM1 (t) + dP11 (t))
+ M21 fV2 (X(t),
 Y (t)) (dM2 (t) + dP21 (t)).

So by Lemma 3.5.1(2), (4.13) and Theorem 3.4.1, {(G tn )t under Pm }m0 is tight
d
in (C([0, T ]; R )). This completes the proof of our assertion.
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

836 S. Kusuoka & S. Liang

Lemma 6.3.8. Suppose that f C0 (R4d ) satises the condition in (4) of Theo-
rem 6.3.2. Then for any > 0, we have that

T n 0  t
lim Pm m1/2 {M11 fV1 (X(s),
  X(s))
Y (s)) 1 U( 
m0 
0 0


+ M21 fV2 (X(s),
 
Y  (X(s))}ds
(s)) 2 U   dt > = 0.


Proof. First notice that i U (X1 , X2 ) = 0 if |X1 X2 | > R1 +R2 . For any X1 , X2
R with |X1 X2 | R1 + R2 , let X
d i = R1 +R2 Xi , i = 1, 2. Then |X1 X
2 | = R1 +
|X1 X2 |
R2 . Since D0 = {(X1 , X2 ) R2d ; |X1 X2 | > R1 +R2 }, this means X  = (X1 , X
2 )
 
D0 , Also, as in the proof of Corollary 6.1.3, 1 U (X1 , X2 ) = 2 U (X1 , X2 ) is
parallel with same direction to X1 X2 , so
 
 (X1 , X2 ) = |1 U(X1 , X2 )| (X1 X2 ) = |1 U(X1 , X2 )| (X
1 U 1 X
2 ),
|X1 X2 | R1 + R2
 
 (X1 , X2 ) = + |2 U(X1 , X2 )| (X1 X2 ) = + |1 U(X1 , X2 )| (X
2 U 1 X
2 ).
|X1 X2 | R1 + R2
So by assumption, for any Y R2d ,

M11 fV1 (X,  1 , X2 ) + M 1 fV2 (X,


 Y ) 1 U(X  (X1 , X2 )
 Y ) 2 U
2

 1 , X2 )|
|1 U(X
=  Y ) (X
(M11 fV1 (X, 1 X
2 ) + M 1 fV2 (X,
 Y ) (X
1 X
2 ))
2
R1 + R2
= 0,

hence if we set C1 M11 fXV1 M21 fXV2 , then


 1 , X2 ) + M 1 fV2 (X, Y ) 2 U(X
|M11 fV1 (X, Y ) 1 U(X  1 , X2 )|
2

 Y )) 1 U
= |M11 (fV1 (X, Y ) fV1 (X,  (X1 , X2 )

 Y )) 2 U
+ M21 (fV2 (X, Y ) fV2 (X,  (X1 , X2 )|

 1U
M11 fXV1 |X X||  (X1 , X2 )|

+ M21 fXV2 |X X|| 2U  (X1 , X2 )|



  R1 + R2
C1 (|1 U (X1 , X2 )| + |2 U(X1 , X2 )|) 1 |X|.
|X1 X2 |
 0 | + 2nT )(R1 + R2 )1 , and let
Let C2 = 2(|X
 
T n
C3 = C1 C2 sup E Pm
m 1/2  X(s))|
(|1 U(   X(s))|)ds
+ |2 U(  ,
m(0,1] 0
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

Classical Mechanical Model of Brownian Motion with Plural Particles 837

which is nite by Lemma 3.5.2. Then by the calculation above, we have for any
[0, 34 0 12 (R1 + R2 )), (hence R1 + R2 > 12 (R1 + R2 )),

T n 0  t
Pm m1/2 {M11 fV1 (X(s),
 Y  (X(s))
 (s)) 1 U 

0 0


+ M21 fV2 (X(s),
 Y 
(s)) 2 U(X(s))}ds dt >



T n 0
Pm  (X(s))|
m1/2 C1 (|1 U   (X(s))|)
+ |2 U 
0

R 1 + R2
(|X
 0 | + 2nT ) 1 1{|X1 (s)X2 (s)|<R1 +R2 } ds >
|X1 (s) X2 (s)|

Pm inf |X1 (s) X2 (s)| R1 + R2
s[0,T n ]

T n 0
+ Pm  X(s))|
m1/2 C1 (|1 U(   X(s))|)ds
+ |2 U( 
0
1
 0 | + 2nT )
> (|X
(R1 + R2 )/2

Pm ( 43 < T n )
 2 
Pm T n 0

+ C1 C2 E m 1/2
C1 
|i U (X(s))| ds

0 i=1

Pm ( 43 < T n ) + C3 .

Since Pm ( 43 < T n ) 0 as m 0 for any (0, 34 0 ] by Lemma 6.3.3, we get
our assertion by taking rst > 0 small enough then m > 0 small enough.

By using the same argument when deriving Lemma 6.3.6 from Lemmas 3.5.2
and 6.3.5, with the help of Lemmas 6.3.7 and 6.3.8, we get the following, which
means that the condition (4) of Theorem 6.3.2 is also satised.

Lemma 6.3.9. Assume that f C0 (R4d ) satises

M11 (V1 f )(X,


 V ) (X1 X2 ) + M 1 (V2 f )(X,
2
 V ) (X2 X1 ) = 0

 V
for any (X,  ) D0 R2d , then f (X(t),
 V (t)) is continuous in t almost surely,
under any cluster point of {(X(t), V (t))t under Pm }, as m 0.
 

This completes the proof of the fact that in our setting any cluster point of
the distribution of {(Xt , Vt )}t under Pm as m 0 satises all of the conditions of
August 10, 2010 15:0 WSPC/S0129-055X 148-RMP
J070-S0129055X10004077

838 S. Kusuoka & S. Liang

Theorem 6.3.2. Therefore, by the uniqueness of Theorem 6.3.2, the distribution of


{(Xt , Vt )}t under Pm converges to P,0 as m 0.

Acknowledgment
We would like to thank the referees for their valuable comments which helped to
improve the manuscript in many ways. Also we would like to thank Professor Sergio
Albeverio for read the manuscript carefully. This work was nancially supported
by Grant-in-Aid for the Encouragement of Young Scientists (No. 21740063), Japan
Society for the Promotion of Science.

References
[1] P. Billingsley, Convergence of Probability Measures (John Wiley & Sons, Inc., 1968).
[2] P. Calderoni, D. D urr and S. Kusuoka, A mechanical model of Brownian motion in
half-space, J. Statist. Phys. 55(34) (1989) 649693.
[3] D. Durr, S. Goldstein and J. L. Lebowitz, A mechanical model of Brownian motion,
Comm. Math. Phys. 78(4) (1980/81) 507530.
[4] D. D urr, S. Goldstein and J. L. Lebowitz, A mechanical model for the Brownian
motion of a convex body, Z. Wahrsch. Verw. Gebiete 62(4) (1983) 427448.
[5] D. Durr, S. Goldstein and J. L. Lebowitz, Stochastic processes originating in deter-
ministic microscopic dynamics, J. Statist. Phys. 30(2) (1983) 519526.
[6] R. Holley, The motion of a heavy particle in an infinite one dimensional gas of hard
spheres, Z. Wahrsch. Verw. Gebiete 17 (1971) 181219.
[7] N. Ikeda and S. Watanabe, Stochastic Dierential Equations and Diusion Processes,
North-Holland Mathematical Library, Vol. 24 (North-Holland Publishing Co., Kodan-
sha, Ltd., 1981).
[8] O. Kallenberg, Foundations of Modern Probability, Probability and Its Applications,
2nd edn. (Springer-Verlag, New York, 2002).
[9] S. Kusuoka, Stochastic Newton equation with reflecting boundary condition, in
Stochastic Analysis and Related Topics in Kyoto, Adv. Stud. Pure Math., Vol. 41
(Math. Soc. Japan, 2004), pp. 233246.
[10] E. Nelson, Dynamical Theories of Brownian Motion (Princeton University Press,
Princeton, 1967).
[11] M. Reed and B. Simon, Methods of Modern Mathematical Physics. III. Scattering
Theory (Academic Press, 1979).
[12] J. A. M. van der Weide, Stochastic Processes and Point Processes of Excursions,
CWI Tract, Vol. 102 (Stichting Mathematisch Centrum, Centrum voor Wiskunde en
Informatica, Amsterdam, 1994).
August 10, 2010 15:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004089

Reviews in Mathematical Physics


Vol. 22, No. 7 (2010) 839858

c World Scientific Publishing Company
DOI: 10.1142/S0129055X10004089

A NOTE ON THE NON-COMMUTATIVE


LAPLACEVARADHAN INTEGRAL LEMMA

W. DE ROECK
Institut f
ur Theoretische Physik, Universit
at Heidelberg, Germany
w.deroeck@thphys.uni-heidelberg.de

CHRISTIAN MAES
Instituut voor Theoretische Fysica, K. U. Leuven, Belgium
christian.maes@fys.kuleuven.be

Y
KAREL NETOCN
Institute of Physics, Academy of Sciences of the Czech Republic
Prague, Czech Republic
netocny@fzu.cz

LUC REY-BELLET
Department of Mathematics and Statistics,
University of Massachusetts, Amherst, USA
luc@math.umass.edu

Received 10 September 2009


Revised 21 May 2010

We continue the study of the free energy of quantum lattice spin systems where to the
local Hamiltonian H an arbitrary mean field term is added, a polynomial function of the
arithmetic mean of some local observables X and Y that do not necessarily commute.
By slightly extending a recent paper by Hiai, Mosonyi, Ohno and Petz [10], we prove
in general that the free energy is given by a variational principle over the range of
the operators X and Y . As in [10], the result is a non-commutative extension of the
LaplaceVaradhan asymptotic formula.

Keywords: Quantum large deviations; quantum lattice systems; LaplaceVaradhan


lemma.

Mathematics Subject Classification 2010: 82B10

1. Introduction
1.1. Large deviations
One of the highlights in the combination of analysis and probability theory is the
asymptotic evaluation of certain integrals. We have here in mind integrals of the

839
August 10, 2010 15:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004089

840 W. De Roeck et al.

form, for some real-valued function G,



dn (x) exp{vn G(x)}, vn  + as n  + (1.1)

for which the measures n satisfy a law of large numbers. Such integrals can be
evaluated depending on the asymptotics of the n . The latter is the subject of
the theory of large deviations, characterizing the rate of convergence in the law of
large numbers. In a typical scenario, the n are the probabilities of some macro-
scopic variable, such as the average magnetization or the particle density in ever
growing volumes vn and as distributed in a given equilibrium Gibbs ensemble.
Then, depending on the case, thermodynamic potentials J make the rate function
dn (x) dx exp{vn J (x)} in the sense of large deviations for Gibbs measures,
see [8, 9, 16, 22, 23]. That theory of large deviations is however broader than the
applications in equilibrium statistical mechanics. Essentially, when the rate function
for n is given by J , then the integral (1.1) is computed as

1
log dn (x) exp{vn G(x)} sup{G(x) J (x)}. (1.2)
vn n+ x

This is a typical application of Laplaces asymptotic formula for the evaluation of


real-valued integrals. The systematic combination with the theory of large devia-
tions gives the so called LaplaceVaradhan integral lemma.
We rst recall the large deviation principle (LDP). Let (M, d) be some complete
separable metric space.

Definition 1.1. The sequence of measures n on M satises a LDP with rate


function J : M R+ {+} and speed vn R+ if

(1) J is convex and has closed level sets, i.e.

{J 1 (x), x c} (1.3)

is closed in (M, d) for all c R+ ;


(2) for all Borel sets U M with interior int U and closure cl U , one has
1
lim inf log n (U ) inf J (u),
n+ vn uint U

1
lim sup log n (U ) inf J (u).
n+ vn ucl U

We say that the rate function J is good whenever the level sets (1.3) are compact.

For the transfer of LDP, one considers a pair (n , n ), n  of sequences of


absolutely continuous measures on (M, d) such that
dn
(x) = exp{vn G(x)}, n -almost everywhere,
dn
August 10, 2010 15:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004089

Note on Non-Commutative LaplaceVaradhan Integral Lemma 841

for some measurable mapping G : M R. We now state an instance of the Laplace


Varadhan lemma.
Lemma 1.1 (LaplaceVaradhan Integral Lemma). Assume that G is bounded
and continuous and that the sequence (n ) satisfies a large deviation principle with
good rate function J and speed vn . Then (n ) satisfies a large deviation principle
with good rate function G J and speed vn .
For more general versions and proofs we refer to the literature, see e.g. [57, 22,
23]; it remains an important subject of analytic probability theory to extend the
validity of the variational formulation (1.2) and to deal with its applications.

1.2. Mean-field interactions


From the point of view of equilibrium statistical mechanics, one can also think of
the formula (1.1) as giving (the exponential of) the pressure or free energy when
adding a mean eld type term to a Hamiltonian which is a sum of local interactions.
The choice of the function G is then typically monomial with a power decided
by the number of particles or spins that are in direct interaction. For example, the
free energy of an Ising-like model with such an extra mean eld interaction would
be given by the limit
  p 
1  1 
lim log exp H () + p || i (1.4)
Zd ||
||
{+,} i

for p = 1, 2, . . . , where H () is the (local) energy of the spin conguration and


the limit takes a sequence of regularly expanding boxes to cover some given lat-
tice. The case p = 1 corresponds to the addition of a magnetic eld 1 ; p = 2 is
most standard and adds eectively a very small but long range two-spin interac-
tion. Higher p-values are also not uncommon in the study of Ising interactions on
hypergraphs, and even very large p has been found relevant, e.g., in models of spin
glasses and in information theory [4].
The form (1.1) is easily recognized in (1.4), with

n (x) exp{H ()}, vn = ||,
P
{+,} , i i =x||

and the function G(x) = p xp . The LaplaceVaradhan lemma applies to (1.4) since
we know that the sequence of Gibbs states with density exp{H ( )} satises
a LDP with a good rate function Jcl and speed ||. The result reads that (1.4) is
given by the variational formula
sup {p up Jcl (u)}. (1.5)
u[1,1]

In non-commutative versions the local Hamiltonian H and the additional mean


eld term are allowed not to commute with each other. That is natural within the
August 10, 2010 15:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004089

842 W. De Roeck et al.

statistical mechanics of quantum spin systems and this is also the context of the
present paper.

1.3. Non-commutative extensions


Although it has proven very useful to think of integrals (1.1) within the framework
of probability and large deviation theory, it is fundamentally a problem of analysis.
However, without such a probabilistic context, the question of a non-commutative
extension of the LaplaceVaradhan Lemma 1.1 becomes ambiguous and it in fact
allows for dierent formulations, each possibly having a physical interpretation on
its own.
One approach is to ask for the asymptotic evaluation of the expectations
1
log (e|| G(X ) )

lim (1.6)
Zd ||
under a family of quantum states where X would now be the arithmetic mean
of some quantum observable in volume . To be specic, one can take a quantum
Gibbs state for a Hamiltonian H at inverse temperature , with density matrix
exp{H }, and X = (
i Xi )/|| the mean magnetization in some xed
direction. Arguably, this formulation is closely related to the asymptotic statistics
of outcomes in von Neumann measurements of X . Indeed, let be the measure
on [
X
,
X
] dened by
))
(f ) := (f (X for f C([
X
,
X
]). (1.7)
Then, (1.6) can be evaluated with the help of Lemma 1.1 (the commutative Laplace
Varadhan integral lemma) if the family satises a LDP with speed ||. In recent
years, this LDP has been established for exp{H } in the regime of small
(high temperature) or d = 1, see [11, 1315].
A more general class of possible extensions is obtained by considering the
limits of
1 1 ||
log Tr (K e K G(X ) )K ,  Zd (1.8)
||
for dierent K > 0, where is the density matrix of a quantum state in box .
For the canonical form = exp(H )/Z with local Hamiltonian H at inverse
temperature , (1.8) becomes
1 1 ||
log Tr (e K H e K G(X ) )K ,

 Zd . (1.9)
|| Z
There is no a priori reason to exclude any particular value of K from consideration.
Two standard options are: K = 1, which corresponds to the expression (1.6) above,
and K  +, which, by the Trotter product formula, boils down to
1 1
log Tr (eH +||G(X ) ),

 Zd (1.10)
|| Z
August 10, 2010 15:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004089

Note on Non-Commutative LaplaceVaradhan Integral Lemma 843

which is the free energy of a corresponding quantum spin model, cf. (1.4). In the
present paper, we study the case K  + (without touching the question of inter-
changeability of both limits).
One of our results, Theorem 3.1 with Y = Y = 0, is of the form
1
log Tr (eH +|| G(X ) ) =

lim sup {G(u) J (u)}. (1.11)
Zd || XuX

Note that we omitted the normalization factor 1/Z since it merely adds a constant
(independent of G) to (1.10). In the usual context of the theory of large deviations,
formula (1.11) arises as a change of rate function. However, while our result (1.11)
very much looks like Varadhans formula in Lemma 1.1, there is a big dierence in
interpretation: The function J is not as such the rate function of large deviations
. Instead, it is given as the Legendre transform
for X
J (u) = sup{tu q(t)}, uR (1.12)
tR

of a function q( ) which is the pressure corresponding to a linearized interaction, i.e.


1
log Tr (eH +t||X ) ).

q(t) = lim (1.13)
Zd ||

1.4. Several non-commuting observables: Towards joint


large deviations?
In the previous Sec. 1.3, we made the tacit assumption that there is a single observ-
able X corresponding to some Hermitian operator on Hilbert space. However, in
1 
formula (1.4), the observable || i i could equally well represent a vector-
valued magnetization which, upon quantization, would correspond to several non-
commuting observables X , Y , say, the magnetization along the x-axis and y-axis,
respectively. In the commutative theory, this case does not require special attention;
the framework of large deviations applies equally regardless of whether the observ-
able takes values in R or R2 . Obviously, this is not true in the non-commutative
setting and in fact, we do not even know a natural analogue of the generating func-
tion (1.6), since we do not dispose of a simultaneous Von Neumann measurement
of X and Y . One can take the point of view that this is inevitable in quantum
mechanics, and insisting is pointless. Yet, as  Zd , the commutator
 
1
[X , Y ] = O (1.14)
||
vanishes and hence the joint measurability of X , Y is restored on the macroscopic
scale. We refer the reader to [19] where this issue is discussed and studied in more
depth.
The advantage of the approach via the LaplaceVaradhan Lemma is that one
can set aside these conceptual questions and study joint large deviations of X and
Y by choosing G to be a joint function of X and Y , for example a symmetrized
August 10, 2010 15:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004089

844 W. De Roeck et al.

monomial
, Y ) = (X
G(X )k (Y )l + (Y )l (X
)k , for some k, l N, (1.15)
and check whether the formula (1.11) remains valid with some obvious adjustments.
This turns out to be the case and it is our main result: Theorem 3.1.

1.5. Comparison with previous results


The asymptotics of the expression (1.10) was rst studied and the result (1.11) was
rst obtained by Petz et al. [17], in the case where the Hamiltonian H is made
solely from a one-body interaction. The corresponding equilibrium state is then a
product state. In [10], Hiai et al. generalized this result to the case of locally interact-
ing spins but the lattice dimension was restricted to d = 1. However, the authors
of [10] argue that the restriction to d = 1 can be lifted in the high-temperature
regime. The main reason is that their work relies heavily on an asymptotic decou-
pling condition which is proven in that regime, [1]. One should observe here that
this asymptotic decoupling condition in fact implies a large deviation principle for
X , as follows from the work of Pster [18]. Hence, in the language of Sec. 1.3, [10]
evaluates (1.10) (the case K = ) in those regimes where (1.6) (the case K = 1)
can be evaluated as well.
The present paper elaborates on the result of [10] in two ways. First, we remark
that, in our setup, the decoupling condition is actually not necessary for (1.11)
to hold, and therefore one can do away with the restriction to d = 1 or high
temperature. Hence, again referring to Sec. 1.3, the case K = can be controlled
even when we know little about the case K = 1. To drop the decoupling condition,
it is absolutely essential that we start from nite-volume Gibbs states, and not from
nite-volume restrictions of innite-volume Gibbs states, as it is done in [10].
Second, we show that by the same formalism, one can treat the case of sev-
eral noncommuting observables, as explained in Sec. 1.4. The most serious step in
this generalization is actually an extension of the result of [17] to noncommuting
observables. This extension is stated in Lemma 6.1 and proven in Sec. 7.
Note. While we were nishing this paper, we learnt of a similar project by J.-B. Bru
and W. de Siqueira Pedra. Their result [3] is nothing less than a full-edged theory
of equilibrium states with mean-eld terms in the Hamiltonian, describing not only
the mean-eld free energy (as we do here), but also the states themselves. Also,
their results hold for fermions, while ours are restricted to spin systems, and they
provide interesting examples. Yet, the focus of our paper diers from theirs and our
main result is not contained in their paper.

1.6. Outline
In Sec. 2, we sketch the setup. We introduce spin systems on the lattice, non-
commutative polynomials and ergodic states. Section 3 describes the result of the
paper. The remaining Secs. 47 contain the proofs.
August 10, 2010 15:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004089

Note on Non-Commutative LaplaceVaradhan Integral Lemma 845

2. Setup
2.1. Hamiltonian and observables
We consider a quantum spin system on the regular lattice Zd , d = 1, 2, . . . . We
briey introduce the essential setup below, and we refer to [12, 20] for more
expanded, standard introductions.
The single site Hilbert space H is nite-dimensional (isomorphic to Cn ) and for
any nite volume Zd , we set H = H. The C -algebra of bounded operators
on H is denoted by B B(H ). The standard embedding B B for 
is assumed throughout. The quasi-local algebra U is dened as the norm closure of
the nite-volume algebras

U := B . (2.1)
nite

Denote by i , i Zd , the translation which shifts all observables over a lattice


vector i, i.e. i is a homomorphism from B onto Bi+ .
We introduce an interaction potential , that is a collection (A ) of Hermitian
elements of BA , labeled by nite subsets A Zd . We assume translation invariance
(i) and a nite range (ii):
(i) i (A ) = i+A for all nite A Zd ;
(ii) there is a dmax < such that, if diam(A) > dmax , then A = 0.
In estimates, we will frequently use the number

r() :=
A
< . (2.2)
A0

The local Hamiltonian in a nite volume is



H H = A (2.3)
A

which corresponds to free or open boundary conditions. Boundary conditions will


however turn out to be irrelevant for our results. We will drop the superscript
since we will keep the interaction potential xed.
Let X, Y, . . . denote local observables on the lattice, located at the origin, i.e.
Supp X (which is dened as the smallest set A such that X BA ) is a nite set
which includes 0 Zd .
We write

X := j X (2.4)
jZd ,Supp j X

and
:= 1 X
X (2.5)
||
for the corresponding intensive observable (the empirical average of X).
August 10, 2010 15:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004089

846 W. De Roeck et al.

All of these operators are naturally embedded into the quasi-local algebra U. At
some point, we will also require the intensive innite volume observable
X
X  .

since it does not belong to the quasi-local


Some care is required in dealing with X
algebra U. We will further comment on this in Sec. 2.3.

2.2. Non-commutative polynomials


We will perturb the Hamiltonian H by a mean eld term of the form ||G(X , Y )
where G is a non-commutative polynomial of the operators X , Y , e.g., as
in (1.15).
In this section, we introduce these non-commutative polynomials G as quanti-
zations of polynomial functions g. First, we dene

Ran(X, Y ) := [
X
,
X
] [
Y
,
Y
]. (2.6)

This denition is motivated by the fact that (sp stands for spectrum)
sp Y Ran(X, Y ),
sp X for all . (2.7)

Let g be a real polynomial function on the rectangular set Ran(X, Y ). Using


the symbol I for the collection of all nite sequences from the binary set {1, 2},
any map G : I C is called a quantization of g whenever


N 

G() x(1) x(n) = g(x1 , x2 ) (2.8)
n=0 =((1),...,(n))I

for all (x1 , x2 ) Ran(X, Y ) and for some N N. A quantization G


is called
symmetric whenever

G((1),
. . . , (n)) = G((n), . . . , (1)). (2.9)
denes a self-adjoint operator
Any such symmetric quantization G

N 
G(X, Y ) =
G() X(1) X(n) (2.10)
n=0 =((1),...,(n))I

taking X1 X and X2 Y .
In the thermodynamic limit, one expects dierent quantizations of g to be
equivalent:
 be any two quantizations of g : Ran(X, Y ) R. Then
and G
Lemma 2.1. Let G


G(X , Y )
Cg (X, Y )
, Y ) G (X (2.11)
||
for some Cg (X, Y ) < , and for all finite volumes .
August 10, 2010 15:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004089

Note on Non-Commutative LaplaceVaradhan Integral Lemma 847

Proof. This is a simple consequence of the fact that the commutator of macro-
scopic observables vanishes in the thermodynamic limit, more precisely,

, Y ]
1
X
|Supp X|
Y
|Supp Y |.

[X (2.12)
||

Indeed, our results, Theorems 3.1 and 3.2, do not depend on the choice of
quantization. This can also be checked a priori using the above lemma and the
log-trace inequality in (3.11).

2.3. Infinite-volume states


A state is a positive linear functional on B , normalized by

= (1) = 1.
An example is the tracial state, ( ) Tr ( ). In general we consider states
as characterized by their density matrix , ( ) = Tr ( ).
An innite volume state is a positive normalized function on the C -algebra
U (the quasi-local algebra). It is translation invariant when (A) = (j A) for all
j Zd and A U. A translation-invariant state is ergodic whenever it is an
extremal point in the convex set of translation invariant states. A state is called
symmetric whenever it is invariant under a permutation of the lattice sites, that is,
for any sequence of one-site observables A1 , . . . , An B{0} U and i1 , . . . , in Zd

(i1 (A1 )i2 (A2 ) in (An )) = (i(1) (A1 )i(2) (A2 ) i(n) (An )) (2.13)

for any permutation of the set {1, . . . , n}. The set of ergodic/symmetric states
on U is denoted by Serg , Ssym , respectively.
At some point we will need the theorem by Strmer [21] that states that any
Ssym can be decomposed as

= d () (2.14)
prod.

for some regular probability measure whose support consists of product states.
Of course, the set of product states can be identied with the (nite-dimensional)
set of states on the one-site algebra B{0} = B(H).
For a nite-volume state on B , we consider the entropy functional

S( ) S ( ) = Tr log . (2.15)

The mean entropy of a translation-invariant innite-volume state is dened as


1
s() := lim S( ), with := B (restriction to ). (2.16)
Zd ||
In this formula and in the rest of the paper, the limit limZd is meant in the
sense of Van Hove, see, e.g., [12, 20]. Standard properties of the functional s are its
anity and upper semicontinuity (with respect to the weak-topology on states).
August 10, 2010 15:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004089

848 W. De Roeck et al.

In Sec. 2.1, we mentioned the observables at innity X and Y , postponing


l k
their denition to the present section. Expressions like (X Y ) (for some positive
numbers l, k) can be dened as
l Y k ) :=
(X lim l Y k ),
(X (2.17)

, Zd

provided that the limit exists. We use the following standard result that can be
viewed as a non-commutative law of large numbers

Lemma 2.2. For Serg , the limit (2.17) exists and


l Y k ) = [(X)]l [(Y )]k .
(X (2.18)
Note that (X) = (X) and (Y ) = (Y ) by translation invariance. An imme-
diate corollary is that for a non-commutative polynomial G which is a quantization
of g (see Sec. 2.2), and Serg :
Y )) = g((X), (Y )).
(G(X, (2.19)
For the convenience of the reader, we sketch the proof of Lemma 2.2 in the
Appendix.
Finally, we note that Lemma 2.2 does not require the state to be trivial at
innity. Triviality at innity is a stronger notion which is not used in the present
paper. In particular, the state constructed in Sec. 4 is ergodic, but not trivial at
innity, since it fails to be ergodic with respect to a subgroup of lattice translations.

3. Result
Choose X, Y to be local operators and let H be the Hamiltonian corresponding
be a sym-
to a nite-range, translation invariant interaction , as in Sec. 2.1. Let G
metric quantization of a polynomial g on the rectangle Ran(X, Y ) and G( , ) the
corresponding self-adjoint operator, as dened in Sec. 2.2. We dene the G-mean
eld partition function
ZG () := Tr (eH +|| G(X ,Y ) )

(3.1)
, Y empirical averages of X, Y . The following theorem is our main result:
with X

Theorem 3.1. Define the pressure


1
p(u, v) = lim log Tr eH +uX +vY (3.2)
Zd ||
and its Legendre transform
I(x, y) = sup (ux + vy p(u, v)). (3.3)
(u,v)R2

Then
1
lim log ZG () = sup (g(x, y) I(x, y)) (3.4)
Zd || (x,y)R2
August 10, 2010 15:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004089

Note on Non-Commutative LaplaceVaradhan Integral Lemma 849

where the limit  Zd is in the sense of Van Hove, as in (3.2). In particular, the
left-hand side of (3.4) does not depend on the particular form of quantization taken.

As discussed in Sec. 1, our result expresses the pressure of the mean eld Hamil-
tonian through a variational principle. To derive this result, it is helpful to represent
this pressure rst as a variational problem on a larger space, namely that of ergodic
states, as in Theorem 3.2. Theorem 3.1 follows then by parametrizing these states
by their values on X and Y .
We also need the local energy operator associated to the interaction as
 1
E := A . (3.5)
|A|
A0

Theorem 3.2 (Mean-Field Variational Principle). Let s( ) be the mean


entropy functional, as in Sec. 2.3. Then

1
lim log ZG () = sup (g((X), (Y )) + s() (E )). (3.6)
Zd || Serg

To understand how the rst term on the right-hand side of (3.6) originates from
(3.1), we recall the equality (2.19) for ergodic states .
The proof of Theorem 3.2 is postponed to Secs. 5 and 6. Here we prove that
Theorem 3.1 is a rather immediate consequence of Theorem 3.2.

Proof of Theorem 3.1. We write the right-hand side of (3.6) in the form

sup (g(x, y) I(x,


y)) (3.7)
(x,y)R2

where

y) =
I(x, inf (s() + (E )) (3.8)
Serg
(X)=x, (Y )=y

is a convex function on R2 , innite on the complement of Ran(X, Y ). To establish


y) is lower semi-continuous (l.s.c.), we proceed as in the proof of the
that I(x,
contraction principle in large deviation theory, see, e.g., [5]: The function 
(s() + (E )) is l.s.c. and the set { Serg , (X) = x, (Y ) = y} is compact
by the continuity of  ((X), (Y )) (compactness and continuity with respect
to the weak -topology). Therefore, the inmum is attained and we can deduce that

y) a} = F ({ Serg |s() + (E ) a})


{x, y | I(x, (3.9)

where F :  ((X), (Y )). The level set on the left-hand side is closed and hence
I is l.s.c.
August 10, 2010 15:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004089

850 W. De Roeck et al.

By using the innite-volume Gibbs variational principle [12, 20], the Legendre
Fenchel transform of I reads
sup (ux + vy I(x,
y)) = sup (s() (E ) + u (X) + v (Y ))
(x,y)R2 Serg

= p(u, v). (3.10)


The equality I = I then follows by the involution property of the LegendreFenchel
transform on the set of convex lower-semicontinuous functions, see, e.g., [20].

Independence of boundary conditions. Observe that both Theorems 3.1


and 3.2 have been formulated for the nite volume Gibbs states with open bound-
ary conditions. It is however easy to check that this choice is not essential and
other equivalent formulations can be obtained. Indeed, by the standard log-trace
inequality,
| log Tr (eH +W +|| G(X ,Y ) ) log Tr (eH +|| G(X ,Y ) )|
W

(3.11)
and hence if one chooses W such that limZd
W
/|| = 0, then we can replace
H by H + W in Theorems 3.1 and 3.2.
Finite-range restrictions. It is obvious that our paper contains some restrictions
that are not essential. In particular, by standard estimates (in particular, those used
to prove the existence of the pressure, see, e.g., [20]) one can relax the nite-range
conditions on the interaction to the condition that

A

< , (3.12)
|A|
A0

and similarly for the local observables X, Y . Moreover, it is not necessary that G
is a non-commutative polynomial. Starting from (3.11), one checks that it suces
that G can be approximated in operator norm by non-commutative polynomials.

4. Approximation by Ergodic States


In this section, we describe a construction that is the main ingredient of our proofs,
as well as of those in [10, 17]. This construction will be used in Secs. 6 and 7.
Let V be a hypercube centered at the origin, i.e. V = [L, L]d for some L > 1
and let

V := {i V i Zd \V such that i, i are nearest neighbors} (4.1)
We write
Zd /V = ((2L + 1)Z)d (4.2)
to denote the block lattice whose points can be thought of as translates of V . In
other words, Zd = iZd /V V + i. Consider a state V on BV .
August 10, 2010 15:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004089

Note on Non-Commutative LaplaceVaradhan Integral Lemma 851

We aim to build an innite-volume ergodic state out of V . First, we dene the


block product state


:= V . (4.3)
Zd /V

We dene also the translation-average of


,
1 

:= j .
(4.4)
|V |
jV

We can now check the following properties:


We have the exact equality of entropies
1
s(
) = s(
) = S(V ). (4.5)
|V |
This follows from the anity of the entropy in innite-volume. A remark is in
order: A priori, the innite-volume entropy is dened for translation-invariant
states, whereas is only periodic. However, one easily sees that the entropy can
still be dened, e.g. be viewing as a translation-invariant state on the block
lattice Z /V , and correcting the denition by dividing by |V |.
d

The state is ergodic. This follows, for example, from an explicit calculation that
is presented in [10]. Note however that is in general not ergodic with respect
to the translations over the sublattice Zd/V = ((2L + 1)Z)d . This phenomenon
(though in a slightly dierent setting) is commented upon in [20] (the end of
Sec. III.5).
The state is a good approximation of V for observables which are empirical
averages, provided V is large. Consider the local observable X as in Sec. 2.1.
A translate j X can lie inside a translate of V , i.e. Supp j X V + i for some
i Zd/V , or it can lie on the boundary between two translates of V . The dierence
between (X) = (X) and V (X V ) clearly stems from those translates where X
lies on a boundary, and the fraction of such translates is bounded by
|V |
|Supp X| . (4.6)
|V |
Hence
|V |
|
(X) V )|
X
|Supp X|
V (X . (4.7)
|V |

5. The Lower Bound


In this section, we prove the following lower bound.
Lemma 5.1. Recall ZG () as defined in (3.1). Then
1
lim inf log ZG () sup ((g((X), (Y )) + s() (E )) (5.1)
Zd || Serg

where all symbols have the same meaning as in Sec. 3.


August 10, 2010 15:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004089

852 W. De Roeck et al.

Proof. Consider a state Serg . We show that


1
lim inf log ZG () g((X), (Y )) + s() (E ). (5.2)
Zd ||

Consider, for each volume , the restriction := B . By the nite-volume
variational principle (see, e.g., [2, Proposition 6.2.22]),
1 , Y )) + 1 S( ) 1 (H ).
log ZG () (G(X (5.3)
|| || ||
The following convergence properties apply with  Zd in the sense of Van Hove:
(1) , Y )) = (G(X
(G(X , Y )) g((X), (Y )), (5.4)
1
(2) S( ) s(), (5.5)
||
1
(3) (H ) (E ). (5.6)
||
The relation (5.6) is obvious from the nite range condition on , see Sec. 2.1. The
convergence (5.5) is the denition of the mean entropy s. Finally, (5.4) follows from
the ergodicity of as explained in Sec. 2.3.
The relation (5.2) now follows immediately, since one can repeat the above
construction for any ergodic state .

6. The Upper Bound


6.1. Reduction to product states
In this section, we outline how to approximate
1
log ZG () (6.1)
||
by a similar expression involving the partition function of a block-product state.
Fix a hypercube V = [L, L]d and cover the lattice with its translates, as explained
in Sec. 4. From now on, is chosen such that it is a multiple of V . One can easily
adopt the arguments such as to cover the case where tends to innity in the sense
of Van Hove (as one has to do as well in the proof of the existence of the pressure
for local interactions, see [12]).
Dene the observables
HV H,V , V ,
X YV (6.2)
by cutting all terms that connect any two translates of V , i.e.

V := 1
X j X, (6.3)

||
j
iZd/V :Supp j XV +i

and analogously for HV and YV . One can say that these observables with super-
script V are one-block observables with the blocks being translates of V . One easily
August 10, 2010 15:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004089

Note on Non-Commutative LaplaceVaradhan Integral Lemma 853

derives that


X

X
|Supp X| |V | ,
V X
HV H
r()||
|V |
(6.4)

|V | |V |
with the number r() as dened in Sec. 2.1.
Using the log-trace inequality, we bound
1 1
log Tr (eH +|| G(X ,Y ) )

log Tr (eH +||G(X ,Y ) )
V V V
(6.5)
|| ||
as follows
1
(6.5)
H HV
+
G(X
, Y ) G(X
V , Y V )

||

|V |
(r() + Cg (
X
|Supp X| +
Y
|Supp Y |))
|V |
where Cg is constant depending on the function G. The second term of (6.5) is
clearly the pressure of a product state with mean eld interaction. We will nd an
upper bound for this pressure by slightly extending the treatment of Petz et al.
in [17]. We prove an extended PRV-lemma, Lemma 6.1 in the next section.

6.2. The extended PetzRaggioVerbeure upper bound


In this section, we outline the bound from above on the quantity
1
log Tr (eH +||G(X ,Y ) )
V V V
(6.6)
||
that appeared in (6.5).
To do this, let us make the setting slightly more abstract. Consider the lattice
Z with the one-site Hilbert space G given by
d

G := H. (6.7)
V

In words, Z should be thought of as the block lattice Zd/V . Let D, A, B be one-


d

site observable on the new lattice, i.e. D, A, B are Hermitian operators on G. The
extended PRV (PetzRaggioVerbeure) states that
Lemma 6.1 (Extended PRV). Let all symbols have the same meaning as in
Secs. 2.12.3, except that the one-site Hilbert space is changed from H to G. Then
1
log Tr (eD +||G(A ,B ) ) sup ((G(A,
+ s() (D)).
B))
lim sup
Zd || Ssym
(6.8)
B))
In particular (G(A, defined as (2.17) exists.

To appreciate the similarity between (6.8) and (3.6), one should realize that D
is a local energy operator, as E in (3.6). The proof of this lemma in the case that
A = B is in the original paper [17]. The proof for the more general case is presented
August 10, 2010 15:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004089

854 W. De Roeck et al.

in Sec. 7. Of course, one can prove that the right-hand side of (6.8) is also a lower
bound: it suces to copy Sec. 5.
By the Strmer theorem, see (2.14), each symmetric state on U can be written
as the barycenter of a regulary probability measure on the product states, and since
all terms on the right-hand side of (6.8) are ane and upper semicontiuous functions
of , it follows that the sup can be restricted to product states (see [17] for the ne
details of this argument). Since, moreover, all product states are ergodic, we can
replace (G(A, B))
by g((A), (A)). Hence, Lemma 6.1 implies that
1
log Tr (eD +||G(A ,B ) ) sup (g((A), (B)) + s() (D)).

lim sup
Zd || prod.
(6.9)

6.2.1. From the extended PRV to the upper bound


Next, we use (6.9) to formulate an upper bound on the quantity
1
Tr (eH +||G(X ,Y ) )
V V V
(6.10)
||
for a multiple of V . This means that we have to recall that the lattice sites in
(6.9) are in fact blocks. We write := /V and choose
D := HV
V
A := X
B := YV .
Then, by the extended PRV,
 
1 1
(6.10) sup g((A), (B)) + s () (D)
prod. on B( ) |V | |V |
 
= sup G(V (X V ), V (YV )) + 1 S(V ) 1 V (HV )
V on BV |V | |V |
where s indicates that this is the entropy density on the block lattice , hence
it should be divided by |V | to obtain the density on . Now, let
be the innite-
volume state obtained by taking a block-product over states V and let be its
translation-average, as in Sec. 4. By the conclusions of Sec. 4, it follows that

) = S(V ). Also, we see that
is ergodic and s(
|V |
|V (X
V )
(X)|
X
|Supp X|
|V |
1 |V |
|V (HV )
(E )| r()
|V | |V |

and analogously for YV . Consequently, we obtain
 
|V |
(6.10) sup (g((X), (Y )) + s() (E )) + O , V  Zd
Serg |V |
August 10, 2010 15:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004089

Note on Non-Commutative LaplaceVaradhan Integral Lemma 855

which proves the upper bound for Theorem 3.2, since the O( |V |
|V | )-term can be made
arbitrarily small by increasing V .

7. Proof of Lemma 6.1


Let the state on B be given by
1
Tr (eD +||G(A ,B ) )

( ) =
ZG (D)
with

ZG (D) := Tr (eD +||G(A ,B ) ).


Naturally, is the nite-volume Gibbs state that saturates the variational


principle, i.e.
 
1 )) + 1 S( ) (D)
log ZG (D) = sup (G(A , B
|| on B ||
)) + 1 S( ) (D).
= (G(A , B (7.1)
||
Our strategy is to attain the entropy and energy of the state via ergodic
states. For deniteness, we assume that G is of the form
) := [A ]k [B
G(A , B ]l for some integers k, l,

(which, strictly speaking, is not allowed since G(A , B


) has to be a self-adjoint
operator, but this does not matter for the argument in this section). The general
case follows by the same argument.
We apply the construction in Sec. 4 to , thus obtaining innite-volume states

and . Since we will repeat the construction for dierent , we indicate the
-dependence in {} and {} , but remembering that these are states on the
innite lattice. They satisfy
1
{} ) =
s( S( ). (7.2)
||

We have also established in Sec. 4 that {} is ergodic and that the states {}
{}
and approximate for observables which are empirical averages. However,
we cannot conclude yet that they have comparable values for G(A, B),
except in
the case where G is linear. Essentially, such a comparison is achieved next by using
the fact that is symmetric.
Choose a sequence of volumes n such that along that sequence the right-hand
side of (7.1) converges. We assume that n has a weak-limit, as n  , which
can always be achieved (by the weak-compactness) by restricting to a subsequence
of n . We call this limit . By construction, it is a symmetric state.
August 10, 2010 15:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004089

856 W. De Roeck et al.

n , in the weak-topology, and


Energy estimate. Since n (D) = n (D),
we have
n (D) (D). (7.3)

G-estimates. Using the symmetry of the state , we estimate


| (G(A , B
)) (k A l B)|
  
k+l (k + l)2 c(k, l)
max (
A
,
B
) +O , ||  (7.4)
|| ||2
where the tensor products
k A l B := A A B B (7.5)
 
k copies l copies

denote that all one-site operators are placed on dierent sites. Since is sym-
metric, we need not specify on which sites. The error term of order 1/|| comes
from those terms in the expansion of the monomial containing a product of k + l
one-site operators but only involving k + l 1 sites. Since is symmetric, we obtain
analogously that
(G(A, = (k A l B).
B)) (7.6)
In particular, the left-hand side is well-dened. Hence, by combining (7.4) and (7.6),
we obtain
n )) (G(A,
n (G(An , B B)).
(7.7)
For a more general non-commutative polynomial G as dened in Sec. 2.2 (not
n )
necessarily a monomial), the convergence (7.7) follows easily since G(An , B
can be approximated in operator norm by polynomials.
Entropy estimates. As established in Sec. 4, we have
1
{} ),
S( ) = s( for all . (7.8)
||
By the upper semi-continuity of the innite-volume entropy and the convergence
n , we get that

{n } ) s().
lim sup s( (7.9)
n

Hence
1
lim S(n ) s(). (7.10)
n |n |
By combining the convergence results (7.3), (7.7) and (7.10), we have proven
that there is a symmetric state such that the right-hand side of (6.8) with is
larger than a given limit point of the right-hand side of (7.1). Since the construction
can be repeated for any limit point, this concludes the proof of Lemma 6.1.
August 10, 2010 15:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004089

Note on Non-Commutative LaplaceVaradhan Integral Lemma 857

Acknowledgment
The authors thank M. Fannes, M. Mosonyi, Y. Ogata, D. Petz and A. Verbeure
for fruitful discussions. K. N. is also grateful to the Instituut voor Theoretische
Fysica, K. U. Leuven, and to Budapest University of Technology and Economics for
kind hospitality, and acknowledges the support from the Grant Agency of the Czech
Republic (Grant no. 202/07/J051). W. D. R. was a postdoctoral fellow of the FWO-
Flanders at the time when the paper was written and he acknowledges the nancial
support. L. R. B. acknowledges the support of the NSF (DMS-0605058).

Appendix. Proof of Lemma 2.2


To prove Lemma 2.2, it is convenient to introduce an extended framework: Let
be the cyclic GNS-representation associated to the state , H the associated
Hilbert space and H the representant of the state , i.e.
(A) = , (A)H , A U. (A.1)
The set (U) is a subalgebra of B(H ). Let Uj , Zd be the unitary representation
of the translation group induced on (U), i.e.
Uj (A)Uj = (j A). (A.2)
Ergodicity of implies (see, e.g., the proof of [20, Theorem III.1.8]) that
1  strongly
Uj P (A.3)
|| Zd
j

where P is the one-dimensional orthogonal projector associated to the vector ,


and  Zd in the sense of Van Hove. Using (A.3) and the translation-invariance
Uj = , one calculates
)(Y ) = 1 
(X Uj (X)Uj  j (Y )Uj 
||2 
j,j
d
Z P (X)P (Y ) = (X)(Y )
for local observables X, Y U. Taking the scalar product with , we conclude
that (X Y ) (X)(Y ). The same argument works for all polynomials in

X , Y , thus proving Lemma 2.2. Finally, we remark that one can also construct
the operators X, , Y , as  Zd (these weak-limits are
Y as weak-limits of X
simply multiples of identity: (X)1, (Y )1). This is however not necessary for our
results.

References
[1] H. Araki and P. D. F. Ion, On the equivalence of KMS and Gibbs conditions for
states of quantum lattice systems, Comm. Math. Phys. 35 (1974) 112.
[2] O. Brattelli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechan-
ics: 2, 2nd edn. (Springer-Verlag, Berlin, 1996).
August 10, 2010 15:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004089

858 W. De Roeck et al.

[3] J.-B. Bru and W. de Siqueira Pedra, Equilibrium states of Fermi systems with long
range interactions, in preparation.
[4] R. Heylen, D. Bolle and N. S. Skantzos, Thermodynamics of spin systems on small-
world hypergraphs, Phys. Rev. E 74 (2006) 056111.
[5] A. Dembo and O. Zeitouni, Large Deviations Techniques and Applications (Springer,
Berlin, 1993).
[6] F. den Hollander, Large Deviations, Field Institute Monographs, Vol. 14 (Amer.
Math. Soc., 2000).
[7] J. D. Deuschel and D. W. Stroock, Large Deviations, Pure and Applied Mathematics,
Vol. 137 (Academic Press, Boston, 1989).
[8] R. S. Ellis, Entropy, Large Deviations, and Statistical Mechanics (Springer, 2005).
[9] H.-O. Georgii, Gibbs Measures and Phase Transitions, De Gruyter Studies in Math-
ematics, Vol. 9 (De Gruyter, 1988).
[10] F. Hiai, M. Mosonyi, H. Ohno and D. Petz, Free energy density for mean eld per-
turbation of states of a one-dimensional spin chain, Rev. Math. Phys. 20 (2008)
335365.
[11] F. Hiai, M. Mosonyi and O. Tomohiro, Large deviations and Cherno bound for
certain correlated states on the spin chain, J. Math. Phys. 48(12) (2007) 123301
123319.
[12] R. B. Israel, Convexity in the Theory of Lattice Gases, Princeton Series in Physics
(Princeton University Press, 1979).
[13] M. Lenci and L. Rey-Bellet, Large deviations in quantum lattice systems: One-phase
region, J. Stat. Phys. 119 (2005) 715746.
[14] K. Netocny and F. Redig, Large deviations for quantum spin systems, J. Stat. Phys.
117 (2004) 521547.
[15] Y. Ogata, Large deviations in quantum spin chain, arXiv:0803.0113.
[16] S. Olla, Large deviations for Gibbs random elds, Probab. Theory Related Fields
77 (1988) 343357.
[17] D. Petz, G. A. Raggio and A. Verbeure, Asymptotics of Varadhan-type and the Gibbs
variational principle, Comm. Math. Phys. 121 (1989) 271282.
[18] C.-E. Pster, Thermodynamical aspects of classical lattice systems, in In and
Out of Equilibrium, Probability with a Physics Flavor, Vol. 1, ed. V. Sidoravicius
(Birkhauser, 2002).
[19] W. De Roeck, C. Maes and K. Netocn y, Quantum macrostates, equivalence of ensem-
bles and an H theorem, J. Math. Phys. 47 (2006) 073303.
[20] B. Simon, The Statistical Mechanics of Lattice Gases (Princeton University Press,
Princeton, 1993).
[21] E. J. Stormer, Symmetric states on innite tensor products of C -algebras, Funct.
Anal. 3 (1969) 4868.
[22] S. R. S. Varadhan, Asymptotic probabilities and dierential equations, Comm. Pure
Appl. Math. 19 (1966) 261286.
[23] S. R. S. Varadhan, Large Deviations and Applications (Society for Industrial and
Applied Mathematics, 1984).
September 14, 2010 13:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004090

Reviews in Mathematical Physics


Vol. 22, No. 8 (2010) 859879

c World Scientic Publishing Company
DOI: 10.1142/S0129055X10004090

DYNAMICAL BOUNDS FOR STURMIAN



SCHRODINGER OPERATORS

L. MARIN
UMR 6628-MAPMO, Universit e dOrl
eans,
B.P. 6759, 45067 Orl
eans cedex, France
laurent.marin@univ-orleans.fr

Received 3 November 2009

The Fibonacci Hamiltonian, that is a Schr odinger operator associated to a quasiperiodi-


cal Sturmian potential with respect to the golden mean has been investigated intensively
in recent years. Damanik and Tcheremchantsev developed a method in [10] and used it
to exhibit a non trivial dynamical upper bound for this model. In this paper, we use this
method to generalize to a large family of Sturmian operators dynamical upper bounds
and show at sucently large coupling anomalous transport for operators associated to
irrational number with a generic diophantine condition. As a counterexample, we exhibit
a pathological irrational number which does not verify this condition and show its asso-
ciated dynamic exponent only has ballistic bound. Moreover, we establish a global lower
bound for the lower box counting dimension of the spectrum that is used to obtain a
dynamical lower bound for bounded density irrational numbers.

Keywords: Sturmian Schr


odinger operators; quasiperiodical potential; dynamical
bounds.

Mathematics Subject Classication 2010: 81Q10, 47B36

1. Introduction
If H is a self-adjoint operator on a separable Hilbert space H, the time depen-
odinger equation of quantum mechanics, it = H, yields to a unitary
dent Schr
dynamical evolution in H,
(t) = eitH (0).
Under the time evolution, (t) will generally spread out with time. This could
be a complicated question to quantify this spreading in concrete cases. One of the
most studied case is where H is given by L2 (Rd ) or l2 (Zd ), H is a Sch odinger
operator of the form + V , and (0) is a localized wavepacket. The form of the
potential V is depending on the physical model one studies. One of the most studied
is the Sturmian potential and its particular subcase, the Fibonacci Hamiltonian,
describing a standard one-dimensional quasicrystal.
The rst approach to study quantum dynamics is the spectral theorem. Recall
that each initial vector (0) = has a spectral measure, dened as the unique
859
September 14, 2010 13:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004090

860 L. Marin

Borel measure verifying



, f (H) = f (E)d (E)
(H)

for every measurable function f . ,  denotes the scalar product of H. A major


step in the theory discovered by Guarneri ([14, 15]) was that suitable continuity
properties of the spectral measure d implies lower bounds on the spreading of
the wavepacket. It was then extended by many authors in [3, 16, 25, 23]. Conti-
nuity properties of the spectral measure follows from upper bounds on measure
of intervals, ([E , E + ]), E (H), 0. Later on, many authors rened
Guarneris method ([2, 17, 30]) allowing to take into account the whole statistics
of ([E , E + ]), E R. One can nd better lower bounds with information
about both measure of intervals and the growth of the generalized eigenfunctions
u (n, E) ([23, 30]).
In the case of Schr odinger operators in one space dimension, the information
on the spectral measure and on generalized eigenfunctions is linked to the proper-
ties of solutions to the dierence (also called sometimes free) equation Hu = Eu
([6, 11, 13, 19, 20, 31]). Explicit lower bounds on spreading rate for numerous con-
crete cases come from an analysis of these solutions ([5, 6, 13, 19, 20, 23]).
The second approach to dynamical lower bounds in one dimension is based on
the Parseval formula,
   1 2

2t/T itH 2  i 
2 e |e 1 , n | dt =  H E 1 , n  dE.
0  T 

This method developed in [8, 9, 31] is the basis for the results in [7, 21]. This
method has the advantage that it gives directly dynamical bounds without any
knowledge of the properties of spectral measure. What is required is upper bounds
for solutions corresponding to some set of energies, which can be very small (non
empty is sucient). Moreover, additional information allows to improve the results.
A combination of both approach leads to optimal dynamical bounds for growing
sparse potentials (see [31]).
As mentioned before, there is a fairly good understanding of how to prove
dynamical lower bounds, specially in one space dimension. Results of dynamical
upper bounds are a few and more recent. Proving upper bounds is hard because
one needs to control the entire wavepacket. In fact, the dynamical lower bounds that
typically established only bound some (fast) part of the wavepacket from below and
this is sucient for the desired growth of the standard dynamical quantities. In the
same way, it is of course much easier to prove upper bounds only for a (slow) por-
tion of the wavepacket. Killip, Kiselev and Last developed this idea with success
in [24]. Their work provides explicit criteria for upper bounds on the slow part of the
wavepacket in terms of lower bounds on solutions. Applying their general method to
the Fibonacci operator, their result supports the conjecture that this model exhibits
anomalous transport (i.e. neither localized, nor diusive, nor ballistic).
September 14, 2010 13:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004090

Dynamical Bounds for Sturmian Schr


odinger Operators 861

The conjecture for Fibonacci model is nally proved at suciently large coupling
by Damanik and Tcheremchantsev in [10]. They developed a general method estab-
lishing a connection between solutions properties and dynamical upper bounds.
Based on the Parseval formula, this method allows to bound the entire wavepacket
from above provided that suitable lower bounds for solution (or rather transfer
matrix) growth at complex energies are available.
It is the main purpose of this paper to extend the application of this general
method used for concrete Fibonacci model to almost every Sturmian potential.
We will show that one has anomalous transport for Sturmian models associated
to irrational numbers far enough from rational numbers, in a sense we develop
further. On the other hand, we construct an irrational number close enough to
rational number that yields to balistic motion.
In this paper, we use tools that are relevant to give a new lower bound for the
box counting dimension of the spectrum that is better for almost every irrational
number. Since the spectrum is a Cantor set with Lebesgue measure zero, it is
logical to investigate its fractal dimension. It is well known that this Cantor set
is the limit of band spectra of approximant operators [29, 1]. To nd the bound,
we use band spectra at rank n as a sequence of n -cover of the spectrum. Using
the informations given in [28] about the number of band in periodic band spectra
and in [27] about the length of the bands, we estimate n and give a bound for the
number of band of this diameter. This yields to a bound from below of the minimal
number of balls of diameter n one needs to cover the spectrum. This bound also
has a direct dynamical application and allows us to state a dynamical lower bound
using the method in ([30]). It is required for this lower bound to have the transfer
matrix norms polynomially bounded. This property is shown to be true for bounded
density irrational number in [18], hence more is not expected. This limits dynamical
implication of this lower bound to a set of irrational number of Lebesgue measure 0.
We will give precise statements of the model we study and our results in the
next section. Section 3 will be devoted to the proof of our main result. We give
a pathological example in the Sec. 4 and a new lower bound for box counting
dimension of the spectrum in Sec. 5.

2. Model and Statements


We limit our study to the one-dimensional discrete Schrodinger operator H ,
[H ](n) = (n + 1) + (n 1) + V (n)(n) (1)
acting on l2 (Z), associated to a Sturmian potential V (n) given by
V (n) = ((n + 1) n)V
with an irrational number in [0, 1] and V a positive constant. We denote continued
fraction expansion of by
1
= = [0, a1 , a2 , . . .].
1
a1 +
a2 +
September 14, 2010 13:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004090

862 L. Marin


The Fibonacci Hamiltonian, H with = 51 2 = [0, 1, 1, . . .] is the sim-
plest example in Sturmian model because of its particular continued fraction
development.
Since we are interested in dynamical bounds, let us recall some quantities we
want to bound:
We denote the time average outside probabilities by

P (N, T ) = a(n, T ),
|n|>N

with

2
a(n, T ) = e2t/T |eitH 1 , n |2 dt.
T 0

For all [0, +], see [13]


log P (T 2, T )
S () = lim inf
T log T
and
log P (T 2, T )
S + () = lim sup .
T log T
The following critical exponents are particular of interest:

l = sup{ 0 : S () = 0},

u = sup{ 0 : S () < }.

They verify 0 +
l u . In particular, if > u then P (T , T ) goes to 0


fast. l can be interpreted as the (lower and upper) rates of propagation of the
essential part of the wavepacket and u as the rates of propagation of the fastest
part of the wavepacket.
Moreover, we always have for this kind of models + u 1. This upper bound,
called ballistic, is the fastest rate of spreading of the wavepacket.
Sturmian potentials (quasiperiodic structure) are the buer situation between
random potentials (no structure in potential) that imply dynamical localization
(
u = 0) and periodic potentials that imply ballistic spreading that is u = 1.
More precisely, one has a non trivial strictly positive bound for almost all irra-
tional numbers. In a sense we will make more precise latter, these irrational numbers
are far enough from rational numbers. On the other hand, we show for irrational
number close enough to rational number, one has ballistic motion.
The rst objective of this paper is to give a non ballistic upper bound for a large
set of irrational numbers.
Recall the sequences associated to :
p1 = 1, p0 = 0,
q1 = 0, q0 = 1,
September 14, 2010 13:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004090

Dynamical Bounds for Sturmian Schr


odinger Operators 863

pk+1 = ak+1 pk + pk1 ,


qk+1 = ak+1 qk + qk1 . (2)
We can now state our main result:
Theorem 1. Let be an irrational number and H dened as in (1) with a
Sturmian potential associated to . Assume that V > 20. If D = lim supk logkqk
is nite then
2D
+u  .
V 8
log
3
Moreover, for an irrational number with continued fraction expansion containing
no 1, the dynamical upper bound becomes
D
+u  .
V 8
log
3
Remark 1. It is clear that taking V large enough, one can obtain a non trivial
bound that is smaller than 1.
It is well known that the set of irrational numbers with nite D has full Lebesgue
measure. In fact, for any algebraic number, that is with a periodic continued fraction
development, one can easily compute D. Moreover, the explicit value of D is known
for almost all by the result of Khinchin discussed next.
Lemma 1 ([22]). For almost all with respect to Lebesgue measure,
log qk 2
D = lim sup = DK = ,
k k 12 log 2
where qk is the sequence dened as in (2) and
1
M = lim inf (a1 ak ) k = CK = 2.685 . . .
k
CK is called the Khintchin constant.
Corollary 1. For Lebesgue almost every irrational number , we have
2D
+
u  K .
V 8
log
3

Proof. It follows directly from previous Theorem 1 and Khinchin lemma.

Corollary 2. For a precious number, that is = [0, a, a, a, a, . . .], a


= 1 the bound
becomes
log(a + )
+
u  .
V 8
log
3
September 14, 2010 13:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004090

864 L. Marin

Proof. One can compute qk easily for such numbers.

On the contrary, if D is innite, one can have ballistic motion at all large
coupling:

Theorem 2. There exist an irrational number with D = + such that for any
V > 20 the dynamic of H is ballistic.

We also prove a new lower bound for the fractal dimension of the spectrum:

Theorem 3. Set Ck = k3 kj=1 log(aj + 2). We have for any irrational number
verifying C = lim sup Ck < + and V > 20:
1 log 2
dim+
B () (3)
2 C + log(V + 5)
where is the spectrum of H .

3. Proof of Theorem 1
When one wants to bound all these dynamical quantities for specic models, it is
useful to connect them to the qualitative behavior of the solutions of the dierence
equation

(n + 1) + (n 1) + V (n)(n) = z(n) (4)

with z C and a non-zero vector.


One can reformulate this equation in terms of transfer matrices.
   
(n + 1) (1)
= F (n, z)
(n) (0)
with


T (n, z) T (1, z)
n 1,
F (n, z) = Id n = 0,


[T (n, z)]1 [T (0, z)]1 n 1,

and
 
z V (m) 1
T (m, z) = .
1 0

We set


T (qk , z) T (1, z)
n 1,
Mk (z) = F (qk , z) = Id k = 0,


[T (q , z)]1 [T (0, z)]1 n 1.
k
September 14, 2010 13:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004090

Dynamical Bounds for Sturmian Schr


odinger Operators 865

The following statement allows us to connect transfer matrix norms with dynam-
ical exponents (see [10] for details). Here and in what follows, f  g means that
f Cg for some positive constant C that we leave implicit.

Theorem 4. Let H be dened as in (1) and K 4 such that (H ) [K +


1, K 1]. Then, the outside probabilities can be bounded from above in terms of
transfer matrix norms as follows:
 K   2 1
 i 
Pr (N, T )  exp(cN ) + T 3 max Mk E +  dE,
K1qk N  T 
   2 1
K  i 
Pl (N, T )  exp(cN ) + T 3
max  Mk E +  dE,
K N qk 1 T 

the implicit constants depend only on K and c is a universal positive constant.

This theorem connects transfer matrix behavior with a dynamical upper bound
in the following way. Choosing N = N (T ) = CT such that the both integrals
decay faster that any inverse power of T , implies that P (N (T ), T ) goes to 0 faster
that any inverse power of T . By denition, of + +
u , it follows that u . To
exhibit such kind of condition, we have to prove the considered energy is not in the
spectrum, then the transfer matrix norm is shown to grow super exponentially.
We shall recall now a few properties of the transfer matrix and their traces. The
transfer matrix sequence veries the evolution in k (see, e.g., [1, 28])

Mk+1 (z) = Mk1 (z)Mk (z)ak+1 . (5)

In order to bound from below the sequence of the norm of transfer matrix, it
is enough to consider their traces. We recall now the following result one can nd
in [28].

Proposition 1. Let tk,p be the trace of the matrix Mk1 Mkp . The evolution along
the p index is given by

tk,p+1 = tk+1,0 tk,p tk,p1 ,

and consequently,

tk,p+1 = Sp (tk+1,0 )tk,1 Sp1 (tk+1,0 )tk,0 (6)


= Sp (tk+1,0 )tk,0 Sp1 (tk+1,0 )tk,1 . (7)

The evolution along the k index is related to the p-evolution by

tk+2,0 = tk,ak+1 ,
tk+1,1 = tk,ak+1 +1 ,
tk+1,1 = tk,ak+1 1 .
September 14, 2010 13:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004090

866 L. Marin

If one denotes by xk = tk+1,0 the trace of Mk and zk = tk,1 the trace of Mk1 Mk .
This can be reduced to the usual trace map relation (6)
xk+1 = zk Sak+1 1 (xk ) xk1 Sak+1 2 (xk ),
zk+1 = zk Sak+1 (xk ) xk1 Sak+1 1 (xk ),
with initial conditions, x1 = 2, x0 = z and z0 = z V .
Remark 2. This two sequences are dependent on z but we will omit it in order to
simplify notations.
Here, Sl denotes the lth Tchebychev polynomial of the second kind:
S1 (x) = 0,
S0 (x) = 1,
Sl+1 (x) = xSl (x) Sl1 (x), l 0.
The sequence {xk (z)}k can have two dierent behaviors depending on z. If and
only if z lies in the spectrum of H then this sequence is bounded. A criterium
has rst been stated by S ut
o in [29] for Fibonacci Hamiltonian and extended by
Bellissard et al. in [1] for other irrational numbers. The appearance of in the next
Lemma is purely technical and does not change the proof.
Lemma 2. A necessary and sucient condition that {xk (z)}k be unbounded is that
xN 1 (z) 2 + , xN (z) > 2 + , zN (z) > 2 +
for some N 0. This N is unique. Set
Gk = Gk1 + ak Gk2 , G0 = 1, G1 = 1.
We have
|xk+1 | |zk | ecGkN + 1 k > N,
with c = log(1 + ) > 0 constant.

Proof. We start by stating the following inequality on Chebychev polynomial:


|Sl (x)| |Sl1 (x)| (|x| 1)|Sl1 (x)| |Sl2 (x)|
(|x| 1)[|Sl1 (x)| |Sl2 (x)|]
iterating this, one obtains
(|x| 1)l [|S0 (x)| |S1 (x)|] = (|x| 1)l .
The proof is made by induction. Hypothesis HN is the following:
One has |xN | > 2 + and |zN | > 2 + . Moreover |xN 1 | |zN |.
It is clear that the hypothesis of the lemma implies HN .
We now show the induction property, namely HN implies
|zN +1 | > |zN |,
|xN +1 | > |zN |,
September 14, 2010 13:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004090

Dynamical Bounds for Sturmian Schr


odinger Operators 867

and
|xN | |zN +1 |.
It is easy to see that these three relations with HN implies HN +1 .
Suppose HN to be true, then one has
|zN +1 | |zN SaN +1 (xN )| |xN 1 SaN +1 1 (xN )|
|zN |[|SaN +1 (xN )| |SaN +1 1 (xN )|]
|zN |(|xN | 1)aN +1 . (8)
This shows that |zN +1 | > |zN | with |xN | 2 + . One also has |zN +1 | > |xN |.
Indeed, one can write
|zN +1 | |zN |(|xN | 1)
|xN | + (|zN | 1)|xN | |zN |
|xN | + 2(|zN | 1) |zN |
|xN | + |zN | 2 |xN |.
Only the last relation remain to be shown:
One shows the same way that before
|xN +1 | |zN SaN +1 1 (xN )| |xN 1 SaN +1 2 (xN )|
|zN |[|SaN +1 1 (xN )| |SaN +1 2 (xN )|]
|zN |(|xN | 1)aN +1 1
which yields to |xN +1 | > |zN |.
Taking logarithms in (8), one obtains:
log|zk+1 | log|zk | + ak+1 log(|xk | 1).
Using |zk+1 | > |zk | and |zk1 | < |xk | yields to
log(|zk+1 | 1) log(|zk | 1) + ak+1 log(|zk1 | 1).
Sequence {log(|zk | 1)}k>N grows faster than the exponential sequence Gk . This
sequence is dened in the following way
Gk = Gk1 + ak+N Gk2 , G0 = 1, G1 = 1.
One has
|xk+1 | |zk | ecGkN + 1 k > N,
with c = log(1 + ) > 0 a xed constant. This constant c comes from the dif-
ference in the initial conditions between the sequence {Gk }k and the sequence
{log(|zk | 1)}k>N .

This criterium motivates the following denition:


Set k,p = {E R, |tk,p (E)| 2}.
September 14, 2010 13:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004090

868 L. Marin

Denote by n = pqnn , the rational approximation of . It is well known that the


spectrum of the operator Hn , where n replace in the denition of H coincide
with the set k,0 . The sequence of operator {Hn } is called the periodic approxi-
mants of H and converges strongly to H . It is well known spectrum of H is a
Cantor set that can be approximate by the band spectra of the periodic approxi-
mants. The following proposition recalls precisely this statement ([29, 1, 32]):

Proposition 2. The sequence of spectra of periodic approximants of H satises


(i) the set k,p is made of pqk + qk1 distinct intervals,
(ii) k+1,0 k,0 and k,p+1 k+1,0 (k+1,0
c
k,p ), k N,
(iii) k+1,0 k,p k,p1 = , V > 4 and k N, p 0.
We recall now important result about periodic approximants spectra structure.
It allows to know the way the intervals of k,p are included in k1,p . It requires
some denitions:

Definition 1. For a given k, we call


Type I gap: A band of k,1 included in a band of k,0 and therefore in a gap of
k+1,0 ,
Type II band: A band of k+1,0 included in a band of k,1 and in a gap of k,0 ,
Type III band: A band of k+1,0 included in a band of k,0 and in a gap of k,1 .
As proved in [28] these denitions exhaust all the possible conguration with
the following lemma.
Lemma 3 ([28]). At a given level k,
(i) a type I gap contains an unique type II band of k+2,0 .
(ii) a type II band contains (ak+1 +1) bands of type I of k+1,1 . They are alternated
with (ak+1 ) type III bands of k+2,0 .
(iii) a type III band contains (ak+1 ) bands of type I of k+1,1 . They are alternated
with (ak+1 1) type III bands of k+2,0 .
As stated above, the spectrum of Hn is made by a growing number of intervals
of decreasing length as n is increasing. We recall now a result obtain in [27] which
allows to control the length of the bands of k,p at any level k. We need again some
notations to resume it:
Let A = {I, II, III} be an alphabet. For each band B of spectrum at level k,
correspond an unique word i0 i1 ik An+1 such that B is a band of type ik
included in a band of type ik1 at level k 1, . . . , included in a band of type i0
at level 0. This word will be called the index of B. More than one band can have
the same index. Let Tn = (ti,j (n))33 be a sequence of matrix and = i0 i1 ik an
index, we dene:
L (T ) = ti0 ,i1 (1)ti1 ,i2 (2) tik1 ,ik (k).
September 14, 2010 13:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004090

Dynamical Bounds for Sturmian Schr


odinger Operators 869

We can now recall the result in [27]:


Theorem 5 ([27]). If = [a1 , a2 , . . .] is an irrational number in [0, 1] and H
dened as above with V > 20 then any band B of index veries,
4L (Q) |B| 4L (P )
where P = (Pn )n>0

0 c1an 1 0

Pn = c1 /an 0 c1 /an
c1 /an 0 c1 /an
3
with c1 = V 8 and Q = (Qn )n>0

0 c2an 1 0
Qn = c2 (an + 2)3 0 c2 (an + 2)3
c2 (an + 2)3 0 c2 (an + 2)3
1
with c2 = V +5 .

By now, we dene the periodic approximants spectrum not only in R but in C.



k,0 = {z C: |xk (z)| 2 + }
The statements of the preceeding propositions remain true if one replace k,p by

k,p for some small enough xed . A condition on V should be added to keep the
invariant formula, V > V = [16 + 24 + 9 2 + 4]1/2 (see [10]). Since the invariant

keeps true, all the structure for set k,0 remains the same. The proof is the very
same, see [28, 24].
The following proposition states, due to classical Koebe distortion theorem, the
height of this set is almost the same that its length.

Proposition 3. If k 3, > 0 and V > 20 then there exists constants c ,d > 0


such that

qk1
(j)

qk1
(j)
B(xk , rk ) k,0

B(xk , Rk )
j=1 j=1

(j)
where {xk }1jqk1 are the zeros of xk , rk = c inf Ak L (Q) and Rk =
d sup Ak L (P ).

Proof. The proof follows the same steps that in [10]. Let Cj be a connected com-
2
ponent of k,0 . With V > max{20, (2)}, Cj contains exactly one of a qk1 zeros
(j)
of k,0 , xk . Moreover Cj contains one connected component of k,0 , denoted by

Cj . It suces to show that
(j) (j)
B(xk , rk ) Cj B(xk , Rk ), (9)
to obtain the result.
September 14, 2010 13:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004090

870 L. Marin

As xk is a proper function (as a polynomial of z) and Cj contains an unique


zero, its degree is 1.
xk : int(Cj ) B(0, 2 + 2)
is univalent (as a proper function of degree one) and so
x1
k : B(0, 2 + 2) int(Cj )

is well dened and univalent too. Consequently, the function


(j)
x1
k ((2 + 2)z) xk
F : B(0, 1) C, F (z) =
(2 + 2)(x1 
k ) (0)

is univalent on B(0, 1). We have F (0) = 0 and F  (0) = 1.


Applying Koebe distortion theorem, we get
|z| |z|
2
|F (z)| , |z| 1.
(1 + |z|) (1 |z|)2
2+
Evaluating this for |z| = 2+2 , one has
(2 + )(2 + 2) (2 + )(2 + 2)
F (z) .
(4 + 3)2 2
By denition of F this implies
(j) (2 + )(2 + 2) 1 
|x1
k ((2 + 2)z) xk | |(xk ) (0)|,
2
(j) (2 + )(2 + 2) 1 
|x1
k ((2 + 2)z) xk | |(xk ) (0)|.
(4 + 3)2
And then for |z| = 2 + ,
(j) (2 + )(2 + 2) 1 
|x1
k (z) xk | |(xk ) (0)|,
2
(j) (2 + )(2 + 2) 1 
|x1
k (z) xk | |(xk ) (0)|.
(4 + 3)2
(j)
It suces with |(x1  
k ) (0)| = |xk (xk )| to remark that

rk |(x1 
k ) (0)| Rk

and with |z| = 2 + , x1


k (z) runs through the entire boundary of Cj to conclude.

Proof of Theorem 1. We have now all the required tools to nish the proof of
the Theorem 1.
(j)
As xk are real, we have
(V )

k,0 {z C: |Im z| < Rk } {z C: |Im z| < dqk },
September 14, 2010 13:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004090

Dynamical Bounds for Sturmian Schr


odinger Operators 871

for a suitable (V ). This implies with Proposition 2


(V )

k,0 k,1

{z C: |Im z| < dqk }. (10)
Let us be more precise on how to choose (V ).
We need to bound all Rk from above. Rk is the supremum of products of k
elements of matrix Pn . All the coecients in Pn are maximal for an = 1. The worst
case possible happens when a band has a index history type I containing a band
of type II, in that case the coecient could be trivial equal to 1 (if an = 1). But
because of combinatoric behavior of bands described by the Lemma 3, this situation
cannot occur more than half of the time. Consequently this implies
k/2
Rk c1 .
(V )
We should have Rk < dqk so a suitable can be chosen by taking:
k log c1
(V ) lim sup .
k 2 log qk
For = Im z > 0, we get an uniform lower bound for |xn (E + i)| with E
(V )
[K, K] R. For a xed > 0, we choose k such that dqk < . With (10), this
shows |xk (E + i)| > 2 + and |zk (E + i)| > 2 + . As |x1 (E + i)| = 2 2 +
we are in the situation of the Lemma 2 and we have the bound
|xj | elog(1+)Gjk + 1, j > k. (11)
All this motivates the following denitions:
For > 0, T > 1, denote by k(T ) the unique integer with
(V ) (V )
qk(T )1 qk(T )
T
d d
and let
N (T ) = qk(T )+k(T ) .

It is then easy to see for T large enough and for every > 0, that we have a
constant C > 0 such that
1
N (T )  C T (V ) T . (12)
Let us give explicit argument on this statement:

log N (T ) log qk(T )+k(T ) k(T ) + k(T )


= 
log T k(T ) +  k(T ) log T
log qk(T )+k(T ) k(T ) + k(T )

k(T ) +  k(T ) (k(T ) + 1)/2 log c1

k(T ) +  k(T )
2D .
(k(T ) + 1) log c1
September 14, 2010 13:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004090

872 L. Marin

1 2D
For k(T ) large enough, last expression is close to (V ) = log c1 . So for T large
enough, one gets
2D
N (T )  C T log c1 T
with arbitrary small.
Applying (11) to Theorem 4, we get
 K   2 1
 i 
Pd (N (T ), T )  exp(cN (T )) + T 3
max Mn E +  dE,
K 1qn N (T )  T 
2 log(1+)Gk(T )
 exp(cN (T )) + T 3 e .
From this bound, it is clear that Pd (N (T ), T ) goes to zero faster than any inverse
power of T since sequence G has exponential growth. One gets the same bound for
Pg (N (T ), T ) because of the symetry of the potential. Finally, one can conclude with
(12) that
+
u

with
1
= +
(V )
and arbitrary small.
For the second part of the theorem, notice the constant 2 comes from the choice
of (V ) considering the worst coecient in matrix Pn . But assuming there are no 1
in continued fraction development, one gets
Rk ck1
and
k log c1
(V ) lim sup .
k log qk 

4. A Pathological Counterexample
The statements above holds if D < +. In the case D = +, we exhibit in the
next statement a counter example. It is still an open question if D = + implies
ballistic motion.
Theorem 6. There exists an irrational number with D = + such that for any
V > 20
+
u = 1.

The proof, made by induction, follows the lines of pathological example in [25].
The main idea is that, choosing an irrational number close to rational numbers
(with large values for the sequence {ak }k ), potentials of H and Hn coincide on
large scale of time. Large enough to say that H and Hn have the same dynamical
behavior. It is well known that periodic operator Hn has ballistic motion.
September 14, 2010 13:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004090

Dynamical Bounds for Sturmian Schr


odinger Operators 873

We make now these ideas more precise and rst prove the following lemma:

Let n = [a1 , . . . , an ] be xed and be any an irrational number verifying


= [a1 , . . . , an , . . .].

Lemma 4. The Sturmian potentials of the operators H and Hn have the same
rst qn+1 values.

Proof. To prove this, we recall the iterative construction of Sturmian word that
coincide with our potential. For details and proof, see, e.g., [26]. Set W0 = 0 et
W1 = 0a1 1 V and dene the sequence of Sturmian words by
a
Wk+1 = Wk k+1 Wk1 , k 1.

Each word Wk has length qk .


As H and Hn have the same rst n terms of continued fraction expansion,
words W0 , W1 , . . . , Wn are the same for H and Hn .
For Hn , the limit word W is periodic with period qn and repeat endless the
an
word Wn . As Wn = Wn1 Wn2 , one has

Wn = Wnan+1 Wn1
an
Wn2 Wn .
a
This shows that the potential Hn begins with the word Wn n+1 Wn1 which is the
word Wn+1 for H . As Wn+1 is qn+1 long, this ends the proof.

We need another lemma, one can nd in [25]. It states that two operators have
close dynamic (on some scale of time T ) if their potentials are close enough. We
make this idea more precise by recalling this lemma:

Lemma 5. Let H1 = + V1 and H2 = + V2 acting on l2 (Z), and such that


|V1 (k)|, |V2 (k)| < C for all k Z and some constant C. Let T > 0 and > 0 be
xed constant then if it exists L(T, ), > 0 such that |V1 (k) V2 (k)| < for all
|k| < L, then

||X|2H1 T |X|2H2 T | < .

We get back to the construction.

Proof of Theorem 6. As Hn is a periodic potential operator, one has

|X|2Hn T > Cn T 2 ,

choose Tn big enough such that


1
Cn > .
log Tn
One can then choose an+1 such that L(Tn , 1) qn+1 .
September 14, 2010 13:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004090

874 L. Marin

Inductively, we have a sequence Tn going to innity and an irrational number


with
Tn2
|X|2 Tn > 1 > Tn2 , > 0.
log Tn
Now, since is fully construct, one can compare H with Hn .
Then Lemma 5 implies
Tn2
|X|2H Tn > 1, (13)
log Tn
which yields to

+
u 1 (2) > 1 , > 0.


5. Lower Bound for the Box Counting Dimension of the Spectrum


We give now a lower bound of the fractal box counting dimension of the spectrum
of operator H . We recall now the dention. If one denotes by N () the number
of balls of diameter at most one need to cover , then the upper box counting
dimension is dened by
log N ()
dim+
B = lim sup .
0 log
The spectrum is approached by the band spectrum of periodic Hn . Moreover,
in [28, 27], we have precise information of the number of bands and their length. It
allows us to give a lower bound of minimal number of set of some decreasing scale
needed to cover the spectrum and then to give a lower bound of box dimension
of the limit set. The rst idea to cover the spectrum can be to take into account
all the bands and take as a scale the smallest length, but this is a bad idea because
this minimal length decreases faster than the number of intervals grows. The second
idea can be to count the number of bands that have the maximal length, in terms
of inverse power of V . This yields to a better lower bound for the box dimension of
the spectrum for almost every irrational number.
Fixing the irrational number, one can improve this method, by counting pre-
cisely the number of band that have a particular length. It has been made for
Fibonacci number in [12] where the full fractal spectrum has been investigated. The
length of a band is depending of its history, in that case, the number of I in the
index history. Hence, one obtains this way all the contribution at any scale to the
box dimension. It is shown their result is optimal with V increasing and one has
for = [0, 1, 1, . . .]

log(1 + 2)
dimB ((H )) .
log V
An other example, simpler than golden mean is silver ratio. Fix = [0, 2, 2, . . .],
then all the bands have the same length up to a constant independent of V . Namely,
September 14, 2010 13:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004090

Dynamical Bounds for Sturmian Schr


odinger Operators 875

all bands at level k have length ck V k , where ck is a constant depending of history


of the band but not of V . At a given rank k, the number of band of length ck V k
needed to cover the spectrum is bound from below by qk .
This implies that one has:

log qk log(1 + 2)
dimB ((H )) lim inf .
k log ck V k log V
It is easy to show by direct computation the other side inequality and hence we
obtain the same estimation for this case

log(1 + 2)
dimB ((H )) .
log V
It is quite astonishing that both golden mean and silver ratio yield the same fractal
dimension estimate.
Going back to the general case, we will apply the same method used for silver
mean, that is count the number of bands at level k that have length equal to ck V k .
We obtain:
k
Theorem 7. Set Ck = k3 j=1 log(aj + 2). We have for any irrational number
verifying C = lim sup Ck < + and V > 20:
1 log 2
dim+
B () (14)
2 C + log(V + 5)
where is the spectrum of H .

Remark 3. As in Lemma 1, C nite is valid for a set of full Lebesgue measure.


The following lemma give precise statement of the counting idea.

Lemma 6. Denote by nk,I , nk,II and nk,III the number of bands of type respectively
I, II and III in respectively k,1 , k+1,0 , k+1,0 and with a length greater than k =
4kj=1 (V + 5)1 (aj + 2)3 .
For all k, we have the following induction relation:
nk+1,I = (ak+1 + 1)nk,II + ak+1 nk,III ,
nk+1,II = 1{ak+1 2} nk,I ,
nk+1,III = ak+1 nk,II + (ak+1 1)nk,III .
Here, the initial conditions are n0,I = 1, n0,II = 0, n0,III = 1.
Moreover this three sequences verify the following properties:

nk,II
= 0 nk,III
= 0
nk,I
= 0
nk,I > nk,III
and
k
nk,II + nk,III > 2 2 .
September 14, 2010 13:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004090

876 L. Marin

Proof. The induction relation is obvious with (5).


The two rst properties are made by induction. Initial conditions give level 0.
Assume it is true at level n, then as ak+1 > 0, nk,II
= 0 nk,III
= 0, implies
nk+1,I
= 0. For the second part, if ak+1 2 then nk+1,II
= 0, else ak+1 > 2 implies
nk+1,III
= 0.
To prove
nk,I > nk,III
it suces to see that
nk,I = nk,III + nk1,II + nk1,III .
For the last property, it suces to show that
nk,II + nk,III 2(nk2,II + nk2,III ).
Using induction relation, we get
nk,II = [(ak1 + 1)nk2,II + ak1 nk2,III ]1{ak 2}
nk,III = (ak 1)(ak1 nk2,II + (ak1 1)nk2,III ) + ak nk2,I 1{ak1 2} .
We distinguish 4 cases depending on the values of ak and ak1 .

If ak > 2 and ak1 > 2, then we simply get


nk,II + nk,III = (ak 1)(ak1 nk2,II + (ak1 1)nk2,III )
(ak 1)(ak1 1)(nk2,II + nk2,III )
4(nk2,II + nk2,III ).
If ak 2 and ak1 > 2, then one has
nk,II + nk,III = (ak 1)(ak1 nk2,II + (ak1 1)nk2,III )
+ (ak1 + 1)nk2,II + ak1 nk2,III
ak ak1 (nk2,II + nk2,III )
3(nk2,II + nk2,III ).
If ak > 2 and ak1 2, then one has
nk,II + nk,III = (ak 1)(ak1 nk2,II + (ak1 1)nk2,III ) + ak nk2,I
(ak 1)(ak1 nk2,II + (ak1 1)nk2,III ) + ak nk2,III
(ak 1)ak1 (nk2,II + nk2,III )
2(nk2,II + nk2,III ).
If ak 2 and ak1 2, then one gets
nk,II + nk,III = (ak 1)(ak1 nk2,II + (ak1 1)nk2,III )
+ ak nk2,I + (ak1 + 1)nk2,II + ak1 nk2,III .
September 14, 2010 13:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004090

Dynamical Bounds for Sturmian Schr


odinger Operators 877

And one obtains


nk,II + nk,III ((ak 1)ak1 + ak1 + 1)nk2,II
+ ((ak 1)(ak1 1) + (ak1 + ak )nk2,III
(ak ak1 + 1)nk2,II + (ak1 + ak )nk2,III
2(nk2,II + nk2,III ).

Proof of Theorem 7. With previous lemma, we nd a bound for nk,II + nk,III ,


that is the number of bands of length at least k . To make sure we have a disjoint
cover we consider only half of the bands. Each band is then separeted by another
band we does not count. Then by denition of box dimension, we have
log 1/2(nk,II + nk,III )
dim+
B () lim inf ,
k log k
and the stated result. 

Remark 4. The former bound for box dimension provided in [27] was
 
+ log 2 log M log 3
dimB () dimH () max , ,
10 log 2 3 log t2 log M log t2 /3
1 1
where M = lim inf k (a1 a2 ak ) k and t2 = 4(V +8) .
For almost all irrational numbers, that is with M equal to the Khintchin constant
(2.685. . .), our bound is better than above and for any V > 20. On the other
hand, for all xed V , one has no improvement with some specic numbers. Fixing
= [0, c, c, . . .], the bound above goes to 1 and (14) to 0 as c goes to innity.
A lower bound for box dimension can be relevant to obtain a bound for dynamic
lower exponent u .

Definition 2. An irrational number is said to be a bounded density irrational


number if it fullls the following condition
1
n
lim sup ai < +.
n n i=1

Theorem 8. For any bounded density irrational number, we have


1 log 2

u
2 C + log(V + 5)
k
with C = lim sup k3 j=1 log(aj + 2).

Proof. It is shown in [30, 12] that if the norms of the transfer matrix are poly-
+
nomially bounded on the spectrum then we have u dimB (). This property
on the norm of the transfer matrix is shown for irrational with bounded density
in [18].
September 14, 2010 13:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004090

878 L. Marin

Acknowledgments
It is a pleasure to thank Dominique Vieugue for useful conversations about number
theory.

References
[1] J. Bellissard, B. Iochum, E. Scoppola and D. Testard, Spectral properties of one
dimensional quasi-crystals, Comm. Math. Phys. 125 (1989) 527543.
[2] J.-M. Barbaroux, F. Germinet and S. Tcheremchantsev, Fractal dimensions and the
phenomenon of intermittency in quantum dynamics, Duke. Math. J. 110 (2001)
161193.
[3] J. M. Combes, Connections between quantum dynamics and spectral properties of
time-evolution operators, in Dierential Equations with Applications to Mathematical
Physics (Academic Press, Boston, 1993), pp. 5968.
[4] D. Damanik, Dynamical upper bounds for one-dimensional quasicrystals, J. Math.
Anal. Appl. 303 (2005) 327341.
[5] D. Damanik, -continuity properties of one-dimensional quasicrystals, Comm. Math.
Phys. 192 (1998) 169182.
[6] D. Damanik, R. Killip and D. Lenz, Uniform spectral properties of one-dimensional
quasicrystals. III. -continuity, Comm. Math. Phys. 212 (2000) 191204.
[7] D. Damanik, D. Lenz and G. Stolz, Lower transport bounds for one-dimensional
continuum Schr odinger operators, Math. Ann. 336 (2006) 361389.
[8] D. Damanik, A. S uto and S. Tcheremchantsev, Power-law bounds on transfer
matrices and quantum dynamics in one dimension II, J. Funct. Anal. 216 (2004)
362387.
[9] D. Damanik and S. Tcheremchantsev, Power-law bounds on transfer matrices and
quantum dynamics in one dimension, Comm. Math. Phys. 236 (2003) 513534.
[10] D. Damanik and S. Tcheremchantsev, Upper bounds in quantum dynamics, J. Amer.
Math. Soc. 20 (2007) 799827.
[11] D. Damanik and S. Tcheremchantsev, Scaling estimates for solutions and dynamical
lower bounds on wavepacket spreading, J. Anal. Math. 97 (2005) 103131.
[12] D. Damanik, M. Embree, A. Gorodetski and S. Tcheremchantsev, The fractal dimen-
sion of the spectrum of the Fibonacci Hamiltonian, Comm. Math. Phys. 280 (2008)
499516.
[13] F. Germinet, A. Kiselev and S. Tcheremchantsev, Transfert matrices and transport
for Schrodinger operators, Ann. Inst. Fourier (Grenoble) 54 (2004) 787830.
[14] I. Guarneri, Spectral properties of quantum diusion on discrete lattices, Europhys.
Lett. 10 (1989) 95100.
[15] I. Guarneri, On an estimate concerning quantum diusion in the presence of a fractal
spectrum, Europhys. Lett. 21 (1993) 729733.
[16] I. Guarneri and H. Schulz-Baldes, Lower bounds on wave packet propagation by
packing dimensions of spectral measures, Math. Phys. Electron. J. 5 (1999), Paper 1,
16 pp.
[17] I. Guarneri and H. Schulz-Baldes, Intermittent lower bounds on quantum diusion,
Lett. Math. Phys. 49 (1999) 317324.
[18] B. Iochum, L. Raymond and D. Testard, Resistance of one-dimensional quasicristals
Phys. A 187 (1992) 353368.
[19] S. Jitomirskaya and Y. Last, Power-law subordinacy and singular spectra. I. Half-line
operators, Acta Math. 183 (1999) 171189.
September 14, 2010 13:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004090

Dynamical Bounds for Sturmian Schr


odinger Operators 879

[20] S. Jitomirskaya and Y. Last, Power-law subordinacy and singular spectra. II. Line
operators, Comm. Math. Phys. 211 (2000) 643658.
[21] S. Jitomirskaya, H. Schulz-Baldes and G. Stolz, Delocalization in random polymer
models, Comm. Math. Phys. 233 (2003) 2748.
[22] A. Ya. Khinchin, Continued Fractions (University of Chicago Press, 1964).
[23] A. Kiselev and Y. Last, Solutions, spectrum and dynamics for Schrodinger operators
on innite domains, Duke Math. J. 102 (2000) 125150.
[24] R. Killip, A. Kiselev and Y. Last, Dynamical upper bounds on wavepacket spreading,
Amer. J. Math. 125 (2003) 11651198.
[25] Y. Last, Quantum dynamics and decompositions of singular continuous spectra,
J. Funct. Anal. 142 (1996) 406445.
[26] M. Lothaire, Algebraic Combinatorics on Words (Cambridge Univ. Press, 2002),
Chap. 2, pp. 4097.
[27] Q. Liu and Z. Wen, Hausdor dimension of spectrum of one-dimensional Schr odinger
operator with Sturmian potentials, Potential Anal. 20 (2004) 3359.
[28] L. Raymond, A constructive gap labelling for the discrete Schr
odinger operator on a
quasiperiodic chain, preprint (1997).
[29] A. S
ut odinger operator, Comm. Math. Phys.
o, The spectrum of a quasiperiodic Schr
111 (1987) 409415.
[30] S. Tcheremchantsev, Mixed lower bound in quantum transport, J. Funct. Anal. 197
(2003) 247282.
[31] S. Tcheremchantsev, Dynamical analysis of Schr odinger operators with growing
sparse potentials, Comm. Math. Phys. 253 (2005) 221252.
[32] G. Teschl, Jacobi Operators and Completely Integrable Nonlinear Lattices, Mathe-
matical Surveys and Monographs, Vol. 72 (Amer. Math. Soc., 2000).
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Reviews in Mathematical Physics


Vol. 22, No. 8 (2010) 881961

c World Scientific Publishing Company
DOI: 10.1142/S0129055X10004107

ASYMPTOTICS FOR FERMI CURVES: SMALL


MAGNETIC POTENTIAL

GUSTAVO DE OLIVEIRA
Department of Mathematics,
University of British Columbia, Canada
goliveira5d@gmail.com

Received 9 March 2010

We consider complex Fermi curves of electric and magnetic periodic fields. These are
analytic curves in C2 that arise from the study of the eigenvalue problem for periodic
Schrodinger operators. We characterize a certain class of these curves in the region of
C2 where at least one of the coordinates has large imaginary part. The new results in
this work extend previous results in the absence of magnetic field to the case of small
magnetic field. Our theorems can be used to show that generically these Fermi curves
belong to a class of Riemann surfaces of infinite genus.

Keywords: Fermi curves; Bloch variety; Fermi surfaces; periodic Schr


odinger operators.

Mathematics Subject Classification 2010: 47B99, 81Q99, 14H55

1. Introduction
In [1], the authors introduced a class of Riemann surfaces of innite genus that
are asymptotic to a nite number of complex lines joined by innite many han-
dles. These surfaces are constructed by pasting together a compact submanifold of
nite genus, plane domains, and handles. All these components satisfy a number
of geometric/analytic hypotheses stated in [1] that specify the asymptotic holo-
morphic structure of the surface. The class of surfaces obtained in this way yields
an extension of the classical theory of compact Riemann surfaces that has ana-
logues of many theorems of the classical theory. It was proven in [1] that this new
class includes quite general hyperelliptic surfaces, heat curves (which are spectral
curves associated to a certain heat-equation), and Fermi curves with zero mag-
netic potential. In order to verify the geometric/analytic hypotheses for the latter
the authors proved two asymptotic theorems similar to the ones we prove below.
This is the main step needed to verify these hypotheses. In this work, we extend
their results to Fermi curves with small magnetic potential.
There are two immediate applications of our results. First, as we have already
mentioned, one can use our theorems for verifying the geometric/analytic hypothe-
ses of [1] for Fermi curves with small magnetic potential. This would show that

881
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

882 G. de Oliveira

these curves belong to the class of Riemann surfaces mentioned above. Secondly,
one can prove that a class of these curves are irreducible (in the usual algebraic-
geometrical sense). Both these applications were done in [1] for Fermi curves with
zero magnetic potential.
Complex Fermi curves (and other similar spectral curves) have been studied, in
dierent perspectives, in the absence of magnetic eld [15], and in the presence of
magnetic eld [6]. Some results on the real Fermi curve in the high-energy region
were obtained in [7]. There one also nds a short description of the existing results
on periodic magnetic Schr odinger operators. An even more general review is pre-
sented in [8]. To our knowledge, our work provides new results on complex Fermi
curves with magnetic eld. At this moment, we are only able to handle the case
of small magnetic potential. The asymptotic characterization of Fermi curves
with arbitrarily large magnetic potential remains as an open problem. In order to
prove our theorems, we follow the same strategy as [1]. The presence of magnetic
eld makes the analysis considerably harder and requires new estimates. As it was
pointed out in [7, 8], the study of an operator with magnetic potential is essentially
more complicated than the study of the operator with just an electric potential.
This seems to be the case in this problem as well.
Before we outline our results let us introduce some denitions. Let be a lattice
in R2 and let A1 , A2 and V be real-valued functions in L2 (R2 ) that are periodic
with respect to . Set A := (A1 , A2 ) and dene the operator
H(A, V ) := (i + A)2 + V
acting on L2 (R2 ), where is the gradient operator in R2 . For k R2 consider the
following eigenvalueeigenvector problem in L2 (R2 ) with boundary conditions,
H(A, V ) = ,
(x + ) = eik (x)
for all x R2 and all . Under suitable hypotheses on the potentials A and V
this problem is self-adjoint and its spectrum is discrete. It consists of a sequence of
real eigenvalues
E1 (k, A, V ) E2 (k, A, V ) En (k, A, V ) .
For each integer n 1, the eigenvalue En (k, A, V ) denes a continuous function of
k. From the above boundary condition, it is easy to see that this function is periodic
with respect to the dual lattice
# := {b R2 | b 2Z for all },
where b is the usual scalar product on R2 . It is customary to refer to k as the
crystal momentum and to En (k, A, V ) as the nth band function. The corresponding
normalized eigenfunctions n,k are called Bloch eigenfunctions.
The operator H(A, V ) (and its three-dimensional counterpart) is important in
solid state physics. It is the Hamiltonian of a single electron under the inuence of
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 883

magnetic eld with vector potential A, and electric eld with scalar potential V , in
the independent electron model of a two-dimensional solid [9]. The classical frame-
work for studying the spectrum of a dierential operator with periodic coecients
is the Floquet (or Bloch) theory [911]. Roughly speaking, the main idea of this
theory is to decompose the original eigenvalue problem, which usually has contin-
uous spectrum, into a family of boundary value problems, each one having discrete
spectrum. In our context this leads to decomposing the problem H(A, V ) =
(without boundary conditions) into the above k-family of boundary value problems.
Let Uk be the unitary transformation on L2 (R2 ) that acts as
Uk : (x)  eikx (x).
By applying this transformation, we can rewrite the above problem and put the
boundary conditions into the operator. Indeed, if we dene
Hk (A, V ) := Uk1 H(A, V ) Uk and := Uk1 ,
then the above problem is unitarily equivalent to
Hk (A, V ) = for L2 (R2 /).
Furthermore, a simple (formal) calculation shows that
Hk (A, V ) = (i + A k)2 + V.
The real lifted Fermi curve of (A, V ) with energy R is dened as
F,R (A, V ) := {k R2 | (Hk (A, V ) ) = 0 for some DHk (A,V ) \{0}},
where DHk (A,V ) L2 (R2 /) denotes the (dense) domain of Hk (A, V ). The adjective
lifted indicates that F,R (A, V ) is a subset of R2 rather than R2 /# . As we
may replace V by V , we only discuss the case = 0 and write FR (A, V )
in place of F0,R (A, V ) to simplify the notation. Let || := R2 / dx and A(0)
:=
1

|| 2
R / A(x)dx. Since Hk (A, V ) is equal to H (A A(0), V ), if we perform
kA(0)

the change of coordinates k k + A(0) and redene A A(0)
A we may
assume, without loss of generality, that A(0) = 0. The dual lattice # acts on R2

by translating k  k + b for b # . This action maps FR (A, V ) to itself because
for each n 1 the function k  En (k, A, V ) is periodic with respect to # . In
other words, the real lifted Fermi curve is periodic with respect to # . Dene
FR (A, V ) := FR (A, V )/# .
We call FR (A, V ) the real Fermi curve of (A, V ). It is a curve in the torus R2 /# .
The above denitions and the real Fermi curve have physical meaning. It is
useful and interesting, however, to study the complexication of these curves.
Knowledge about the complexied curves may provide information about the real
counterparts. For complex-valued functions A1 , A2 and V in L2 (R2 ) and for k C2
the above problem is no longer self-adjoint. Its spectrum, however, remains discrete.
It is a sequence of eigenvalues in the complex plane. From the boundary condition
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

884 G. de Oliveira

in the original problem it is easy to see that the family of functions k  En (k, A, V )
remains periodic with respect to # . Moreover, the transformation Uk is no longer
unitary but it is still bounded and invertible and it still preserves the spectrum,
that is, we can still rewrite the original problem in the form Hk (A, V ) = for
L2 (R2 /) without modifying the eigenvalues. Thus, it makes sense to dene

F (A, V ) := {k C2 | Hk (A, V ) = 0 for some DHk (A,V ) \{0}},


F (A, V ) := F (A, V )/# .

We call F(A, V ) and F (A, V ) the complex lifted Fermi curve and the complex
Fermi curve, respectively. When there is no risk of confusion we refer to either
simply as Fermi curve.
We are now ready to outline our results. When A and V are zero the (free)
Fermi curve can be found explicitly. It consists of two copies of C with the points
b2 + ib1 (in the rst copy) and b2 + ib1 (in the second copy) identied for all
(b1 , b2 ) # with b2 = 0. In this work, we prove that in the region of C2 where
k C2 has large imaginary part the Fermi curve (for nonzero A and V ) is close
to the free Fermi curve. In a compact form, our main result (that will be stated
precisely in Theorems 1 and 2) is essentially the following.

Main result. Suppose that A and V have some regularity and assume that (in a
suitable norm) A is smaller than a constant given by the parameters of the problem.
Write k in C2 as k = u + iv with u and v in R2 and suppose that |v| is larger than
a constant given by the parameters of the problem. (Recall that the free Fermi curve
is two copies of C with certain points in one copy identied with points in the other
one.) Then, in this region of C2 , the Fermi curve of A and V is very close to
the free Fermi curve, except that instead of two planes we may have two deformed
planes, and identications between points can open up to handles that look like
{(z1 , z2 ) C2 | z1 z2 = constant} in suitable local coordinates.

The proof of our results has basically three steps:

We rst derive very detailed information about the free Fermi curve (which is
explicitly known). Then, to compute the interacting Fermi curve we have to nd
the kernel of H in L2 (R2 ) with the above boundary conditions.
In the second step of the proof, we derive a number of estimates for showing that
this kernel has nite dimension for small A and k C2 with large imaginary
part. Our strategy here is similar to the Feshbach method in perturbation theory
[12]. Indeed, we prove that in the complement of the kernel of H in L2 (R2 ), after
a suitable invertible change of variables in L2 (R2 ), the operator H multiplied by
the inverse of the operator that implements this change of variables is a compact
perturbation of the identity that is invertible for such A and k. This reduces the
problem of nding the kernel to nite dimension and thus we can write local
dening equations for the Fermi curve.
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 885

In the third step of the proof, we use these equations to study the Fermi curve.
A few more estimates and the implicit function theorem gives us the deformed
planes. The handles are obtained using a quantitative Morse lemma from [13]
that is available in the Appendix A.

Steps two and three contain most of the novelties in this work. The critical part
of the proof is the second step. The main diculty arises due to the presence of the
term Ai in the Hamiltonian H(A, V ). When A is large, taking the imaginary part
of k C2 arbitrarily large is not enough to control this term it is not enough
to make its contribution small and hence have the interacting Fermi curve as a
perturbation of the free Fermi curve. (The term V in H(A, V ) is easily controlled
by this method.) However, the proof can be implemented by assuming that A is
small.
This work is organized as follows. In Sec. 2, we collect some properties of the
free Fermi curve and in Sec. 3, we dene -tubes about it. In Sec. 4, we state our
main results and in Sec. 5, we describe the general strategy of analysis used to prove
them. Subsequently, we implement this strategy by proving a number of lemmas
and propositions in Secs. 610, which we put together later in Secs. 11 and 12 to
prove our main theorems. The proof of the estimates of Secs. 9 and 10 are left to
the Appendices B and C.

2. The Free Fermi Curve


When the potentials A and V are zero the curve F (A, V ) can be found explicitly.
In this section we collect some properties of this curve. For {1, 2} and b #
set

Nb, (k) := (k1 + b1 ) + i(1) (k2 + b2 ),


N (b) := {k C2 | Nb, (k) = 0},
Nb (k) := Nb,1 (k)Nb,2 (k),
Nb := N1 (b) N2 (b),
1
(b) := ((1) b2 + ib1 ).
2
Observe that N (b) is a line in C2 . The free lifted Fermi curve is an union of these
lines. Here is the precise statement.

Proposition 1 (The Free Fermi Curve). The curve F (0, 0) is the locally nite
union
 
N (b).
b# {1,2}

In particular, the curve F (0, 0) is a complex analytic curve in C2 /# .


September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

886 G. de Oliveira

The proof of this proposition is straightforward. It can be found in [13]. Here


we only give its rst part.

Proof of Proposition 1 (First Part). For all k C2 the functions {eibx | b # }


form a complete set of eigenfunctions for Hk (0, 0) in L2 (R2 /) satisfying

Hk (0, 0)eibx = (i k)2 eibx = (b + k)2 eibx = Nb (k)eibx .

Hence,
  
F (0, 0) = {k C2 | Nb (k) = 0 for some b # } = Nb = N (b).
b# b# {1,2}

This is the desired expression for F (0, 0).


The lines N (b) have the following properties (see [13] for a proof).

Proposition 2 (Properties of N (b)). Let {1, 2} and let b, c, d # .


Then:

(a) N (b) N (c) = if b = c;


(b) dist(N (b), N (c)) = 12 |b c|;
(c) N1 (b) N2 (c) = {(i1 (c) + i2 (b), 1 (c) 2 (b))};
(d) the map k  k + d maps N (b) to N (b d);
(e) the map k  k + d maps N1 (b) N2 (c) to N1 (b d) N2 (c d).

Let us briey describe what the free Fermi curve looks like. In Fig. 1, there is
a sketch of the set of (k1 , k2 ) F (0, 0) for which both ik1 and k2 are real, for the
case where the lattice # has points over the coordinate axes, that is, it has points

ik1 ik1
N2 (b) N1 (b)
N2 (0) N1 (0)

N2 (b) N1 (b)

k2

k2

Fig. 1. Sketch of F (0, 0) and F (0, 0) when both ik1 and k2 are real.
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 887

of the form (b1 , 0) and (0, b2 ). Observe that, in particular, Proposition 2 yields

N1 (0) N2 (b) = {(i1 (b), 1 (b))},


N1 (b) N2 (0) = {(i2 (b), 2 (b))},
the map k  k + b maps N1 (0) N2 (b) to N1 (b) N2 (0).

Recall that points in F (0, 0) that dier by elements of # correspond to the same
point in F (0, 0). Thus, in the sketch on the left, we should identify the lines
k2 = b2 /2 and k2 = b2 /2 for all b # with b2 = 0, to get a pair of helices
climbing up the outside of a cylinder, as illustrated by the gure on the right. The
helices intersect each other twice on each cycle of the cylinder once on the front
half of the cylinder and once on the back half. Hence, viewed as a manifold (with
singularities), the pair of helices are just two copies of R with points that corre-
sponds to intersections identied. We can use k2 as a coordinate in each copy of R
and then the pairs of identied points are k2 = b2 /2 and k2 = b2 /2 for all b #
with b2 = 0. So far we have only considered k2 real. The full F (0, 0) is just two
copies of C with k2 as a coordinate in each copy, provided we identify the points
1 (b) = 12 (b2 + ib1 ) (in the rst copy) and 2 (b) = 12 (b2 + ib1 ) (in the second copy)
for all b # with b2 = 0.

3. The -Tubes about the Free Fermi Curve


We now introduce real and imaginary coordinates in C2 and dene -tubes about
the free Fermi curve. We derive some properties of the -tubes as well. For k C2
write

k1 = u1 + iv1 and k2 = u2 + iv2 ,

where u1 , u2 , v1 and v2 are real numbers. Then,

Nb, (k) = (k1 + b1 ) + i(1) (k2 + b2 )


= i(v1 + (1) (u2 + b2 )) (1) (v2 (1) (u1 + b1 )),

so that

|Nb, (k)| = |v + (1) (u + b) |,

where (y1 , y2 ) := (y2 , y1 ). Since Nb (k) = Nb,1 (k)Nb,2 (k), we have Nb (k) = 0 if
and only if

v (u + b) = 0 or v + (u + b) = 0.

Let 2 be the length of the shortest nonzero vector in # . Then there is at most
one b # with |v + (u + b) | < and at most one b # with |v (u + b) | <
(see [13] for the proof).
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

888 G. de Oliveira

Let be a constant satisfying 0 < < /6. For {1, 2} and b # , dene
the -tube about N (b) as

T (b) := {k C2 | |Nb, (k)| = |v + (1) (u + b) | < },

and the -tube about Nb = N1 (b) N2 (b) as

Tb := T1 (b) T2 (b).

Since (v + (u + b) ) + (v (u + b) ) = 2v, at least one of the factors |v + (u + b) |


or |v (u + b) | in |Nb (k)| must always be greater or equal to |v|. If k  Tb both
factors are also greater or equal to . If k Tb one factor is bounded by and the
other must lie within of |2v|. Thus,

k  Tb |Nb (k)| |v|, (1)


k Tb |Nb (k)| (2|v| + ). (2)

Finally, denote by Tb the closure of Tb . The intersection Tb Tb is compact whenever


b = b , and Tb Tb Tb is empty for all distinct elements b, b , b # (see [13]
for details).
If a point k belongs to the free Fermi curve the function Nb (k) vanishes for some
b # . We now give a lower bound for this function when (b, k) is not in the zero
set.
Proposition 3 (Lower Bound for |Nb (k)|).

(a) If |b + u + v | and |b + u v | , then |Nb (k)| 2 (|v| + |u + b|).


(b) If |v| > 2 and k T0 , then |Nb (k)| 2 (|v| + |u + b|) for all b = 0 but at most
one b = 0. This exceptional b obeys |b| > |v| and | |u + b| |v| | < .
(c) If |v| > 2 and k T0 Td with d = 0, then |Nb (k)| 2 (|v| + |u + b|) for all
b  {0, d}. Furthermore we have |d| > |v| and | |u + d| |v| | < .

Proof. (a) By hypothesis, both factors in |Nb (k)| = |v + (u + b) | |v (u + b) |


are greater or equal to . We now prove that at least one of the factors must
also be greater or equal to 12 (|v| + |u + b|). Suppose that |v| |u + b|. Then,
since (v + (u + b) ) + (v (u + b) ) = 2v, at least one of the factors must also
be greater or equal to |v| = 12 (|v| + |v|) 12 (|v| + |u + b|). Now suppose that
|v| < |u + b|. Then similarly we prove that |u + b| > 12 (|v| + |u + b|). All this
together implies that |Nb (k)| 2 (|v| + |u + b|), which proves part (a).
(b) By hypothesis < /6 < |v|. Let k T0 . Then, by (2),

|N0 (k)| (2|v| + ) < 3|v| < |v|. (3)
2
Thus we have either |u + v | < or |u v | < (otherwise we apply part
(a) to get a contradiction). Suppose that |u + v | < . Then there is no
b # \{0} with |b+u+v | < and there is at most one b # \{0} satisfying
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 889

|b + u v | < . This inequality implies | |u + b| |v| | < . Furthermore, for


this b,
|b| = |2v (u + v ) + (b + u v )| > 2|v| 2 > |v|,
since 2 > |v|. Now suppose that |uv | < . Then similarly we prove that
|b| > |v|. Finally observe that, if b  {0, b} then |b+u+v | and |b+uv |
. Hence, by applying part (a) it follows that |Nb (k)| 2 (|v| + |u + b|). This
proves part (b).
(c) As in the proof of part (b), if k T0 Td then in addition to (3), we have
|Nd (k)| < 2 |v|. Thus, applying part (b) we conclude that d must be the excep-
tional b of part (b). The statement of part (c) follows then from part (b). This
completes the proof.

4. Main Results
The Riemann surfaces introduced in [1] can be decomposed into
X com X reg X han ,
where X com is a compact submanifold with smooth boundary and nite genus, X reg
is a nite union of open regular pieces, and X han is an innite union of closed
handles. All these components satisfy a number of geometric/analytic hypotheses
stated in [1] that specify the asymptotic holomorphic structure of the surface. Below
we state two asymptotic theorems that essentially characterize the X reg and X han
components of Fermi curves with small magnetic potential. Before we move to the
theorems let us introduce some denitions.
For any L2 (R2 /) dene : # C as

1
(b)
:= (F )(b) := (x)eibx dx,
|| R2 /

where || := R2 / dx. Then,

(x) = (F 1 )(x)
= (b)e
ibx
,
b#

L2 (R2 /) = ||1/2
l2 (# ) .
Recall that k = u + iv with u, v R2 , let be a positive constant, and set
K := {k C2 | |v| }.
Finally, consider the projection
pr: C2 C,
(k1 , k2 )  k2 ,
and dene
q := (i A) + A2 + V.
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

890 G. de Oliveira

It is easy to construct a holomorphic map E: F (A, V ) F(A, V ) [13]. The


precise form of this map is irrelevant here. For our purposes it is enough to think
of it simply as a projection (or exponential map).
We are ready to state our results. Clearly, the set K is invariant under the action
of # and K /# is compact. Hence, the image of F (A, V )K under the holomor-
phic map E is compact in F (A, V ). This image set will essentially play the role of
X com in the decomposition of F (A, V ). Our rst theorem characterizes the regular
piece X reg of F (A, V ).

Theorem 1 (The Regular Piece). Let 0 < < /6 and suppose that A1 , A2 and
V are functions in L2 (R2 /) with b2 q(b) l1 (# ) < and (1+b2 )A(b)
l1 (# \{0}) <
2/63. Then there is a constant = ,,q,A such that, for {1, 2}, the projection
pr induces a biholomorphic map between


(F (A, V ) T (0)) K Tb
b# \{0}

and its image in C. This image component contains


{z C | 8|z| > and |z + (1) (b)| > for all b # \{0}}
and is contained in


1 2
z C |z + (1) (b)| > for all b # \{0} ,
2 40
where (b) = 12 ((1) b2 + ib1 ). Furthermore,
pr1 : Image(pr) T (0),
(1,0)
y  (2 i(1) y r(y), y),
(1,0)
where 2 is a constant given by (24) that depends only on and A,
(1,0) 2 3 C
|2 |< and |r(y)| + ,
100 502
where C = C,,q,A is a constant.
Now observe that, since Tb + c = Tb+c for all b, c # , the complement of
E(F (A, V ) K ) in F (A, V ) is the disjoint union of


A
A 
E
(F (A, V ) T0 ) A K
Tb

A b #

A b2 =0

and

E(F (A, V ) T0 Tb ).
b#
b2 =0
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 891

Basically, the rst of the two sets will be the regular piece of F (A, V ), while the
second set will be the handles. The map parametrizing the regular part will be
the composition of the map E with the inverse of the map discussed in the above
theorem. The detailed information about the handles X han in F (A, V ) comes from
our second main theorem.
Theorem 2 (The Handles). Let 0 < < /6 and suppose that A1 , A2 and V
are functions in L2 (R2 /) with b2 q(b) l1 (# ) < and (1 + b2 )A(b)
l1 (# \{0}) <
2/63. Then, for every suciently large constant and for every d # \{0} with
2|d| > , there are maps



d,1 : (z1 , z2 ) C2 |z1 | and |z2 | T1 (0) T2 (d),
2 2


2

d,2 : (z1 , z2 ) C |z1 | and |z2 | T1 (d) T2 (0),
2
and a complex number td with |td | C
|d|4 such that:

(i) For {1, 2} the domain of the map d, is biholomorphic to its image, and
the image contains



2
k C |k1 + i(1) k2 | and
8


|k1 + (1) +1
d1 i(1) (k2 + (1)
+1
d2 )| .
8
Furthermore,

1 1 1 1
Dd, = I +O
2 i(1) i(1) |d|2
and

1
d, (0) = (i (d), (1)+1 (d)) + O +O .
900
(ii)

1
d,1 (T1 (0) T2 (d) F (A, V ))




= (z1 , z2 ) C2 z1 z2 = td , |z1 | and |z2 | ,
2 2
1
d,2 (T1 (d) T2 (0) F (A, V ))




= (z1 , z2 ) C2 z1 z2 = td , |z1 | and |z2 | .
2 2
(iii)

d,1 (z1 , z2 ) = d,2 (z2 , z1 ) d.


September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

892 G. de Oliveira

These are the main results in this paper. In the next section, we outline the
strategy for proving them. The proofs are presented in the subsequent sections
divided in many steps.

5. Strategy Outline
Below we briey describe the general strategy of analysis used to prove our results.
We rst introduce some notation and denitions. Observe that
Hk (A, V ) = ((i + A k)2 + V )
= ((i k)2 + 2A (i k) + (i A) + A2 + V ),
and write
Hk (A, V ) = k + h(k, A) + q(A, V )
with
k := (i k)2 , h(k, A) := 2A (i k) and
q(A, V ) := (i A) + A2 + V.
For each nite subset G of # set
 
G := # \ G and C2G := C2 Nb ,
bG

L2G := span{eibx | b G} and L2G := span{eibx | b G }.


To simplify the notation write L2 in place of L2 (R2 /). Let I be the identity
operator on L2 , and let G and G be the orthogonal projections from L2 onto L2G
and L2G , respectively. Then,
L2 = L2G L2G and I = G + G .
For k C2G dene the partial inverse (k )1 2
G on L as

(k )1 1
G := G + k G .

Its matrix elements are



 ibx icx

b,c if c G,
e e
((k )1
G )b,c := , (k )1 = 1
||1/2 G
||1/2 L2
b,c if c  G,
Nc (k)
where b, c # .
Here is the main idea. By denition, a point k is in F (A, V ) if Hk (A, V ) has a
nontrivial kernel in L2 . Hence, to study the part of the curve in the intersection of
 
d G Td with C \ bG Tb for some nite subset G of , it is natural to look
2 #

for a nontrivial solution of


(k + h + q)(G + G ) = 0,
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 893

where G L2G and G L2G . Equivalently, if we make the following (invertible)


change of variables in L2 ,

(G + G ) = (k )1
G (G + G ),

where G L2G and G L2G , we may consider the equation

(k + h + q)G + (I + (h + q)1
k )G = 0. (4)

The projections of this equation onto L2G and L2G are, respectively,

G (h + q)G + G (I + (h + q)1
k )G = 0, (5)
G (k + h + q)G + G (h + q)1
k G = 0. (6)

Now dene RG G on L2 as

RG G := G (I + (h + q)1
k )G .

Observe that RG G is the zero operator on L2G . Then, if RG G has a bounded
inverse on L2G , Eq. (5) is equivalent to
1
G = RG  G G (h + q)G .

Substituting this into (6) yields

G (k + h + q (h + q)1 1
k RG G G (h + q))G = 0.

This equation has a nontrivial solution if and only if the (nite) |G| |G|
determinant

det[G (k + h + q (h + q)1 1
k RG G G (h + q))G ] = 0

or, equivalently, expressing all operators as matrices in the basis {||1/2 eibx | b
# },

 wd ,b
detNd (k)d ,d + wd ,d 1
(RG  G )b,c wc,d
= 0, (7)

N b (k)
b,cG
d ,d G

where

wb,c := hb,c + q(b c) = 2(c + k) A(b


c) + q(b c).

Therefore, if RG G has a bounded inverse on L2G which is in fact the case under
suitable conditions in the region under consideration we can study the Fermi
curve in detail using the (local) dening Eq. (7).
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

894 G. de Oliveira

6. Invertibility of RG G
The following notation will be used whenever we consider vector-valued quantities.
Let X be a Banach space and let A, B X 2 , where A = (A1 , A2 ) and B = (B1 , B2 ).
Then,
A X := ( A1 2X + A2 2X )1/2 and A B := A1 B1 + A2 B2 .
Furthermore, we will denote by the operator norm on L2 (R2 /).
In general, for any B, C # (C such that 1 k C exists) dene the operator
RBC as
RBC := B (I + (h + q)1
k )C
= B C + B q 1 1 1
k C + B (2A i)k C B (2k A)k C . (8)

Its matrix elements are


q(b c) 2c A(b
c) 2k A(b
c)
(RBC )b,c = b,c + , (9)
Nc (k) Nc (k) Nc (k)
where b B and c C. We rst estimate the norm of the last three terms on the
right-hand side of (8). We begin with the following proposition.

Proposition 4. Let k C2 and let B, C # with C {b # | Nb (k) = 0}.


Then,
1
B q 1
k C
q l1 sup ,
cC |N c (k)|

|c|
B (A i)1
k C A l1 sup
,
cC |Nc (k)|
1
B (k A)1
k C A l1 |k| sup
.
cC |Nc (k)|
To prove this proposition we apply the following well-known inequality (see [13]).

Proposition 5. Consider a linear operator T : L2C L2B with matrix elements


Tb,c . Then,
 
 
T max sup |Tb,c |, sup |Tb,c | .
cC bB cC
bB

Proof of Proposition 4. We only prove the rst inequality. The proof of the other
ones is similar. Write T := B q 1 k C . Then, in view of (8) and (9),
  |q (b c)| 1
sup |Tb,c | sup sup
q l1 ,
cC cC |Nc (k)| cC |Nc (k)|
bB bB
  |
q (b c)| 1
sup |Tb,c | sup sup
q l1 .
bB cC bB |N c (k)| cC |N c (k)|
cC

By Proposition 5, these estimates yield the desired inequality.


September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 895

1
The key estimate for the existence of RG  G is given below.

Proposition 6 (Estimate of RSS S ). Let k C2 with |u| 2|v| and


|v| > 2. Suppose that S {b # | |Nb (k)| |v|}. Then,
1 14
RSS S
q l1 + A l1 . (10)
|v|

If A = 0, the right-hand side of (10) can be made arbitrarily small for any V by
taking |v| suciently large (recall that q(0, V ) = V ). If A = 0, however, we need
l1 small to make that quantity less than 1. The term 14 A
to take A
l1 in (10)
1
comes from the estimate we have for G h k G .
 

Proof of Proposition 6. By hypothesis, for all b S,


1 1
. (11)
|Nb (k)| |v|

We now show that, for all b S,

|b| 4
. (12)
|Nb (k)|

First suppose that |b| 4|v|. Then,

|b| 4|v| 4
= .
|Nb (k)| |v|

Now suppose that |b| 4|v|. Again, by hypothesis we have |u| 2|v| and |v| >
2 > . Hence,

3 |b|
|v (u + b) | |b| |u| |v| |b| 3|v| |b| |b| = .
4 4
Consequently,

|b| |b| 4 4 16 4 4
=
|b| = .
|Nb (k)| |v + (u + b) | |v (u + b) | |b| |b| |b| |v|

This proves (12).


The expression for RSS S is given by (8). Observe that |k| |u| + |v| 3|v|.
Then, applying Proposition 4 and using (11) and (12) we obtain

RSS S (6|v| A
l1 +
q l1 ) sup
1 l1 sup |c|
+ 2 A
bS |Nc (k)| bS |Nc (k)|

1 8 1 14
(6|v| A
l1 +
q l1 ) + A l1 =
q l1 + A l1 .
|v| |v|

This is the desired inequality.


September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

896 G. de Oliveira

From the last proposition it follows easily that RSS has a bounded inverse for
large |v| and weak magnetic potential.
Lemma 1 (Invertibility of RSS ). Let k C2 ,


2 2
|u| 2|v|, |v| > max 2,
q l1 q l1 <
, and A
l1 < .
63
Suppose that S {b # | |Nb (k)| |v|}. Then the operator RSS has a bounded
inverse with
1 l1 14 < 17 ,
RSS S < q l1 + A
|v| 18
1
RSS S < 18 RSS S .

Proof. Write RSS = S + T with T = RSS S . Then, by Proposition 6,


1 l1 14 < 1 + 4 = 17 < 1.
T = RSS S
q l1 + A
|v| 2 9 18
1
Hence, the Neumann series for RSS = (S + T )1 converges (and is a bounded
operator). Furthermore,
1
RSS S = (S + T )1 S = (S + T )1 (S + T )1 (S + T )
= (S + T )1 T (1 T )1 T < 18 RSS S ,
as was to be shown.

Lemma 1 says that if G is such that G {b # | |Nb (k)| |v|} the operator
RG G has a bounded inverse on L2G for |u| 2|v|, large |v|, and weak magnetic
potential. We are now able to write local dening equations for F (A, V ) under such
conditions.

7. Local Dening Equations


In this section we derive local dening equations for the Fermi curve. We begin
with a simple proposition.

Proposition 7. Suppose either (i) or (ii) or (iii) where:



(i) G = {0} and k T0 \ b# \{0} Tb ;
(ii) G = {0, d} and k T0 Td ;

(iii) G = and k C2 \ b# Tb .
Then G = # \G = {b # | |Nb (k)| |v|}.

Proof. The proposition follows easily if we observe that G = # \G and recall


from (1) that
k  Tb |Nb (k)| |v|.
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 897

We now introduce some notation. Let B be a fundamental cell for # R2 (see


[9, p. 310]). Then any vector u R2 can be written as u = + u for some #
and u B. Dene


2
:= sup{|u| | u B}, R := max , 2, q l1 , KR := {k C2 | |v| R}.

We rst show that in C2 \KR the Fermi curve is contained in the union of -tubes
about the free Fermi curve.

Proposition 8 (F
(A, V )\KR is Contained in the Union of -Tubes).

F (A, V )\KR Tb .
b#

Proof. Without loss of generality, we may consider k C2 with real part in B. We


now prove that any point outside the region KR and outside the union of -tubes

does not belong to F (A, V ). Suppose that k C2 \(KR b# Tb ) and recall that
k is in F (A, V ) if and only if (4) has a nontrivial solution. If we choose G = then
G = # and this equation reads

RG G G = 0.

By Proposition 7(iii), we have G = # = {b # | |Nb (k)| |v|}. Furthermore,


since u B and |v| > R , it follows that |u| < |v| < 2|v|. Consequently, the
operator RG G has a bounded inverse by Lemma 1. Thus, the only solution of the
above equation is G = 0. That is, there is no nontrivial solution of this equation
and therefore k  F (A, V ).

We are left to study the Fermi curve inside the -tubes. There are two types
of regions to consider: intersections and non-intersections of tubes. To study non-

intersections we choose G = {0} and consider the region (T0 \ b# \{0} Tb )\KR .
For intersections we take G = {0, d} for some d # \{0} and consider (T0
Td )\KR . Observe that, since the tubes Tb have the following translational property,
Tb + c = Tb+c for all b, c # , and the curve F (A, V ) is invariant under the action
of # , there is no loss of generality in considering only the two regions above. Any
other part of the curve can be reached by translation.
Recall that G = # \G and for d , d G and i, j {1, 2} set

   Ai (d b)
1 
dd
Bij (k; G) := 4 (RG  G )b,c Aj (c d ),


Nb (k)
b,cG

   q(d b) 2b A(d
 b)
Cid d (k; G) := 2Ai (d d ) + 2

Nb (k)
b,cG

1 
(RG  G )b,c Ai (c d )

September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

898 G. de Oliveira

 Ai (d b)
1
+2 (RG q (c d )
 G )b,c (

Nb (k)
b,cG

2d A(c
d )),
 
C0d d (k; G) := q(d d ) 2d A(d
 d )
 q(d b) 2b A(d
 b) 1
q (c d )
(RG G )b,c (

N b (k)
b,cG

2d A(c
d )). (13)

Then,
 wd ,b
Dd ,d (k; G) := wd ,d (R1  )b,c wc,d
Nb (k) G G
b,cG
       
dd 2 dd 2 dd dd
= B11 k1 + B22 k2 + (B12 + B21 )k1 k2
     
+ C1d d k1 + C2d d k2 + C0d d .

These functions have the following property.


     
Proposition 9. For d , d G and i, j {1, 2}, the functions Bijdd
, Cid d , C0d d

(and consequently Dd ,d ) are analytic on (T0 \ b# \{0} Tb )\KR and (T0 Td )\KR
for G = {0} and G = {0, d}, respectively.
     
dd
Sketch of the proof. It suces to show that Bij , Cid d and C0d d are analytic
functions. This property follows from the fact that all the series involved in the
denition of these functions are uniformly convergent sums of analytic functions.
The argument is similar for all cases. See [13] for details.

Using the above functions we can write (local) dening equations for the Fermi
curve.

Lemma 2 (Local Dening Equations for F (A, V )).



(i) Let G = {0} and k (T0 \ b# \{0} Tb )\KR . Then k F (A, V ) if and only if

N0 (k) + D0,0 (k) = 0.

(ii) Let G = {0, d} and k (T0 Td )\KR . Then k F (A, V ) if and only if

(N0 (k) + D0,0 (k))(Nd (k) + Dd,d (k)) D0,d (k)Dd,0 (k) = 0.

Proof. We only prove part (i). The proof of part (ii) is similar. First, by
Proposition 7(i) we have G = # \{0} = {b # | |Nb (k)| |v|}. Furthermore,
since k T0 , we have either |v u | < or |v + u | < . In either case this implies
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 899

|u| < + |v| < 2 + |v| < 2|v|. Hence, the operator RG G has a bounded inverse
by Lemma 1. Thus, in the region under consideration F (A, V ) is given by (7):
 w0,b
1
0 = N0 (k) + w0,0 (RG  G )b,c wc,0 = N0 (k) + D0,0 (k).

N b (k)
b,cG

This is the desired expression.

To study in detail the dening equations above we shall estimate the asymptotic
d d    
behavior of the functions Bij , Cid d , C0d d and Dd ,d for large |v|. (We sometimes
refer to these functions as coecients.) Since all these functions have a similar form
it is convenient to prove these estimates in a general setting and specialize them
later. This is the contents of Secs. 9 and 10. We next introduce a change of variables
in C2 that will be useful for proving these bounds.

8. Change of Coordinates
Dene the (complementary) index  as  := (1) . Observe that  = 2 if

= 1,  = 1 if = 2, and (1) = (1) . The following change of coordinates in
C2 will be useful for our analysis. For {1, 2} and d , d G dene the functions
w,d , z,d : C2 C as

w,d (k) := k1 + d1 + i(1) (k2 + d2 ),


(14)
z,d (k) := k1 + d1 i(1) (k2 + d2 ).

Observe that, the transformation (k1 , k2 )  (w,d , z,d ) is just a translation com-
posed with a rotation. Furthermore, if k T (d )\KR then |w,d (k)| is small and
|z,d (k)| is large. Indeed, |w,d (k)| = |Nd , (k)| < and |z,d (k)| = |Nd ,  (k)|
|v| > R. Dene also
  1 d d d d d d d d
Jd d := (B B22 + i(1) (B12 + B21 )),
4 11
  1 d d d d
Kd d := (B11 + B22 ),
2
 
d d d d 1 d d d d
Ld d := d1 B11 i(1) d2 B22 (d2 + i(1) d1 )(B12 + B21 )
2
1    
+ (C1d d + i(1) C2d d ),
2
         
M d d := d2 dd
1 B11 + d2 dd
2 B22 + d1 d2 (B12
dd d d
+ B21 )
     
d1 C1d d d2 C2d d + C0d d ,
       
where Jd d , K d d , Ld d and M d d are functions of k C2 that also depend on
the choice of G # . Using these functions we can express Nd (k) + Dd ,d (k) and
Dd ,d (k) as follows.
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

900 G. de Oliveira

Proposition 10. Let {1, 2} and let d , d G. Then,


     
Nd + Dd ,d = Jd d w,d
2 dd 2
 + J z,d + (1 + K d d )w,d z,d
     
+ Ld d w,d + Ld d z,d + M d d ,
     
Dd ,d = Jd d w,d
2 dd
 + J
2
z,d  + K
dd
w,d z,d
     
+ Ld d w,d + Ld d z,d + M d d .

Furthermore,
   (1, i(1) ) A(d
 b)
1 
Jd d (k) = (RG  G )b,c (1, i(1) ) A(c d ),


N b (k)
b,cG

   A(d
 b) A(c
d ) 1
K d d (k) = 2 (RG G )b,c ,

Nb (k)
b,cG

   q(d b) + 2(d b) A(d


 b)
1 
 G )b,c (1, i(1) ) A(c d )
Ld d (k) = (RG

Nb (k)
b,cG

 (1, i(1) ) A(d


 b)
1
+ (RG q (c d )
 G )b,c (

N b (k)
b,cG

+ 2(d d ) A(c


d )) (1, i(1) ) A(d
 d ),

   q(d b) + 2(d b) A(d


 b)
1
M d d (k) = (RG (c d )
 G )b,c q

N b (k)
b,cG

+ q(d d ) + 2(d d ) A(d


 d ).

 
dd
Proof. To simplify the notation write w = w,d , z = z,d , Bij = Bij and
d d
Ci = Ci . First observe that, in view of (14),

Nd = (k1 + d1 + i(1) (k2 + d2 ))(k1 + d1 i(1) (k2 + d2 )) = wz.

Furthermore,
1
k1 = (w + z) d1 ,
2
(1)
k2 = (w z) d2 ,
2i
1 2 1
k12 = (w + z 2 ) + wz d1 (w + z) + d2
1 ,
4 2
1 1
k22 = (w2 + z 2 ) + wz + i(1) d2 (w z) + d2
2 ,
4 2
i(1) 2 1 1
k1 k2 = (z w2 ) (d2 i(1) d1 )w (d2 + i(1) d1 ) + d1 d2 .
4 2 2
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 901

Hence,

Dd ,d = B11 k12 + B22 k22 + (B12 + B21 )k1 k2 + C1 k1 + C2 k2 + C0


1
= (B11 B22 i(1) (B12 + B21 ))w2
4

1
+ (B11 B22 + i(1) (B12 + B21 ))z + d1 B11 + i(1) d2 B22
2
4

1 1
(d2 i(1) d1 )(B12 + B21 ) + (C1 i(1) C2 ) w
2 2

1
+ d1 B11 + i(1) d2 B22 (d2 + i(1) d1 )(B12 + B21 )
2

1
+ (C1 + i(1) C2 ) z + d2 2  
1 B11 + d2 B22 + d1 d2 (B12 + B21 )
2
1
d1 C1 d2 C2 + C0 + (B11 + B22 )wz
2
           
= Jd d w2 + Jd d z 2 + K d d wz + Ld d w + Ld d z + M d d .

This proves the rst claim. Consequently,


           
Nd + Dd ,d = Jd d w2 + Jd d z 2 + (1 + K d d )wz + Ld d w + Ld d z + M d d ,

which proves the second claim.


Now, again to simplify the notation write
 f(b, d )
1
fg = (RG (c, d ),
 G )b,c g

N b (k)
b,cG

that is, to represent sums of this form suppress the summation and the other factors.
Note that f g = gf according to this notation. Then, substituting (13) into the
 
denition of Jd d we have
  1
Jd ,d = (B11 B22 + i(1) (B12 + B21 ))
4
= A1 A1 + A2 A2 i(1) (A1 A2 + A2 A1 )
= (A1 i(1) A2 )(A1 + i(1) A2 )
= ((1, i(1) ) A) ((1, i(1) ) A)
 (1, i(1) ) A(d
 b)
1 
= (RG  G )b,c (1, i(1) ) A(c d ).


N b (k)
b,cG

     
Similarly, substituting (13) into the denitions of K d d , Ld d and M d d we derive
the other expressions.
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

902 G. de Oliveira

9. Asymptotics for the Coecients


Let f and g be functions on # and for k C2 and d , d G set
 f (d b)
1 
d ,d (k; G) := (RG  G )b,c g(c d ). (15)

N b (k)
b,cG

In this section we study the asymptotic behaviour of the function d ,d (k) for
k in the union of -tubes with large |v|. Here we only give the statements. See
Appendix B for the proofs. Reset the constant R as


l1 , (1 + b2 ) 4
R := max 1, , 2, 140 A q (b) l1 , (16)

and make the following hypothesis.

Hypothesis 1.
2
b2 q(b) l1 < and (1 + b2 )A(b)
l1 < .
63
Our rst lemma provides and expansion for d ,d (k) in powers of 1/|z,d (k)|.
Lemma 3 (Asymptotics for d ,d (k)). Under Hypothesis 1, let {1, 2} and
let f and g be functions on # with b2 f (b) l1 < and b2 g(b) l1 < . Suppose
either (i) or (ii) where:

(i) G = {0} and k (T (0)\ bG Tb )\KR ;
(ii) G = {0, d} and k (T (0) T  (d))\KR .
Then, for (, d ) = (, 0) if (i) or (, d ) {(, 0), (  , d)} if (ii),
(1) (2) (3)
d ,d (k) = ,d (k) + ,d (k) + ,d (k),
where for 1 j 2,
(j) Cj (3) C3
|,d (k)| and |,d (k)| ,
(2|z ,d (k)| R)j |z (k)|R2
,d

where Cj = Cj;,A,q,f,g and C3 = C3;,,A,q,f,g are constants. Furthermore, the


(j)
functions ,d (k) are given by (66) and (69) and are analytic in the region under
consideration.
(1)
Below we have more information about the function ,d (k).
(1)
Lemma 4 (Asymptotics for ,d (k)). Consider the same hypotheses of
Lemma 3. Then, for (, d ) = (, 0) if (i) or (, d ) {(, 0), (  , d)} if (ii),
(1) (1,0) (1,1) (1,2) (1,3)
z,d (k) ,d (k) = ,d + ,d (w(k)) + ,d (k) + ,d (k),
(1,0) (1,j)
where ,d is a constant given by (80), and the remaining functions ,d are
given by (79). Furthermore, for 0 j 2,
(1,j) (1,3) C3
|,d | Cj and |,d | ,
2|z,d (k)| R
where Cj = Cj;,A,f,g and C3 = C3;,A,f,g are constants given by (81).
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 903

The next lemma estimates the decay of d ,d (k) with respect to z  ,d (k) for
d = d .


Lemma 5 (Decay of d ,d (k) for d = d ). Under Hypothesis 1, let {1, 2}
and let f and g be functions on # with b2 f (b) l1 < and b2 g(b) l1 < .
Suppose further that G = {0, d} and k (T (0) T  (d))\KR . Then, for d , d G
with d = d ,
C# ,,f,g
|d ,d (k)| ,
|z  ,d (k)|3101
where C# ,,f,g is a constant.
The next proposition relates the quantities |v|, |k2 |, |z,d (k)| and |d| for k in
the -tubes with large |v|.

Proposition 11. For {1, 2} we have:


(i) Let k T (0)\KR . Then,
1 1 3 1 1 8
and .
|z,0 (k)| |v| |z,0 (k)| 4|v| |k2 | |v|
(ii) Let k (T (0) T  (d))\KR . Then,
1 1 3 1 1 3
, ,
|z,0 (k)| |v| |z,0 (k)| |z  ,d (k)| |v| |z  ,d (k)|

1 1 2
.
2|z  ,d (k)| |d| |z ,d (k)|


10. Bounds on the Derivatives


(j)
In the last section, we expressed d ,d (k) as a sum of certain functions ,d (k) for
k in the -tubes with large |v|. In this section we provide bounds for the derivatives
of all these functions. Here we only give the statements. See Appendix C for the
proofs.
Our rst lemma concerns the derivatives of d ,d (k).
Lemma 6 (Derivatives of d ,d (k)). Under Hypothesis 1, let f and g be
functions in l1 (# ) and suppose either (i) or (ii) where:

(i) G = {0} and k (T0 \ bG Tb )\KR ;
(ii) G = {0, d} and k (T0 Td )\KR .
Then, for any integers n and m with n + m 1 and for any d , d G,
n+m
C

k n k m d ,d (k) |v| ,
1 2

where C is a constant with C = C,,A,f,g,m,n if (i) or C = C,A,f,g,m,n if (ii).


We now improve the estimate of Lemma 6(ii) for d = d .
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

904 G. de Oliveira

Lemma 7 (Derivatives of d ,d (k) for d = d). Consider a constant 2


and suppose that |b| q(b) l1 < and (1 + |b| )A(b) l1 < 2/63. Let {1, 2}
and let f and g be functions on obeying |b| f (b) l1 < and |b| g(b) l1 < .
#

Suppose further that G = {0, d} and k T0 Td with |v| > 2 |b| q(b) l1 . Then, for
any integers n and m with n + m 0 and for any d , d G with d = d ,
n+m
C

k n k m d ,d (k) |d|1+ ,
1 2

where C = C,,A,f,g,m,n is a constant.

Observe that, in particular, this lemma with m = n = 0 generalizes Lemma 5.


(j)
We next have bounds for the derivatives of ,d (k).
(j)
Lemma 8 (Derivatives of ,d (k)). Under Hypothesis 1, let {1, 2} and let
f and g be functions in l1 (# ). Suppose either (i) or (ii) where:

(i) G = {0} and k (T (0)\ bG Tb )\KR ;
(ii) G = {0, d} and k (T (0) T  (d))\KR .

Then, there is a constant = ,A,q,m,n with R such that, for |v| and for
(, d ) = (, 0) if (i) or (, d ) {(, 0), (  , d)} if (ii), for any integers n and m
with n + m 1 and for 1 j 2,
n+m n+m
Cj C3
(j) (3)
(k)
k n k m ,d (2|z,d (k)| R)j and k n k m ,d |z,d (k)|R2 ,
(k)
1 2 1 2

where Cl = Cl;f,g,,A,q,n,m for 1 l 3 are constants. Furthermore,

C1;f,g,,A,1,0 , C1;f,g,,A,0,1 132 f l1 g l1 and


C1;f,g,,A,1,1 653 f l1 g l1 .

11. The Regular Piece


Proof of Theorem 1. Step 1 (Dening Equation). We rst derive a dening
equation for the Fermi curve. Without loss of generality we may assume that

A(0) = 0. Let G = {0}, recall that G = # \{0}, and consider the region

(T (0)\ bG Tb )\K , where is a constant to be chosen suciently large obey-
ing R. By Proposition 7(i) we have G = {b # | |Nb (k)| |v|}. To simplify
the notation write

 
M := F (A, V ) T (0) K Tb .
b# \{0}

By Lemma 2(i), a point k is in M if and only if

N0 (k) + D0,0 (k) = 0.


September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 905

By Proposition 10, if we set

w(k) := w,0 (k) = k1 + i(1) k2 and z(k) := z,0 (k) = k1 i(1) k2 ,

this equation becomes

1 w2 + 2 z 2 + (1 + 3 )wz + 4 w + 5 z + 6 + q(0) = 0, (17)

where

1 := J00 , 2 := J00 , 3 := K 00 ,
4 := L00
 , 5 := L00
, 6 := M 00 q(0),

with J00 , K 00 , L00 and M


00
given by Proposition 10. Observe that all the coe-
cients 1 , . . . , 6 have exactly the same form as the function 0,0 (k) of Lemma 3(i)
(see (15)). Thus, by this lemma, for 1 i 6 we have
(1) (2) (3)
i = i + i + i , (18)
(j)
where the function i is analytic in the region under consideration with
(j) C C (3) C
|i (k)| for 1 j 2 and |i (k)| ,
(2|z(k)| )j |z(k)|j |z(k)|2
(j)
where C = C,,q,A is a constant. The exact expression for i can be easily
obtained from the denitions and from Lemma 3(i). Substituting (18) into (17)
and dividing both sides of the equation by z yields
(1)
w + 2 z + g = 0, (19)

where
1 w 2 (2) (3) 4 w 6 q(0)
g := + (2 + 2 )z + 3 w + + 5 + + (20)
z z z z
obeys
C
|g(k)| , (21)

with a constant C = C,,q,A . Therefore, a point k is in M if and only if

F (k) = 0,

where
(1)
F (k) := w(k) + 2 (k) z(k) + g(k)

is an analytic function (in the region under consideration).

Step 2 (Candidates for a Solution). Let us now identify which points are candidates
to solve the equation F (k) = 0. First observe that, by Proposition 2(c) the lines
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

906 G. de Oliveira


N (0) and N  (d) intersect at N (0) N  (d) = {(i (d), (1) (d))}. Hence, the
second coordinate of this point and the second coordinate of a point k dier by

pr(k) pr(N (0) N  (d)) = k2 (1) (d) = k2 + (1) (d).
Now observe that, if k T (0) T  (d) then |k1 + i(1) k2 | < and

1 1
|k2 + (1) (d)| = (k1 + i(1) k2 ) (k1 + d1 i(1) (k2 + d2 )

2 2
1
|N0, (k) Nd,  (k)| < + = .
2 2 2
That is, the second coordinate of k and the second coordinate of N (0) N  (d)
must be apart from each other by at most . This gives a necessary condition
on the second coordinate of a point k for being in M . Conversely, if a point
k is in the (/4)-tube inside T (0), that is, |k1 + i(1) k2 | < 4 , and its second
coordinate dier from the second coordinate of N (0) N  (d) by at most /4, that
is, |k2 + (1) (d)| < 4 , then

|Nd,  (k)| = N0, (k) 2(k2 + (1) (d))| + 2 < ,
4 4
that is, the point k is also in T  (d) and hence lie in the intersection T (0) T  (d).
This gives a sucient condition on the rst and second coordinates of a point k for
being in T (0) T  (d).
For y C dene the set of candidates for a solution of F (k) = 0 as


M (y) := pr1 (y) T (0) Tb
b# \{0}


= pr1 (y) T (0) T  (b) .
b# \{0}

Observe that, if |y + (1) (b)| for all b # \{0} then


M (y) = pr1 (y) T (0) = {(k1 , y) C2 | |k1 + i(1) y| < }. (22)
On the other hand, if |y + (1) (d)| < for some d # \{0}, then there is at
most one such d and consequently
M (y) = pr1 (y) (T (0)\T  (d))

= {(k1 , y) C2 ||k1 + i(1) y| < and |k1 + d1 + i(1) (y + d2 )| }.
(23)
Indeed, suppose there is another d = 0 such that |y + (1) (d )| < . Then,
|d d | = |2(1) (d d )| = |y + (1) (d) (y + (1) (d ))| 2 < 2,
which contradicts the denition of . Thus, there is no such d = 0.
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 907

Step 3 (Uniqueness). We now prove that, given k2 , if there exists a solution k1 (k2 )
of F (k1 , k2 ) = 0, then this solution is unique and it depends analytically on k2 . This
follows easily using the implicit function theorem and the estimates below, which
we prove later.

Proposition 12. Under the hypotheses of Theorem 1 we have


C1
|F (k) w(k)| + , (a)
900

F 1 C2

k1 (k) 1 7 34 + , (b)

where the constants C1 and C2 depend only on , , q and A.

Now suppose that (k1 , y) M (y). Then,



F 1 C2

k1 (k1 , y) 1 7 34 + .

Hence, by the implicit function theorem, by choosing the constant R suciently


large, if F (k1 , y) = 0 for some (k1 , y) M (y), then there is a neighborhood
U V C2 which contains (k1 , y), and an analytic function : V U such that
F (k1 , k2 ) = 0 for all (k1 , k2 ) U V if and only if k1 = (k2 ). In particular this
implies that the equation F (k1 , k2 ) = 0 has at most one solution ((y), y) in M (y)
for each y C. We next look for conditions on y to have a solution or have no
solution in M (y).

Step 4 (Existence). We rst state an improved version of Proposition 12(a).

Proposition 13. Under the hypotheses of Theorem 1 we have


(1,0) (1,1) (1,2)
F (k) w(k) = 2 + 2 (w(k)) + 2 (k) + h(k),

where
! "
(1,0)
  (A(b))
c))
 (A(b
2 = 2i b,c +
(A(c)) (24)

 (b)  (c)
b,cG1

is a constant that depends only on and A and


(1,3)
h := 2 + g.

Furthermore,
(1,0) 1 (1,1) 1
|2 |<2 , |2 (k)| <3 ,
100 402
(1,2) 1 1
|2 (k)| < 4 3 4 , |h(k)| C,,q,A .
7
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

908 G. de Oliveira

We now derive conditions for the existence of solutions. Suppose that


F ((y), y) = 0. Then, since (y) + i(1) y = w((y), y) and < /6, using the
above proposition we obtain
|(y) + i(1) y| = |w((y), y)| = |F ((y), y) w((y), y)|
2 3 4 C 2 C
+ + + + .
100 402 74 3 50
Hence, by choosing the constant suciently large we nd that
2
|(y) + i(1) y| < .
40
In view of (23), there is no solution in M (y) if for some d # \{0} we have

|y + (1) (d)| < and |(y) + d1 + i(1) (y + d2 )| < .
This happens if

1 2
|y + (1) (d)|


2 40
because in this case

|(y) + d1 + i(1) (y + d2 )| = |(y) + i(1) y 2i(1) y + d1 i(1) d2 |
|(y) + i(1) y| + 2|y + (1) (d)| < .
Therefore, the image set of pr is contained in


1 2

1 := z C |z + (1) (b)| >

for all b \{0} .
#
2 40
On the other hand, in view of (22), there is a solution in M (y) if |y+(1) (b)| >
for all b # \{0}. Recall from Proposition 11(a) that < |v| < 8|k2 |. Thus, the
image set of pr contains the set
2 := {z C | 8|z| > and |z + (1) (b)| > for all b # \{0}}.
Step 5. Summarizing, we have the following biholomorphic correspondence:
pr
M  k k2 ,
pr 1
M  ((y), y) y ,
where
(1,0)
2 1 and (y) = 2 i(1) y r(y),
(1,0)
with the constant 2 given by (24),

(1,0) 2 3 C
|2 |< and |r(y)| + .
100 502
This completes the proof of the theorem.
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 909

Proof of Proposition 12. (a) Recallthat 2 = J00 . First observe that, by


Proposition 10, Lemma 3, and (66), we have
(1)
2 (k) = (J00 )(1) (k)
 (1, i(1) ) A(b)

= Sb,c (1, i(1) ) A(c).
(25)

Nb (k)
b,cG1

Thus, by (94) and (99),


(1)
2 45
|2 (k)| l1
2 A 2 A l1
(2|z(k)| R) 44
2
4 44 2 1
. (26)
|z(k)| 45 63 900 |z(k)|
Now recall that |g(k)| C,,q,A 1 . Hence,
(1) 1
|F (k) w(k)| = |2 (k)z(k) + g(k)| + C,,q,A .
900
This proves part (a).
(b) We rst compute
# $
(2) (3)
g 1 w2 2wz w2 2
= + 1 + + 2 z
k1 k1 z z2 k1 k1

(2) (3) 3 4 w
+ 2 + 2 + w + 3 +
k1 k1 z
z w 5 6 1 6 q(0)
+ 4 + + 2 2 . (27)
z2 k1 k1 z z z
Now observe that, since k T (0)\K we have |w(k)| < , 3|v| |z| and
< |v| |z|. Furthermore, by Lemmas 3(i), 6(i) and 8(i), for 1 i 6 and
1 j 2,
C (j) C (3) C
|i (k)| , |i (k)| , |i (k)| ,
|z(k)| |z(k)|j |z(k)|2

i (k) (j) (k) (3) (k) (28)
C i C i C
k1 |z(k)| ,
k1 |z(k)|j
,
k1 |z(k)|2
,

where C = C,,q,A in all cases. Hence,



g(k) 1

k1 C,,q,A . (29)

By Lemma 8(i) with f = g = (1, i(1) ) A,


we obtain

2 (k)
(1)
13
z(k) |z(k)| 2 (1, i(1) ) A
21
k1 |z(k)| l

26 2 1
A l1 < . (30)
2 7 34
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

910 G. de Oliveira

Therefore,

F
(k) 1 = (F (k) w(k)) = (
(1)
(k)z(k) + g(k))
k1 k1 k1 2

(1)
(1) g 1 1
= 2 (k)z(k) + 2 (k) + (k) + C,,q,A .
k1 k1 73 4

This proves part (b) and completes the proof of the proposition.

Proof of Proposition 13. First observe that



(1, i(1) ) A = A1 + i(1) A2 = A1 i(1) A2 = 2i  (A).
Thus, recalling (25),

(1)
 2i  (A(b))

2 (k) = (J00 )(1) (k) =
Sb,c 2i (A(c)).

Nb (k)
b,cG1

Now, by Lemma 4, we have


(1) (1,0) (1,1) (1,2) (1,3)
z(k)2 (k) = 2 + 2 (w(k)) + 2 (k) + 3 (k),
where
! "
(1,0)
  (A(b))
c))
 (A(b
2 = 2i b,c +
(A(c))

 (b)  (c)
b,cG1

and
(1,3) 1 1
|3 (k)| C,A < C,A .
|z(k)|
Hence,
(1) (1,0) (1,1) (1,2)
F (k) w(k) = z(k)2 (k) + g(k) = 2 + 2 (w(k)) + 2 (k) + h(k)
(1,3)
with h := 3 + g. Furthermore, in view of (21),
(1,3) 1
|h(k)| |3 (k)| + |g(k)| < C,,q,A .

This proves the rst part of the proposition. Finally, by (81), since A l1 < 2/63
and < /6, we nd that

(1,0) 1 1
|2 | 1+  (A) l1 2i  (A)
l1 2i (A)
l1
2 2
4 2 1
A 1 < 2 ,
l 100
(1,1) 7
|2 | 2 1 +  (A) l1 2i  (A)
l1 2i (A)
l1
6
8 21 < 1 3
2 A l
402
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 911

and
(1,2) 64 l1 256 A
41 < 1 4 .
|2 |
3
 (A)
21 2i  (A)
l
l1 2i (A)
3 l
74 3
This completes the proof.

12. The Handles


Proof of Theorem 2. Step 1 (Dening Equation). Let G = {0, d} and consider
the region (T (0) T  (d))\K , where is a constant to be chosen suciently large
obeying R. Observe that, this requires d being suciently large for (T (0)
T  (d))\K being not empty. In fact, by Proposition 11(ii), for k in this region we
have < |v| 2|d|. Now, recall from Proposition 7(ii) that G = {b # | |Nb (k)|
|v|}, and to simplify the notation write
H := F (A, V ) (T (0) T  (d))\K .
By Lemma 2(ii), a point k is in H if and only if
(N0 (k) + D0,0 (k))(Nd (k) + Dd,d(k)) D0,d (k)Dd,0 (k) = 0. (31)
Dene
w1 (k) := w,0 = k1 + i(1) k2 ,
z1 (k) := z,0 = k1 i(1) k2 ,
 (32)
w2 (k) := w  ,d = k1 + d1 + i(1) (k2 + d2 ),

z2 (k) := z  ,d = k1 + d1 i(1) (k2 + d2 ).
Note that, by Proposition 11(ii),
|v| |z1 | 3|v|, |v| |z2 | 3|v| and |d| |z2 | 2|d|.
By Proposition 10,
N0 + D0,0 = 1 w12 + 2 z12 + (1 + 3 )w1 z1 + 4 w1 + 5 z1 + 6 + q(0),
(33)
Nd + Dd,d = 1 w22 + 2 z22 + (1 + 3 )w2 z2 + 4 w2 + 5 z2 + 6 + q(0),
where
1 := J00 , 2 := J00 , 3 := K 00 ,
4 := L00 00
 , 5 := L , 6 := M
00
q(0),
and
1 := Jdd , 2 := Jdd , 3 := K dd,
4 := Ldd dd
, 5 := L  , 6 := M
dd
q(0),
       
with Jd d , K d d , Ld d and M d d given by Proposition 10. Observe that all the
coecients 1 , . . . , 6 and 1 , . . . , 6 have exactly the same form as the function
d ,d (k) of Lemma 3(ii) (see (15)). Thus, by this lemma, for 1 i 6 we have
(1) (2) (3) (1) (2) (3)
i = i + i + i and i = i + i + i , (34)
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

912 G. de Oliveira

(j) (j)
where the functions i and i are analytic in the region under consideration
with
(j) C C
|i (k)| for 1 j 2
(2|z1 (k)| )j |z1 (k)|j
(3) C
and |i (k)| ,
|z1 (k)|2
(j) C C
|i (k)| for 1 j 2
(2|z2 (k)| )j |z2 (k)|j
(3) C
and |i (k)| ,
|z2 (k)|2
(j) (j)
where C = C,,q,A is a constant. The exact expressions for i and i can be
easily obtained from the denitions and from Lemma 3(ii). Substituting (34) into
(33) yields
1 (1)
(N0 + D0,0 ) = w1 + 2 z1 + g1 ,
z1
(35)
1 (1)
(Nd + Dd,d ) = w2 + 2 z2 + g2 ,
z2
where
1 w12 (2) (3) 4 w1 6 q(0)
g1 := + (2 + 2 )z1 + 3 w1 + + 5 + + ,
z1 z1 z1 z1
(36)
1 w22 (2) (3) 4 w2 6 q(0)
g2 := + (2 + 2 )z2 + 3 w2 + + 5 + +
z2 z2 z2 z2
obey
C C
|g1 (k)| and |g2 (k)| , (37)

with a constant C = C,,q,A . This gives us more information about the rst term
in (31). We next consider the second term in that equation.
Write
D0,d = c1 (d) + p1 and Dd,0 = c2 (d) + p2 (38)
with
c1 (d) := q(d) 2d A(d),
p1 := D0,d q(d) + 2d A(d),

c2 (d) := q(d) + 2d A(d),
p2 := Dd,0 q(d) 2d A(d).

We have the following estimates.

Proposition 14. Under the hypotheses of Theorem 2 we have, for any integers n
and m with n + m 0 and for 1 j 2,
n+m
C1 C2

k n k m pj (k) |d| and |cj (d)| |d| ,
1 2
where the constants C1 and C2 depend only on , , q and A.
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 913

Thus, by dividing both sides of (31) by z1 z2 and substituting (35) and (38) we
nd that
1
0= [(N0 + D0,0 )(Nd + Dd,d ) D0,d Dd,0 ]
z1 z2
(1) (1)
= (w1 + 2 z1 + g1 )(w2 + 2 z2 + g2 )
1
(c1 (d) + p1 )(c2 (d) + p2 ). (39)
z1 z2
We now introduce a (nonlinear) change of variables in C2 . Set
(1)
x1 (k) := w1 (k) + 2 (k) z1 (k) + g1 (k),
(1)
(40)
x2 (k) := w2 (k) + 2 (k) z2 (k) + g2 (k).

This transformation obeys the following estimates.

Proposition 15. Under the hypotheses of Theorem 2 we have:

(i) For 1 j 2 and for suciently large,


C
|xj (k) wj (k)| + < .
900 8
(ii)

x1 x1 # $
k1 k2 1 i(1)
= (I + M )
x2 x2 
1 i(1)
k1 k2
and

k1 k1

x1 x2
= 1 1 1
(I + N )
k2 k2 2 i(1)

i(1)
x1 x2
with
4 C 1
M + < and N 4 M .
7 34 2
Furthermore, for all m, i, j {1, 2},
2
km 3 2 C

xi xj 3 + .

Here, all the constants C depend only on , , q and A.

By the inverse function theorem, these estimates imply that the above trans-
formation is invertible. Therefore, by rewriting Eq. (39) in terms of these new
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

914 G. de Oliveira

variables, we conclude that a point k is in H if and only if x1 (k) and x2 (k) satisfy
the equation

x1 x2 + r(x1 , x2 ) = 0, (41)

where
1
r(x1 , x2 ) := (c1 (d) + p1 )(c2 (d) + p2 ).
z1 z2
In order to study this dening equation we need some estimates.

Step 2 (Estimates). Using the above inequalities we have, for i, j, l {1, 2},
 2
pj km C

xi pj (k(x)) km xi |d|
m=1

and
 2 2
2 2
pj km kn  pj 2 km C

xi xl pj (k(x)) km kn xi xl + km xi xl |d| ,
m,n=1 m=1

so that
1 1 1 C
|r(x)| C 4,
|d| |d| |d|
2 |d|

1 1 1 1 1 1 C

xi r(x) C |d|3 |d| |d| + C |d|2 |d| |d| |d|4

and

2
r(x) C .
xi xj |d|4

Here, all the constants depend only on , , q and A.

Step 3 (Morse Lemma). We now apply the quantitative Morse lemma in


Appendix A for studying Eq. (41). We consider this lemma with a = b = C/|d|4 ,
= , and d suciently large so that b < max{ 23 55
1
, 4 }. Observe that, under this
condition we have

( a)(1 19b) > and ( a)(1 55b) > .
2 4
According to this lemma, there is a biholomorphism dened on



2
1 := (z1 , z2 ) C |z1 | < and |z2 | <
2 2
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 915

with range containing





2
(x1 , x2 ) C |x1 | < and |x2 | < (42)
4 4
such that
C
D I ,
|d|2
((x1 x2 + r) )(z1 , z2 ) = z1 z2 + td ,
C (43)
|td | ,
|d|4
C
| (0)| ,
|d|4
where D is the derivative of and td is a constant that depends on d. Hence,
if for = 1 we dene
d,1 : 1 T1 (0) T2 (d)
as
d,1 (z1 , z2 ) := (k1 (1 (z1 , z2 )), k2 (1 (z1 , z2 ))),
where k(x) is the inverse of the transformation (40), we obtain the desired map. Note
that the conclusion (ii) of the theorem is immediate. We next prove (i) and (iii).

Step 4 (Proof of (i)). By Proposition 15(i), for 1 j 2 we have |xj (k)wj (k)| 8 .
Now, recall from (32) the denition of w1 (k) and w2 (k). Then, since

|xj (k)| |xj (k) wj (k)| + |wj (k)| < + |wj (k)|,
8
the set



2
(k1 , k2 ) C |w1 (k)| < and |w2 (k)| <
8 8
is contained in the set (42). This proves the rst part of (i). To prove the second
part we use Proposition 15 and (43). First observe that
# $
k 1 1 1
Dd,1 = D1 = (I + N )(I + D1 I)
x 2 i i
# $
1 1 1
= (I + N + R),
2 i i
where
1 C C
N + and R .
33 |d|2
Furthermore, from (32) and (40) we have
1 1 (1) (1)
k1 = i (d) + (w1 + w2 ) = i (d) + (x1 + x2 + 2 z1 + 2 z2 + g1 + g2 )
2 2
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

916 G. de Oliveira

and similarly
(1) (1) (1)
k2 = (1) (d) + (x1 x2 2 z1 + 2 z2 g1 + g2 ),
2i
so that

1
d,1 (0) = k(1 (0)) = k O
|d|4

1
= (i (d), (1) (d)) + O +O .
900

Step 5 (Proof of (iii)). To prove part (iii) it suces to note that T1 (0) T2 (d)
F (A, V ) is mapped to T1 (d) T2 (0) F (A, V ) by translation by d and dene
d,2 by
d,2 (z1 , z2 ) := d,1 (z2 , z1 ) + d.
This completes the proof of the theorem.

Proof of Proposition 14. It suces to estimate


cd ,d := q(d d ) 2(d d ) A(d
 d ) and pd ,d := Dd ,d cd ,d
 
for d , d {0, d} with d = d . Dene ld d := (1, i(1) ) A(d
 d ). Observe
that, since
1
q (d d )| =
| |d d |2 |
q (d d )|
|d
d |2
1  1
 
|b|2 |q (b)| b2 q(b) l1 2 ,
|d d | 2
#
|d|
b

and similarly
 d )| b2 A(b) 1
|A(d l1 ,
|d|2
it follows that
CA,q   CA
|cd ,d | and |ld d | .
|d| |d|2
This gives the desired bounds for c1 and c2 .
Now, by Proposition 10, we have
         
p = Jd d w,d
2 dd
 + J
2
z,d  + K
dd d d ld d )w,d
w,d z,d + (L
     
d d ld d )z,d + M
+ (L dd

         
with L d d := Ld d +ld d and M d d := M d d c. Observe that all the coecients
       
Jd d , K d d , L d d and M
d d have exactly the same form as the function d ,d (k)
of Lemma 7 (see Proposition 10 and (15)). Thus, by this lemma with = 2, for
n+m
any integers n and m with n + m 0, the absolute value of the k n km -derivative
1 2
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 917

of each of these functions is bounded above by C,,A,q,m,n |d|1 3 . Hence, if we recall


from Proposition 11(ii) that |z1 (k)| 6|d| and |z2 (k)| 2|d|, and apply the Leibniz
rule we nd that
n+m
C

k n k m pd ,d (k) Cm,n |d| .
1 2

This yields the desired bounds for p1 and p2 and completes the proof.

Proof of Proposition 15. (i) Similarly as in (26) we have

(1) 1 (1) 1
|2 (k)| and |2 (k)| .
900 |z1 (k)| 900 |z2 (k)|
Thus, in view of (37), and by choosing suciently large,
(1) C
|x1 (k) w1 (k)| |2 (k) z1 (k) + g1 (k)| + < ,
900 8
and similarly |x2 (k) w2 (k)| < /8. This proves part (i).
(ii) Recall (32) and (40). Then, for 1 j 2,
(1)
x1 (1) w1 z1 (1) g1
= (w1 + z1 2 + g1 ) = + z1 2 + + ,
kj kj kj kj kj 2 kj
(1)
x2 (1) w2 z2 (1) g2
= (w2 + z2 2 + g2 ) = + z2 2 + + .
kj kj kj kj kj 2 kj
First observe that the functions g1 and g2 are similar to the function g (see
g1 g2
(36) and (20)). Thus, it is easy to see that k j
and kj
are given by expressions
similar to (27). Since k T (0) T (d) we have |w1 (k)| < and |w2 (k)| < .


Recall also the inequalities in Proposition 11(ii). Hence, by Lemmas 3(ii), 6(ii)
and 8(ii), we obtain (28) with k1 and z(k) replaced by kj and z1 (k), respectively,
and for k1 , z(k) and replaced by kj , z2 (k) and , respectively. Consequently,
similarly as in (29) and using again Lemma 3(ii), for 1 j 2 we have

z1 (1) g1 1 z2 (1) g2 1

kj 2 + kj C,,q,A and kj 2 + kj C,,q,A .

Now recall that 2 = J00 and 2 = Jdd . Then, by Proposition 10, Lemma 3(ii),
and (66), it follows that

(1)
 (1, i(1) ) A(b)

2 (k) = (J00 )(1) (k) = Sb,c (1, i(1) ) A(c),


N b (k)
b,cG1

(1)
2 (k) = (Jdd )(1) (k)
 (1, i(1)  ) A(d
b) 
= Sb,c (1, i(1) ) A(c
d).

N b (k)
b,cG1
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

918 G. de Oliveira

Hence, by Lemma 8(ii), similarly as in (30), for 1 j 2,



2 (k)
(1)
13 21 < 1
z1 (k) (1, i(1) ) A and
kj 2 l
7 34

2 (k)
(1)
1
z2 (k) < .
kj 7 34
Therefore,
(1) (1)
x1 x1 2 (k) 2 (k)
z1 (k)
k2
z1 (k)
k1 k2 i(1) k1
= 1 +
x2 x2


1 i(1) 2 (k)
(1) (1)
2 (k)
k1 k2 z2 (k) z2 (k)
k1 k2

g1 g1
(1) (1)
2 i(1) 2 k1 k 2
+ +
(1)  (1) g g
2 i(1) 2 2 2
k1 k2

1 i(1)
:=  (I + M1 + M2 + M3 ),
1 i(1)
where
2 1
M1 2 and M2 + M3 C,,q,A .
7 34
Set M := M1 + M2 + M3 . This proves the rst claim.
Now, by choosing suciently large we can make M < 12 . Write
# $
1 i(1)
P :=  .
1 i(1)
Then, by the inverse function theorem and using the Neumann series,

k1 k1 x1 x1 1
x1 x2 k2
= k1 = (I + M )1 P 1 = (I + M )P 1
k2 k2 x2 x2
x1 x2 k1 k2
=: P 1 (I + P M P 1 )

1 1 1 P 1 ),
=  (I + P M
2 i(1) i(1)
with
P 1 2 M 2 M
P M 1 4 M .
1 M
P 1 . This proves the second claim.
Set N := P M
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 919

Dierentiating the matrix identity T T 1 = I and applying the chain rule


we nd that
2 2
2 km km xl kp km 2 xl kr kp
= = .
xi xj xl xi kp xj xl kr xp xi xj
l,p=1 l,p=1

Furthermore, in view of the above calculations we have



ki 1
(1 + N ) 1 (1 + 4 M ) 1 1 + 4 1 < 3 .
xj 2 2 2 2 2
Thus,
2 3 2
km
4 3 sup xl .
xi xj 2 l,r,p kr xp

We now estimate
(1) (1) (1)
2 x1 z1 2 2 2 z1 2 2 g1 2 x2
= + z1 + + and .
ki kj ki kj ki kj kj ki ki kj ki kj

From (27) with g, w and z replaced by g1 , w1 and z1 , respectively, we obtain

2 g1 2 1 w12 1 2w1 z1 w12 2z 2 6w1 z1 + 4w12


2 = 2 +2 2 + 1 1
k1 k1 z1 k1 z1 z13
# $ # $
(2) (3) (2) (3)
2 2 2 2 2 2 2 3
+ + z 1 + 2 + + w1
k12 k12 k1 k1 k12

3 2 4 w1 4 z1 w1 2(w1 z1 )
+2 + +2 + 4
k1 k12 z1 k1 z12 z13
2 5 2 6 1 6 1 6 2
q (0)
+ 2 + 2 +2 3 + .
k1 k12 z1 k1 z12 z1 z13
Hence, by Lemmas 3(ii), 6(ii) and 8(ii),
2
g1 1

k 2 C,,q,A .
1

Similarly we prove that


2
gl 1

ki kj C,,q,A

for all l, i, j {1, 2} because all the derivatives acting on gl are essentially the
same up to constant factors (see [13]). Furthermore, again by Lemma 8(ii),

(1) (1)
2 1 2 1
C,,q,A , C,,q,A ,
kj kj
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

920 G. de Oliveira

and

2 2 (k)
(1)
65 21 < 1 2 ,
z1 (k) (1, i(1) ) A
k1 kj 3 l
53

2 2 (k)
(1)
1 2
z2 (k) < .
ki kj 53
Hence,
2
xl 1 2 1

ki kj 53 + C,,q,A .

Therefore,
2 3 2
km
4 3 sup xl 3 2 + C,,q,A 1 .
xi xj 2 l,r,p kr xp 3

This completes the proof of the proposition.

Acknowledgments
I would like to thank Professor Joel Feldman for suggesting this problem and for the
many discussions I have had with him. I am also grateful to Alessandro Michelangeli
for useful comments about the manuscript. This work is part of the authors Ph.D.
thesis [13] defended at the University of British Columbia in Vancouver, Canada.

Appendix A. Quantitative Morse Lemma


Lemma 9 (Quantitative Morse Lemma [13]). Let be a constant with 0 <
< 1 and assume that
f (x1 , x2 ) = x1 x2 + r(x1 , x2 )
is an holomorphic function on D = {(x1 , x2 ) C2 ||x1 | and |x2 | }. Suppose
further that, for all x D and 1 i 2, the function r satises
%& ' %
r % 2r %
% % 1
xi (x) a < and % % xi xj
(x) %b<
% 55
,
i,j{1,2}

where a and b are constants. Then f has a unique critical point = (1 , 2 ) D


with |1 | a and |2 | a. Furthermore, let s = max{|1 |, |2 |}. Then there is a
biholomorphic map from the domain D(s)(119b) to a neighbourhood of D
that contains
{(z1 , z2 ) C2 | |zi i | < ( s)(1 55b) for 1 i 2}
such that (f )(z1 , z2 ) = z1 z2 + c, where c C is a constant fullling |c r(0, 0)|
a2 . The dierential D obeys D I 18b. If x r
1
r
(0, 0) = 0 and x 2
(0, 0) = 0,
then = 0 and s = 0.
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 921

Appendix B. Asymptotics for the Coecients: Proofs


Proof of Proposition 11. We rst derive a more general inequality and then we
prove parts (i) and (ii). First observe that, if k T (d )\KR then
|v + (1) (u + d ) | = |Nd , (k)| < < |v|.
Hence,
|v| |2v (v + (1) (u + d ) )| 3|v|.
But
|2v (v + (1) (u + d ) )| = |v (1) (u + d ) |
= |k1 + d1 i(1) (k2 + d2 )| = |z,d (k)|.
Therefore,
1 1 3
. (44)
|z,d (k)| |v| |z,d (k)|
We now prove parts (i) and (ii).
(i) The rst inequality of part (i) follows from the above estimate setting (, d ) =
(, 0). To prove the second inequality observe that, since |v| > R 2 > 12
by hypothesis and |v| |z,0 (k)| by (44), on the one hand we have
1 11 1 1
|v| |v| = |v| |v| |v| |v|
4 12 12 6
|z,0 (k)| |k1 + i(1) k2 | |z,0 (k) k1 i(1) k2 | = 2|k2 |.
On the other hand, since |z,0 (k)| < 3|v| by (44),
|k2 | = |2i(1) k2 | = |k1 + i(1) k2 (k1 i(1) k2 )|
= |k1 + i(1) k2 z,0 (k)| + 3|v| 4|v|.
Combining these estimates we obtain the second inequality of part (i).
(ii) Similarly, in view of (44), if k T (d )\KR for (, d ) {(, 0), (  , d)} then
1 1 3 1 1 3
and . (45)
|z,0 (k)| |v| |z,0 (k)| |z  ,d (k)| |v| |z  ,d (k)|
These are the rst two inequalities of part (ii). Now, since
 
z  ,d (k) = k1 i(1) k2 + d1 i(1) d2
 
= z  ,0 (k) + d1 i(1) d2 = w,0 (k) + d1 i(1) d2 ,

|w,0 (k)| < , and |d1 i(1) d2 | = |d|, it follows that
|z  ,d (k)| |d| |z  ,d (k)| + .
Furthermore, by (45),
|v| |z  ,d (k)|
< .
6 12 12
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

922 G. de Oliveira

Thus,
1
|z  ,d (k)| |d| 2|z  ,d (k)|.
2
This yields the third inequality of part (ii) and completes the proof.

Proof of Lemma 3. We consider all cases at the same time. Therefore, we have
either hypothesis (i) with (, d ) = (, 0) or hypothesis (ii) with (, d ) {(, 0),
(  , d)}. Observe that either (,  ) = (1, 2) or (,  ) = (2, 1).

Step 1. Recall the change of variables (14) and set







  1 

  1
G1 := b G |b d | < R , G2 := b G |b d | R .
4 4
Then G = G1 G2 and G1 , G2 {b # | |Nb (k)| |v|} by Proposition 7.
Furthermore, by Proposition 11, for (, d ) = (, 0) if (i) or (, d ) {(, 0), (  , d)}
if (ii) we have |z,d | 3|v|. Thus, observing the denition of G2 ,


  f (d b) 1
|R1 (k)| := 
(RG G )b,c g(c d )
 
Nb (k)
bG1 cG2

1   |c d |2
1
RG  G |f (d b)| |g(c d )|
|v|  
|c d |2
bG1 cG2

1 1 16 2 C,f,g
RG  G f l1 c g(c) l1 , (46)
|v| R 2 |z,d |R2
and similarly
C,f,g
|R2 (k)| . (47)
|z,d |R2
Hence,

  
f (d b) 1
d ,d (k) = + + (RG G )b,c g(c d )
  
N b (k)
b,cG1 bG1 bG2
cG2 cG

 f (d b)
1 
= (RG  G )b,c g(c d ) + R1 (k) + R2 (k) (48)

Nb (k)
b,cG1

with
C,f,g
|R1 (k) + R2 (k)| . (49)
|z,d |R2
Now, if we set TG G := G RG G and recall the convergent series expansion


1 1
RG  G = (G TG G ) = TGj  G ,
j=0
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 923

we can write
 f (d b)
1 
(RG  G )b,c g(c d )

N b (k)
b,cG1

 
f (d b) j
= (TG G )b,c g(c d ). (50)
j=0 
N b (k)
b,cG1

Note, the above equality is ne because G1 is nite set. Let





 
 1 

  1
G3 := b G |b d | < R , G4 := b G |b d | R .
2 2
Again, observe that G = G3 G4 . Thus, we can break TG G into

TG G = G T G = (G3 + G4 )T (G3 + G4 ) = T33 + T43 + T34 + T44 ,

where Tij := Gi T Gj for i, j {3, 4}. Using this decomposition we prove the
following.

Proposition 16. Under the hypotheses of Lemma 3 we have


 
f (d b) j
(TG G )b,c g(c d )
j=0 
N b (k)
b,cG1

 
f (d b) j
= (T33 )b,c g(c d ) + R3 (k)
j=0 
N b (k)
b,cG1

with R3 (k) given by (75) and


C,f,g
|R3 (k)| . (51)
|z,d |R2
This proposition will be proved below. Combining this with (48) and (50) we
obtain
  3
f (d b) j 
d ,d (k) = (T33 )b,c g(c d ) + Rj (k). (52)
j=0 
Nb (k) j=1
b,cG1

j
Step 2. We now look in detail to the operator T33 and its powers T33 . Recall that


(b) = 2 ((1) b2 + ib1 ) and set := (1) so that (1) = (1) . Then,
1

Nb (k) = Nb, (k)Nb, (k)


= (w,d 2i (b d ))(z,d 2i (b d )).

Extend the denition of (y) to any y C2 . Thus,

2(k + d ) A(b
c) = 2i (A(b
c))w,d 2i (A(b
c))z,d .
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

924 G. de Oliveira

Hence,
1
Tb,c = (2(c + k) A(b
c) q(b c))
Nc (k)
2(c d ) A(b
c) q(b c) + 2(k + d ) A(b
c)
=
(w,d 2i (c d ))(z,d 2i (c d ))
= Xb,c + Yb,c , (53)

where
2(c d ) A(b
c) q(b c) 2i (A(b
c))w,d
Xb,c := , (54)
(w,d 2i (c d ))(z,d 2i (c d ))
c))z,d
2i (A(b
Yb,c := . (55)
(w,d 2i (c d ))(z,d 2i (c d ))
Let X and Y be the operators whose matrix elements are, respectively, Xb,c and
Yb,c . Set

X33 := G3 XG3 and Y33 := G3 Y G3 .

We next prove the following estimates,



X33 20 A l1 + 4
q l1
1 1
< ,
|z,d |R 3
(56)
8 l1 < 1 ,
Y33  (A)
14
where

|z,d |R := 2|z,d | R.

First observe that the vector b # has the same length as the complex
number 2i (b):

|b| = |(b1 , b2 )| = |b1 + i(1) b2 | = |2i (b)|. (57)

Thus, for b G3 ,


|2i (b d )| |b d | 1
= < .
R R 2
Consequently,
1 1


|z,d 2i (b d )| |z,d | |2i (b d )|
1 2
< = . (58)
1 |z,d |R
|z,d | R
2
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 925

Furthermore, for b G ,
1 1 1


(59)
|w,d 2i (b d )| |b d | |w,d | |b d |
1 1
= . (60)
2
Here we have used that |w,d | < < and |b d | 2 for all b G . Using again
that < |c d |/2 for all c G we have
|c d |
< 2. (61)
|c d |
Finally recall that
1 1 1 1
< and < , (62)
6 |z,d | |v| R
where the last inequality follows from Proposition 11 since |v| > R by hypothesis.
Then, using the above inequalities and Proposition 5, the bounds (56) for X33
and Y33 follow from the estimates

   
sup + sup |Xb,c | sup + sup
cG3 bG3 cG3 bG3
bG3 cG3 bG3 cG3

2|c d | |A(b
c)| + | q (b c)| + |2i (A(b
c))| |w,d |

|w,d 2i (c d )| |z,d 2i (c d )|


2  
sup + sup
|z,d |R cG3  bG3 
bG3 cG3
! "
2|c d | |A(b
c)| |
q (b c)| + 2 |A(b
c)|
+
|w,d 2i (c d )| |w,d 2i (c d )|

2  
sup + sup
|z,d |R cG3  bG3 
bG3 cG3
! "
2|c d | |A(b
c)| | q (b c)| + 2 |A(b c)|
+
|c d |
!! " "
2 2
q l 1
2 4+ A l1 +
|z,d |R
& '
l1 + 4 q l1 1
20 A
|z,d |R
& '
l1 + 4 q l1 1 1 1 1
20 A < + =
R 7 4 3
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

926 G. de Oliveira

and similarly

 
sup + sup |Yb,c | 8  (A)
l1 < 1 .
cG3 bG bG3 cG 
14
3 3

j
Step 3. We now look in detail to T33 . For each integer j 1 write
j j
T33 = (X33 + Y33 )j = Zj + Wj + Y33 , (63)

where Wj is the sum of the j terms containing only one factor X33 and j 1 factors
Y33 ,

j
Wj := (Y33 )m1 X33 (Y33 )jm ,
m=1
j
Zj := (X33 + Y33 )j Wj Y33 .

In view of (56) we have


j
1
Y33 j ,
14
j1
C,A,q 1
Wj j X33 Y33 j1
j ,
|z,d |R 14
j2 j
1 C,A,q 2
Zj (2 j 1) X33
j 2
.
3 |z,d |2R 3
Hence, the series




S := j
Y33 = (I Y33 )1 , W := Wj and Z := Zj (64)
j=0 j=1 j=2

converge, and the operator norm of W and Z decay with respect to |z,d |. Indeed,
  j
1
S Y33 j < C,
j=0 j=0
14

   j1
C,A,q 1 C,A,q
W Wj j < ,
j=1
2|z,d | R j=1 14 |z,d |R

   j
C,A,q 2 C,A,q
Z Zj .
j=2
|z,d |R j=2 3

2 |z,d |2R

Thus, we have the expansion



 j
T33 = S + W + Z.
j=0
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 927

Step 4. Consequently,
 
f (d b) j
(T33 )b,c g(c d )
j=0 
N b (k)
b,cG1

 f (d b) (S + W + Z)b,c g(c d )


=
(w,d 2i (b d ))(z,d 2i (b d ))
b,cG1

(1) (2)
= ,d + ,d + R4 , (65)
where
(1)
 f (d b) Sb,c (k)g(c d )
,d (k) := ,
(w,d (k) 2i (b d ))(z,d (k) 2i (b d ))
b,cG1
(66)
(2)
 f (d b) Wb,c (k)g(c d )
,d (k) :=
(w,d (k) 2i (b d ))(z,d (k) 2i (b d ))
b,cG1

and
 f (d b) Zb,c (k)g(c d )
R4 (k) := . (67)
(w,d (k) 2i (b d ))(z,d (k) 2i (b d ))
b,cG1

By a short calculation as in (74), using (58) and (60) we nd that


(1) 1 2 C,f,g
|,d (k)| f l1 g l1 S ,
2|z,d | R |z,d |R
(2) 1 2 C,A,q,f,g
|,d (k)| f l1 g l1 W , (68)
2|z,d | R |z,d |2R
1 2 C,A,q,f,g
|R4 (k)| f l1 g l1 Z .
2|z,d | R |z,d |3R
Hence, recalling (52) we conclude that
(1) (2) (3)
d ,d = ,d + ,d + ,d ,
where
(3)

4
,d (k) := Rj (k). (69)
j=1

Furthermore, in view of (49), (51) and (68), since


1 1 1
= < ,
|z,d |3R (2|z,d | R)3 |z,d |R2
for 1 j 2 we have
(j) Cj (3) C3
|,d (k)| and |,d (k)| ,
|z,d (k)|jR |z,d (k)|R2
where Cj = Cj;,A,q,f,g and C3 = C3;,,A,q,f,g are constants. This proves the main
statement of the lemma. Finally observe that, since G3 is a nite set, the matrices
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

928 G. de Oliveira

X33 and Y33 are analytic in k because their matrix elements are analytic functions of
k. (Note, the functions w,d (k) and z,d (k) are analytic.) Consequently, the matri-
ces Wj and Zj are also analytic and so are Sb,c , Wb,c and Zb,c because the series
(j)
(64) converge uniformly with respect to k. Thus, all the functions ,d (k) are ana-
lytic in the region under consideration. This completes the proof of the lemma.

Proof of Proposition 16. Step 1. Recall that TG G = T33 + T34 + T43 + T44 with
(0) (0) (0) (0)
Tij = Gi T Gj and set X33 := 0, Y34 := T34 , W43 := T43 , and Z44 := T44 . It is
straightforward to verify that, for any integer j 0,
(j) (j) (j) (j)
TGj+1 j+1
 G = T33 + X33 + Y34 + W43 + Z44 , (70)
where
(j) (j1) (j1)
X33 := T33 X33 + T34 W43 : L2G3 L2G3 ,
(j) (j1) (j1)
Y34 := T33 Y34 + T34 Z44 : L2G3 L2G4 ,
(71)
(j) j (j1) (j1)
W43 := T43 T33 + T43 X33 + T44 W43 : L2G4 L2G3 ,
(j) (j1) (j1)
Z44 := T43 Y34 + T44 Z44 : L2G4 L2G4 .

Step 2. Since G1 G4 = G4 G1 = 0 and G1 G3 = G3 G1 = G1 , substituting
(0)
(70) into the sum below for the terms where j 1 we have, recalling that X33 = 0,
 
f (d b) j
(TG G )b,c g(c d )
j=0 
N b (k)
b,cG1

 
f (d b) j
= (T33 )b,c g(c d )
j=0 
Nb (k)
b,cG1

 
f (d b) (j)
+ (X33 )b,c g(c d ). (72)
j=1 
N b (k)
b,cG1

Now recall from (58) and (60) that, for all b G3 ,
1 2 1
, (73)
|Nb (k)| |z,d |R
and observe that G1 G3 . Let M be either TG G or T33 . Then, the estimate



f (d b)
(M j
) g(c d
)
b,c
b,cG Nb (k)
1

 ibx 
 f (d b)  e j e
icx
= , M g(c d
)
||1/2 ||1/2

Nb (k)
bG1 cG1
2 1
f l1 g l1 M j (74)
|z,d |R
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 929

implies that the left-hand side and the rst term on the right-hand side of (72)
converge because M < 17/18. Thus, the last term in (72) also converges. Hence,
we are left to show that
 
f (d b) (j)
R3 (k) := (X33 )b,c g(c d ) (75)
j=1 
N b (k)
b,cG1

obeys
C,f,g
|R3 (k)| .
|z,d | R2
In order to do this we need the following inequality, which we prove later.

Proposition 17. Consider a constant 0 and suppose that (1 + |b| ) q (b) l1 <
and (1 + |b| )A(b)
l 1 < 2/63. Suppose further that |v| > 2
(1 + |b|
)A(b) l1 .

Then, for any B, C G and m 1,
m
17 1
B TGm G C (1 + (2) m 1 ) sup ,
bB 1 + |b c|
18
cC

where  is the smallest integer greater or equal than .

Step 3. Now observe that, if b G1 and c G4 then


R R R
|b c| = |b d (c d )| |c d | |b d | = .
2 4 4
Thus, applying the last proposition with = 2 and recalling that G3 G , for
m 0 we have
m+1
3(m + 1) 17
G1 T33
m
T34 G1 TGm G TG G4 = G1 TGm+1
 G G  .
4 1 18
1 + R2
16
Furthermore, since G4 G3 = G4 G1 = 0 and G3 G1 = G1 , from (70) we obtain
(j)
W43 G1 = G4 TGj+1 3 1 4
j+1
 G G G = G TG G G .
1

Hence,
j+1
(j) 17
W43 G1 = G4 TGj+1
 G G TG G
1
j+1
< .
18
Therefore, for 0 m < j,
(jm1)
G1 T33
m
T34 W (jm1) G1 G1 T33
m
T34 W43 G1
j+1
3(m + 1) 17
.
1 18
1 + R2
16
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

930 G. de Oliveira

Iterating the rst expression in (71) we nd that

(j) (j1) (j1)


X33 = T34 W43 + T33 X33
(j1) (j2) 2 (j2)
= T34 W43 + T33 T34 W43 + T33 X33
..
.
(j1) (j2) j2 (1)
j1 (0)
= T34 W43 + T33 T34 W43 + + T33 T34 W43 + T33 T34 W43

j1
(jm1)
m
= T33 T34 W43 . (76)
m=0

Thus, using the above inequality,


% j1 %
% %
(j) % (jm1) %
G1 X33 G1 =% m
G1 T33 T34 W43 G1 %
% %
m=0


j1
(jm1)
G1 T33
m
T34 W43 G1
m=0
j+1 
j1
3 17
(m + 1)
1 18
1 + R2 m=0
16
j+1
3 17
= (j 2 + j) .
1 2 18
2+ R
8

Consequently,
% %
% %
%  % 
%G X
(j)
%
(j)
G1 X33 G1
% 1 33 G1 %


% j=1 % j=1

 j+1
3 17 C
2
(j + j) 2,
1 2 18 R
2 + R j=1
8

where C is an universal constant. Finally, using this and (73), since |z,d | 3|v|
we have


 f (d b)  6C 1

|R3 (k)| = (j)
X33 
g(c d ) f l1 g l1 .
b,cG1 N b (k) j=1 |z ,d | R
2
b,c

In view of (72) and (75) this completes the proof.


September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 931

Proof of Proposition 17. For any b, c # set Qb,c := (1 + |b c| )Tb,c . We rst


claim that, for any B, C G ,
 17  17
sup |Qb,c | < and sup |Qb,c | < . (77)
bB cC 18 cC 18
bB

In fact, using the bounds (11), (12) and |k| 3|v|, it follows that

  q(b c) 2c A(b c)
c) 2k A(b

sup |Qb,c | = sup (1 + |b c| )
bB bB Nc (k) Nc (k) Nc (k)
cC cC

1 14 1 4 17
(1 + |b| )
q (b) l1 + (1 + |b| )A(b)
l1 < + = ,
|v| 2 9 18

and similarly we prove the second bound in (77). Furthermore, since |Tb,c | |Qb,c |
for all b, c # , for any integer m 1 we have
 m  m
17 17
sup |(TBC )b,c | <
m
and sup |(TBC )b,c | <
m
.
bB 18 cC 18
cC bB

Now, let p be the smallest integer greater or equal than , and for any integer
m 1 and any 0 , 1 , . . . , m # , let b = 0 and c = m . Then,
& ' & 'p
|b c| |b c|
|b c| = (2)

(2)
2 2
(2) 
m
= |i1 1 i1 | |ip 1 ip |
(2)p i1 ,...,ip =1


m
(2)p (|i1 1 i1 |p + + |ip 1 ip |p )
i1 ,...,ip =1


m
= (2)p p mp1 |i1 i |p
i=1

*
m
(2)p p mp1 (1 + |i1 i |p ). (78)
i=1

1
To simplify the notation write s := supbB, cC 1+|bc|
. Hence,

sup |(TGm G )b,c |
bB cC

1 
sup sup (1 + |b c| )|(TGm G )b,c |
bB 1 + |b c| bB
cC
cC
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

932 G. de Oliveira


 
s sup |(TGm G )b,c | + (2)p p mp1 sup (1 + |b 1 | )|Tb,1 |
bB cC bB G
1

 
(1 + |1 2 |2 )|T1 ,2 | (1 + |m1 c|2 )|Tm1 ,c |
2 G cC

m 
17
s + (2)p p mp1 sup (1 + |b 1 |2 )|Tb,1 |
18 bB  1 G

 
sup (1+|1 2 |2 )|T1 ,2 | sup (1 + |m1 c|2 )|Tm1 ,c |
1 G G m1 G cC
2

m
17
s (1 + (2)p p mp1 ) ,
18

and similarly we prove the other inequality. Therefore, by Proposition 5,


m
 1 17 1
B TGm G C (1 + (2) 
m ) sup ,
18 bB 1 + |b c|
cC

where  is the smallest integer greater or equal than . This is the desired
estimate.

Proof of Lemma 4. To simplify the notation write w = w,d , z = z,d , and


|z|R = 2|z| R. First observe that

1 1 w
= + ,
w 2i (c d ) 2i (c d ) 2i (c d )(w 2i (c d ))
 

so that

z 1 w
= +
Nc (k) 2i (c d ) 2i (c d )(w 2i (c d ))
2i (c d ) 1
+
w 2i (c d ) z 2i (c d )
 

=: c(0) + c(w) + c(z) ,

where, in view of (58) to (61), since |w| < ,

1 4
|c(0) | , |c(w) | and |c(z) | .
2 22 |z|R
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 933

Hence,
c))z
2i (A(b
Yb,c =
Nc (k)
c)) (0) 2i (A(b
= 2i (A(b c
c)) (w) 2i (A(b
c
c)) (z)
c
(0) (w) (z)
=: Yb,c + Yb,c + Yb,c .
() ()
Let Y ( ) be the operator whose matrix elements are Yb,c and set Y33 :=
G3 Y ( ) G3 . Then, similarly as we estimated Y33 , using (58) to (61) and
Proposition 5, it follows easily that
(0) 1 l1 , Y (w)  (A)
l1 , Y (z) 4  (A)
Y33  (A) l1 .
2 33
22 33
|z|R
Furthermore,
S = (I Y33 )1 = 1 + (1 Y33 )1 Y33 = 1 + SY33
2(0) (w) (z)
= 1 + (1 + SY33 )Y33 = 1 + Y33 + Y33 + Y33 + SY33 ,
where, recalling (56),
2
1 Y33 2 14 8
SY33
2
(1 Y33 ) Y33 2
<  (A)
21 .
1 Y33 13 l

Combining all this we have


z Sb,c (0) (w) (0) (w) (z) 2 (z)
= (b + b )(b,c + Yb,c + Yb,c + Yb,c + (SY33 )b,c ) + b Sb,c
Nb (k)
(0) (0) (0) (w) (w) (0) (w)
= [b (b,c + Yb,c )] + [b Yb,c + b (b,c + Yb,c + Yb,c )]
(0) 2 (w) (0) (w) (z) (z)
+ [(b + b )(SY33 )b,c ] + [(b + b )Yb,c + b Sb,c ]
(0) (1) (2) (3)
=: Kb,c + Kb,c + Kb,c + Kb,c
with

(0) 1 1
|Kb,c | 1+  (A) l1 ,

2 2

(1) 1
|Kb,c |  (
A) l 1 + 1 +  (
A) l 1 +  (
A) l 1
43 22 2 2

7
< 1 +
l1 ,
 (A)
22 6
2
(2) 1 8 21 < 64  (A)
|Kb,c |  (A) l
21 ,
l
3
(3) 3 l1 4 + 14 4 < C,A
|Kb,c |  (A)
2 |z|R 13 |z|R |z|R
for all b, c G3 . Here, to estimate |Kb,c | we have used that < /6.
(1)
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

934 G. de Oliveira

Finally, recalling (66) and using the above estimates we nd that


 z Sb,c
f (d b) g(c d )
(1)
z,d (k),d (k) =

N b (k)
b,cG1

 3
f (d b) K g(c d )
(j)
= b,c
b,cG1 j=0

(1,0) (1,1) (1,2) (1,3)


=: ,d + ,d (w(k)) + ,d (k) + ,d (k), (79)
where, in particular,
! "
 f (d b) c))
 (A(b
g(c d ).
(1,0)
,d = b,c + (80)
2i (b d )  (c d )
b,cG1

(1,j)
Furthermore, for 0 j 2, it follows easily from (79) that |,d | Cj with

1 1
C0 := 1+  (A)
l1 f l1 g l1 ,
2 2

7
C1 := 1+ (A) l1 f l1 g l1 ,
 (81)
22 6
64
C2 :=  (A)
21 f l1 g l1 ,
l
3
while for j = 3,
(1,3) 1
|,d | C,A,f,g .
|z|R
This completes the proof of the lemma.

Proof of Lemma 5. To prove this lemma we apply the following (well-known)


inequality (see [13] for a proof).

Proposition 18. Let and be constants with 1 < 2 and 1 < 2. Suppose
that f is a function on # obeying |b| f (b) l1 < . Then, for any 1 , 2 #
with 1 = 2 ,

 |f (b 1 )| C 1 if , < 2,

#
|b 2 | |1 2 |+2
ln|1 2 | if = 2 or = 2,
b \{1 ,2 }

where C = C# ,,,f is a constant.


First observe that {b} TGm G {c} = |(TGm G )b,c |. Hence, by Proposition 17 with
= 2, for all b, c G and m 1,
m
17 1
|(TG G )b,c | = {b} TG G {c} (1 + 2m)
m m
.
18 1 + |b c|2
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 935

Note that this inequality is also valid for m = 0. Thus,




  f (d b) m
|d ,d (k)| = (TG G )b,c g(c d )
m=0 b,cG Nb (k)
! m  "
1  17  |g(c d )|
(1 + 2m) |f (d b)|
|v| m=0 18 1 + |b c|2
bG cG

C   |g(c d )|
|f (d b)| |g(b d )| + , (82)
|v|  
|b c|2
bG cG \{b}

where C is an universal constant.


Now, by the triangle inequality, H olders inequality, and since l2 l1 ,

|f (d b)| |g(b d )|
bG
 |d d |2
=  d |2
|f (d b)| |g(b d )|

|d
bG

4 
(|d b|2 + |b d |2 ) |f (d b)| |g(b d )|
|d d |2 
bG

4
( b2 f (b) l2 g l2 + f l2 b2 g(b) l2 )
|d d |2
4 Cf,g
( b2 f (b) l1 g l1 + f l1 b2 g(b) l1 )  . (83)
|d d |2 |d d |2
Furthermore, by Proposition 18 with = = 2, for any 0 < 1 < 2,
 |g(c d )| ln|b d | C# ,g, 1
C# ,g
 |2
.

|b c|2 |b d |b d |2 1
cG \{b}

Applying this inequality and (83) to (82) we obtain


! "
C Cf,g  |f (d b)|
|d ,d (k)| + C# ,g, 1 .
|v| |d d |2 
|b d |2 1
bG

Again, by Proposition 18 with = 2 and = 2 1 we conclude that, for any


0 < 2 < 2 1 ,
& '
C Cf,g ln |d d | C,# ,f,g, 1 , 2
|d ,d (k)|  
+ C# ,f,g, 1  
.
|v| |d d | 2 |d d | 2 1 |v| |d d |2 1 2
Finally, recall from Proposition 11(ii) that |z  ,d | < 3|d| and |z  ,d | < 3|v|, observe
that |d d | = |d|, and set  = 1 + 2 . Then, for any 0 <  < 2,
C,# ,f,g, 1 , 2 C,# ,f,g,
|d ,d (k)| .
|d| |d|2 1 2 |z  ,d |3
Choosing  = 101 we obtain the desired inequality.
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

936 G. de Oliveira

Appendix C. Bounds on the Derivatives: Proofs


Proof of Lemma 6. Step 0. When there is no risk of confusion we shall use the
same notation to denote an operator or its matrix. Dene

FBC := [f (b c)]bB,cC , GBC := [g(b c)]bB,cC ,


G (k) := [d ,d (k; G)]d ,d G .

Here FBC and GBC are |B| |C| matrices and G (k) is a |G| |G| matrix. First
observe that

 f (d b)
G (k) = 1
(RG 
 G )b,c g(c d )

N b (k)
b,cG
d ,d G

can be written as the product of matrices FGG 1 1


k RG G GG G . Furthermore, since
on L2G we have 1 1
k RG G = (RG G k )
1
= Hk1 , we can write G (k) as
1
FGG Hk GG G . Hence,

n+m n+m Hk1


G (k) = F GG  GG G . (84)
k1n k2m k1n k2m
This is the quantity we want to estimate.

m0
Step 1. Let T = T (k) be an invertible matrix. Then applying m
ki 0
to the identity
m0
T T 1 = I and using the Leibniz rule for m
ki 0
(T T 1) we nd that


m 0 1
m0 m1
m0 T 1 1 m0 T m1 T 1
m0 = T m0 m1 .
ki m =0
m1 ki kim1
1

Iterating this formula m0 1 times we obtain



m0 T 1 * 1 mj1
m0 mj1
mj1 mj
T mm0 T 1
m0 = (T 1 ) mj1 mj m
ki j=1 m =0 j
mj ki ki m0

1
0 1 mj1
m*
mj1

mj1 mj
T
= (T 1 ) mj1 mj
j=1 mj =0
mj ki
mm0 1 1
 mm0 1 mm0 1 mm0 T mm0 T 1
(T 1 ) mm 1 mm m
mm0 =0
m m0 ki 0 0 ki m0

m* 1 mj1
0 1 mj1
mj1 mj
T mm0 1 T
= (1)m0 T 1 mj1 mj T 1 mm0 1 T 1 .
j=1 mj =0
mj ki ki
(85)
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 937

m Hk1
Step 2. In view of (85), it is not dicult to see that k2m is given by a nite
linear combination of terms of the form

*
m
nj
H
Hk1 nj
k 1
Hk , (86)
j=1
k 2

+m m 1
n Hk n
where j=1 nj = m. Thus, when we compute k1n k2m , the derivative k1n acts
nj , -
either on Hk1 or nHjk . However, since H
k2 b,c = 2(k2 + b2 )b,c 2A2 (b c), we
k
k 2
n n
n j Hk n j Hk n Hk
have k1n knj = 0 if nj 1 and k1n knj = k1n if nj = 0. Similarly, using
2 2
n
Hk1
again (85), one can see that is given by a nite linear combination of terms of
k1n
+n
the form (86), with m and k2 replaced by n and k1 , respectively, and j=1 nj = n.
k n+m H 1
Therefore, combining all this we conclude that kn km is given by a nite linear
1 2
combination of terms of the form

*
n+m
nj
H
1 1 k 1 R1  ,
k RG G nj k GG (87)
j=1
k ij

+ +n+m
where n+mj=1 nj 2,ij = m and j=1 nj 1,ij = n, that is, where the sum of nj for
which ij = 2 is equal to m, and the sum of nj for which ij = 1 is equal to n.
nj
Hk 1
Step 3. The rst step in bounding (87) is to estimate n k G . A simple
ki j
j
calculation shows that

# $
2(kij + bij )b,c + 2Aij (b c) if nj = 1,

nj Hk 1 1
n = 2b,c if nj = 2,
kijj k Nc (k)

b,c 0 if nj 3.

Furthermore, by Proposition 7,
1 1

|Nb (k)| |v|
for all b G , while by Proposition 3 we have
1 2
(88)
|Nb (k)| |v|
and
2
|ki + bi | |ui + bi | + |vi | |v| + |u + b| |Nb (k)|

for all b G if G = {0, d}, and for all b G \{b} if G = {0}. Furthermore,

|b| + |u| + |v| < + 3|v|, (89)


September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

938 G. de Oliveira

since |u| < 2|v| because k T0 . Now, let 1B (x) be the characteristic function of
the set B. Then, using the above estimates we have
# $
 nj Hk
sup 1
nj k G


cG 
bG  kij b,c
! "
 2|kij + bij |nj ,1 + 2nj ,2 2|Aij (b c)|
sup b,c + nj ,1
cG |Nb (k)| |Nb (k)|
bG
! "
2|kij + bij | + 2 2|Aij (b c)|
sup b,c + 1G (b)
cG |Nb (k)| |Nb (k)|
! "
 2|kij + bij | + 2 2|Aij (b c)|
+ sup b,c +
cG 
|Nb (k)| |Nb (k)|
bG \{
b}

2|kij + bij | + 2 + 2 A
l1
1G (b)
|v|
!& ' "
 4 2 2|Aij (b c)|
+ sup + b,c +
cG 
|Nb (k)| |Nb (k)|
bG \{
b}

2 l1 )1G (b) + 4 + 4 + 4 A
(2(|u| + |v| + |b|) + 2 + 2 A l1
|v| |v| |v|
2 l1 )1G (b) + 4 + 4 + 4 A
(12|v| + 2 + 2 + 2 A l1
|v| |v| |v|
1G (b) 1 C,A + C,A .

Similarly,
# $

 nj Hk
sup 1 1G (b) 1 C,A + C,A .
k nj k G
bG cG

ij b,c

Hence, by Proposition 5,
% %
% nj H %
% k 1 % 1 C,A + C,A .
% G % 1G (b)

% kinjj k %

Step 4. By a similar (and much simpler) calculation (using Proposition 5) we get

FGG f l1 ,
GGG g l1 ,
(90)
1 2
1 
k G 1G (b)
 + (1 1G (b)) .
|v| |v|
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 939

From Lemma 1 we have (RG G )1 18. Thus, the operator norm of (87) is
bounded by
% %
% n+m %
% * 1 1 nj Hk %
% k RG G 1 1 %
k RG G %
% nj
% j=1 k ij %
% %
* %
n+m nj
% Hk 1
%
% 1
1 1
k RG G % G % RG
% kinjj k %
 G ,

j=1

which is bounded either by



1 *
n+m
1
18 (1 C,A + C,A ) 18 (n+m+1) C,A,n,m
|v| j=1
|v|

if G = {0}, or by

1 *
n+m
1
18 C,A 18 g l1 C,A,n,m
|v| j=1
|v|

if G = {0, d}. Therefore,


% n+m 1 %
% Hk %  C C C
% %
% k n k m % |v|
Cn,m
|v|

|v|
, (91)
1 2 nite sum where
# of terms depend
on n and m

with C = C,,A,n,m if G = {0} or C = C,A,n,m if G = {0, d}. Finally, recalling


(84) and (90) we have
n+m
n+m 1
FGG Hk
G
k n k m G =
(k) n
k1 k2m G G

1 2
% n+m 1 %
% Hk % C
FGG % %
% k n k m % GG G |v| ,
1 2

where C = C,,A,n,m,f,g if G = {0} or C = C,A,n,m,f,g if G = {0, d}. This is the


desired inequality. The proof of the lemma is complete.

Proof of Lemma 7. Let R+ be the set of non-negative real numbers and let be
a real-valued function on R+ such that:

(i) (t) 1 for all t R+ with (0) = 1;


(ii) (s)(t) (s + t) for all s, t R+ ;
(iii) increases monotonically.

For example, for any 0 the functions t  et and t  (1 + t) satisfy these


properties. Now, let T be a linear operator from L2C to L2B with B, C # (or a
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

940 G. de Oliveira

matrix T = [Tb,c ] with b B and c C) and consider the -norm


 
 
T := max sup |Tb,c |(|b c|), sup |Tb,c |(|b c|) .
bB cC cC
bB

In [13] we prove that this norm has the following properties.

Proposition 19 (Properties of   ). Let S and T be linear operators from


L2C to L2B with B, C # . Then:

(a) T T 1 T ;
(b) If B = C, then S T S T ;
(c) If B = C, then (I + T )1 (1 T )1 if T < 1;
(d) |Tb,c | (|bc|)
1
T for all b B and all c C.

Now, by using these properties we prove Lemma 7. We follow the same notation
as above. First observe that, similarly as in the last proof we can write

d ,d (k) = F{d }G 1 1 1


k RG G GG {d } = F{d }G Hk GG {d } .

Now, let (|b|) = (1 + |b|) , and observe that there is a positive constant C such
that (|b|) C (1 + |b| ) for all b # . Then, it is easy to see that

F{d }G = f C (1 + |b| )f (b) l1 ,


GG {d } = g C (1 + |b| )g(b) l1 .

Furthermore, by (77) and Proposition 5,




1 1
RG  G = (I + TG G ) TG G j < 18, (92)
j=0

and since for diagonal operators the -norm and the operator norm agree, from
(90) we have
2
1
k G .
|v|
Hence, in view of Propositions 19(b) and 11(ii),
1
|d ,d (k)| F{d }G 1 1
k RG G GG {d } C,f,g,,A,m,n ,
|d|
and by repeating the proof of Lemma 6 with the operator norm replaced by the
-norm we obtain
% n+m %
% % 1
% %
% k n k m d ,d (k)% C,f,g,,A,m,n |d| .
1 2
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 941

Therefore, by Proposition 19(d), for any integers n and m with n + m 0,


n+m % n+m %
1 % %
(k) % (k)%
k n k m d ,d
 
1 + |d d | % k n k m d ,d
 
%
1 2 1 2
1
C,f,g,,A,m,n .
|d|1+
This is the desired inequality.

Proof of Lemma 8. Dene the operator M (j) : L2G L2G as


3 3




S if j = 1,
(j)
M := W if j = 2,


Z if j = 3,

where S, W and Z are given by (64). In order to prove Lemma 8, we rst prove
the following proposition.

Proposition 20. Assume the same hypotheses of Lemma 8. Then, for any integers
n and m with n + m 1 and for 1 j 3,
% n+m %
% % Cj
% 1 (j) %

% k n k m k M % (2|z,d (k)| R)j ,
1 2

where C1 = C1;,A,n,m and Cj = Cj;,A,q,n,m for 2 j 3 are constants.


Furthermore,
13 13 65
C1;,A,1,0 , C1;,A,0,1 and C1;,A,1,1 .
2 2 3

Proof. Step 0. To simplify the notation write w = w,d , z = z,d and |z|R = 2|z|
R. First observe that, for any analytic function of the form h(k) = h(w(k), z(k))
we have


h= + h, h = i(1)
h.
k1 w z k2 w z

Thus,
% n+m %
% %
% 1
M (j) %
% k n k m k %
1 2
% n
%
% m  nr+mp r+p %
% m n 1 (j) %
= %(i(1) )m (1)mp nr+mp M %
% p r z wr+p k %
p=0 r=0
% nr+mp r+p %
% %
2 n+m %
sup sup % nr+mp 1
M %.(j) %
z w r+p k
pr rn
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

942 G. de Oliveira

Now, by the Leibniz rule,


% n % % m n %
% m 1 (j) % %  m n nr+mp 1 r+p M (j) %
% %= % %

% z n wm k M % % k
r wp %
% p r z nr wmp z %
p=0 r=0
% nr+mp 1 % % r+p (j) %
% k % % M %
2n+m sup sup %
% nr mp
%
%
% %
% z r wp % .
pm rn z w
Furthermore, we shall prove below that
% %% %
% nr+mp 1 % % r+p M (j) % C
% k %% % j,n,m
sup sup % nr wmp % % % , (93)
pm rn % z % % z r wp % |z|n+j
R

with constants C1,n,m = C1,n,m;,A and Cj,n,m = Cj,n,m;,A,q for 2 j 3. Hence,


% n m %
% % n+m Cj,n,m
% 1 (j) %
% z n wm k M % 2 |z|n+j
.
R

Therefore, being careful with the indices,


% n+m %
% % Cj,nr+mp,r+p Cj
% 1 (j) %
% k n k m k M % 2 j ,
n+m
sup sup 2nr+mp+r+p nr+mp+j
1 2 pm rn |z|R |z|R
where C1 = C1;,A,n,m and Cj = Cj;,A,q,n,m for 2 j 3. This is the desired
inequality. We are left to prove (93) and estimate the constants C1;,A,i,j for i, j
{0, 1} to nish the proof of the proposition.

r+p 1
Step 1. The rst step for obtaining (93) is to estimate zr wkp G3 . Observe that

r+p 1 r+p (1 )
k b,c
k
=
z r wp b,c z r wp
p
1 r b,c

= p
w w 2i (b d ) z r z 2i (b d )

(1)p p! (1)r r! b,c
=
(w 2i (b d ))p+1 (z 2i (b d ))r+1
p! r! b,c

,
|w 2i (b d )|p+1 |z 2i (b d )|r+1
and recall from (58) and (59) that, for all b G3 ,
1 2 1 1
and . (94)
|z 2i (b d )| |z|R |w 2i (b d )|
Then,

r+p 1 p! r! 2r+1
b,c
k
p+1 r+1 ,
z r wp b,c |z|R
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 943

and consequently,

  r+p 1
sup + sup k

bG3 cG cG3 bG 
z r wp b,c
3 3


p! r! 2 r+1   r+2
sup + sup b,c = p! r! 2 .
p+1 |z|Rr+1
bG3 cG cG3 bG p+1 |z|r+1
R
3 3

Therefore, by Proposition 5,
% r+p 1 %
% k % p! r! 2r+2 1
%  % . (95)
% z r wp G 3%
p+1 |z|r+1
R

Step 2. We now estimate the second factor in (93). Let us rst consider the case
j = 1, that is, M (1) = S. Since S = (I Y33 )1 , the operator S is clearly invertible.
pS
Thus, by applying (85) with T = S 1 , one can see that w p is given by a nite

linear combination of terms of the form



*
p nj 1
S
S S, (96)
j=1
wnj

+p r pS r
where j=1 nj = p. Hence, when we compute z r w p , the derivative z r acts
nj S 1 1
either on S or w nj
.
Similarly, using again (85) with T = S , one can see that
r S
r is given by a nite linear combination of terms of the form (96), with p and w
z
+r r+p S
replaced by r and z, respectively, and j=1 mj = r. Thus, we conclude that z r w p

is given by a nite linear combination of terms of the form



* mj +nj S 1
r+p
S S, (97)
j=1
z mj wnj

+r+p +r+p
where j=1 mj = r and j=1 nj = p. Indeed, observe that the general form of the
terms (97) follows directly from (85) because that identity is also valid for mixed
derivatives.
Since S = (I Y33 )1 with Y33 < 1/14 and

2i (A(b c)) z


Yb,c = , (98)
(w 2i (c d ))(z 2i (c d ))


we have
1 14
S = (I Y33 )1 (99)
1 Y33 13
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

944 G. de Oliveira

and
j+l
j+l

S 1
= j l Yb,c
z w
j l
b,c z w

j 2i  (A(b
c)) z l
1
= j .
z z 2i (c d ) wl w 2i (c d )

Furthermore,

j 2i (A(b
c)) z c)) 2i (c d )
(1)j1 j! 2i (A(b

= for j 1,
z z 2i (c d )
j (z 2i (c d ))j+1
l 1 (1)l l!

= for l 0.
w w 2i (c d )
l (w 2i (c d ))l+1
Recall from (59) and (61) that, for all c G ,
|c d | |c d |

2. (100)
|w 2i (c d )| |c d |
Then, using this and (94), for j 1 and l 0,

j+l j! l! |A(b
c)| |c d |
1
S
z w
j l
b,c |z 2i (c d )| |w 2i (c d )| |w 2i (c d )|
 j+1  l

2j+2 j! l! |A(b
c)|
, (101)
l |z|j+1
R

while for j = 0 and l 0,



j+l l! |A(b
c)| |z|
1
S
z w
j l
b,c |z 2i (c d )| |w 2i (c d )|l+1


2 l! |A(b
c)|
. (102)
l+1
Consequently,

  j+l
sup 1
+ sup S
bG3  cG3 
z j wl b,c
cG3 bG3

j+2  
|z|R 2 j! l! |A(b
1 0,j + 0,j sup + sup c)|
2 l |z|j+1
R bG3  cG3
cG3  bG3
j+3
|z|R 2 j! l!
1 0,j + 0,j A l1 .
2 l |z|j+1
R
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 945

Therefore, by Proposition (5),


% j+l % j+3
% % |z|R 2 j! l!
% 1 %
% z j wl S % 1 0,j + 2 0,j l |z|j+1 A l1 . (103)
R
+r+p
Thus, for r 1, in view of (97) where j=1 mj = r,

% r+p % * % m +n %
% % r+p
% j j %
% % S % 1 %
% z r wp S % Cr,p % z mj wnj S % S
j=1


*
r+p
2 mj +3
mj ! n j !
Cr,p C,A A l1
j=1
nj

*
r+p
|z|R

1
C,A 1 0,mj + 0,mj m +1
j=1
2 |z|R j

1
C,A,r,p ,
|z|r+1
R

since mj 1 for at least one 1 j r + p. Similarly, if r = 0 then


% r+p %
% %
% %
% z r wp S % C,A,r,p .

Hence, in view of (95),


% nr+mp 1 % % r+p (1) %
% k % % M %
sup sup % % % %
pm rn
% z nr w mp % % z wp %
r

(m p)! (n r)! 2nr+2


sup sup C,A,r,p A
l1
pm rn mp+1 |z|nr+1
R

|z|R 1
1 0,r + 0,r
2 |z|r+1
R

1
C,A,n,m .
|z|n+1
R

This proves (93) for j = 1.

Step 3. We now estimate the constant C1;,A,i,j for i, j {0, 1}. First observe
that

w
= |1,j + i(1) 2,j | = 1 and z = |1,j i(1) 2,j | = 1.
kj kj
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

946 G. de Oliveira

Thus, in view of (99) and (103), since |z| |v| > R 2,


% % % % % %
% S % % 1 % % 1 1 %
% % = %S S S % = %S w S + z S S%
% kj % % kj % % kj w kj z %

% 1 % % 1 % 2 # 4 $
% S % % S % 3 2 A l1 22 A
l1
%
S % 2 % + % % +
w % % z % 2 |z|2R 2

l1
18 A
.
2
Similarly,

2S S w S 1 z S 1
= + S
ki kj ki kj w kj z

w S 1 z S 1 S
S +
kj w kj z ki

w w 2 S 1 z 2 S 1
S +
kj ki w2 ki zw

z w 2 S 1 z 2 S 1
+ + S,
kj ki wz ki z 2

so that, using the above inequality as well,


% 2 % % % % 1 % % 1 %
% S % % % % % % %
% % 2 S % S % % S % + % S %
% ki kj % % ki % % w % % z %

% 2 1 % % 2 1 % % 2 1 %
% S % % S % % S %
%
+ S % 2 % +2% % % %
w 2 % % zw % + % z 2 %

2 # 3 $
l1 8 A
3 18 A l1 3 2 A l1 25 A
l1 26 A
l1
2 + + +
2 2 2 2 3 |z|2R |z|3R
# $
432 2 54 l1
55 A l1
8 A
4 A l1 + 3 A l1 +1 .
3

Furthermore, by (95),
% % % % % %
% 1 % % 1 % % 1 % 22 23 8
% k % % k % % k %
% kj % % w % % z % 2 |z|R + |z|2 2 |z|R
+
R
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 947

and
% 2 1 % % 2 1 % % 2 1 % % 2 1 %
% k % % k % % % % %
% %% % + 2 % k % + % k %
% ki kj % % w2 % % zw % % z 2 %

23 24 26 5 23 1
+ + < .
3 |z|R 2 |z|R
2 3
|z|R 3 |z|R

Hence, since A
l1 < 2/63 and < /6,
% % % % % %
% 1 % % 1 % %
1 % S %
%
% % % k %
% kj k S % % kj % S + k % kj %

8 3 l1
2 18 A 13 1
+ 2
2 |z|R 2 |z|R 2 |z|R
and
% %
% 2 %
% 1 %
% ki kj k S %
% 2 1 % % %% % % %% % % 2 %
% k % % 1 %%
k % % S %
% % 1 %%
k % % S %
% %
1 % S %
%
%% % %
S + % +%% + k %
ki kj % kj % % ki % ki % % kj % ki kj %
# # $$
1 5 23 3 l1
8 18 A l1 8 A
2 55 A l1 65 1
+2 2 + +1 < 3 .
|z|R 23 2 3 |z|R

Therefore,
13 13 65
C1;,A,1,0 , C1;,A,0,1 and C1;,A,1,1 ,
2 2 3
as was to be shown.
r+p (2) r+p
Step 4. To prove (93) for j = 2 we need to bound zr w
M
p = z r w p . Recall
W

from (64) that



 
 j
W = Wj = (Y33 )m1 X33 (Y33 )jm ,
j=1 j=1 m=1

where Yb,c is given above by (98) and X33 C/|z| < 1/3 with

(c d ) A(b
c) q(b c) 2i (A(b
c))w
Xb,c = .
(w 2i (c d ))(z 2i (c d ))


First observe that


r+p
(Y33 )m1 X33 (Y33 )jm
z r wp
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

948 G. de Oliveira

is given by a sum of j r+p terms of the form


l1 +n1 Y33 lm1 +nm1 Y33 lm +nm X33 lm+1 +nm+1 Y33 lj +nj Y33
,
l
z 1 w 1 n l n l n l
z m1 w m1 z m w m z m+1 w m+1 n z lj wnj
where there are j factors ordered as in the product (Y33 )m1 X33 (Y33 )jm . Further-
+j +j
more, for each term in the sum we have i=1 li = r and i=1 ni = p. Thus,
% r+p %
% %
% W %
% z r wp %
% %
% %
% r+p %
=% % Wj % (104)
r p %
% j=1 z w %
% %
% j %
%  r+p %
=%
% r p
(Y33 )m1
X (Y
33 33 )jm %
%
% j=1 m=1 z w %

  j % %
% r+p %
% (Y )m1
X (Y )jm %
% z r wp 33 33 33 %
j=1 m=1


  % l +n %
j
% 1 1 Y33 lm +nm X33 lj +nj Y33 %
j r+p sup %
% z l1 wn1 %
j=1 m=1 I
z lm wnm z lj wnj %


  % l +n % % l +n % % l +n %
j
% 1 1 Y33 % % m m X33 % % j j Y33 %
j r+p sup % % % % % %
% z l1 wn1 % % z lm wnm % % z lj wnj % , (105)
j=1 m=1 I

where
 

j 
j

I := (li , ni ) li r and ni p for 1 i j with li = r and ni = p .

i=1 i=1
(106)
+
Note, we can dierentiate the series (104) term-by-term because the sum j=1 Wj
+j
converges uniformly and the sum m=1 is nite. We next estimate the factors
in (105).
Combining (101) and (102) we have
l +n li +2
i i |z|R 2 l i ! ni !

z li wni Yb,c 1 0,li + 2 0,li ni |z|li +1 |A(b c)|. (107)
R

Furthermore, using (94) and (100),


l +n
i i

z li wni Xb,c

li ni (c d ) A(b c))w
c) q(b c) 2i (A(b
1
= li
z z 2i (c d ) wni w 2i (c d )
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 949


(1)li l !(1)ni n !(2 (A(b
c))2 (c d ) (c d ) A(b c) q(b c))
i i
=
(z 2i (c d ))li +1 (w 2i (c d ))ni +1
c)| |c d | + |
li ! ni ! (2|A(b q (b c)|)

|z 2i (c d )|li +1 |w 2i (c d )|ni +1
c)| |c d | + |
2li +1 li ! ni ! 2|A(b q (b c)|

i |z|R
n l i +1 |w 2i (c d )|

2li +1 li ! ni ! c)| + 1 |
4|A(b q (b c)| . (108)
ni |z|lRi +1

Hence,

  l +n
i i
sup + sup
z li wni Yb,c
bG3 cG3
cG3 bG3

li +2  
|z|R 2 l i ! ni ! |A(b
1 0,li + 0,li sup + sup c)|
2 ni |z|lRi +1 bG3  cG3 
cG3 bG3
li +3
|z|R 2 l i ! ni !
1 0,li + 0,li A l1
2 i |z|lRi +1
n

and similarly

  l +n
i i 2li +2 li ! ni !
q l1
sup
z li wni Xb,c ni |z|li +1 4 A l1 +
+ sup .
bG3 cG3
cG3 bG 3
R

Thus, by Proposition (5), since |z| |v| > R 2,


% l +n % li +3
% i i % |z|R 2 l i ! ni !
% Y % 1 + A l1
% z li wni 33 % 0,li
2
0,li
ni |z|lRi +1

1 1 2li +3 li ! ni ! 2li +3 li ! ni !
+ A l 1 A l1 (109)
|z|R 2 ni |z|Ril
ni +1 |z|lRi

and
% l +n %
% i i % 2li +2 li ! ni !
q l1
% X % 4
A +
% z li wni 33 % ni |z|lRi +1
l 1

# $

q l1 1 2li +3 li ! ni !
= 2 + A l1 . (110)
2 A l1 |z|R ni +1 |z|li
R
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

950 G. de Oliveira

+j +j
Applying these estimates to (105) and recalling that i=1 li = r and i=1 ni = p
we have
% r+p %
% %
% %
% z r wp W %

  % l +n % % l +n % % l +n %
j
% 1 1 Y33 % % m m X33 % % j j Y33 %
j r+p
sup % l1 n1 % % lm nm % % lj nj %
% % % % %
j=1 m=1 I
z w z w z w %


# $ 
 
j

q l1 1 * 2li +3 li ! ni !
j
j r+p sup 2 + A l1
j=1 m=1 I
l1
2 A |z|R i=1 ni +1 |z|lRi
# $
# $j  j  j
2r  r+p 8 A * * 
j

q l1 1 l1
= 2 + j sup li ! nm ! 1
l1
2 A |z|R |z|R j=1
p r I i=1 m=1 m=1
# $ j

q l1 2r r!p!  r+p+1 1  1
2 + j C,A,q,r,p r+1 .

2 A l1 p |z|r+1
R j=1
21 |z| R

This is the inequality we needed to prove (93) for j = 2. In fact, using (95) we
obtain
% nr+mp 1 % % r+p (2) %
% k % % M %
sup sup % % % %
pm rn
% z nr wmp % % z r wp %


(m p)! (n r)! 2nr+2 C,A,q,r,p
sup sup
pm rn mp+1 |z|nr+1
R |z|r+1
R
1
C,A,q,m,n .
|z|n+2
R

r+p (3) r+p


Step 5. To prove (93) for j = 3 we need to estimate zr w
M
p = z r w p , where
Z



 j
Z= Zj = (X33 + Y33 )j Wj Y33 .
j=2 j=2

First observe that

r+p r+p j
Z j = ((X33 + Y33 )j Wj Y33 )
z r wp z r wp

is given by a sum of (2j j 1) j r+p terms of the form

l1 +n1 Y33 lm +nm X33 lj +nj Y33


, (111)
l1
.z w
n1 z lm/0
wnm z lj wnj1
j factors
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 951

where there are j 2 factors involving X33 or Y33 and two factors containing X33 .
+ +
Furthermore, for each term in the sum we have ji=1 li = r and ji=1 ni = p. Thus,
% r+p %
% %
% %
% z r wp Zj %
% l +n % % l +n % % l +n %
% 1 1 Y33 % % m m X33 % % j j Y33 %
(2 j 1) j
j r+p %
sup % l1 n1 % % lm nm % % lj nj %
% % % % ,
I z w z w z w %
where the set I is given above by (106). Now observe that, the estimate for the
derivatives of X33 in (110) is better then the estimate for the derivatives of Y33 in
(109) because the former has an extra factor C,A,q /|z|R < 1. Since the product
(111) has at least two factors containing X33 , we can estimate any of these products
by considering the worst case. This happens when there are exactly two factors
involving X33 . Hence, by proceeding in this way, for each j 2 we have
% r+p %
% %
% %
% z r wp Zj %
# $2
q 1 1 *j
2 li +3
l !n !
l i i
(2j j 1) j r+p sup 2 + l1
A
I 2 A l1 |z|2R i=1 ni +1 |z|lRi
# $2 # $j

q l1 1 2r r!p! l1
8 A
2 jj r+p
2 +
l1
2 A |z|2R p |z|rR
j
 2 1
C,A,q,r,p j r+p ,
21 |z|r+2
R

since A l1 2/63 and < /6. Thus,


% r+p % % % j
% %  % r+p 
% C,A,q,r,p  2
% % % %
% z r wp Z % % z r wp Zj % |z|r+2
r+p
j
j=2 R j=2
21

C,A,q,r,p
.
|z|r+2
R

Therefore, recalling (95),


% nr+mp 1 % % r+p (3) %
% k % % M %
sup sup % % % %
pm rn
% z nr wmp % % z r wp %


(m p)! (n r)! 2nr+2 C,A,q,r,p
sup sup
pm rn mp+1 |z|nr+1
R |z|r+2
R
1
C,A,q,m,n .
|z|n+3
R

This is the desired inequality for j = 3. The proof of the proposition is complete.
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

952 G. de Oliveira

We can now prove Lemma 8. We rst prove it for 1 j 2 and then for j = 3
separately.

Proof of Lemma 8 for 1 j 2. Dene the |B| |C| matrices


FBC := [f (b c)]bB,cC and GBC := [g(b c)]bb,cC ,
and write w = w,d , z = z,d and |z|R = 2|z|R. First observe that, for 1 j 2,
the functions

 f (d 
b)M
(j)
g(c d
)
[,d (k)]d G =
(j) b,c
(w 2i  (b d ))(z 2i (b d ))

b,cG1
d G

are the diagonal entries of the matrix FGG1 1


GG1 G . Thus, similarly as in
k M
(j)

the proof of Lemma 6, by Proposition 20, for 1 j 2,


n+m % n m %
% %

(j)
(k) FGG % 1 M (j) % GG G Cj ,
k n k m ,d 
1 % k n k m k % 1
|z|j
1 2 1 2 R
where C1 = C1;,A,n,m,f,g and C2 = C2;,A,q,n,m,f,g are constants. Furthermore,
13 13
C1;,A,1,0,f,g f l1 g l1 , C1;,A,0,1,f,g f l1 g l1
2 2
and
65
C1;,A,1,1,f,g f l1 g l1 .
3
This proves the lemma for 1 j 2.

Proof of Lemma 8 for j = 3. We need to estimate


n+m (3)  n+m 4
 (k) = Rj (k),
k1n k2m ,d j=1
k1n k2m

where R1 , . . . , R4 are given by (46), (47), (75) and (67), respectively.

Step 1. We begin with the terms involving R1 and R2 , which are easier. We follow
the same notation as above. First observe that, similarly as in the proof of Lemma 6,
since 1 1
k RG G = Hk
1
on L2G , we have
n+m % %
% n+m Hk1 %
R (k) = %F G %
k n k m 1 % {d }G1 k n k m G2 {d } %
   
1 2 1 2
% n+m 1 %
% Hk %
F{d }G1 % %
% k n k m % GG2 {d } ,
1 2
n+m % %
% n+m Hk1 %
% %
k n k m R2 (k) = %F{d }G2 k n k m GG {d } %
1 2 1 2
% n+m 1 %
% Hk %
F{d }G2 %
 %
% k n k m % GG {d } .
1 2
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 953

Furthermore, we have already proved that F{d }G1 f l1 and GG {d } g l1
(see (90) and (91)), and since |z| 3|v|, by Proposition 11,
% n+m 1 %
% Hk % 1
% % (n+m+1)
% k n k m % C,A,n,m .
|z|
1 2

Now recall that G2 = {b G ||b d | > 14 R}. Then,

  |d c|2 1
sup |f (b c)| 
|f (d c)| b2 f (b) l1 sup 
b{d } cG 
|d c|2
cG2 |d c|
2
2 cG2

16 2
b f (b) l1 ,
R2
 |d c|2 1
sup |f (b c)| sup |f (d c)| b2 f (b) l1 sup 
cG2 cG2 |d c|2 cG2 |d c|
2
b{d }

16 2
b f (b) l1 .
R2
Hence, by Proposition 5,

1
F{d }G2 16 b2 f (b) l1 .
R2
Similarly,

1
GG2 {d } 16 b2 f (b) l1 .
R2
Therefore, combining all this, for 1 j 2 we obtain
n+m
1
(n+m+1)
k n k m Rj (k) C,A,n,m,f,g
|z|R2
.
1 2

Step 2. Recall from (67) the expression for R4 . Then, similarly as above, by applying
Proposition 20 for j = 3 we nd that
n+m % n+m %
% %
R F % 1 %
(k)
k n k m 4 {d }G1 %
 
n m
k1 k2
k Z % GG1 {d }
1 2
1
f l1 g l1 C,A,q,n,m .
|z|3R

Step 3. To bound the derivatives of R3 (which is given by (75)) we need a few more
(j)
estimates. Recall from (70) that W43 = G4 TGj+1
 G G . First observe that
3

r+p 1 m r+p
1 G1 T33
(jm1)
p G  T T 34 W = m
T34 TGjm
 G G
k1r k2 1 k 33 43
k1r k2p k 3
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

954 G. de Oliveira

is given by a sum of (j + 2)r+p terms of the form

l1 +n1 1 l2 +n2 T33 lm+2 +nm+2 T34


k
G1
k1l1 k2n1 k1l2 k2n2 l n
k1m+2 k2 m+2
lm+3 +nm+3 TG G lj+2 +nj+2 TG G
l n
l n
G3 .
k2m+3 k2 m+3 k1j+2 k2 j+2
+j+2 +j+2
Moreover, for each term in the sum we have i=1 li = r and i=1 ni = p. Thus,
% r+p %
% (jm1) %
% 1 m
T T W %
% k r k p G1 k 33 34 43

%
1 2
%#j+2 $ %
% * li +ni T %
% (i) %
(j + 2)r+p sup % G  %, (112)
I  % li
k1 k2 ni 3
%
i=1


where the set I is given by (106) with j replaced by j + 2 and


1  for i = 1,
k G1

T for 2 i m + 1,
33
T(i) := (113)
T34
for i = m + 2,



T G G for m + 3 i j + 2.

r+p 1
Step 3a. The rst step in bounding (112) is to estimate k
k1r k2p
G1 . We follow the
n+m Hk1
same argument that we have used in the proof of Lemma 6 to bound k1n k2m .
In fact, in view of (85) one can see that

p 1  *p
nj

k
= 1
k nj
k 1 , (114)
k2p j=1
k 2
k
nite sum
where # of terms
depend on p

+p p 1
r k r
where j=1 nj = p. Hence, when we compute k1r k2p , the derivative k1r
nj k
acts either on 1 or n . However, since ( k
k2 )b,c = 2(k2 + c2 )b,c , we have
k k2 j
r nj r nj r
k1r knj k = 0 if nj 1 and k1r knj k = k1r k if nj = 0. Similarly, using again
2 2
r 1
k
(85) one can see that is given by a nite sum as in (114), with p and k2
k1r
+r
replaced by r and k1 , respectively, and j=1 nj = r. Thus, combining all this we
conclude that

r+p 1
k  *
r+p nj
k 1
= 1 n k , (115)
k1r k2p j=1
k
kijj
nite sum where
# of terms depend
on r and p
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 955

+r+p +r+p
where j=1 nj 2,ij = p and j=1 nj 1,ij = r. If we observe that

# $

nj k 2(kij + cij )b,c if nj = 1,
= 2b,c if nj = 2,
n
kijj

b,c 0 if nj 3,

and extract the leading term from the summation in (115), in a sense that will
be clear below, we can rewrite (115) in terms of matrix elements as
& 'r & 'p
r+p 1 (1)r+p (r + p)! 2(k1 + c1 ) 2(k2 + c2 )
=
k1r k2p Nc (k) Nc (k) Nc (k) Nc (k)
 (2(k1 + c1 ))j (2(k2 + c2 ))j
+ ,
Nc (k)r+p+1
nite sum where
# of terms depend
on r and p

where j + j < r + p for every j in the summation. Recall from (88) and (89) that,
for all c G \{
c},
|ki + ci | 2 1 7 |ki + ci | + 3|v| 7
< < and . (116)
|Nc (k)| 3 2 |Nc(k)| |v| 2
Hence,
r+p r+p  j +j
1 (r + p)! 7 7 1

k r k p Nc (k) |Nc (k)| +
|Nc (k)|2
1 2 nite sum where
# of terms depend
on r and p
r+p
(r + p)! 7 1
+ C,r,p . (117)
|Nc (k)| |Nc (k)|2
Thus, by Proposition 5, since |Nc (k)| |v| |z|/3 for all c G , we have
% r+p 1 %
% k % 7r+p (r + p)! 3 C,r,p
% %
% k r k p G1 %

r+p+1 |z|
+
|z|2
. (118)
1 2

Now, let 1 = 1;,r,p be the constant


l1 +n1 +1 C,l1 ,n1
1;,r,p := max ,
l1 r 4(l1 + n1 )! 7l1 +n1
n1 p

where C,l1 ,n1 is the constant in (118). Then, for |z| > 1 and for any l1 r and
any n1 p,
% %
% l1 +n1 1 % 7l1 +n1 (l + n )! 3 7l1 +n1 (l1 + n1 )! 4
% %
 %
1 1
% k
G +
% k11 k2 1
l n 1
% l1 +n1 +1 |z| l1 +n1 +1 |z|
l1 +n1 +1
7 1
= (l1 + n1 )! . (119)
|z|
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

956 G. de Oliveira

This is the rst inequality we need to bound (112). We next estimate the other
factors in that expression.

Step 3b. Recall from (53) that


1
Tb,c = (2(c + k) A(b
c) q(b c)).
Nc (k)
By direct calculation we have
r+p
r+p Tb,c 1
= (2(c + k) A(b
c) q(b c))
k1r k2p k1r k2p Nc (k)
r1+p
1
+r 2Aj (b c)
k1r1 k2p Nc (k)
# $
r+p1 1
+p 2Aj (b c).
k1r k2p1 Nc (k)

Hence, using (116) and (117), since |Nc (k)| |v| |z|/3 for all c G and
|v| > 1,
r+p # r+p $
T 7 C 7 |
q (b c)|
b,c (r + p)! +
,r,p
|A(b c)| +
k r k p |v| |v|
1 2

C,r,p
+ |A(b c)|
|v|
r+p+1
7 c)| + C,r,p (|A(b
(r + p)! |A(b c)| + |
q (b c)|).
|z|
(120)

Therefore, by Proposition 5,
% r+p %
% T G G %
% %
% k r k p % r,p , (121)
1 2

where
r+p+1
7 l1 + C,A,q,r,p 1 .
r,p := (r + p)! A (122)
|z|

This is the second estimate we need to bound (112). We next derive one more
inequality.

Step 3c. Set


r+p Tb,c
Qr,p
b,c := (1 + |b c| )
2
.
k1r k2p
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 957

We rst prove that, for any B, C G ,


 r,p 
sup |Qb,c | r,p and sup |Qr,p
b,c | r,p ,
bB cC cC
bB

where
r+p+1
7 1
r,p := (r + p)! (1 + b2 )A(b)
l1 + C,A,q,r,p . (123)
|z|
In fact, in view of (120) we have
r+p
 r,p  Tb,c
2
sup |Qb,c | = sup (1 + |b c| ) r p
bB cC bB cC k1 k2

sup (1 + |b c|2 )
bB cC
! r+p+1 "
7 C,r,p
(r + p)! |A(b c)| +
(|A(b c)| + |
q (b c)|)
|z|
r+p+1
7 1
(r + p)! (1 + b2 )A(b)
l1 + C,A,q,r,p ,
|z|
+
and similarly we estimate supcC bB |Qr,p b,c |. Now observe that, as in (78), for any
integer m 0 and for any 0 , 1 , . . . , m+2 # , let b = 0 and c = m+2 . Then,

m+2
|b c|2 2(m + 2) |i1 i |2 .
i=1

li ,ni li +ni
To simplify the notation write = l n , and recall from (113) and (123)
k1i k2 i
the denition of T(i) and r,p . Hence, similarly as in the proof of Proposition 17,
since |b c| R/4 for all b G1 and c G4 ,
# $
 m+2
*
sup li ,ni T(i)

bG1

cG4 i=2 b,c

# $
m+2
1  *
2
sup sup (1 + |b c| ) li ,ni
T (i)
bG1 1 + |b c| bG1
2
 cG4 i=2 b,c
cG4

2(m + 2) 
sup (1 + |b 1 |2 )| l2 ,n2 Tb,1 |
1 2 bG
1+ R 1 1 G
3
16

(1 + |1 2 |2 )| l3 ,n3 T1 ,2 |
2 G3

(1 + |m+1 c|2 )| lm+2 ,nm+2 Tm+1 ,c |
cG4
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

958 G. de Oliveira

2(m + 2) 
sup (1 + |b 1 |2 )| l2 ,n2 Tb,1 |
1 2 bG
1+ R 1 1 G
3
16

sup (1 + |1 2 |2 )| l3 ,n3 T1 ,2 |
1 G3 G
2 3

sup (1 + |m+1 c|2 )| lm+2 ,nm+2 Tm+1 ,c |
m+1 G3 cG4

2(m + 2)   l ,nm+2
= sup |Qlb,
2 ,n2
| sup |Qm+2
m+1 ,c
|
1 2 bG 1 
m+1 G3
1+ R 1 1 G
3 cG4
16
2(m + 2) *
m+2
li ,ni
1
1 + R2 i=2
16
and similarly
# $
 m+2
* 2(m + 2) *
m+2


li ,ni
sup T(i) li ,ni .
cG4 bG i=2
b,c
1 + 1 R2 i=2
1
16
Therefore, by Proposition 5,
% %
% %
%  * i i T(i) % 2(m + 2) *
m+2 l +n m+2
%G1 % li ,ni .
% k1li k2ni % 1 + 1 R2 i=2
i=2
16
We have all we need to bound (112).

Step 3d. From (121) and (119) it follows that


% j+2 %
% * li +ni T % *
j+2
% (i) %
% % li ,ni
% k1li k2ni % i=m+3
i=m+3

and
% % r+p+1
% l1 +n1 T %
% (1) % 7 1
% % (r + p)! .
% k1 k2 %
l 1 n1 |z|

Thus, recalling (112) we get


% r+p %
% (jm1) %
% 1
T m
T W %
% k r k p k G1 33 34 43

%
1 2
%#j+2 $ %
% * li +ni T %
% (i) %
(j + 2)r+p
sup % G3 %

I  % i=1 k1 k2 %
li ni
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 959



1 2(m + 2) r+p+1 !m+2*
" j+2
*


7
(j + 2)r+p sup (r + p)! li ,ni li ,ni
I
|z| 1 + 1 R2 i=2 i=m+3


16
C
(j + 2)r+p (m + 2)
|z|R2
 l1 +n1 +1 !m+2 " j+2 
7 * *
sup (l1 + n1 )! li ,ni li ,ni ,
I i=2 i=m+3

where C is an universal constant. Now, recall the denition of r,p and r,p in
(122) and (123), observe that A l1 < (1 + b2 )A
l1 , and let 2 = 2;,A,q,r,p be a
suciently large constant such that, for |z| > 2 and for any li r and any ni p,
li +ni +1
7
li ,ni , li ,ni 2(li + ni )! (1 + b2 )A(b)
l1 .

Then,
% r+p %
% (jm1) %
% 1 m %
% k r k p k G1 T33 T34 W43 %
1 2
C
(j + 2)r+p (m + 2)
|z|R2
 l1 +n1 +1 !m+2 " j+2 
7 * *
sup (l1 + n1 )! li ,ni li ,ni
I i=2 i=m+3
j+2
(m + 2)C 7
(j + 2)r+p
(2 (1 + b2 )A(b) l1 )
j+1
|z|R2
 Pj+2 
i=1 (li +ni ) *
j+2
7
sup (li + ni )!
I i=1
+j+2 +j+2 5j+2
(since i=1 li = r, i=1 ni = p and
i=1 (li + ni )! < (r + p)!)
r+p+1 j+1
7 14 1
C(r + p)! (m + 2)(j + 2)r+p (1 + b2 )A(b)
l 1
|z|R2
j+1
C,r,p 4
(m + 2)(j + 2)r+p
,
|z|R2 9
since (1 + b2 )A(b)
l1 < 2/63. This establishes a bound for (112).

Step 4. We now apply the last inequality for deriving an estimate for the derivatives
of R3 and complete the proof of the lemma for j = 3. Recall from (76) that

(j)

j1
(jm1)
m
X33 = T33 T34 W43 .
m=0
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

960 G. de Oliveira

Then,
% r+p %
% %
% 1 (j) %
% k r k p G1 k X33 %
1 2

 % r+p %
j1
% (jm1) %
% 1 m %
% k r k p k G1 T33 T34 W43

%
m=0 1 2


j1 j+1
C,r,p 4
(m + 2)(j + 2)r+p

m=0
|z|R 2 9
j+1 
j1
C,r,p 4
(j + 2)r+p (m + 2)
|z|R2 9 m=0
j+1
C,r,p 4 1 2
= (j + 2)r+p (j + 3j).
|z|R2 9 2
Thus, since G1 G3 ,
% %
%  %
% r+p %
%G 1
X
(j)
%
% 1 k r k p k 33 G1 %


% 1 2 j=1 %
%
 %
% r+p %
% 1 (j) %
% k r k p G1 k X33 %
j=1 1 2
j+1
C,r,p  4 1 2 1
(j + 2)r+p
(j + 3j) CC,r,p ,
|z|R2 j=1 9 2 |z|R2
where C is an universal constant. Therefore,

r+p 
r+p
R (k) = F 1
X
(j)
G
k r k p 3 {d }G1 k r k p
 
k 33 G1 {d }
 
1 2 1 2 j=1
% %
%  %
% r+p %
%
F{d }G1 %G1 r p k 1
X33 G1 %
(j)
k1 k2 % GG1 {d }
% j=1 %
1
CC,r,p f l1 g l1 .
|z|R2
Finally, combining all the estimates we have
% n+m %  4 % %
% % % n+m %
% (3) % % %
% k n k m ,d (k)% % k n k m Rj (k)%
1 2 j=1 1 2

C C 4C
3 + ,
|z|R2 |z|3R |z|R2
where C = C,,A,q,f,g,m,n is a constant. Set ,A,q,m,n := max{1;,m,n ,
2;,A,q,m,n }. The proof of the lemma for j = 3 is complete.
September 14, 2010 13:29 WSPC/S0129-055X 148-RMP
J070-S0129055X10004107

Asymptotics for Fermi Curves: Small Magnetic Potential 961

References
[1] J. Feldman, H. Kn orrer and E. Trubowitz, Riemann Surfaces of Innite Genus, CRM
Monograph Series (Amer. Math. Soc., 2003).
[2] D. Gieseker, H. Knorrer and E. Trubowitz, The Geometry of Algebraic Fermi Curves,
Perspectives in Mathematics, Vol. 14 (Academic Press, Inc., 1993).
[3] H. Kn orrer and E. Trubowitz, A directional compactification of the complex Bloch
variety, Comment. Math. Hel. 65 (1990) 114149.
[4] I. Krichever, Spectral theory of two-dimensional periodic operators and its applica-
tions, Russian Math. Surveys 44(2) (1989) 145225.
[5] H. McKean, Integrable systems and algebraic curves, in Global Analysis (Proc. Bien-
nial Sem. Canad. Math. Congr. Univ. Calgary, 1978), Lecture Notes in Math.,
Vol. 755 (Springer, 1979), pp. 83200.
[6] J. Feldman, H. Kn orrer and E. Trubowitz, Asymmetric Fermi surfaces for magnetic
Schrodinger operators, Comm. Partial Dierential Equations 26 (2000) 319336.
[7] Y. Karpeshina, Spectral properties of the periodic magnetic Schr odinger operator
in the high-energy region. Two-dimensional case, Comm. Math. Phys. 251 (2004)
473514.
[8] L. Erdos, Recent developments in quantum mechanics with magnetic fields, in Spec-
tral Theory and Mathematical Physics: A Festschrift in Honor of Barry Simons 60th
Birthday, Proc. Sympos. Pure Math., Vol. 76, Part 1 (Amer. Math. Soc., 2007),
pp. 401428.
[9] M. Reed and B. Simon, Methods of Modern Mathematical Physics IV: Analysis of
Operators (Academic Press, 1978).
[10] P. Kuchment, Floquet Theory for Partial Dierential Equations (Birkh auser, 1993).
[11] W. Magnus and S. Winkler, Hills Equation (Dover, 2004).
[12] S. Gustafson and I. Sigal, Mathematical Concepts of Quantum Mechanics (Springer,
2006).
[13] G. de Oliveira, Asymptotics for Fermi curves of electric and magnetic
periodic fields, Ph.D. thesis, The University of British Columbia (2009);
http://hdl.handle.net/2429/11114.
September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

Reviews in Mathematical Physics


Vol. 22, No. 8 (2010) 963993

c World Scientic Publishing Company
DOI: 10.1142/S0129055X10004119

THE 3D SPIN GEOMETRY


OF THE QUANTUM TWO-SPHERE

SIMON BRAIN, and GIOVANNI LANDI,,


Dipartimento di Matematica e Informatica,
Universit`
a di Trieste, Via A. Valerio 12/1,
34127 Trieste, Italy
INFN, Sezione di Trieste, Trieste, Italy
brain@sissa.it
landi@univ.trieste.it

Received 23 March 2010

We study a three-dimensional dierential calculus 1 Sq2 on the standard Podles quantum


two-sphere Sq2 , coming from the Woronowicz 4D+ dierential calculus on the quantum
group SUq (2). We use a frame bundle approach to give an explicit description of 1 Sq2
and its associated spin geometry in terms of a natural spectral triple over Sq2 . We equip
this spectral triple with a real structure for which the commutant property and the rst
order condition are satised up to innitesimals of arbitrary order.

Keywords: Noncommutative geometry; spectral triples; quantum groups; quantum


spheres.

Mathematics Subject Classication 2010: 58B34, 17B37

1. Introduction
The standard quantum two-sphere Sq2 has proven to be one of the most impor-
tant and useful examples in trying to understand the relationship between the
geometric/analytic world of noncommutative geometry and the algebraic setting of
quantum group theory. At the algebraic level, it is known that Sq2 has a unique
left-covariant two-dimensional dierential calculus [17, 18]. On the other hand, it
is known that this same calculus is recovered via analytic techniques by means of a
noncommutative spin geometry [4, 20]. This compatibility has led to the discovery
of other noncommutative two-dimensional geometries on Sq2 with a range of interest-
ing properties [7]. In this paper, we extend the investigation to the noncommutative
spin geometry of a dierential calculus on Sq2 whose dimension is equal to three.
Quantum two-spheres were constructed and classied by Podles in [16]. The
standard sphere Sq2 is unique amongst the Podles family in that it also appears
as the base space of the noncommutative Hopf bration SUq (2) Sq2 constructed
in [1] as a basic example of a quantum principal bundle. By equipping the total space
SUq (2) with the 3D dierential calculus of [22], one nds that the two-dimensional
963
September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

964 S. Brain & G. Landi

dierential calculus on Sq2 appears as an associated vector bundle. This quantum


frame bundle approach to noncommutative geometry, developed in [13, 14], has
been applied successfully to study a host of examples, not least the two-dimensional
geometry of the quantum sphere Sq2 itself.
The present paper also uses the frame bundle approach to study the geometry
of Sq2 , but this time starting with the 4D+ dierential calculus on SUq (2) of [22].
This calculus has the advantage of being bicovariant under both left and right
translations, in contrast with the 3D calculus, which is only left-covariant. Using
the framing theory we recover the three-dimensional dierential calculus 1 Sq2 of
[9, 10, 17] on Sq2 . The methods we use are well-adapted to the principal bundle
structure and as a consequence we immediately nd an explicit description of the
bimodule relations in 1 Sq2 , including a decomposition into irreducible components.
We do not discuss the deeper aspects of the Riemannian geometry such as Hodge
structure and connection theory: these will be developed elsewhere [12].
Our main results concern the spin geometry of the three-dimensional calculus
1 Sq2 . Remarkably, we nd that the spinor bundle of Sq2 is unchanged from the one
used in [4, 14, 20] for the two-dimensional calculus. We construct a Dirac operator
D which implements the exterior derivative in 1 Sq2 , nding that the eigenvalues
of |D| grow not faster than q 2j for large j and hence that the associated spectral
triple has metric dimension zero.
Moreover, we equip this spectral triple with a Z2 -grading operator and a real
structure which is dened up to compact operators, in the sense that the com-
mutant property and the rst order condition for a real spectral triple [3] are
satised up to innitesimals of arbitrary order. As we shall see, this is in contrast
with [4], where a true real structure for the two-dimensional calculus on Sq2
was given (cf. also [20]), but is parallel to the results of [7] for the sphere Sq2 . We
also nd that the KO-theoretic dimension of this real spectral triple is equal to
the classical value, just two.
The paper is organized as follows. In Sec. 2, we give a brief overview of the con-
struction of quantum dierential calculi on quantum groups and their homogeneous
spaces, followed by the general quantum frame bundle construction itself. Following
this, Sec. 3 recalls the elementary geometry of the Hopf bration SUq (2) Sq2 and
the Hopf algebra Uq (su(2)) which describes its symmetries. In Sec. 4, we describe
the dierential structure of the Hopf bration. We start from the 4D quantum
dierential calculus on the total space SUq (2) from which we derive the calculus
on the bundle ber U(1). The structure of the calculus 1 Sq2 is then obtained as a
framed quantum manifold in the sense of [14]. Finally, in Sec. 5 we construct our
spectral triple (A[Sq2 ], H, D) over Sq2 , which in addition we equip with a Z2 -grading
of the spinor bundle H and a real structure J: H H.
Notation. In this paper, we make frequent use of the q-numbers dened by
q x q x
[x] := (1.1)
q q 1
September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

Spin Geometry of the Quantum Two-Sphere 965

for each x R and q = 1. Furthermore, for the sake of brevity we introduce the
constants

:= q + q 1 , := q q 1 (1.2)

to be used throughout the paper. Our convention is that N = {0, 1, 2, . . .}.

2. Preliminaries on Quantum Principal Bundles


We start with some generalities on dierential calculi and quantum principal
bundles. These will be endowed both with universal and non-universal compati-
ble calculi.

2.1. Dierential structures


Let P be a complex -algebra with unit. A rst order dierential calculus over P
is a pair (1 P, d) where 1 P is a P -P -bimodule (the one-forms) and d: P 1 P
is a linear map obeying the Leibniz rule

d(ab) = a(db) + (da)b, a, b P,

and such that the map P P 1 P dened by a b  a db is surjective.


 1 P, d),
The universal dierential calculus over P is the pair (  1 P :=
where
ker m is the kernel of the product map m: P P P on P , with obvious bimodule
structure

p (a b) = pa b, (a b) p = a bp, a, b, p P
is dened by dp
and d := 1 p p 1, for each p P . It is so-called because any
other dierential calculus (1 P, d) over P arises as a quotient 1 P =  1 P/NP ,
 
where NP is some P -P -sub-bimodule of P . With the projection P : P 1 P
1 1

one has d = P d.
If H is a Hopf algebra, we write mH: H H H and 1H for its prod-
uct and unit, H: H H H and H: H C for its coproduct and counit
and SH: H H for its antipode (when there is no possibility of confusion, we
omit the subscript H). We use Sweedler notation (h) = h(1) h(2) for the
coproduct. A dierential calculus 1 H over a Hopf algebra H is said to be left-
covariant if the coproduct , viewed as a left coaction of H on itself, extends to
a left coaction L: 1 H H 1 H such that d is an intertwiner and L is a
bimodule map:

L (dh) = (id d)L (h), L (h) = (h) L (), L (h) = L () (h)

for all h H, 1 H. A similar denition holds for a right-covariant calculus,


now with a right coaction R: 1 H 1 H H. A calculus is said to be bicovari-
ant if it is both left and right covariant with commuting coactions. The universal
September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

966 S. Brain & G. Landi

calculus  1 H is bicovariant when equipped with the left and right tensor product
coactions on H H.
Left-covariant dierential calculi on a Hopf algebra H are classied as follows
after [22]. First, it may be shown that the linear map

r: H H H H, r(a b) := ab(1) b(2) , (2.1)

is an isomorphism with inverse

r1: H H H H, r1 (a b) = aS(b(1) ) b(2) . (2.2)

 1 H we obtain an isomorphism
Upon restricting r to the universal calculus

 1H H H +,
r:

where H + := ker H denotes the augmentation ideal of H. This is in fact an


isomorphism of H-H bimodules if we equip H H + with the bimodule structure

a (b ) = ab , (a ) b = ab(1) b(2) , a, b H, H + (2.3)

and an isomorphism of H-H-bicomodules if we equip H H + with the bicomodule


structure
L (a ) = a(1) (a(2) ),
R (a ) = (a(1) (1) ) a(2) (2) , a H, H + .

Any left-covariant sub-bimodule NH of  1 H is carried to a right ideal IH of H +


by the map r in (2.1). Conversely, any right ideal IH arises in this way from a
left-covariant sub-bimodule of  1 H. It follows that the left-covariant dierential
calculi on H are in one-to-one correspondence with right ideals IH H + ; indeed,
given such an IH , one has 1 H H 1 , where 1 = H + /IH are the left-invariant
1 1 1
one-forms. We also write inv H := r ( ).
A left-covariant sub-bimodule NH is also right-covariant if and only if the cor-
responding ideal IH is stable under the right adjoint coaction

AdR: H H H, AdR (a) = a(2) S(a(1) )a(3) ,

in the sense that AdR (IH ) IH H. It follows that bicovariant calculi on H are in
one-to-one correspondence with right ideals IH of H + which are AdR -stable [22].
Given a left-covariant dierential calculus 1 H over H, the quantum tangent
space of 1 H is the vector space

TH := {X H  | X(1) = 0 and X(a) = 0 for all a IH }, (2.4)

where the vector space H  is the linear dual of H. This tangent space admits many
properties analogous to the classical case, in particular there exists a unique bilinear
September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

Spin Geometry of the Quantum Two-Sphere 967

form | : TH 1 H C such that

X | a db = H (a)X(b), a, b H, X TH . (2.5)

With respect to this bilinear form, the vector spaces 1inv H and TH are non-
degenerately paired, so that

dim 1inv H = dim TH = dim 1 .

This number is said to be the dimension of the left-covariant dierential calculus


1 H.

2.2. Quantum principal bundles


The general set-up for a principal bration of noncommutative spaces is an algebra
P (playing the role of the algebra of functions on the total space) which is a right
comodule algebra for a Hopf algebra H with coaction R : P P H. The algebra
of functions on the base space of the bration is the subalgebra M of P consisting
of coinvariant elements under R ,

M := P H = {p P: R (p) = p 1}.

For a well-dened bundle structure at the level of universal dierential calculi, one
requires exactness of the following sequence [1],
j ver
 1 M )P
0 P (  1P
P H + 0, (2.6)

with H + the augmentation ideal, as before. The algebra inclusion M  P extends


 1 P of universal dierential calculi, hence P (
1 M 
to an inclusion  1 M )P are
the analogues of the horizontal one-forms (classically this corresponds to the space
of one-forms which have been pulled back from the base of the bration). The map
ver is dened by

ver(p p ) = pR (p );

the generator of the vertical one-forms. We say that the inclusion M  P is a


quantum principal bundle with universal calculi and structure quantum group H.
Requiring exactness of the sequence (2.6) is equivalent to requiring that the induced
canonical map

: P M P P H, p M p  pR (p ) (2.7)

be bijective. If this is the case, one also says that the triple (P, H, M ) is an
H-HopfGalois extension. This bijection condition is enough for a principal bundle
structure at the level of universal dierential calculi.
For a principal bundle with non-universal calculi extra conditions are required
that we briey recall. Assume then that P and M are equipped with dierential cal-
 1 P/NP and 1 M =
culi 1 P =  1 M/NM , where NP and MM are sub-bimodules
September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

968 S. Brain & G. Landi

 1 P and
of  1 M, respectively. Assume further that H is equipped with a left-
covariant calculus 1 H corresponding to a right ideal IH .
Compatibility of the dierential structures means that the calculi satisfy the
conditions
 1M
NM = NP and R (NP ) NP H. (2.8)
1
The role of the rst condition is to ensure that M is spanned by elements of the
form mdn with m, n M and is hence obtained by restricting the calculus on P .
The second condition in (2.8) is sucient to ensure covariance of 1 P . Finally, we
need the sequence
ver
0 P (1 M )P 1 P P 1 0 (2.9)
to be exact. This sequence is the analogue of the sequence (2.6) but now at the
level of non-universal calculi. The P -P -bimodule P (1 M )P once again makes up
the horizontal one-forms and ver(p p ) = pR (p ) is the canonical map which
generates the vertical one-forms. The condition
ver(NP ) = P IH (2.10)
ensures that the map
ver: 1 P P 1 , 1 H + /IH
is well-dened and yields that the sequence (2.9) is indeed exact.

2.3. Framed quantum manifolds


Suppose that the total space P of the bundle is itself a Hopf algebra equipped
with a Hopf algebra surjection : P H. Here we have a coaction of H on P by
coproduct and projection to H,
R: P P H, R = (id ).
The base is then the quantum homogeneous space M = P H of coinvariants and
the algebra inclusion M  P is automatically an H-HopfGalois extension, i.e. a
quantum principal bundle with universal calculi. To impose non-universal dieren-
tial structure, we suppose that 1 P is left-covariant for P and 1 H is left-covariant
for H, so that they are dened by right ideals IP and IH of P + and H + , respec-
tively. We ensure the rst of (2.8) by taking it as a denition of 1 M ; in the case
at hand, the remaining compatibility conditions in (2.8)(2.10) reduce to
(id )AdR (IP ) IP H, (IP ) = IH . (2.11)
Thus a choice of left-covariant calculus on P satisfying these conditions automati-
cally gives a principal bundle with non-universal calculi [14].
We say that an algebra M is a framed quantum manifold if it is the base of a
quantum principal bundle, M = P H , to which 1 M is an associated vector bundle.
September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

Spin Geometry of the Quantum Two-Sphere 969

To give M as a framed quantum manifold we therefore require not only a quantum


principal bundle R: P P H as above but also a right H-comodule V , so that
E := (P V )H plays the role of the sections of the corresponding associated vector
bundle (the space P V is equipped with the tensor product coaction). Moreover,
we require a soldering form : V P 1 M such that the map

s: E 1 M, p v  p(v)

is an isomorphism.
For a general M , it is usually not obvious how to go about looking for a framing.
However in the case of a quantum homogeneous space with compatible calculi one
has a standard framing in the following way [14]. If the conditions in (2.11)
are satised then the algebra M = P H is automatically framed by the bundle
(P, H, M ). The H-comodule V and soldering form are given explicitly by the
formul

V = (P + M )/(IP M ), R v = v(2) S(
v(1) ), (v) = S
v(1) d
v(2) , (2.12)

with v any representative of v in P + M and (


v ) = v(1) v(2) is the coproduct
on P .

3. The Standard Podle


s Sphere
We recall here some of the basic geometry of the so-called standard Podles quantum
two-sphere Sq2 of [16]. We begin with the quantum group A[SUq (2)] and its sym-
metries Uq (su(2)), from which we obtain the quantum sphere Sq2 as the base space
of the quantum Hopf bration SUq (2) Sq2 . Finally we sketch the construction
of a family of quantum line bundles over Sq2 which shall prove useful in what is to
follow.

3.1. The quantum group SUq (2)


Recall that the coordinate algebra A[Mq (2)] of functions on the quantum matrices
Mq (2) is the associative unital algebra generated by the entries of the matrix
 
a b
x = (xi j ) =
c d

obeying the relations

ab = qba, ac = qca, bd = qdb, cd = qdc,


1
(3.1)
bc = cb, ad da = (q q )bc,

with 0 = q C a deformation parameter. The algebra A[Mq (2)] has a coalgebra


structure given by (xi j ) = xi x j and (xi j ) = i j . From A[Mq (2)] we obtain a
September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

970 S. Brain & G. Landi

Hopf algebra A[SLq (2)] upon quotienting by the determinant relation ad = 1 + qbc
(equivalently da = 1 + q 1 bc) and dening an antipode by
   
a b d q 1 b
S = .
c d qc a
When the deformation parameter q is taken to be real A[Mq (2)] is made into
a -algebra by dening the anti-linear involution
   
a b d qc
x = := . (3.2)
c d q 1 b a
It is not dicult to see that A[SLq (2)] inherits this -structure. Without loss of
generality, we take 0 < q < 1. The compact quantum group A[SUq (2)] is dened to
be the quotient of A[SLq (2)] by the additional relations S(xk l ) = (xl k ) . Thus in
A[SUq (2)] we have
   
a b a qc
x= = . (3.3)
c d c a
The algebra relations become

ac = qca, ac = qc a, cc = c c, aa + q 2 cc = 1, a a + c c = 1, (3.4)

together with their conjugates. On generators, the counit is (a) = (a ) = 1,


(c) = (c ) = 0 and the antipode is now S(a) = a , S(a ) = a, S(c) = qc,
S(c ) = q 1 c , while the coproduct now reads (a) = a a qc c, (c) =
c a + a c and (a ) = a a qc c , (c ) = c a + a c .

3.2. The quantum universal enveloping algebra Uq (su(2))


The quantum universal enveloping algebra Uq (su(2)) is the unital -algebra gener-
ated by the four elements K, K 1 , E, F , with KK 1 = K 1 K = 1, subject to the
relations
K 1 E = q 1 EK 1 , K 1 F = q 1 F K 1 ,
(3.5)
[E, F ] = (q q 1 )1 (K 2 K 2 )
and the -structure

K = K, E = F, F = E.

It becomes a Hopf -algebra when equipped with the coproduct and counit 
dened on generators by

(K 1 ) = K 1 K 1 , (E) = E K + K 1 E,
(F ) = F K + K 1 F,
(K) = 1, (E) = 0, (F ) = 0,
September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

Spin Geometry of the Quantum Two-Sphere 971

and with antipode S dened by S(K) = K 1 , S(E) = qE, S(F ) = q 1 F on


generators. The maps ,  are extended as -algebra maps, whereas S extends as a
-anti-algebra map. From the relations (3.5), one nds that the quadratic Casimir
element
1
Cq := F E + (q q 1 )2 (qK 2 2 + q 1 K 2 ) (3.6)
4
generates the center of the algebra Uq (su(2)).
The nite-dimensional irreducible -representations j of Uq (su(2)) are indexed
by a half-integer j = 0, 1/2, 1, 3/2, . . . called the spin of the representation. Explic-
itly, these representations are given by

j (K)|j, m = q m |j, m ,
j (F )|j, m = ([j m][j + m + 1])1/2 |j, m + 1 , (3.7)
j (E)|j, m = ([j m + 1][j + m])1/2 |j, m 1 ,

where the vectors |j, m for m = j, j + 1, . . . , j 1, j form an orthonormal basis


of the (2j + 1)-dimensional irreducible Uq (su(2))-module V j . Moreover, j is a -
representation with respect to the Hermitian inner product on V j for which the
vectors |j, m are orthonormal. In each representation, the Casimir Cq of (3.6) acts
as a multiple of the identity, with constant given by
 2
1 1
j (Cq ) = j + (3.8)
2 4

as one may easily verify by direct computation.


The Hopf -algebras A(SUq (2)) and Uq (su(2)) are dually paired via a bilinear
pairing

( , ): Uq (su(2)) A[SUq (2)] C (3.9)

which is non-degenerate. It is dened on generators by

(K, a) = q 1/2 , (K 1 , a) = q 1/2 , (K, d) = q 1/2 , (K 1 , d) = q 1/2 ,

(E, c) = 1, (F, b) = 1,

with all other combinations of generators pairing to give zero. The pairing is
extended to products of generators via the requirements

((X), p1 p2 ) = (X, p1 p2 ), (X1 X2 , p) = (X1 X2 , (p)),


(3.10)
(X, 1) = (X), (1, p) = (p),

for all X, X1 , X2 Uq (su(2)) and all p, p1 , p2 A[SUq (2)]. It is compatible


with the antipode and the -structures in the sense that, for all X Uq (su(2)),
September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

972 S. Brain & G. Landi

p A[SUq (2)],

(S(X), p) = (X, S(p)), (X , p) = (X, (S(p)) ), (X, p ) = ((S(X)) , p). (3.11)

Using the pairing, there is a canonical left action of Uq (su(2)) on A[SUq (2)]
dened by

: Uq (su(2)) A[SUq (2)] A[SUq (2)], X p := p(1) (X, p(2) ) (3.12)

where X Uq (su(2)), p A[SUq (2)] and (p) = p(1) p(2) denotes the coproduct
on A[SUq (2)]. In particular, this action works out on generators to be

E a = b, E c = d, F b = a, F d = c,
K 1 a = q 1/2 a, K 1 c = q 1/2 c, K 1 b = q 1/2 b, K 1 d = q 1/2 d, (3.13)
E b = 0, E d = 0, F a = 0, F c = 0.

This action makes A[SUq (2)] into a left Uq (su(2))-module -algebra, in the sense
that

X (p1 p2 ) = (X(1) p1 )(X(2) p2 ), X 1 = 1, X p = ((S(X)) p)

for all p, p1 , p2 A[SUq (2)], X Uq (su(2)). There is also a canonical right action
of Uq (su(2)) on A[SUq (2)], dened by


: A[SUq (2)] Uq (su(2)) A[SUq (2)], p
X := (X, p(1) )p(2) (3.14)

for X Uq (su(2)) and p A[SUq (2)], with properties similar to those for the left
action. These two canonical actions commute amongst one another.

3.3. Line bundles on the quantum sphere Sq2


The coordinate algebra H := A[U(1)] of the group U(1) is the commutative unital
-algebra generated by t, t , subject to the relations tt = t t = 1. It is a Hopf
algebra when equipped with the coproduct, counit and antipode

(t) = t t, (t) = 1, S(t) = t ,

extended as -algebra maps. There is a canonical Hopf algebra projection given on


generators by
   
a b t 0
: A[SUq (2)] A[U(1)], := . (3.15)
c d 0 t

Using this projection a right coaction of H = A[U(1)] on P := A[SUq (2)] is


dened by

R : A[SUq (2)] A[SUq (2)] A[U(1)], R (xi j ) := xi (x j ). (3.16)


September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

Spin Geometry of the Quantum Two-Sphere 973

In fact, this coaction is the same thing as a Z-grading on A[SUq (2)] for which the
generators have degrees

deg(a) = deg(c) = 1, deg(b) = deg(d) = 1. (3.17)

The subalgebra of coinvariants under this coaction is denoted A[Sq2 ],

A[Sq2 ] := {m A[SUq (2)] | R (m) = m 1}.

We shall frequently write M := A[Sq2 ]. This algebra is precisely the subalgebra


generated by elements of degree zero: it is the unital -algebra generated by the
elements

b+ := cd, b := ab, b0 := bc (3.18)

subject to the relations

b0 b = q 2 b b0 , q 2 b b+ = q 2 b+ b + (1 q 2 )b0 ,
b+ b = b0 (1 + q 1 b0 )

inherited from those of A[SUq (2)]. In the classical limit q 1, the rst line of rela-
tions becomes the statement that the algebra is commutative, whereas the second
line becomes the sphere relation for the classical two-sphere S 2 . The quantum sphere
Sq2 is precisely the standard Podles sphere of [16]. The canonical algebra inclusion
M  P is well known to be a HopfGalois extension [1] and hence a quantum
principal bundle with universal dierential calculi whose typical ber is determined
by H := A[U(1)].
The coaction (3.16) of H on A[SUq (2)] is also used to dene a family of line
bundles over the quantum sphere Sq2 , indexed by n Z:

Ln := {x A[SUq (2)] | R (x) = x tn }.

One has the decomposition [15]



A[SUq (2)] = Ln .
nZ

In particular L0 = A[Sq2 ] and one nds that Ln


= Ln and Ln A[Sq2 ] Lm
= Ln+m
for each n, m Z. Moreover,

E Ln Ln+2 , F Ln Ln2 , K 1 Ln Ln

for all n Z, as can be checked directly using (3.13) and (3.10).


It is known that each Ln is a nitely generated projective (say) left A[Sq2 ]-
module of rank one [21]. In this way, we think of the module Ln as the space of
sections of a line bundle over Sq2 with winding number n.
September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

974 S. Brain & G. Landi

4. Dierential Structure of the Quantum Hopf Fibration


In this section we equip the quantum group SUq (2) with a four-dimensional bico-
variant dierential calculus, originally described in [22]. Using this, the base space
Sq2 of the Hopf bration inherits a three-dimensional dierential calculus which
was originally described in [17], although we describe it here in terms which are
more compatible with the principal bundle structure. Finally, we show that Sq2 is a
framed quantum manifold, in the sense that its cotangent bundle is a vector bundle
associated to the Hopf bration SUq (2) Sq2 .

4.1. Dierential structure on SUq (2)


In the following we write P for the counit of the Hopf algebra P := A[SUq (2)].
In terms of the matrix elements in (3.3), we dene IP to be the right ideal of
P + := Ker P generated by the nine elements
b2 , c2 , b(a d), c(a d), a2 + q 2 d2 (1 + q 2 )(ad + q 1 bc),
(4.1)
zb, zc, z(a d), z(q 2 a + d (q 2 + 1)),
where z := q 2 a + d (q 3 + q 1 ). As discussed in Sec. 2.1, this ideal denes a left-
covariant rst order dierential calculus on SUq (2), which we denote by 1 P . In
fact, one checks that IP is stable under the right adjoint coaction AdR and so this
calculus is bicovariant under left and right coactions of A[SUq (2)]. It is precisely the
4D+ calculus on SUq (2) introduced in [22]: indeed, one may check that the space
1
= P + /IP of left-invariant one-forms is a four-dimensional vector space.
Following [11], we dene elements L , L0 , L+ , Lz of Uq (su(2)) by
L := q 1/2 F K 1 , L+ := q 1/2 EK 1 ,
L0 := K 2 + 2 q 1 F E 1, Lz := K 2 1.
The vectors L0 and Lz are related to the quantum Casimir (3.6) by
  2 
1 2 1 1
(q q ) Cq + = qL0 + q 1 Lz . (4.2)
4 2
The elements L , L0 , L+ , Lz act upon A[SUq (2)] via the formula (3.12) and
together provide a basis for the quantum tangent space TP of the calculus. Note in
particular that the element Cq P (Cq )1 is also an element of TP .
Let { , 0 , + , z } be a basis of the space of left-invariant one-forms 1 such
that (Lj , k ) = jk for j, k = , 0, +, z. As given in [19], the bimodule relations in
the calculus 1 P with respect to these one-forms are:
     
a b a b 2 1 b 0
= + q 0 ;
c d c d d 0
     
a b a b 2 1 0 a
+ = + + q 0 ;
c d c d 0 c
September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

Spin Geometry of the Quantum Two-Sphere 975

   1 
a b q a qb
0 = 0 ;
c d q 1 c qd
     
a b 0 a 2 1 a 0
z = + q 0
c d 0 c c 0
   
b 0 qa q 1 b
+ + + z .
d 0 qc q 1 d
(4.3)

In these terms, the exterior derivative d: A[SUq (2)] 1 P has the form

dp = (L p) + (L0 p)0 + (L+ p)+ + (Lz p)z , p A[SUq (2)],


(4.4)

where is the left action of Uq (su(2)) on A[SUq (2)] dened in (3.12). By using
the formul (3.13) to compute the action of L0 , Lz , L+ , L on the generators of
A[SUq (2)] and then substituting into (4.4), one obtains the explicit expressions

da = (q 1 1 + 2 q 1 )a0 + b+ + (q 1)az ,
db = a + (q 1)b0 + (q 1 1)bz ,
(4.5)
dc = (q 1 1 + 2 q 1 )c0 + d+ + (q 1)cz ,
dd = c + (q 1)d0 + (q 1 1)dz

for the dierentials of the matrix generators of A[SUq (2)] in terms of these left-
invariant one-forms.

4.2. Framed manifold structure of Sq2


Next, we use Sec. 2.3 to compute the cotangent bundle 1 Sq2 of the base space
Sq2 of the Hopf bration as an associated vector bundle. As before, we write P =
A[SUq (2)] for the algebra of functions on the total space of the Hopf bration,
M = A[Sq2 ] for the algebra of functions on the base and H = A[U(1)] for the
structure quantum group. Recall the right coaction R : P P H dened in (3.16)
and the canonical projection : P H dened in (3.15).
The dierential calculus on P is taken to be the four-dimensional bicovariant cal-
culus 1 P dened in the previous section; it is dened in terms of the AdR -invariant
ideal IP generated by the elements in (4.1). Now writing H for the counit of H, we
obtain a bicovariant dierential calculus 1 H on H = A[U(1)] by projecting the
ideal IP to obtain an ideal IH := (IP ) of Ker H . As such, IH is generated by the
three elements

t2 + q 2 t2 (1 + q 2 ), z(t t ), z(q 2 t + t (q 2 + 1)), (4.6)

again with z = q 2 t + t (q 3 + q 1 ), where t, t are the generators of H.


September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

976 S. Brain & G. Landi

Lemma 4.1. The calculus 1 H is one-dimensional. It is spanned as a left module


by the left-invariant one-form t := t dt and has bimodule relations

t t = qtt , t t = q 1 t t ,

where t, t are the generators of H = A[U(1)].

Proof. We dene an equivalence relation on H + by x y if and only if x y


IH . By taking a linear combination of the generators in (4.6), one nds in particular
that (t 1) + q(t 1) 0, which is our key equivalence. Using it, one deduces that

t2 = (t + 1)(t 1) + 1 q(t + 1)(t 1) + 1


= q(t t) + 1 (q + 1)(t 1) + 1,
t2 = (t + 1)(t 1) + 1 q 1 (t + 1)(t 1) + 1
= q 1 (t t ) + 1 q 1 (1 + q 1 )(t 1) + 1,

so that every quadratic polynomial in t, t and 1 is equivalent to a linear combi-


nation of t 1 and t 1. By induction any polynomial in t is equivalent to such
a linear combination. Applying the key equivalence once more tells us that we can
always eliminate t 1. Thus we take t 1 as a representative of the quotient
space H + /IH and t := r1 (1 (t 1)) as the corresponding left-invariant one-
form, which spans the calculus 1 H as a left H-module. To obtain the bimodule
relations, we compute for example that

t t = ((t 1) t 1)t = (1 t) t2 t = qt(t 1) t 1 = qtt ,

where   denotes an equivalence class modulo IH . The rst and last equali-
ties use the denition of the map r and the middle equality uses the bimodule
structure (2.3).

The dierential calculus 1 M on the base of the bration is dened by


restricting the calculus 1 P to M . This means that it is dened as the quotient
1 M :=  1 M/NM , where NM is the M -M -bimodule NM := NP  1 M . We post-
1
pone the computation of generators and relations for M and observe that for
now we have the following expressions for the exterior derivative on M in terms of
the left-invariant one-forms , 0 .

Lemma 4.2. The exterior derivative d acts on M = A[Sq2 ] as



db+ d2 2 q 1 cd qc2 +

db0 = db 2 q 1 (1 + bc) ac 0 (4.7)
2 2 1 2
db b q ab qa

in terms of the generators b , b0 of M given in (3.18).


September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

Spin Geometry of the Quantum Two-Sphere 977

Proof. This follows from direct computation. For example, to compute db+ the
Leibniz rule yields
db+ = d(cd) = (dc)d + c(dd).
One uses the expressions (4.5) to rewrite dc, dd in terms of and 0 , then the
bimodule relations in Eqs. (4.3) to collect all coecients to the left. Combining
together alike terms yields the expression as stated. The same method works for
computing db0 and db .

Lemma 4.3. With P, H and M as above, the dierential calculi 1 P, 1 H and


1 M satisfy the compatibility conditions of (2.11).

Proof. The relation (IP ) = IH holds by denition of the calculus on H. It is


sucient to verify the AdR -condition in (2.11) on generators: one nds that
(id )AdR (c2 ) = c2 t4 , (id )AdR (c(a d)) = c(a d) t2 ,
(id )AdR (b2 ) = b2 t4 , (id )AdR (b(a d)) = b(a d) t2 ,
(id )AdR (zc) = zc t2 , (id )AdR (zb) = zb t2 ,
with all other generators coinvariant under the map (id )AdR .

This means that we may apply Sec. 2.3 to express Sq2 as a framed quantum
manifold. The framing comodule V is computed as follows. Clearly P + M is
equal to M + = Ker M , the restriction of the counit P to the subalgebra M . In
our case, with M = A[Sq2 ] being generated by b , b0 , we have that M + = b0 , b
as a right ideal. To compute IP M we note that, since the generators b(a d),
c(a d), a2 + q 2 d2 (1 + q 2)(ad + q 1 bc), zb, zc, z(a d), z(q 2 a + d (q 2 + 1)) are not
of homogeneous degree, the ideal that each of them generates has no intersection
with M . Thus we concentrate on the generators b2 , c2 of IP . The elements of degree
zero in b2 include b2 {a2 , ac, c2 } and so we see that b2 , b b0 , b20 all lie in IP M .
Similarly, from the ideal c2 we see that b2+ and b+ b0 are also in IP M . From this
discussion we obtain
V = b0 , b / b2 , b20 , b b0 . (4.8)
Hence V is three-dimensional with representatives b and b0 . We compute the right
coaction of H on V from (2.12) as
R (b+ ) = cd S(d2 ) = b+ t2 ,
R (b ) = ab S(a2 ) = b t2 ,
R (b0 ) = bc 1 = b0 1.
Hence V = C C C and the associated bundle
E = L2 L0 L+2 = A[SUq (2)]2 A[SUq (2)]0 A[SUq (2)]2
September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

978 S. Brain & G. Landi

is the direct sum of the line bundles over Sq2 with winding numbers 2, 0 and 2.
This yields the following theorem.

Theorem 4.4. The homogeneous space Sq2 is a framed quantum manifold with
cotangent bundle

1 Sq2
= L2 L0 L+2 .

The isomorphism is given by the soldering form

(b+ ) = q 2 c2 db qac db0 + a2 db+ = + ,


(b0 ) = qdc db + (1 + bc)db0 q 1 ba db+ = 2 q 1 0 ,
(b ) = d2 db q 1 bd db0 + q 2 b2 db+ = q

and makes 1 Sq2 projective as a left A[Sq2 ]-module.

Proof. The only remaining part is to compute the soldering form (b ), (b0 ). We
nd the left coaction on M = A[Sq2 ] inherited from the coproduct on A[SUq (2)] to
be

L (b+ ) = L (cd) = c2 b + cd (1 + b0 ) + d2 b+ ,
L (b0 ) = L (bc) = ca b + 1 b0 + bc (1 + b0 ) + db b+ ,
L (b ) = L (ab) = a2 b + ab (1 + b0 ) + b2 b+ .

In fact these coproducts were already used in computing R above. This time we
apply the antipode S to the rst tensor factor to obtain

(b+ ) = S(b+ (1) )d(b+ (2) ) = q 2 c2 db qac db0 + a2 db+ ,

similarly for (b ) and (b0 ). This yields the middle expressions as stated. We then
insert the expressions from Lemma 4.2 to obtain {+ , 2 q 1 0 , q } for the values
of the map . According to Sec. 2.3, the map : V P 1 M is well-dened on
V . In order to get one-forms on A[Sq2 ], one must multiply (b ) by an element of
degree 2, (b+ ) by an element of degree 2 and (b0 ) by an element of degree zero.
Moreover, every one-form is obtained in this way. This yields the isomorphism as
stated. Since all line bundles Ln are projective, so is 1 Sq2 .

The above also shows that the exterior derivative d in the calculus 1 Sq2 is given
by restriction of the expression in (4.4), namely

dm = (L m) + (L0 m)0 + (L+ m)+ , m A[Sq2 ]. (4.9)

We stress that L m L2 rather than being element in A[Sq2 ]. Of course,


from (4.4) combined with the fact that the vertical vector eld Lz obeys Lz m = 0
September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

Spin Geometry of the Quantum Two-Sphere 979

for all m A[Sq2 ], we already expected this to be the case. From Theorem 4.4 we
know that 1 Sq2 is spanned as a left module by
{d2 , db, b2 } + := {+ b+ , + b0 , + b },
2 q 1 {cd, 1 + bc, ab} 0 := {0 b+ , 0 b0 , 0 b }, (4.10)
2 2
{qc , ac, qa } := { b+ , b0 , b }.
The bimodule relations in the calculus 1 Sq2 are in general quite complicated to
compute directly, but we can use the expressions in Eqs. (4.10) to break them into
smaller pieces which are much easier to work with.

Corollary 4.5. The cotangent bundle 1 Sq2 has rst order dierential sub-calculi

1+
= L2 L0 , 10
= L0 , 1
= L0 L+2
with dierentials given by d+ := + + 0 , d0 := 0 and d := 0 + respectively.
These calculi obey the bimodule relations
2

q b+ (+ b+ ) + q 3 1 b+ (0 b+ )
b +

q 4 b ( b ) + 1 q 2 (1 + q 3 b )( b )
0 + + 0 0 +
+ b+ b0 =

q 2 b (+ b+ ) (q 2 q 2 )b+ (+ b ) + 0 b0



b
+ (q 2 q 2 )1 (q 2 b (0 b+ ) b+ (0 b )) q 1 b+ (0 b ),
3 1

b +
b+ (+ b0 ) + q b0 (0 b+ )
+ b0 b0 = q 2 b0 (+ b0 ) + q 2 1 b+ (0 b )


2
b q b (+ b0 ) q 1 b0 (+ b ) + q 2 (1 + q 1 b0 )(0 b ),
2 2 2 1 2

b + q b+ (+ b ) + (q q ) (q b (0 b+ ) b+ (0 b ))

+ b b0 = b0 (+ b ) + q 1 1 b0 (0 b )


2
b q b (+ b ) + q 3 1 b (0 b ),
2 3 1

b +
q b+ ( b+ ) + q b+ (0 b+ )
b+ b0 = b0 ( b+ ) + q1 b0 (0 b+ )


2
b q b ( b+ ) + (q 2 q 2 )1 (b (0 b+ ) q 2 b+ (0 b )),
2 1

b +
q b+ ( b0 ) + qb0 ( b+ ) + (1 + qb0 )(0 b+ )
b0 b0 = q 2 b0 ( b0 ) + 1 b (0 b+ )



b b ( b0 ) + q 3 1 b0 (0 b ),
2

q b+ ( b ) + (q 2 q 2 )b ( b+ ) + q 2 0 b0
b

+ + (q 2 q 2 )1 (b (0 b+ ) q 2 b+ (0 b )) + qb (0 b+ )
b b0 =

4 1 3
q b0 ( b ) + (1 + q b0 )(0 b )


b 2
q b ( b ) + q 3 1 b (0 b ).
September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

980 S. Brain & G. Landi

Proof. Using the expressions in Eqs. (4.10) the bimodule relations in 1 Sq2 are eas-
ily determined from straightforward but laborious computation along the following
lines. From the bimodule relations in Eqs. (4.3) one nds that
2 1 2

b +
b + + + q c 0
+ b0 = b0 + + 2 q 1 ca0



b b + + 2 q 1 a2 0 ,
2 2

b +
b + + d 0
b0 = b0 + 2 db0



b b + 2 b2 0 ,

with 0 commuting with each of b , b0 . Combining these with the algebra relations
in A[SUq (2)] yields the bimodule relations as stated, together with


b +
b+ (0 b+ )
0 b+ b0 = q 2 b0 (0 b+ )


2
b q b (0 b+ ) q 2 b (0 b+ ) + b+ (0 b ),
2 1

b + q b+ (0 b0 ) q (0 b+ )

0 b0 b0 = b0 (0 b0 )


2
b q b (0 b0 ) + q 1 1 (0 b ),
2 2

b + q b+ (0 b ) + b (0 b+ ) q b+ (0 b )

0 b b0 = q 2 b0 (0 b )



b b (0 b ).

The fact that 1+ = L2 L0 , 10 = L0 and 1 = L0 L+2 close as sub-bimodules


is now clear by inspection. The Leibniz rules for the dierentials d+ , d0 and d
follow from the Leibniz rule for d and the direct sum decomposition of 1 Sq2 .

Corollary 4.6. The one-forms in the calculus 1 Sq2 enjoy the relations

+ b0 = q 2 b (+ b+ ) q 2 b+ (+ b ),
b0 b (+ b+ ) = q 3 (1 + qb0 )b+ (+ b ),
b0 = b+ ( b ) q 4 b ( b+ ),
b0 b+ ( b ) = q 3 (1 + q 1 b0 )b ( b+ ),
b0 0 b0 = q 1 b (0 b+ ) + q 1 1 b+ (0 b ),
b+ (0 b0 ) = (1 + q 2 b0 )0 b+ ,
b (0 b0 ) = (1 + q 2 b0 )0 b+ .
September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

Spin Geometry of the Quantum Two-Sphere 981

Proof. These are obtained in analogy with the proof of Corollary 4.5, from the
relations in A[SUq (2)] acting on and 0 . One nds the relations as stated,
together with

b+ (+ b ) = q 1 b0 (+ b0 ), b (+ b+ ) = q 2 (1 + qb0 )(+ b0 ),
b ( b+ ) = q 2 b0 ( b0 ), b+ ( b ) = q 1 (1 + q 1 b0 )( b0 ).

There are other relations involving the dierential 0 , but they are quite compli-
cated (since the sphere relation in A[Sq2 ] does not explicitly involve the unit) and
are not particularly illuminating, so we shall not give them here.

Finally, we use Theorem 4.4 to compute the dierentials and 0 in terms of


the exterior derivative d. Using the algebra relations in A[SUq (2)] and the expres-
sions in Eqs. (4.10) we nd that

+ b+ = q 1 b2+ db b+ (1 + q 1 b0 )db0 + (1 + q 1 b0 )2 db+ + q 2 b+ b db+ ,


+ b0 = qb+ b0 db b+ b db0 + q 2 (1 + q 1 b0 )b db+ ,
+ b = q 2 b20 db q 1 b b0 db0 + q 3 b2 db+ ,
0 b+ = b2+ db + b+ (1 + b0 )db0 q 2 b+ b db+ ,
0 b0 = (1 + b0 )(b+ db + (1 + b0 )db0 q 2 b db+ ),
0 b = b b+ db + b (1 + b0 )db0 q 2 b2 db+ ,
b+ = qb2+ db q 1 b0 b+ db0 + q 2 b20 db+ ,
b0 = (1 + qb0 )b+ db qb0 (1 + qb0 )db0 + q 2 b b0 db+ ,
b = ((1 + qb0 )2 + b b+ )db b (1 + qb0 )db0 + q 1 b2 db+ .

These expressions may now be used to compute the full bimodule structure of the
calculus 1 Sq2 in terms of the dierential d, as well as the deeper structure of the
noncommutative Riemannian geometry of this calculus, along similar lines to [14].
However, since our objective is to study the spin geometry of the calculus, we have
all we need and so we shall not pursue these directions here.

5. The Spectral Geometry of Sq2


In this section, we give the three-dimensional dierential calculus 1 Sq2 by a
spectral triple on Sq2 . This means equipping Sq2 with a spinor bundle S and a Dirac
operator D which together implement the exterior derivative d for 1 Sq2 . We then
equip this spectral triple with a real structure for which the commutant property
and the rst order condition for the Dirac operator are satised up to innitesimals
of arbitrary order, in parallel with the results of [7] for the two-dimensional
calculus on Sq2 .
September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

982 S. Brain & G. Landi

5.1. Background on spectral triples


We recall briey the notion of a spectral triple [2].

Denition 5.1. A unital spectral triple (A, H, D) consists of a complex unital


-algebra A, faithfully -represented by bounded operators on a (separable) Hilbert
space H, and a self-adjoint operator D: H H (the Dirac operator) with the
following properties:

(i) the resolvent (D )1 ,


/ R, is a compact operator on H;
(ii) for all a A the commutator [D, (a)] is a bounded operator on H.

A spectral triple (A, H, D) is called even if there exists a Z2 -grading of H, i.e. an


operator : H H with = and 2 = 1, such that D + D = 0 and a = a
for all a A. Otherwise the spectral triple is said to be odd.

With 0 < n < , the Dirac operator D is said to be n+ -summable if


+
(D + 1)1/2 is in the Dixmier ideal Ln (H). The metric dimension of the spectral
2

triple (A, H, D) is dened to be the inmum of the set of all n, such that D is
n+ -summable.
Given a spectral triple (A, H, D), one associates to it a canonical rst order
dierential calculus (1D A, dD ). In particular, the A-A-bimodule 1D A is dened
to be

 j
1D A := = a0 [D, aj1 ] | aj0 , aj1 A , (5.1)

j

with the dierential dD given by dD a = [D, a] for a A.


The original denition [3] of a real structure on a spectral triple (A, H, D) was
given by an anti-unitary operator J: H H with the properties J 2 = 1, JD =
DJ and

[(a), J(b)J 1 ] = 0, [[D, (a)], J(b)J 1 ] = 0, a, b A. (5.2)

These are called the commutant property and the rst order condition respectively.
However, in many examples involving quantum spaces, one needs to modify
these conditions in order to obtain non-trivial spin geometries [58]. Following the
approach there, we impose the weaker assumption that (5.2) holds only up to
innitesimals of arbitrary order (i.e. up to compact operators T with the property
that the singular values sk (T ) satisfy limk k p sk (T ) = 0 for all p > 0).

Denition 5.2. A real structure on a spectral triple (A, H, D) is an anti-unitary


operator J: H H such that

J 2 = 1, JD = DJ,
(5.3)
1
[(a), J(b)J ] I, [[D, (a)], J(b)J 1 ] I , a, b, A,
September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

Spin Geometry of the Quantum Two-Sphere 983

where I is an operator ideal of innitesimals of arbitrary order. We say that the


datum (A, H, D, J) is a real spectral triple (up to innitesimals). If (A, H, D, ) is
even and J = J, we call the datum (A, H, D, , J) an even real spectral triple
(up to innitesimals).
The signs above depend on the so-called KO-dimension of the triple. We shall
only need the case where the KO-dimension is two; then J 2 = 1, JD = DJ and
J = J.

5.2. A Dirac operator on Sq2


In order to dene a spectral triple on Sq2 , we need a spinor bundle over Sq2 and an
associated Dirac operator, which we require should recover the dierential calculus
1 Sq2 via the commutator representation dened in (5.1). Since the dierential
calculus 1 Sq2 constructed in Theorem 4.4 is equivariant under a left coaction of
A[SUq (2)] and hence a right action of Uq (su(2)), we are led to consider spinor
bundles and Dirac operators which are right Uq (su(2))-equivariant.
Guided by this principle, as well as by the spin structure of the classical two-
sphere S 2 , for the A[Sq2 ]-module of spinors we take

S = S+ S := L1 L+1 .

As right Uq (su(2))-modules, the vector spaces S are both isomorphic to the direct
sum

V := Vj (5.4)
jN+ 12

1
over all irreducible Uq (su(2))-modules V j with spin j N + 2 a half-odd integer.
A corresponding basis for V is then given by
  
 1
|j, m j N + , m = j, . . . , j ,
2
where the vectors |j, m span the irreducible Uq (su(2))-module V j in Eqs. (3.7). We
denote the orthonormal bases of the two dierent copies S of V respectively by
1
|j, m , j N+ , m = j, . . . , j. (5.5)
2
We equip S with the inner product which makes this basis orthonormal and write
H for the corresponding Hilbert space completion of S.
As A[Sq2 ]-modules, the vector spaces S each carry one of two inequivalent
Uq (su(2))-equivariant representations of A[Sq2 ],

: A[Sq2 ] End(S ).

Recall that S are just the subspaces of A[SUq (2)] with overall degrees 1 with
respect to the Z-grading (3.17), so the representations on S are simply given
September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

984 S. Brain & G. Landi

by restricting the multiplication in A[SUq (2)] to the appropriate degrees. However,


it is possible to describe these representations explicitly in terms of the basis (5.5)
in the following way.
Indeed, the Uq (su(2))-equivariant representations of A[Sq2 ] on V were already
described in [7, 21]. To be able to simply quote them we make a change of generators,
now writing

x1 = q 1/2 b+ , x0 1 = b0 , x1 = q 3/2 b , (5.6)

where b , b0 are the generators of A[Sq2 ] dened in (3.18), and = q + q 1 . With


respect to these new generators, the algebra relations of A[Sq2 ] now read

x1 (x0 1) = q 2 (x0 1)x1 ,


x1 (x0 1) = q 2 (x0 1)x1 ,
(q 2 x0 + 1)(x0 1) = (q + q 1 )x1 x1 ,
(q 2 x0 + 1)(x0 1) = (q + q 1 )x1 x1 .

Then, with N = 1/2, the two representations = 1/2 of A[Sq2 ] on S have


the form

N (xi )|j, m = 0
i (j, m; N )|j 1, m + i + i (j, m; N )|j, m + i

+ +
i (j, m; N )|j + 1, m + i , (5.7)

where the coecients are determined by


 1/2
[j + m + 1][j + m + 2]
+
1 (j, m; N ) =q j+m
N (j + 1),
[2j + 1][2j + 2]

01 (j, m; N ) = q m+2 ([2][j m][j + m + 1])1/2 [2j]1 N (j),


 1/2
j+m+1 [j m 1][j m]
1 (j, m; N ) = q N (j),
[2j 1][2j]
 1/2
+ m [2][j m + 1][j + m + 1]
0 (j, m; N ) = q N (j + 1),
[2j + 1][2j + 2]

00 (j, m; N ) = [2j]1 ([j m + 1][j + m] q 2 [j m][j + m + 1])N (j),


 1/2
m [2][j m][j + m]
0 (j, m; N ) = q N (j),
[2j 1][2j]
 1/2
[j m + 1][j m + 2]
+
1 (j, m; N ) = q j+m
N (j + 1),
[2j + 1][2j + 2]
September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

Spin Geometry of the Quantum Two-Sphere 985

01 (j, m; N ) = q m ([2][j m + 1][j + m])1/2 [2j]1 N (j),


 1/2
[j + m 1][j + m]

1 (j, m; N ) = q
j+m1
N (j)
[2j 1][2j]
(with the convention that 1 1
i ( 2 , 2 ; N ) = 0) and the real numbers N (j), N (j) are

N (j) = ([2j + 1][2j])1/2 ([2][j + N ][j N ])1/2 ([2j + 1][2j])1/2 q N ,


    
1 1 1 1 3
N (j) = q [2j + 2] (q (q q ) [j][j + 1] ,
2 2
with = sign(N ).
Next we come to the Dirac operator. With the 2 2 Pauli matrices
     
0 1 1 0 0 0
+ := , 0 := , := ,
0 0 0 1 1 0
one has the relations
     
1 0 1 0 0 0
+ = , 02 = , + = ,
0 0 0 1 0 1 (5.8)
2 2
0 + = + , + 0 = + , + = = 0, 0 = , 0 = .
Further, we use the dierential operators D , D0 ,
  2 
1 1
D := L , D0 := L0 + q 2 Lz = q 1 (q q 1 )2 Cq + , (5.9)
4 2

having used the expression (4.2) for the last equality. As will be clearly momentarily,
the use of D0 instead of L0 (the extra Lz vanishing identically on A[Sq2 ]) will lead to
a Dirac operator whose square is diagonal. We dene a Dirac operator D: S S by

D = D+ + + D0 0 + D , (5.10)

where the 2 2 Pauli matrices , 0 act upon the column vector of S by left
multiplication and the vector elds D , D0 operate via the left action of Uq (su(2))
(using the symbol , which we omit from now on). As mentioned above, elements
a A[Sq2 ] act as multiplicative operators on S via the representations :
 
+ (a) 0
: A[Sq2 ] End(S), (a) :=
0 (a)
although we will not always explicitly denote the representation .

Proposition 5.3. The Dirac operator D: S S obeys

[D, a] = (L+ a)+ + (L0 a)0 + (L a)

for each a A[Sq2 ].


September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

986 S. Brain & G. Landi

Proof. For = (+ )tr S+ S , using the derivation property of the vector


elds D , D0 , the commutator [D, a] works out to be
     
(D+ a) (D0 a)+ 0
[D, a] = + +
0 (D0 a) (D a)+
= ((D+ a)+ + (D0 a)0 + (D a) ).

To obtain the desired result, one simply substitutes D = L and D0 = L0 +q 2 Lz ,


observing that Lz a = 0 for all a A[Sq2 ].

This also shows that for all a A[Sq2 ] the commutator [D, a] recovers the one-form
da, acting on the spinors S by Cliord multiplication.
The summand D+ + + D in the operator (5.10) is precisely the Dirac
operator of [4], corresponding [20] to the two-dimensional dierential calculus on
the sphere Sq2 . The extra term D0 in our Dirac operator is the origin of the extra
direction in the calculus 1 Sq2 . It is clear from (4.2) that D0 vanishes when q 1,
whence the classical limit of our construction is just the canonical spectral triple
on the classical two-sphere S 2 .
Next, we compute the spectrum of the Dirac operator. We shall use the identities
 
2 1 q 1 K 2 2 + qK 2
L+ L = qEF K = q Cq + K 2 ,
4 (q q 1 )2
  (5.11)
2 1 2
1 qK 2 + q K
L L+ = q 1 F EK 2 = q 1 Cq + K 2 ,
4 (q q 1 )2

each obtained using the expression (3.6) for the quantum Casimir Cq . Moreover,
we know from (3.13) that for all S we have

K 2 = q 1 , K 2 = q 1 . (5.12)

These facts lead to the following result.

Proposition 5.4. The Dirac operator D obeys


  2 2  
1 1 1
D2 = q 2 4 Cq + + Cq + ,
4 2 4

where Cq is the quantum Casimir.

Proof. Using the Pauli relations (5.8) one computes that, for = (+ )tr S,
     
2 1 0 1 0 0 0
D = D02 + D+ D + D D+ . (5.13)
0 1 0 0 0 1
September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

Spin Geometry of the Quantum Two-Sphere 987

The crucial fact in this calculation is that D0 is a function of the Casimir Cq and
therefore commutes with D . Next, using the relations (5.11) and (5.12) we nd
 
1
D D = Cq +
4
for each S . Furthermore, we have that
  2 2
1 1
D02 = q 2 4 Cq + .
4 2

Substituting these expressions into (5.13) yields the formula as claimed.

As an immediate consequence we obtain the spectrum of our Dirac operator D.

Corollary 5.5. The Dirac operator D dened in (5.10) has spectrum


 
 2 1/2 
1 
Spec(D) = q 2 4 [j]2 [j + 1]2 + j + j N + 1
2  2


with multiplicities 2j + 1.

Proof. The eigenvalues of Cq are given in (3.8): each |j, m is an eigenvector with
eigenvalue [j + 12 ]2 14 , whence the multiplicity of the jth eigenvalue is 2(2j + 1).
From the expression for D2 in Proposition 5.4, we read o its eigenvalues using
those for Cq , yielding
  2 
1 1
Spec(D2 ) = j := q 2 4 [j]2 [j + 1]2 + j + j N+ , (5.14)
2 2

each having multiplicity 2(2j + 1). Here we have used the identity [j + 12 ]2
1/2
[ 12 ]2 = [j][j + 1]. The eigenvalues of D are therefore just j with multiplicities
2j + 1.

By inspection, we see that the eigenvalues of |D| grow not faster than q 2j for
large j, in contrast with the Dirac operator of [4], whose eigenvalues diverge not
faster than q j . It is the extra term D0 which accounts for this behavior.
This result immediately gives us an expression for D in terms of an orthonormal
basis of eigenspinors |j, m; , |j, m; dened by

D|j, m; = j |j, m; , D|j, m; = j |j, m; (5.15)

with eigenvalues
  2 1/2
2 4 2 2 1
j := q [j] [j + 1] + j + .
2
September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

988 S. Brain & G. Landi

To proceed further, it will be necessary to have an explicit description of these


eigenspinors in terms of the basic spinors |j, m . By evaluating the actions of D ,
D0 on S one nds that the Dirac operator is
 
1
D|j, m = q 1 2 [j][j + 1]|j, m + j + |j, m , (5.16)
2
the rst term corresponding to the action of D0 0 , the second to the action of
D . Knowing the eigenvalues of D, we nd the corresponding eigenspinors
to be
1
|j, m; :=  (j+ |j, m + j |j, m ),
2j
(5.17)
1
|j, m; :=  (j |j, m + + j+ |j, m ),
2j

for m = j, j + 1, . . . , j 1, j and j N + 12 , where we have written


 
j+ = j + q 1 2 [j][j + 1], j = j q 1 2 [j][j + 1]. (5.18)

On the two-dimensional subspace Vj,m spanned by |j, m + , |j, m for xed values
of j, m, the operator which diagonalizes D is just the orthogonal matrix
 + 
1 j j
Wj :=  . (5.19)
2j j j+

We write W: H H for the closure of the operator dened by the matrices Wj ,


j N + 12 .

5.3. Spectral properties of Sq2


We now show that the datum (A[Sq2 ], H, D) fulls the conditions required of
a spectral triple, which we then equip with a real structure in the sense of
Denition 5.2.

Theorem 5.6. The datum (A(Sq2 ), H, D) constitutes a unital spectral triple over
the sphere Sq2 with metric dimension zero.

Proof. For each a A[Sq2 ] the commutator [D, a] acts on S by multiplication


operators and is therefore itself a bounded operator. In fact, for the summand
D+ + + D this goes as in [4], whereas for the term D0 one gets multiplication
by L0 a which belongs to A[Sq2 ] itself. The operator D clearly satises D = D on
the dense domain S of H. From Corollary 5.5 it is clear that the only accumulation
points of the spectrum of D are at innity, so the resolvent of D is compact. Since
the eigenvalues of D grow exponentially with j N + 12 , the metric dimension is
just zero.
September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

Spin Geometry of the Quantum Two-Sphere 989

Proposition 5.7. With the Z2 -grading : H H dened by

|j, m; := |j, m; , |j, m; := |j, m;

on the orthonormal basis (5.15) and extended by A[Sq2 ]-linearity, the datum
(A[Sq2 ], H, D, ) constitutes an even spectral triple.

Proof. It is obvious that 2 = 1 and = . The property D + D = 0 follows


from the fact that interchanges the +j and j eigenspaces of D, as may be
veried directly on the basis vectors (5.15).

Next a real structure. Since we have made the same choice for the spinors as
in [4], it is tempting to take the same real structure as well. However, one quickly
nds that this choice is unsuitable, since it neither commutes nor anti-commutes
with our Dirac operator D. The reason for this lies mainly in the fact that the term
D0 in our Dirac operator (5.10) is proportional to the Casimir operator, which
is rather a second order dierential operator, if anything. Instead, we dene an
anti-unitary operator J : H H in terms of its action on the orthonormal basis
(5.15) by

J|j, m; = (1)m+1/2 |j, m; , J|j, m; = (1)m+1/2 |j, m;

and seek to show that this J equips the datum (A[Sq2 ], H, D, ) with a real structure.
It is not dicult to check that the J above is equivariant under the right action of
Uq (su(2)) on H, making it a particularly natural choice.

Proposition 5.8. The operator J satises J 2 = 1, DJ = JD and J = J.

Proof. The fact that J 2 = 1 is immediate. We nd that

(DJ JD)|j, m; = (1)m+1/2 D|j, m; j D|j, m;


= (1)m+1/2 j |j, m; (1)m+1/2 j |j, m; = 0,
(J + J)|j, m; = J|j, m; (1)m+1/2 |j, m;
= (1)m+1/2 |j, m; (1)m+1/2 |j, m; = 0,

where we have used anti-linearity of J. Similar computations hold on |j, m; .

Aiming at (modied) commutant and rst order conditions as in Denition 5.2,


and having in mind the strategy of [7], we denote by Lq the positive trace-class
operator dened by
1
Lq |j, m := q j |j, m , j N+ ,
2
on H and let Kq be the two-sided ideal of B(H) generated by the operators Lq .
The ideal Kq is an ideal of innitesimals of arbitrarily high order and so we take
September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

990 S. Brain & G. Landi

I = Kq as our operator ideal in Denition 5.2. Thus, to prove that J denes a


real structure, it remains to check that the commutant property and rst order
condition in (5.3) are satised.
The strategy of [7] is based on the fact that the operators (xi ), i = 1, 0, 1, can
be approximated by operators acting diagonally on the Hilbert space of spinors.
Specically, these operators zi , i = 1, 0, 1, on H are dened by
zi |j, m = 0
i (j, m; 0)|j 1, m + i + i (j, m; 0)|j, m + i

+ +
i (j, m; 0)|j + 1, m + i . (5.20)
The coecients are exactly the ones used in (5.7), unless |m + i| > j + for
= 1, 0, 1, in which case we set i (j, m; 0) = 0. Momentarily we shall show that
the operators zi approximate the operators (xi ) modulo the ideal Kq , but to do
this we rst need the following technical lemma.

Lemma 5.9. With Wj , j N + 12 , the operators in (5.19), there exists a constant


C (independent of j) such that

||Wj Wj+1 1|| < Cq j
for all j N + 12 .


Proof. One evaluates the norm ||Wj Wj+1 1|| by computing the eigenvalues

of the 2 2 matrix Wj Wj+1 1 and choosing the larger of the two, nding it
to be


j+ j+1
+
+ j j+1

j j+1
+
+ j+ j+1

2 j j+1
Wj Wj+1 1 = .
2 j j+1

Using the inequalities [j] < (q q 1 )1 q j and [j]1 < q j1 , elementary estimates

for each of the terms in this expression yield that j < C  q j and j j+1 <
C  q 2j for real constants C  , C  , so it appears at rst glance that the above norm
has an O(1) behavior. However, a more detailed analysis shows that the coecient
of q 2j in the numerator is in fact zero; the behavior of the numerator is therefore
O(q j ) and we have our result.

Proposition 5.10. There exist bounded operators Ai , Bi , i = 1, 0, 1, such that


(xi ) zi = Ai Lq = Lq Bi
when acting upon the basis vectors |j, m; . In particular, (xi ) zi Kq for
i = 1, 0, 1.

Proof. From [7, Lemma 4.4], there exist bounded operators Ai , Bi , i = 1, 0, 1


such that
(xi ) zi = Ai Lq = Lq Bi
September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

Spin Geometry of the Quantum Two-Sphere 991

with respect to the basis |j, m of H, and so the operators (xi ) are approxi-
mated by the operators zi modulo the ideal Kq of innitesimals. We need to check
that using the operator W to change the basis vectors from |j, m to |j, m;
does not spoil this approximation property. Evaluating Wj zi Wj zi on |j, m;
gives
(Wj zi Wj zi )|j, m; =
i (j, m; 0)(Wj1 Wj 1)|j 1, m + i;

+ +
i (j, m; 0)(Wj Wj+1 1)|j + 1, m + i; .

This and Lemma 5.9 yield that Wj zi Wj zi Kq for all i = 1, 0, 1 and all
j N + 12 .

As a consequence, we immediately get the commutant property, the rst of the


two conditions in (5.3).

Proposition 5.11. For all a, b A[Sq2 ] we have [(a), J(b)J 1 ] Kq .

Proof. From the derivation property of commutators, it suces to check this


only for the generators x1 , x0 , x1 of A[Sq2 ]. With the operators z1 , z0 , z1 dened
in (5.20), we have
Jzk J 1 |j, m = (1)k (
k (j, m; 0)|j 1, m k

+ 0k (j, m; 0)|j, m k + +
k (j, m; 0)|j + 1, m k ).
(5.21)
Using this, one computes as in [7, Lemma 6.2] that
[zi , Jzk J 1 ] = 0, i, k = 1, 0, 1. (5.22)
It is straightforward to check that
[(xi ), J(xk )J 1 ] = [(xi ) zi , J(xk )J 1 ]
+ [zi , J((xk ) zk )J 1 ] + [zi , Jzk J 1 ],
whence the assertion follows from Proposition 5.10.

We are now ready for our main theorem regarding the dierential structure
of Sq2 .

Theorem 5.12. The datum (A(Sq2 ), H, D, , J) constitutes a real even unital spec-
tral triple (up to innitesimals) with KO-dimension equal to two.

Proof. Having already established Propositions 5.8 and 5.11, it remains to verify
the rst order condition for D, namely that [[D, a], JaJ 1 ] Kq for all a A[Sq2 ].
For this, we split the Dirac operator into two pieces, D = D + D , where D =
D0 0 and D = D + D+ + . By linearity it suces to check the rst order
condition for D and D individually.
September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

992 S. Brain & G. Landi

Since D0 is a function of the Casimir, each a A[Sq2 ] is an eigenfunction for the


derivation [D , ], whence the rst order condition for D follows immediately from
the commutant property in Proposition 5.11. On the other hand, the component
D has eigenvalues j , j := [j + 12 ], whose growth with j obeys j < Cq j for C a
real constant (as already mentioned, D is precisely the Dirac operator considered
in [4]). It is easy to compute that
[D , zi ]|j, m = (j1 j )
i (j, m; 0)|j 1, m + i

+ (j+1 j )+
i (j, m; 0)|j + 1, m + i .

Using this expression, together with (5.21), one calculates the action of the com-
mutators [[D , zi ], Jzk J 1 ] for i, k = 1, 0, 1 and nds them to be a sum of ve

independent weighted shift operators with weights Si,k (j, m), = 2, . . . , 2, i.e.
2

[[D , zi ], Jzk J 1 ]|j, m =
Si,k (j, m)|j + , m + i k .
=2

These weights Si,k (j, m) are estimated using exactly the same method as in
[7, Proposition 6.5]. In our case, the growth condition for j is sucient to

guarantee that |Si,k (j, m)| < C  q j for some real constant C  . We conclude that
1
[[D , zi ], Jzk J ] Kq for all i, k = 1, 0, 1. Since the zi approximate the opera-
tors (xi ) modulo Kq , the proof is complete.

Acknowledgments
Both authors were partially supported by the Italian Project Con08
Noncommutative Geometry, Quantum Groups and Applications. SB is grateful
to INdAMGNSAGA for support and the Department of Mathematics at the
University of Trieste for its hospitality. We thank Francesco DAndrea for very
useful comments.

References
nski and S. Majid, Quantum group gauge theory on quantum spaces, Comm.
[1] T. Brzezi
Math. Phys. 157 (1993) 591638; Erratum, ibid. 167 (1995) 235.
[2] A. Connes, Noncommutative Geometry (Academic Press, 1994).
[3] A. Connes, Gravity coupled with matter and the foundation of noncommutative
geometry, Comm. Math. Phys. 182 (1996) 155176.
[4] L. Dabrowski and A. Sitarz, Dirac operator on the standard Podles quantum sphere,
in Noncommutative Geometry and Quantum Groups (Warsaw, 2001 ), Banach Center
Publ., Vol. 61 (Polish Acad. Sci., Warsaw, 2003), pp. 4958.
[5] L. Dabrowski, G. Landi, M. Paschke and A. Sitarz, The spectral geometry of the
equatorial Podles sphere, C. R. Math. Acad. Sci. Paris 340 (2005) 819822.
[6] L. Dabrowski, G. Landi, S. Sitarz, W. D. van Suijlekom and J. C. Varilly, The Dirac
operator on SUq (2), Comm. Math. Phys. 259 (2005) 729759.
[7] L. Dabrowski, F. DAndrea, G. Landi and E. Wagner, Dirac operators on all Podles
spheres, J. Noncommut. Geom. 1 (2007) 213239.
September 14, 2010 13:30 WSPC/S0129-055X 148-RMP
J070-S0129055X10004119

Spin Geometry of the Quantum Two-Sphere 993

[8] F. DAndrea, L. Dabrowski and G. Landi, The isospectral Dirac operator on the
4-dimensional orthogonal quantum sphere, Comm. Math. Phys. 279 (2008) 77116.
[9] M. D- urdevic, Geometry of quantum principal bundles. I, Comm. Math. Phys. 175
(1996) 457520.
- urdevic, Geometry of quantum principal bundles. II, Rev. Math. Phys. 9 (1997)
[10] M. D
531607.
[11] A. Klimyk and K. Schm udgen, Quantum Groups and Their Representations (Springer
Verlag, Berlin Heidelberg, 1997).
[12] G. Landi and A. Zampini, in preparation.
[13] S. Majid, Quantum and braided group Riemannian geometry, J. Geom. Phys. 30
(1999) 113146.
[14] S. Majid, Noncommutative Riemannian and spin geometry of the standard q-sphere,
Comm. Math. Phys. 256 (2005) 255285.
[15] T. Masuda, K. Mimachi, Y. Nakagami, M. Noumi and K. Ueno, Representations of
the quantum group SUq (2) and the little q-Jacobi polynomials, J. Funct. Anal. 99
(1991) 357387.
[16] P. Podles, Quantum spheres, Lett. Math. Phys. 14 (1987) 193202.
[17] P. Podles, Dierential calculus on quantum spheres, Lett. Math. Phys. 18 (1989)
107119.
[18] P. Podles, The classication of dierential structures on quantum two-spheres,
Comm. Math. Phys. 150 (1992) 167179.
[19] K. Schm udgen, Commutator representations of dierential calculi on the quantum
group SUq (2), J. Geom. Phys. 31 (1999) 241264.
[20] K. Schm udgen and E. Wagner, Dirac operator and a twisted cyclic cocycle on the
standard Podles quantum sphere, J. Reine Angew. Math. 574 (2004) 219235.
[21] K. Schm udgen and E. Wagner, Representations of crossed product algebras of Podles
quantum spheres, J. Lie Theory 17 (2007) 751790.
[22] S. L. Woronowicz, Dierential calculus on compact matrix pseudogroups (quantum
groups), Comm. Math. Phys. 122 (1989) 125170.
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

Reviews in Mathematical Physics


Vol. 22, No. 9 (2010) 9951032

c World Scientic Publishing Company
DOI: 10.1142/S0129055X10004120

BOREL SUMMABILITY OF 44 PLANAR THEORY VIA


MULTISCALE ANALYSIS

MARCELLO PORTA and SERGIO SIMONELLA


Dipartimento di Fisica, Universit`
a di Roma Sapienza,
Piazzale Aldo Moro 5, 00185 Roma, Italy
Dipartimento di Matematica, Universit`
a di Roma Sapienza,
Piazzale Aldo Moro 5, 00185 Roma, Italy
marcello.porta@roma1.infn.it
simonella@mat.uniroma1.it

Received 23 March 2010

We review the issue of Borel summability in the framework of multiscale analysis and
renormalization group, by discussing a proof of Borel summability of the 44 massive
Euclidean planar theory; this result is not new, since it was obtained by Rivasseau and
t Hooft. However, the techniques that we use have already been proved eective in the
analysis of various models of consended matter and eld theory; therefore, we take the
44 planar theory as a toy model for future applications.

Keywords: Borel summability; 44 theory; renormalization group.

Mathematics Subject Classication 2010: 81T08, 81T17, 40G10

1. Introduction
The problem of giving a meaning to the formal perturbative series dening the
scalar 44 theory, the simplest four-dimensional interacting eld theory, has been
very debated (see [7] for a critical introduction to the problem) and it is still wide
open, despite several triviality conjectures have been proposed since the work of
Landau, [1]. Here we focus on the planar restriction of the full perturbative series;
that is, we consider only the graphs that can be drawn on a sheet of paper without
ever crossing lines in points where no interacting vertices are present. This problem
is much easier than the complete case, since the number of topological Feynman
graphs contributing to a given order n is much smaller than the original n!. In fact,
in the planar theory this number is bounded by (const.)n , see [10, 11]. Still, the
problem is far from being trivial, since the theory needs to be renormalized; this
can be done using renormalization group, see [6, 12, 13, 25], for instance.

995
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

996 M. Porta & S. Simonella

It is well known that the full 44 with the wrong sign of the renormalized
coupling constant, that is the one corresponding to an unstable self-interaction
potential, is perturbatively asymptotically free, in the sense that truncating the beta
function to a nite order the running coupling constant describing the interaction
of the elds at energy scale ows to zero in the ultraviolet as (log )1 . This fact
does not have any direct physical interpretation in the full 44 , since the theory is not
dened for the considered value of the renormalized coupling constant. Moreover,
the beta function itself is not dened, because of the factorial growth of the number
of topological Feynman graphs in the order of the series.
However, these problems do not aect the planar theory, since it is only dened
perturbatively and the number of graphs at a given order is far smaller than the
original n!. Therefore, one can hope in this case to exploit asymptotic freedom to
rigorously construct the theory. This has been done independently by Rivasseau
and t Hooft using quite dierent methods, see [25]; indeed, they proved that
the renormalized perturbative series dening the Schwinger functions, which are
the result of various resummations, are absolutely convergent. In particular, they
proved that the result is the Borel sum of the perturbative series in the renormalized
coupling constant. This last fact means in particular that the Schwinger functions
can be expressed to an arbitrary accuracy starting from their perturbative series in
the renormalized coupling constant, following a well-dened prescription; moreover,
the result is unique within a certain class of functions, the Borel summable ones.
Clearly, this does not exclude the existence of other less regular solutions with the
same formal perturbative expansion.
At the time of those works, besides the possibility of giving a mathematically
rigorous meaning to a simple quantum eld theory, the physical motivation of the
study was that the 44 planar theory is formally equal to the limit N of a
massive SU (N ) theory in four dimensions, with interaction Tr 4 where is an
N N matrix, see [3, 11]. In particular, in t Hooft work the planar approximation
was seen as a rst step towards the more ambitious study of QCD with large number
of colors.
In this paper, we review the issue of Borel summability of the 44 planar theory
using the rigorous renormalization group techniques introduced in [6, 12, 13] (in [6,
13] the ow of the running coupling constants of the planar theory was heuristically
discussed), which make possible a transparent proof of the ultraviolet stability of
the massive Euclidean 44 theory, through the so called n! bounds.
One of the motivations of our work lies in the fact that very few proofs of Borel
summability based on renormalization group methods are present in literature,
[8, 9]. Moreover, we take the 44 planar theory as a rst step towards the study of
physically more interesting models, which can be analyzed by similar techniques.
As mentioned before, the great gain that one has in the planar restriction of the
full 44 theory is that the topological Feynman graphs of a given order n are far less
(their number is bounded as (const.)n , against the n! of the full case). This is in a
sense reminiscent of what happens in fermionic eld theories, where it is possible
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

Borel Summability of 44 Planar Theory via Multiscale Analysis 997

to control the factorial growth of the number of Feynman graphs by exploiting


the 1 arising in the anticommutation of the elds, showing that the nth order of
the series, which is given by n! addends, reconstructs the determinant of an n n
matrix, which is estimated by (const.)n . For instance, we think that the methods
described in this paper could be useful to prove Borel summability for the one-
dimensional Hubbard model, where one sector of the theory is asymptotically free,
while to control the ow of the other running coupling constants one has to prove
that the beta function is vanishing, [21]. This model has been rigorously constructed
in [15] using renormalization group methods similar to those used here, but a proof
of Borel summability has not been given yet.
Informally, our main result can be stated as follows; we refer the reader to Sec. 3,
Theorem 1, for a precise formulation.
Main result. The Schwinger functions of the Euclidean massive planar 44 theory
are Borel summable in the renormalized coupling constant; in particular, they satisfy
the hypothesis of the NevanlinnaSokal theorem [14], which are sucient conditions
for Borel summability.
Roughly speaking, our proof goes as follows. First, by choosing the renormalized
coupling constant in a suitable complex domain, we prove that the ow equation
dening recursively the running coupling constants at all energy scales admits a
bounded solution which falls into the radius of convergence of the Schwinger func-
tions, and veries some special regularity properties. To do that, we use a xed
point argument, similar to the one introduced by t Hooft in [2]. Then, to conclude
the check of the hypothesis of NevanlinnaSokal theorem on Borel summability,
we show that it is possible to undo the resummation that allowed to write the
Schwinger functions as power series in the running coupling constants so that the
nth order Taylor remainder in the renormalized coupling constant can be bounded
proportionally to n!||n+1 uniformly in the analyticity domain. To prove this sec-
ond statement, we rely in a crucial way on the GallavottiNicol`o tree representation
of the beta function; the undoing of the resummations, corresponding to rather
involved analytical operations, is made clear by a graphical manipulation of these
trees. This procedure is quite similar in spirit to what has been done by Rivasseau
in [4].
Therefore, we feel that our proof lies halfway between those of Rivasseau and
t Hooft. As mentioned above, in t Hooft approach, which is based on renormal-
ization group ideas, the ow of the beta function is studied in a way analogous
to the one we follow. However, instead of deriving bounds on the remainder of
the resummed perturbative series, t Hooft, see [2], concludes the proof of Borel
summability by checking the analyticity properties of the Borel transform using a
totally independent argument, that we have not been able to rigorously reproduce
in our framework.
For what concerns the comparison with Rivasseaus work, see [4], the main
dierence is that in his approach the beta function is not introduced: to construct
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

998 M. Porta & S. Simonella

the planar theory Rivasseau uses a minimal resummation procedure, involving


only a certain class of Feynman graphs with four external legs, the parquet ones.
This denes an asymptotically free running coupling constant, and it turns out
to be enough to prove the niteness of the planar theory. To conclude, Rivasseau
shows that the result of these operations is the Borel sum of the nonrenormalized
series, by proving an n! bound on the Taylor remainder; this bound is obtained
undoing the resummation of the parquet subgraphs in a suitable way.
The paper is organized as follows. In Sec. 2, we dene the model, we set the
notations, we briey review the ideas behind multiscale integration and we intro-
duce the beta function and the ow of the running coupling constants; we refer the
interested reader to [6, 12, 13] for a detailed introduction to these techniques. In
Sec. 3, we state our main result and we discuss the strategy of the proof. Finally,
in Sec. 4 and in the appendices cited therein, we prove the theorem.

2. Renormalization Group Analysis


In this section we describe the iterative procedure that allows to express the
Schwinger functions of the full 44 theory as power series order by order nite
in the ultraviolet limit, graphically represented in terms of renormalized Feynman
graphs; at the same time, we dene the planar 44 theory by considering at each
step only the planar graphs. Our discussion will be quite short; we refer the reader
to [6] for a detailed proof of the renormalizability of the 44 theory.

The full 44 theory. Let x


(N )
be a massive gaussian free eld with ultraviolet
cut-o at length N , where > 1 is a xed scale parameter, and x where
is a four-dimensional box of side size L with periodic boundary conditions; for
simplicity, we set to 1 the value of the mass. We rewrite the eld as:


N
(N
x
)
= (j)
x , x , (2.1)
j=0

where {(j) }N
j=0 are independent gaussian elds with propagators

(j) dp fj (p) ip(xy)
Cx,y := e ,
(2)4 p2 + 1
 2 2j 2 (2.2)
/ 2(j1)
ep / ep if j > 0
fj (p) := 2 ,
ep if j = 0
 
and dp
(2)4 is a shorthand for ||1 p=2n/L with n Z4 ; notice that


N
lim fj (p) = 1. (2.3)
N +
j=0
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

Borel Summability of 44 Planar Theory via Multiscale Analysis 999

The generating functional of the Schwinger functions of the 44 theory is given by:
   
(N ) (N )
eWN (f ) := exp dx (N x
)
f x eV ( )
P (d(N ) ), (2.4)


where fx is a Schwartz test function, R, P (d(N ) ) := N (j)
j=0 P (d ) with
P (d(j) ) the gaussian distribution of the eld (j) with covariance given by (2.2),
and the interaction V (N ) is dened as

V (N ) ((N ) )

:= dx (N : ((N
x
) 4
) : +N : ((N
x
) 2
) : +N : ((N
x
) 2
) : +N ), (2.5)

where N , N , N , N are called bare coupling constants, and the dots denote the
Wick product of the elds (see [6, Appendix C]); notice that in our convention the
wrong sign of N is the positive one. The generic q-point Schwinger function of
the full 44 theory is obtained deriving the generating functional q times with respect
p np p np
to and setting = 0. Now, let WN (f ) =: WN (f ) + WN (f ), where WN , WN
are respectively the planar/non planar part of WN to be dened recursively in the
following; the q-point Schwinger function of the planar theory is dened as:
q
T
S(N ) (f ; q) := W p (f )|=0 . (2.6)
q N
We shall denote by S T (f ; q) the limit for N + of (2.6).

Multiscale analysis. As explained in [6], we can try to evaluate (2.4) by pro-


ceeding in an iterative fashion, integrating the independent elds (j) starting from
the ultraviolet scale j = N going down to the infrared scale j = 0. This iterative
integration gives rise to an expansion in Feynman graphs; the restriction to the pla-
nar theory will be enforced by considering at each integration step only the planar
ones. For simplicity, in what follows, we shall explicitly discuss only the case f = 0,
which corresponds to the integration of the partition function. The case f = 0 is
a straightforward extension of our argument, and it will be discussed later. After
the integration of (N ) , (N 1) , . . . , (k+1) , we rewrite the integral (2.4) as
 
(k) (k) (k) (k) (k)
((k) )
eWN (0) = eV ( )
P (d(k) ) = eVp ( )+Vnp
P (d(k) ), (2.7)

 
where P (d(k) ) := kj=0 P (d(j) ), the eld (k) = kj=0 (j) has a propagator
given by, in momentum space,


k
fj (p)
Cp(k) := Cp(j) , Cp(j) := , (2.8)
j=0
p2 + 1

(k) (k)
and the eective potential V (k) together with its planar/non planar parts Vp , Vnp
(N )
will be dened recursively. At the beginning, V (N ) ((N ) ) = Vp ((N ) ); on scale
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

1000 M. Porta & S. Simonella

k we will show that, if # = p, np:


(k)
  dp1 dpm (k)
V# ((k) ) = V (p1 , . . . , pm ; m)
(2)4 (2)4 #
m0



m 
(k)
: pi : pi , (2.9)
i=1 i
(k)
where V# (p1 , . . . , pm ; m) are suitable coecients to be recursively dened, and the
product with m = 0 is interpreted as 1. Let us perfom the single scale integration.
First, we split V (k) as LV (k) + RV (k) , where R = 1 L and L, the localization
operator, is a linear operator acting on functions of the form (2.9), dened by its
(k)
action on the kernels V# (p1 , . . . , pm ; m) in the following way (with a slight abuse
of notation, due to the presence of the delta function in (2.9) we only write the
independent values of the momenta in the arguments of the kernels):
(k) (k)
LV# (p1 , p2 , p3 ; 4) := V# (0, 0, 0; 4), (2.10)
(k) (k) (k) 1 (k)
LV# (p; 2) := V# (0; 2) + pp V# (0; 2) + pi pj pi pj V# (0; 2),
2
(k)
and LV# (p1 , . . . , pm ; m) = 0 otherwise. By symmetry, it follows that
(k) (k)
pi V# (0; 2) = 0, pi pj V# (0; 2) = 0 for i = j,
(2.11)
(k) (k)
pi pi V# (0; 2) = pj pj V# (0; 2) for all i, j;
nally, we dene the running coupling constants of the planar theory on scale k as:
(k) (k)
k := Vp (0, 0, 0; 4), 2k k := Vp (0; 2),
(2.12)
1
k := p p V (k) (0; 2), 4k k := Vp(k) (0);
2 1 1 p
(k)
the corresponding objects in the full theory are obtained by replacing the Vp in
(2.12) with V (k) . Therefore, setting (k) =: (k1) + (k) , we can rewrite (2.7)
with k replaced by k 1, and V (k1) given by

(k) (k1)
(k1) (k1) +(k) )
V ( ) = log P (d(k) )eV (
 1
:= E T (V (k) ((k) ); n), (2.13)
n! k
n0

where EkT is called truncated expectation on scale k, and it is dened as:



n (h)
EhT (X((h) ); n) := n log P (d(h) )eX( ) |=0 . (2.14)

It is convenient to dene also V (1) ; for this purpose one thinks (N ) as being
given by, see formula [6, Eq. (6.9)],
(N ) = (1) + (0) + + (N ) , (2.15)
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

Borel Summability of 44 Planar Theory via Multiscale Analysis 1001

where the eld (1) is distributed independently relative to the other (j) , j 0,
(1)
and it has its own covariance Cx,y which needs not to be specied (because it will
eventually be taken to be identically zero whenever it appears in some interesting
formulas). The introduction of V (1) allows to treat the case k = 0 on the same
grounds as the cases k > 0.
Tree expansion and Feynman graphs. The iterative integration described
above leads to a representation of the eective potential on scale k 1 as a power
series in the running coupling constants h , h , h , h with h k, where the coef-
cients of the series can be represented in terms of connected Feynman graphs, as
briey explained in the following. The key formula which we start from is (2.13);
iterating this formula as suggested by Fig. 1, we end up with a representation of
the eective potentials in terms of a sum over GallavottiNicol` o trees [6, 12, 13],
see Fig. 2:
 
V (k1) ((h) ) = V (k1) (),
n1 Tk1,n

 dp1 dpm (k1)


V (k1) () = 4
V (p1 , . . . , pm ; , m) (2.16)
(2) (2)4
m0



m 
(k1)
: pi : pi ,
i=1 i

where Tk1,n is the set of trees with root r on scale hr = k 1 and n endpoints,
with value V (k1) (). The trees involved in the sum are distinct; two trees are
considered identical if it is possible to superpose them together with the labels
appended to their vertices by stretching or shortening the branches. Proceeding
in a way analogous to [6, Sec. XVI and Appendix C], it follows that the kernels
V (k) (p1 , . . . , pm ; , m) satisfy the following recursion relation:

 1  s
V (k1) (p1 , . . . , pm ; , m) = V (k) (p1 , . . . , pmj ; j , mj )
m ,...,m
s! j=1
1 s


 
Cp()
(k)
Cp() ,
(k1)

Gm /
connected
(2.17)
where 1 , . . . , s are the s subtrees of with root corresponding to the rst non-
trivial vertex of , V (k) (p1 , . . . , pmj ; j , mj ) is equal to RV (k) (p1 , . . . , pmj ; j , mj )
if j is nontrivial and to LV (k) (p1 , . . . , pmj ; mj ) otherwise, Gm is a suitable set of
Feynman graphs dened below, and the integral is over their loop momenta. This
relation is a consequence of the rules of evaluation of the truncated expectations of
Wick monomials, see [6, Appendix C]. Formula (2.17) is iterated by replacing each
V (k) (p1 , . . . , pmj ; j , mj ) corresponding to nontrivial j s with (2.17) with k 1
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

1002 M. Porta & S. Simonella

V(k) = , V(k) = , V(k) =


k k k

V(k1) = + + + + ...
k1 k k1 k k1 k k1 k

Fig. 1. Graphical interpretation of (2.13). The graphical equations for LV (k1) , RV (k1) are
obtained from the equation in the second line by putting an L, R label, respectively, over the
vertices on scale k.

v
v0
V(k1) =
trees

k1 k hv N

Fig. 2. The eective potential V (h) can be represented as a sum over GallavottiNicol` o trees.
The small black dots will be called vertices of the tree. All the vertices except the rst (i.e. the one
on scale k) have an R label attached, which means that they correspond to the action of REhTv ,
while the rst represents EkT . The generic endpoint e, represented by a fat endpoint, corresponds
to LV (he 1) . The sum is over distinct trees; two trees are considered identical if it is possible to
superpose them together with the labels appended to their vertices by stretching or shortening
the branches.

replaced by k. Analogously, the planar part of the eective potential is dened as:

 1  s
Vp(k1) (p1 , . . . , pm ; , m) = Vp(k) (p1 , . . . , pmj ; j , mj )
m1 ,...,ms
s! j=1

  (k)
C C
(k1)
.
p() p()
Gm /
planar connected
(2.18)

Represent a generic Wick monomial Mj containing the product of mj elds as


a point or as a cluster with mj emerging lines, depending on whether the corre-
sponding j is trivial or not; we shall consider the points as (trivial) clusters, too.
Given the Wick monomials M1 , . . . , Ms the symbol Gm denotes the set of connected
graphs that can be made joining pairwise some of the lines associated with the clus-
ters M1 , . . . , Ms in such a way that: (i) two lines emerging from the same cluster
cannot be contracted together, (ii) there should be enough lines so that looking
the clusters as points the resulting graph is connected, (iii) after the contraction
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

Borel Summability of 44 Planar Theory via Multiscale Analysis 1003

there should be still m uncontracted lines, representing the Wick monomial M .


The resulting graph is enclosed in a new cluster, labeled by k. Furthermore, the
condition with the subscript connected means that the subgraph still
keeps the connection between the boxes. We graphically represent the propagators
C (k) by a solid line, while C (k1) correspond to wavy lines. Finally, the restriction
to planarity means that we discard all the graphs that show lines crossing in points
were no interacting vertices are present. We refer the reader to [6, Sec. XVI], for a
more extensive discussion and for examples.
Clearly, the iteration stops when only trivial subtrees appear in (2.17), (2.18);
at this point, the resulting graph looks like an usual one, but enclosed in a
hierarchical cluster structure, where each cluster has a scale label; and given two
clusters Gv , Gv then Gv Gv if and only if hv > hv . After the iteration, the
eective potential on scale k is expressed as a power series in the running coupling
constants h , h , h , h with h > k. From the analysis of [6, 12, 13], it follows
that the contribution of a given tree Tk,n to a kernel of the planar theory can
be bounded in the following way, setting := maxh {|h |, |h |, |h |, |h |}, for some
positive Cm , :
 (k) 
Vp (p1 , . . . , pm ; , m) Cm (const.)n n k(4m) (hv hv ) , (2.19)
v>r
v not e.p.

where the product runs over the vertices of the tree and v  is the vertex imme-
diately preceding v; since the number of distinct trees is bounded as (const.)n it
follows that, see [6, Sec. XIX]:

|Vp(k) (p1 , . . . , pm ; , m)| Cm C n n k(4m) , (2.20)
Tk,n

which means that the planar part of the eective potential can be expressed as a
convergent power series in the running coupling constants, provided their absolute
values are small enough. This is not the case in the full theory; in the analogous
of (2.19), due to the combinatorics of the Feynman graphs, one has to take into
account an extra n! factor. Formula (2.19) implies in particular the so called short
memory property of the GallavottiNicol` o trees, which states that if two scales of a
given tree are constrained to have xed values, say h, k with h < k, then the bound
on the sum over all the remaining scales is improved by a factor (/2)(kh) with
respect to (2.20); in other words, long trees are exponentially suppressed.
The expansion of the Schwinger functions. The generating functional of the
Schwinger functions can be evaluated repeating a procedure completely analogous
to the one described for the eective potentials; after the integration of the scales
N, N 1, . . . , k + 1 it turns out that:

(k) (k)
WN (f )
e = P (d(k) )eS ( ;f )


(k)
((k) ;f )+Snp
(k)
((k) ;f )
= P (d(k) )eSp , (2.21)
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

1004 M. Porta & S. Simonella

(k)
where the eective potentials S# ((k) ; f ) have the form:

(k)
  dp1 dpm+t (k)
S# ((k) ; f ) = S (p1 , . . . , pm+t ; m, t)
(2)4 (2)4 #
m0
t0



m
m+t 
: (k)
pi :
fpj pi , (2.22)
i=1 j=m+1 i

and can be represented as sums over trees very similar to the ones introduced for
the eective potentials, up to the following dierences, see [12, Sec. 7.5] and [13]:
(i) special vertices may appear, from which dotted lines representing the external
elds f emerge (that do not contribute to the total number of endpoints), and
(ii) no R operation is dened on the path from a given dotted line to the root. We
call Tk,n,t the set of such trees having root scale k, n endpoints and t dotted lines.
See Fig. 3 for an example.
Setting
(k)
  (k)
S# (p1 , . . . , pm+t ; m, t) = S# (p1 , . . . , pm+t ; , m, t), (2.23)
n1 Tk,n,t

the planar parts of the kernels of the eective potentials are related by the following
recursive equation:
Sp(k1) (p1 , . . . , pm+t ; , m, t)

 1  s
= Sp(k) (p1 , . . . , pmj +tj ; j , mj , tj )
m1 ,...,ms s! j=1
t1 ,...,ts

 
Cp()
(k) (k1)
Cp() , (2.24)
Gm /
planar connected

Fig. 3. A generic tree belonging to Tk,6,2 .


October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

Borel Summability of 44 Planar Theory via Multiscale Analysis 1005

where 1 , . . . , s are the s subtrees of with root coinciding with the rst
vertex of following the root. If j is trivial and corresponds to a dotted
(k)
line then Sp (p1 , . . . , pmj +tj ; j , mj , tj ) = mj ,1 tj ,1 , while if it corresponds
(k)
to a solid line Sp (p1 , . . . , pmj +tj ; j , mj , tj ) = tj ,0 LV (k) (p1 , . . . , pmj ; mj ); if
(k)
j is a nontrivial subtree with tj > 0 then Sp (p1 , . . . , pmj +tj ; j , mj , tj ) =
(k)
Sp (p1 , . . . , pmj +tj ; j , mj , tj ), while if j is nontrivial and tj = 0 then
(k) (k)
Sp (p1 , . . . , pmj ; j , mj , 0) = RVp (p1 , . . . , pmj ; j , mj ), with R = 1 L
dened as in (2.10). Clearly, m1 , . . . , ms and t1 , . . . , ts are subject to the con-
 
straints j mj = m, j tj = t. Formula (2.24) is iterated by replacing each
(k)
Sp (p1 , . . . , pmj ; j , mj , tj ) corresponding to any nontrivial j with tj > 0. There-
T
fore, the generic planar Schwinger function S(N ) (f ; q) can be written as:
 
T
S(N ) (f ; q) = Sp (),
n1 T1,n,q

dp1 dpq
Sp () := fp fpq Sp(1) (p1 , . . . , pq ; , 0, q), T1,n,q ,
(2)4 (2)4 1
(2.25)
(1)
where Sp is given by (2.24) with k = 0. Finally, from the theory of [12, Sec. 7.5],
it follows that

|Sp ()| f q1 Cq C n n , (2.26)
T1,n,q

which implies that in the planar theory the Schwinger functions can be expressed
as absolutely convergent power series in the running coupling constants, provided
their absolute values are small enough. As it is well known, this is not the case in
the full theory, since the bound (2.26) has to be multiplied by n!; see [6, 12, 13, 22].
The beta function and its tree expansion. From now on, we shall focus only
on the planar theory. The running coupling constants obey to recursive equations
(4) (2 )
induced by the iterative integration; it follows that, setting vk := k , vk := k ,
(2) (0)
vk = k , vk := k :
 (a)
vk = 2a,2 4a,0 vk1 Bv k , k 0,
(a) (a)
(2.27)
where the operator B, the beta function of the theory, has the form, see formula
[6, Eq. (9.15)]:

 
N 
r
(a) (a )
(Bv)k := a(a)
1 ,...,ar
(k; h1 , . . . , hr ) vhi i . (2.28)
r=2 h1 ,...,hr k a1 ,...,ar i=1

(a)
The quantities {v1 } are called the renormalized coupling constants. As the iter-
(a)
ative procedure described before suggests, the beta function (Bv)k can be rep-
resented as a sum over trees; the only dierence with respect to the trees which
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

1006 M. Porta & S. Simonella

have been introduced previously is that we attach an La over the rst vertex,
(k) (k)
where La is dened in the following way: La Vp (p1 , p2 , p3 ; 4) := Vp (0, 0, 0; 4) if
a = 4 and zero otherwise, La Vp (p; 2) := 2k Vp (0; 2) if a = 2 and zero other-
(k) (k)

wise, La Vp (p; 2) := (1/2)p1 p1 Vp (0; 2) if a = 2 and zero otherwise and nally


(k) (k)

La Vp (0) := 4k Vp (0) if a = 0 and zero otherwise. From the theory of [6], it


(k) (k)

follows that in the planar theory:


  
a(a),...,a (k; h1 , . . . , hr ) (const.)r , (2.29)
1 r
h1 ,...,hr k

which means that the beta function is dened as an absolutely convergent power
series provided the absolute values of the running coupling constants are small
enough; this is not the case in the full theory, since in that case the bound (2.29)
has to be multiplied by r!.

Remarks. (1) From the representation of the coecients of the beta function
in terms of Feynman graphs, induced by the iterative integration previously
described (see also [6, Secs. IX, XVIXIX]), it follows that for k > 0, calling
r the number of indexes i such that ai = 4 (corresponding to the number of
vertices with four external lines),

a(4)
1 ,...,ar
(k; h1 , . . . , hr ) = 0 unless r 2, (2.30)

a(21 ,...,a
)
r
(k; h1 , . . . , hr ) = 0 unless r 2, (2.31)

a(2)
1 ,...,ar
(k; h1 , . . . , hr ) = 0 unless r 1. (2.32)

These properties can be understood in the following way. The graphs contribut-
ing to (2.30)(2.32) are all computed at vanishing external momenta, and the
momenta owing on the propagators must have absolute values bigger than 0;
(a)
in fact, the quantity (Bv)k arise from the integration of the elds (h) with
h k, which if k > 0 have support for momenta p such that |p| > 0. Then,
to see property (2.30), simply try to draw on a sheet of paper any graph with
four external lines evaluated at vanishing external momenta; as the reader may
check, the condition r < 2 is not compatible with the fact that the momenta
owing on the propagators have absolute values > 0. Property (2.32) can be
seen in an analogous way. To understand (2.31), notice that the graphs con-
(2 )
tributing to a1 ,...,ar (k; h1 , . . . , hr ) have two external lines, and are derived twice
with respect to the external momentum. Then, proceed as for (2.32), and notice
that the only two-legged graphs with r = 1 compatible with the request on the
modulus of the inner momenta are tadpole graphs, which do not depend on
the value of the external momentum; therefore, their derivatives are vanishing.

(2) Note that the ow of k is decoupled from the others, since k does not appear
in the recursive equations dening k , k , k (it is graphically represented by
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

Borel Summability of 44 Planar Theory via Multiscale Analysis 1007

a vertex with no external lines); moreover, the sequence 1 , . . . , N solves the


following equation:
 (0)
k = 4 k1 Bv k , (2.33)
which implies

k
 (0)
k = 4(k+1) 1 4(jk) Bv j , (2.34)
j=0

(0)
where (Bv)j is analytic in its arguments for maxk {|k |, |k |, |k |} small
enough. For these reasons, in what follows we shall focus only on the ows
of k , k , k .
We can rewrite Eq. (2.27) as:

k
 (a)
vk = (k+1)(2a,2 +4a,0 ) v1
(a) (a)
(jk)(2a,2 +4a,0 ) Bv j , (2.35)
j=0

(a)
and this equation can be iterated in order to obtain the formal power series of vk
in the renormalized coupling constants. Again, Eq. (2.35) can be represented graph-
ically. The second term in (2.35) corresponds to the sum of all the possible trees
with root scale k enclosed in a frame labeled by a type label a. The correspondence
between the framed trees and the trees discussed after (2.28) is made explicit by
the example in Fig. 4.
In general, the fat endpoint e labeled by ae and attached to a vertex on scale
(a)
he 1 corresponds to the running coupling constant vhe 1 , while the rst term in
(2.35) is represented as a trivial tree with a thin endpoint labeled by a and root
scale k. See Fig. 5 for a graphical representation of (2.35). The iteration of (2.35)
produces trees showing thin endpoints, and in general more than one frame; see
Fig. 6 for a picture of the situation. Therefore, the nth order contribution in the
(a)
renormalized coupling constant to vk is dened graphically as the sum of all the
possible framed trees with root scale k enclosed in a frame labeled by a, with n thin
endpoints, and where the generic vertex v has an R label attached otherwise the
corresponding subtree is enclosed in a frame. We stress that trees with dierent type

k
a,2 a,0) a
=
k j=0 j

Fig. 4. Example of framed tree.


October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

1008 M. Porta & S. Simonella

a1 a1 a1

= + + a2 + a2 + ...
k a k a k k k
a2 a3 a3
a a a

Fig. 5. Graphical interpretation of formula (2.35); a sum over the ai s is understood.

a1 a1 a1

a2 + a2 + ...
a2

a3 a3
a5
a4
a3

Fig. 6. Graphical interpretation of the iteration of Eq. (2.35).

labels attached to their frames and endpoints are considered dierent. The same
graphical procedure allows to nd the perturbative expansion of the Schwinger
functions (or equivalently of the eective potentials) in the renormalized coupling
constants, starting from their denition as trees with only fat endpoints.

Remark. Given a generic framed tree showing any number of inner frames, we
dene the maximally pruned framed tree as the tree obtained by replacing the
maximal inner frames (i.e. the ones enclosed only by the outermost frame) with fat
endpoints of the corresponding type; by properties (2.30)(2.32) the sum over the
scale of the rst vertex of a framed tree, see Fig. 4, involves only the term with
j = 0 if:

the type label of the frame is 2 and the maximally pruned framed tree has no
endpoints of type 4;
the type label of the frame is 2 or 4 and the maximally pruned framed tree has
at most one endpoint of type 4.

We shall say that a frame is trivial if the enclosed tree veries one of the above
properties; all the other frames will be called nontrivial.

Call T1,m,q the set of trees with root scale 1, any number of frames, m
endpoints fat or thin, and q dotted lines; given a generic tree T1,m,q we call
n2 ,4 () the number of nontrivial frames (see previous remark) labeled by a = 2 , 4
and we denote by ma () the number of endpoints of type a. In the planar theory
the following remarkable result is true.
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

Borel Summability of 44 Planar Theory via Multiscale Analysis 1009

Theorem [n! Bound]. Let q > 0; there exist two positive constants C, Cq such
that, if m = m4 + m2 + m2 :
 q
 (a)
 ma
|S()| C Cq f 1 n!
m
max |vk | ; (2.36)
k1
T1,m,q a
n2 ,4 ()=n
ma ()=ma

for q = 0 the bound (2.36) has to be multiplied by ||.


We refer the reader to [6, 12, 13], and to Appendix E (see item (2) in the Remark
below), for a proof of this result.
Remarks. (1) The n! bound (2.36) only applies to the planar theory; in the full
theory n is replaced by the number of endpoints of the tree. This proves the
ultraviolet stability of the full 44 theory; see [6, 12, 13, 2225].
(2) In References [6, 13], it was noticed that in the planar case the bound grows
factorially in the number of frames; as we show in Appendix E, it is possible to
improve the bound by considering only the nontrivial frames labeled by 4, 2 .
Roughly speaking, the factorial is produced by the sums appearing in the
denitions of the frames; the frames labeled by 2, 0 do not contribute to the
factorial because their sums can be controlled thanks to the exponential factor
appearing in (2.35) and Fig. 4, and if a frame is trivial the sum is missing.

Notations. From now on we shall set


:= 1 , := 1 , := 2 1 ; (2.37)
moreover, we dene
:= {k }k1 , := {k }k1 , := {k }k1 . (2.38)
Notice that the denition (2.38) does not involve the running coupling constants on
scale zero. In fact, for purely technical reasons, the running coupling constants on
scale zero have to be treated separately from those on scales > 0. In particular, we
rst determine the running coupling constants on scales > 0 as functions of those on
scale 0, and then we express the running coupling constants on scale 0 as functions
of the renormalized ones. The motivation of this procedure is connected with the
fact that the properties of the beta function (2.30)(2.32), that will play a key role
in our analysis, are true only for scales k > 0. It is also convenient to introduce
k := (2 ,k , 2,k ) := (k , k ), := (, ), := {k }k1 . (2.39)
 
Finally, we dene the sets B , C , W, with > 0, 0, 2 in the following way,
see Fig. 7:
B := {z C : |z| < }, C := {z C : Re z 1 > 1 },
(2.40)
W, := {z C : |z| < , |arg z| < }.
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

1010 M. Porta & S. Simonella

Fig. 7. The domains B , C , W, .

3. Borel Summability of 44 Planar Theory


In this section, we state our main result in a mathematically precise form, we recall
what Borel summability is and we outline the ideas of the proof. The technical
details are contained in Sec. 4 and in the appendices.
 
Theorem 1 (t HooftRivasseau). For any 0, 2 there exist > 0, > 0
such that the Schwinger functions S T (f ; q) = limN + S(N T 4
) (f ; q) of the planar 4
theory are analytic for (, , ) W, B B, and Borel summable in at the
origin.
Remark. Not surprisingly, 0 if 0.
Before discussing a sketch of the proof, let us briey remind what Borel summa-

bility is (see [14, 16]). A formal power series n an z n , z C, is said to be Borel
summable if the following properties are true:

the Borel transform B(t) := n an!n tn converges for every t in some circle B ;
B(t) admits an analytic continuation in a neighbourhood of the positive real axis;
the integral

1 + t
f (z) = e z B(t)dt (3.1)
z 0
is convergent for z C for some > 0.

Notice that f (z) n an z
n
for z 0. The function f (z) is called the Borel
sum of the formal power series, and if f (z) exists it is unique. Therefore, Borel
summability is nothing else than a one-to-one mapping between a certain space of
functions and a certain space of power series: all the information on the function is
enclosed in the list of its Taylor coecients. For these reasons, Borel summability
is, [17], the perfect substitute for ordinary analyticity when a function is expanded
on the boundary of its analyticity domain.
By the NevanlinnaSokal theorem, [14], to establish whether f (z) is the Borel

sum of n an z n it is sucient to check the following two properties:

f (z) is analytic in C for some > 0;


October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

Borel Summability of 44 Planar Theory via Multiscale Analysis 1011

for every z C and for all M > 0 the following estimate holds:
 
 
M1 
 
f (z) an z n  C M M !|z|M , C > 0. (3.2)
 
n=0

Sketch of the Proof. Our proof consists in a check of the two hypothesis of
the NevanlinnaSokal theorem, and it goes as  follows.
 First, we prove that for any
xed ultraviolet cuto N > 0 and any 0, 2 the running coupling constants
are analytic for (, , ) W, B B; analyticity of the Schwinger functions
T T
S(N ) (f ; q) in the same domain is straightforward, since S(N ) (f ; q) is given by an
absolutely convergent power series in the running coupling constants, see [6, 12, 13].
Then, we prove that S T (f ; q) = limN + S(N T
) (f ; q) exists, and that the limit is
reached uniformly in the analyticity domain. Therefore, S T (f ; q) is analytic in the
T T
same analyticity domain of S(N ) (f ; q). To conclude, we show that S (f ; q), as
function of in W, , veries the bound (3.2). These two properties imply Borel
summability, since C W, .
Analyticity. To solve the ow equations (2.27) and determine the analyticity prop-
erties of the running coupling constants we use a xed point argument. More pre-
cisely, we show that the Eqs. (2.27) are solved by sequences parametrized by the
renormalized coupling constants (, , ) which, for nite N , are the xed points of
some operators acting on suitable nite dimensional spaces; all the technical work
is reduced to showing that in the considered spaces the operators are contractions.
After this, the sequences of running coupling constants are determined through
an exponentially convergent procedure. In particular, in the limit N +, for
(, , ) W, B B, we nd that the Eqs. (2.27) admit a solution of the
form, for some positive C, c:
1
k = , |k | c(|| + ||2 ),
k
1 + k (3.3)
j=0

|k 2k | c[ 2k ||2 + (|| + ||)|k |],


= (1 + O()), |k k | C(|| + ||), k := (4) (k; k, k) > 0.
where 4,4
To begin, we rewrite the ow equation for k as, see (2.27) with a = 4:
k =: k+1 + 4,k+1 (, ), k 0, (3.4)

=: 0 + f4,0 (0 , 0 ) + 4,0 (0 , , 0 , ), (3.5)


where f4,0 is linear in 0 , and 4,h is given by a sum of terms proportional to at
least two among 0 , . . . , N . Then, iterating (2.27) up to the scale 0 we get that,
for a = 2, 2 :

k
  
k
 
k =: 0 2 ,j , , k =: 2k 0 2(jk) 2,j , , k 1,
j=1 j=1
(3.6)
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

1012 M. Porta & S. Simonella

 
0 =: f2 ,0 (0 , 0 ) 2 ,0 0 , , 0 , ,
  (3.7)
0 =: f2,0 (0 ) 2,0 0 , , 0 , ,

where f2 ,0 collect terms at most linear in 0 , while 2 ,h , 2,h are given by sums of
terms proportional to at least  two or one among 0 , . . . , N , respectively. Setting
4,h (, ) =: h 2h + 4,h , where h > 0 and 4,h is of order 3, Eq. (3.4) can
be rewritten as
  
k 
k
 
1
k = 1
k+1 k+1 + Rk+1 , 1
k = 1
0 + j Rj , , (3.8)
j=1 j=1

where Rj is given by a sum of terms bounded proportionally to one between j ,


j , j , and it depends only on running coupling constants on scales j, see
Appendix B; the key remark is that, formally, Eq. (3.8) can be seen as dening
the xed point of the map
1
(T0 , x)k = , k 1, (3.9)

k 
k
 
1
0 + j Rj x,
j=1 j=1

where x = (x1 , . . . , xN ) with xi C and , satisfy (3.6), which again can be


formally seen as the xed point of the map

k
 
0 2 ,j , y

j=1

(T0 , y)k = , k 1, (3.10)
k
 
2k 2(jk)
2,j , y
0
j=1

where y = (y1 , y2 , . . . , yN ) and yk = (yk,2 , yk,2 ) with yk,i C. Therefore, we can


in principle determine the running coupling constants on scale > 0 as functions of
(0 , 0 , 0 ) by solving the equations:
= T0 , , 0 , ;
=T (3.11)

after this, the dependence of the running coupling constants on the renormalized
ones can be deduced from Eqs. (3.5) and (3.7).
To solve (3.11), in Sec. 4.1 and in Appendices A and B we prove that if S CN
is the set of sequences close enough to the solution of the ow of k truncated
to second order and if S C2N is a 2N -dimensional ball centered in zero and of
suitably small radius, then: (i) if x S and |0 |, |0 | are small enough the map
T 0 ,x leaves S invariant and is a contraction therein; (ii) the xed point y(x) of T
0 ,x
in S is Holder continuous in x with exponent 0 < < 1; (iii) given (0, /2],
for all 0 W, with small enough, the map T0 ,y() leaves S invariant and
is a contraction therein. To be specic, the distances d, d that we shall adopt
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

Borel Summability of 44 Planar Theory via Multiscale Analysis 1013

in S, S are dened as d(x, x ) := maxk |xk xk |, d(y,


y  ) := maxk,i |yk,i y  |,
k,i
respectively.
Then, we can construct the sequences solving (3.11) in the following way: take
 (0) 

(0)
= (0) S,
S, (3.12)
(0)
and dene, for m 0,
 n (m)
(m+1) := lim T , (m+1) := T0 ,(m+1) (m) . (3.13)
n 0 ,(m)

 
Assume inductively that for all 0 m m the sequences (m ) , (m ) belong
respectively to S,
S, which is true for m = 0. Property (i) above implies that
(m+1)
belongs to S,
while property (iii) implies that (m+1) belongs to S. Then,
our procedure (3.13) converges exponentially to a limit; in fact, for m 1, for some
0 < < 1, C > 0 and 0 <  < 1:
 
 (m+1) (m)   (m) (m1) 

max k,i 
k,i C max k k
k,i k
 
 (1) (0) 
C  (m1) 
max k k (3.14)
k
 (m+1) (m)   (1) (0) 
max k k  m max k k 
k,i k

where we used property (ii) to get the rst inequality in the rst line, and property
(iii) for the remaining ones. Since (1) , (0) are bounded, Eqs. (3.14) prove that the
limits

= lim (m) , = lim (m) (3.15)


m m

exist in S, S respectively, and by construction

= T0 , , = T
, ,
0 (3.16)

i.e. , are the sequences of running coupling constants from scale 1 to N of the
planar 44 theory, parametrized by 0 , 0 , 0 . The proof of analyticity of the limits
for (0 , 0 , 0 ) W, B B with , small enough is straightforward; it is
a consequence of the analyticity properties of the initial data and of the maps T,
and of the fact that convergence is uniform for (0 , 0 , 0 ) W, B B .
T,
After this, from Eqs. (3.5) and (3.7) we show that 0 , 0 , 0 are analytic for
(, , ) W , B B with  > ,  < ,  < , and this concludes the
proof of analyticity of the running coupling constants in the renormalized ones.
T
Finally, to prove analyticity of the Schwinger functions we use that S(N ) (f ; q) is
given by an absolutely convergent power series in the running coupling constants,
see Sec. 2, and we prove that the limit for N exists and it is reached uniformly
for (, , ) W, B B with <  , <  .
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

1014 M. Porta & S. Simonella

Bound on the remainder. In Sec. 4.2, we show that relying on the tree represen-
tation of the beta function described in Sec. 2, it is possible to rewrite the q-point
Schwinger function as:
S T (f ; q) = S T,(n) (f ; q) + r(n) (f ; q), (3.17)
where S T,(n) (f ; q) is the Taylor expansion of S T (f ; q) up to order n in = 0,
and r(n) (f ; q) is a quantity bounded by (const.)n+1 Cq f q1 (n + 1)!||n+1 uniformly
in the analyticity domain. The idea is to use the graphical representation of the
beta function depicted in Fig. 5 to extract in the tree expansion of the Schwinger
function all the possible trees with less than n + 1 thin endpoints corresponding to
, as suggested by Fig. 6; the main diculty in this procedure is to check that after
having reproduced the Taylor series up to the order n the unwanted trees, i.e.
the ones showing more than n endpoints of type 4, have less than n + 1 nontrivial
frames labeled by a = 2 , 4, see remark after Fig. 6. After having checked this, the
desired bound is a straightforward consequence of the n! bound (2.36).

4. Proof of Theorem 1
4.1. Analyticity of the flow of the running coupling constants
In this section we present in a mathematically precise form the properties (i)(iii)
mentioned in the previous section after Eq. (3.11), which, as we already discussed,
are the key ingredients in the construction of the sequences of the running coupling
constants on scale 1 as functions of the ones on scale 0. After this, we express
the running coupling constants on scale 0 in terms of the renormalized ones, and
we prove the analyticity properties required for Borel summability.
The spaces of sequences that we shall consider are the following ones:










1
S0 , := x C : xk =
N
, |t k | k ,

k


1

(4.1)

0 + j + t k

j=1

S := {y C2N : |yk,i | }.
The following two lemmas imply, respectively, properties (i), (ii) and property (iii)
stated in Sec. 3.
 
Lemma 1. For any 0, 2 there exist > 0, > 0 such that if (0 , 0 , 0 )
W2, B2 B2 and x S0 ,+ :
2

0 ,x is a map from S4 to S4 ;
(1) T
0 ,x is a contraction in S4 , i.e. if y S4 , y  S4
(2) T
      
max  T 0 ,x y
k,i
T 0 ,x y 
k,i
  max yk,i yk,i
 
, 0 <  < 1; (4.2)
k,i k,i
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

Borel Summability of 44 Planar Theory via Multiscale Analysis 1015

(3) given two sequences x, x belonging to S0 ,+ , the xed points y(x), y(x ) of
the maps T 0 ,x in S4 verify the following inequalities:
0 ,x , T

|yk,i (x) yk,i (x )| C[log(1 + k) + 1] max |xk xk |, (4.3)


k
 
 
|yk,i (x) yk,i (x )| C max |xk xk | . (4.4)
k

for some positive C, C and 0 < < 1.


 
Lemma 2. For any 0, 2 there exist > 0, > 0 such that if (0 , 0 , 0 )
W2, B2 B2 the xed point y(x) of T
0 ,x in S4 for x S0 ,+ exists and:
2

(1) T0 ,y() is a map from S0 ,+ to S0 ,+ ;


(2) T0 ,y() is a contraction in S0 ,+ , i.e. if x S0 ,+ , x S0 ,+ ,

max |(T0 ,y(x) x)k (T0 ,y(x ) x )k |  max |xk xk |, 0 <  < 1. (4.5)
k k

We refer the reader to Appendices A and B for the proofs of these lemmas.
As explained in Sec. 3, this two results allow to construct the sequences of the
running coupling constants as functions of those on scale 0, and to determine their
analyticity properties. We take
 (0) 

(0)
= S4 , (0) S0 ,+ (4.6)
(0)
analytic for (0 , 0 , 0 ) W2, B2 B2 ; to be concrete, we can choose
2

(0) (0) (0) 1


k = k = , k = , 0 W2, . (4.7)
k 2
1
0 + j
j=1

Then, we can construct the sequences of running coupling constants by proceeding


as explained after (3.12); analyticity for (0 , 0 , 0 ) W2, B2 B2 is a
2
straightforward consequence of the analyticity properties of the maps and of the
initial data, and of the fact that convergence is uniform for (0 , 0 , 0 ) W2,
2
B2 B2 .
Now we turn to the ow Eqs. (3.5) and (3.7) for the running coupling constants
on scale 0. Notice that these equations are dierent from the ones corresponding to
higher scales, because of the presence of the functions fa,0 . The main consequence
of this fact is that choosing inside C does not imply that 0 C for some  ;
this is the reason why we considered 0 W, so far. The strategy that we shall
adopt is very similar, but technically much simpler, to the one we followed for the
scales 1, . . . , N , see Appendix C for details: rst, we determine with a xed point
argument 0 , 0 as analytic functions of 0 , , in W2, B B for , small
2
enough; then, we plug 0 , 0 into Eq. (3.5) for 0 , and we solve it using again a
xed point argument; nally, we show that the solution has the required analyticity
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

1016 M. Porta & S. Simonella

properties in , , . In particular, it follows that 1


0
1
(1 + O()) + 0 , up
to corrections bounded by (const.)(|| + ||).

Asymptotic behavior of the running coupling constants. So far, our con-


struction allowed us to conclude that, if (, , ) W, B B with , small
enough:
1
k = , |k | , |k | , (4.8)

k
1
(1 + O()) + k + tk
j=0

with |tk | (k + 1) + ; however, these results can be improved to get (3.3). In
fact, the ows of k , k are given by, for k 1:


k
  
k
 
k = 0 2 ,j , , k = 2k 0 2(jk) 2,j , , (4.9)
j=1 j=1

where:
       
2 ,j ,  c |j |2 , 2,j ,  c || + || |j |. (4.10)

Therefore it follows that, using the expression for k in (4.8), for some c > 0:
    # 
|k | c || + ||2 , k 2k  c 2k ||2 + (|| + ||)|k | , (4.11)

which give the last two of (3.3). To prove the rst of (3.3), simply use (4.11) and
the rst of (4.8) to replace the running coupling constants appearing in Rj , see
(3.9) and (B.2).

Analyticity of the Schwinger functions. As we have discussed in Sec. 2, the


T
Schwinger functions S(N ) (f ; q) are given by absolutely convergent power series in
the running coupling constants on scales N ; therefore, taking , smaller than
) (f ; q) is analytic for (, , ) W,
T
the radius of convergence of the series, S(N
B
$ T

B
. To
% prove analyticity in the limit N + we show that the sequence
S(N ) (f ; q) N 1 is uniformly Cauchy in the analyticity domain. In fact, consider
two positive integers N, N  such that N  > N ; then,

 ) (f ; q) S(N ) (f ; q) := S1,(N,N  ) (f ; q) + S2,(N,N  ) (f ; q),


T T T T
S(N (4.12)
T
where S1,(N,N  ) (f ; q) is given by a sum of trees with at least one endpoint on scale

(a),N  (a),N
k N corresponding to the dierence of running coupling constants vk vk
of theories with cutos on scales N  , N , and S2,(N,N
T
 ) (f ; q) is given by a sum of

GN trees having root scale 1 and at least one endpoint on scale N + 1. The
rst term can be bounded using the results of Appendix D as:
 T 
S  1
1,(N,N  ) (f ; q) (const.)N , (4.13)
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

Borel Summability of 44 Planar Theory via Multiscale Analysis 1017

while the second can be estimated using the short memory property of the GN trees
(see discussion after (2.20)) as, for some > 0,
 T 
S  N
2,(N,N  ) (f ; q) (const.) , > 1; (4.14)

all the bounds are uniform in (, , ) W, B B. Therefore the limit exists,


and it is analytic in W, B B.

4.2. Bounds on the Taylor remainder of the Schwinger functions


In this section we show that for all n > 0, (, , ) W, B B, the q-points
Schwinger function S T (f ; q) veries

S T (f ; q) = S T,(n) (f ; q) + rn (f ; q) (4.15)

where S T,(n) (f ; q) is the Taylor expansion of S T (f ; q) up to the order n in = 0


and rn (f ; q) is a remainder bounded by C n+1 (n + 1)!||n+1 for some C > 0. Result
(4.15) concludes the proof of Borel summability of the Schwinger functions of the
planar theory.
One can try to prove decomposition (4.15) by iterating the graphical denition
of the running coupling constants, see discussion after (2.35) and, in particular,
Fig. 6 to get an idea of the graphical meaning of the iteration, to extract all
the possible trees with only thin endpoints and at most n of them labeled by 4; to
conclude the proof one has to check at the end that the sum of the values of the trees
not belonging to this category is bounded by C n+1 (n + 1)!||n+1 . For simplicity, in
the following we shall call a-endpoint an endpoint labeled by a, and a-frame a
frame labeled by a; a-frames with a equal to 2 or 4 will be called (2 , 4)-frames.

Empty and square endpoints. We can rewrite (3.4), (3.7) in the more compact
form:

= 2a,2 (k+1) v1 2a,2 k fa,0 (0 , 0 )


(a) (a)
vk

k
 
2(jk)a,2 a,j 0 , , 0 , . (4.16)
j=0

We graphically represent 2a,2 k fa,0 as an empty a-endpoint and


k
2(jk)a,2 a,j
j=0

as a square a-endpoint. Therefore, in general, the fat a-endpoint can be written as


the sum of thin, empty and square a-endpoints; see Fig. 8. In turn, the empty and
the square endpoints can be represented as sums of framed trees with root scale k,
no inner frames and only fat endpoints, see discussion after (2.35). It is important
to notice that the frames appearing in the tree representation of 2a,2 k fa,0 are
trivial, see Remark after Fig. 5.
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

1018 M. Porta & S. Simonella

= + +
k a k a k a k a

Fig. 8. Fat endpoints are equal to thin plus empty plus square endpoints.

We dene the order and the 4-order of fat, thin, empty, and square endpoints as
the order of their values in all the renormalized coupling constants and in only,
respectively. Therefore:
Thin and fat endpoints have order 1; empty endpoints have order 2; square
a-endpoints have order 1 or 2 depending on whether a = 2 , 4 or a = 2.
Thin, fat and empty a-endpoints have 4-order 0 or 1 depending on whether
a = 2 , 2 or a = 4; square endpoints have 4-order 1.
Notice that the reason why we set to 1 the order and the 4-order of the square
a-endpoints with a = 2 , 4, which are given by sums of trees with two 4-endpoints,
is that we have to exploit asymptotic freedom to control the sum in (4.16); the
result can be bounded uniformly in k by || but not by ||2 .
Notations. We shall use the following notations:
n2 ,4 () is the number of nontrivial (2 , 4)-frames appearing in a tree ;
(a)
nsq () is the number of square a-endpoints appearing in a tree , and nsq () :=
(4) (2 ) (2)
nsq () + nsq () + nsq ();
the order O() and the 4-order O4 () of a tree are respectively equal to the
sums of the orders, 4-orders of the endpoints of ;
the expansion of square and empty endpoints consists in replacing them with
their tree expansions in terms of framed trees with no inner frames and only fat
endpoints, see discussion after (2.35).
Proof of (3.17). We will proceed by induction. Assume that, at the step r of the
induction, for every n > 0, M > 0 with M n the Schwinger function S T (f ; q) can
be written as
(r) (r),1 (r),2
S T (f ; q) = Fn,M + Rn,M + Rn,M , (4.17)
(r) (r),i
where both Fn,M , Rn,M can be represented as sums over distinct trees such that
n2 ,4 () n. Moreover, we assume that:
(r)
the trees contributing to Fn,M are such that O4 () n, O() M and show
fat and thin endpoints;
(r),i
the trees contributing to Rn,M are such that O4 () > n or O() > M , depend-
ing on whether i = 1, 2, and may have empty and square endpoints.
These assumptions are trivially true at the beginning of the induction, see Sec. 2.
As a consequence of result (2.36), and since the number of topologically distinct
(r),1 (r),2
trees with m endpoints is estimated by (const.)m , Rn,M , Rn,M are bounded
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

Borel Summability of 44 Planar Theory via Multiscale Analysis 1019

respectively by C n Cq f q1n!||n+1 , C M Cq f q1 n! M+1 for some positive C and


:= maxk {|k |, |k |, |k |}. Now do the following.
(r)
(1) Substitute every fat 2-endpoint appearing in Fn,M with the sum of a thin
plus an empty plus a square 2-endpoints: in this way the fat 2-endpoint disappear,
generating new trees such that n2 ,4 () n that we organize by writing
(r) (r) (r),1 (r),2
Fn,M = A1 + A2 + A2 , (4.18)
where
(r)
A1 := sum of trees such that O4 () n and O() M ,
(r),1
A2 := sum of trees such that O4 () > n,
(r),2
A2 := sum of trees such that O4 () n and O() > M .
(2) Substitute every fat 2 -endpoint appearing in A1 with the sum of a thin plus
(r)

an empty plus a square 2 -endpoint: in this way the fat 2 -endpoints disappear,
generating new trees such that n2 ,4 () n that we organize by writing
(r) (r) (r),1 (r),2
A1 = A3 + A4 + A4 , (4.19)
where
(r) (2 )
A3 := sum of trees s.t. n2 ,4 () + nsq () n 1 + n2 ,4 ,0 and O() M ,
(r),1 (2 )
A4 := sum of trees s.t. n2 ,4 () + nsq () > n 1 + n2 ,4 ,0 ,
(r),2 (2 )
A4 := sum of trees s.t. n2 ,4 () + nsq () n 1 + n2 ,4 ,0 and O() > M .
(r),1
Notice that the trees appearing in A4 are such that O4 () > n; in fact, for these
trees,

O4 () n2 ,4 () + 1 + nsq
(2 )
() n2 ,4 ,0 > n, (4.20)

where we used that each nontrivial 2 -frame contains trees of 4-order 2, that the
square 2 -endpoints are of 4-order strictly bigger than their corresponding thin and
(r),1
empty endpoints, and the denition of A4 .
(3) Expand each square a-endpoint with a = 2 , 2 appearing in A3 , and write
(r)

(r) (r) (r),1 (r),2


A3 = A5 + A6 + A6 , (4.21)
where
(r)
A5 := sum of the trees s.t. O4 () n and O() M ,
(r),1
A6 := sum of the trees s.t. O4 () > n,
(r),2
A6 := sum of the trees s.t. O4 () n and O() > M .
Notice that the trees generated at this step are such that n2 ,4 () n; in fact, for a
(2 )
generic tree  generated by A3 it follows that n2 ,4 (  ) = n2 ,4 () + nsq ()
(r)

(r)
n, where the last inequality holds by denition of A3 .
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

1020 M. Porta & S. Simonella

(r)
(4) Substitute every fat 4-endpoint appearing in A5 with the sum of a thin plus
an empty plus a square 4-endpoint: in this way the fat 4-endpoints disappear,
generating new trees such that n2 ,4 () n that we organize by writing
(r) (r) (r),1 (r),2
A5 = A7 + A8 + A8 , (4.22)

where
(r) (4)
A7 := sum of trees s.t. n2 ,4 () + nsq () n 1 + n2 ,4 ,0 and O() M ,
(r),1 (4)
A8 := sum of trees s.t. n2 ,4 () + nsq () > n 1 + n2 ,4 ,0 ,
(r),2 (4)
A8 := sum of trees s.t. n2 ,4 () + nsq () n 1 + n2 ,4 ,0 and O() > M .
(r),1
Now, we show A8 can be rewritten as a sum of trees such that O4 () > n and
n2 ,4 () n. Notice that since the 4-order of the 4-square endpoint is equal to the
4-order of its corresponding fat, thin and empty endpoints, we cannot use a bound
like the one in (4.20). To rise the 4-order of a tree up to n + 1 we have to
(4) (4)
expand a suitable number n sq () nsq () of square 4-endpoints (which are given
(4)
by sums of trees of 4-order 2). If n2 ,4 () = 0 we choose n sq () = 0, because in
(r)
this case by denition of S4 the 4-order of is already > n; if n2 ,4 () > 0 we
choose

n sq () := n n2 ,4 (),
(4) (4.23)

with this choice it follows that (n2 ,4 () refers to the tree before this last expan-
sion),

O4 () n2 ,4 () + 1 + n
(4)
sq () = n + 1. (4.24)

Finally, a generic tree  produced by this last expansion veries

n2 ,4 (  ) = n2 ,4 () + n
(4)
sq () = n. (4.25)
(r)
(5) Expand each square 4-endpoint appearing in A7 , and write
(r) (r) (r),1 (r),2
A7 = A9 + A10 + A10 , (4.26)

where
(r)
A9 := sum of the trees s.t. O4 () n and O() M ,
(r),1
A10 := sum of the trees s.t. O4 () > n, (4.27)
(r),2
A10 := sum of the trees s.t. O4 () n and O() > M .

Notice that the trees generated at this step are such that n2 ,4 () n; in fact, for a
generic tree  generated by A7 it follows that n2 ,4 (  ) = n2 ,4 () + nsq ()
(r) (4)

(r)
n, where the last inequality holds by denition of A7 .
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

Borel Summability of 44 Planar Theory via Multiscale Analysis 1021

(r)
(6) Expand each empty a-endpoint appearing in A9 , and write
(r) (r+1) (r),1 (r),2
A9 = Fn,M + A12 + A12 (4.28)
where
(r+1)
Fn,M := sum of the trees s.t. O4 () n and O() M ,
(r),1
A12 := sum of the trees s.t. O4 () > n, (4.29)
(r),2
A12 := sum of the trees s.t. O4 () n and O() > M .
(7) We are now able to express the generic Schwinger function S T (f ; q) as

(r+1) (r),1 (r),2


 
6
(r),j
T
S (f ; q) = Fn,M + Rn,M + Rn,M + A2i
j=1,2 i=1
(r+1) (r+1),1 (r+1),2
=: Fn,M + Rn,M + Rn,M , (4.30)
where, by construction, all the trees are such that n2 ,4 () n, the remainder
(r+1),1 (r+1),2
Rn,M contains distinct trees such that O4 () > n, while Rn,M is given by a
(r+1)
sum of distinct trees such that O() > M . If Fn,M still contains trees with fat
endpoints repeat the process starting from step (1), otherwise we have nished:
calling r the nal step (which is nite, see Remark below), i.e. the integer such
(r )
that Fn,M contains trees with only thin endpoints, the n! bound (2.36) implies that,
if = maxh {|h |, |h |, |h |}:
(r ),1 (r ),2
|Rn,M | C n Cq f q1 n!||n+1 , |Rn,M | C M Cq f q1n! M+1 . (4.31)

(r ) (r )
Moreover, Fn,M diers from Fn,+ , the Taylor expansion in to the order n, by
a quantity bounded by C M Cq f q1 n! M+1 ; therefore, for each in the analyticity
domain and for each n 0 there exists a nite integer M (, n) n such that for
all M M (, n) it follows that:
 T (r ) 
S (f ; q) Fn,+ 4C n Cq f q1 n!||n+1 , (4.32)
and this bound concludes the proof of Borel summability of the 44 planar theory.
Remark. The iteration ends in less than M + 1 steps (where each step is formed
by the seven substeps described above); this means that no trees with fat endpoints
(M+1)
are present in Fn,M . We can prove this fact with a simple induction. At the step
r = 0, the trees with fat endpoints are of order 0. Assume inductively that at the
(r)
rth step, the trees belonging to Fn,M with at least one fat endpoint are of order
r. If this is true, by repeating the six substeps described above, we nd that
(r+1)
the new trees with at least one fat endpoint appearing in Fn,M must be of order
r + 1, since at the rth step the fat endpoints are replaced by thin plus empty
plus square endpoints, and the empty endpoints are of order 2 while the squares
are given by sums of trees of order 2. Hence, after at most r = M + 1 iterations
(r )
no more trees showing fat endpoints will be present in Fn,M .
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

1022 M. Porta & S. Simonella

5. Conclusions
In this paper, we discussed the issue of Borel summability in the framework of multi-
scale analysis and renormalization group, by providing a proof of Borel summability
for the 44 planar theory using the techniques of [6]. This result is not new, since it
has been proven independently by t Hooft and Rivasseau, [25]. The proof given
by t Hooft is based on renormalization group methods, and it does not rely on
NevanlinnaSokal theorem; we have not been able to fully reproduce t Hooft argu-
ment in our rigorous framework. The proof given by Rivasseau, instead, consists in
a check of the two hypothesis of NevanlinnaSokal theorem. However, his methods
are quite dierent from the ones that we use, since in his approach the beta function
was not introduced. Moreover, in his work a particular choice of the wave function
renormalization and of the renormalized mass was made.
One of the motivations of our work is that very few proofs of Borel summability
of interacting eld theories based on renormalization group methods are present in
the literature, [8, 9]. Moreover, our framework has already been proved eective in
the analysis of various models of condensed matter and eld theory. Therefore, we
consider our work as a rst step towards the analysis of more interesting models. For
instance, we think that the ideas of this paper can be applied to the one-dimensional
Hubbard model, which has been rigorously constructed through renormalization
group methods in [15], but where a proof of Borel summability has not been given
yet. In fact, due to the anticommutativity of the fermionic elds the factorial growth
of the Feynman graphs can be controlled using the so called Gram bounds. Moreover,
one sector of the theory is asymptotically free, while to control the ow of the other
running coupling constants one has to exploit the vanishing of the beta function.
Regarding our work, the rst part of this paper consists essentially in a rigorous
study of the beta function of an asymptotically free eld theory. In particular,
we have shown that the theory is analytic for values of the renormalized coupling
constant belonging to a Watson domain, see [18] and denition (2.40), and for
values of the wave function renormalization and of the renormalized mass close to
1 in absolute value. In the second part of our work, to prove Borel summability
we have shown that it is possible to undo the resummation that allowed us to
write the Schwinger functions as a convergent power series in the running coupling
constants, in such a way that the dierence between the generic Schwinger function
and its Taylor expansion to the order n in is bounded by C n n!||n+1 for some
positive C. Thanks to NevanlinnaSokal theorem, see [14], this last fact along with
the above mentioned analyticity properties implies Borel summability.

Acknowledgments
It is a pleasure to thank Prof. G. Gallavotti for having introduced us to the theory
of renormalization, for having proposed the problem and for many very useful dis-
cussions, from which all the ideas of this paper emerged. We are also grateful to
Dr. A. Giuliani, for constant encouragement and constructive criticism.
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

Borel Summability of 44 Planar Theory via Multiscale Analysis 1023

Appendix A. Proof of Lemma 1


In this appendix, we present the proof of Lemma 1. Recall that r is the number of
running coupling constants of type 4 appearing at a given order r of the perturbative
series dening the beta function, see (2.28); moreover, we dene r := r r. We
remind also that with the notation y(x) we denote the xed point of the map T ,x
0

in S2 .
All the estimates that we shall derive here and in the next appendix are
consequences of the fact that, as it can be checked in a straightforward way, if
x, x S0 ,+ , 0 W, and , are small enough there exists a constant C > 0
such that
 
C  xk 
|xk | ,    C if k h; (A.1)
|0 |1 + k x h
1
the constant C grows as for 0.
Proof of Lemma 1 (1). First, we have to prove that if (0 , 0, 0 ) W, B B
and x S0 ,+ the map T 0 ,x leaves invariant S2 , for 0, , and , small
2
enough; in fact, setting a = (a1 , . . . , ar ), h = (h1 , . . . , hr ):

k      
r
r
,x y)k,2 | |0 | +
|(T 0
a(2 ) (j; h) |xhi | |yhi ,ai |
j=1 r2 hi j {ai }ri=1 i=1 i=1
i=1,...,r ai =4 ai =4


k      
|0 | + a(2 ) (j; h)|xj |2 Cr r2 (2)r
j=1 r2 hi j {ai }ri=1
i=1,...,r

|0 | + C (A.2)
for some C > 0. Similarly,

k     
,x y)k,2 | |0 | +
|(T 0 2(jk) a(2) (j; h)Crr(2)r
j=1 r2 hi j {ai }ri=1
i=1,...,r

|0 | + C ( + )2 , (A.3)
for C large enough. Hence, if (0 , 0 ) B B , then both (A.2), (A.3) can be
made smaller than 2 taking small enough.
Proof of Lemma 1(2). Under the same assumption of Lemma 1(1), we show now
0 ,x is a contraction in S2 ; in fact,
that T


k     
r
|(T 0
,x y  )k,2 |
,x y)k,2 (T
0
a(2 ) (j; h) |xhi |
j=1 r3 hi j i=1
{ai }ri=1 i=1,...,r ai =4
 
(6)r1 r max yk,i yk,i
 
, (A.4)
k,i
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

1024 M. Porta & S. Simonella

where we used that the second order of the beta function depends only on xj ;
therefore, we can exploit two of the xhis to perform the sum, and it follows that,
for , small enough:
0 ,x y)k,2 (T
|(T 0 ,x y  )k,2 |  max |y  yk,i |, 0 <  < 1. (A.5)
k,i
k,i

The same result can be proved for the dierence of the 2-components, using the
2(jk) factor to perform the sum over the js; this concludes the proof of the
,x .
contractivity of T 0

Proof of Lemma 1(3). We prove here the last item of Lemma 1. Given y S2 ,
set
 n  
 n 

yk,i,n := T
0 ,x y k,i , yk,i,n := T 0 ,x y k,i , (A.6)
and assume inductively that for all 0 m n the following bound is true:
 

|yk,i,m yk,i,m | C(log(1 + k) + 1) max xk xk ; (A.7)
k

therefore, from (A.7) it follows that:


&

k     
|yk,2 ,n+1 
yk,2 ,n+1 | a(2 ) (j; h) C r1 |xj |
r (3)r2 (6)r

j=1 r2 hi j
{ai }ri=1 i=1,...,r
'

r
r2 r1
+ CCr (log(1 + h ) + 1)|xj | (3)
2
(6)
=1
a =4

max |xk xk |, (A.8)


k

and
&

k    
|yk,2,n+1 
yk,2,n+1 | 2(jk) a(2) (j; h) C r1 r(3)r1 (6)r

j=1 r2 hi j
{ai }ri=1 i=1,...,r
'

r
r1
+ CCr (log(1 r
+ h ) + 1)(3) (6)
=1
a =4

max |xk xk |. (A.9)


k

Using the short memory property of the GN trees, see discussion after (2.20), it
follows that:
  
 (i) (j; h) log(1 + h ) (const.)r log(1 + j); (A.10)
a
h1 ,...,hr
hi j

plugging this bound into (A.8), (A.9) we can reproduce our inductive assumption
(A.7) for m = n + 1, choosing for , small enough. This concludes the proof
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

Borel Summability of 44 Planar Theory via Multiscale Analysis 1025

of (4.3). The Holder continuity bound (4.4) can be proved again by induction,
replacing (A.7) with
 
 
|yk,i,m yk,i,m | C max |xk xk | , 0 < < 1, (A.11)
k

and using in (A.8) the bound, if hi j for all i = 1, . . . , r and 0 < < 1:
 
   
 r r 
  
 xhi r max |xk xk | |xj |2 Cr (3)r2 .
xhi  2 
(A.12)
  k
 i=1 i=1 
ai =4 ai =4

Appendix B. Proof of Lemma 2


In this appendix we present a proof of Lemma 2.

Proof of Lemma 2(1). First, we have to prove that T0 ,y() leaves invariant
S0 ,+ for (0 , 0 , 0 ) W, B B for , small enough. We have that

1
(T0 ,y(x) x)k = , (B.1)

k k
1
0 + j Rj (x, y(x))
j=1 j=1

where y(x) S2 and

xj j2 + x1 2
j j 4,j (x, y(x)) xj 4,j (x, y(x))
Rj (x, y(x)) = , (B.2)
1 + j xj + x1 4,j (x, y(x))
j

with

  
r
r
4,j (x, y(x)) = a(4) (j; h) xhi yhi ,ai (x), (B.3)
r3 hi j {ai }ri=1 i=1 i=1
i=1,...,r ai =4 ai =4

(4)
where a1 ,...,ar (j; h1 , . . . , hr ) = 0 unless there are at least two ai equal to 4. The
nal statement follows from the fact that for , small enough

|Rj (x, y(x))| (const.)( + ). (B.4)

Proof of Lemma 2(2). To conclude, we have to show that under the same assump-
tions of the previous item, T0 ,y(x) is a contraction in S0 ,+ . Setting y(x) =: y,
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

1026 M. Porta & S. Simonella

y(x ) =: y  , from (B.1) we have that

(T0 ,y(x) x)k (T0 ,y(x ) x )k



k
(Rj (x, y) Rj (x , y  ))
(B.5)
= ,
j=1


k 
k 
k 
k
1
0 + j Rj (x, y) 1
0 + j Rj (x , y )
j=1 j=1 j=1 j=1

where Rj is given by (B.2); therefore, to bound the dierence of Rj s calculated


at dierent x we have to estimate (the other terms can be worked out in a similar
way)
 2 
x 4,j (x, y) xj 2 4,j (x , y  )
j
   
a(4) (j; h)
r3 {hi }j
{ai }
 
 
 2
r r r r
 2   
xj xhi yhi ,ai xj xhi yhi ,ai ; (B.6)
 
i=1 i=1 i=1 i=1
ai =4 ai =4 ai =4 ai =4

we have that
 
 
 r r r r 
 2  2   
xj xhi yhi ,ai xj xhi yhi ,ai 
 
 i=1 i=1 i=1 i=1 
ai =4 ai =4 ai =4 ai =4
 
 
 r r r r 
 2   
xj xhi yhi ,ai xhi yhi ,ai  (B.7)
 
 i=1 i=1 i=1 i=1 
ai =4 ai =4 ai =4 ai =4
 
 
 (x + x ) r r 
  j j   
+ max |xk xk |  2  2 xhi yhi ,ai  (B.8)
k  xj xj 
 i=1 i=1 
a =4
i ai =4

and:

(B.7) |xj |1 Cr1 r(3)r2 (6)r max |xk xk |


k

r
+ Cr(3)r2 (6)r1 |yh ,a yh  ,a | (B.9)
=1
a =4

(B.8) max |xk xk ||xj |1 2Cr (3)r2 (6)r. (B.10)


k
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

Borel Summability of 44 Planar Theory via Multiscale Analysis 1027

Using (4.3) and the short memory property it follows that:


   
 (4) (j; h)yh ,a y   
h ,a (const.) [log(1 + j) + 1] max |xk xk |;
r
a  
k
h1 ,...,hr
hi j
(B.11)
therefore, since the other terms arising in the dierence (B.5) can be treated exactly
in the same way, from (B.9)(B.11) we nd that:

k k
|Rj (x , y  ) Rj (x, y)| (const.) |xj |1 ( + ) + k(log(1 + k) + 1)
j=1 j=1

max |xk xk |, (B.12)


k

which gives statement (4.5) for , small enough. In fact, the denominator of (B.5)
is bounded from below as
  
  
 1  k
  1  k

 +  
(j Rj (x, y)) 0 +   
(j Rj (x , y )) (const.)|xk |2 ;
 0
 j=1  j=1 
(B.13)
using the second of (A.1) our claim (4.5) follows.

Appendix C. The Running Coupling Constants on Scale 0


In this appendix, we discuss how to express the running coupling constants on scale
zero as functions of the renormalized ones. First, a straightforward computation
shows that the second equation in (3.7) can be rewritten as:
1 0
0 = 2,0 (0 , , 0 , ); (C.1)
1+ 1+
(2)
this is a consequence of the fact that in (2.28) a1 ,...,ar (0; 0 . . . , 0) = 1 if ai = 2 for
all i [1, r]. Since the running coupling constants on scale > 0 are parametrized
by the ones on scale 0, we can rewrite (C.1) as

0 =: + g2 (0 , 0 , ) =: + f2 () + g2 (0 , 0 , ), (C.2)
1+
and plugging (C.2) in the rst equation of (3.7) we get
0 =: + f2 () + g2 (0 , 0 , ), (C.3)

where: fi () are analytic functions of B, and gi (0 , 0 , ) are analytic for


(0 , 0 , ) W2, B2 B2 B. Formulas (C.2), (C.3) can be regarded as a
2
xed point equation:
( )
i,0 = M ,0 0 ; (C.4)
i
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

1028 M. Porta & S. Simonella

all we have to do is to check that: (i) for |0 |, || small enough M ,0 leaves invariant
the set B 32 B 23 , and (ii) M
,0 is a contraction therein. The property (i) is a
straightforward consequence of the fact that

|fi ()| C||2 , |gi (0 , 0 , )| C|0 |(|0 | + ||), (C.5)

 
where in the second inequality  we used
 that | k | c| 0 |, |i,k | c |0 | + |0 | and
that, from (3.7), |i,0 | c |0 | + || ; if we choose (0 , ) W2, B B with
2
, small enough then the set B 23 B 32 B2 B2 is left invariant by (C.4).
To prove property (ii), we use a Cauchy estimate. In fact, the Cauchy bound
tells us that if y, y  B 23 B 23 then, since gi (0 , y, ) is analytic for y B2 B2
and bounded as (C.5), for (0 , ) W2, B with , small enough:
2


|gi (0 , y, ) gi (0 , y , )| 2C ( + ) max |yi yi |  max |yi yi | (C.6)
i i

with 0 <  < 1. Therefore, we can construct explicitly the solution i,0 (0 , ),
and the above properties allow us to conclude that it is analytic for (0 , )
W2, B B.
2
After this, we are left with Eq. (3.5) for 0 ; since all the couplings on scale 1
are functions of 0 , 0 and, as we know for our previous analysis, i,0 = i,0 (0 , ),
we can rewrite (3.5) as:

0 =: 0 f4 () 0 20 + h(0 , ) (C.7)

f4 () = O(), |h(0 , )| C|0 |2 (|0 | + ||),

where we used that 0 satises (C.4) with i = 2, and h(0 , ) is analytic for (0 , )
W2, B B. Therefore, we can rewrite (C.7) as:
2

0 = M,
0 ,

:=
, (C.8)
1 + f4 ()

1
M,
x := + (0 x2 + h(x, )).
1 + f4 ()

All we have to do is to check that: (i) if (, ) W, B B then M, leaves


invariant the set W 32 , 23 W2, , and (ii) M, is a contraction therein. Let us
2
prove property (i); for , small enough, it is easy to see that if W, then
W 4 3 and x W 3 2 M x W 3 2 .
3 , 4 2, 3 , 2
, 3
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

Borel Summability of 44 Planar Theory via Multiscale Analysis 1029

We now turn to property (ii). From the analyticity of h(x, ) in x W2, , using
2
that the distance from a point x W 32 , 23 to the boundary of W2, is bounded
2
|x|
from below by sin 6 , if x, x W 32 , 23 a Cauchy estimate tells us that:
3
 
 3C ( + )
|M,
x M
, x | 8 0
+ |x x | |x x | (C.9)
sin(/6)
with  < 1; the rst inequality follows from the bound on h in (C.7), while the
second holds taking small enough (remember that 0, 2 ).
In conclusion, we can explicitly construct the solution of (C.8), and by a simple
inductive argument it follows that it is analytic for (, , ) W, B B.

Appendix D. Dependence of the Running Coupling Constants on


the Ultraviolet Cuto
In this appendix we show that the running coupling constants are weakly dependent
on the location of the ultraviolet cuto; in particular, denoting with a superscript
N the quantities corresponding to a theory with cuto on scale N , if (, , )
W, B B with , small enough we show that there exist two positive
constants C, such that for any k N and N < N  the following bounds hold:
 N  C
 N   + C N ,
k k
1 + N
 N  C
k N  + C N , (D.1)
k 1
+N
 N  C
 N   .
k k
1 + N
In the proof, we shall use in a crucial way the short memory property of the GN
N
trees, see discussion after (2.20). Consider rst the dierence of N
k , k . Denoting
by a prime the running coupling constants corresponding to a theory with cuto
N  and neglecting the N label in the others we have that

k
 N 
|k k | 2 ,j (, , ) 2N ,j ( ,  ,  ) + |0 0 |, (D.2)
j=1

where 2N ,j is the beta function the theory with an ultraviolet cuto on scale N .
Let a := maxk[0,N ] |ak |; using property (2.31) and the bounds in (A.1) it follows
that, for some C1 > 0, > 0 (neglecting for simplicity the arguments of the beta
function):
 N 
2 ,j 2N ,j  C1   C1 
1 2
   +    + 1 |h h | (jh)
( + j) +j
hj
(jN )

+ C1 ( + ) , (D.3)
1 + j
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

1030 M. Porta & S. Simonella

|0 0 | C1 ( + )(   +    +   ) + C1 ( + )2 N ,
(D.4)

where the last terms in (D.3) and (D.4) take into account the contribution of GN
trees with at least one endpoint on scale > N , and all the others bound the dier-
ences of trees with all endpoints on scale < N . Therefore, plugging (D.3) and (D.4)
in (D.2) we have that, for some C1 > 0:

N
C1 
N
   C1 ( + )(   +   ) + |h h | (jh)
j=1
1 + j
hj

C1 ( + )
+ 1 + C1 ( + )2 N . (D.5)
+N
By what has been discussed in Secs. 3 and 4 and in Appendices A and B, it follows
that |k k | for k 1 can be estimated in the following way, for some C2 > 0:
C2 (   +    +   ) C2 ( + ) (kN )
|k k | + , (D.6)
1 + k (1 + k)2
where the rst term takes into account the dierence of running coupling constants
on scale N , while the last term takes into account trees with root scale k having
at least one endpoint on scale > N . Plugging (D.6) in (D.5) it is straightforward
to see that, for some C3 > 0,

   C3 ( + )(   +   )
C3 ( + )
+ + C3 ( + )2 N , (D.7)
1 + N
which if inserted in (D.6) implies, for some positive C4 , C5 :
C4 ( + )
   C4    + C4 ( + )2 N + ,
1 + N
(D.8)
C5 ( + )
   C5 ( + )   + + C5 ( + )2 N .
1 + N
The dierence k k can be bounded in a way analogous to k k , and using
(D.8) it follows that
C6
   + C6 N , C6 > 0, (D.9)
1 + N
which together with (D.8) proves (D.1).

Appendix E. An Improvement of the n! Bounds in the Planar


Theory
In this appendix, we discuss an improvement, valid in the planar case, of the n!
bounds proved in [6, Sec. XIX], see formulas (19.5) and (20.2). Here we shall follow
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

Borel Summability of 44 Planar Theory via Multiscale Analysis 1031

the notations of that work: we remind that the form factor r(a) (; k) of [6] corre-
sponds to the contribution of the tree with thin endpoints to the formal expansion
(a)
of vk (2a,2 +4a,0 )k in , , , which is obtained by iteration of the equation graph-
ically represented in Fig. 5. We claim that [6, Eq. (20.2)] is still valid if f is replaced
by an f denoting just the number of nontrivial frames (see remark after Fig. 5 for
the denition of trivial frame) labeled by a = 2 , 4.
To prove the claim, observe that one can repeat the proof of Sec. XIX in [6]
with the new inductive assumption


f
(bk)j
|r (a) n1 f!
(; k)| D
n
(2a,2 +4a,0 )k (E.1)
j=0
j!

instead of [6, Eq. (19.5)], the only dierence being that the number of topological
Feynman graphs with m vertices is bounded proportionally to N0m where N0 is
a suitable constant, because of the restriction to the planar theory. Then if f is
the number of nontrivial (2 , 4)-frames of excluding the external one, equation [6,
Eq. (19.13)] is replaced by, depending on whether the frame enclosing is trivial
or not:
nm Dm f!
|r(a) (; k)| D7 N m Dm n D
0 4 6


k f
(bh)r
(2a,2 +4a,0 )h , (nontrivial frame), (E.2)
r!
h=0 r=0
nm Dm f!,
|r(a) (; k)| D7 N0m D4m n D (trivial frame);
6

with respect to [6], we have kept the factor (2a,2 +4a,0 )h inside the sum, instead of
estimating it replacing h with k. If the frame enclosing is trivial the claim follows
from the second of (E.2), taking D large enough (as in [6], here m 2). If the frame

is nontrivial and a = 2 , 4, proceed as in [6, Eq. (19.15)], while if a = 2 substitute
that bound with


k 
f
(bh)r 2k  (bk)r
f
2h , (E.3)
r! 1 2 r=0 r!
h=0 r=0

and do the same for a = 0 ( 2k will be replaced by 4k ). From this the claim follows
suciently large, as explained in [6].
choosing D

References
[1] L. D. Landau, Collected Papers of L. D. Landau (Gordon and Breach, 1965).
[2] G. t Hooft, Borel summability of a four-dimensional eld theory, Phys. Lett. B 119
(1982) 369371.
[3] G. t Hooft, Rigorous construction of planar diagram eld theories in four dimensional
euclidean space, Comm. Math. Phys. 88 (1983) 125.
[4] V. Rivasseau, Construction and Borel summability of planar 4-dimensional Euclidean
eld theory, Comm. Math. Phys. 95 (1984) 445486.
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004120

1032 M. Porta & S. Simonella

[5] V. Rivasseau, Rigorous construction and Borel summability for a planar four-
dimensional eld theory, Phys. Lett. B 137 (1983) 98102.
[6] G. Gallavotti, Renormalization theory and ultraviolet stability for scalar elds via
renormalization group methods, Rev. Mod. Phys. 57 (1985) 471562.
[7] G. Gallavotti and V. Rivasseau, 4 -Field theory in dimension four: A modern intro-
duction to its open problems, Ann. Inst. H. Poincar e 40 (1984) 185220.
[8] F. Feldman, J. Magnen, V. Rivasseau and R. Seneor, Construction and Borel summa-
bility of infrared 44 by a phase space expansion, Comm. Math. Phys. 109 (1987)
437480.
[9] F. Feldman, J. Magnen, V. Rivasseau and R. Seneor, A renormalizable eld theory:
The massive GrossNeveu model in two-dimensions, Comm. Math. Phys. 103 (1986)
67103.
[10] J. Koplik, A. Neveu and S. Nussinov, Some aspects of the planar perturbation series,
Nucl. Phys. B 123 (1977) 109131.
[11] E. Brezin, C. Itzykson, G. Parisi and J. B. Zuber, Planar diagrams, Comm. Math.
Phys. 59 (1978) 3551.
[12] G. Gallavotti and F. Nicol`o, Renormalization theory for four-dimensional scalar elds,
I, Comm. Math. Phys. 100 (1985) 545590.
[13] G. Gallavotti and F. Nicol`o, Renormalization theory for four-dimensional scalar elds,
II, Comm. Math. Phys. 101 (1985) 247282.
[14] A. Sokal, An improvement of Watsons theorem on Borel summability, J. Math. Phys.
21 (1980) 261263.
[15] V. Mastropietro, Rigorous proof of Luttinger liquid behaviour in the 1d Hubbard
model, J. Stat. Phys. 121 (2005) 373432.
[16] G. H. Hardy, Divergent Series (Oxford University Press, 1949).
[17] V. Rivasseau, Constructive eld theory in zero dimension, Adv. Math. Phys. 2009
(2009) article ID 180159, 12 pp.
[18] G. N. Watson, A theory of asymptotic series, Philos. Trans. R. Soc. Lond. Ser. A
211 (1912) 279313.
[19] G. Benfatto and G. Gallavotti, Perturbation theory of the Fermi surface in a quan-
tum liquid. A general quasi-particle formalism and one-dimensional systems, Comm.
Math. Phys. 258 (2005) 609655.
[20] G. Gentile and V. Mastropietro, Renormalization group for one-dimensional fermions.
A review on mathematical results, Phys. Rep. 352 (2001) 273437.
[21] G. Benfatto and V. Mastropietro, Ward identities and chiral anomaly in the Luttinger
liquid, Comm. Math. Phys. 258 (2005) 609655.
[22] C. De Calan and V. Rivasseau, Local existence of the Borel transform in Euclidean
44 , Comm. Math. Phys. 82 (1982) 69100.
[23] K. Hepp, Proof of the BogoliubovParasiuk theorem on renormalization, Comm.
Math. Phys. 2 (1966) 301326.
[24] W. Zimmermann, Convergence of Bogoliubovs method of renormalization in momen-
tum space, Comm. Math. Phys. 15 (1969) 208234.
[25] J. Polchinski, Renormalization and eective Lagrangians, Nucl. Phys. B 231 (1984)
269295.
October 12, 2010 10:3 WSPC/S0129-055X 148-RMP
J070-S0129055X10004156

Reviews in Mathematical Physics


Vol. 22, No. 9 (2010) 10331059

c World Scientic Publishing Company
DOI: 10.1142/S0129055X10004156

PARALLEL TRANSPORT OVER PATH SPACES

SAIKAT CHATTERJEE and AMITABHA LAHIRI


S. N. Bose National Centre for Basic Sciences,
Block JD, Sector III, Salt Lake, Kolkata 700098,
West Bengal, India
saikat@math.tifr.res.in
amitabha@bose.res.in

AMBAR N. SENGUPTA
Department of Mathematics,
Louisiana State University,
Baton Rouge, Louisiana 70803, USA
ambarnsg@gmail.com

Received 10 October 2009


Revised 14 June 2010

We develop a dierential geometric framework for parallel transport over path spaces
and a corresponding discrete theory, an integrated version of the continuum theory, using
a category-theoretic framework.

Keywords: Gauge theory; path spaces; double categories.

Mathematics Subject Classication 2010: 81T13, 58Z05, 16E45

1. Introduction
A considerable body of literature has grown up around the notion of surface
holonomy, or parallel transport on surfaces, motivated by the need to have a gauge
theory of interaction between charged string-like objects. Approaches include direct
geometric exploration of the space of paths of a manifold (Cattaneo et al. [5], for
instance), and a very dierent, category-theory avored development (Baez and
Schreiber [2], for instance). In the present work, we develop both a path-space geo-
metric theory as well as a category theoretic approach to surface holonomy, and
describe some of the relationships between the two.
As is well known [1] from a group-theoretic argument and also from the fact that
there is no canonical ordering of points on a surface, attempts to construct a group-
valued parallel transport operator for surfaces leads to inconsistencies unless the

Current
address: School of Mathematics, Tata Institute of Fundamental Research, Homi Bhabha
Road, Mumbai 400005, India.

1033
October 12, 2010 10:3 WSPC/S0129-055X 148-RMP
J070-S0129055X10004156

1034 S. Chatterjee, A. Lahiri & A. N. Sengupta

group is abelian (or an abelian representation is used). So in our setting, there are
two interconnected gauge groups G and H. We work with a xed principal G-bundle
: P M and connection A; then, viewing the space of A-horizontal
paths itself
as a bundle over the path space of M , we study a particular type of connection on
this path-space bundle which is specied by means of a second connection A and a
eld B whose values are in the Lie algebra LH of H. We derive explicit formulas
describing parallel-transport with respect to this connection. As far as we are aware,
this is the rst time an explicit description for the parallel transport operator has
been obtained for a surface swept out by a path whose endpoints are not pinned.
We obtain, in Theorem 2.1, conditions for the parallel-transport of a given point
in path-space to be independent of the parametrization of that point, viewed as a
path. We also discuss H-valued connections on the path space of M , constructed
from the eld B. In Sec. 3, we show how the geometrical data, including the eld B,
lead to two categories. We prove several results for these categories and discuss how
these categories may be viewed as integrated versions of the dierential geometric
theory developed in Sec. 2.
In working with spaces of paths, one is confronted with the problem of specifying
a dierential structure on such spaces. It appears best to proceed within a simpler
formalism. Essentially, one continues to use terms such as tangent space and
dierential form, except that in each case the specic notion is dened directly
(for example, a tangent vector to a space of paths at a particular path is a vector
eld along ) rather than by appeal to a general theory. Indeed, there is a good
variety of choices for general frameworks in this philosophy (see, for instance, [16,
17]). For this reason, we shall make no attempt to build a manifold structure on
any space of paths.

1.1. Background and motivation


Let us briey discuss the physical background and motivation for this study.
Traditional gauge elds govern interaction between point particles. Such a gauge
eld is, mathematically, a connection A on a bundle over spacetime, with the struc-
ture group of the bundle being the relevant internal symmetry group of the particle
species. The amplitude of the interaction, along some path connecting the point
particles, is often obtained from the particle wave functions coupledtogether using
quantities involving the path-ordered exponential integral P exp( A), which is

the same as the parallel-transport along the path by the connection A. If we now
change our point of view concerning particles, and assume that they are extended

Fig. 1. Point particles interacting via a gauge eld.


October 12, 2010 10:3 WSPC/S0129-055X 148-RMP
J070-S0129055X10004156

Parallel Transport Over Path Spaces 1035

string-like entities, then each particle should be viewed not as a point entity but
rather a path (segment) in spacetime. Thus, instead of the two particles located
at two points, we now have two paths 1 and 2 ; in place of a path connecting
the two point particles we now have a parametrized path of paths, in other words
a surface , connecting 1 with 2 . The interaction amplitudes would, one may
expect, involve both the gauge eld A, as expressed through the parallel transports
along 1 and 2 , and an interaction between these two parallel transport elds. This
higher order, or higher dimensional interaction, could be described by means of a
gauge eld at the higher level: it would be a gauge eld over the space of paths in
spacetime.

1.2. Comparison with other works


The approach to higher gauge theory developed and explored by Baez [1], Baez
and Schreiber [2, 3], and Lahiri [13], and others cited in these papers, involves an
abstract category theoretic framework of 2-connections and 2-bundles, which are
higher-dimensional analogs of bundles and connections. There is also the framework
of gerbes [6, 4, 14].
We develop both a dierential geometric framework and category-theoretic
structures. We prove in Theorem 2.1 that a requirement of parametrization invari-
ance imposes a constraint on a quantity called the fake curvature which has
been observed in a related but more abstract context by Baez and Schreiber [2,
Theorem 23]. Our dierential geometric approach is close to the works of Cattaneo
et al. [5], Pfeier [15], and Girelli and Pfeier [11]. However, we develop, in addition
to the dierential geometric aspects, the integrated version in terms of categories
of diagrams, an aspect not addressed in [5]; also, it should be noted that our con-
nection form is dierent from the one used in [5]. To link up with the integrated
theory it is essential to explore the eect of the LH-valued eld B. To this end we
determine a bi-holonomy associated to a path of paths (Theorem 2.2) in terms
of the eld B; this aspect of the theory is not studied in [5] or other works.
Our approach has the following special features:

we develop the theory with two connections A and A as well as a 2-form B (with
the connection A used for parallel-transport along any given string-like object,
and the forms A and B used to construct parallel-transports between dierent
strings);
we determine, in Theorem 2.2, the bi-holonomy associated to a path of paths
using the B-eld;
we allow quadrilaterals rather than simply bigons in the category theoretic
formulation, corresponding to having strings with endpoints free to move rather
than xed-endpoint strings.

Our category theoretic considerations are related to notions about double categories
introduced by Ehresmann [9, 10] and explored further by Kelly and Street [12].
October 12, 2010 10:3 WSPC/S0129-055X 148-RMP
J070-S0129055X10004156

1036 S. Chatterjee, A. Lahiri & A. N. Sengupta

Fig. 2. Gauge elds along paths c1 and c2 interacting across a surface.

2. Connections on Path-Space Bundles


In this section we will construct connections and parallel-transport for a pair of
intertwined structures: path-space bundles with structure groups G and H, which
are Lie groups intertwined as explained below in (2.1). For the physical motivation,
it should be kept in mind that G denotes the gauge group for the gauge eld along
each path, or string, while H governs, along with G, the interaction between the
gauge elds along dierent paths.
An important distinction between existing dierential geometric approaches
(such as Cattaneo et al. [5]) and the integrated theory encoded in the category-
theoretic framework is that the latter necessarily involves two gauge groups: a group
G for parallel transport along paths, and another group H for parallel transport
between paths (in path space). We shall develop the dierential geometric frame-
work using a pair of groups (G, H) so as to be consistent with the integrated
theory. Along with the groups G and H, we use a xed smooth homomorphism
: H G and a smooth map
G H H : (g, h)  (g)h
such that each (g) is an automorphism of H, such that the identities
((g)h) = g (h)g 1 ,
(2.1)
( (h))h = hh h1 ,
hold for all g G and h, h H. The derivatives  (e) and  (e) will be denoted
simply as : LH LG and : LG LH. (This structure is called a Lie 2-group
in [1, 2].)
To summarize very rapidly, anticipating some of the notions explained below,
we work with a principal G-bundle : P M over a manifold M , equipped with
connections A and A, and an -equivariant vertical 2-form B on P with values in
the Lie algebra LH. We then consider the space PA P of A-horizontal
paths in P ,
October 12, 2010 10:3 WSPC/S0129-055X 148-RMP
J070-S0129055X10004156

Parallel Transport Over Path Spaces 1037

which forms a principal G-bundle over the path-space PM in M . Then there is


an associated vector bundle E over PM with ber LH; using the 2-form B and
the connection form A we construct, for any section of the bundle P M , an
LH-valued 1-form on PM . This being a connection over the path-space in M
with structure group H, parallel-transport by this connection associates elements
of H to parametrized surfaces in M . Most of our work is devoted to studying a
second connection form (A,B) , which is a connection on the bundle PA P which we
construct using a second connection A on P . Parallel-transport by (A,B) is related
to parallel-transport by the LH-valued connection form .

2.1. Principal bundle and the connection A


Consider a principal G-bundle


:P M
with the right-action of the Lie group G on P denoted
P G P : (p, g)  pg = Rg p.
Let A be a connection on this bundle. The space PA P of A-horizontal
paths in P
may be viewed as a principal G-bundle over PM , the space of smooth paths in M .
We will use the notation pK Tp P , for any point p P and Lie-algebra element
K LG, dened by

d 
pK =  p exp(tK).
dt t=0
It will be convenient to keep in mind that we always use t to denote the parameter
for a path on the base manifold M or in the bundle space P ; we use the letter s to
parametrize a path in path-space.

2.2. The tangent space to PA P


The points of the space PA P are A-horizontal
paths in P . Although we call PA P
a space we do not discuss any topology or manifold structure on it. However, it
is useful to introduce certain dierential geometric notions such as tangent spaces
on PA P . It is intuitively clear that a tangent vector at a point PA P ought
to be a vector eld on the path . We formalize this idea here (as has been done
elsewhere as well, such as in Cattaneo et al. [5]).
If PX is a space of paths on a manifold X, we denote by evt the evaluation
map
evt : PX X :  evt () = (t). (2.2)
Our rst step is to understand the tangent spaces to the bundle PA P . The
following result is preparation for the denition (see also [5, Theorem 2.1]).
Proposition 2.1. Let A be a connection on a principal G-bundle : P M, and
: [0, 1] [0, 1] P : (t, s)  (t,
s) =
s (t)
October 12, 2010 10:3 WSPC/S0129-055X 148-RMP
J070-S0129055X10004156

1038 S. Chatterjee, A. Lahiri & A. N. Sengupta

a smooth map, and


s).
vs (t) = s (t,
Then the following are equivalent:
(i) Each transverse path
s : [0, 1] P : t  (t,
s)

is A-horizontal.
0 is A-horizontal,
(ii) The initial path and the tangency condition

A(
vs (t))
= F A (t (t,
s), vs (t)) (2.3)
t
holds, and thus also
 T
F A (t (t,

vs (T )) A(
A( vs (0)) = s), vs (t))dt, (2.4)
0
for every T, s [0, 1].
Equation (2.3), and variations on it, is sometimes referred to as the Duhamel
formula and sometimes a non-abelian Stokes formula. We can write it more com-
pactly by using the notion of a Chen integral. Withsuitable regularity assumptions,
a 2-form on a space X yields a 1-form, denoted , on the space PX of smooth
paths in X; if c is such a path, a tangent vector  v Tc (PX) is a vector eld
t  v(t) along c, and the evaluation of the 1-form on v is dened to be
     1
v= (v) = (c (t), v(t))dt. (2.5)
c c 0

The 1-form , or its localization to the tangent space Tc (PX), is called the Chen
integral of . Returning to our context, we then have
 T
evT A ev0 A = F A,

(2.6)
0
where the integral on the right is a Chen integral; here it is, by denition, the 1-form
on PA P whose value on a vector vs T s PA P is given by the right-hand side of
(2.3). The pullback evt A has the obvious meaning.

Proof. From the denition of the curvature form F A , we have


F A (t ,
s )
= t (A( s (A(
s )) A([
t )) t ,
s ]
) + [A( A(
t ), s )].

 
0

So
t (A( F A (t ,
s )) s )
= s (A( [A(
t )) t ),
A(
s )]

(2.7)
t )
= 0 if A( = 0,

thus proving (2.3) if (i) holds. Equation (2.4) then follows by integration.
October 12, 2010 10:3 WSPC/S0129-055X 148-RMP
J070-S0129055X10004156

Parallel Transport Over Path Spaces 1039

Next suppose (ii) holds. Then, from the rst line in (2.7), we have

s (A( [A(
t )) t ),
A(
s )]
= 0. (2.8)

Now let s  h(s) G describe parallel-transport along s  (s,


t); then

h (s)h(s)1 = A( t)),
s (s, and h(0) = e.

Then

s (h(s)1 A( s))h(s)) = Ad(h(s)1 )[s (A(


t (t, [A(
t )) t ),
A(
s )]
(2.9)

and the right-hand side here is 0, as seen in (2.8). Therefore,

h(s)1 A(
t (t,
s))h(s)

is independent of s, and hence is equal to its value at s = 0. Thus, if A vanishes


s) for all s [0, 1]. In conclusion, if the
0) then it also vanishes in t (t,
on t (t,
0 is A-horizontal,
initial path and the tangency condition (2.3) holds, then each
transverse path
s is A-horizontal.

In view of the preceding result, it is natural to dene the tangent spaces to PA P


as follows:

Definition 2.1. The tangent space to PA P at is the linear space of all vector
elds t  v(t) T(t) P along for which

v (t))
A(
 (t), v(t)) = 0
F A (

(2.10)
t

holds for all t [0, 1].

The vertical subspace in T PA P consists of all vectors v() for which v(t) is
vertical in T(t) P for every t [0, 1].
Let us note one consequence:

Lemma 2.1. Suppose : [0, 1] M is a smooth path, and an A-horizontal


lift.
Let v : [0, 1] TM be a vector eld along , and v(0) any vector in T(0) P with
v(0) = v(0). Then there is a unique vector eld v T PA P whose projection
down to M is the vector eld v, and whose initial value is v(0).

Proof. The rst-order dierential equation (2.10) determines the vertical part of

v(t), from the initial value. Thus v(t) is this vertical part plus the A-horizontal lift
of v(t) to T(t) P .
October 12, 2010 10:3 WSPC/S0129-055X 148-RMP
J070-S0129055X10004156

1040 S. Chatterjee, A. Lahiri & A. N. Sengupta

2.3. Connections induced from B


All through our work, B will denote a vertical -equivariant 2-form on P with
values in LH. In more detail, this means that B is an LH-valued 2-form on P
which is vertical in the sense that

B(u, v) = 0 if u or v is vertical,

and -equivariant in the sense that

Rg B = (g 1 )B for all g G

wherein Rg : P P : p  pg is the right action of G on the principal bundle space


P , and

(g 1 )B = d(g 1 )|e B,

recalling that (g 1 ) is an automorphism H H.



Consider an A-horizontal PA P , and a smooth vector eld X along = ;

take any lift X of X along , and set
   1
 (u), X
def (u))du.
(X) = B (X ) = B( (2.11)

0

This is independent of the choice of X (as any two choices dier by a vertical
vector on which B vanishes) and species a linear form on T (PM ) with values
in LH. If we choose a dierent horizontal lift of , a path g, with g G, then

g (X) = (g 1 ) (X). (2.12)

Thus, one may view to be a 1-form on PM with values in the vector bundle
E PM associated to PA P PM by the action of G on LH.
Now x a section : M P , and for any path PM let () PA P be the
: PM PA P is a section of
A-horizontal lift with initial point ((0)). Thus,
the bundle PA P PM . Then we have the 1-form on PM with values in LH
given as follows: for any X T (PM ),

( )(X) = () (X). (2.13)

We shall view as a connection form for the trivial H-bundle over PM . Of course,
it depends on the section of PA P PM , but in a controlled manner, i.e. the
behavior of under change of is obtained using (2.12).

2.4. Constructing the connection (A,B)


Our next objective is to construct connection forms on PA P . To this end, x a
connection A on P , in addition to the connection A and the -equivariant vertical
LH-valued 2-form B on P .
October 12, 2010 10:3 WSPC/S0129-055X 148-RMP
J070-S0129055X10004156

Parallel Transport Over Path Spaces 1041

The evaluation map at any time t [0, 1], given by

evt : PA P P :  (t),

commutes with the projections PA P PM and P M , and the evaluation map


PM M . We can pull back any connection A on the bundle P to a connection
evt A on PA P .
Given a 2-form B as discussed above, consider the LH-valued 1-form Z on PA P
specied as follows. Its value on a vector v T PA P is dened to be
 1
Z(
v) =  (t), v(t))dt.
B( (2.14)
0

Thus
 1
Z= B, (2.15)
0

where on the right we have the Chen integral (discussed earlier in (2.5)) of the

2-form B on P , lifting it to an LH-valued 1-form on the space of (A-horizontal)
smooth paths [0, 1] P . The Chen integral here is, by denition, the 1-form on
PA P given by
 1
v T PA P   (t), v(t))dt.
B(
0

Note that Z and the form are closely related:

v ) = ( v).
Z( (2.16)

Now dene the 1-form (A,B) by

(A,B) = ev1 A + (Z). (2.17)

Recall that : H G is a homomorphism, and, for any X LH, we are writing


(X) to mean  (e)X; here  (e) : LH LG is the derivative of at the identity.
The utility of bringing in becomes clear only when connecting these developments
to the category theoretic formulation of Sec. 3. A similar construction, but using
only one algebra LG, is described by Cattaneo et al. ([5]). However, as we pointed
out earlier, a parallel transport operator for a surface cannot be constructed using
a single group unless the group is abelian. To allow non-abelian groups, we need to
have two groups intertwined in the structure described in (2.1), and thus we need .
Note that (A,B) is simply the connection ev1 A on the bundle PA P , shifted
by the 1-form (Z). In the nite-dimensional setting it is a standard fact that
such a shift, by an equivariant form which vanishes on verticals, produces another
connection; however, given that our setting is, technically, not identical to the nite-
dimensional one, we shall prove this below in Proposition 2.2.
October 12, 2010 10:3 WSPC/S0129-055X 148-RMP
J070-S0129055X10004156

1042 S. Chatterjee, A. Lahiri & A. N. Sengupta

Thus,
 1
(A,B) (
v ) = A(
v (1)) +  (t), v(t))dt.
B( (2.18)
0

We can rewrite this as


 1
(A,B) = ev0 A + [ev1 (A A)
ev (A A)] (F A + B).
+
0 (2.19)
0

To obtain this we have simply used the relation (2.4). The advantage in (2.19) is
that it separates o the endpoint terms and expresses (A,B) as a perturbation
of the simple connection ev0 A by a vector in the tangent space Tev0 A A, where A is
the space of connections on the bundle PA P . Here note that the tangent vectors
to the ane space A at a connection are the 1-forms 1 , with 1 running
over A. A dierence such as 1 is precisely an equivariant LG-valued 1-form
which vanishes on vertical vectors.
Recall that the group G acts on P on the right
P G P : (p, g)  Rg p = pg
and this induces a natural right action of G on PA P :
PA P G PA P : (
, g)  Rg = g.
Then for any vector X in the Lie algebra LG, we have a vertical vector
) T PA P
X(
given by

)(t) = d 
X( (t) exp(uX).
du u=0
Proposition 2.2. The form (A,B) is a connection form on the principal G-bundle
PA P PM . More precisely,
(A,B) ((Rg ) v) = Ad(g 1 )(A,B) (v)
for every g G, v T (PA P ) and
=X
(A,B) (X)
for every X LG.

Proof. It will suce to show that for every g G,


Z((Rg ) v) = Ad(g 1 )Z(v)
and every vector v tangent to PA P , and
=0
Z(X)
for every X LG.
October 12, 2010 10:3 WSPC/S0129-055X 148-RMP
J070-S0129055X10004156

Parallel Transport Over Path Spaces 1043


From (2.15) and the fact that B vanishes on verticals it is clear that Z(X)
is 0. The equivariance under the G-action follows also from (2.15), on using the
G-equivariance of the connection form A and of the 2-form B, and the fact that

the right action of G carries A-horizontal
paths into A-horizontal paths.

2.5. Parallel transport by (A,B)


Let us examine how a path is parallel-transported by (A,B) . At the innitesimal
level, all we need is to be able to lift a given vector eld v : [0, 1] T M , along
PM , to a vector eld v along such that:

(i) v is a vector in T (PA P ), which means that it satises Eq. (2.10):


v (t))
A( 
= F A (
(t), v(t)); (2.20)
t
(ii) v is (A,B) -horizontal, i.e. satises the equation
 1
A(
v (1)) +  (t), v(t))dt = 0.
B( (2.21)
0

The following result gives a constructive description of v.

Proposition 2.3. Assume that A, A, B, and (A,B) are as specied before. Let
PA P, and = PM its projection to a path on M, and consider any
v T PM . Then the (A,B) -horizontal lift v T PA P is given by
h
v(t) = vA v (t),
(t) + v

(t) T
(t) P is the A-horizontal lift of v(t) T(t) M, and
h
where vA

 1
F A (

v (1))
vv (t) = (t) A(  (u), vA
h
(u))du (2.22)
t

wherein
h
v(1) = vA (1) + (1)X, (2.23)
h
with vA (1) being the A-horizontal lift of v(1) in T(1) P, and
 1
X =  (t), vA
B( h
(t))dt. (2.24)
0

Note that X in (2.24) is A( v (1)).


Note also that since v is tangent to PA P , the vector vv (t) is also given by

 t
v 
A h
v (t) = (t) A(v (0)) + F ( (u), vA (u))du . (2.25)
0
October 12, 2010 10:3 WSPC/S0129-055X 148-RMP
J070-S0129055X10004156

1044 S. Chatterjee, A. Lahiri & A. N. Sengupta

Proof. The (A,B) horizontal lift v of v in T (PA P ) is the vector eld v along
which projects by to v and satises the condition (2.21):
 1
A(v (1)) +  (t), v(t))dt = 0.
B( (2.26)
0

Now for each t [0, 1], we can split the vector v(t) into an A-horizontal
part and
v v (t)) LG viewed as a
a vertical part vv (t) which is essentially the element A(
vector in the vertical subspace in T(t) P :
h
v(t) = vA v (t)
(t) + v

and the vertical part here is given by


v (t)).
vv (t) = (t)A(

Since the vector eld v is actually a vector in T (PA P ), we have, from (2.20), the
relation
 1
F A (
v (1))
v (t)) = A(  h
A( (u), vA
(u))du.
t

We need now only verify the expression (2.23) for v(1). To this end, we rst split
this into A-horizontal and a corresponding vertical part:
h
v(1) = vA (1) + (1)A(
v (1)).

The vector A(
v (1)) is obtained from (2.26), and thus proves (2.23).

There is an observation to be made from Proposition 2.3. Equation (2.24) has,


on the right-hand side, the integral over the entire curve . Thus, if we were to
consider parallel-transport of only, say, the left half of , we would, in general,
end up with a dierent path of paths!

2.6. Reparametrization invariance


If a path is reparametrized, then, technically, it is a dierent point in path space.
Does parallel-transport along a path of paths depend on the specic parametrization
of the paths? We shall obtain conditions to ensure that there is no such dependence.
Moreover, in this case, we shall also show that parallel transport by (A,B) along
a path of paths depends essentially on the surface swept out by this path of paths,
rather than the specic parametrization of this surface.
For the following result, recall that we are working with Lie groups G, H,
smooth homomorphism : H G, smooth map : G H H : (g, h)  (g)h,
where each (g) is an automorphism of H, and the maps and satisfy (2.1).
Let : P M be a principal G-bundle, with connections A and A, and B an
LH-valued -equivariant 2-form on P vanishing on vertical vectors. As before, on
October 12, 2010 10:3 WSPC/S0129-055X 148-RMP
J070-S0129055X10004156

Parallel Transport Over Path Spaces 1045

the space PA P of A-horizontal


paths, viewed as a principal G-bundle over the space
PM of smooth paths in M , there is the connection form (A,B) given by
 1
(A,B) = ev1 A + B.
0
By a smooth path s  s in PM , we mean a smooth map
[0, 1]2 M : (t, s)  (t, s) = s (t),
viewed as a path of paths s PM .
With this notation and framework, we have:

Theorem 2.1. Let


: [0, 1]2 [0, 1]2 : (t, s)  (s (t), t (s))
be a smooth dieomorphism which xes each vertex of [0, 1]2 . Assume that
(i) either
F A + (B) = 0

(2.27)
and carries each s-xed section [0, 1] {s} into an s-xed section [0, 1]
{0 (s)};
(ii) or
 1

(F A + B) = 0,

[ev1 (A A) ev0 (A A)] +
(2.28)
0

maps each boundary edge of [0, 1] into itself, and 0 (s) = 1 (s) for all
2

s [0, 1].
Then the (A,B) -parallel-translate of the point 0 0 along the path s  ( )s ,
is 1 1 , where 1 is the (A,B) -parallel-translate of
0 along s  s .
As a special case, if the path s  s is constant and 0 the identity map on
[0, 1], so that 1 is simply a reparametrization of 0 , then, under conditions (i) or
(ii) above, the (A,B) -parallel-translate of the point 0 along the path s  ( )s ,
is 0 1 , i.e. the appropriate reparametrizaton of the original path
0.

Note that the path ( )0 projects down to ( )0 , which, by the boundary


behavior of , is actually that path 0 0 , in other words 0 reparametrized.
)1 is an A-horizontal
Similarly, ( lift of the path 1 , reparametrized by 1 .
If A = A then conditions (2.28) and (2.27) are the same, and so in this case the
weaker condition on in (ii) suces.

Proof. Suppose (2.27) holds. Then the connection (A,B) has the form
ev0 A + [ev1 (A A)
ev0 (A A)].

The crucial point is that this depends only on the endpoints, i.e. if PA P and
V T PA P then (A,B) (V ) depends only on V (0) and V (1). If the conditions
October 12, 2010 10:3 WSPC/S0129-055X 148-RMP
J070-S0129055X10004156

1046 S. Chatterjee, A. Lahiri & A. N. Sengupta

on in (i) hold then reparametrization has the eect of replacing each s with
0 (s) s , which is in PA P , and the vector eld t  s (0 (s) s (t)) is an

(A,B) -horizontal vector, because its endpoint values are those of t  s (
0 (s) (t)),
since s (t) equals t if t is 0 or 1.
Now suppose (2.28) holds. Then (A,B) becomes simply ev0 A. In this
case (A,B) (V ) depends on V only through the initial value V (0). Thus, the
(A,B) -parallel-transport of PA P , along a path s  s PM , is obtained
by A-parallel-transporting the initial point (0) along the path s  0 (s), and

shooting o A-horizontal paths lying above the paths s . (Since the paths s do
not necessarily have the second component xed, their horizontal lifts need not be
of the form s s , except at s = 0 and s = 1, when the composition s s is
guaranteed to be meaningful.) From this it is clear that parallel translating 0 0 ,
by (A,B) along the path s  s , results, at s = 1, in the path 1 1 .

2.7. The curvature of (A,B)


We can compute the curvature of the connection (A,B) . This is, by denition,
1
(A,B) = d(A,B) + [(A,B) (A,B) ],
2
where the exterior dierential d is understood in a natural sense that will become
clearer in the proof below. More technically, we are using here notions of calculus
on smooth spaces; see, for instance, [16] for a survey, and [17] for another approach.
First we describe some notation about Chen integrals in the present context. 1
If B is a 2-form on P , with values in a Lie algebra, then its Chen integral 0 B,
restricted to PA P , is a 1-form on PA P given on the vector V T (PA P ) by
 1   1

B (V ) =  (t), V (t))dt.
B(
0 0

If C is also a 2-form on P with values in the same Lie algebra, we have a product
2-form on the path space PA P given on X, Y T (PA P ) by
 1 2
[B C](X,
Y )
0

=  (u), X(u)),
[B(  (v), Y (v))]du dv
C(
0u<v1

 (u), X(u)),
[C(  (v), Y (v))]du dv
B(
0u<v1
 1  1
=  (u), X(u)),
[B(  (v), Y (v))]du dv.
C( (2.29)
0 0
October 12, 2010 10:3 WSPC/S0129-055X 148-RMP
J070-S0129055X10004156

Parallel Transport Over Path Spaces 1047

Proposition 2.4. The curvature of (A,B) is


 1 
(A,B) A
= ev1 F + d B
0

 1  1 2
+ ev1 A B + [ B B], (2.30)
0 0

where the integrals are Chen integrals.

Proof. From
 1
(A,B) = ev1 A + B,
0
we have
1
(A,B) = d(A,B) + [(A,B) (A,B) ]
2
 1
= ev1 dA + d B + W, (2.31)
0
where
Y ) = [(A,B) (X),
W (X, (A,B) (Y )]

= [ev1 A(X),
ev1 A(Y )]

 1

+ ev1 A(X), B( (t), Y (t))dt
0

 1


+ B(
(t), X(t))dt, ev1 A(Y )
0
 1  1
+  (u), X(u)),
B(  (v), Y (v)) du dv
B(
0 0

 1
= [ev1 A, ev1 A](X,
Y ) + ev A
1
Y )
B (X,
0
 1 2
+ [ B B](X,
Y ). (2.32)
0

In the case A = A, and without , the expression for the curvature can be
expressed in terms of the fake curvature F A + B. For a result of this type, for

a related connection form, see Cattaneo et al. [5, Theorem 2.6] have calculated a
similar formula for curvature of a related connection form.
A more detailed exploration of the fake curvature would be of interest.
October 12, 2010 10:3 WSPC/S0129-055X 148-RMP
J070-S0129055X10004156

1048 S. Chatterjee, A. Lahiri & A. N. Sengupta

2.8. Parallel-transport of horizontal paths


As before, A and A are connections on a principal G-bundle : P M , and B
is an LH-valued -equivariant 2-form on P vanishing on vertical vectors. Also PX
is the space of smooth paths [0, 1] X in a space X, and PA P is the space of

smooth A-horizontal paths in P .
Our objective now is to express parallel-transport along paths in PM in terms
of a smooth local section of the bundle P M :
:U P
where U is an open set in M . We will focus only on paths lying entirely inside U .
The section determines a section for the bundle PA P PM : if PM
then
() is the unique A-horizontal path in P , with initial point ((0)), which
projects down to . Thus,

()(t) = ((t))
a(t), (2.33)
for all t [0, 1], where a
(t) G satises the dierential equation
(t)1 a
a  (t) = Ad(
a(t)1 )A (( ) (t)) (2.34)
for t [0, 1], and the initial value a
(0) is e.
Recall that a tangent vector V T (PM ) is a smooth vector eld along the
path . Let us denote () by :
def
=
().
Note, for later use, that
 (t) = (  (t)) a(t)1 a
a(t) + (t)  (t) . (2.35)
 
vertical

Now dene the vector


(V ) T (PA P )
V = (2.36)

Fig. 3. The section


applied to a path c.
October 12, 2010 10:3 WSPC/S0129-055X 148-RMP
J070-S0129055X10004156

Parallel Transport Over Path Spaces 1049

to be the vector V in T (PA P ) whose initial value V (0) is

V (0) = (V (0)).

The existence and uniqueness of V was proved in Lemma 2.1.


Note that V (t) T(t) P and ( V )(t) T((t)) P , are generally dierent vec-
tors. However, ( V )(t)a(t) and V (t) are both in T(t) P and dier by a vertical
vector because they have the same projection V (t) under :

V (t) = ( V )(t)
a(t) + vertical vector. (2.37)

Our objective now is to determine the LG-valued 1-form

(A,A,B)
(A,B)
= (2.38)

on PM , dened on any vector V T (PM ) by

(A,A,B)
(V ) = (A,B) (
V ). (2.39)

We can now work out an explicit expression for this 1-form.

Proposition 2.5. With notation as above, and V T (PM ),


 1
(A,A,B)
a(1)1 )A (V (1)) +
(V ) = Ad( a(t)1 ) B (  (t), V (t))dt, (2.40)
Ad(
0

: [0, 1] G
where C denotes the pullback C on M of a form C on P, and a
describes parallel-transport along , i.e. satises

(t)1 a
a  (t) = Ad(
a(t)1 )A (  (t))

with initial condition a


(0) = e. The formula for (A,A,B)
(V ) can also be expressed
as

(A,A,B)
(V ) = A (V (0))
+ [Ad(a(1)1 )(A A )(V (1)) (A A )(V (0))]
 1
a(t)1 )(FA + B )(  (t), V (t))dt.

+ Ad( (2.41)
0

Note that in (2.41), the terms involving A and FA cancel each other out.

Proof. From the denition of (A,B) in (2.17) and (2.14), we see that we need only
focus on the B term. To this end we have, from (2.35) and (2.37):

 (t), V (t)) = B( (  (t))


B( a(t) + vertical, ( V )(t)
a(t) + vertical)
= B( (  (t))
a(t), ( V )(t)
a(t))
a(t)1 )B (  (t), V (t)).
= ( (2.42)
October 12, 2010 10:3 WSPC/S0129-055X 148-RMP
J070-S0129055X10004156

1050 S. Chatterjee, A. Lahiri & A. N. Sengupta

Now recall the relation (2.1)


((g)h) = g (h)g 1 , for all g G and h H,
which implies
((g)K) = Ad(g) (K) for all g G and K LH.
As usual, we are denoting the derivatives of and by and again. Applying
this to (2.42) we have
 (t), V (t)) = Ad(
B( a(t)1 ) B (  (t), V (t)),
and this yields the result.

Suppose
: [0, 1]2 P : (t, s)  (t,
s) = t (s)
s (t) =
is smooth, with each s being A-horizontal,
and the path s  (0, s) being
A-horizontal. Let = . We will need to use the bi-holonomy g(t, s) which
0) along 0 |[0, t] by A,
is specied as follows: parallel translate (0, then up the
t
path |[0, s] by A, back along s -reversed by A and then down |[0, s] by A; then
0

the resulting point is


0)g(t, s).
(0, (2.43)
The path
s 
s
0 using the connection ev A. In
describes parallel transport of the initial path 0
what follows we will compare this with the path
s  s
0 using the connection ev A. The following
0 =
which is the parallel transport of 1
result describes the dierence between these two connections.
Proposition 2.6. Suppose
: [0, 1]2 P : (t, s)  (t,
s) = s (t) = t (s)
is smooth, with each
s being A-horizontal, and the path s  (0, s) being

A-horizontal. Then the parallel translate of 0 by the connection ev1 A along the
path [0, s] PM : u  u , where = , results in s g(1, s), with g(1, s) being
the bi-holonomy specied as in (2.43).

Proof. Let 0 by ev A along the path [0, s] PM :


s be the parallel translate of
1
u  u . Then the right endpoint s (1) traces out an A-horizontal path, starting

at 0 (1). Thus, s (1) is the result of parallel transporting (0,
0) by A along 0
then up the path |[0, s] by A. If we then parallel transport
1 s (1) back by A along
s |[0, 1]-reversed then we obtain the initial point s (0). This point is of the form
s (0)b, for some b G, and so

s =
s b.
October 12, 2010 10:3 WSPC/S0129-055X 148-RMP
J070-S0129055X10004156

Parallel Transport Over Path Spaces 1051

s (0) back down 0 |[0, s]-reversed, by A, produces the


Then, parallel-transporting
0)b. This shows that b is the bi-holonomy g(1, s).
point (0,

Now we can turn to determining the parallel-transport process by the connection


as above, let now
(A,B) . With s be the (A,B) -parallel-translate of
0 along
[0, s] PM : u  u . Since s and s are both A-horizontal and project by

down to s , we have
s =
s bs ,

for some bs G. Since (A,B) = ev1 A + (Z) applied to the s-derivative of


s is 0,

and ev1 A applied to the s-derivative of s is 0, we have
b1 1
s s bs + Ad(bs ) Z(s s ) = 0. (2.44)

Thus, s  bs describes parallel transport by where the section satises
= .

Since s g(1, s), we then have
s =
dbs 1
b = Ad(g(1, s)1 ) Z(s s)
ds s
 1
= Ad(g(1, s)1 ) s), s (t,
B(t (t, s))dt. (2.45)
0
To summarize:

Theorem 2.2. Suppose


: [0, 1]2 P : (t, s)  (t,
s) = t (s)
s (t) =

is smooth, with each


s being A-horizontal, and the path s  (0,
s) being A-
0 by the connection (A,B) along the
horizontal. Then the parallel translate of
path [0, s] PM : u  u , where = ,
results in
s g(1, s) (h0 (s)),
(2.46)
with g(1, s) being the bi-holonomy specied as in (2.43), and s  h0 (s) H
solving the dierential equation
 1
dh0 (s)
h0 (s)1 = (g(1, s)1 ) s), s (t,
B(t (t, s))dt (2.47)
ds 0

with initial condition h0 (0) being the identity in H.


Let be a smooth section of the bundle P M in a neighborhood of ([0, 1]2 ).
Let at (s) G specify parallel transport by A up the path [0, s] M : v 
(t, v), i.e. the A-parallel-translate of (t, 0) up the path [0, s] M : v  (t, v)
results in ((t, s))at (s).
On the other hand, a s (t) will specify parallel transport by A along [0, t] M :
u  (u, s). Thus,
s) = ((t, s))a0 (s)
(t, as (t) (2.48)
October 12, 2010 10:3 WSPC/S0129-055X 148-RMP
J070-S0129055X10004156

1052 S. Chatterjee, A. Lahiri & A. N. Sengupta

The bi-holonomy is given by


g(1, s) = a0 (s)1 a
s (1)1 a1 (s)
a0 (1).
Let us look at parallel-transport along the path s  s , by the connection
(A,B) , in terms of the trivialization . Let s PA P be obtained by parallel
transporting (0 ) PA P along the path
0 =

[0, s] M : u  0 (u) = (0, u).


This transport is described through a map
[0, 1] G : s  c(s),
specied through
s =
s a0 (s)1 c(s).
(s )c(s) = (2.49)
Then c(0) = e and
c(s)1 c (s) = Ad(c(s)1 )(A,A,B)
(V (s)), (2.50)
where Vs Ts PM is the vector eld along s given by
Vs (t) = V (s, t) = s (t, s) for all t [0, 1].
Equation (2.50), written out in more detail, is

c(s)1 c (s) = Ad(c(s)1 ) Ad( as (1)1 )A (Vs (1))


 1
+ as (t)1 ) B (s (t), Vs (t))dt ,
Ad( (2.51)
0

where as (t) G describes A -parallel-transport along s |[0, t]. By (2.46), c(s) is


given by
c(s) = a0 (s)g(1, s) (h0 (s)),
where s  h0 (s) solves
 1
dh0 (s)
h0 (s)1 = as (t)a0 (s)g(1, s))1 B (t (t, s), s (t, s))dt,
( (2.52)
ds 0

with initial condition h0 (0) being the identity in H. The geometric meaning of
s (t)a0 (s) is that it describes parallel-transport rst by A up from (0, 0) to (0, s)
a
and then to the right by A from (0, s) to (t, s).

3. Two Categories from Plaquettes


In this section we introduce two categories motivated by the dierential geometric
framework we have discussed in the preceding sections. We show that the geometric
framework naturally connects with certain category theoretic structures introduced
by Ehresmann [9, 10] and developed further by Kelley and Street [12].
October 12, 2010 10:3 WSPC/S0129-055X 148-RMP
J070-S0129055X10004156

Parallel Transport Over Path Spaces 1053

We work with the pair of Lie groups G and H, along with maps and
satisfying (2.1), and construct two categories. These categories will have the same
set of objects, and also the same set of morphisms.
The set of objects is simply the group G:
Obj = G.
The set of morphisms is
Mor = G4 H,
with a typical element denoted
(a, b, c, d; h).
It is convenient to visualize a morphism as a plaquette labeled with elements of G:
To connect with the theory of the preceding sections, we should think of a and

c as giving A-parallel-transports, d and b as A-parallel-transports, and h should be
thought of as corresponding to h0 (1) of Theorem 2.2. However, this is only a rough
guide; we shall return to this matter later in this section.
For the category Vert, the source (domain) and target (co-domain) of a mor-
phism are:
sVert (a, b, c, d; h) = a,
tVert (a, b, c, d; h) = c.
For the category Horz
sHorz (a, b, c, d; h) = d,
tHorz (a, b, c, d; h) = b.
We dene vertical composition, that is composition in Vert, using Fig. 5. In
this gure, the upper morphism is being applied rst and then the lower.
Horizontal composition is specied through Fig. 6. In this gure, we have used
the notation opp to stress that, as morphisms, it is the one to the left which is
applied rst and then the one to the right.
Our rst observation is:

Proposition 3.1. Both Vert and Horz are categories, under the specied compo-
sition laws. In both categories, all morphisms are invertible.
c


d  h b


a
Fig. 4. Plaquette.
October 12, 2010 10:3 WSPC/S0129-055X 148-RMP
J070-S0129055X10004156

1054 S. Chatterjee, A. Lahiri & A. N. Sengupta

c


d  h  b
c


a = c
= d d  h((d1 )h )  b b
c = a


a
d  h b


a
Fig. 5. Vertical composition.

c c c c
  


d  h b opp d  h  b= d  ((a1 )h )h  b

  
a a a a
Fig. 6. Horizontal composition (for b = d ).

a e
 

e  e e a  e a

 
a e

Identity for Vert Identity for Horz

Fig. 7. Identity maps.


October 12, 2010 10:3 WSPC/S0129-055X 148-RMP
J070-S0129055X10004156

Parallel Transport Over Path Spaces 1055

Proof. It is straightforward to verify that the composition laws are associative.


The identity map a a in Vert is (a, e, a, e; e), and in Horz it is (e, a, e, a; e).
These are displayed in in Fig. 7. The inverse of the morphism (a, b, c, d; h) in Vert
is (c, b1 , a, d1 ; (d)h1 ); the inverse in Horz is (a1 , d, c1 , b; (a)h1 ).

The two categories are isomorphic, but it is best not to identify them.
We use H to denote horizontal composition, and V to denote vertical
composition.
We have seen earlier that if A, A and B are such that (A,B) reduces to ev0 A
(for example, if A = A and F A + (B) is 0) then all plaquettes (a, b, c, d; h) arising

from the connections A and (A,B) , satisfy

(h) = a1 b1 cd.

Motivated by this observation, we could consider those morphisms (a, b, c, d; h)


which satisfy

(h) = a1 b1 cd. (3.1)

However, we can look at a broader class of morphisms as well. Suppose

h  z(h) Z(G)

is a mapping of the morphisms in the category Horz or in Vert into the center
Z(G) of G, which carries composition of morphisms to products in Z(G):

z(h h ) = z(h)z(h ).

Then we say that a morphism h = (a, b, c, d; h) is quasi-at with respect to z if

(h) = (a1 b1 cd)z(h) (3.2)

A larger class of morphisms could also be considered, by replacing Z(G) by an


abelian normal subgroup, but we shall not explore this here.

Proposition 3.2. Composition of quasi-at morphisms is quasi-at. Thus, the


quasi-at morphisms form a subcategory in both Horz and Vert.

Proof. Let h = (a, b, c, d; h) and h = (a , b , c , d ; h ) be quasi-at morphisms in


Horz, such that the horizontal composition h H h is dened, i.e. b = d . Then

h H h = (a a, b , c c, d; {(a1 )h }h).

Applying to the last component in this, we have


1  1  
a1 (h )a (h) = a1 (a b c d )a(a1 b1 cd)z(h)z(h )
1
= ((a a)1 b (c c)d)z(h H h), (3.3)

which says that h H h is quasi-at.


October 12, 2010 10:3 WSPC/S0129-055X 148-RMP
J070-S0129055X10004156

1056 S. Chatterjee, A. Lahiri & A. N. Sengupta

Now suppose h = (a, b, c, d; h) and h = (a , b , c , d ; h ) are quasi-at morphisms


in Vert, such that the vertical composition h V h is dened, i.e. c = a . Then
h V h = (a, b b, c , d d; h{(d1 )h }).
Applying to the last component in this, we have
1  1  
(h)d1 (h )d = (a1 b1 cd)d1 (a b c d )dz(h)z(h )
1
= (a (b b)1 c d d)z(h V h), (3.4)

which says that h V h is quasi-at.

For a morphism h = (a, b, c, d; h) we set


(h) = (h).

If h = (a, b, c, d; h) and h = (a , b , c , d ; h ) are morphisms then we say that they
 

are -equivalent,
h = h
if a = a , b = b , c = c , d = d , and (h) = (h ).
Proposition 3.3. If h, h , h , h are quasi-at morphisms for which the composi-
tions on both sides of (3.5) are meaningful, then
(h H h ) V (h H h) = (h V h ) H (h V h) (3.5)
whenever all the compositions on both sides are meaningful.
Thus, the structures we are using here correspond to double categories as
described by Kelly and Street [12, Sec. 1.1]

Proof. This is a lengthy but straightforward verication. We refer to Fig. 8. For


a morphism h = (a, b, c, d; h), let us write
(h) = a1 b1 cd.
For the left-hand side of (3.5), we have
(h H h) = (a a, b , c c, d; {(a1 )h }h)
(h H h ) = (c c, b , f  f, d ; {(c1 )h }h ) (3.6)
h = (h H h ) V (h H h) = (a a, b b , f  f, d d; h ),
def

where
h = {(a1 )h }h{(d1 c1 )h }{(d1 )h } (3.7)
Applying gives
(h ) = a1 (h )z(h )a (h)z(h)d1 c1 (h )cd
z(h ) d1 (h )dz(h )
= (a a)1 (b b )1 (f  f )(d d)z(h ), (3.8)
October 12, 2010 10:3 WSPC/S0129-055X 148-RMP
J070-S0129055X10004156

Parallel Transport Over Path Spaces 1057

f f
 

d  h d  d h  b

c c
 
c c

d  h b b h  b

 
a a
Fig. 8. Consistency of horizontal and vertical compositions.

where we have used the fact, from (2.1), that is converted to a conjugation on
applying , and the last line follows after algebraic simplication. Thus,

(h ) = (h )z(h ) (3.9)

On the other hand, by an entirely similar computation, we obtain

h = (h V h ) H (h V h) = (a a, b b , f  f, d d; h ),


def
(3.10)

where

h = {(a1 )h }{(a1 b1 )h }h{(d1 )h }. (3.11)

Applying to this yields, after using (2.1) and computation,

(h ) = (h )z(h ).

Since (h ) is equal to (h ), the result (3.5) follows.

Ideally, a discrete model would be the exact integrated version of the


dierential geometric connection (A,B) . However, it is not clear if such an ideal
transcription is feasible for any such connection (A,B) on the path-space bundle.
To make contact with the dierential picture we have developed in earlier sec-
tions, we should compare quasi-at morphisms with parallel translation by (A,B)
in the case where B is such that (A,B) reduces to ev0 A (for instance, if A = A
and the fake curvature F A + (B) vanishes); more precisely, the h for quasi-at

morphisms (taking all z(h) to be the identity) corresponds to the quantity h0 (1)
specied through the dierential Eq. (2.47). It would be desirable to have a more
thorough relationship between the discrete structures and the dierential geometric
constructions, even in the case when z() is not the identity. We hope to address
this in future work.
October 12, 2010 10:3 WSPC/S0129-055X 148-RMP
J070-S0129055X10004156

1058 S. Chatterjee, A. Lahiri & A. N. Sengupta

4. Concluding Remarks
We have constructed in (2.17) a connection (A,B) from a connection A on a prin-
cipal G-bundle P over M , and a 2-form B taking values in the Lie algebra of a

second structure group H. The connection (A,B) lives on a bundle of A-horizontal
paths, where A is another connection on P which may be viewed as governing
the gauge theoretic interaction along each curve. Associated to each path s  s
of paths, beginning with an initial path 0 and ending in a nal path 1 in M ,
is a parallel transport process by the connection (A,B) . We have studied condi-
tions (in Theorem 2.1) under which this transport is surface-determined, that is,
depends more on the surface swept out by the path of paths than on the specic
parametrization, given by , of this surface. We also described connections over the
path space of M with values in the Lie algebra LH obtained from the A and B.
We developed an integrated version, or a discrete version, of this theory, which
is most conveniently formulated in terms of categories of quadrilateral diagrams.
These diagrams, or morphisms, arise from parallel transport by (A,B) when B has
a special form which makes the parallel transports surface-determined.
Our results and constructions extend a body of literature ranging from
dierential geometric investigations to category theoretic ones. We have developed
both aspects, clarifying their relationship.

Acknowledgments
We are grateful to the anonymous referee for useful comments and pointing us to
the reference [12]. Our thanks to Urs Schreiber for the reference [16]. We also thank
Swarnamoyee Priyajee Gupta for preparing some of the gures. ANS acknowl-
edges research supported from US NSF grant DMS-0601141. AL acknowledges
research support from Department of Science and Technology, India under Project
No. SR/S2/HEP-0006/2008.

References
[1] J. Baez, Higher YangMills theory, http://arxiv.org/abs/hep-th/0206130.
[2] J. Baez and U. Schreiber, Higher gauge theory, http://arXiv:hep-th/0511710v2.
[3] J. Baez and U. Schreiber, Higher gauge theory II: 2-connections on 2-bundles,
http://arxiv.org/abs/hep-th/0412325.
[4] L. Breen and W. Messing, Dierential geometry of gerbes, http://arxiv.org/abs/
math/0106083.
[5] Alberto S. Cattaneo, P. Cotta-Ramusino and M. Rinaldi, Loop and path spaces
and four-dimensional BF theories: Connections, holonomies and observables, Comm.
Math. Phys. 204 (1999) 493524.
[6] D. Chatterjee, On gerbs, Ph.D. thesis, University of Cambridge (1998).
[7] K.-T. Chen, Algebras of iterated path integrals and fundamental groups, Trans.
Amer. Math. Soc. 156 (1971) 359379.
[8] K.-T. Chen, Iterated integrals of dierential forms and loop space homology, Ann. of
Math. 97(2) (1973) 217246.
October 12, 2010 10:3 WSPC/S0129-055X 148-RMP
J070-S0129055X10004156

Parallel Transport Over Path Spaces 1059


[9] C. Ehresmann, Categories structurees, Ann. Sci. Ecole Norm. Sup. 80 (1963)
349425.
[10] C. Ehresmann, Categories et structures (Dunod, Paris, 1965).
[11] F. Girelli and H. Pfeier, Higher gauge theory Dierential versus integral formu-
lation, J. Math. Phys. 45 (2004) 39493971; http://arxiv.org/abs/hep-th/0309173.
[12] G. M. Kelly and R. Street, Review of the elements of 2-categories, in Category Sem-
inar (Proc. Sem., Sydney, 1972/1973), Lecture Notes in Math., Vol. 420 (Springer,
Berlin, 1974), pp. 75103.
[13] A. Lahiri, Surface holonomy and gauge 2-group, Int. J. Geom. Methods Mod. Phys.
1 (2004) 299309.
[14] M. Murray, Bundle gerbes, J. London Math. Soc. 54 (1996) 403416.
[15] H. Pfeier, Higher gauge theory and a non-abelian generalization of 2-form electro-
dynamics, Ann. Phys. 308 (2003) 447477; http://arxiv.org/abs/hep-th/0304074.
[16] A. Stacey, Comparative smootheology; http://arxiv.org/abs/0802.2225.
[17] O. Viro, http://www.pdmi.ras.ru/olegviro/talks.html.
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

Reviews in Mathematical Physics


Vol. 22, No. 9 (2010) 10611097

c World Scientic Publishing Company
DOI: 10.1142/S0129055X10004132

MODULI SPACES OF G2 MANIFOLDS

SERGEY GRIGORIAN
Max-Planck-Institut f
ur Gravitationsphysik (Albert-Einstein-Institut),
Am M uhlenberg 1, D-14476 Golm, Germany
and
Simons Center for Geometry and Physics,
Stony Brook University, Stony Brook, NY 11794, USA
sergey.grigorian@stonybrook.edu

Received 27 January 2010


Revised 14 June 2010

This paper is a review of current developments in the study of moduli spaces of G2


manifolds. G2 manifolds are seven-dimensional manifolds with the exceptional holonomy
group G2 . Although they are odd-dimensional, in many ways they can be considered
as an analogue of CalabiYau manifolds in seven dimensions. They play an important
role in physics as natural candidates for supersymmetric vacuum solutions of M -theory
compactications. Despite the physical motivation, many of the results are of purely
mathematical interest. Here we cover the basics of G2 manifolds, local deformation
theory of G2 structures and the local geometry of the moduli spaces of G2 structures.

Keywords: Special holonomy; moduli space; M -theory.

Mathematics Subject Classication 2010: 53C25, 53C29, 53Z05

1. Introduction
Ever since antiquity there has been a very close relationship between physics and
geometry. Originally, in Timaeus, Plato related four of the ve Platonic solids
tetrahedron, hexahedron, octahedron, icosahedron to the elements re, earth, air
and water, respectively, while the fth solid, the dodecahedron was the quintessence
of which the cosmos itself is made. Later, Isaac Newtons Laws of Motion and The-
ory of Gravitation gave a precise mathematical framework in which the motion of
objects can be calculated. However, Albert Einsteins General Relativity made it
very explicit that the physics of spacetime is determined by its geometry. More
recently, this fundamental relationship has been taken to a new level with the
development of String and M -theory. Over the past 25 years, superstring theory
has emerged as a successful candidate for the role of a theory that would unify grav-
ity with other interactions. It was later discovered that all ve superstring theories
can be obtained as special limits of a more general 11-dimensional theory known as
M -theory and moreover, the low energy limit of which is the 11-dimensional super-
gravity [44, 46]. The complete formulation of M -theory is, however, not known yet.

1061
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

1062 S. Grigorian

One of the key features of String and M -theory is that these theories are formu-
lated in 10- and 11-dimensional spacetimes, respectively. One of the techniques to
relate this to the visible four-dimensional world is to assume that the remaining six
or seven dimensions are curled up as a small, compact, so-called internal space. This
is known as compactication. Such a procedure also leads to a remarkable interre-
lationship between physics and geometry, since the eective physical content of the
resulting four-dimensional theory is determined by the geometry of the internal
space. Usually the full multidimensional spacetime is regarded as a direct product
M4 X, where M4 is a four-dimensional non-compact manifold with Lorentzian
signature ( + ++) and X is a compact six, or seven-dimensional Riemannian
manifold. In general, the parameters that dene the geometry of the internal space
give rise to massless scalar elds known as moduli, and the properties of the moduli
space are determined by the class of spaces used in the compactication.
The properties of the internal space in String and M -theory compactications
are governed by physical considerations. A key ingredient of these theories is super-
symmetry [45]. Supersymmetry is a physical symmetry between particles the spin
of which diers by 12 that is, between integer spin bosons and half-integer
spin fermions. Mathematically, bosons are represented as functions or tensors and
fermions as spinors. When looking for a supersymmetric vacuum for which the
metric is the only non-zero eld, that is a Ricci-at solution that is invariant under
supersymmetry transformations, it turns out that a necessary requirement is the
existence of covariantly constant, or parallel, spinor. That is, there must exist a
non-trivial spinor on the Riemannian manifold X that satises
= 0 (1.1)
where is the relevant spinor covariant derivative [8]. This condition implies that
is invariant under parallel transport.
Properties of parallel transport on a Riemannian manifold are closely related
to the concept of holonomy. Consider a vector v at some point x on X. Using the
natural LeviCivita connection that comes from the Riemannian metric, we can
parallel transport v along paths in X. In particular, consider a closed contractible
path based at x. As shown in Fig. 1, if we parallel transport v along , then the
new vector v  which we get will necessarily have the same magnitude as the original
vector v, but otherwise it does not have to be the same. This gives the notion of
holonomy group. Below we give the precise denition.

Definition 1. Let (X, g) be a Riemannian manifold of dimension n with metric g


and corresponding LeviCivita connection , and x point x X. Let : [0, 1] X
be a loop based at x, that is, a piecewise-smooth path such that (0) = (1) = x.
The parallel transport map P : Tx X Tx X is then an invertible linear map which
lies in SO(n). Dene the Riemannian holonomy group Holx (X, g) of based at x
to be
Holx (X, g) = {P : is a loop based at x} O(n).
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

Moduli Spaces of G2 Manifolds 1063

v
x

Fig. 1. Parallel transport of a vector.

If the manifold X is connected, then it is trivial to see that the holonomy group
is independent of the base point, and can hence be dened for the whole manifold.
Parallel transport is initially dened for vectors, but can then be naturally extended
to other objects like tensors and spinors, with the holonomy group acting on these
objects via relevant representations.
Now going back to the covariantly constant spinor , (1.1) implies that
is invariant under the action of the holonomy group. This shows that the
spinor representation of Hol(X, g) must contain the trivial representation. For
Hol(X, g) = SO(n), this is not possible since the spinor representation is reducible,
so Hol(X, g) SO(n). Hence the condition (1.1) implies a reduced holonomy group.
Thus, Ricci-at special holonomy manifolds occur very naturally in string and
M -theory.
As shown by Berger [9], the list of possible special holonomy groups is very
limited. In particular, if X is simply-connected, and neither locally a product nor
symmetric, the only possibilities are given in Fig. 2. In this list manifolds with
holonomy SU (k), Sp(k), G2 and Spin(7) are Ricci-at. Moreover, these groups
are subgroups of SO(n) and are simply-connected. This implies that manifolds
with these holonomy groups always admit a spin structure ([30, Proposition 3.6.2]).
These are also precisely the manifolds that admit a parallel spinor. K ahler manifolds
only admit parallel projective spinors a line subbundle of the spinor bundle.

Geometry Holonomy Dimension


K
ahler U (k) 2k
CalabiYau SU (k) 2k
HyperKahler Sp(k) 4k
Exceptional G2 7
Exceptional Spin(7) 8

Fig. 2. List of special holonomy groups.


October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

1064 S. Grigorian

Thus, for a Ricci-at supersymmetric vacuum in a 10-dimensional theory, X has


to be six-dimensional in order to reduce to four dimensions, and hence necessarily
a CalabiYau manifold. Similarly, for an 11-dimensional theory, seven-dimensional
manifolds with G2 holonomy arise naturally.
We have thus seen that even rather simple physical requirement restrict the
geometry of the manifold X to rather special classes. In particular, the study of
CalabiYau manifolds has been crucial in the development of String Theory, and
in fact some very important discoveries in the theory of CalabiYau manifolds have
been made thanks to advances in the physics. One such major discovery is Mirror
Symmetry [41, 27]. This symmetry rst appeared in String Theory where evidence
was found that conformal eld theories (CFTs) related to compactications on
a CalabiYau manifold with Hodge numbers (h1,1 , h2,1 ) are equivalent to CFTs
on a CalabiYau manifold with Hodge numbers (h2,1 , h1,1 ). Mirror symmetry is
currently a powerful tool both for calculations in String Theory and in the study
of the CalabiYau manifolds and their moduli spaces.
In mathematical literature G2 holonomy rst appeared in Bergers list of spe-
cial holonomy groups in 1955 [9]. In 1966, Bonan has shown that manifolds with
G2 holonomy are Ricci-at. It was known from general theory that having a
holonomy group G is equivalent to having a torsion-free G-structure. So it was
natural to study G2 structures on manifolds to get a better understanding of G2
holonomy. The dierent classes of G2 structures have been explored by Fernandez
and Gray in their 1982 paper [18]. In particular they have shown that a torsion-
free G2 structure is equivalent to the G2 -invariant three-form being closed and
co-closed.
It was not known whether the group G2 (or indeed Spin(7) for that matter)
does actually appear as a non-symmetric holonomy group until in 1987 Bryant [12]
proved the existence of metrics with G2 and Spin(7) holonomy. In a later paper,
Bryant and Salamon [11] constructed complete metrics with G2 holonomy. However
the rst compact examples of G2 holonomy manifolds have been constructed by
Joyce in 1996 [28, 29]. These examples are based on quotients T 7 / where is
a nite group. Such quotient spaces usually exhibit singularities, and Joyce has
shown that it is possible to resolve these singularities in such a way as to get a
smooth, compact manifold with G2 holonomy. Since then, a number of other types
of constructions have been found, in particular the construction by Kovalev [35]
where a compact G2 manifold is obtained by gluing together two non-compact
asymptotically cylindrical Riemannian manifolds with holonomy SU (3).
In the G2 holonomy compactication approach to M -theory, the physical con-
tent of the four-dimensional theory is given by the moduli of G2 holonomy mani-
folds. A review of the role of G2 manifolds in M -theory is given by Acharya and
Gukov [2] and by Du [17]. Such a compactication of M -theory is in many ways
analogous to CalabiYau compactications in String Theory, where much progress
has been made through the study of the CalabiYau moduli spaces. In particu-
lar, as it was shown in [14, 40], the moduli space of complex structures and the
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

Moduli Spaces of G2 Manifolds 1065

complexied moduli space of K ahler structures are both in fact, Kahler manifolds.
Moreover, both have a special geometry: that is, both have a line bundle whose rst
Chern class coincides with the K ahler class. However, until recently, the structure
of the moduli space of G2 holonomy manifolds has not been studied in that much
detail. Generally, it turns out that the study of G2 manifolds is quite dicult. Unlike
the study of CalabiYau manifolds where the machinery of algebraic geometry has
been used with great success, in the case of G2 manifolds there is no analogue, so
analytical rather than algebraic study is needed.
In this review, we aim to give an overview of what is currently known about
G2 moduli spaces and corresponding deformations of G2 structures. We rst give
an introduction to the properties of the group G2 denitions and representa-
tions. Then we look at general properties of G2 structures. Finally we move on to
properties of G2 moduli spaces. Note that here we will only be looking at smooth
compact G2 manifolds. Properties of the non-compact asymptotically cylindrical
G2 manifolds have recently been studied by Kovalev and Nordstr om [36] and by
Nordstr om [38], while the properties of G2 manifolds with conical singularities have
been studied by Karigiannis [33].

2. The Group G2
2.1. Automorphisms of octonions
The group G2 is the smallest of the ve exceptional Lie groups, the others being
F4 , E6 , E7 and E8 . Surprisingly, all of these Lie groups are related to the octonions,
but G2 is especially close. So let us rst give a few facts about the octonions. The
eight-dimensional algebra of octonions, denoted by O, is the largest possible normed
division algebra. The others of course are the real numbers R, complex numbers C
and the quaternions H. Following Baez [6], it turns out that division algebras can
be dened using the notion of triality. Given three real vector spaces U, V, W , then
a triality is a non-degenerate trilinear map
t : U V W R.
Non-degenerate here means that for any xed non-zero elements of U and V , the
induced functional on W is non-zero. Hence, t also denes a bilinear map m
m : U V W .
For each xed element of U , this map denes an isomorphism between V and
W , and for each xed element of V , an isomorphism between U and W . Hence
these three spaces are isomorphic to each, and if we choose to identify non-zero
elements e1 U , e2 V , and e1 e2 W , we can identify the spaces U, V, W
with each other, and we can say that m now denes multiplication on U with
identity element e = e1 = e2 = e1 e2 . Note that in particular, the existence of a
non-degenerate trilinear map implies that the original vector spaces U ,V ,W are all
of the same dimension.
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

1066 S. Grigorian

Due to the non-degeneracy of the original triality, multiplication by a xed


element is an isomorphism, so in fact, U is a division algebra. Assuming further
that U, V, W are inner product spaces, if the triality map satises
|t(u, v, w)| uvw
and is such that for all u, v there exists a non-zero w such that the bound is attained
(and similarly for cyclic permutations for u, v, w) then we get a normed division
algebra. The converse is also true any division algebra denes a triality.
As discussed in detail by Baez [6], on Rn it is possible to construct bilinear maps
mn involving the vector and spinor representations of Spin(n)
mn : Vn Sn Sn for n = 0, 4 (mod 8) (2.1a)
mn : Vn Sn Sn otherwise (2.1b)
()
where Vn is the vector representation of SO(n), Sn are the (left- and right-handed)
spinor representations.
The spinor representations in (2.1) are self-dual, so in principle, by dualizing the
maps in (2.1), we could obtain trilinear maps into R. However, in order to obtain
trialities, these maps have to be non-degenerate, and hence the dimensions of the
relevant representations must agree. This happens only for n = 1, 2, 4, 8, and each
of these trialities gives a normed division algebra of the corresponding dimension:
t1 : V1 S1 S1 R R,
t2 : V2 S2 S2 R C,
(2.2)
t4 : V4 S4+ S4 R H,
t8 : V8 S8+ S8 R O.
This way, via the trialities we obtain all of the normed division algebras.
In general, suppose we have a triality t : U1 U2 U3 R. Then to dene a
normed division algebra from t, we x two vectors in the two of the three spaces.
Hence the automorphism of the division algebra is the subgroup of the automor-
phism group of the triality that xes these two vectors. For t8 the automorphism
group of the triality turns out to be Spin(8), while G2 is dened as the automor-
phism group of the corresponding octonion algebra. Thus we have
Definition 2. The group G2 is the automorphism group of the octonion algebra.
Since G2 is the automorphism group of octonions, it is the subgroup of Spin(8)
(the automorphism group of the triality t8 ) that preserves unit vectors in V8 and
S8+ . As explained by Baez in [6], the subgroup of Spin(8) that xes a unit vector in
V8 is Spin(7). Moreover, if the representation S8+ is restricted to Spin(7), we get the
spinor representation S7 . Therefore, G2 is the subgroup of Spin(7) that xes a unit
vector in S7 . In this representation, Spin(7) acts transitively on the unit sphere S 7 ,
so we have
Spin(7)/G2 = S 7 . (2.3)
Hence we have the following result.
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

Moduli Spaces of G2 Manifolds 1067

Proposition 3. The group G2 has dimension 14.

Proof. From (2.3),


dim G2 = dim(Spin(7)) dim S 7 = 21 7 = 14.

The automorphism group xes the identity, so in fact G2 acts non-trivially on


octonions that are orthogonal to the identity the imaginary octonions, denoted
by Im(O) and thus we get a natural seven-dimensional representation of G2 . A
closer look at this representation reveals another description of G2 . Using octonion
multiplication, we can dene a cross product on Im(O) by
1
a b = Im(ab) = (ab ba). (2.4)
2
But G2 preserves octonion multiplication, hence any element of G2 preserves the
seven-dimensional cross product. Alternatively, (2.4) can be written as
a b = ab + a, b (2.5)
where  , is the octonionic inner product, in general dened by
1
a, b = (a b + ba ).
2
Also, it can be shown that
1
a, b = Tr(a (b )) (2.6)
6
Therefore, from (2.5), multiplication of imaginary octonions can be dened in terms
of the cross product, hence any transformation preserving the cross product pre-
serves multiplication on Im(O), and is thus in G2 . So, G2 is precisely the group
that preserves the seven-dimensional cross product.
Moreover, from the cross product we can form a scalar triple product on
Im(O) given by
0 (a, b, c) = a, b c = a, bc . (2.7)
This denes 0 as an anti-symmetric trilinear functional that is, a three-form
on R7 . Equivalently, for a basis ei of Im(O),
ei ej = 0kij ek . (2.8)
So in this description, the components of 0 are essentially the structure constants
of the algebra of imaginary octonions.
A well known way to encode the multiplication rules for the octonions is the
Fano plane [6]. It is shown in Fig. 3. In the diagram, the vertices e1 , . . . , e7 are the
seven square roots of 1. Multiplication follows along the six straight lines (sides
of the triangle and the altitudes) and along the central circle in the direction of
the arrows. So if ei , ej , ek are in this order on a straight line, then ei ej = ek and
ej ei = ek .
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

1068 S. Grigorian

e3

e2 e7

e6

e1 e5 e4

Fig. 3. Fano plane.

However, from (2.8) we see that 0 encodes precisely the same information as the
Fano plane. Suppose x1 , . . . , x7 are coordinates on R7 and let eijk = dxi dxj dxk ,
then just reading o from the Fano plane, 0 can be written as

0 = e123 + e145 + e167 + e246 e257 e347 e356 . (2.9)

Note that in order to keep the same convention for 0 as Joyce [30], in the Fano
plane we have a dierent numbering for the octonions compared to Baez [6].
With this choice of coordinates, the inner product on Im(O)
= R7 is given by
the standard Euclidean metric

g0 = (dx1 )2 + + (dx7 )2 . (2.10)

As seen from (2.6), G2 preserves the inner product on Im(O), so it clearly preserves
g0 and is hence a subgroup of SO(7).
Since 0 denes the seven-dimensional cross product, and G2 is the symmetry
group of this cross product, G2 is the stabilizer of 0 in GL(7, R). So we can state:

Theorem 4 ([12]). The subgroup of GL(7, R) that preserves the three-form 0 is


G2 . From the metric g0 we can dene the Hodge star 0 on R7 , and using this, the
dual four-form 0 = 0 0 which is given by

0 = e4567 + e2367 + e2345 + e1357 e1346 e1256 e1247 . (2.11)

This is a key property of G2 and as such this is often taken as the denition of
the group G2 , in particular in [30]. As we have seen, G2 preserves both 0 and g0 ,
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

Moduli Spaces of G2 Manifolds 1069

so it also preserves 0 . In particular, 0 and 0 give alternate descriptions of the


trivial one-dimensional representation of G2 .
It also turns out that 0 is closely related to the associator on Im(O). As the
octonions are non-associative, we can dene a non-trivial associator map
[, , ] : Im(O) Im(O) Im(O) Im(O)
given by
[a, b, c] = a(bc) (ab)c. (2.12)
Just as 0 is dened as a dualization of the cross product using the inner product
to obtain the map
0 : Im(O) Im(O) Im(O) R
so it turns out that up to a constant multiple the map
0 : Im(O) Im(O) Im(O) Im(O) R
is a dualization of the associator, given by
1
[a, b, c], d .
0 (a, b, c, d) = (2.13)
2
It is possible to show that 0 and 0 satisfy various contraction identities. In
particular, from [13, 21, 32], we have

Proposition 5. The three-form 0 and the corresponding four-form 0 satisfy the


following identities:
0abc 0mn c = g0am g0bn g0an g0bm + 0abmn , (2.14a)
0abc 0mnp c = 3(g0a[m 0np]b g0b[m 0np]a ), (2.14b)

0abcd 0mnpq = 24 [m [mn p


c d] 160[abc 0 [mnp d] ,
n p q] q] q]
a b c d + 720[ab (2.14c)

where [m n p] denotes antisymmetrization of indices and ab is the Kronecker delta,


with ba = 1 if a = b and 0 otherwise.
The above identities can be of course further contracted the details can be
found in [21, 32]. These identities and their contractions are crucial whenever any
calculations involving 0 and 0 have to be done. In particular, these are very
useful when studying G2 manifolds.

2.2. Representations of G2
As we will see in Sec. 3, a crucial role in the study of G2 structures is played by
the representations of G2 . Since G2 is a subgroup of SO(7), it has a fundamental
vector representation on R7 . In the study of G2 manifolds, it is very important
to understand the representations of G2 on p-forms. So let us consider rst the
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

1070 S. Grigorian

representations of G2 on antisymmetric tensors in R7 . For brevity let V = R7 .


Following Bryant [13], we rst look at the the Lie algebra so(7), which is the space
of antisymmetric 7 7 matrices on V . For a vector V , dene the map

: V so(7) given by () = 0 (2.15)

which is clearly injective. Conversely, dene the map


1 c
: so(7) V given by (ab )c = ab . (2.16)
6 0 ab
From (2.14), we get that

( ()) = ,

so that is a partial inverse of . Thus we get a decomposition

so(7) = ker (V ) (2.17)

where dim (V ) = 7 and dim ker = 14. It turns out that ker is in fact a Lie
algebra with respect to the matrix commutator. This is the Lie algebra bracket on
so(7) and satises the Jacobi identity. It is hence only necessary to show that for
, ker , we have [, ] ker . This is an exercise in applying the contrac-
tions for . Thus we get a 14-dimensional Lie subalgebra of so(7). However, this is
precisely the Lie algebra g2 [32], that is

g2 = ker = { so(7) : 0abc bc = 0}. (2.18)

This further implies that we get the following decomposition of so(7):

so(7) = g2 (V ). (2.19)

The group G2 acts via the adjoint representation on the 14-dimensional vector
space g2 and via the fundamental vector representation on the seven-dimensional
space (V ). This is a G2 -invariant irreducible decomposition of so(7) into the
representations 7 and 14. Hence we get the following result:

Theorem 6 ([12]). The space 2 of two-forms on V decomposes as

2 = 27 214 . (2.20)

with the components 27 and 214 given by:

27 = {: a vector}, (2.21a)


 
2 1
14 = = ab e e : (ab ) g2 .
a b
(2.21b)
2
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

Moduli Spaces of G2 Manifolds 1071

An alternative, but fully equivalent, description of 27 and 214 presents them as


eigenspaces of the operator

T : 2 2 given by T (ab ) = 0abcd cd . (2.22)

With this description, we have [32]:

27 = { 2 : T = 4}, (2.23a)
214 = { 2 : T = 2}. (2.23b)

Correspondingly, the description of the 7 and 14 pieces of 5 is obtained from


(2.21a) and (2.21b) via Hodge duality.
Let us now look at three-forms in more detail. Consider Sym2 (V ) the space
of symmetric two-tensors on V , and dene a map

i : Sym2 (V ) 3 given by i (h)abc = hd[a 0bc]d . (2.24)

We can decompose Sym2 (V ) = Rg0 Sym20 (V ) where Rg0 is the set of symmetric
tensors proportional to the metric g0 and Sym20 (V ) is the set of traceless symmetric
tensors. This is a G2 -invariant irreducible decomposition of Sym2 (V ) into one-
dimensional and 27-dimensional representations. We clearly have

i (g0 )abc = 0abc ,

so the map i is also G2 -invariant and is injective on each summand of this


decomposition. Looking at the rst summand, we get that i (Rg0 ) = 31 the
one-dimensional singlet representation of G2 . Now look at the second summand
and consider i (Sym20 (V )). This is 27-dimensional and irreducible, so it gives a
27-dimensional representation of G2 on three-forms:

i (Sym20 (V )) = 327 (V ).

Now, 3 is 35-dimensional, and we have accounted for 1 + 27 = 28 dimensions.


Thus we still have seven dimensions left unaccounted for in 3 . So let us extend
the map i to 2 the antisymmetric two-tensors on R7 . Suppose 27 . Then
= 0 , for some vector V so

i ()abc = d0 [a|e| 0bc]d


e
= 0abcd d (2.25)

where we have used (2.14). This denes a G2 -invariant map from V to 3 and hence
gives 37 .
So overall we thus have a decomposition of three-forms into irreducible repre-
sentations of G2 :

Theorem 7 ([13]). The space 3 of three-forms on V decomposes as

3 = 31 37 327 (2.26)
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

1072 S. Grigorian

where

31 = { 3 : abc = f 0abc for scalar f }, (2.27a)


37 = {0 : a vector}, (2.27b)
327 = { 3 : abc = hd[a 0bc]d for hab traceless, symmetric}. (2.27c)

From the identities for contraction of 0 and 0 , it is possible to see that an


equivalent description of 327 is

327 = { 3 : 0 = 0 and 0 = 0}.

A similar decomposition of four-forms is again obtained via Hodge duality.


Suppose we have 3 , then dene 1 , 7 and 27 to be projections of onto
1 , 37 and 327 , respectively. Using contraction identities for and , we get the
3

following relations [21]:

Proposition 8. Given a three-form 3 , the projections of onto the compo-


nents (2.26) of 3 are given by:
1 1
1 () = a0 where a = (abc abc
0 )= , 0 with |1 ()|2 = 7a2 , (2.28a)
42 7
1
7 () = 0 where a = mnp 0mnpa with |7 ()|2 = 4||2 , (2.28b)
24
3 2
27 () = i (h) where hab = mn{a mn 0b} with |27 ()|2 = |h|2 . (2.28c)
4 9
Here {a b} denotes the traceless symmetric part.

Note that similar projections can be dened for four-forms as well.

3. G2 Structures
3.1. Definition
As we shall see, the notion of holonomy is closely related to G-structures on mani-
folds. Let us give the necessary denitions

Definition 9. Let X be a manifold of dimension n. Suppose T X is the tangent


bundle over X. Dene the manifold F by

F = {(x, e1 , . . . , en ) : x X and (e1 , . . . , en ) is a basis for Tx X}

This then has a projection : (x, e1 , . . . , en )  x onto X and a natural left action
by GL(n, R) on the bers. F is thus a principal bundle over X with ber GL(n, R),
called the frame bundle of X.

Definition 10. Let X be a manifold of dimension n. Let G be a Lie subgroup of


GL(n, R). Then a G -structure on X is a principal subbundle P of F with ber G.
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

Moduli Spaces of G2 Manifolds 1073

The framework of G-structures is very powerful, and a number of geometrical


structures can be reformulated in this language. In particular, a Riemannian metric
on a manifold is equivalent to an O(n) structure. We are in particular interested
in torsion-free G-structures. A G-structure is torsion-free if and only if there exists
a compatible torsion-free connection on TM . A connection on TM is equivalent
to a connection D on the frame bundle F , and we say is compatible with the
G-structure P if D reduces to a connection on P . For example, given a Riemannian
metric, a unique torsion-free LeviCivita connection can always be dened, hence all
O(n) structures are torsion-free. On a complex manifold with complex dimension,
an integrable complex structure is equivalent to a torsion-free GL(m, C) structure.
A K ahler structure is then equivalent to a torsion-free U (m)-structure. From [30],
we have a key result that relates torsion-free structures and holonomy:
Proposition 11. Let (X, g) be a Riemannian manifold of dimension n, with O(n)-
structure P corresponding to g. Let G be a Lie subgroup of O(n). Then Hol(g) G
if and only if X admits a torsion-free G-structure Q that is a subbundle of P .
As Proposition 11 shows, the study of Riemannian holonomy is equivalent to
studying torsion-free G-structures. Hence in order to study G2 holonomy manifolds
we will rst consider G2 structures.
Now suppose X is a smooth, oriented 7-dimensional manifold. Following
Joyce [30], dene a three-form to be positive if locally we can choose a frame
such that is written in the form (2.9) that is for every p X there is an
oriented isomorphism qp between Tp X and R7 such that |p = 0 . For each p X
dene Pp3 X to be set of such three-forms. To each positive we can associate a
metric g and a Hodge dual which are identied with g0 and 0 under the qp and
the associated metric is written (2.10).
Since 0 is preserved by G2 and GL(7, R)+ acts transitively on Pp3 X it follows
that
Pp3 X
= GL(7, R)+ /G2
and hence dim Pp3 X = dim GL(7, R)+ dim G2 = 49 14 = 35. This is equal to
the dimension of 3 Tp X, hence Pp3 X is an open subset of 3 Tp X. Moreover if we
consider the bundle P 3 X over X with ber Pp3 X, it will be an open subbundle of
3 T X.
Given a positive three-form on X, consider at each point p the set Qp of
isomorphisms qp between Tp X and R7 such that |p = 0 . It is then easy to see
that Qp = G2 and that the bundle Q over X with ber Qp is in fact a principal
subbundle of the frame bundle F . So in fact, Q is a G2 structure. The converse is
also true given an oriented G2 structure Q, we can uniquely dene a positive
three-form and associated metric g and four-form that correspond to 0 ,g0 and
0 , respectively. We thus have a key result:
Theorem 12 ([30]). Let X be an oriented seven-dimensional manifold. There
exists a 1 1 correspondence between positive three-forms on X and oriented
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

1074 S. Grigorian

G2 -structures Q on X. Moreover, to each positive three-form we can associate a


Riemannian metric g and a corresponding four-form = such for each p X,
under the isomorphism qp : Tp X R7 , these quantities are identied with 0 ,g0
and 0 , respectively.
So given a positive three-form on X, it is possible to dene a metric g asso-
ciated to . This metric then denes the Hodge star, which we denote by to
emphasize the dependence on . Given the Hodge star, we can in turn dene the
four-form = . Thus in fact both the metric g and the four-form are func-
tions of . By denition, at point p X there is an isomorphism that identies
with 0 , with 0 and g with g0 . Therefore, properties of 0 and 0 such as
the contraction identities (2.14) that we encountered in Sec. 2.1 also hold for the
dierential forms and .
In general, any G-structure on a manifold X induces a splitting of bundles
of p-forms into subbundles corresponding to irreducible representations of G. The
same is of course true for G2 structures. The decomposition of p-forms on R7 carries
over to any manifold with a G2 structure, so from the previous section we have the
following decomposition of the spaces of p-forms p :
1 = 17 , (3.1a)
2 = 27 214 , (3.1b)
3 = 31 37 327 , (3.1c)
4 = 41 47 427 , (3.1d)
5
= 57 514 , (3.1e)
6 = 67 . (3.1f)
Here each pkcorresponds to the k-dimensional irreducible representation of G2 .
Moreover, for each k and p, pk and 7p
k are isomorphic to each other via Hodge
p
duality, and also 7 are isomorphic to each other for n = 1, 2, . . . , 6.
Dene the standard inner product on p , so that for p-forms and ,
1
, = a1 ap a1 ap . (3.2)
p!
This is related to the Hodge star, since
= , vol (3.3)
where vol is the invariant volume form given locally by

vol = det g dx1 dx7 . (3.4)
Then the decompositions (3.1) are orthogonal with respect to (3.2). Note that
, = 7, so in fact we have

1
V = (3.5)
7
where V is the volume of the manifold X.
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

Moduli Spaces of G2 Manifolds 1075

We know that the metric g is dened by the three-form and we can use some
of the results from Sec. 2.1 to nd a direct relationship between the two quantities.

Proposition 13. Given a positive three-form on a seven-manifold X, the asso-


ciated metric g is given by
1
gab = (det s) 9 sab . (3.6)
with
1
sab = amn bpq rst mnpqrst (3.7)
144
where mnpqrst is the alternating symbol with 12,...,7 = +1. Alternatively, for u,v
vector elds on X,
1
u, v vol = (u) (v) (3.8)
6
where  denotes interior multiplication: (u)bc = ua abc .

Proof. Consider the quantity Pab given by


Pab = amn bpq mnpq
Using identities (2.14) to contract and , this gives
Pab = 24gab .
Expanding mnpq in terms of and the LeviCivita tensor we get
1
Pab = amn bpq rst mnpqrst .
6
If we write mnpqrst for the alternating symbol with 12,...,7 = +1, then we get
 1
gab det g = amn bpq rst mnpqrst . (3.9)
144
Alternatively, let u and v be vector elds on X. Then
 1
u, v det g = (ua amn )(v a bpq )rst mnpqrst .
144
Hence we get (3.8). Now dene
1
sab = amn bpq rst mnpqrst
144
so that then, after taking the determinant of (3.9) we get (3.6).

Thus we see that even though given the three-form we can dene the metric g,
this relationship is rather complicated and nonlinear. In particular, this also shows
that = depends on in an even more non-trivial fashion, since the Hodge
star depends itself on the metric.
Here we need to say a few words about the notation used for the G2 three-
form and the associated four-form . The notation that we use here is due to
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

1076 S. Grigorian

Authors Three-form Dual Four-form References


Beasley and Witten;
[7, 22]
Gukov, Yau and Zaslow
1 1
Bryant = eijk
6 ijk
=
24 ijkl
eijkl [12, 13]
Hitchin; Lee and Leung = [26, 37]
Joyce () = [2830]
Karigiannis; Karigiannis and Leung;
= [21, 31, 32, 34]
Grigorian and Yau

Fig. 4. Notation that is used by dierent authors.

Karigiannis where the Hodge dual of is denoted by and was rst introduced
in [31]. In Fig. 4, we summarize the dierent notations used by other authors: where
eijk = ei ej ek and eijkl = ei ej ek el for basis covectors ei .

3.2. Torsion-free structures


The denition of a G2 structure only denes the algebraic properties of , and
in general does not address the analytical properties of . Using the associated
metric g we can dene the LeviCivita connection on X. Then it is natural to
ask what are the properties of . This quantity is known as the torsion of the
G2 structure. Originally the torsion of G2 structures was studied by Fern andez and
Gray [18], and their analysis revealed that there are in fact a total of 16 torsion
classes of G2 structures. Later on, Karigiannis reproduced their results using simple
computational arguments [32].
Following [32], consider the three-form X for some vector eld X. We know
that three-forms split as 31 37 327 , so consider the projections 1 ,7 and 27
of X onto these components. Using (2.28), we have
1 (X ) = a
where
a = X a (a bcd )bcd = X a a (bcd bcd ) bcd X a a bcd
= X a (a bcd )bcd
= 0.
Hence we see that the 31 component vanishes. Similarly, for 327 we have
27 (X ) = i (h)
where
3 c 3 3
hab = (X c mn{a )b}mn = X c c (mn{a b}mn ) mn{a X c c b}mn
4 4 4
3
= (X c c mn{a ) b}mn
4
= 0.
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

Moduli Spaces of G2 Manifolds 1077

Here we have used the fact that mna b mn = 6gab , the traceless part of which
vanishes. Therefore, the 327 part of X also vanishes. Now consider the 37 com-
ponent. In this case,

7 (X ) = 

where
1 c 1 a
a = X (c mnp ) mnpa = X (a bcde )bcd .
24 24
This quantity does not vanish in general, so we can conclude that

X 37 (3.10)

and thus overall,

W = 17 37 . (3.11)

Further classication of torsion classes depends on the decomposition of W into


components according to irreducible representations of G2 . Given (3.11), we can
write

a bcd = Ta e ebcd (3.12)

where Tab is the full torsion tensor. This two-tensor full denes since pointwise,
it has 49 components and the space W is also 49-dimensional (pointwise). In general
we can split Tab as

T = 1 g + 7 + 14 + 27 (3.13)

where 1 is a function, and gives the 1 component of T , 7 27 and hence gives the 7
component, 14 214 gives the 14 component and 27 is traceless symmetric, giving
the 27 component. Note that the normalization of these components is dierent
from [32]. Hence we can split W as

W = W1 W7 W14 W27 . (3.14)

The 16 torsion classes arise as the subsets of W which belongs to. Moreover,
as shown in [32], the torsion components i relate directly to the expression for d
and d. In fact, in our notation,

d = 41 + 37 27 , (3.15a)
d = 47 2 14 . (3.15b)

Now suppose d = d = 0. Then this means that all four torsion components vanish
and hence T = 0, and as a consequence = 0. The converse is trivially true, since
d and d can both be expressed in terms of the covariant derivative. This result is
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

1078 S. Grigorian

due to Fernandez and Gray [18]. If we add the fact that Hol(g) is a subgroup of G
if and only if X admits a torsion-free G structure from Proposition 11, then we get
the following important result.
Theorem 14 ([30, Proposition 10.1.3]). Let X be a seven-manifold with a G2
structure dened by the three-form and equipped with the associated Riemannian
metric g. Then the following are equivalent:

(1) The G2 -structure is torsion-free;


(2) Hol(g) G2 and is the induced three-form;
(3) = 0 on X where is the LeviCivita connection of g;
(4) d = d = 0 where = with the Hodge star dened by g.

Dierent torsion classes of the G2 structure also restrict the curvature of the
manifold. Consider the curvature tensor Rabcd . Then for xed a,b, we have
(Rab )cd 2 ,
so we can decompose it as
(Rab )cd = (7 Rab )cd + (14 Rab )cd . (3.16)
Following Karigiannis [32], consider the operator T (2.22) acting on Rabcd . Then
we have
g ad T Rabcd = Rabef efcd g ad
= (Rbeaf + Reabf ) efcd g ad
= Rbeaf e caf + Rf bea eaf c
= 2g adT Rabcd
=0
where we have used the cyclic identity for Rabef . Hence, from (2.23) we get
3
Ricbd = 3(7 Rab )cd g ac =
(14 Rab )cd g ac (3.17)
2
where Ricbd is the Ricci tensor. However, in general, by the AmbroseSinger holon-
omy theorem [5], if Hol(g) G, then Rabcd Sym2 (g) where g is the Lie alge-
bra of G. Therefore, in the G2 case, if the G2 structure is torsion-free and hence
Hol(g) G2 , then Rabcd Sym2 (g2 ). This however implies that in (3.16), the 7
component vanishes, and thus from (3.17), we have the following result:
Theorem 15 ([10]). Let X be a Riemannian seven-manifold with metric g. If
Hol(g) G2 , then X is Ricci-at.
In fact, this result can also be derived without invoking the general Ambrose
Singer theorem. In [32], Karigiannis expressed the 27 component of the curvature
tensor in terms of the torsion tensor Tab , so that when the torsion vanishes, the
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

Moduli Spaces of G2 Manifolds 1079

curvature tensor is fully contained in 214 , thus directly conrming the Ambrose
Singer theorem in the G2 case. The original proof of Theorem 15 due to Bonan [10]
relied on the fact that the Lie algebra structure of g2 imposes strong conditions
on the Riemann tensor, and that these imply that the Ricci tensor cannot be non-
vanishing.
Given a compact manifold with a torsion-free G2 structure, the decompositions
(3.1) carry over to de Rham cohomology [30], so that we have
H 1 (X, R) = H71 , (3.18a)
2
H (X, R) = H72 2
H14 , (3.18b)
3
H (X, R) = H13 3
H7 3
H27 , (3.18c)
4
H (X, R) = H14 H74 4
H27 , (3.18d)
5
H (X, R) = H75 5
H14 , (3.18e)
6
H (X, R) = H76 . (3.18f)
Dene the rened Betti numbers bpk = dim(Hkp ). Clearly, b31 = b41 = 1 and we
also have b1 = bk7 for k = 1, . . . , 6. Moreover, it turns out that if Hol(X, g) = G2
then b1 = 0. Therefore, in this case the H7k component vanishes in (3.18). It can be
easily shown that on a Ricci-at manifold, any harmonic one-form must be parallel.
However this happens if and only if Hol(g) has an invariant one-form. However the
only G2 -invariant forms are and . Therefore there are no non-trivial harmonic
one-forms when Hol(g) = G2 and thus b1 = 0.
An example of a construction of a manifold with a torsion-free G2 structure is
to consider X = Y S 1 where Y is a CalabiYau three-fold. Dene the metric and
a three-form on X as
gX = d2 gY , (3.19)
= d + Re , (3.20)
where is the coordinate on S 1 . This then denes a torsion-free G2 structure, with

1
= d Im . (3.21)
2
However, the holonomy of X in this case is SU (3) G2 . From the Kunneth formula
we get the following relations between the rened Betti numbers of X and the Hodge
numbers of Y
bk7 = 1 for k = 1, . . . , 6, (3.22)
1,1
bk14 =h 1 for k = 2, 5, (3.23)
bk27 = h1,1 + 2h2,1 for k = 3, 4. (3.24)
In [2830], Joyce describes a possible construction of a smooth manifold with
holonomy equal to G2 from a CalabiYau manifold Y . So suppose Y is a CalabiYau
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

1080 S. Grigorian

three-fold as above. Then suppose : Y Y is an antiholomorphic isometric invo-


lution on Y , that is, preserves the metric on Y and satises
2 = 1, (3.25a)

() = , (3.25b)
() = .
(3.25c)
Such an involution is known as a real structure on Y . Dene now a quotient
given by
Z = (Y S 1 )/
(3.26)
where :Y S 1 Y S 1 is dened by (y, ) = ((y), ). The three-form
dened on Y S 1 by (3.20) is invariant under the action of and hence provides
Z with a G2 structure. Similarly, the dual four-form given by (3.21) is also
invariant. Generically, the action of on Y will have a non-empty xed point set
N , which is in fact a special Lagrangian submanifold on Y [30]. This gives rise to
orbifold singularities on Z. The singular set is two copies of Z. It is conjectured
that it is possible to resolve each singular point using an ALE four-manifold with
holonomy SU (2) in order to obtain a smooth manifold with holonomy G2 , however
the precise details of the resolution of these singularities are not known yet. We will
therefore consider only free-acting involutions, that is those without xed points.
Manifolds dened by (3.26) with a freely acting involution were called barely G2
manifolds by Harvey and Moore in [25]. The cohomology of barely G2 manifolds is
expressed in terms of the cohomology of the underlying CalabiYau manifold Y :
H 2 (Z) = H 2 (Y )+ , (3.27a)
H 3 (Z) = H 2 (Y ) H 3 (Y )+ . (3.27b)
Here the superscripts refer to the eigenspaces of . Thus H 2 (Y )+ refers to two-
forms on Y which are invariant under the action of involution and correspondingly
H 2 (Y ) refers to two-forms which are odd under . Wedging an odd two-form on
Y with d gives an invariant three-form on Y S 1 , and hence these forms, together
with the invariant 3-forms H 3 (Y )+ on Y , give the three-forms on the quotient space
Z. Also note that H 1 (Z) vanishes, since the one-form on S 1 is odd under . Now,
given a three-form on Y , its real part will be invariant under , hence H 3 (Y )+ is
essentially the real part of H 3 (Y ). Therefore the Betti numbers of Z in terms of
Hodge numbers of Y are
b1 = 0, (3.28a)
b 2 = h+
1,1 , (3.28b)
3
b = h
1,1 + h2,1 + 1. (3.28c)
A class of barely G2 manifolds that are constructed from complete intersection
CalabiYau manifolds has recently been considered in [20], where the Betti numbers
of all such manifolds have been calculated explicitly.
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

Moduli Spaces of G2 Manifolds 1081

Note that barely G2 manifolds have holonomy SU (3)  Z2 while the rst Betti
number still vanishes. This shows that vanishing rst Betti number is not a neces-
sary and sucient condition for Hol(g) = G2 . In fact, as shown by Joyce in [28, 29],
Hol(g) = G2 if and only if the fundamental group 1 (X) is nite.
Let us briey describe Joyces construction of compact torsion-free manifolds
with Hol(g) = G2 . Here we follow [30]. On T 7 we can dene a at G2 structure
(0 , g0 ), similarly as on R7 . Now suppose that is a nite group acting on T 7
that preserves the G2 structure. Then we can dene the orbifold T 7 /. The key to
resolving the orbifold singularities is to consider appropriate Quasi Asymptotically
Locally Euclidean (QALE) G2 manifolds. These are seven-manifolds with a torsion-
free G2 structure that is asymptotic to the G2 structure on R7 /G where G is
a nite subgroup of G2 . The orbifold T 7 / is then resolved to obtain a smooth
compact manifold. However on the resolution, the resulting G2 -structure is not
necessarily torsion-free, so it is shown that it can be deformed to a torsion-free G2
structure (, g). Further, the fundamental group is calculated, and if it is nite,
then Hol(g) = G2 . Using this method, Joyce found 252 topologically distinct G2
holonomy manifolds with unique pairs of Betti numbers (b2 , b3 ).

4. Moduli Space
4.1. Deformations of G2 structures
One of the interesting directions in the study of G2 holonomy manifolds is the
structure of the moduli space. Essentially, the idea is to consider the space of all
torsion-free G2 structures modulo dieomorphisms on a manifold with xed topol-
ogy. The moduli space itself has an interesting geometry that may give further
information about G2 manifolds.
Currently, we can only say something about the very local structure of the G2
moduli space. For this, we take a xed G2 structure and deform it slightly. The
space of these deformations is the local moduli space. To study it, we thus need to
understand the deformations of G2 structures. Although, we are mostly interested
in deformations of torsion-free G2 structures, many of the results are valid for any
G2 structures.
Our aim is to consider innitesimal deformations of of the form
+ (4.1)
for some three-form . As we already know, the G2 structure on X and the corre-
sponding metric g are all determined by the invariant three-form . Hence, defor-
mations of will induce deformations of the metric. These deformations of metric
will then also aect the deformation of = . Theoretically, large deformations
could also be considered, and in fact, as we shall see below in some cases closed
expressions can be obtained for large deformations. However in that case, it is dif-
cult to determine the resulting torsion class of the new G2 structure [31]. In order
for the deformed to dene a new G2 structure, the new must also be a positive
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

1082 S. Grigorian

form (as per the denition of a G2 structure). However it is known [30] that the
bundle of positive three-forms on X is an open subbundle of 3 T X, so we can
always nd small enough in order for the deformed to be positive.
Using the decomposition of three-forms (3.1c), we can split into 31 , 37 and
3
27 parts, and at rst let us consider each one separately. As shown by Karigiannis
in [31], metric deformations can be made explicit when the three-form deformations
are either in 31 or 37 . Let us rst review some of these results. First suppose

= f . (4.2)

We will also use the notation =


where is the Hodge star derived from the
metric g corresponding to
. Then from (3.9) we get
 1
gab det g =
mnpqrst
144 amn bpq rst

= f 3 gab det g. (4.3)
After taking the determinant on both sides, we obtain
14
det g = f 3 det g. (4.4)

Substituting (4.4) into (4.3), we nally get


2
gab = f 3 gab , (4.5)
and hence
= f 43 .
(4.6)
So, a scaling of gives a conformal transformation of the metric. Hence defor-
mations of in the direction 31 also give innitesimal conformal transformation.
Suppose f = 1 + a, then to third order in , we can write
 
= 1 + 4 a + 2 a2 2 4 a3 3 + O(4 ) .
(4.7)
3 9 81
Given a torsion-free G2 structure, d = d = 0, so if we want the deformed
structure to be also torsion-free, f must be constant.
Now, suppose in general that = + for some 3 . Then using (3.8) for
the denition of the metric associated with , after some manipulations, we get:


u,  = 1 (u) (v)
v vol
6
1
+ [(u) (v) + (v) (u)]
2
1
+ 2 (u) (v)
2
1
+ 3 (u) (v) . (4.8)
6
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

Moduli Spaces of G2 Manifolds 1083

Rewriting (4.8) in local coordinates, we get



det g 1 1 2
gab = gab + mn(a mn
b) + amn bpq
mnpq
det g 2 8
1 3
+ amn bpq ()mnpq . (4.9)
24
Now suppose the deformation is in the 37 direction. This implies that

=  (4.10)

for some vector eld . Look at the rst order term in (4.9). From (2.28) we see
that this is essentially a projection onto 31 327 the traceless part gives the
327 component and the trace gives the 31 component. Hence this term vanishes for
37 . For the third order term, it is more convenient to study at it in (4.8). By
looking at

((u) (v) ) = 0

we immediately see that the third order term vanishes. So now we are left with
 
 1 2 c d mnpq

gab det g = gab + camn dbpq det g
8

= (gab (1 + 2 ||2 ) 2 a b ) det g (4.11)

where we have used a contraction identity for twice. Taking the determinant of
(4.11) gives
 2
det g = (1 + 2 ||2 ) 3 det g. (4.12)

Eventually we have the following result:

Theorem 16 ([31]). Given a deformation of a G2 structure (4.1) with = 


37 , then the new metric gab is given by
2
gab = (1 + 2 ||2 ) 3 ((gab (1 + 2 ||2 ) 2 a b )) (4.13)
is given by
and the deformed four-form
= (1 + 2 ||2 ) 13 ( + () + 2  ()).
(4.14)

One of the key reasons why it is possible to get these closed form expressions
for modied g and is because as shown by Karigiannis in [31], the determinant
of (4.11) can be calculated in a closed form. Notice that to rst order in , both

det g and gab remain unchanged under this deformation. Now let us examine the
last term in (4.14) in more detail. Firstly, we have

 () = (  ())
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

1084 S. Grigorian

and
(  ())mnp = 3[m a |a|np]
= 3i ( ) (4.15)
where ( )ab = a b . Therefore, in (4.14), this term gives 41
and 427
compo-
nents. So, can write (4.14) as
  
= (1 + 2 ||2 ) 13 3
1 + 2 ||2 + () + 2 i (( )0 ) . (4.16)
7
Here ( )0 denotes the traceless part of , so that i (( )0 ) 327 and thus,
in (4.16), the components in dierent representations are now explicitly shown.
To rst order, we thus have the deformations

= + (),
= + ().

If originally d = d = 0, that is, the G2 structure is torsion-free, then for the
deformed structure to be torsion-free to rst order we need
d() = d () = 0. (4.17)
4
By expanding d() in terms of the decomposition of , and setting each term
individually to 0, we nd that the symmetric part of a b and the 27 part of d 
must vanish. Furthermore, by expanding d () in terms of the decomposition
of 2 we nd that the 214 part of d  must also vanish. Hence we get that = 0.
If Hol(g) = G2 , then we know that in this case = 0, so there are no interesting
small 37 deformations of manifolds with holonomy equal to G2 .
As we have seen above, in the cases when the deformations were in 31 or 37
directions, there were some simplications, which make it possible to write down
all results in a closed form. In the case of deformations in 327 the only known way
to get results for deformations of the metric and the four-form is to consider the
deformations order by order in . This analysis has been carried out in [21], and
here we will review those results. So suppose we have a deformation

= +
where 327 . Now let us set up some notation. Dene
1 1
sab = mnpqrst
(4.18)
144 det g amn bpq rst

det g
= gab . (4.19)
det g
From (3.9), the untilded sab is then just equal to gab . We can rewrite (4.19) as

det g
gab = (gab + sab ) (4.20)
det g
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

Moduli Spaces of G2 Manifolds 1085

where gab is the deformation of the metric and sab is the deformation of sab ,
which from (4.9) is given by

1 1 1
sab = mn(a b)mn + 2 amn bpq mnpq + 3 amn bpq ()mnpq . (4.21)
2 8 24
Also introduce the following short-hand notation

sk = Tr((s)k ) (4.22)

where the trace is taken using the original metric g. From (4.21), note that since
327 , when taking the trace the rst order term vanishes, and hence s1 is at
least second-order in . Clearly, for k > 1, sk are at least of order k in . Similarly
as before, take the determinant of (4.18):
  92
det g det(g + s)
= . (4.23)
det g det(g)

Unlike in the case of 37 deformations, we cannot compute det(g + s) in closed


form, so we have to calculate it order by order in . From the standard expansion
of det(I + X), we nd

det(g + s) 1 1
= 1 + s1 + (s21 s2 ) + (s31 3s1 s2 + 2s3 ) + O(4 ). (4.24)
det g 2 6

However, as we noted above, s1 is second-order in , so this expression actually


simplies:
 
det(g + s) 1 1
= 1 + s1 s2 + s3 + O(4 ). (4.25)
det g 2 3

Raising this to the power of 19 , and expanding again to fourth order in , we get

  12  
det g 1 1 1
=1+ s2 s1 s3 + O(4 ). (4.26)
det g 18 9 27

Using this and (4.20), we can immediately get the deformed metric, but the expres-
sions using the current form of sab are not very useful. So far, the only property
of 327 that we have used is that it is orthogonal to , thus in fact, up to this point
everything applies to 37 as well. Now however, let be of the form

abc = hd[a bc]d (4.27)

where hab is traceless and symmetric, so that 327 . Let us rst introduce some
further notation. Let h1 , h2 , h3 , h4 be traceless, symmetric matrices, and introduce
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

1086 S. Grigorian

the following shorthand notation


(h1 h2 )mn = abm had be
1 h2 den , (4.28a)
be cf
h1 h2 h3 = abc had
1 h2 h3 def , (4.28b)
be cf
(h1 h2 h3 )mn = abcm def n had
1 h2 h3 , (4.28c)
be cf mn
h1 h2 h3 h4 = abcm def n had
1 h2 h3 h4 . (4.28d)
It is clear that all of these quantities are symmetric in the hi and moreover
(h1 h2 )mn and (h1 h2 h3 )mn are both symmetric in indices m and n. Then,
it can be shown that
4
(a|mn| b)mn = hab ,
3
4 16 4
amn bpq mnpq = ||2 gab + (h2 ){ab} (hh){ab} ,
7 9 9
32 8
amn bpq mnpq = Tr(h3 )gab (hh2 ){ab} ,
189 9
where as before {a b} denotes the traceless symmetric part. Using this and (4.21),
we can now express sab in terms of h:
 
2 1 2 2 4 3 3
sab = hab + gab Tr(h ) + Tr(h )
3 63 567
 
2 2 2 1 3
+ (h ){ab} (hh){ab} (hh2 ){ab} (4.29)
9 18 27
and hence
1 4
s1 = Tr(s) = 2 Tr(h2 ) + 3 Tr(h3 ), (4.30a)
9 81
 
4 8 2
s2 = Tr(s2 ) = 2 Tr(h2 ) + 3 Tr(h3 ) (hhh) , (4.30b)
9 27 27
8 3
s3 = Tr(s3 ) = Tr(h3 ). (4.30c)
27
Substituting these expressions into (4.26) and (4.20), we can get the full expression
for the deformed metric (up to third order in ) and correspondingly the expression
for the deformed four-form :
Theorem 17 ([21]). Given a deformation of a G2 structure (4.1) with abc =
hd[a bc]d 327 , then the new metric gab is given to third order in by
 
1 1 1 3 2
gab = 1 + 2 Tr(h2 ) + 3 Tr(h3 ) (hhh) gab + hab
18 81 243 3
 
2 2 1 2
+ 2 (h )(ab) (hh)ab + 3 hab Tr(h2 )
9 18 81
3
(hh2 )ab + O(4 ) (4.31)
27
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

Moduli Spaces of G2 Manifolds 1087

and correspondingly, the deformed four-form is given by


 
= + 2 1 Tr(h2 ) + 1 i ((hh)0 )

189 6

2 5 1
+ 3 (hhh) Tr(h2 ) + i (h30 )
1701 108 18

1 1
i ((hhh)0 ) + + O(4 ) (4.32)
36 324

where (hh)0 , h30 and (hhh)0 denote the traceless parts of (hh)ab , (h3 )ab and
(hhh)ab , respectively, and

a = amnp rst hmr hns hpt . (4.33)

In general if such a deformation is performed on a torsion-free G2 structure,


then it is not known what conditions must h satisfy in order for the torsion class
to be preserved. If we restrict our analysis only to rst order deformations, then it
is easier to see these conditions.
Suppose we have d = d = 0 and we apply a deformation (4.1) with = i (h)
for traceless and symmetric. Then to rst order the conditions for d = d = 0 are

d = d = 0.

Hence the deformation must be a form that is closed and co-closed. For a compact
manifold this is thus equivalent to being harmonic. We can also nd what this
condition means in terms of h. By decomposing d into 41 , 47 and 427 components,
we nd that we must have

r hra = 0. (4.34a)
m ha(b mac) = 0. (4.34b)

Further, if we decompose d into 27 and 214 components, we again get (4.34a)


and moreover get a new constraint

m ha[b mac] = 0. (4.35a)

Thus overall, for h traceless and symmetric, = i (h) being closed and co-closed
is equivalent to

r hra = 0 and m hab mac = 0.

On a compact manifold being closed and co-closed is equivalent to being har-


monic. It also turns out [2] that, if is dened as above, then

= 0 L h = 0
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

1088 S. Grigorian

where L is the Lichnerowicz operator given by


L hab = 2 hab + 2Racbd hcd . (4.36)
Therefore to preserve the torsion-free G2 structure, we have to limit our attention
to zero modes of the Lichnerowicz operator. Note that, to linear order, traceless
deformations of the metric which preserve the Ricci tensor are also precisely the
Lichnerowicz zero modes, and this is consistent with (4.31) where the linear term
in the metric deformation is proportional to h.
Let us compare what happens here to what happens on CalabiYau mani-
folds [14]. In that case, deformations of the metric gmn split into deformations
of mixed type g and deformations of pure type g and g . From the mixed
type deformations we can dene a real (1, 1)-form
ig dx dx (4.37)
and given the holomorphic 3-form , we can use the mixed type deformation to
dene a real (2, 1)-form
g dxk dx dx . (4.38)
In order to preserve the CalabiYau structure, the metric deformation must pre-
serve the vanishing Ricci curvature, and hence gmn must satisfy the Lichnerowicz
equation:
L gmn = 0.
However, the Lichnerowicz equation for gmn becomes equivalent to both the (1, 1)-
form (4.37) and the (2, 1)-form (4.38) being harmonic. Note that the denition
(4.38) is very similar to abc = hd[a bc]d in the G2 case with playing the role of
and h the role of g .

4.2. Geometry of the moduli space


In the theory of CalabiYau moduli spaces, one of the key results is that the local
moduli space of complex structure deformations is isomorphic to an open set in
H m1,1 (X) where X is a CalabiYau m-fold. Moreover, as it has been shown by
Tian and Todorov [42, 43], any innitesimal deformation can be in fact lifted to
a full deformation. For the moduli spaces of G2 manifolds however, we can only
replicate the results about the local moduli space. First let us dene the moduli
space of torsion-free G2 structures. Let X be the set of of positive three-forms
P 3 X such that d = d = 0. Here we use to emphasize that the
Hodge star is dened using the G2 holonomy metric that is dened by itself.
Then X gives the set of all three-forms that correspond to oriented, torsion-free
G2 structures. However we do not want to distinguish between three-forms that
are related by a dieomorphism. Hence, let D be the group of all dieomorphisms
of X isotopic to the identity. This group then acts naturally on three-forms. The
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

Moduli Spaces of G2 Manifolds 1089

moduli space of torsion-free G2 structures is then dened as the quotient M =


X /D. The key result by Joyce is that M is locally dieomorphic to an open set of
H 3 (X, R):
Theorem 18 ([28, 29]). Dene a map : X H 3 (X, R) by () = []. Then
is invariant under the action of D on X . Moreover, induces a dieomorphism
between neighborhoods of D M and [] H 3 (X, R).
Since the dimension of H 3 (X, R) is b3 (X), this result implies that dim M =
b3 (X). The full proof of this result can be found either in [28, 29] or [30]. This
result covers the basic local properties of the G2 moduli space, but we do not yet
know anything about the global structure of M. So anything we can say about the
moduli space only holds in a small neighborhood.
Looking back at the study of CalabiYau moduli spaces, we know that the
complex structure moduli space admits a K ahler structure, and the K
ahler structure
moduli space admits a Hessian structure [14]. It turns out that on the G2 moduli
space we can also dene a Hessian structure. First let us dene the notion of a
Hessian manifold [39]:

Definition 19. Let M be a smooth manifold and suppose D is a at, torsion-


free connection on M . A Riemannian metric G on a at manifold (M, D) is called
Hessian if G can be locally expressed as
G = D2 H (4.39)
that is,
2H
Gij = (4.40)
xi xj
where {x1 , . . . , xn } is an ane coordinate system with respect to D. Then H is
called the Hessian potential.
Note that this is the closest analogue to a K
ahler structure that can be dened
on a real manifold. In fact, as shown by Shima [39], if we dene a complex structure
on the manifold TM , then the straightforward extension of G onto TM is Kahler if
and only if G is a Hessian metric on (M, D). Thus the complexication of a Hessian
manifold is Kahler.
In the case of the G2 moduli space M, we know that M is locally dieomorphic
to an open set in H 3 (X, R). Suppose we choose a basis [0 ], . . . , [n ] on H 3 (X, R)
where n = b3 (X) 1. Taking the unique harmonic representatives of the basis
elements, we can expand M as

n
= sN N . (4.41)
N =0

Since H 3 (X, R) is a vector space, s0 , . . . , sn give an ane coordinate system, which


in turn denes a at connection D = d on M. It is trivial to check that this
connection is well-dened [34].
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

1090 S. Grigorian

In order to dene a metric on M, we have to choose a Hessian potential function


on M. The only natural function on M is the volume function V () given by (3.5):

1
V () = .
7 X
Note that as before, = is itself a function of . So we can consider V or some
function of V as potential candidates for a Hessian potential. Let us calculate the
Hessian of V . Note that under a scaling sM sM , scales as and from
4
(4.6), scales as 3 , and so V scales as
7
V 3 V.
7
So V is homogeneous of order 3 in the sM , and hence
V 7
sM M
= V
s 3

1
= sM M
3
and thus,

V 1
M
= M . (4.42)
s 3
Using our results on deformations of G2 structures from Sec. 4.1, we can deduce
that
4
N () = 1 (N ) + 7 (N ) 27 (N ). (4.43)
3
Hence dierentiating (4.42) again, we nd that
 
V 4 1
= 1 (M ) 1 (N ) + 7 (M ) 7 (N )
sM sN 9 3

1
27 (M ) 27 (N ). (4.44)
3
Note that in the case when b1 (X) = 0 (which in particular is true when Hol(g) =
G2 ), since H73 = H 1 , the H73 component of H 3 (X, R) is empty. Therefore, the
second term in (4.44) vanishes, and we nd that the signature of this metric is
Lorentzian (1, b3 1). Up to a constant factor, this denition of the moduli
space metric has been been used in mathematical literature in particular by
Hitchin in [26] and Karigiannis and Leung in [34]. However in physics literature, in
particular by Beasley and Witten in [7] and by Gutowski and Papadopoulos in [23],
the potential K given by

K = 3 log V (4.45)

has been used instead.


October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

Moduli Spaces of G2 Manifolds 1091

The motivation for using this modied potential is two-fold. Firstly, this is more
in line with the logarithmic Kahler potentials on CalabiYau moduli spaces. Sec-
ondly, and perhaps most importantly is that the metric that arises from this poten-
tial appears as the target space metric of the eective theory in four dimensions
when the action for the 11-dimensional supergravity is reduced to four dimensions
on a G2 manifold. We will hence dene the moduli space metric GMN as
2K
GMN = .
sM sN
Using the denition of K and (4.44), we get
 
2K 1
= 1 (M ) 1 (N ) 7 (M ) 7 (N )
sM sN V
 
+ 27 (M ) 27 (N ) (4.46)

In this case, if b1 (X) = 0, we get



1
GMN = M N . (4.47)
V X

This metric is then in fact Riemannian. In the physics setting, apart from the G2
three-form, there is another three-form C and when the 11-dimensional supergravity
action is reduced to four dimensions, the parameters of and C naturally combine
to give a complexication of the G2 moduli space. The extension of the metric GMN
to this complex space is then K ahler [7, 21, 23]. However since the metric on the
complexied space does not depend on C, there is not much dierence in treating
the moduli space as a complexied K ahler manifold or a real Hessian manifold.
Here we will treat M as a real Hessian manifold.
Now that we have xed a metric on M, we can proceed to various other geo-
metrical quantities. For this we will need to use higher derivatives of . In what
follows, we will assume that b1 (X) = 0, so that there no harmonic forms in H73 .
Let us introduce local special coordinates on M. Let 0 = a and 327 for
= 1, . . . , b327 , so that s0 denes directions parallel to and s dene directions
3
in H27 . Then, from the deformations of in Sec. 4.1, we can extract the higher
derivatives of in these directions:
4 2 8
0 0 = a , 0 0 0 = a3 , (4.48a)
9 27
1 2
0 = a , 0 0 = a2 , (4.48b)
3 9
2 1
= Tr(h h ) + i ((h h )0 ), (4.48c)
189 3
4 2
0 = a Tr(h h ) a i ((h h )0 ), (4.48d)
567 9
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

1092 S. Grigorian

5 1
= Tr(h h ) + i ((h h h )0 )
18 3
1 4
i ((h h h )0 ) (h h h ), (4.48e)
6 567
where h , h and h are traceless symmetric matrices corresponding to the three-
forms , and , respectively. On a Hessian manifold, there is a natural sym-
metric three-tensor given by the derivative of the metric, or equivalently the third
derivative of the Hessian potential. We will denote this tensor AMNP . By anal-
ogy with similar quantities on CalabiYau moduli spaces, this tensors is called the
Yukawa coupling. Using these expressions, following [21] we can now write down all
the components of AMNR :

A000 = 14a3 , (4.49a)


A00 = 0, (4.49b)

2a
A0 = = 2aG , (4.49c)
V

2
A = (h h h )dV. (4.49d)
27V
The full Riemann curvature on a Hessian manifold is then dened by
1 M
RMNPQ = (A QR AR NP AMPR AR NQ ). (4.50)
4
Note that since the fourth derivative of K is fully symmetric, the fourth derivative
terms vanish here. However, we can also dene the Hessian curvature tensor by

QKLMN = M N L K K AKMR AR LN . (4.51)

This tensor is the equivalent of the K


ahler curvature, and carries more information
than the actual Riemann tensor (4.50). The Riemann curvature tensor is obtained
from Q by
1
RMNPQ = (QMNPQ QNMPQ ). (4.52)
2
From (4.48), we can calculate the fourth derivatives of K, and hence get all the
components of Q:

Theorem 20 ([21]). The components of the Hessian curvature tensor Q corre-


sponding to the metric (4.47) on the local moduli space of torsion-free G2 structures
are given by:

Q0000 = 14a4 , (4.53a)


Q000 = 0, (4.53b)
Q00 = 2a2 G , (4.53c)
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

Moduli Spaces of G2 Manifolds 1093

Q0 = A a, (4.53d)
 
1 5
Q = G G + G G G G G A A
3 7
 
1 2 1
+ Tr(h h h h ) + (h h h h )
V 27 27

5
+ Tr(h( h ) Tr(h h) ) vol. (4.53e)
81
Let us look in more detail at the expression for A . If we dene ha = ham dxm ,
then we get

4
A = abc ha hb hc . (4.54)
9V
Expressions for the G2 Yukawa coupling has been derived by dierent authors
in particular by Lee and Leung, [37], de Boer, Naqvi and Shomer [16], and
Karigiannis [32]. Similarly, we can rewrite (4.53e) as
 
1 5
Q = G G + G G G G G A A
3 7

81
+ abcd ha hb hc hd
9V

1 1
+ (5 Tr(h( h ) Tr(h h) ) 6 Tr(h h h h ))vol. (4.55)
81 V
As we have mentioned previously, by complexifying the G2 moduli space, it is pos-
sible to turn the Hessian structure into a Kahler structure. Similarly, the Hessian
curvature Q becomes Kahler curvature. On CalabiYau manifolds, the complex
structure moduli space is naturally a complex manifold, and admits a K ahler struc-
ture, while the K ahler structure moduli space is naturally a Hessian manifold, but
can be complexied to become Kahler itself. We compare the various quantities on
G2 moduli space and on the CalabiYau complex structure moduli space in Fig. 5.
We can see that there are a number of similarities. This leads to a speculation
that perhaps the G2 moduli space possesses more structures than it is currently
known. One of the key features of CalabiYau moduli spaces is the special geometry,
that is, both have a line bundle whose rst Chern class coincides with the K ahler
class [19, 40]. From physics point of view, special geometry relates to the eective
theory having N = 2 supersymmetry. M -theory compactied on G2 manifolds only
gives N = 1 supersymmetry, so from this point of view it is perhaps unlikely that
the (complexied) G2 moduli space would admit precisely this structure. Moreover,
it was shown by Alekseevsky and Cortes in [4] that a so-called special real struc-
ture on a Hessian manifold corresponds to special K ahler structure on the tangent
bundle. A special real manifold is a Hessian manifold on which the cubic form DG
(with D being the at connection, and G the Hessian metric) is parallel with respect
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

1094 S. Grigorian

Quantity G2 moduli in 327 Complex structure moduli


Form ,

Deformation space 3
H27 H (2,1)
2
Metric deformation h
3
g


Form deformation abc = hd[a bc]d = 21 g
R R
K
ahler potential K = 3 log( ) K = log(i )
R R

G = V1 M N

Moduli space metric G = R


4
R
A = 9V abc ha R
Yukawa coupling
=

hb hc
R
= G G
+ G G
Curvature Q as in (4.55)
e2KC

Fig. 5. Comparison of G2 moduli space and CalabiYau complex structure moduli space.

to D. In our terms, this would mean that the derivative of the Yukawa coupling A
vanishes. This is a rather strong condition which is not necessarily fullled in our
case. So perhaps instead there is some intermediate structure that could be dened
on the G2 moduli space or its complexication.

5. Concluding Remarks
In this paper, we have reviewed the developments in the study of G2 moduli spaces.
Currently only the local picture of the moduli space is known, so in the future it
is natural to try and obtain at least some information on the global structure of
the G2 moduli space. On CalabiYau manifolds, the extension to the global moduli
space was originally done by Tian and Todorov [42, 43]. We have seen that there
are a number of similarities in the local structure of CalabiYau moduli spaces and
G2 moduli spaces, so it is feasible that it could also be possible to derive similar
global properties of G2 moduli spaces. However torsion-free G2 structures are very
nonlinear in some aspects in particular, the metric depends nonlinearly on and
hence the dierential equation = 0 for a torsion-free structure is also nonlinear.
Therefore, it is not clear how to extend innitesimal deformations of a G2 structure
to large deformations, apart from considering deformations order by order. However
even such expansions quickly get very complicated.
Another possible topic for study would be to further develop approaches to
mirror symmetry on G2 holonomy manifolds [22]. One possible direction for fur-
ther research is to look at G2 manifolds in a slightly dierent way. Suppose we
have type IIA superstrings on a non-compact CalabiYau three-fold with a spe-
cial Lagrangian submanifold which is wrapped by a D6 brane which also lls M4 .
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

Moduli Spaces of G2 Manifolds 1095

Then, as explained in [3], from the M -theory perspective this looks like a S 1 bundle
over the CalabiYau which is degenerate over the special Lagrangian submanifold,
but this seven-manifold is still a G2 manifold. The moduli space of this manifold
will be then determined by the CalabiYau moduli and the special Lagrangian
moduli. This possibly could provide more information about mirror symmetry on
CalabiYau manifolds [41].
One more direction is to look at G2 manifolds with singularities. So far in
this work we have considered only smooth G2 manifolds, however, from a physical
point of view, G2 manifolds with singularities are even more interesting, as they
yield more realistic matter content [1]. Also, the moduli spaces which we studied
are for manifolds with xed topology. By allowing topological transitions through
singularities [15], it may be possible to nd some relations between the dierent
moduli spaces. Understanding these questions would improve our grasp of both the
geometry and physics of G2 moduli spaces and the interplay between them.

References
[1] B. Acharya and E. Witten, Chiral fermions from manifolds of G(2) holonomy,
hep-th/0109152.
[2] B. S. Acharya and S. Gukov, M theory and singularities of exceptional holonomy
manifolds, Phys. Rept. 392 (2004) 121189; hep-th/0409191.
[3] M. Aganagic, A. Klemm and C. Vafa, Disk instantons, mirror symmetry and the
duality web, Z. Naturforsch. A 57 (2002) 128; hep-th/0105045.
[4] D. V. Alekseevsky and V. Cortes, Geometric construction of the r-map: From ane
special real to special Kahler manifolds, arXiv:0811.1658.
[5] W. Ambrose and I. M. Singer, A theorem on holonomy, Trans. Amer. Math. Soc. 75
(1953) 428443.
[6] J. Baez, The Octon, Bull. Amer. Math. Soc. (N.S.) 39 (2002) 145205.
[7] C. Beasley and E. Witten, A note on uxes and superpotentials in M -theory com-
pactications on manifolds of G(2) holonomy, JHEP 07 (2002); hep-th/0203061.
[8] K. Becker, M. Becker and J. H. Schwarz, String Theory and M-Theory: A Modern
Introduction (Cambridge University Press, 2007).
[9] M. Berger, Sur les groupes dholonomie homog`ene des varietes `a connexion ane et
des varietes riemanniennes, Bull. Soc. Math. France 83 (1955) 279330.
[10] E. Bonan, Sur les varietes riemanniennes `a groupe dholonomie g2 our spin(7), C. R.
Acad. Sci. Paris 262 (1966) 127129.
[11] R. Bryant and S. Salamon, On construction of some complete metrics with excep-
tional holonomy, Duke Math. J. 58 (1989) 829850.
[12] R. L. Bryant, Metrics with exceptional holonomy, Ann. of Math. (2) 126(3) (1987)
525576.
[13] R. L. Bryant, Some remarks on G 2-structures, math/0305124.
[14] P. Candelas and X. de la Ossa, Moduli space of CalabiYau manifolds, Nucl. Phys.
B 355 (1991) 455481.
[15] M. Cvetic, G. W. Gibbons, H. Lu and C. N. Pope, M -theory conifolds, Phys. Rev.
Lett. 88 (2002) 121602, pp. 4; hep-th/0112098.
[16] J. de Boer, A. Naqvi and A. Shomer, The topological, hep-th/0506211.
[17] M. J. Du, M -theory on manifolds of G(2) holonomy: The rst twenty years,
hep-th/0201062.
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

1096 S. Grigorian

[18] M. Fernandez and A. Gray, Riemannian manifolds with structure group G2 , Ann.
Mat. Pura Appl. (4) 132 (1982) 1945.
[19] D. S. Freed, Special Kaehler manifolds, Comm. Math. Phys. 203 (1999) 3152;
hep-th/9712042.
[20] S. Grigorian, Betti numbers of a class of barely G2 manifolds, arXiv:0909.4681.
[21] S. Grigorian and S.-T. Yau, Local geometry of the G2 moduli space, Comm. Math.
Phys. 287 (2009) 459488; arXiv:0802.0723.
[22] S. Gukov, S.-T. Yau and E. Zaslow, Duality and brations on G(2) manifolds,
hep-th/0203217.
[23] J. Gutowski and G. Papadopoulos, Moduli spaces and brane solitons for M theory
compactications on holonomy G(2) manifolds, Nucl. Phys. B 615 (2001) 237265;
hep-th/0104105.
[24] F. R. Harvey, Spinors and Calibrations (Academic Press, 1990).
[25] J. A. Harvey and G. W. Moore, Superpotentials and membrane instantons,
hep-th/9907026.
[26] N. J. Hitchin, The geometry of three-forms in six and seven dimensions,
math/0010054.
[27] K. Hori et al., Mirror Symmetry (Amer. Math. Soc., 2003).
[28] D. D. Joyce, Compact Riemannian 7-manifolds with holonomy G2 . I, J. Dierential
Geom. 43 (1996) 291328.
[29] D. D. Joyce, Compact Riemannian 7-manifolds with holonomy G2 . II, J. Dierential
Geom. 43 (1996) 329375.
[30] D. D. Joyce, Compact Manifolds with Special Holonomy, Oxford Mathematical Mono-
graphs (Oxford University Press, 2000).
[31] S. Karigiannis, Deformations of G 2 and Spin(7) structures on manifolds, Canad. J.
Math. 57 (2005) 10121055; math/0301218.
[32] S. Karigiannis, Geometric flows on manifolds with G 2 structure, I, math/0702077.
[33] S. Karigiannis, Desingularization of G2 manifolds with isolated conical singularities,
Geom. Topol. 13(3) (2009) 15831655; arXiv:0807.3346.
[34] S. Karigiannis and N. C. Leung, Hodge theory for G2-manifolds: Intermediate
Jacobians and AbelJacobi maps, arXiv:0709.2987.
[35] A. Kovalev, Twisted connected sums and special Riemannian holonomy,
math/0012189.
[36] A. Kovalev and J. Nordstr om, Asymptotically cylindrical 7-manifolds of holonomy
G2 with applications to compact irreducible G2 -manifolds, arXiv:0907.0497.
[37] J.-H. Lee and N. C. Leung, Geometric structures on G(2) and Spin(7)-manifolds,
math/0202045.
om, Deformations of asymptotically cylindrical G2 -manifolds, Math. Proc.
[38] J. Nordstr
Cambridge Philos. Soc. 145(2) (2008) 311348; arXiv:0705.4444.
[39] H. Shima, The Geometry of Hessian Structures (World Scientic Publishing, 2007).
[40] A. Strominger, Special geometry, Comm. Math. Phys. 133 (1990) 163180.
[41] A. Strominger, S.-T. Yau and E. Zaslow, Mirror symmetry is T -duality, Nucl. Phys.
B 479 (1996) 243259; hep-th/9606040.
[42] G. Tian, Smoothness of the universal deformation space of compact CalabiYau
manifolds and its PeterssonWeil metric, in Mathematical Aspects of String Theory
(San Diego, Calif., 1986), Adv. Ser. Math. Phys., Vol. 1 (World Sci. Publishing,
1987), pp. 629646.
[43] A. Todorov, The WeilPetersson geometry of the moduli space of SU (n 3)
(CalabiYau) manifolds I, Comm. Math. Phys. 126 (1989) 325346.
October 12, 2010 10:1 WSPC/S0129-055X 148-RMP
J070-S0129055X10004132

Moduli Spaces of G2 Manifolds 1097

[44] P. K. Townsend, The eleven-dimensional supermembrane revisited, Phys. Lett. B 350


(1995) 184187; hep-th/9501068.
[45] P. C. West, Introduction to Supersymmetry and Supergravity (World Scientic
Publishing Singapore, 1990).
[46] E. Witten, String theory dynamics in various dimensions, Nucl. Phys. B 443 (1995)
85126; hep-th/9503124.
October 12, 2010 10:2 WSPC/S0129-055X 148-RMP
J070-S0129055X10004144

Reviews in Mathematical Physics


Vol. 22, No. 9 (2010) 10991121

c World Scientic Publishing Company
DOI: 10.1142/S0129055X10004144

A UNIFIED TREATMENT OF CONVEXITY OF RELATIVE


ENTROPY AND RELATED TRACE FUNCTIONS,
WITH CONDITIONS FOR EQUALITY


ANNA JENCOV
A
Mathematical Institute, Slovak Academy of Sciences,
anikova 49, 814 73 Bratislava, Slovakia
Stef
jenca@mat.savba.sk

MARY BETH RUSKAI


Department of Mathematics, Tufts University,
Medford, MA 02155, USA
marybeth.ruskai@tufts.edu

Received 14 August 2009


Revised 19 June 2010

We consider a generalization of relative entropy derived from the WignerYanaseDyson


entropy and give a simple, self-contained proof that it is convex. Moreover, special cases
yield the joint convexity of relative entropy, and for Tr K Ap KB 1p Liebs joint con-
cavity in (A, B) for 0 < p < 1 and Andos joint convexity for 1 < p 2. This approach
allows us to obtain conditions for equality in these cases, as well as conditions for equal-
ity in a number of inequalities which follow from them. These include the monotonicity
under partial traces, and some Minkowski type matrix inequalities proved by Carlen and
Lieb for Tr1 (Tr2 Ap12 )1/p . In all cases, the equality conditions are independent of p; for
extensions to three spaces they are identical to the conditions for equality in the strong
subadditivity of relative entropy.

Keywords: Relative entropy; convex trace functions; WignerYanaseDyson entropy.

Mathematics Subject Classication 2010: 47A63, 15A45, 94A17

1. Introduction
1.1. Background
For matrices A12 > 0 acting on a tensor product of two Hilbert spaces, Carlen and
Lieb ([7, 8]) considered the trace function [Tr1 (Tr2 Ap12 )q/p ]1/q and proved that it
is concave when 0 p q 1 and convex when 1 q and 1 p 2. They
showed that this implies that these functions and the norms they generate satisfy
Minkowski type inequalities, including a natural generalization to matrices A123
acting on a tensor product of three Hilbert spaces. They also raised the question of
the conditions for equality in their inequalities. When q = 1, we show that this can

1099
October 12, 2010 10:2 WSPC/S0129-055X 148-RMP
J070-S0129055X10004144

1100 A. Jen
cov
a & M. B. Ruskai

be treated using methods developed to treat equality in the strong subadditivity


of quantum entropy. Moreover, we obtain conditions for equality in a large class
of related convexity inequalities, show that they are independent of p in the range
0 < p < 2, and show that for inequalities involving A123 they are identical to
the equality conditions for strong subadditivity (SSA) of quantum entropy given
in [13].
These equality conditions are non-trivial and have found many applications in
quantum information theory. For example, they play an important role in some
recent no broadcasting results; see [19] and references therein. They also play a
key role in Devetak and Yards ([9]) quantum state redistribution protocol which
gives an operational interpretation to the quantum conditional mutual information.
Our approach to proving joint convexity of relative entropy is motivated by
Arakis relative modular operator ([5]), introduced to generalize relative entropy
to more general situations including type III von Neumann algebras. It was sub-
sequently used by Narnhofer and Thirring ([29]) to give a new proof of SSA. The
argument given here is similar to that in [18,31,37]; however, the unied treatment
for 0 < p < 2 leading to equality conditions, is new. Moreover, a dual treatment
can be given for 1 < p < 1 allowing extension to the full range (1, 2).
Wigner and Yanase ([42, 43]) introduced the notion of skew information of a
density matrix with respect to a self-adjoint observable K,
1
Tr [K, p ][K, 1p ] (1)
2
for p = 12 and Dyson suggested extending this to p (0, 1). Wigner and Yanase [43]
proved that (1) is convex in for p = 12 and, in his seminal paper [20] on convex
trace functions, Lieb proved joint concavity for p (0, 1) for the more general
function

(A, B)  Tr K Ap KB 1p (2)

for K xed and A, B > 0 positive semi-denite. This implies convexity of (1)
and was a key step in the original proof ([23]) of the strong subadditivity (SSA)
inequality of quantum entropy. Moreover, it leads to a proof of joint convexity
of relative entropya as well. It is less well known that Ando ([3, 4]) gave another
proof which also showed that for 1 p 2, the function (2) is jointly convex in
A, B. The case p = 2 was considered earlier by Lieb and Ruskai ([24]). We modify
what one might describe as Liebs extension of the WignerYanaseDyson (WYD)
entropy to a type of relative entropy in a way that allows a unied treatment of
the convexity and concavity of Tr K Ap KB 1p in the range p (0, 2] and includes
the usual relative entropy as a special case. Our modication retains a linear term,

a In [23], only concavity of the conditional entropy was proved explicitly, but the same argument [36,

Sec. V.B] yields joint convexity of the relative entropy. Independently, Lindblad ([26]) observed
that this follows directly from (2) by dierentiating at p = 1.
October 12, 2010 10:2 WSPC/S0129-055X 148-RMP
J070-S0129055X10004144

Unified Treatment of Convexity of Relative Entropy and Related Trace Functions 1101

even for A = B. Although this might seem unnecessary for convexity and concavity
questions, it is crucial to a unied treatment.
Lieb also considered Tr K Ap KB q with p, q > 0 and 0 p + q 1 and Ando
considered 1 < q p 2. In Sec. 2.2, we extend our results to this situation.
However, we also show that for q = 1p, equality holds only under trivial conditions.
Therefore, we concentrate on the case q = 1 p.
Next, we introduce our notation and conventions. In Sec. 2, we rst describe our
generalization of relative entropy and prove its convexity; then consider the exten-
sion to q = 1 p mentioned above; and nally prove monotonicity under partial
traces including a generalization of strong subadditivity to p = 1. In Sec. 3, we con-
sider several formulations of equality conditions. In Sec. 4, we show how to use these
results to obtain equality conditions in the results of Lieb and Carlen ([7, 8]). For
completeness, we include an appendix which contains the proof of a basic convexity
result from [37] that is key to our results.

1.2. Notation and conventions


We introduce two linear maps on the space Md of dd matrices. Left multiplication
by A is denoted LA and dened as LA (X) = AX; right multiplication by B is
denoted RB and dened as RB (X) = XR. These maps are associated with the
1
relative modular operator AB = LA RB introduced by Araki ([5]) in a far more
general context. They have the following properties:
(a) The operators LA and RB commute since
LA [RB (X)] = AXB = RB [LA (X)] (3)
even when A and B do not commute.
(b) LA and RA are invertible if and only if A is non-singular, in which case L1
A =
1
LA1 and RA = RA1 .
(c) When A is self-adjoint, LA and RA are both self-adjoint with respect to the
HilbertSchmidt inner product A, B = Tr AB.
(d) When A 0, the operators LA and RA are positive semi-denite, i.e.
Tr X LA (X) = Tr X AX 0 and
Tr X RA (X) = Tr X XA = Tr XAX 0.
(d) When A > 0, then (LA )p = LAp and (RA )p = RAp for all p 0. If A is also
non-singular, this extends to all p R. More generally, f (LA ) = Lf (A) for
f : (0, )  R.
To see why (e) holds, it suces to observe that A > 0 implies LA and RA
are linear operators for which f (A) can be dened by the spectral theorem for
any function f with domain in (0, ). It is easy to verify that A|j  = j |j 
implies LA |j k | = j |j k | for k = 1, . . . , d so that the spectral decomposition
of A induces one on LA with degeneracy d and f (LA )|j k | = f (k )|j k |.
October 12, 2010 10:2 WSPC/S0129-055X 148-RMP
J070-S0129055X10004144

1102 A. Jen
cov
a & M. B. Ruskai

For RB a similar argument goes through starting with left eigenvectors of B i.e.
j |B = j j |.
If a function is homogeneous of degree 1, then convexity is equivalent to subad-

ditivity. Thus, if F (A) = F (A), then F is convex if and only if F (A) j F (Aj )

with A = j Aj . We will use this equivalence without further ado.
For B positive semi-denite, we denote the projection onto (ker B) by P(ker B) .
We will encounter expressions involving commuting positive semi-denite matrices
A, D with ker D ker A. We will simply write AD1 for

lim A(D + I)1 A = AD1 P(ker D) = AD1 P(ker A) (4)
0

with D1 the generalized inverse.

2. WYD Entropy Revisited and Extended


2.1. Generalization of relative entropy
We now introduce the family of functions

1
(x xp ) p = 1
gp (x) = p(1 p) (5)

x log x p = 1,

which are well-dened for x > 0 and p = 0. We will consider p (0, 2] although it
would suce to consider p [ 12 , 2]. For A, B strictly positive we dene
 1

Jp (K, A, B) Tr BK gp LA RB (K B) (6)

1

(Tr K AK Tr K Ap KB 1p ) p (0, 1) (1, 2),

p(1 p)

= Tr KK A log A Tr K AK log B p = 1, (7)





1 (Tr K AK Tr AKB 1 K A) p = 2.
2
When p = 1 and K = I, (6) reduces to the usual relative entropy, i.e.

J1 (I, A, B) = H(A, B) = Tr A(log A log B). (8)

For p = 1, the function Jp (K, A, B) diers from that considered by Lieb ([20]) and
1
Ando ([3, 4]) by the seemingly irrelevant linear term Tr K AK and the factor p(1p) .
However, this minor dierence allows us to give a unied treatment of p (0, 2]
because of the extension by continuity to p = 1 and the sign change there.
One might expect to associate the exchange A B with the symmetry p
(1p) around p = 12 . However, there are several subtleties due to the linear term, the
exchange K K , and the case p = 1. Therefore, we use instead the observation
October 12, 2010 10:2 WSPC/S0129-055X 148-RMP
J070-S0129055X10004144

Unified Treatment of Convexity of Relative Entropy and Related Trace Functions 1103

that
1

Jp (K , B, A) = Tr AK gp (LB RA )(K A)
1

= Tr BK 
g1p (LA RB )(K B)
= J1p (K, A, B) (9)
where, for 1 p < 1, we dene

1
(1 xp ) p = 0
gp (x) = xg1p (x1 ) = p(1 p) (10)

log x p=0

and Jp (K, A, B) = Tr BK gp (LA RB 1
)(K B).

The functions Jp (K, A, B) and Jp (K, A, B) have been considered before, usually
with K = I, in the context of information geometry ([2, Sec. 7.2] and references
therein) and by Petz ([31]) who used the term quasi-entropy. What is novel here
is that we present a simple unied proof of joint convexity in A, B that easily yields
equality conditions, shows that they are independent of p, and can be extended to
other functions.
The special case Jp (I, A, I) is equivalentb to the Tsallis ([40]) entropy. When
K = K , the relation
1
Jp (K, A, A) = Tr [K, Ap ][K, A1p ] (11)
2p(1 p)
yields the original WYD information (up to a constant) and extends it to the range
(0, 2]. Morevoer, K = K implies that Jp (K, A, A) = J1p (K, A, A). Although nei-
ther gp (w) nor g1p (w) is positive, their averagec Gp (w) 12 [gp (w)+wgp (w1 )] 0
on (0, ). Therefore, when K = K ,
1

Jp (K, A, A) = Tr(K A) Gp (LA RA )(K A) 0. (12)
The function Jp (I, A, B) is a more appealing generalization of relative entropy
than Tr Ap B 1p because of Proposition 1, which one can consider to be a generaliza-
tion of Kleins inequality ([17]). It allows one to use Jp (I, A, B) as a pseudo-metric,
as is commonly done with the relative entropy.

Proposition 1. When U is unitary and A, B > 0 with Tr A = Tr B = 1, then


Jp (U, A, B) 0 with equality if and only if A = U BU .

Proof. When U is unitary,


Jp (U, A, B) = Jp (I, U AU, B) = Jp (I, A, U BU ). (13)

b Thiswas pointed out by Karol Zyczkowski.


c Thedenition of gep in (10) diers from that in [18] by the exchange egp e g1p so that in [18]
G(w) = 12 [g(w) + e
g (w)] for any g. In the convention used here, Gp (w) = 12 [gp (w) + ge1p (w)].
October 12, 2010 10:2 WSPC/S0129-055X 148-RMP
J070-S0129055X10004144

1104 A. Jen
cov
a & M. B. Ruskai

Therefore, it suces to consider the case U = I. For p (0, 1) Holders inequality


p 1p 1p
implies Tr A B (Tr A) (Tr B)
p
= 1 with equality if and and only A = B.
It immediately follows that

1
Jp (I, A, B) (Tr A 1) = 0 and Jp (I, A, B) = 0 A = B. (14)
p(1 p)

For p = 1, the result is well-known [38, Sec. 2.5.2] and originally due to Klein ([17]).
For p (1, 2) we write p = 1 + r and again use H olders inequality

r r r
1 = Tr A = Tr B 2(r+1) AB 2(r+1) B r+1

r r
1+r 1+r
1
r
Tr B 2(r+1) AB 2(r+1) (Tr B) 1+r (15)
1 1 1   1
[Tr B 2 A1+r B 2 ] 1+r Tr A1+r B r 1+r

where we used Tr B = 1 and the second inequality follows from a classic result of
LiebThirring [25, Appendix B, Theorem 9].

Because the denominator p(1 p) changes sign at p = 0 and p = 1, both gp


and gp are convex. In fact, they satisfy the much stronger condition of operator
convexity for p (0, 2] and p [1, 1) respectively. Since g(0) = 0 and


1
gp (x) (1 xp1 ) p = 1
= p(1 p) , (16)
x
log x p=1

it follows that gp (x)/x is operator monotone [3, 10, 27], for p (0, 2], i.e. gp can
be analytically continued to the upper half plane, which it maps into itself. By
applying Nevanlinnas theorem [1, Sec. 59, Theorem 2] to gp (x)/x, one nds that
gp (x) has an integral representation of the form

x2 t x
gp (x) = ax + d(t)
0 x+t
 2 
x 1 1
= ax + + t d(t) (17)
0 x+t t x+t

with (t) 0. Integral representations are not unique, and making a suitable change
of variable in the classic formula

xp1 1
= p (0, 1) (18)
0 x+1 sin p cp
October 12, 2010 10:2 WSPC/S0129-055X 148-RMP
J070-S0129055X10004144

Unified Treatment of Convexity of Relative Entropy and Related Trace Functions 1105

allows us to give the following explicit representations


   

1 t

x + cp 1 t dt
p1
p (0, 1),

p(1 p) 0 x+t




 2


x t 1

1+ dt p = 1,
0 x+t x+t 1+t
gp (x) = (19)

 2 

1 x p2

x c t dt p (1, 2),

p(1 p)
p1
x+t

0






1 (x + x2 ) p = 2.
2
Note that for p (0, 2) the integrand is supported on (0, ). This plays a key role
in the equality conditions; therefore, we will henceforth concentrate on p (0, 2).
Theorem 2. The function Jp (K, A, B) dened in (6) is jointly convex in A, B.

Proof. It follows from (17) that


Jp (K, A, B) = a Tr K AK

1 Tr KBK
+ Tr K A (AK)
0 LA + tRB t

1
+ Tr BK (KB) t(t)dt. (20)
LA + tRB
The joint convexity then follows immediately from that of the map (X, A, B) 
1
Tr X LA +tRB
(X) which was proved in [37] following the strategy in [24]. The proof
is also given in the Appendix.

For other approaches, see [30, 31, 11]. The advantage to the argument used here
is that it immediately implies that equality holds in joint convexity if and only if
it holds for each term in the integrand.
Corollary 3. The relative entropy H(A, B) = J1 (I, A, B) is jointly convex in A, B.

2.2. Extensions with r = 1 p


We now consider extensions of Theorem 2 to situations considered by Ando ([4])
and Lieb ([20]) in which B 1p is replaced by B r with r = 1 p. Our approach uses
an idea from Bekjan ([6]) and Eros ([11]). We will also show that equality holds in
these extensions only under trivial conditions. For this we rst need an elementary
lemma, which we prove for the concave case.
Lemma 4. Let f (): [0, )  R be a nonlinear convex or concave operator
function, let A1 , A2 be density matrices and A = A1 + (1 )A2 with (0, 1).
Then f (A) = f (A1 ) + (1 )f (A2 ) if and only if A1 = A2 .
October 12, 2010 10:2 WSPC/S0129-055X 148-RMP
J070-S0129055X10004144

1106 A. Jen
cov
a & M. B. Ruskai

Proof. Since any operator concave function is analytic, nonlinearity implies that f
is strictly concave. If f (A) = f (A1 ) + (1 )f (A2 ), then

v, f (A)v = v, f (A1 )v + (1 )v, f (A2 )v (21)

for any vector v. Now choose v to be a normalized eigenvector of A. Then inserting


this on the left above and applying Jensens inequality to each term on the right,
one nds

f (v, Av) f (v, A1 v) + (1 )f (v, A2 v). (22)

But this contradicts concavity unless equality holds, which implies that v is also
an eigenvector of A1 and A2 . But then the strict concavity of f also implies that
v, A1 v = v, A2 v. Since this holds for an orthonormal basis of eigenvectors of
A, A1 and A2 , we must have A1 = A2 .

Corollary 5. The function (A, B)  Tr K Ap KB r is jointly concave on the set of


positive denite matrices when p, r 0 and p + r 1. Moreover, when p + r < 1
and K is invertible, the convexity is strict unless B1 = B2 and A1 = A2 .

Proof. It is an immediate consequence of Theorem 2 that (A, B) 


Tr K Ap KB 1p is jointly concave in A, B. Now write Tr K Ap KB r =
Tr K Ap K(B s )1p with s = r/(1 p). First, observe that for 0 < s < 1 the
function f (x) = xs satises the hypotheses of Lemma 4. Therefore,

(B1 + (1 )B2 )s > B1s + (1 )B2s (23)

with 0 < < 1 and B1 = B2 . The operator monotonicity of x  x1p for 0 < p < 1
then implies

(B1 + (1 )B2 )r > (B1s + (1 )B2s )1p , (24)

and the joint concavity of Tr K Ap KB 1p implies

Tr K Ap K(B s )1p Tr K (A1 + (1 )A2 )p K(B1s + (1 )B2s )1p


Tr K Ap1 KB1 + (1 ) Tr K Ap2 KB2
s(1p) s(1p)
(25)

where A = A1 + (1 )A2 , B = B1 + (1 )B2 , which is precisely the joint


concavity of Tr K Ap KB r . Moreover, equality in joint concavity implies equality
in (25) and, since K Ap K is strictly positive, this implies equality in (23). There-
fore, equality in (25) gives a contradiction unless B1 = B2 . In that case, the joint
concavity reduces to concavity in A for which, by a similar argument, equality holds
if and only if A1 = A2 .

Corollary 6. The function (A, B)  Tr K Ap KB 1r is jointly convex on the set


of positive denite matrices when 1 < r p 2. Moreover, when r < p and K is
invertible, the convexity is strict unless B1 = B2 and A1 = A2 .
October 12, 2010 10:2 WSPC/S0129-055X 148-RMP
J070-S0129055X10004144

Unified Treatment of Convexity of Relative Entropy and Related Trace Functions 1107

Proof. The argument is similar to that for Corollary 5. Write Tr K Ap KB 1r =


1r
Tr K Ap K(B s )1p with s = 1p . Since s (0, 1) and 1 p (1, 0) when 1 <
r < p < 2, it follows that x is operator concave and x1p is operator monotone
s

decreasing.

2.3. Monotonicity under partial traces


Let X and Z denote the generalized Pauli operators whose action on the standard
basis is X|ek  = |ek+1  (with subscript addition mod d) and Z|ek  = ei2k/d |ek . It

is well known and easy to verify that d1 k Z kAZ k is the projection of a matrix

onto its diagonal. If D is a diagonal matrix, then k X k DX k = (Tr D)I. Now
let {Wn }n=1,2,...,d2 denote some ordering of the generalized Pauli operators, e.g.,

Wj+k(d1) = X j Z k with j, k = 1, 2, . . . , d. Then d1 n Wn AWn = (Tr A)I and

1
(Wn I2 )A12 (Wn I2 ) = I1 (Tr1 A) = I1 A2 . (26)
d n

Using the fact that replacing Wn by U Wn U with U unitary, simply corresponds


to a change of basis which does not aect (26) and then multiplying both sides by
U I2 on the left and U I2 on the right gives the equivalent expression
1
(Wn U I2 )A12 (Wn U I2 ) = I1 A2 . (27)
d n

Combining this with joint convexity yields a slight generalization of the well-known
monotonicity of Jp (K, A, B) under partial traces (MPT), rst proved by Lieb in [20]
for the case K12 = I1 K2 when p (0, 1).

Theorem 7. Let Jp be as in (7), A12 , B12 strictly positive in Md1 Md2 and
K12 = V1 K2 with V1 unitary in Md1 . Then

Jp (K2 , A2 , B2 ) Jp (K12 , A12 , B12 ). (28)

Proof. Writing Wn for Wn I2 and V for V1 I2 and using (27) gives


1
Jp (K2 , A2 , B2 ) = Jp (I1 K2 , I1 A2 , I1 B2 )
d1
 
1 1  1


= Jp I1 K2 , Wn V A12 VWn , Wn B12 Wn
d1 d1 n d1 n
1 
Jp (I1 K2 , Wn (V1 I2 )A12 (V1 I2 )Wn , Wn B12 Wn )
d21 n

= Jp (V1 K2 , A12 , B12 )

where the nal equality follows from the unitary invariance of the trace.
October 12, 2010 10:2 WSPC/S0129-055X 148-RMP
J070-S0129055X10004144

1108 A. Jen
cov
a & M. B. Ruskai

Because Tr 12 (V1 K2 )A12 (V1 K2 ) = Tr 2 K2 A2 K2 , (28) is equivalent to



p 1p p 1p 0 p (0, 1)
Tr K2 A2 K2 B2 Tr(V1 K2 ) A12 (V1 K2 )B12 . (29)
0 p (1, 2)
We can obtain a weak reversal of this for p (0, 1). The argument in the Appendix
shows that for any p and xed A, B 0 both Tr K Ap KB 1p and Tr K AK are
convex in K. This was observed earlier by Lieb ([20]) and also follows from the
results in [24]. One can then apply the argument above in the special case A12 =
I1 A2 , B12 = I1 B2 to conclude that
1
Tr K2 Ap2 K2 B21p
Tr K12 (I1 A2 )p K12 (I1 B2 )1p (30)
d1

Tr K12 (I1 A2 )p K12 (I1 B2 )1p (31)
independent of whether p < 1 or p > 1. However, because the term Tr K AK is
convex rather than linear in K, (30) does not allow us to draw any conclusions
about the monotonicity of Jp (K12 , I1 A2 , I1 B2 ).
To prove Theorem 7 we showed that joint convexity implies monotonicity; the
reverse implication also holds. Let A1 , . . . , Am , B1 , . . . , Bm be positive denite
 
matrices in Md , A = j Aj , B = j Bj , and put
 
A12 = |ej ej | Aj , B12 = |ej ej | Bj , (32)
j j

12 and B
for e1 , . . . , em the standard basis of Cm . Then A 12 are block diagonal,
and A2 = Tr1 A 12 =  Ak = A and similarly for B. Then if monotonicity under
k
partial traces holds, one can conclude that
2 , B
Jp (K, A, B) = Jp (K, A 2 )

12 , B
Jp (I1 K, A 12 ) = Jp (K, Aj , Bj ) (33)
j

Thus, monotonicity under partial traces also directly implies joint convexity of Jp .
Applying (28) in the case K = I, and A12  A123 and B12  A12 I3 gives
Jp (I23 , A23 , A2 I3 ) Jp (I123 , A123 , A12 I3 ). (34)
When p = 1, it follows from (7) that
J1 (I23 , A23 , A2 I3 ) = H(A23 , A2 I2 ) = S(A23 ) + S(A2 )
where S(A) = Tr A log A. Thus, (34) becomes
S(A23 ) + S(A2 ) S(A123 ) + S(A12 )
or, equivalently
S(A2 ) + S(A123 ) S(A12 ) + S(A23 ) (35)
which is the standard form of SSA.
October 12, 2010 10:2 WSPC/S0129-055X 148-RMP
J070-S0129055X10004144

Unified Treatment of Convexity of Relative Entropy and Related Trace Functions 1109

3. Equality for Joint Convexity of Jp (K, A, B)


3.1. Origin of necessary and sucient conditions
Looking back at the proof of Theorem 2, we see that for p (0, 2), equality holds in
the joint convexity of Jp (K, A, B) if and only if equality holds in the joint convexity
for each term in the integrand in (17). It should be clear from the argument given
in the Appendix, that this requires Mj = 0 for all j with Mj given by (70). This is
easily seen to be equivalent to

(LAj + tRBj )1 (Xj ) = (LA + tRB )1 (X) for all j, (36)


  
with A = j Aj , B = j Bj , and X = j Xj with Xj = Aj K and/or Xj = KBj .
By writing AK = LA (K) in the former case and KB = RB (K) in the latter we
obtain the conditions

(I + t1
Aj Bj )
1
(K) = (I + t1
AB )
1
(K) j t > 0, (37a)

(Aj Bj + tI)1 (K) = (AB + tI)1 (K) j t > 0. (37b)

From the integral representations (19), one might expect it to be necessary for
either or both of (37a) and (37b) to hold depending on p. In fact, either will suce
because (37a) holds if and only if (37b) holds. Because AB is positive denite, by
analytic continuation (37b) extends from t > 0 to the entire complex plane, except
points t on the negative real axis for which t spectrum (AB ). Therefore, by
using the Cauchy integral formula, one nds that for any function G analytic on
C\(, 0] G(Aj Bj )(K) = G(AB )(K).
 
Theorem 8. For xed K, and A = j Aj , B = j Bj , the following are equivalent

(a) Jp (K, A, B) = j Jp (K, Aj , Bj ) for all p (0, 2).

(b) Jp (K, A, B) = j Jp (K, Aj , Bj ) for some p (0, 2).
(c) (Aj Bj + tI)1 (K) = (AB + tI)1 (K) for all j and for all t > 0.
it
(d) Ait
j KBj = Ait KB it for all j and for all t > 0.
(e) (log A log Aj )K = K(log B log Bj ) for all j.

Proof. Clearly (a) (b). The implications (b) (c) (d), as well as (b) (a),
follow from the discussion above. Dierentiation of (d) at t = 0 gives (d) (e),
and it is straightforward to verify that (e) (b) with p = 1. Moreover, (d) implies
 it 1it
j Tr K Aj KBj = Tr K Ait KB 1it for all t, which implies (a) by analytic
continuation.

3.2. Sucient subalgebras


When K = I, we can obtain a more useful reformulation of the equality condi-
tions by using results about sucient subalgebras obtained in [14, 15, 33]. Since the
denition and convexity properties of Jp (I, A, B) extend by continuity to positive
October 12, 2010 10:2 WSPC/S0129-055X 148-RMP
J070-S0129055X10004144

1110 A. Jen
cov
a & M. B. Ruskai

semidenite matrices, with ker B ker A, we will formulate the conditions in this
more general situation, using the conventions in Sec. 1.2.
Let N Md be a subalgebra, then there is a trace preserving conditional
expectation EN from Md onto N , such that Tr AX = Tr EN (A)X for all X N .
In particular, if N = Md1 I Md1 Md2 , then we have EN (A12 ) = Tr2 A d12 I.
Let Q1 , . . . , Qm Md+ and assume that ker Qm ker Qj for all j. The subal-
gebra N is said to be sucient for {Q1 , . . . , Qm } if there is a completely positive
trace preserving map T : N Md , such that T (EN (Qj )) = Qj for all j = 1, . . . , m.
This denition is due to Petz ([33, 32]) and it is a quantum generalization of the
well known notion of suciency from classical statistics. In [33], it was shown that
sucient subalgebras can be characterized by the condition
H(Qj , Qm ) = H(EN (Qj ), EN (Qm )), for all j.
We combine this with the results of the previous section to obtain other useful
characterizations of suciency.
Theorem 9. Let Q1 , . . . , Qm Md+ be such that ker Qm ker Qj for all j. Let
N Md be a subalgebra. The following are equivalent.
(i) N is sucient for {Q1 , . . . , Qm }.
(ii) EN (Qj )it EN (Qm )it P(ker Qm ) = Qit it
j Qm , for all j, t R.
+ +
(iii) There exist Qj,0 N , and D Md , such that ker D = ker Qm , and Qj =
Qj,0 D for j = 1, . . . , m.
(iv) Jp (I, Qj , Qm ) = Jp (I, EN (Qj ), EN (Qm )) for all j and some p (0, 1).
The proof of the conditions (i)(iii) can be found in [14], see also [28]. The
condition (iv) was proved in [15].

3.3. Equality conditions with K = I


Theorem 10. Let A1 , . . . , Am and B1 , . . . , Bm be positive semi-denite matrices
 
with ker Bj ker Aj , and let A = j Aj , B = j Bj . Then the following are
equivalent.

(a) Jp (I, A, B) = j Jp (I, Aj , Bj ) for all p (0, 2).

(b) Jp (I, A, B) = j Jp (I, Aj , Bj ) for some p (0, 2).
it
(c) Ait
j Bj = Ait B it P(ker Bj ) for all j and t R.
(d) There are positive matrices D1 , . . . , Dm , with ker Dj = ker Bj , such that

[Aj , Dj ] = [Bj , Dj ] = 0, and with D = j Dj .
Aj = AD1 Dj , Bj = BD1 Dj . (38)

Proof. As in Sec. 3.1, (b) implies (36) on (ker Bj ) , with Xj = Bj , X = B. This


gives
(Aj Bj + tI)1 (I) = (AB + tI)1 (I) on (ker Bj ) . (39)
Then (c) follows from the Cauchy integral formula as in Sec. 3.1.
October 12, 2010 10:2 WSPC/S0129-055X 148-RMP
J070-S0129055X10004144

Unified Treatment of Convexity of Relative Entropy and Related Trace Functions 1111

To show (c) implies (d), we will use Theorem 9. First let N = I Md


Mm Md and let A 12 , B
12 be the block-diagonal matrices in Mm Md , dened
by (32). Clearly, we have ker A 12 =  |ej ej | ker Bj and EN (A
12 ker B 12 ) =
j
1 12 ) = 1 I B. Then (c) implies EN (A
I A, EN (B 12 )it EN (B
12 )it P e =
m m (ker B12 )
it
A  it
12 B12 for all t. Then by using Theorem 9 with Q1 = A 12 , Qm = Q2 = B
12 , we
can conclude that there are positive matrices A0 , B0 Md and D12 (Mm Md)+ ,
such that ker D12 = ker B12 , [I A0 , D12 ] = [I B0 , D12 ] = 0 and

12 = (I A0 )D12 ,
A 12 = (I B0 )D12 .
B (40)

12 are block diagonal, D12 =  |ej ej | Dj must also be block
12 , B
Since A j
diagonal with Dj Md+ , ker Dj = ker Bj , [A0 , Dj ] = [B0 , Dj ] = 0 for all j and

Aj = A0 Dj , Bj = B0 Dj . (41)

Taking Tr1 in (40) gives A = A0 D and B = B0 D. Using this in (41) gives (38)
which proves (d). The implications (d) (a) (b) are straightforward.

We return briey to the case of arbitrary K. Note that if the condition (d) holds

and [Dj , K] = 0 for all j, then Jp (K, A, B) = j Jp (K, Aj , Bj ) for all p (0, 2),
this gives a sucient, but not necessary, condition for equality if K = I. The next
result reduces the case of K unitary to K = I. Then, we can apply the conditions
of Theorem 10 to Aj and KBj K .

Theorem 11. If K is unitary, then Jp (K, A, B) = Jp (K, Aj , Bj ) if and only if
 j
Jp (I, A, KBK ) = j Jp (I, Aj , KBj K )

Proof. When K is unitary, then KB p K = (KBK )p which implies Jp (K, A, B) =


Jp (I, A, KBK ).

One can try to extend the results of this section to the case K 1, and hence
to all K, by using the unitary dilation
 
K L
U=
L K

where L = U (1 |K|2 )1/2 and K = U |K| is the polar decomposition. Then, with
   
A 0 B 0
A= , B=
0 0 0 0
October 12, 2010 10:2 WSPC/S0129-055X 148-RMP
J070-S0129055X10004144

1112 A. Jen
cov
a & M. B. Ruskai

we have Jp (K, A, B) = Jp (U, A, B), so that we may use Theorem 11 to


get conditions for equality. But note that the conditions of Theorem 10
require that ker UBj U ker Aj and it can be shown that this implies
P(ker Aj ) KP(ker Bj ) K = P(ker Aj ) , where PN denotes a projection onto the sub-
scripted space. In particular, if all Aj and Bj are invertible, this restricts us to
unitary K.

3.4. Equality in monotonicity under partial trace


It is easy to see that when A12 = A1 A2 and B12 = B1 B2 , then Jp (I, A12 , B12 ) =
Jp (I, A2 , B2 ) if and only if A1 = B1 with Tr A1 = 1. However, it is not necessary
that A12 = A1 A2 . The equality conditions are given by the following theorem.
Theorem 12. Let K12 = I12 and A12 , B12 B(H1 H2 )+ , with ker B12 ker A12 .
Equality holds in (28) if and only if

(i) H2 = n HnL HnR ,
 L + R +
(ii) A12 = n AL AR n with An B(H1 Hn ) and An B(Hn ) ,
L R
 nL
(iii) B12 = n Bn Bn with Bn B(H1 Hn ) and Bn B(HnR )+ ,
R L L + R

(iv) AL L
n = Bn for all n.

Proof. Let us denote Aj = d11 Wj A12 Wj , Bj = d11 Wj B12 Wj , with Wj dened as


in the proof of Theorem 7. Then we get that equality in (28) is equivalent to

  
Jp I12 , Aj , Bj = Jp (I12 , Aj , Bj ).
j j j

By Theorem 10, equality for some p implies equality for all p, so that
Jp (I12 , A12 , B12 ) = Jp (I2 , Tr1 A, Tr1 B) = Jp (I12 , EN (A12 ), EN (B12 )) for p (0, 1),
where N is the subalgebra I1 B(H2 ) B(H1 H2 ). Hence N is sucient
for {A12 , B12 } and, by Theorem 9, there are some AR , BR B(H2 )+ and D
B(H1 H2 )+ , ker D = ker B12 , such that [(I1 AR ), D] = [(I1 BR ), D] = 0 and
A12 = D(I1 AR ), B12 = D(I1 BR ). (42)
Now let M1 be the subalgebra in B(H2 ), generated by AR , BR . Then D
(I1 M1 ) = B(H1 ) M1 where M  denotes the commutant of M . There is a

decomposition H2 = n HnL HnR , such that
 
M1 = B(HnL ) 1R
n, M1 = n B(Hn )
1L R

n n

n Dn 1n , where Dn B(H1 Hn ). Since AR , BR
R L
and D = M1 , we get the
L L
result, with An = Bn = Dn . The converse can be veried directly.

Applying this result in the case A12  A123 and B12  A12 I3 gives equal-
ity conditions in (34). Since these are independent of p, they are identical to the
conditions, rst given in [13], for equality in SSA (35) which corresponds to p = 1.
October 12, 2010 10:2 WSPC/S0129-055X 148-RMP
J070-S0129055X10004144

Unified Treatment of Convexity of Relative Entropy and Related Trace Functions 1113

Corollary 13. Equality holds in (34) if and only if



(i) H2 = n HnL HnR .

n An with An B(H1 Hn ) and An B(Hn H3 ).
(ii) A123 = n AL R L L R R

Proof. It suces to let A12 A123 and B12 A12 I3 in Theorem 12.

To apply these results in Sec. 4, it is useful to observe that condition (ii) in


Corollary 13 above can be written as

A123 = (FL I3 )(I1 FR ) (43)

with FL B(H1 H2 )+ , FR B(H2 H3 )+ , [FL I3 , I1 FR ] = 0. Combining


this with part (d) of Theorem 10 gives the following useful result, which essentially
allows us to bypass the need to apply Theorem 10 to Jp (I, Aj , Wn Aj Wn ).

Corollary 14. Let Aj Md1 Md2 , A = Aj . Then

Jp (I12 , A, (Tr2 A) I2 ) = Jp (I12 , Aj , (Tr2 Aj ) I2 ) (44)
j

if and only if there are Dj Md+1 , such that ker Dj = ker Tr2 Aj , [Aj , Dj I] = 0

and Aj = A(D1 Dj I) with D = j Dj .

Proof. Let A123 =  |ej ej |Aj Mm Md1 Md2 , then A = A


23 Md1 Md2
j
and (44) can be written as
23 , A
Jp (I23 , A 2 I3 ) = Jp (I123 , A
123 , A
12 I3 ).

By (43), this is equivalent to the existence of FL and FR , [(FL I3 ), (I1 FR )] = 0,


such that A123 = (FL I3 )(I1 FR ). Since A (1)(23) is block-diagonal, FL must

be of the form FL = j |ej ej | Dj , so that Aj = FR (Dj I). Then Tr2 Aj =
Dj Tr2 FR which implies that ker Dj ker Tr2 Aj . If we let Pj = P(ker Tr2 Aj ) , then
Pj commutes with Dj and

Aj = (Pj I)Aj = (Pj Dj I)FR ,

so that we can assume that ker Dj = ker Tr2 Aj , by taking Pj Dj instead of Dj .


Taking Tr1 of (43) gives A = (D I3 )FR = FR (D I3 ) so that Aj = A(D1 Dj I).

4. Equality in Joint Convexity of CarlenLieb


Carlen and Lieb [8] obtained several convexity inequalities from those of the map

p,q (K, A) Tr(K Ap K)q/p (45)


October 12, 2010 10:2 WSPC/S0129-055X 148-RMP
J070-S0129055X10004144

1114 A. Jen
cov
a & M. B. Ruskai

using an identity which we write only for q = 1 and p > 1 in our notation as

1
p,1 (K, A) = (p 1) inf Jp (K, A, X) + Tr X
p

1
+ Tr K AK : X > 0 . (46)
p(p 1)

We introduce the closely related quantity


 
 p,1 (K, A) = inf Jp (K, A, X) + 1 Tr X : X > 0
(47)
p
 
1 1
= p,1 (K, A) Tr K AK (48)
(p 1) p

which is well-dened for all p (0, 2) and allows us to continue to treat the
cases p < 1 and p > 1 simultaneously, as well as include the special case p = 1
for which

 1,1 (K, A) = Tr K AK log(K AK) + Tr K (A log A)K + Tr K AK



= S(K AK) + Tr KK A log A + Tr K AK. (49)

Since we are dealing with nite dimensional spaces, the inmum in (46) has a
minimizer which satises

Xmin = (K Ap K)1/p . (50)

For xed K, let Xj denote the minimizer associated with Aj . Then

 p,1 (K, A2 ) = Jp (K, A1 , X1 ) + 1 Tr X1 + Jp (K, A2 , X2 ) + 1 Tr X2


 p,1 (K, A1 ) +

p p
1
Jp (K, A1 + A2 , X1 + X2 ) + Tr(X1 + X2 ) (51)
p
 
1
inf Jp (K, A1 + A2 , X) + Tr X : X > 0
p
 p,1 (K, A1 + A2 )
= (52)

which proves convexity of  . Note that equality above requires both X =


  p,1
j Xj and Jp (K, A, X) = j Jp (K, Aj , Xj ), where X is the minimizer associated
with A.
Now we introduce some notation following the strategy in the published version
of [8]. Let | denote the vector (1, 1, . . . , 1) with all components 1 and |e1  the
October 12, 2010 10:2 WSPC/S0129-055X 148-RMP
J070-S0129055X10004144

Unified Treatment of Convexity of Relative Entropy and Related Trace Functions 1115

vector (1, 0, . . . , 0). Dene



I 0 ... 0

I 0 . . . 0
1
K = I |e1 | = . .. .. (53)
d .. .
.
I 0 ... 0

and

Aj1 0 0 ... 0

  0 Aj2 0 . . . 0

Aj = Ajk |ek ek | = =
. . . 0
Ajk , (54)
0 0 Aj3
k k
.. .. .
. . ..

   
and A = j Aj = k Ak |ek ek | = k Ak with Ak = j Ajk . Then

 


KA K= p
Apk |e1 e1 |
k

With this notation, we make some denitions following Carlen and Lieb but modi-
ed to allow a unied treatment of p (0, 2).

(p,1) (A) = (p,1) (A1 , A2 , A3 . . .)


p,1 (K, A)
= Tr(Ap1 + Ap2 + Ap3 + )1/p , (55)
 (p,1) (A) =
 (p,1) (A1 , A2 , A3 . . .)

 p,1 (K, A)
! "
1 1
= (p,1) (A1 , A2 , A3 , . . .) Tr Ak . (56)
(p 1) p
k

The denitions of and  apply only when A is a block diagonal matrix in


Md1 Md2 . We now extend this to an arbitrary matrices A12 Md1 Md2 .

(p,1) (A12 ) Tr1 (Tr2 Ap12 )1/p , (57)


 
 1 1
(p,1) (A12 ) (p,1) (A12 ) Tr A12 . (58)
(p 1) p
October 12, 2010 10:2 WSPC/S0129-055X 148-RMP
J070-S0129055X10004144

1116 A. Jen
cov
a & M. B. Ruskai

For p = 1, the formulas with hats are related to the conditional entropy, from which
they dier by a constant
   
  

(1,1) (A1 , A2 , A3 . . .) Tr A12 = Tr Ak log Ak + Ak log Ak
k k k
 

=S Ak S(A12 )
k

= J1 (I, A12 , Tr2 A12 I2 ), (59)


 (1,1) (A12 ) Tr A12 = S(A1 ) S(A12 ) = H(A12 , A1 I2 ).
(60)
When A12 is block diagonal, (p,1) (A12 ) = (p,1) (A12 ) with the understanding

that Tr2 A12 = k Ak . Now let Wn denote the generalized Pauli matrices as in
Sec. 2.3, Wn = I1 Wn and dene
 
A123 = Wn A12 Wn |en en | = Wn A12 Wn (61)
n n

so that A123 is block diagonal with blocks Wn A12 Wn . Then


1+p
d2 p (p,1) (A12 ) = (A(12)(3) ) = (W1 A12 W1 , W2 A12 W2 , . . .). (62)
 (p,1) (A) and
It is straightforward to show that for p (0, 2) the functions  (p,1) (A)
are all convex in A, inheriting this property from the quantities from which they
are dened. In view of (59) and (60), the conditions for equality in the next two
theorems are not surprising.

Theorem 15. The function  (p,1) (A) is convex in A for p (0, 2). Moreover, the
following are equivalent:

(i) Jp (I, A, (Tr2 A) I2 ) = j Jp (I, Aj , (Tr2 Aj ) I2 ),

(ii) There are matrices Dj > 0, D = j Dj , such that [Ajk , Dj ] = 0, ker Dj =

ker( k Ajk ) and Ajk = Ak D1 Dj ,
 (p,1) (A1 , A2 , A3 . . .) = 
(iii) 
j (p,1) (Aj1 , Aj2 , Aj3 . . .).

Proof. It follows from Corollary 14 and the fact that Aj are block-diagonal that
(i) (ii) and it is straightforward to verify that (ii) (iii). Moreover, (iii) implies
(i) for p = 1, by (59). To show that (iii) implies (ii) for p = 1, observe that (iii),
 p,1 (K, A) =  
implies j p,1 (K, Aj ), and this implies

Jp (K, A, X ) = Jp (K, Aj , Xj ) (63)
j



where Xj = (K Apj K)1/p = Xj |e1 e1 | and Xj = X = (KAp K)1/p =
 p 1/p  j p 1/p
X |e1 e1 |, with Xj = ( k Ajk ) and X = ( k Ak ) . Since

KApj KXj1p = Apjk Xj1p |e1 e1 |,
k
October 12, 2010 10:2 WSPC/S0129-055X 148-RMP
J070-S0129055X10004144

Unified Treatment of Convexity of Relative Entropy and Related Trace Functions 1117

with a similar expression for K Ap KX 1p , we nd


  
Jp (I, Ak , X) = Jp (K, A, X ) = Jp (K, Aj , Xj ) = Jp (I, Ajk , Xj ).
k j k,j

Convexity then implies that we must have



Jp (I, Ak , X) = Jp (I, Ajk , Xj ) k. (64)
j

Since ker Xj ker Ajk , Theorem 10 implies that


it it
Ait
kX P(ker Xj ) = Ait jk Xj , for all k, j, t. (65)
 
k = 
After writing A j |ej ej | Ajk , X = j |ej ej | Xj , this reads
 it  it
it  it 1  1 
Ak X = I Tr1 Ak I Tr1 X P(ker X)
e ,
m m
so that, by Theorem 9, there are elements Bk Md+ and D (Mm Md )+ , such
 [(I Bk ), D] = 0 and A
that ker D = ker X, k = (I Bk )D. As before, one nds
 +
D = j |ej ej | Dj for some Dj Md which implies (ii).

Theorem 16. The function  (p,1) (A12 ) is convex in A12 for p (0, 2). Moreover,
if we let A123 denote the block diagonal matrix with blocks Wn AWn , the following
are equivalent:

(i) Jp (I, A123 , A1 I23 ) = j Jp (I, (A123 )j , (A1 )j I23 ) with A123 dened by (61),

(ii) There are matrices Dj Md+1 , D = j Dj , such that ker Dj = ker(A1 )j ,
[Aj , Dj I] = 0 and Aj = A(D1 Dj I).
 (p,1) (A) = 
(iii) 
j (p,1) (Aj ).

1+p
Proof. It follows from the denition of A123 , that d2 p (p,1) (A) = (A123 ). The
equivalence (i) (iii) follows immediately from Theorem 15, and (i) (ii) can be
shown to follow from Corollary 14.

Theorem 17. The following monotonicity inequalities hold,


 (p,1) (A23 )
 (p,1) (A123 ), p (0, 2), (66a)
(p,1) (A23 ) (p,1) (A123 ), p (0, 1), (66b)
(p,1) (A23 ) (p,1) (A123 ), p [1, 2). (66c)
Moreover, equality holds if and only if the conditions of Corollary 13 are satised.

Proof. It suces to give the proof for  since the other inequalities follow imme-
diately. The argument is similar to that for Theorem 7. Let Wn denote the gener-
alized Pauli matrices of Sec. 2.3, but now let Wn = Wn I23 . Then the convexity
October 12, 2010 10:2 WSPC/S0129-055X 148-RMP
J070-S0129055X10004144

1118 A. Jen
cov
a & M. B. Ruskai

 (p,1) (A23 ) implies


of

 (p,1) (A23 ) = 1
 (p,1) (I1 A23 )
d1
 
1  1 
= (p,1) Wn A123 Wn
d1 d1 n
1   (p,1) (A123 )
(p,1) (Wn A123 Wn ) =
d21 n

where we used the invariance of  under unitaries of the form U1 I23 . In the case
 (1,1) (A23 ) (1,
p = 1, it follows from (60) that  1)(A123 ) becomes

S(A2 ) S(A23 ) S(A12 ) S(A123 ) (67)

which is SSA. Because the equality conditions in Theorem 16 are independent of p,


they are identical to those for SSA, which are given in Corollary 13.

The CarlenLieb triple Minkowski inequality for the case q = 1 is an immediate


corollary of Theorem 17. Observe that
Tr3 Tr1 (Tr2 Ap123 )1/p = (p,1) (A(13),(2) ) (68a)
Tr3 [Tr2 (Tr1 A123 )p ]1/p = (p,1) (A32 ) (68b)
so that it follows immediately from (66c) that
Tr3 [Tr2 (Tr1 A123 )p ]1/p = (p,1) (A32 ) (p,1) (A132 ) = Tr3 Tr1 (Tr2 Ap123 )1/p (69)

for 1 < p 2 and from (66b) that the inequality reverses for 0 < p < 1. Moreover,
the conditions for equality are again independent of p and identical to those for
equality in SSA, given in Corollary 13.

5. Final Remarks
It should be clear that the results in Sec. 2 are not restricted to Jp (K, A, B). The
function gp (x) given in (6) can be replaced by any operator convex function of the
form g(x) = xf (x) with f operator monotone on (0, ). Moreover, if the measure
(t) in (17) is supported on (0, ), then the conditions for equality are identical to
those in Sec. 3.
In particular, our results go through with gp replaced by gp and Jp (I, A, B)
replaced by Jp (I, A, B), which is well-dened for p [1, 1) with J0 (I, A, B) =
H(B, A). Thus our results can be extended to all p (1, 2). The case p = 2
reduces to the convexity of (A, X)  Tr X A1 X with A > 0 proved in [24]. One
can show that equality holds if and only if Xj = Aj T j with T = A1 X. We
recently learned that Kiefer ([16]) proved the p = 2 convexity, by a dierent method,
much earlier and also found these equality conditions.
October 12, 2010 10:2 WSPC/S0129-055X 148-RMP
J070-S0129055X10004144

Unified Treatment of Convexity of Relative Entropy and Related Trace Functions 1119

There have been various attempts, e.g., the Renyi ([35]) and Tsallis ([40])
entropies, to generalize quantum entropy in a way that gives the usual von Neu-
mann entropy at p = 1. In this paper we have considered two extensions of the
conditional entropy involving an exponent p (0, 2), namely,

p (0, 1)
Jp (I, A12 , A1 ) which gives Tr Ap23 A21p Tr Ap123 A1p
12 and can be
p (1, 2)
thought of as a pseudo-metric; and
 (p,1) (A12 ) which gives Tr2 (Tr3 Ap )1/p Tr12 (Tr3 Ap )1/p p (0, 1) and can be
23 123
p (1, 2)
thought of as a pseudo-norm.

These expressions are quite dierent for p = 1, but arise from quantities with the
same convexity and monotonicity properties, as well as the same equality conditions
which are independent of p. Moreover, both yield SSA at p = 1 and the equality
conditions for p = 1 are identical to those for SSA. This independence of non-trivial
equality conditions on the precise form of the function seems remarkable.
If one uses gp and Jp (I, A, B) from (10), then the inequalities above hold with
p (1, 2) replaced by p (1, 0) and SSA corresponds to p = 0.

Acknowledgments
The rst-named author was supported by the grants VEGA 2/0032/09, APVV-
0071-06, Center of Excellence SAS Quantum Technologies and ERDF OP R&D
Project CE QUTE ITMS 26240120009. The second-named author was partially
supported by National Science Foundation under Grant DMS-0604900.

Appendix. Proof of the Key Schwarz Inequality


For completeness, we include the proof of the joint convexity of (A, B, X) 
Tr X (LA + tRB )1 (X) when A, B > 0 and t > 0. Since this function is homo-
geneous of degree one, it suces to prove subadditivity. Now let

Mj = (LAj + tRBj )1/2 (Xj ) (LAj + tRBj )1/2 (). (70)

Then one can verify that


 
0 Tr Mj Mj = Mj , Mj 
j j

 
= Tr Xj (LAj + tRBj )1 (Xj ) Tr Xj
j j

 
Tr Xj + Tr (LAj + tRBj ). (71)
j j
October 12, 2010 10:2 WSPC/S0129-055X 148-RMP
J070-S0129055X10004144

1120 A. Jen
cov
a & M. B. Ruskai

Next, observe that for any matrix W ,


 
(LAj + tRBj )(W ) = (Aj W + tW Bj ) = LPj Aj (W ) + tRPj Bj (W ).
j j

Therefore, inserting the choice = (LPj Aj + tRPj Bj )1 ( j Xj ) in (71) yields

 1   1
Tr Xj P P
Xj Tr Xj (Xj ). (72)
j
L j Aj + tR j Bj j j
LAj + tRBj

for any t 0.

References
[1] N. I. Akheizer and I. M. Glazman, Theory of Operators in Hilbert Space, Vol. II
(Frederik Ungar Publishing, NY, 1963).
[2] A. Amari and H. Nagaoka, Methods of Information Geometry, Translations of Math-
ematical Monographs, Vol. 191 (American Mathematical Society and Oxford Univer-
sity Press, 2000).
[3] T. Ando, Topics on Operator Inequalities, Lecture Notes (Hokkaido University, 1978).
[4] T. Ando, Concavity of certain maps on positive denite matrices and applications to
Hadamard products, Lin. Alg. Appl. 26 (1979) 203241.
[5] H. Araki, Relative entropy of states of von Neumann algebras, Publ RIMS Kyoto
Univ. 9 (1976) 809833.
[6] T. Bekjan, On joint convexity of trace functions, Lin. Alg. Appl. 390 (2004) 321327.
[7] E. Carlen and E. Lieb, A Minkowski type trace inequality and strong subadditivity of
quantum entropy, Amer. Math. Soc. Trans. 189(2) (1999) 5962; Reprinted in [21].
[8] E. A. Carlen and E. H. Lieb, A Minkowski type trace inequality and strong subaddi-
tivity of quantum entropy II: Convexity and concavity, Lett. Math. Phys. 83 (2008)
107126; arXiv:0710.4167.
[9] I. Devetak and J. Yard, Exact cost of redistributing multipartite quantum states,
Phys. Rev. Lett. 100 (2008) 230501, 4 pp.
[10] W. F. Donoghue Jr., Monotone Matrix Functions and Analytic Continuation
(Springer, 1974).
[11] E. G. Eros, A matrix convexity approach to some celebrated quantum inequalities,
Proc. Natl. Acad. Sci. 106 (2009) 10061008; arXiv:0802.1234.
[12] H. Epstein, Remarks on two theorems of E. Lieb, Comm. Math. Phys. 31 (1973)
317325.
[13] P. Hayden, R. Jozsa, D. Petz and A. Winter, Structure of states which satisfy strong
subadditivity of quantum entropy with equality, Comm. Math. Phys. 246 (2004)
359374; arXiv:quant-ph/0304007.
[14] A. Jencova and D. Petz, Suciency in quantum statistical inference, Comm. Math.
Phys. 263 (2006) 259276; arXiv:math-ph/0412093.
[15] A. Jencova and D. Petz, Suciency in quantum statistical inference. A survey with
examples, J. Infin. Dimens. Anal. Quantum Prob. Relat. Top. 9 (2006) 331352;
arXiv:quant-ph/0604091.
[16] J. Kiefer, Optimum experimental designs, J. Roy. Statist. Soc. Ser. B 21 (1959)
272310.
[17] O. Klein, Zur quantenmechanischen begr undung der zweiten hauptsatzes der
waremlehre, Z. Phys. 72 (1931) 767775.
October 12, 2010 10:2 WSPC/S0129-055X 148-RMP
J070-S0129055X10004144

Unified Treatment of Convexity of Relative Entropy and Related Trace Functions 1121

[18] A. Lesniewski and M. B. Ruskai, Relative entropy and monotone Riemannian metrics
on non-commutative probability space, J. Math. Phys. 40 (1999) 57025724.
[19] S. Luo, N. Li and X. Cao, Relation between no broadcasting for noncommuting
states and no local broadcasting for quantum correlations, Phys. Rev. A 79 (2009)
054305, 3 pp.
[20] E. H. Lieb, Convex trace functions and the WignerYanaseDyson conjecture, Adv.
Math. 11 (1973) 267288; Reprinted in [21].
[21] M. Loss and M. B. Ruskai (eds.), Inequalities: Selecta of E. Lieb (Springer, 2002).
[22] E. H. Lieb and M. B. Ruskai, A fundamental property of the quantum-mechanical
entropy, Phys. Rev. Lett. 30 (1973) 434436; Reprinted in [21].
[23] E. H. Lieb and M. B. Ruskai, Proof of the strong subadditivity of quantum mechanical
entropy, J. Math. Phys. 14 (1973) 19381941; Reprinted in [21].
[24] E. H. Lieb and M. B. Ruskai, Some operator inequalities of the Schwarz type, Adv.
Math. 12 (1974) 269273; Reprinted in [21].
[25] E. H. Lieb and W. Thirring, Inequalities for the moments of the eigenvalues of the
Schrodinger Hamiltonian and their relation to Sobolev inequalities, in Studies in
Mathematical Physics, eds. E. Lieb, B. Simon and A. Wightman (Princeton Univer-
sity Press, 1976), pp. 269303; Reprinted in [21].
[26] G. Lindblad, Expectations and entropy inequalities, Comm. Math. Phys. 39 (1974)
111119.
[27] K. L
owner, Uber monotone Matrix Funktionen, Math. Z. 38 (1934) 177216.
[28] M. Mosonyi and D. Petz, Structure of sucient quantum coarse-grainings, Lett. Math.
Phys. 68 (2004) 1930.
[29] H. Narnhofer and W. Thirring, From relative entropy to entropy, Fizika 17 (1985)
257265.
[30] M. Ohya and D. Petz, Quantum Entropy and Its Use, 2nd edn. (Springer-Verlag,
2004).
[31] D. Petz, Quasi-entropies for nite quantum systems, Rep. Math. Phys. 23 (1986)
5765.
[32] D. Petz, Suciency of channels over von Neumann algebras, Quart. J. Math. 39
(1988) 9071008.
[33] D. Petz, Sucient subalgebras and the relative entropy of states of a von Neumann
algebra, Comm. Math. Phys. 105 (1986) 123131.
[34] D. Petz, Monotone Metrics on Matrix Spaces, Lin. Alg. Appl. 244 (1996) 8196.
[35] A. Renyi, On measures of entropy and information, in Proc. 4th Berkeley Sym-
pos. Math. Statist. and Prob., Vol. I (Univ. California Press, Berkeley, 1961),
pp. 547561.
[36] M. B. Ruskai, Inequalities for quantum entropy: A review with conditions for equal-
ity, J. Math. Phys. 43 (2002) 43584375; Erratum ibid., 46 (2005) 019901, quant-
ph/0205064.
[37] M. B. Ruskai, Another short and elementary proof of strong subadditivity of quantum
entropy, Rep. Math. Phys. 60 (2007) 112; arXiv:quant-ph/0604206.
[38] D. Ruelle, Statistical Mechanics (Benjamin, 1969).
[39] B. Simon, The Statistical Mechanics of Lattice Gases (Princeton Univ. Press, 1993).
[40] C. Tsallis, Possible generalization of BoltzmannGibbs statistics, J. Stat. Phys. 52
(1988) 479487.
[41] A. Wehrl, General properties of entropy, Rev. Mod. Phys. 50 (1978) 221260.
[42] E. P. Wigner and M. M. Yanase, Information content of distributions, Proc. Nat.
Acad. Sci. 49 (1963) 910918.
[43] E. P. Wigner and M. M. Yanase, On the positive semi-denite nature of certain
matrix expressions, Canad. J. Math. 16 (1964) 397406.
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X1000417X

Reviews in Mathematical Physics


Vol. 22, No. 10 (2010) 11231145

c World Scientic Publishing Company
DOI: 10.1142/S0129055X1000417X

ON THE HERMANKLUK
SEMICLASSICAL APPROXIMATION

DIDIER ROBERT
Departement de Mathematiques,
Laboratoire Jean Leray, CNRS-UMR 6629,
Universit
e de Nantes, 2 rue de la Houssini`
ere,
F-44322 Nantes Cedex 03, France
didier.robert@univ-nantes.fr

Received 19 November 2009

For a subquadratic symbol H on Rd Rd = T (Rd ), the quantum propagator of the time


dependent Schr odinger equation i
t
is a Semiclassical Fourier-Integral Operator
= H
when H = H(x, Dx ) (-Weyl quantization of H). Its Schwartz kernel is described by
a quadratic phase and an amplitude. At every time t, when  is small, it is essentially
supported in a neighborhood of the graph of the classical ow generated by H, with a
full uniform asymptotic expansion in  for the amplitude.
In this paper, our goal is to revisit this well-known and fundamental result with
emphasis on the exibility for the choice of a quadratic complex phase function and on
global L2 estimates when  is small and time t is large. One of the simplest choice of the
phase is known in chemical physics as HermanKluk formula. Moreover, we prove that
1
the semiclassical expansion for the propagator is valid for |t|  4 |log | where > 0 is
a stability parameter for the classical system.

Keywords: Coherent states; time dependent Schr


odinger equations; Semiclassical
Fourier-Integral Operator; Ehrenfest time.

Mathematics Subject Classication 2010: 35Q41, 81Q05, 81S30, 35S30

1. Introduction and Results


Let us consider the time-dependent Schrodinger equation
(t)
i = H(t)(t), (t = t0 ) = 0 , (1.1)
t

where is an initial state, H(t) is a quantum Hamiltonian dened as a continuous
family of self-adjoint operators in the Hilbert space L2 (Rd ), depending on time t
and on the Planck constant  > 0, which plays the role of a small parameter in

the system of units considered in this paper. H(t) is supposed to be the -Weyl-
quantization of a classical smooth observable H(t, X), X = (x, ) Rd Rd (see [27]
for more details concerning semiclassical Weyl quantization).

1123
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X1000417X

1124 D. Robert

Our main results concern subquadratic Hamiltonians H; that means here that
H(t, X) is continuous in t R, C smooth in X R2d and satises, for every
N2d , || 2,

|X H(t, X| CT, , t, |t t0 | T, X R2d (1.2)

where X = X and CT, > 0.
Let us introduce some classes of symbols (classical observables) dened as
follows. Let be m, n N.

Definition 1.1. We say that a symbol s is in Om (n) if s is a smooth function on


the Euclidean space Rn such that for every Nn , || m we have

|s|, := sup |X s(X)| < + (1.3)
XRn

If s() depends on a parameter P we say that s() is bounded in Om (n) if for


every , we have

sup |s()|, < +.


P

It is well known that the subquadratic assumption entails that Eq. (1.1) is solved
by a unique quantum unitary propagator in L2 (Rd ) such that t = U (t, t0 )0 ,
t R. For the same reason, the classical dynamics is also well dened t R.
zt = (qt , pt ) is the classical path in the phase space R2d such that zt0 = z and
satisfying

qt = p H(t, qt , pt )
(1.4)
pt = Hq (t, qt , pt ), qt0 = q, pt0 = p.

It denes a Hamiltonian ow: t (z) = zt (t0 (z) = z). Let us introduce the stability
Jacobi matrix of this Hamiltonian ow:F (t) = z t (z). F (t) is a 2d 2d symplectic
matrix with four d d blocks, F (t) = A t Bt
Ct Dt , where

qt qt pt pt
At = , Bt = , Ct = , Dt = . (1.5)
q p q p
We also introduce the classical action
 t
S(t, z) = (ps qs H(s, zs ))ds (1.6)
t0

where u v denote the usual scalar product for u, v Rd , and the phase function

i
(t, z; x, y) = S(t, z) + pt (x qt ) p (y q) + (|x qt |2 + |y q|2 ). (1.7)
2
For applications, it is useful to introduce semi-classical subquadratic symbols.
These symbols have an asymptotic expansion in the semiclassical parameter  > 0,
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X1000417X

On the HermanKluk Semiclassical Approximation 1125


H  (t, X)  j0 j Hj (t, X) such that the following conditions are satised.

j 0, Hj (t, ) O(2j)+ (2d) and are bounded in O(2j)+ (2d) for t R, (1.8)


N 1, N 1 H(t, X) j Hj (t, X) is bounded in O0 for
0jN

t R and  ]0, 1]. (1.9)

Let us recall the denition of Weyl quantization. For any symbol s in Om (2d),and
for any S(Rd ), we have

w d i
(xy) x+y
Op [s](x) = (2) e  s , (y)dyd. (1.10)
R2d 2
We shall also use the notation s = Opw [s].
The HermanKluk formula is included in the following asymptotic result which
will be discussed in details in this paper. This formula was discovered by several
authors in the chemical-physics litterature in the eighties. We refer to the intro-
ductions of [22, 29] for interesting historical expositions. It is rather surprising
that until the recent paper [29] and the Ph.D. thesis [33] there was no explicite
connexion in the mathematical literature between the HermanKluk formula and
Fourier-Integral Operators with complex phases.

Theorem 1.2. Let be H  (t) a time dependent semiclassical subquadratic Hamilto-


nian and K  (t; x, y) be the Schwartz kernel of its propagator U  (t, t0 ). Then there

exists a semi-classical symbol of order 0, a (t; z) = 0j<+ aj (t; z)j where aj
is continuous in t,

 i
K (t; x, y)  e  (t,z;x,y) a(; t; z)dz (1.11)
R2d

in the L2 uniform norm. More precisely, if we denote



 
K (,N ) (t; x, y) = (2)3d/2 e  (t,z;x,y) aj (t; z)j dz
i
(1.12)
R2d 0jN

and U (,N ) (t, t0 ) the operator, in L2 (Rd ), with the Schwartz kernel K (,N ) (t; x, y),
then, for every T > 0 and every N 1, there exists C(T, N ) > 0 such that for the
L2 operator norm we have

U (t, t0 ) U (,N ) (t, t0 ) C(T, N )N +1 , t, |t t0 | T,  ]0, 1].


(1.13)

The leading term is


 t
a0 (t; z) = det 1/2
(At + Dt + i(Bt Ct )) exp i H1 (zs )ds (1.14)
t0
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X1000417X

1126 D. Robert

where the square root is defined by continuity starting from t = t0 (a0 (t0 ; z) = 2d/2 ).
Moreover, the amplitudes aj are smooth functions defined by transport equations (see
the proof below ) and, for every T > 0 they are bounded in O0 for |t| T .
In [29], the authors give a rigorous proof of this result with an additional hypoth-
esis: they assume that H(x, ) is a polynomial in . Here we consider more general
subquadratic
symbols. In particular our result applies to relativistic Hamiltonians
like 1 + ||2 + V (x). Using a global diagonalization (see [28, Sec. 3]), the result
can be extended to Dirac systems.
Similar results are true with more general quadratic phases and for systems with
diagonalizable leading symbols (see [4, 28]). Let us dene the quadratic phase
(t ,) (t, z; x, y) = S(t, z) + pt (x qt ) p (y q)
1
+ (t (x qt ) (x qt ) (y
q).(y q)) (1.15)
2
where , t are complex symmetrix matrices with a denite-positive imaginary
part, t is C 1 in t. is constant, t may depend smoothly on t and z such that
the following condition is satised:
1 2
cT > 0, t v.v |v| , t, |t| T, z R2d (1.16)
cT
, || 1, CT, , z t  CT, , z R2d , |t| T. (1.17)
So we have

Theorem 1.3. Under the assumptions of Theorem 1.2 and (1.16), (1.17), we have

(t ,)
3d/2 i
K(t; x, y)  (2) e (t,z;x,y)
f (; t; z)dz (1.18)
R2d

where f (; t; z) = 0j<+ fj (t; z)j with the same meaning as in Theorem 1.2.
In particular
f0 (t, z) = 2d/2 det1/2 [M (t , )] (1.19)
where
(A + B )).
M (t , ) = i(C + D

There exist several methods to prove this theorem. In [29], the authors prove
it as a consequence of a symbolic calculus for FIO with complex quadratic phases.
In [5], the authors proved a weaker result for = iI and t = t is determined by
the propagation of Gaussian coherent states: t = (C + D)(A + B)1 (see Sec. 2
of this paper). LaptevSigal in [23] have also considered a similar formula for the
propagator (see Sec. 5 of this paper) but assume that the initial data has a com-
pact support in momenta. Kay ([22]) explains how to compute all the semiclassical
corrections aj but did not give estimates on the error term, so its expansion is not
rigorously established. Here we choose another approach, may be more explicit and
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X1000417X

On the HermanKluk Semiclassical Approximation 1127

simpler. We shall prove the general Theorem 1.3 as a consequence of the particular
case of Theorem 1.2 by using a real deformation of the phase (t ,) on the simpler
one (iI,iI) . Moreover, we give a direct proof of Theorem 1.2, proving the necessary
properties for Fourier integrals with complex quadratic phases. This way we can
get easily explicit estimates for the error terms for large times.
Let us assume that conditions on H(t) are satised for T = +. Moreover
assume that there exists a positive real function (T ) 1, T > 0, such that the
classical ow t satises, for every multiindex , || 1, we have for some C > 0,

|z t,t (z)| C (T )|| , for |t| + |t | T, z R2d . (1.20)

We have discussed in [5] the condition (1.20). In particular this condition is fullled
with (T ) = eT for = supXR2d ,tR JX,X
2
H(t, X).

Theorem 1.4. Choosing the phase as in Theorem 1.2, for j 0 the amplitudes
aj (t, z) satisfy the following estimates, for every multiindex there exist a constant
Cj, such that

|z aj (t, z)| Cj |det1/2 Mt |(t)4j+|| , t R, z R2d . (1.21)

Hence we have the following Ehrenfest type estimate. For every N 1 and every
> 0 there exists CN, such that we have

U (t, t0 ) U (N ) (t, t0 ) CN, (N +1) , t,


1 (1.22)
|t| s |log |,  ]0, 1].
4
In previous works, an Ehrenfest time TE = c|log |, c > 0, was estimated for
propagation of Gaussians in [9] and propagation of observables in [6]. For Gaussians
1 1
we got c = 6 , for observables c = 2 . In [29], the authors gave an Ehrenfest time
without explicit estimate on c.

2. Gaussians Coherent States and Quadratic Hamiltonians


The phase functions (,) in (1.7) and (1.15) are closely related with Gaussian
coherent states. This can be seen by proving a particular case of Theorem 1.2 for
quadratic time-dependent Hamiltonians:
1
Ht (q, p) = (Gt q q + 2Lt q p + Kt p p)
2
where q, p Rd , Kt , Lt , Gt are real, d d matrices, continuous in time t R,
Gt , Kt are symmetric. The classical motion in the phase space is given by the linear
dierential equation

q Gt LTt q 0 I
=J , J= (2.1)
p Lt Kt p I 0
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X1000417X

1128 D. Robert

where LT is the transposed matrix of L, J denes the symplectic form (X, X  ) :=


JX X  , X = (x, ), X  = (x ,  ).
This equation denes a linear symplectic transformation, Ft , such that F0 = I
(we take here t0 = 0). It can be represented as a 2d 2d matrix which can be
written as four d d blocks:

At Bt
Ft = . (2.2)
Ct Dt

The quantum evolution for the Hamiltonian H(t) is denoted by U (t) (U (0) = I).
We can compute the matrix elements of U (t) on the coherent states basis z . This
has been done in [24, p. 249 (6.36)] and [3, 12, 10]. We follow here the presentation
given in [10]. Let us introduce some notations which will be used later. g denotes the
2
Gaussian function: g(x) = d/4 e|x| /2 and  is the dilation operator  (x) =
d/4 (1/2 x). So 0 =  g, and the general Gaussian coherent states are dened
as follows.
()
z = T (z)() , (2.3)
where T (z) is the Weyl translation operator, z = (q, p),

i

T (z) = exp (p x q Dx ) (2.4)

where Dx = i x

and z = (q, p) Rd Rd . () is the Gaussian state:

d/4 i
()
(x) = () a exp x x (2.5)
2
where is a complex symmetric matrix such that is denite-positive, a is a
normalization constant. (a = det1/4 ).
It is convenient to introduce here the Siegel space + (d) of d d complex
matrices such that is denite-positive. (See in [13] properties of + (d).)
()
Let us dene the FourierBargmann transform FB as follows, L2 (Rd ),

FB [](z) = (2)d/2
, ()
()
z . (2.6)
()
z R2d , z is the following coherent state living at z, z = (q, p) Rd Rd ,
x Rd ,

d/4 i p q  i(x q) (x q)
()
z (x) = () a exp px + , (2.7)
 2 2
()
FB is an isometry from L2 (Rd ) into L2 (R2d ) (with the Lebesgue measures). If
 2
=iI we denote FB = FBiI ; its range consists of F L2 (R2d ) such that exp p2 +
2 F (q, p) is holomorphic in C in the variable q ip. In other words,
i qp d

2
p qp
FB (z) = E (q ip) exp i (2.8)
2 2
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X1000417X

On the HermanKluk Semiclassical Approximation 1129

where E is entire in Cd (see [25]). Moreover we have the inversion formula



()
(x) = FB [](z)()
z (x)dz, in the L2 -sense. (2.9)
R2d

These properties are well known (see [25, 5]). Sometimes we shall use the shorter
()
notation = FB and = .

Let us denote by R[Ft ] the quantum propagator for the Hamiltonian H(t) (this
is the metaplectic representation of Ft ) and K (Ft ) its Schwartz kernel. We know
t ]g is the following Gaussian state [10, 13],
that  R[F

d/4 i

 R[Ft ]g(x) = () a (t) exp t x x (2.10)
2

where a (t) = [det(At + Bt )]1/2 a , the complex square root is computed by


continuitya from t = t0 = 0, and

t = (Ct + Dt )(At + Bt )1 , t0 = . (2.11)

Proposition 2.1. We have the following exact formula



M (t , ) (,)
K (Ft ) (x, y) = 2d/2 (2)3d/2 det1/2 e (t,z;x,y)
dz (2.12)
i R 2d

where , t + (d), t is C 1 in t; M (t , ) = C + D
t (A + B )
and

1
(t ,) (t, z; x, y) = (qt pt q p) + pt (x qt ) p (y q)
2
1
+ (t (x qt ) (x qt ) (y
q) (y q)).
2
Let us remark that here the action is S(t, z) = 12 (qt pt q p).

First of all let us remark that the integral (2.12) is an oscillating integral and is
dened, as usual, by integrations by parts. We shall give two proofs of this formula.

Proof I. We start with any 0 in the Siegel space + (d). Using the formula

(x) = (2)d
, z 0 z 0 dz
R2d

we get the formula



K (Ft ) (x, y) = (2)d
(
z 0 (y)z(t t ) (x)dz. (2.13)
R2d

a This denition of det1/2 is dierent that the det1/2 function on (d), this is explained in [10]
+
to compute Maslov index.
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X1000417X

1130 D. Robert

So, we get

(t ,0 )
K (Ft ) (x, y) = (2)3d/2 k0 (t)
i
e (t,z;x,y)
dz, (2.14)
R2d

where
det1/2 ( 0 )
k0 (t) = 2d/2 .
det1/2 (A + B0 )

Now we shall transform the phase (t ,0 ) into the phase (,0 ) .


Let us introduce (s) = s + (1 s)t , 0 s 1. We have (s) + (d). We
want to nd k(t, s) such that k(t, 0) = k0 (t) and

i (t ,0 )
k(t, s) e (t,z;x,y)
dz = 0, s [0, 1]. (2.15)
s R2d

We have
i (t ,0 ) i i (t ,0 )
e = (t t )(x qt ) (x qt )e  .
s 2
The main trick used here and later in this paper, and also in all the previous papers
on this subject ([23, 22, 29]), is to integrate by parts to convert each factor (x qt )
into , using the following equality
(A + B
p ), = (C + D
(q + ))(x qt ) (2.16)

where A denotes the transposed matrix of A. Let us introduce the matrix


(A + B ).
M = M (, ) = C + D

So we have
i (,)  
p e i , .
M (x qt )e  = q + (2.17)
i
Let us remark that M is invertible. This is a consequence of the following lemma
(see [11, 13] or [28, Appendix A], for proofs).

A B  2.2. For every linear symplectic map in F : T (R ) T (R ), d F =
d d
Lemma
C D and every + (d), (A + B), (C + D) are invertible in C and
(C + D)(A + B)1 + (d).

So we have
+ B) = ((C + D)(A + B)1 )(A
= C + D (A
M + B)1 .

But (C + D)(A + B)1 ) + (d) so is invertible.


Denote M (t, s) = M (s , t ). Let us recall the Liouville formula

s det(M (t, s)) = det(M (t, s)) Tr(s M (t, s)M (t, s)1 ). (2.18)
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X1000417X

On the HermanKluk Semiclassical Approximation 1131

So, integrating by parts in (q, p) we get


det1/2 M (t, s)
k(t, s) = k(t, 0) (2.19)
det1/2 M (t, 0)
k(t,0)
Now we have to compute det1/2 M(t,0)
. A simple computation gives M (t, 0) = (D
t B)(0 0 ). The proof of (2.12) follows from the formula

det(D t B) = det(A + B0 )1 . (2.20)


This equality follows from the symplecticity of F (D B = B D). We have B t B
D B = (A + B0 )1 B. So we get (2.20) if detB = 0. The general case follows
by a density argument. Let us remark that can exchange the role of and by
considering the adjoint U (t) of U (t).

Proof II. We solve directly the Schr


odinger equation


i H(t)
(t, x) = 0 (2.21)
t

for any initial data (x) := (0, x), S(Rd ) using the ansatz

(,)
3d/2
(t, x) = (2) k(t) ei (t,z;x,y)
(y)dzdy. (2.22)
R2d Rd

We have to compute k(t) such that k(0) = 2d/2 . Let us remark that if we integrate
rst in y then the integral (2.22) in z converges because the FourierBargmann
transform of , FB , is in the Schwartz space S(R2d ).
For simplicity, we assume here that = = iI. The general case can be reached
by the same method or by using the deformation argument of Proof I as we shall
see later for more general Hamiltonians.

Here the Hamiltonian H(t) is a quadratic form. So using dilations we can assume
that  = 1. A simple computation left to the reader, gives the following:

Lemma 2.3.
(g 1 H(t)g)(x)
= Gx x + i(L + L )x x Kx x + Tr(K iL) (2.23)
|x|2
2
where g(x) = e .
So we get

(,)
= (2)3d/2
i
(it H(t))(t)
e (t,z;x,y)
b(t, x, z)(y)dzdy
R2d Rd
(2.24)
where
b(t, z, z) = it k(t) k(t)(E(x qt ) (x qt ) + Tr(K iL)).
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X1000417X

1132 D. Robert

As in Proof I, we integrate by parts in the variable z R2d , using


(q ip ) = M (x qt )
with M = C B i(A + D), which is invertible (see below Lemma 3.2). Using the
Hamilton equation of motion we get
M = E(A iB) i(K iL)M. (2.25)
So, we nd the following dierential equation for k(t),
1  
k = Tr( M M k. (2.26)
2
Using the Liouville formula, we get again (2.12) for this particular phase.

3. Proof of Theorems 1.2 and 1.4


As usual for this kind of problems there are two steps: (1) Determine the amplitudes
aj solving by induction transport dierential equations; (2) Estimate the error
between the approximated propagator and the exact one.

3.1. Transport equations


It is convenient to write

e  = ()d/2 zt (x)z (y)e  (S(t,z)+(pqpt qt )/2) .


i i
(3.1)

Then we have to compute H  (t)zt . It is not dicult to add contributions of


the lower order terms of the Hamiltonian, so we shall assume for simplicity that
H  (t) = H0 (t) := H(t).
Lemma 3.1. For every N 2 we have
 ||/2
x qt

H(t)zt (x) = H(t, zt ) zt (x)
! X 
||N

+ (N +1)/2 T (zt ) Opw


1 [RN (t, zt )]g(x) (3.2)
where
 
1
(1 s)N
RN (t, zt , X) = X H(t, zt + s X)X ds (3.3)
0 N!
||=N +1

and is a universal polynomial of degree || which is even or odd according ||


is even or odd.

Proof. Let us recall that z = T(z) g. In this proof we put zt = z. An easy


property of Weyl quantization gives

1
 T (z)H(t)T (z) = Op1 [H(  +z)].
w
(3.4)
So the lemma follows easily from the Taylor formula with integral remainder.
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X1000417X

On the HermanKluk Semiclassical Approximation 1133

In this rst step, we do not take care of remainder estimates, this will be
done in the next step. Let us denote I(a, ) the formal operator having the
Schwartz kernel

Ka (x, y) = (2)3d/2
i
e  (t,z;x,y) a(t, z)dz. (3.5)
R2d

From the Lemma 3.1, we can write


 ||/2
x qt

H(t)I(a, ) I(b, ), where b
X H(t, zt ) a. (3.6)

! 

We have

(x) = h, x . (3.7)

The quadratic part can be computed as for quadratic Hamiltonians and the linear
part disappears with the classical motion. So we have

b H(t, zt )a + (q H(t, zt ) + ip H(t, zt )) (x qt )a



x qt x qt
+ E + Tr(K iL) a (3.8)
 
2
where we denote X,X H(t, X) the Hessian matrix of H(t). We have

G L
2
X,X H(t, zt ) = , E = G + 2iL K, (3.9)
L K
2 2 2
with G := q,q H(t, zt ), L := q,p H(t, zt ), K := p,p H(t, zt ).
At Bt 
Here the stability matrix Ft = Ct Dt satises Ft = JX,X 2
H(t, zt )Ft ,
Ft=0 = I.
As in the quadratic case we want to transform the power of (x qt ) into power
of .

Lemma 3.2. Let us denote Mt = (Ct Bt ) i(At + Dt ). We have

|det Mt | 2d , and (3.10)


i i
(q ip )e  = iMt (x qt )e  (3.11)

Proof. For simplicity, let us forget the lower index t.


Let us consider the 2d 2d matrix
 
I + A iC B + i(I D)
I + F + iJ(I F ) =
C i(I A) I + D + iB
 
I + A iC i(D + iB) + i
= . (3.12)
i(A iC) I + D + iB
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X1000417X

1134 D. Robert

Using [13, Lemma 4, Appendix A], we get


det(I + F + iJ(I F ))
= det((I + A iC)(I + D + iB) (A iC I)(D + iB I))
= 2d det(A + D + i(B C)). (3.13)
Using that F is symplectic, we get
(I + F + iJ(I F )) (I + F + iJ(I F ))

= (I + F )(I + F ) + (1 F )(I F ) I2d (3.14)


hence (3.10) follows.
Let us recall classical computations for the derivatives of the action
q S = (q qt ) pt p, (3.15)
p S = (p qt ) pt . (3.16)
Then we can compute q , p and we get (3.11).

Integrate by parts like in the quadratic case, we get


(it H(t))I(a,
)  I(f, ) (3.17)
where
 ||/2
1 1 x qt
f i t a Tr(M M )a +
H(t, zt ) a. (3.18)
2 ! X 
||3

Hence using the Liouville formula, we get the rst term



a0 (t, z) = 2d/2 det1/2 (iM . (3.19)
We shall obtain the next terms aj by successive integrations by parts. This is solved
more explicitly with the following lemma.

Lemma 3.3. For any symbol b O0 (2d), and every multiindex N2d we have
  
||
i i
(x qt ) e  b(z)dz = f, (t, z)e  z b(z)dz (3.20)
R2d || R2d
2 ||||

where f, (t, z) are symbols of order 0, uniformly bounded in O0 (2d) on bounded


time intervals. They only depend on the classical flow t (z) and its derivatives.
More precisely, let us assume that there exists a positive function (T ) such that
for every N2d we have
sup |z t (z)| C (T )|| . (3.21)
|a|T

Then we have
|z
f, (z)| C,;
(T )||||+|
|. (3.22)
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X1000417X

On the HermanKluk Semiclassical Approximation 1135

Proof. The lemma is easily obtained by induction on || using Lemma 3.2.

Now, to determine the transport equation, we solve inductively on j 0, the


equation


(it H(t))I
k ak (t), = O(j+2 ). (3.23)
0kj+1

Reasoning by induction on j 0, we get the transport equation for aj+1 (t) by


cancellation of the coecient of j+1 in (3.23).
1  1 
t aj+1 (t, z) = Tr M M aj+1 (t, z) + bj (t, z), aj+1 (0, z) = 0, (3.24)
2
where

bj (t, z) = Fj,k, (t, z)z ak (t, z). (3.25)
||+2k2(j+2)

Moreover, Fj,k, (t, z) depends only on the classical ow t (z) and its derivatives
and satises
|z Fj,k, (t, z)| Cj,k,, (T )2(jk+2)+|||| (3.26)
where Cj,k,, only depends on sup|t|T |H(t)|, , 2 || j + 2.
So we get, for every j 0,
 t
 
aj+1 (t, z) = det1/2 M (t, z)M (s, z)1 bj (s, z)ds. (3.27)
0

Moreover, from (3.25) and (3.26), we get the following estimate, for every j 0,
|t| T , z R2d ,
|z aj (t, z)| Cj, |det1/2 M (t, z)|(T )4j+|| (3.28)
with the same remark as in (3.26) for the constant Cj, .

3.2. Error estimates


Let us denote
RN (t) = (it H(t))I(a
(N )
(t), ) (3.29)

where a(N ) (t) = 0kN k ak . Using the Duhamel formula, we have
 t
U  (t) U N,(t) 1 R(s)ds (3.30)
0

where t0 = 0, U  (t) = U  (t, 0), U N, (t) = I(a(N ) (t), ).


So we have to estimate RN (t). Let us denote K (N ) (x, y) the Schwartz kernel
(N ) (X, Y ) the Schwartz kernel of RN (t) in the FourierBargmann
of RN (t) and K
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X1000417X

1136 D. Robert

representation:

(N ) (X, Y ) =
K K (N ) (x, y)X (y)Y (x)dxdy. (3.31)
Rd Rd

Let be R N (t) the operator with Schwartz kernel K


(N ) (X, Y ). The following lemma
is well known. Here we forget N and t for simplicity.

Lemma 3.4. We have the L2 norm estimate

RL2 (Rd ) (2)d R


L2 (Rd ) . (3.32)

In particular, we have
   
d
RL2 (Rd ) (2) max sup |K(X, Y )|dX, sup |K(X, Y )|dY .
(3.33)
Y X

Proof. For inequality (3.32) we use that the FourierBargmann transform is an


isometry. Inequality (3.33) is known as Carleman (or Schur) L2 estimate.

Using Lemma 3.1, we get

(N ) (X, Y ) = 23d/2 ()d


K

i

T (zt ) Opw
1 [RN (t)]g, Y
X , z a
(N )
(t, z)e  (t,z) dz
R2d
(3.34)

t qt
where (t, z) = S(t, z) + pqp
2 .
Using Weyl commutation formula, we have

|X z|2 i

X , z = exp + (X, z) , (3.35)
4 2
1 [RN (t)]g, Y =
Op1 [RN (t)]g, g Y

T (zt ) Opw w
zt . (3.36)


We know the Wigner function W0,Z of the pair (g, gZ ), Z R2d ([28])
  2 
 Z 
W0,Z (X) = 22d exp X  i(X, Z) . (3.37)
2

By a well-known property of Weyl quantization ([13]), for any symbol s, we have



d

Opw
1 [s]g, gZ = (2) s(X)W0,Z (X)dX (3.38)
R2d
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X1000417X

On the HermanKluk Semiclassical Approximation 1137

We shall use the following lemma

Lemma 3.5. Let be f O0 (2d). For every N2d and m > 0 there exists C,m
such that
 
 
 |XZ|2 iJZX
dX 
 2d X f (X)e
R

C,m (1 + |Z|)m sup |Y f (Y )|. (3.39)


||m+||; Y R2d

Proof. It is enough to assume |Z| 1. We integrate m times by parts with the


dierential operator
2(X Z) iJZ X
L= X (3.40)
4|X z|2 + |JZ|2

using that (L )m = ||m lm, X

, with |lm, | Cm, (|Z| + |X Z|)m , where
(X) = |X Z|2 iJZ X.

So using Lemma 3.5 we get the following estimate: for every N ; N  there exists
CN,N  (depending only on semi-norms |H(t)|, , 2 || N + N  , such that for
X, Y R2d and |t| T we have
 N +1
(N ) (X, Y )| CN,N  ((T ))N +N  2 d
|K
 N 
|Xz|2 |Y zt |
e 4 1+ |a(N ) (t, z)|dz. (3.41)
R 2d 
Let us denote t = 0,t = (t )1 . We have the Lipchitz estimate, for |t| T ,
|,t Y z| (T )|Y zt |. (3.42)
So we get
 N   N 
 |Y | |t

 |Xz|
2
z t  Y X|
 e 4 1+ dz  CN  1 + (3.43)
 R2d   (T ) 
and
|K
(N ) (X, Y )|
N 
N +N  N +1 |t Y X|
CN,N  ((T ))  2 1+ sup |a(N ) (t, z)|.
(T )  zR2d ,|t|T

(3.44)

Then using Lemma 3.4 and choosing N > 2d, we get the following uniform L2
estimate for the remainder term, for |t| T ,
RN (t) CN ((T ))N +1 (N +1)/2 sup |a(N ) (t, z)|. (3.45)
zR2d ,|t|T

If T is xed, pushing the expansion up to 2N instead of N we get easily Theorem 1.2


using the Duhamel formula.
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X1000417X

1138 D. Robert

Using global estimates on aj (t, z) obtained from the transport equation (3.28)
and pushing the asymptotic expansion up to 2N , we get the proof of Theorem 1.4
using again the Duhamel formula.

4. Varying Phase. Proof of Theorem 1.3


To avoid technicalities, we x the time t. It would be not dicult to follow a
time parameter t if necessary for application. So in this section, is a symplectic
dieomorphism in R2d , such that , 1 are Lipchitz continuous and O1 (2d).
We denote z = (q, p) R2d , (z) = (Q(z), P (z)) Rd Rd and S an action for
, i.e. a primitive on R2d of the closed 1-form P dQ pdq. We consider the following
phases

(,,) (z; x, y) = S(z) + P (x Q) p (y q)


1
+ ((x Q) (x Q) (y
q) (y q)). (4.1)
2
This class of Fourier-Integral Operators with complex quadratic phase was already
analyzed in [29]. We want to show here how to vary the choice of the matrices ,
for a given canonical transformation of R2d . As in Sec. 3, let us denote I(a, )
the operator with the Schwartz kernel

(,,)
3d/2 i
Ka (x, y) = (2) e (z;x,y)
a(z)dz (4.2)
R2d

where a O0 (2d), = (,,) .


Using a FourierBargmann transform and the following estimate: there exist
C > 0, c > 0 such that for all X R2d , we have

c|X|2
|
, X | C exp , (4.3)


we can estimate the FourierBargmann transform K a (X, Y ) of Ka and prove that


I(a, ) is bounded in L (R ) (see Sec. 3, Lemma 3.5 and Sec. 5 below).
2 d

Our goal in this section is to prove the following result which gives Theorem 1.3
as a particular case.

Proposition 4.1. Let be 4 matrices in + (d), ,  , ,  and a O0 (2d). , 


may be z dependent such that

c > 0, () v.v c|v|2 , z R2d (4.4)


, || 1, C ,
z ()  C , z R2d . (4.5)

Then there exists a semi-classical symbol a j j aj of order 0 such that we have
for the L2 operator norm,
 
I(a, (,,) ) = I(a , (, , ) ) + O( ). (4.6)
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X1000417X

On the HermanKluk Semiclassical Approximation 1139

Moreover we have for the principal symbol a0 the formula

det1/2 (M (1))
a0 (z) = a0 (z) (4.7)
det1/2 (M (0))
((1 s) + s )(A + B ).
where M (s) := C + D

Proof. The method is rather simple and is an extension of what we have already
done for quadratic Hamiltonians (Proof I) except that here we have to solve trans-
port equations in the deformation parameter s to get the lower order correction
terms.
Let us remark that this class of Fourier-Integral Operators is closed under
adjointness:

I(a, (,) ) = I(a , ), (4.8)

where a (Z) = a
(1 Z), Z = (Q, P ), Z = (z) and

(Z; x, y) = S(1 Z) + p (x q) P (y Q)
1
+ ((x q) (x q) (y
Q) (y Q)). (4.9)
2
So by transitivity we can assume that =  . As in the quadratic Hamiltonian
case, let us introduce, s = (1 s) + s , (s) = (s ,) , 0 s 1 and look for
 (s)
a semiclassical symbol a(s) = j j aj such that

(s)
e  (z;x,y) a(s) (z)dz = O( ), s [0, 1].
i
(4.10)
s R2d
However, we have
(s) i
(z; x, y) = ( )(x Q) (x Q) (4.11)
s 
and we have to nd a C 1 family symbol a(s) , 0 s 1 such that

i 
I s a + ( )(x Q) (x Q)a , = O( ).
(s) (s)
(4.12)


The principal term a0 = a(1) is computed as in the quadratic case.


Let us suppose for a moment that ,  are constant. Then as in the quadratic
case we have
(A + B
p )(s) = (C + D
(q + )s )(x Q) (4.13)
A B 
where A = q Q, B = p Q, C = q P , D = p P and F = C D is a symplectic
matrix.
We know that M (s) := C + D s (A + B )
is invertible so we can integrate
by parts as in Sec. 3. and as above we can achieve the proof of Proposition 4.1.
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X1000417X

1140 D. Robert

When ,  are z dependent, the integrations by part are more tricky. We have
to use
p )(s) = M (s, z)(x Q) + N (s, z)(x Q, x Q)
(q + (4.14)
where N (s, z)(x, y) is a bilinear application in (x, y) Rd Rd into d d matrices,
with coecients in O0 in z, C 1 in s.
Hence we have
(,)   
i
(x Q)e  p e i (,)
= (M )1 (s, z) q +
i
 (,)
(M ) (s, z)N (s, z)(x Q, x Q)e i . (4.15)
So we apply (4.15) and the following lemmas to proceed like in Sec. 3.

Lemma 4.2. For any symbol b O0 (2d), for every multiindex N2d and every
N ||/2 we have

i (s)
(x Q) e  b(z)dz
R 2d

 
(s)
||
i
= f, (s, z)e  b(z)dz
|| R2d
2 ||N
 
(s)
|| i
+  g, (s, z)(x Q) e  g, b(z)dz (4.16)
||+||=N +1,||1 R2d

where f, (s, z), g, (s, z) are symbols of order 0, uniformly bounded in O0 (2d) for
s [0, 1].

Lemma 4.3. For every b O0 (2d) and Nd we have the crude L2 estimate,
uniform in s [0, 1],

I((x Q) b, (s)  = O(||/2 ). (4.17)



Using these two lemmas we get the full semiclassical symbol a j j aj , where

det1/2 (M (s))
a0 (z) = a0 (4.18)
det1/2 (M (0))
and for j 1, aj is computed by induction as solution for s = 1 of the dierential
equation
 
s aj (s) = Tr M (s)M 1 (s) aj (s) + bj (s), aj (0) = aj . (4.19)
where bj (s) depends on the ak (s), k j 1.

Remark 4.4. Considering the adjoint operator, it is possible to exchange the role
of the matrices and . If the symbol a depends smoothly on some parameter ,
it is not dicult to show that a also depends smoothly in .
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X1000417X

On the HermanKluk Semiclassical Approximation 1141

Proof of Lemma 4.2. This is done by an induction on N such that N .

Proof of Lemma 4.3. Let us begin by giving a simple proof of (4.3) when is z
dependent satisfying the assumptions (4.4) and (4.5) of Proposition 4.1. We shall
prove the more general estimate, for every Nd there exist C > 0, c > 0 such
that
2
|
x g , gY | Cec|Y | , Y R2d . (4.20)
Let us denote Y = (y, ) Rd Rd . By a direct estimate we get easily,
2
|
x g , gY | Ce2c|y| , (y, ) R2d . (4.21)
Using Fourier transform and Plancherel formula, we exchange y and and we
get (4.20).
Now we can follow the method of Sec. 3 to estimate L2 norm of operators using
a FourierBargmann transformation.

Let be K(X, Y ) the FourierBargmann kernel of I((x Q) b, (s) ). We have

Y ) = 23d/2 ()d ||/2
i

K(X,
T (Z) (x g ), Y
X , z b(z)e  (t,z) dz
R2d
(4.22)
where Z = (Q, P ) = (z) and
|
T (Z) (x g ), Y | = |
x g , g YZ |. (4.23)


So we get

c
|K(X,
Y )| C||/2 exp (|Y (z)| + |X z| dz.
2 2
(4.24)
R2d 
Using that is a Lipchitz canonical transformation, we have, for C0 large enough
and c0 > 0 small enough,

||/2 c0
|K(X, Y )| C0 
exp (|Y (X)| .
2
(4.25)

Hence we get the proof of Lemma 4.3 using Lemma 3.4.

We have proved Proposition 4.1 and Theorem 1.3.

5. Semiclassical Fourier Integral Operators


In [23, 8] and in the recent preprint [30], the authors have considered Fourier-
Integral Operators dened by the following simpler phase
1
(,) (p; x, y) = S(y, p) + P (x Q) + (x Q) (x Q) (5.1)
2
where (Q, P ) = (y, p), is a bilipchitz canonical transformation like above,
+ (d).
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X1000417X

1142 D. Robert

In [23, 8] the authors have proved semiclassical expansions for the propagator
of Schrodinger equation for initial data with a compact support. This result is
extended in [30] for the Schrodinger Hamiltonian 2  + V , to general data in L2
with uniform norm estimates. We shall give here some extensions of results of [30]
using the same techniques as in Secs. 3 and 4, so we shall not repeat the details.
Let us denote J (a, , ) the operator whose Schwartz kernel is

(,)
K(x, y) = (2)d
i
e (p;x,y)
a(y, p)dp. (5.2)
Rd
A natural question discussed in this section is to compare the Fourier-Integral
Operators I(a, (,,) ) dened with 2d frequency variables and J (a, (,) )
dened with d frequency variables.
A Fourier integral operator in L2 (Rd ) is always a quantization of a canonical
transformation in the cotangent space T (Rd ). A nice way to make clear this
relationship is to use a FourierBargmann transform (see [7, 31]). This can be
easily done in the same way for Semiclassical Fourier-Integral Operators as we shall
see now.
Definition 5.1. A family of operators, depending on a small parameter  ]0, 1],
U  : S(Rd ) S  (Rd ) is a Semiclassical Fourier-Integral Operator of order m R
associated to the canonical bilipchitz transformation : T (Rd ) T (Rd ), if for
every N  we have U  = UN     
 + RN  where UN  : S(R ) S (R ) and RN   =
d d

O(N ) and for every N 0 there exists CN such that
N
 |Y (X)|
|K (X, Y )| CN 
m3d/2
1+ , X, Y R2d ,  ]0, 1],

(5.3)
 
(X, Y ) is the Schwartz kernel of FB U  F .
where K N B

Remark 5.2. (1) In this denition, which concides with a denition given in [31]
for  = 1, a Semiclassical Fourier-Integral Operator has, up to a negligible
operator in , a kernel living in a neighborhood of the graph of a canoni-
cal transformation . But this denition says nothing concerning asymptotic
expansion of K  (X, Y ) in a neighborhood of the graph of when  is small.
So this denition is certainly too permissive. But for  xed it is suitable as
proven in [31].
(2) Using CarlemanSchur estimate, a Semiclassical Fourier-Integral Operator of
order 0 is uniformly bounded in L2 (Rd ). This is a straightforward consequence
of the denition. This class of Semiclassical Fourier-Integral Operator of order
0 is clearly closed by composition.
(3) In Denition 5.1, it is equivalent to use any FourierBargmann transformation
()
FB , + (d).
(4) There are other denitions of Semiclassical Fourier-Integral Operator using
Lagrangian analysis and real phase functions. For this point of view, see for
example, [1].
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X1000417X

On the HermanKluk Semiclassical Approximation 1143

(5) Fourier-Integral Operators with complex phase were used to study propagation
of singularities of P.D.E. Many papers and books have been published on this
subject, among them let us point out [2, 26, 32].
Now we shall see that the operators already considered in this paper are Semiclas-
sical Fourier-Integral Operators.

Proposition 5.3. Let be amplitudes a = a(x, z), a O0 (3d) and u = u(x, y, p),
u O0 (3d) and , + (d), may depend in z or (y, p), such that (1.16), (1.17)
are satisfied. Then I(a, (,,) ) and J (u, , ) are Semiclassical Fourier-Integral
Operators of order 0.

Proof. Concerning I(a, (,,) ), we get the result following Sec. 3.2, esti-
mate (3.44).
The proof for J (u, , ) is almost the same. For simplicity we assume con-
stant. For depending in (y, p) we could proceed as in Sec. 4.
Let us denote X = ( Y = (
x, ), y , ). We want to estimate

d i
K(X, Y ) = (2) e  u(x, y, p)dpdxdy (5.4)
R3d
where

= S(y, p) + P (x Q) + (x Q) (x Q)
2
i i
+ ( x y) + (
x y) ( x y) + (y x) (
y x) + (
y x). (5.5)
2 2
 D B   B
Let us remark that we have: F 1 = C
A if F = A C D . So, because F
1
is
symplectic, we know that D B is invertible. Hence we have

= (C A )(x Q) + ( p) + i(
y x y), (5.6)
= (D B )(x Q),
p (5.7)
= (x Q) + (P ) + i(
x y x). (5.8)
by integrations by parts using
So we get the necessary estimates on K
(A + C )(D B )1 p = ( p) + i(
y x y), (5.9)
(D B )1 p = (P ) + i(
x y x). (5.10)

The following result is a slight generalization of [23, 8, 30].

Theorem 5.4. Under the assumptions of Theorem 1.2 and (1.16), (1.17), we have

i (t ,t )
K(t; x, y)  (2)d e (t,y,p,x)
u(; t, y, p)dp (5.11)
Rd

where u(; t, y, p) = 0j<+ uj (t; y, p)j has the same meaning as in
Theorem 1.2.
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X1000417X

1144 D. Robert

In particular
u0 (t, y, p) = det1/2 (D B)). (5.12)

Sketch of Proof. These result can be proved following the same strategy as for
proving Theorem 1.3.
We rst prove the theorem for some ( = iI), following the proof of
Theorem 1.2. Then we can get the theorem for any by the variation argument
as in the proof of Theorem 1.3. L2 estimate for operator norm of Fourier-Integral
Operators is used to control the remainder terms.

Remark 5.5. It is not dicult to adapt the proof of Theorem 1.4 concerning an
Ehrenfest time estimate to the setting of Theorem 5.4.

References
[1] I. Alexandrova, Semi-classical wave front set and Fourier-Integral Operators, Canad.
J. Math. 60 (2008) 241263.
[2] V. M. Babich and V. S. Buldreyev, Asymptotic Methods in Short Waves Diraction
Problem (Moscow Nauka, 1972, in Russian); (Springer, 1991, English translation).
[3] V. Bargmann, On the Hilbert space of analytic functions and associated integral
transform, Comm. Pure Appl. Math. 14 (1961) 187214.
[4] J. M. Bily, Propagation detats coherents et applications, Ph.D. thesis, Universite de
Nantes (2001).
[5] J. M. Bily and D. Robert, The semi-classical VanVleck formula. Application to
the AharonovBohm eect, in Long Time Behaviour of Classical and Quantum Sys-
tems, Proceedings of the Bologna APTEX International Conference, Bologna, Italy,
September 1317, 1999 (World Scientic, 2001), pp. 89106.
[6] A. Bouzouina and D. Robert, Uniform semiclassical estimates for the propagation of
quantum observables, Duke Math. J. 111(2) (2002) 223252.
[7] J. M. Bony, Evolution equations and microlocal analysis, in Hyperbolic Problems and
Related Topics, Grad. Ser. Anal. (International Press, 2003), pp. 1740.
[8] J. Butler, Global h Fourier integral operators with complex-valued phase functions,
Bull. London Math. Soc. 34(4) (2002) 479489.
[9] M. Combescure and D. Robert, Semiclassical spreading of quantum wave packets
and applications near unstable xed points of the classical ow, Asymptot. Anal. 14
(1997) 377404.
[10] M. Combescure and D. Robert, Quadratic quantum Hamiltonians revisited, Cubo
8(1) (2006) 6186.
[11] A. Cordoba and C. Feerman, Wave packets and Fourier Integral Operators, Comm.
Partial Dierential Equations 3(11) (1978) 9791005.
[12] B. Fedosov, Deformation Quantization and Index Theory, Mathematical Topics,
Vol. 9 (Akademic Verlag, 1996).
[13] G. B. Folland, Harmonic Analysis in Phase Space, Annals of Mathematics Studies,
Vol. 122 (Princeton University Press, Princeton, NJ, 1989).
[14] D. Fujiwara, A construction of the fundamental solution for the Schr odinger equation,
J. Anal. Math. 35 (1979) 4196.
[15] G. Hagedorn, Semiclassical quantum mechanics I: The   limit for coherent states,
Comm. Math. Phys. 71 (1980) 7793.
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X1000417X

On the HermanKluk Semiclassical Approximation 1145

[16] G. Hagedorn and A. Joye, Exponentially accurate semi-classical dynamics: Propaga-


tion, localization, Ehrenfest times, scattering and more general states, Ann. Henri
Poincare 1(5) (2000) 837883.
[17] E. J. Heller, Time-dependent approach to semiclassical dynamics, J. Chem. Phys.
62(4) (1975) 15441555.
[18] E. J. Heller, Frozen Gaussians: A very simple semiclassical approximation, J. Chem.
Phys. 75(6) (1981) 29232931.
[19] M. F. Herman and E. Kluk, A semiclassical justication for the use of non-spreading
wavepackets in dynamics calculations, Chem. Phys. 91(1) (1984) 2734.
[20] L. Hormander, The Analysis of Linear Partial Dierential Operators I (Springer-
Verlag, 1983).
[21] K. Kay, Integral expressions for the semi-classical time-dependent propagator, J.
Chem. Phys. 100(6) (1994) 43774392.
[22] K. Kay, The HermanKluk approximation: Derivation and semiclassical corrections,
Chem. Phys. 322 (2006) 312.
[23] A. Laptev and I. M. Sigal, Global Fourier Integral Operators and semiclassical asymp-
totics, Rev. Math. Phys. 12(5) (2000) 749766.
[24] R. Littlejohn, The semiclassical evolution of wave packets, Phys. Rep. 138(45)
(1986) 193291.
[25] A. Martinez, An Introduction to Semiclassical and Microlocal Analysis, Universitext
(Springer-Verlag, 2002).
[26] J. Ralston, Gaussian beams and propagation of singularities, in Studies in Partial
Dierential Equations, MAA Stud. Math., Vol. 23 (Math. Assoc. America, 1982),
pp. 246248.
[27] D. Robert, Autour de lapproximation Semi-Classique, Progress in Mathematics,
No. 68 (Birkh auser, 1987).
[28] D. Robert, Propagation of coherent states in quantum mechanics and applications, in
Partial Dierential Equations and Applications, Semin. Congr., Vol. 15 (Soc. Math.
France, 2007), pp. 181250.
[29] V. Rousse and T. Swart, A mathematical justication for the HermanKluk propa-
gator, Comm. Math. Phys. 286 (2009) 725750.
[30] V. Rousse, Semiclassical simple initial value representations, Universite Paris 12
(2009), arXiv:0904.0387.
[31] D. Tataru, Phase space transforms and microlocal analysis, in Phase Space Analysis
of Partial Dierential Equations, Publ. Cent. Ri. Mat Ennio Giorgi, Vol. II (Scuola
Norm. Sup. Pisa, 2004), pp. 505524.
ostrand, Singularites analytiques microlocales, Asterique 95 (1982) 1166.
[32] J. Sj
[33] T. Swart, Initial value representation, Ph.D. thesis, Frei Universitat Berlin (2008).
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

Reviews in Mathematical Physics


Vol. 22, No. 10 (2010) 11471179

c World Scientific Publishing Company
DOI: 10.1142/S0129055X10004168

ALMOST ADDITIVE THERMODYNAMIC FORMALISM:


SOME RECENT DEVELOPMENTS

LUIS BARREIRA
Departamento de Matem atica, Instituto Superior T
ecnico,
1049-001 Lisboa, Portugal
barreira@math.ist.utl.pt

Received 9 November 2009


Revised 13 July 2010

This is a survey on recent developments concerning a thermodynamic formalism for


almost additive sequences of functions. While the nonadditive thermodynamic formal-
ism applies to much more general sequences, at the present stage of the theory there
are no general results concerning, for example, a variational principle for the topological
pressure or the existence of equilibrium or Gibbs measures (at least without further
restrictive assumptions). On the other hand, in the case of almost additive sequences, it
is possible to establish a variational principle and to discuss the existence and uniqueness
of equilibrium and Gibbs measures, among several other results. After presenting in a
self-contained manner the foundations of the theory, the survey includes the description
of three applications of the almost additive thermodynamic formalism: a multifractal
analysis of Lyapunov exponents for a class of nonconformal repellers; a conditional vari-
ational principle for limits of almost additive sequences; and the study of dimension
spectra that consider simultaneously limits into the future and into the past.

Keywords: Almost additive sequences; thermodynamic formalism.

Mathematics Subject Classification 2010: 37C45, 37D20, 37D35

1. Introduction
The point of departure for this survey is the nonadditive thermodynamic formal-
ism developed in [1], having in mind certain applications to the dimension theory
of dynamical systems, as detailed below. Our main aim is to survey some recent
developments in the particular case of almost additive sequences of functions.
During the last two decades, the dimension theory of dynamical systems pro-
gressively developed into an independent eld of research, roughly speaking with
the objective of measuring the complexity from the dimensional point of view
of the objects that remain invariant under the dynamics, such as the invariant
sets and measures. The rst monograph that clearly took this point of view was
Pesins book ([36]), which describes the state-of-the-art up to 1997. We refer to our
book ([4]) for a detailed description of many of the more recent results in the area.

1147
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

1148 L. Barreira

The nonadditive thermodynamic formalism is a generalization of the classical


thermodynamic formalism, in which the topological pressure P () of a continu-
ous function (with respect to a given dynamics on a compact metric space), is
replaced by the topological pressure P () of a sequence of continuous functions
= (n )n . The classical pressure P () was introduced by Ruelle in [39] for expan-
sive maps (see also his book [40]), and by Walters in [46] in the general case. For
arbitrary sets (not necessarily compact), the nonadditive topological pressure also
generalizes (and imitates) the notion of topological pressure introduced by Pesin
and Pitskel in [37], which is equivalent to the notion introduced earlier by Bowen
in [13] (see [37]). The nonadditive thermodynamic formalism contains as a partic-
ular case a new formulation of the subadditive thermodynamic formalism earlier
introduced by Falconer in [19].
The main motivation behind the nonadditive thermodynamic formalism is to
allow certain applications to a more general class of invariant sets in the context of
the dimension theory of dynamical systems. We rst recall that the unique solution
s of the equation

P (s) = 0, (1)

where is a certain function associated to a given invariant set, is often related to


the Hausdor dimension of the set. Equation (1) was introduced by Bowen in [15]
(in his study of quasi-circles) and is usually called Bowens equation. It is also
appropriate to call it BowenRuelles equation, taking into account the fundamental
role of the thermodynamic formalism developed by Ruelle, and of his article [41].
Virtually all known equations used to compute or to estimate the dimension of
invariant sets are particular cases of Eq. (1) or of an appropriate generalization.
We recommend [42] for a quite detailed and informative related discussion.
On the other hand, in certain applications of dimension theory (we refer to the
examples in [1, 4]), one is naturally led to consider sequences = (n )n that may
satisfy no additivity between the functions n . The nonadditive topological pres-
sure and its associated thermodynamic formalism allow us to consider these gener-
alizations in a unied framework. In particular, this allowed to establish in [1] sharp
lower and upper dimension estimates for repellers and hyperbolic sets, including for
a class of nondierentiable maps, without further eort. The dimension estimates
are obtained as solutions of appropriate generalizations of Eq. (1) now involving
the nonadditive topological pressure.
Given a continuous function : X R in a compact metric space X, the
classical topological pressure of , with respect to a continuous map f : X X,
satises the variational principle
  
P () = sup h (f ) + d ,
X

where h (f ) is the KolmogorovSinai entropy of f with respect to the measure ,


and where the supremum is taken over all f -invariant probability measures on X.
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

Almost Additive Thermodynamic Formalism 1149

The thermodynamic formalism developed in [1] also includes a variational prin-


ciple for the topological pressure, although with a restrictive assumption on the
sequence . Namely, if there exists a continuous function : X R such that
n+1 n f uniformly when n , (2)
then
  
P () = sup h (f ) + d ,
X

again with the supremum taken over all f -invariant probability measures on X.
The restrictive assumption in (2) caused that until recently there was no available
discussion of equilibrium and Gibbs measures, in the general context of the nonad-
ditive thermodynamic formalism. But it is well known that equilibrium and Gibbs
measures play a prominent role in dimension theory and in particular in the mul-
tifractal analysis of dynamical systems, in which the spectra are often obtained by
providing equilibrium measures with the appropriate local entropy or the appropri-
ate pointwise dimension. Equilibrium and Gibbs measures can also be for example
measures of full topological entropy or full Hausdor dimension. It is sometimes
possible to develop the theory without a variational principle for the topological
pressure, and thus without these measures, but the corresponding proofs tend to
be more technical. Clearly, from the points of view of dimension theory and mul-
tifractal analysis, it is desirable to continue using equilibrium and Gibbs measures
even when the classical thermodynamical formalism cannot be used.
The discussion above justies the interest in looking for more general classes of
sequences of functions, although perhaps not arbitrary sequences, for which it is
still possible to establish a corresponding variational principle for the topological
pressure, and to study the associated equilibrium and Gibbs measures, among sev-
eral other results. This is precisely what happens with the so-called almost additive
sequences, for which it is possible not only to establish a variational principle, but
also to discuss the existence and uniqueness of equilibrium and Gibbs measures. We
recall that a sequence = (n )n is said to be almost additive if there is a constant
C > 0 such that
C + n + m f n n+m C + n + m f n (3)
for every n, m N. Clearly, for any function the sequence
n1

n = fk
k=0

is almost additive, since in this case


n+m = n + m f n
for every n, m N. Nontrivial examples of almost additive sequences occur for
example in the study of Lyapunov exponents for nonconformal maps by Barreira
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

1150 L. Barreira

and Gelfert in [7] (see Sec. 7). Following [3], we consider in particular repellers
and hyperbolic sets of C 1 transformations, and for an almost additive sequence
of continuous functions we describe several results towards the foundations of an
almost additive thermodynamic formalism. This includes the formula
1 
P () = lim log exp n (x)
n n
n
x:f (x)=x

for the topological pressure, for the class of almost additive sequences with tem-
pered variation. We also describe a variational principle for the topological pressure
of an almost additive sequence, namely
  
1
P () = sup h (f ) + lim n d , (4)
n n X

and we discuss the existence and uniqueness of equilibrium and invariant Gibbs
measures, among several other results, for example concerning characterizations
of unique equilibrium measures. Mummert ([34]) established independently iden-
tity (4), although under an additional assumption on the sequence that can be
removed by repeating verbatim arguments in [3]. Cao, Feng and Huang consid-
ered more recently in [16] the general class of subadditive sequences, and they also
obtained the variational principle in (4), but they do not discuss the existence of
equilibrium or Gibbs measures. Earlier results in this direction were obtained by
Kaenmaki in [30] for a particular class of subadditive sequences, while also dis-
cussing the existence of an equilibrium measure.
After presenting the foundations of the almost additive thermodynamic formal-
ism, we describe three applications of the formalism.
The rst application, following Barreira and Gelfert in [7], considers noncon-
formal repellers in R2 satisfying a cone condition. The main objective is to obtain
a multifractal analysis for the level sets of the Lyapunov exponents. In particular,
we consider certain almost additive sequences related to the Lyapunov exponents
to which one can apply the almost additive thermodynamic formalism. However,
we emphasize that the results in [7] were obtained independently of the theory
described in the survey. We also point out that the proofs of some results in Secs. 46
can be considered a distillation of arguments in that paper. We recall that a dier-
entiable map f is said to be conformal on a given set provided that the dierential
dx f is a multiple of an isometry at every point x of the set. We emphasize that the
dimension theory and the multifractal analysis of dynamical systems are only com-
pletely understood in the case of conformal uniformly hyperbolic dynamics, either
invertible or noninvertible. This includes saddle-type hyperbolic dieomorphisms
on surfaces, and holomorphic maps in the complex plane with a hyperbolic Julia
set. The study of the dimension of invariant sets of nonconformal transformations
has proven to be much more delicate. The main diculty is related with the possi-
bility of existence of distinct Lyapunov exponents in dierent directions, which may
change from point to point. Another diculty is that certain number-theoretical
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

Almost Additive Thermodynamic Formalism 1151

properties may play an important role. Nevertheless, there exist several notewor-
thy results concerning the dimension theory of certain classes of invariant sets
of nonconformal transformations, namely due to Falconer ([18, 20]), Bothe ([12]),
Simon ([44]), and Simon and Solomyak ([45]). We refer to [4] for a related discussion.
The second application, following Barreira and Doutor in [5], has the objective of
establishing a conditional variational principle for the multifractal spectra obtained
from limits of almost additive sequences. This means that we consider the level sets
 
n (x)
K = x X : lim = ,
n n (x)

where (n )n and (n )n are almost additive sequences, and we give a description


of their topological entropy or Hausdor dimension in terms of a conditional varia-
tional principle. For example, in the case of the topological entropy the conditional
variational principle takes the form




n d
X
h(f | K ) = max h (f ) : lim  = ,

n

n d
X

where h(f | K ) denotes the topological entropy on K . It is also shown that the
spectra, such as  h(f | K ), are continuous, and that the associated irregu-
lar sets have full dimension. The approach in [5] builds on related arguments in
former work of Barreira et al. in [9], although now for almost additive sequences.
The multifractal analysis of dynamical systems can be considered a subeld of the
dimension theory of dynamical systems, and it studies the complexity of the level
sets of invariant local quantities obtained from a dynamical system. The concept
of multifractal analysis was suggested by Halsey et al. in [27]. The rst rigorous
approach is due to Collet, Lebowitz and Porzio in [17] for a class of measures
invariant under 1-dimensional Markov maps. In [32], Lopes considered the measure
of maximal entropy for hyperbolic Julia sets, and in [38], Rand studied Gibbs mea-
sures for a class of repellers. We refer the reader to the books [4, 36] for details and
further references.
The third application, following Barreira and Doutor in [6], is a complete
description of the dimension spectra of limits of almost additive sequences on a
hyperbolic set of a surface dieomorphism. The main novelty is that we consider
simultaneously limits into the future and into the past. More precisely, the spectra
are obtained by computing the Hausdor dimension of the level sets of limits of
almost additive sequences both for positive and negative time. We emphasize that
the description of the spectra is not a consequence of the results considering simply
limits into the future (or into the past). The main diculty is that although the local
product structure provided by the intersection of stable and unstable manifolds is
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

1152 L. Barreira

bi-Lipschitz equivalent to a product, the level sets are never compact (this causes
that their box dimension is strictly larger than their Hausdor dimension), and thus
the product of level sets may have a dimension that need not be the sum of the
dimensions of the sets. Instead we construct explicitly noninvariant measures con-
centrated on each product of level sets having the appropriate pointwise dimension.
This approach builds on former work of Barreira and Valls in [11] in the additive
case.

2. Nonadditive Topological Pressure


2.1. General theory
We recall in this section the notion of nonadditive topological pressure introduced
n1
by Barreira in [1]. The main idea is to replace each sequence of functions k=0 f k
in the denition of topological pressure by an arbitrary sequence n .
Let f : X X be a continuous transformation of a compact metric space X.
Given a nite open cover U of X, we denote by Wn (U) the collection of vectors
U = (U0 , . . . , Un ) with U0 , . . . , Un U. For each U Wn (U), we write m(U ) = n,
and we consider the open set
n

X(U ) = f k Uk .
k=0

These sets can be thought of as cylinder sets. Now let be a sequence of continuous
functions n : X R for each n N. We dene

n (, U) = sup{|n (x) n (y)| : x, y X(U ) for some U Wn (U)} (5)

for each n N, and we always assume that


n (, U)
lim sup lim sup = 0. (6)
diam U0 n n
We observe that condition (6) holds automatically when is an additive sequence,
that is, when
n1

n = fk (7)
k=0

for a given continuous function : X R and each n N (this is an immediate


consequence of the uniform continuity of any continuous function in the compact
metric space X). Now we proceed with the construction of the nonadditive topo-
logical pressure. For each U Wn (U) we write

sup n if X(U ) = ,
(U ) = X(U) (8)

if X(U ) = .
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

Almost Additive Thermodynamic Formalism 1153

Given a set Z X and a number R, we dene the function



M (Z, , , U) = lim inf exp(m(U ) + (U )),
n
U

where the inmum is taken over all nite or countable collections kn Wk (U)

such that U X(U ) Z (in other words, such that the cylinder sets X(U ) cover
the set Z). One can show that the function  M (Z, , , U) jumps from + to
0 at a unique value of , and thus we can dene
PZ (, U) = inf{ R : M (Z, , , U) = 0}.
Theorem 2.1 ([1]). The following properties hold :

(1) The limit


PZ () := lim PZ (, U)
diam U0

exists;
(2) If there exist constants c1 , c2 < 0 such that c1 n n c2 n for every n N,
and the topological entropy h(f | X) is nite, then there exists a unique number
s R such that
PZ (s) = 0.

The number PZ () is called the nonadditive topological pressure of the sequence


of functions (with respect to f on Z). We note that the set Z need not be compact
nor f -invariant. For simplicity, when there is no danger of confusion, we simply refer
to PZ () as the topological pressure of (with respect to f on Z). We also write
P () = PX (). One can easily verify that if is the (additive) sequence of functions
in (7), then P () coincides with the classical topological pressure of the function .
The number h(f | Z) = PZ (0) is called the topological entropy of f on Z. It
coincides with the notion of topological entropy for noncompact sets introduced
in [37], and is equivalent to the notion of topological entropy introduced earlier by
Bowen in [13]. It can be described as follows. Given a set Z X and a number
R, we dene the function

N (Z, , U) = lim inf exp(m(U )),
n
U

where the inmum is taken over all nite or countable collections Wk (U)
 kn
such that U X(U ) Z. Then
h(f | Z) = lim inf{ R : N (Z, , U) = 0}.
diam U0

2.2. Equilibrium measures for subadditive sequences


As described in the introduction, the nonadditive thermodynamic formalism devel-
oped in [1] also includes a variational principle for the topological pressure, although
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

1154 L. Barreira

with a restrictive assumption on the sequence (see (2)). Nevertheless, it is still


meaningful to consider some particular classes of dynamics and potentials, and to
look for equilibrium and Gibbs measures.
With this in mind we describe in this section results by Kaenmaki [30] and
by Feng and K aenmaki [25] concerning the construction of equilibrium measures
for a class of subadditive sequences in the particular case of symbolic dynamics.
These sequences are well adapted to the study of the dimension of a class of limit
sets of iterated function systems (see [30]) and of the multifractal analysis of the
top Lyapunov exponent of products of matrices (see [21, 23, 26]). We refer to the
following sections for related results concerning the existence of equilibrium and
Gibbs measures for other classes of dynamics and potentials.
We rst introduce some notation to consider the particular case of symbolic
dynamics. Given p N, we write n = {1, . . . , p}n for each n N and || = n for
each n . We also write

= {1, . . . , p}N and = n ,
nN

and we consider the shift map : by (i1 i2 ) = (i2 i3 ). Given t 0


and , let C be the class of all (parametrized) functions t : R+ with
0 = 1 satisfying the following properties:
(1) there exists Kt > 0 such that t (1 ) Kt (2 ) for any 1 , 2 ;
(2) for every  and j [1, ||] N we have
t (  ) t | j ( j ()  )t j () (  ),
where | j are the rst j elements , and where j ()  denotes the juxtapo-
sition of the two sequences;
(3) for each > 0 there exist a = a(), b = b() (0, 1) depending only on , with
a() 1 and b() 1 when 0, such that
t (  )a|| t+ (  ) t (  )b||
for every  .
We note that this class of functions contains as particular examples several classes
earlier considered by Falconer [18, 20] and by Barreira [2], in connection with the
study of the dimension of repellers of nonconformal transformations.
For any function in the class C, using the subadditivity it is shown in [30] that
given  and a -invariant probability measure in , the limits
1 
p(t) = lim log t (  ) (9)
n n
n

and
1 
s (t) = lim (C ) log t (  )
n n n

November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

Almost Additive Thermodynamic Formalism 1155

exist, where C is the set of sequences whose rst n elements are equal to
those of ). Moreover, they are independent of  .
To verify that p(t) is indeed a particular case of the nonadditive topological
pressure, given  and n N we dene a sequence n : R by
tn () = sup log t  (  ). (10)
 C

Then the rst condition on the class C ensures that (6) holds, and we can show that
p(t) coincides with the nonadditive topological pressure of the sequence t = (n )n
for any  . This follows readily from results in [1] using the second condition on C.
Moreover, by the third condition we can readily apply Theorem 2.1 to conclude
that there exists a unique t 0 such that p(t) = 0 (the proof of this statement
in [30] follows the same argument). This zero is often related to the dimension
of certain classes of limit sets of iterated function systems and repellers (see for
example [1, 2, 4, 18, 20]).
In addition, the following property holds.
Theorem 2.2 ([30]). We have
p(t) h () + s (t). (11)
By Kingmans subadditive ergodic theorem, we have

1
s (t) = lim t d,
n n n

and thus, inequality (11) can be written in the form



1
P (t ) h () + lim tn d.
n n

This inequality is due to Falconer [19] in the general case of arbitrary subadditive
sequences (and not only for the sequences t ) with a bounded distortion condition
(which in the present context is given by the rst condition on C). Assuming a
certain Lipschitz property for the elements of the sequence (more generally for
topological Markov chains), he also obtained the variational principle
  
1
P (t ) = sup h () + lim tn d . (12)
n n

In an analogous manner to that in the classical additive theory, we say that a


-invariant probability measure in is an equilibrium measure for the sequence t
if it attains the supremum in (12). In the present context the existence of equilibrium
measures was establish by Kaenmaki.
Theorem 2.3 ([30]). For each t 0 there exists an equilibrium measure for the
sequence t .
The existence of these equilibrium measures is used in [30] to study the dimen-
sion of a class of limit sets of iterated function systems.
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

1156 L. Barreira

Now we consider a particular class of functions in C that are obtained from


products of matrices. Given p, m N, let M1 , . . . , Mp be m m matrices. For each
t > 0, n N and n , we consider the constant function
t = Mi1 Min t ,
t as in (10), that is,
where = (i1 in ), and again we dene a sequence
tn () = sup log t  (  ) = sup log Mi1 Min t ,
 C  C

where = (i1 in ). One can easily verify that the functions t belong to the


class C, and that p(t) in (9) is given by


1 
p(t) = lim Mi1 Min t .
n n
n

Moreover, given a -invariant probability measure in , we have


1 
s (t) = t lim (C )log Mi1 Min ,
n n
n

and it follows from (12) (see also [16]) that


p(t) = sup(h () + s (t)).

The following result is due to Feng and K


aenmaki.
Theorem 2.4 ([25]). If for each n N there exist i1 , . . . , in {1, . . . , m} such
that Mi1 Min = 0, then for each t 0 there exist at most m ergodic equilibrium
measures for the sequence t . If in addition the only proper vector space V such
that Mi V V for i = 1, . . . , m is the origin, then for each t 0 there exists a
unique equilibrium measure for the sequence t.

The irreducibility condition in Theorem 2.4 concerning the subspaces V is used


in [23] to show that there exist c > 0 and k N such that for each ,  there

kj=1 j for which
exists
M  c M M . (13)
It is essentially this property that allows to establish the existence of a unique
equilibrium measure in [25]. We note that property (13) ensures that the sequence
t is almost additive (see (3)), and thus the existence of a unique ergodic measure

in Theorem 2.4 as well as its Gibbs property (also obtained in [25]) follow from
general results in [3] for the class of almost additive sequences (compare with the
results in Secs. 4 and 5).

3. Topological Pressure for Almost Additive Sequences


We introduce in this section the class of almost additive sequences, and we present
formulas for the nonadditive topological pressure. For deniteness we consider only
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

Almost Additive Thermodynamic Formalism 1157

the case of functions dened on a repeller. We refer to the remaining sections for
further developments.

3.1. Repellers and Markov partitions


We recall in this section the notion of repeller and the notion of Markov partition.
Let f : M M be a C 1 map, and let M be a compact f -invariant set (this
means that f 1 = ). We say that f is expanding on , and that is a repeller
of f if there exist constants c > 0 and > 1 such that
dx f n v c n v
for every x , n N, and v Tx M . In addition, we always assume in this

presentation that there is an open set U such that = nN f n U , and that f
is topologically mixing on .
We recall that a collection of closed sets R1 , . . . , Rp is said to be a Markov
partition of the repeller if:
p
(1) = i=1 Ri , and int Ri = Ri for i = 1, . . . , p;
(2) int Ri int Rj = whenever i = j;
(3) f (Ri ) Rj whenever f (int Ri ) int Rj = .
We note that here the interior of each set Ri is computed with respect to the
induced topology on . Any repeller has Markov partitions with arbitrarily small
diameter
max{diam Ri : i = 1, . . . , p} (14)
(see [41]). Given a Markov partition R1 , . . . , Rp of , we dene a p p matrix
A = (aij ) with entries

1 if f (int Ri ) int Rj = ,
aij = (15)
0 if f (int Ri ) int Rj = ,
and we consider the corresponding topological Markov chain : A A dened
by the shift map (i1 i2 ) = (i2 i3 ) in the set
A = {(i1 i2 ) {1, . . . , p}N : aik ik+1 = 1 for every k N}. (16)
We denote by A,n the set of n-tuples (i1 in ) for which there is a sequence
(j1 j2 ) A such that i = j for = 1, . . . , n. For each (i1 in ) A,n we
dene
n1

i1 in = f  Ri+1 , (17)
=0
and setting



(i1 i2 ) = f  Ri+1 = i1 in ,
=0 n=1

we obtain a coding map : A for the repeller.


November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

1158 L. Barreira

3.2. Formulas for the topological pressure


Now we introduce the class of almost additive sequences, and we describe corre-
sponding formulas for the nonadditive topological pressure both using and avoiding
Markov partitions.
We say that the sequence of functions = (n )n with n : R for each
n N is almost additive (with respect to f on ) if there exists a constant C > 0
such that for every n, m N and x we have

C + n (x) + m (f n (x)) n+m (x) C + n (x) + m (f n (x)). (18)


n1
Clearly, any additive sequence of functions n = k=0 f k is almost additive.
Nontrivial examples of almost additive sequences occur naturally for example in
the study of nonconformal repellers (see Sec. 7 for a detailed description).
Now let be a repeller of f , and let i1 in be the sets in (17) obtained from
a given Markov partition. We write

n () = sup{|n (x) n (y)| : x, y i1 in and (i1 in ) A,n }. (19)

One can easily verify that n () coincides with n (, U) in (5) for the open cover
U of formed by the elements R1 , . . . , Rp of the Markov partition (with respect to
the induced topology on ). We say that has tempered variation if n ()/n 0
as n . Clearly, any sequence with tempered variation satises condition (6).
The following result provides a formula for the topological pressure of an almost
additive sequence with tempered variation.
Theorem 3.1 ([7, Proposition 3]). Let be a repeller of a C 1 map, and let =
(n )n be an almost additive sequence of continuous functions on with tempered
variation. Then
1 
P () = lim log exp n (xi1 in ) (20)
n n
i i 1 n

for any points xi1 in i1 in , for each (i1 in ) A,n and n N.


The statement in Theorem 3.1 was rst established by Barreira and Gelfert
in [7], and was then extended by Barreira in [3] to other classes of transformations
(see Secs. 5 and 6). We emphasize that identity (20) ensures not only that the
nonadditive topological pressure of an almost additive sequence is a limit, but also
that the limit is independent of the particular Markov partition used to dene it.
For a continuous function : R, we recall that the (classical) topological
pressure of (with respect to f on ) is given by
 n1

1
P () = lim log exp max (f k (x)), (21)
n n xi1 in
i i
1 n k=0

where i1 in are the sets in (17) obtained from any given Markov partition. One
can easily verify that the limit in (20) exists (by showing that the rst sum denes a
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

Almost Additive Thermodynamic Formalism 1159

submultiplicative sequence). Furthermore, the limit is independent of the particular


Markov partition used to dene it (see [36,47] for details). We note that identity (20)
includes identity (21) (which is often taken as the denition of topological pressure)
as a particular case.
We have also the following alternative characterization of the topological pres-
sure. It has the advantage of avoiding Markov partitions and the associated symbolic
dynamics. Let
Fix(f ) = {x : f (x) = x}
be the set of xed points of f in .
Theorem 3.2 ([3]). Let be a repeller of a C 1 map, and let = (n )n be an
almost additive sequence of continuous functions on with tempered variation.
Then
1 
P () = lim log exp n (x). (22)
n n
n
xFix(f )

4. Results for Repellers


We describe in this section several results of the almost additive thermodynamic for-
malism, again for deniteness in the particular case of functions dened in a repeller.
In particular, we describe a variational principle for the topological pressure. We
also introduce, for almost additive sequences, the notions of equilibrium measure
and of Gibbs measure, and we consider the problem of existence and uniqueness of
these measures.

4.1. Variational principle for the topological pressure


To formulate the variational principle for the topological pressure, we rst recall the
notion of KolmogorovSinai entropy. Given a measurable transformation f : ,
we denote by M the family of f -invariant probability measures in . We recall that
a measure in is said to be f -invariant if (f 1 A) = (A) for every measurable
set A . Given a measure M and a partition of into measurable subsets,
we dene

H () = (C) log (C),
C

with the convention that 0 log 0 = 0. The KolmogorovSinai entropy of f with


respect to is given by
h (f ) = sup{h (f, ) : H () < },
where
1
h (f, ) = inf H (n ),
nN n
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

1160 L. Barreira

n1
for the partition n of into the sets k=0 f k Ck+1 with C1 , . . . , Cn . In the
case of invariant measures in repellers, the entropy can be obtained as follows. Given
a Markov partition of the repeller , we consider the partition

n = {i1 in : (i1 in ) A,n }

of . Its entropy is given by



H (n ) = (i1 in ) log (i1 in ),
i1 in

and
1 1
h (f ) = lim H (n ) = inf H (n ).
n n nN n

The following is a variational principle for the topological pressure.

Theorem 4.1 ([3]). Let be a repeller of a C 1 map f, and let = (n )n be


an almost additive sequence of continuous functions on with tempered variation.
Then
  
n (x)
P () = max h (f ) + lim d(x)
M n n
  
1
= max h (f ) + lim n d , (23)
M n n

including the existence in L1 (, ) of the rst limit, and the existence of the second
limit.

In a similar manner to that in the classical theory, it is easier to show that


  
1
P () max h (f ) + lim n d
M n n

when compared to the reverse inequality. The argument uses the subadditivity of
the sequence n = n + C (see (18)), that is, the property

n+m n + m f n , n, m N,

together with Kingmans subadditive ergodic theorem. The proof of the reverse
inequality uses analogous arguments to those in the proof of [1, Theorem 1.7],
which in their turn were inspired in arguments of Bowen in [14]. The fact that the
supremum can be replaced by a maximum in (23) follows from the upper semi-
continuity of the map

1
M   h (f ) + lim n d, (24)
n n

since  h (f ) is upper semi-continuous in this setting, and since the limit in (24)
is continuous in .
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

Almost Additive Thermodynamic Formalism 1161

4.2. Equilibrium and Gibbs measures


We continue to consider a repeller of a C 1 map f . In an analogous manner to
that in the classical additive theory, we say that a measure M is an equilibrium
measure for the almost additive sequence (with respect to f on ) if it attains
any of the maxima in (23) (and thus both maxima), that is, if

1
P () = h (f ) + lim n d.
n n

The existence of equilibrium measures is thus an immediate consequence of


Theorem 4.1.

Theorem 4.2 ([3]). Let be a repeller of a C 1 map. Then any almost additive
sequence of continuous functions on with tempered variation has at least one
equilibrium measure.

We also say that a probability measure in (which need not be f -invariant) is


a Gibbs measure for the sequence (with respect to f on , and to a given Markov
partition of ) if there exists a constant K > 0 such that

(i1 in )
K 1 K
exp[nP () + n (x)]

for every n N, (i1 in ) A,n , and x i1 in . It turns out, as in the classical


additive theory, that invariant Gibbs measures are always equilibrium measures.
The argument is simple. We rst note that if is an f -invariant Gibbs measure,
then the limit
1 n (x)
h (x) := lim log (i1 in ) = P () lim (25)
n n n n
exists for -almost every x (by Theorem 4.1 the second limit in (25) exists in
L1 (, ), and thus it also exists for -almost every x ). By ShannonMcMillan
Breimans theorem we obtain
 
n (x)
h (f ) = h (x)d(x) = P () lim d(x),
n n

and hence is an equilibrium measure.


To formulate the following result we need to consider the stronger notion of
bounded variation. We say that the sequence of functions = (n )n has bounded
variation if supnN n () < (see (19) for the denition of n ()). For example,

one can easily verify that if is the additive sequence n = n1 k
k=0 f for some
Holder continuous in a repeller, then has bounded variation. Clearly, if has
bounded variation, then it has tempered variation.
The following statement says in particular that for each almost additive sequence
with bounded variation there exists a unique equilibrium measure.
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

1162 L. Barreira

Theorem 4.3 ([3]). Let be a repeller of a C 1 map, and let be an almost


additive sequence of continuous functions on with bounded variation. Then:
(1) there is a unique equilibrium measure for ;
(2) there is a unique invariant Gibbs measure for ;
(3) the two measures coincide and are mixing.
In particular, the unique equilibrium measure for an almost additive sequence
with bounded variation is an invariant Gibbs measure.
We refer to [34] for some results related to those in this section, although using
a dierent notion of equilibrium measure.

4.3. Characterizations of unique equilibrium measures


The unique equilibrium measure in Theorem 4.3 can be characterized as follows.
We denote by x the probability measure with x ({x}) = 1.
Theorem 4.4 ([3]). Let be a repeller of a C 1 map, and let = (n )n be an
almost additive sequence of continuous functions on with bounded variation. Then
the unique equilibrium measure for is the weak limit of the sequence of invariant
probability measures
  
n = en (x) x en (x) . (26)
xFix(f n ) xFix(f n )

Now we present another characterization of the unique equilibrium measures.


Given a sequence of continuous functions = (n )n with bounded variation, we set
ai1 in = max{exp n (y) : y i1 in },
with the convention that ai1 in = 0 if i1 in = . We also set

n = ai1 in .
i1 in

We dene a probability measure n in the algebra generated by the sets i1 in by


n (i1 in ) = ai1 in /n
for each (i1 in ) A,n , and we extend it arbitrarily to the Borel -algebra of .
Since is compact, the family of probability measures in is compact in the
weak* topology, and hence, there exists a subsequence (nk )k converging to some
probability measure in the weak* topology. A priori the accumulation point
need not be unique. We denote the set of all accumulation points of the sequence
(n )n by M(). As explained above, M() = . The following statement shows
that all accumulation points are Gibbs measures.
Theorem 4.5 ([3]). Let be a repeller of a C 1 map, and let be an almost
additive sequence of continuous functions on with bounded variation. Then each
measure in M() is an ergodic Gibbs measure for .
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

Almost Additive Thermodynamic Formalism 1163

Moreover, the following is a characterization of the unique invariant Gibbs


measure.
Theorem 4.6 ([3]). Let be a repeller of a C 1 map, and let be an almost
additive sequence of continuous functions on with bounded variation. Then the
unique invariant Gibbs measure for is the unique invariant measure in M().
When is an almost additive sequence of continuous functions in with tem-
pered variation (but not necessarily with bounded variation), we can still show that
there exist an ergodic probability measure in , a constant K > 0, and a positive
sequence (n )n decreasing to 0, such that
(i1 in )
K 1 enn Kenn (27)
exp[nP () + n (x)]
for every n N, (i1 in ) A,n , and x i1 in . We emphasize that the measure
need not be invariant. Furthermore, in general it may not be possible to obtain an
invariant measure through an averaging procedure, due to the extra small exponen-
tials in (27). On the other hand, it is still reasonable to call the measure in (27)
a weak Gibbs measure for , as proposed by Yuri in [48].

5. Results for Hyperbolic Sets


We consider in this section the case of functions dened in a hyperbolic set, and we
formulate corresponding results to those in Sec. 4 for functions dened in a repeller.

5.1. Hyperbolic sets and Markov partitions


Let f : M M be a dieomorphism of a smooth manifold M , and let M be
a compact f -invariant set. We say that is a hyperbolic set for f if for every point
x there exists a decomposition of the tangent space
Tx M = E s (x) E u (x)
such that
dx f E s (x) = E s (f (x)) and dx f E u (x) = E u (f (x)),
and there exist constants (0, 1) and c > 0 such that
dx f n | E s (x) cn and dx f n | E u (x) cn
for every x and n N. In addition, we always assume in this presentation that
there is an open set U such that

= f n U, (28)
nZ

and that f is topologically mixing on . Given > 0 suciently small, for each
x the local stable and unstable manifolds (of size ) are given by
V s (x) = {y M : d(f n (y), f n (x)) < for every n 0}
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

1164 L. Barreira

and
V u (x) = {y M : d(f n (y), f n (x)) < for every n 0},
where d is the distance on M .
Now we briey recall the notion of Markov partition for a hyperbolic set. A col-
lection of closed sets R1 , . . . , Rp with suciently small diameter (given by (14))
is called a Markov partition of if:
p
(1) = i=1 Ri , and int Ri = Ri for i = 1, . . . , p;
(2) V s (x) V u (x) Ri and card(V s (x) V u (x)) = 1 for x, y Ri ;
(3) int Ri int Rj = whenever i = j;
(4) if x f (int Ri ) int Rj , then
f 1 (V u (f (x)) Rj ) V u (x) Ri
and
f (V s (x) Ri ) V s (f (x)) Rj .
The interior of each set Ri is computed with respect to the induced topology on .
Any hyperbolic set satisfying (28) has Markov partitions with arbitrarily small
diameter (see, for example, [14]).
Given a Markov partition R1 , . . . , Rp of a hyperbolic set , we dene as in the
case of repellers a p p matrix A = (aij ) with entries given by (15), and we consider
the corresponding two-sided topological Markov chain dened by the shift map on
the set
A = {(i1 i2 ) {1, . . . , p}Z : aik ik+1 = 1 for every k Z}. (29)
We continue to denote by A,n the set of n-tuples (i1 in ) for which there is
a sequence ( j0 j1 j2 ) A such that i = j for = 1, . . . , n. For each
(i1 in ) A,n we consider again the sets i1 in dened by (17).

5.2. Formulation of the results


Repeating arguments in the proofs of Theorems 3.1 and 3.2 we obtain the following
statement, thus providing formulas for the topological pressure of an almost additive
sequence.
Theorem 5.1 ([3]). Let be a hyperbolic set of a C 1 map, and let be an almost
additive sequence of continuous functions on with tempered variation. Then iden-
tities (20) and (22) hold for any points xi1 in i1 in , for each (i1 in ) A,n
and n N.
We also formulate corresponding versions of Theorems 4.1 and 4.3.
Theorem 5.2 ([3]). Let be a hyperbolic set of a C 1 map, and let be an almost
additive sequence of continuous functions on with tempered variation. Then (23)
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

Almost Additive Thermodynamic Formalism 1165

holds, including the existence in L1 (, ) of the rst limit, and the existence of the
second limit.
In particular, this shows that the sequence has at least one equilibrium
measure.
Theorem 5.3 ([3]). Let be a hyperbolic set of a C 1 map, and let be an almost
additive sequence of continuous functions on with bounded variation. Then:

(1) there is a unique equilibrium measure for ;


(2) there is a unique invariant Gibbs measure for ;
(3) the two measures are equal, are mixing, and coincide with the weak limit of the
sequence of invariant probability measures n in (26).

6. Further Generalizations
Some of the former results for repellers and hyperbolic sets can be generalized to
more general classes of dynamics. We rst present a variational principle for the
topological pressure.
Theorem 6.1 ([3]). Let f be a continuous map in a compact metric space , and
let be an almost additive sequence of continuous functions in satisfying (6).
Then
  
n (x)
P () = sup h (f ) + lim d(x)
M n n
  
1
= sup h (f ) + lim n d ,
M n n

including the existence in L1 (, ) of the rst limit, and the existence of the second
limit.
We also formulate a criterion for the existence of equilibrium measures.
Theorem 6.2 ([3]). Let f be a continuous map in a compact metric space such
that M  h (f ) is upper semi-continuous, and let be an almost additive
sequence of continuous functions on satisfying (6). Then there exists an equilib-
rium measure for .
For example, if f is an expansive continuous map in , then the entropy is
upper semi-continuous, and hence each almost additive sequence has an equilib-
rium measure. We recall that f is said to be expansive if there exists > 0 such
that if
d(f n (x), f n (y)) < for every n N,
then x = y (when f is invertible we replace N by Z). For example, when f is a one-
sided or two-sided topologically mixing topological Markov chain, the entropy is
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

1166 L. Barreira

upper semi-continuous. Incidentally, all these transformations satisfy specication.


On the other hand, there are plenty transformations not satisfying specication
for which the entropy is still upper semi-continuous. For example, all -shifts are
expansive, and thus the entropy is upper semi-continuous (see [31] for details), but
for in a residual set of full Lebesgue measure (although the complement has
full Hausdor dimension) the corresponding -shift does not satisfy specication
(see [43]).
Finally, we describe some regularity properties of the topological pressure. We
denote by A() the family of almost additive sequences of continuous functions
satisfying (6). Let also E() A() be the family of sequences with a unique
equilibrium measure.
Theorem 6.3 ([5]). Let f be a continuous map in a compact metric space such
that M   h (f ) is upper semi-continuous. Then:

(1) given A(), the function t  P ( + t) is dierentiable at t = 0 for every


A() if and only if E(); in this case the unique equilibrium measure
of is ergodic, and

d n
P ( + t)|t=0 = lim d; (30)
dt n n

(2) for each open set U R, if + t E() for every t U, then the function
t  P ( + t) is of class C 1 in U .

The proof of Theorem 6.3 follows partially arguments in [31].

7. Application I: Nonconformal Repellers


We describe in this section a class of nonconformal repellers considered by Barreira
and Gelfert in [7] to which one can apply the results in Sec. 4, in connection with
the study of Lyapunov exponents of nonconformal transformations.

7.1. Cone condition and bounded distortion


To describe the class of repellers under consideration, we rst introduce what we
call a cone condition.
Given a number 1 and a 1-dimensional subspace E(x) R2 , we consider
the cone

C (x) = {(u, v) E(x) E(x) : v u }.

We say that a dierentiable map f : R2 R2 satises a cone condition on a set


R2 if there exist 1 and for each x a 1-dimensional subspace E(x) R2
varying continuously with x such that

(dx f )C (x) {0} int C (f x). (31)


November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

Almost Additive Thermodynamic Formalism 1167

Following [7], we present several examples of maps satisfying a cone condition.

Example 7.1. Assume that for each x the derivative dx f is represented by


a positive 2 2 matrix. Then the rst quadrant Q is invariant under these linear
transformations, that is, (dx f )Q Q for each x . Therefore, the map f satises
the cone condition in (31) with = 1, taking for E(x) the 1-dimensional subspace
making an angle of /4 with the horizontal direction.

This example is related to work in [26] (see also [22]).


Another class of examples corresponds to the existence of a strongly unstable
foliation.

Example 7.2. Let be a locally maximal repeller in the sense that in some open
neighborhood U the repeller is the only invariant set. In this case f 1 U = .
Assume that there exists a strongly unstable foliation of the set U , that is, a foliation
by 1-dimensional C 2 leaves V (x) such that:

(1) f (V (x)) V (f x) for every x U f 1 U ;


(2) there exist constants c > 0 and (0, 1) such that
n

|det dx f n |
cn for all x f i U and n N.
dx f n | Tx V (x) 2 i=0

It is shown by Hu in [29] that this assumption is equivalent to:

(1) for some choice of subspaces E(x) varying continuously with x, the cone
condition in (31) holds for every x U f 1 U ;
(2) there exist 1-dimensional subspaces F (x) {0} int C (x) for each x U
f 1 U such that dx f F (x) = F (f x).

Thus, repellers with a strongly unstable foliation satisfy a cone condition.

Notice that the cone condition in (31) is weaker then assuming the existence
of a strongly unstable foliation. In particular, (31) does not ensure the existence
of an invariant distribution F (x) as in Example 7.2. On the other hand, when
there exists a strongly unstable foliation, the invariant distribution F (x) is given
by (see [29])
 
F (x) = dy f n C (y).
nN yf n x

It is thus independent of the particular preimages xn f n x, that is,



F (x) = dxn f n C (xn ).
nN

We can also consider repellers with a dominated splitting.


November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

1168 L. Barreira

Example 7.3. We say that the repeller possesses a dominated splitting if there
exists a decomposition T R2 = E F such that:
(1) dx f E(x) = E(f x) and dx f F (x) = F (f x) for every x ;
(2) there exist constants c > 0 and (0, 1) such that
dx f n | E (dx f )n | F cn for all x and n N.
It follows easily from the denition that the subspaces E(x) and F (x) vary con-
tinuously with x. Furthermore, one can verify that when there exists a dominated
splitting of , the map f satises a cone condition on .
We note that the existence of a strongly unstable foliation does not ensure
the existence of a dominated splitting, due to the requirement of a df -invariant
decomposition E F (more precisely, the existence of a strongly unstable foliation
only ensures the existence of the invariant distribution F in Example 7.2).
Now we consider certain almost additive sequences of functions obtained from
the singular values of a 2 2 matrix A, namely
1 (A) = A and 2 (A) = A1 1
(with respect to the 2-norm in R2 ). Given a C 1 map f : R2 R2 , we dene
sequences of functions i = (i,n )n for i = 1, 2 by
i,n (x) = log i (dx f n ) (32)
for each n N and i = 1, 2. Clearly, the functions i,n are continuous. These
sequences are related to the Lyapunov exponents of the map f (see Sec. 7.2). We
rst present a criterium for almost additivity.
Proposition 7.4 ([7]). Let be a repeller of a C 1 map f : R2 R2 . If f satises
a cone condition on , then i is almost additive for i = 1, 2.
For a map f as in Proposition 7.4, we consider a number > 0 such that for
every x the map is invertible on the ball B(x, ) (simply take a Lebesgue
number of a cover by balls with the property that f is invertible on each of them).
For each x and n N we dene
n1

Bn (x, ) = f  B(f  x, ).
=0

We always assume that the diameter of the Markov partition used to dene the sets
i1 in in (17) is at most /2 (we recall that any repeller has Markov partitions of
arbitrarily small diameter). This ensures that
i1 in Bn (x, )
for every x = (i1 i2 ) and n N. We say that f has bounded distortion on
if there exists > 0 such that
sup{ dy f n (dz f n )1 : x and y, z Bn (x, )} < .
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

Almost Additive Thermodynamic Formalism 1169

Now we give a condition for bounded distortion in the case of C 1+ transfor-


mations. Given > 0, we say that f is -bunched on if
(dx f )1 1+ dx f < 1
for every x (this notion was introduced in [1] in the context of dimension
theory of nonconformal transformations). The following statement is an immediate
consequence of the proof of [2, Theorem 4].

Proposition 7.5. Let be a repeller of a C 1+ map f : M M . If f is -bunched


on , then f has bounded distortion on .
Now we consider the sequences i for i = 1, 2 introduced in (32) and we present
a criterium for bounded variation.
Proposition 7.6 ([7, Proposition 1]). Let be a repeller of a C 1 transformation
f : R2 R2 . If f has bounded distortion on , then i has bounded variation for
i = 1, 2.

7.2. Variational principle and Gibbs measures


It follows from Propositions 7.4 and 7.6 that if a C 1 map f : R2 R2 satises a
cone condition on and has bounded distortion on , then i is an almost additive
sequence with bounded variation for i = 1, 2. This allows us to apply the results in
Sec. 4 to recover the corresponding statements of Barreira and Gelfert in [7].
To explain the relation between the sequences i and the theory of Lyapunov
exponents, we rst recall some basic notions. Given a dierentiable transformation
f : M M (which is not necessarily invertible), for each x M and v Tx M we
dene the Lyapunov exponent of (x, v) by
1
(x, v) = lim sup log dx f n v , (33)
n+ n

with the convention that log 0 = . It follows from the abstract theory of
Lyapunov exponents (see [8] for full details) that for each x M there exist
a positive integer s(x) dim M , numbers 1 (x) < < s(x) (x), and linear
subspaces
{0} = E0 (x) E1 (x) Es(x) (x) = Tx M
such that for i = 1, . . . , s(x) we have
Ei (x) = {v Tx M : (x, v) i (x)},
and (x, v) = i (x) whenever v Ei (x)\Ei1 (x). It follows from Oseledets multi-
plicative ergodic theorem (see, for example, [8]), or more precisely from its version
for noninvertible transformations, that for each nite f -invariant measure in M
there is a set X M of full measure such that if x X, then
1
lim log dx f n v = i (x)
n+ n
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

1170 L. Barreira

for every v Ei (x)\Ei1 (x) and i = 1, . . . , s(x), with uniform convergence in v on


each subspace F Ei (x) such that F Ei1 (x) = {0} (in particular, the lim sup
in (33) is now a limit).
For M = R2 and each x R2 , when s(x) = 1 we set

1 (x) = 1 (x) and 2 (x) = 1 (x),

and when s(x) = 2 we set

1 (x) = 1 (x) and 2 (x) = 2 (x).

The numbers 1 (x) and 2 (x) are the values of the Lyapunov exponent v  (x, v)
counted with multiplicities. It follows again from Oseledets multiplicative ergodic
theorem that for each nite f -invariant measure in R2 there is a set X R2 of full
measure such that
i,n (x) 1
lim = lim log i (dx f n ) = i (x)
n+ n n+ n

for each x X and i = 1, 2 (see (32)). Combining these observations with the
criteria in Propositions 7.4 and 7.6, we readily obtain the following statement of
Barreira and Gelfert by applying the results in Sec. 4.

Theorem 7.7 ([7]). Let be a repeller of a C 1 map f : R2 R2 . If f satises


a cone condition on , and f has bounded distortion on , then for i = 1, 2 the
following properties hold :

(1) the topological pressure satises the variational principle


  
P (i ) = max h (f ) + i (x)d(x)
M
  
1 n
= max h (f ) + lim log i (dx f )d(x) ;
M n n

(2) there is a unique equilibrium measure i for i , and this is the unique invariant
Gibbs measure for i ;
(3) there is a constant K > 0 such that

i (i1 in )
K 1 K
exp[nP (i )]i (dx f n )

for every n N, (i1 in ) A,n , and x i1 in ;


(4) the measure i is mixing, and
  
i (dx f n )x i (dx f n )  i as n .
xFix(f n ) xFix(f n )
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

Almost Additive Thermodynamic Formalism 1171

8. Application II: Multifractal Analysis


We describe in this section a conditional variational principle for the u-dimension
spectrum established by Barreira and Doutor in [5]. This contains as a particular
case a conditional variational principle for the entropy spectrum (see Theorem 8.3
below). For simplicity of the exposition, we do not consider the multidimensional
case in [5] but only the case of a single ratio of almost additive functions. We
emphasize that this is already a nontrivial result when compared to the existing
results in the classical case of additive sequences.

8.1. Notion of u-dimension


We recall in this section the notion of u-dimension introduced by Barreira and
Schmeling in [10]. Let f : X X be a continuous transformation of a compact
metric space, and let U be a nite open cover of X. Let also u : X R+ be
a continuous function. Given a set Z X and a number R, we dene the
function

N (Z, , u, U) = lim inf exp(u(U )),
n
U

where u(U ) is dened as in (8), and where the inmum is taken over all nite or
 
countable collections kn Wk (U) such that u X(U ) Z. Setting
dimu,U Z = inf{ R : N (Z, , u, U) = 0},
one can show that the limit
dimu Z = lim dimu,U Z
diam U0

exists. The number dimu Z is called the u-dimension of the set Z (with respect to f ).
For example, if u = 1, then dimu Z is equal to the topological entropy h(f | Z) of f
on Z (see Sec. 2).
The following result is an easy consequence of the denitions.

Proposition 8.1. The number dimu Z = is the unique root of the equation

PZ (U ) = 0, where U = (un )n with un = n1 k
k=0 u f for each n N.

Furthermore, given a probability measure in X, we set


dimu,U = inf{dimu,U Z : (Z) = 1}.
One can show that the limit
dimu = lim dimu,U
diam U0

exists, and we call it the u-dimension of . Moreover, the lower and upper
u-pointwise dimensions of at the point x X are dened by
log (X(U ))
d,u (x) = lim lim inf inf
diam U0 n U u(U )
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

1172 L. Barreira

and
log (X(U ))
d,u (x) = lim lim sup sup ,
diam U0 n U u(U )
where the inmum and supremum are taken over all vectors U Wn (U) such that
x X(U ). If M is ergodic, then
h (f )
dimu = d,u (x) = d,u (x) = 
X
u d
for -almost every x X (see [10]).

8.2. Conditional variational principle


We formulate in this section a conditional variational principle for the u-dimension
of sets dened in terms of ratios of almost additive sequences. This corresponds to a
multifractal analysis of the level sets of limits of ratios of almost additive sequences.
We continue to consider a continuous map f : X X of a compact metric
space. Let = (n )n and = (n )n be almost additive sequences of functions
in X. We assume that
m (x)
lim inf >0 and n (x) > 0
m m
for every x X and n N. Given R we dene
 
n (x)
K = x X : lim = . (34)
n n (x)

The function Fu : R R dened by

Fu () = dimu K

is called the u-dimension spectrum of the pair (, ) (with respect to f ). We also


consider the function P : M R dened by

n d
P() = lim X .
n
n d
X

The following is a conditional variational principle for the spectrum Fu . We


n1
consider the (additive) sequence of functions U = (un )n with un = k=0 u f k for
each n N. We recall that E(X) denotes the family of almost additive sequences
satisfying (6) with a unique equilibrium measure.
Theorem 8.2 ([5]). Let f be a continuous map of a compact metric space X such
that  h (f ) is upper semi-continuous, and assume that

span{, , U } E(X).
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

Almost Additive Thermodynamic Formalism 1173

If  P(M), then K = . Otherwise, if int P(M), then K = , and the


following properties hold :
(1) Fu satises the variational principle




h (f )

Fu () = max  : M and P() = ;



u d
X

(2) we have
Fu () = min{Tu (, q) : q R},
where Tu (, q) is the unique real number satisfying
P (q( ) Tu (, q)U ) = 0; (35)
(3) there is an ergodic measure M such that P( ) = , (K ) = 1, and
h (f )
dimu =  = Fu ().
u d
X

In addition, the spectrum Fu is continuous in int P(M).


The proof of Theorem 8.2 builds on earlier work of Barreira et al. in [9]. We
note that the number Tu (, q) is dened implicitly by (35). By Theorem 6.3, the
function
(p, , q)  P (q( ) pU )
is of class C . By the Implicit function theorem, we conclude that (, q)  Tu (, q)
1

is also of class C 1 in R2 , since by (30),




P (q( ) pU )|(p,q)=(Tu (,q),q) = u dq < 0,
p X

where q is the unique equilibrium measure of q( ) Tu (, q)U .


Now we formulate explicitly a particular case of Theorem 8.2. Let = (n )n
be an almost additive sequence of functions n : X R. Given R, we consider
the level set
 
K = x X : lim n (x) = .
n

The entropy spectrum E : R R (of the sequence ) is dened by


E() = h(f | K ),
where h(f | K ) denotes the topological entropy of f on K (see Secs. 2 and 8.1).
We also consider the function P : M R dened by

1
P() = lim n d.
n n X
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

1174 L. Barreira

The following statement is a conditional variational principle for the entropy


spectrum E. It is an immediate consequence of Theorem 8.2 below.

Theorem 8.3. Let f be a continuous map of a compact metric space X such that
 h (f ) is upper semi-continuous, and assume that the almost additive sequence
has a unique equilibrium measure. If  P(M), then K = . Otherwise, if
int P(M), then K = , and the following properties hold :

(1) E satises the variational principle

E() = max{h (f ) : M and P() = };

(2) E() = min{P (q) q : q R};

(3) there is an ergodic measure M such that P( ) = , (K ) = 1, and


h (f ) = E().

In addition, the spectrum E is continuous in int P(M).

Now we consider the associated irregular sets, on which the limits in (34) do not
exist. We consider only the particular case of topological Markov chains. Namely,
let and be almost additive sequences in A , either as in (16) or as in (29). The
irregular set of the pair (, ) is dened by
 
n (x) n (x)
I = x A : lim inf < lim sup ,
n n (x) n n (x)

and we denote by mu the equilibrium measure of u, when it is unique.

Theorem 8.4 ([5]). Let | A be a topologically mixing topological Markov


chain. If

span{, , U } E(A ),

and P(mu ) int P(M ), then

dimu I = dimu A .

Theorem 8.4 follows from the application of results in [10] combined with
Theorem 8.2.

9. Application III: Dimension Spectra


Our last application of the almost additive thermodynamic formalism considers
dimension spectra of level sets associated to the limits of ratios of almost addi-
tive sequences. Moreover, we take into account simultaneously limits of ratios of
sequences into the future and into the past.
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

Almost Additive Thermodynamic Formalism 1175

Let f : M M be a C 1+ surface dieomorphism with a hyperbolic set


satisfying the same hypotheses as in Sec. 5.1. We always assume that
dim E s (x) = dim E u (x) = 1
for every x . Let ts and tu be the unique real numbers such that
P (ts log df | E s ) = P (tu log df 1 | E u ) = 0,
where P denotes the (classical) topological pressure with respect to f on . It was
shown by McCluskey and Manning in [33] that
dimH ( V s (x)) = ts and dimH ( V u (x)) = tu
for every x , where dimH denotes the Hausdor dimension. Moreover, it was
shown by Palis and Viana in [35] that
dimH ( V s (x)) = dimB ( V s (x)),
dimH ( V u (x)) = dimB ( V u (x))
for every x , where dimB denotes the upper box dimension. Since the stable
and unstable distributions have codimension 1, it follows from results of Hasselblatt
in [28] that the maps x  E s (x) and x  E u (x) are Lipschitz. This implies that
dimH = dimH [( V s (x)) ( V u (x))]
= dimH ( V s (x)) + dimH ( V u (x)) = ts + tu . (36)
Indeed, if dimH A = dimB A, then for any set B we have
dimH (A B) = dimH A + dimH B.
Now we proceed with the description of the dimension spectra. We denote by L+
(respectively, L ) the family of almost additive sequences of continuous functions
with respect to f (respectively, f 1 ) that have bounded variation with respect to f
(respectively, f 1 ). We only consider almost additive sequences
+ = (+
n )n , = (
n )n , + = (n+ )n , and = (n )n
such that

m (x)
lim inf > 0 and n (x) > 0
m m
for every n N and x . Given (+ , + ) L+ L+ and R we dene
 
+ (x)
K+ = x : lim n+ = ,
n n (x)

and given ( , ) L L and R we dene


 

n (x)
K = x : lim = .
n n (x)
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

1176 L. Barreira

We also consider the dimension spectrum D : R2 R dened by

D(, ) = dimH (K+ K ).

The following is a conditional variational principle for the spectrum D.

Theorem 9.1 ([6]). If int P+ (M) and int P (M), then

D(, ) = dimH K+ + dimH K dimH






h (f )
= max  : M and P () =
+



log df | E s d





h (f )

+ max  : M and P () = . (37)



log df | E u d

Moreover, the spectrum D is analytic in int P+ (M) int P (M).

The proof Theorem 9.1 follows to some extent arguments of Barreira and
Valls ([11]) in the additive case. In particular, it involves constructing a measure
= sitting on the set K+ K , that is, such that

(K+ K ) = 1,

having the right pointwise dimension. This means that

log (B(x, r))


lim inf dimH K+ + dimH K dimH
r0 log r

for -almost every x , and

log (B(x, r))


lim sup dimH K+ + dimH K dimH
r0 log r

for every x K+ K . These properties, together with general results in dimension


theory (see, for example, [4]) readily yield the rst identity in (37). The second
identity follows from Theorem 8.2. The measure , although never invariant, is
constructed essentially as a product of (invariant) equilibrium measures along the
stable and unstable directions, for which the results in Sec. 4 are essential. More
precisely, set

U = q + ( ) (dimH K+ ts )Du
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

Almost Additive Thermodynamic Formalism 1177

and
S = q ( ) (dimH K tu )Ds ,
where Du and Ds are the additive sequences
n1
 n1

log df | E u f k and log df 1 | E s f k ,
k=0 k=0

and where q , q R are such that
+

P (U ) = P (S) = 0.
By the almost additive thermodynamic formalism there exist unique equilibrium
measures u and s respectively of U and S. Roughly speaking, the measure
is given by the product u s at the level of symbolic dynamics. It is also shown
in [6] that
dimH K+ = dimH (K+ V u (x)) + ts
and
dimH K = dimH (K V s (y)) + tu
for every x K+ and y K . Together with (36) and (37), this shows that
D(, ) = dimH (K+ V u (x)) + dimH (K V s (y))
for every x K+ and y K .

Note Added in Proof. Meantime, I became aware of the interesting paper [24]
by Feng and Huang. Their work considers the more general case of asymptotically
subadditive sequences and is a quite substantial advance towards a general theory.

Acknowledgment
The author was partially supported by FCT through CAMGSD, Lisbon.

References
[1] L. Barreira, A non-additive thermodynamic formalism and applications to dimension
theory of hyperbolic dynamical systems, Ergodic Theory Dynam. Systems 16 (1996)
871927.
[2] L. Barreira, Dimension estimates in nonconformal hyperbolic dynamics, Nonlinearity
16 (2003) 16571672.
[3] L. Barreira, Nonadditive thermodynamic formalism: Equilibrium and Gibbs mea-
sures, Discrete Contin. Dyn. Syst. 16 (2006) 279305.
[4] L. Barreira, Dimension and Recurrence in Hyperbolic Dynamics, Progress in Mathe-
matics, Vol. 272 (Birkhauser, 2008).
[5] L. Barreira and P. Doutor, Almost additive multifractal analysis, J. Math. Pures
Appl. 92 (2009) 117.
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

1178 L. Barreira

[6] L. Barreira and P. Doutor, Dimension spectra of almost additive sequences, Nonlin-
earity 22 (2009) 27612773.
[7] L. Barreira and K. Gelfert, Multifractal analysis for Lyapunov exponents on noncon-
formal repellers, Comm. Math. Phys. 267 (2006) 393418.
[8] L. Barreira and Ya. Pesin, Lyapunov Exponents and Smooth Ergodic Theory, Univ.
Lect. Ser., Vol. 23 (Amer. Math. Soc., 2002).
[9] L. Barreira, B. Saussol and J. Schmeling, Higher-dimensional multifractal analysis,
J. Math. Pures Appl. 81 (2002) 6791.
[10] L. Barreira and J. Schmeling, Sets of non-typical points have full topological
entropy and full Hausdor dimension, Israel J. Math. 116 (2000) 2970.
[11] L. Barreira and C. Valls, Multifractal structure of two-dimensional horseshoes,
Comm. Math. Phys. 266 (2006) 455470.
[12] H. Bothe, The Hausdor dimension of certain solenoids, Ergodic Theory Dynam.
Systems 15 (1995) 449474.
[13] R. Bowen, Topological entropy for noncompact sets, Trans. Amer. Math. Soc. 184
(1973) 125136.
[14] R. Bowen, Equilibrium States and the Ergodic Theory of Anosov Dieomorphisms,
Lect. Notes in Math., Vol. 470 (Springer, 1975).

[15] R. Bowen, Hausdor dimension of quasi-circles, Inst. Hautes Etudes Sci. Publ. Math.
50 (1979) 259273.
[16] Y.-L. Cao, D.-J. Feng and W. Huang, The thermodynamic formalism for sub-additive
potentials, Discrete Contin. Dyn. Syst. 20 (2008) 639657.
[17] P. Collet, J. Lebowitz and A. Porzio, The dimension spectrum of some dynamical
systems, J. Stat. Phys. 47 (1987) 609644.
[18] K. Falconer, The Hausdor dimension of self-ane fractals, Math. Proc. Cambridge
Philos. Soc. 103 (1988) 339350.
[19] K. Falconer, A subadditive thermodynamic formalism for mixing repellers, J. Phys.
A 21 (1988) 17371742.
[20] K. Falconer, Bounded distortion and dimension for non-conformal repellers, Math.
Proc. Cambridge Philos. Soc. 115 (1994) 315334.
[21] D.-J. Feng, Lyapunov exponents for products of matrices and multifractal analysis.
I. Positive matrices, Israel J. Math. 138 (2003) 353376.
[22] D.-J. Feng, The variational principle for products of non-negative matrices, Nonlin-
earity 17 (2004) 447457.
[23] D.-J. Feng, Lyapunov exponents for products of matrices and multifractal analysis.
II. General matrices, Israel J. Math. 170 (2009) 355394.
[24] D.-J. Feng and W. Huang, Lyapunov spectrum of asymptotically sub-additive poten-
tials, Comm. Math. Phys. 297 (2010) 143.
[25] D.-J. Feng and A. K aenmaki, Equilibrium states for the pressure function for products
of matrices, preprint (2009).
[26] D.-J. Feng and K. Lau, The pressure function for products of non-negative matrices,
Math. Res. Lett. 9 (2002) 363378.
[27] T. Halsey, M. Jensen, L. Kadano, I. Procaccia and B. Shraiman, Fractal measures
and their singularities: The characterization of strange sets, Phys. Rev. A 34 (1986)
11411151; Errata, ibid. 34 (1986) 1601.
[28] B. Hasselblatt, Regularity of the Anosov splitting and of horospheric foliations,
Ergodic Theory Dynam. Systems 14 (1994) 645666.
[29] H. Hu, Box dimensions and topological pressure for some expanding maps, Comm.
Math. Phys. 191 (1998) 397407.
November 16, 2010 15:27 WSPC/S0129-055X 148-RMP
J070-S0129055X10004168

Almost Additive Thermodynamic Formalism 1179

[30] A. Kaenm aki, On natural invariant measures on generalised iterated function sys-
tems, Ann. Acad. Sci. Fenn. Math. 29 (2004) 419458.
[31] G. Keller, Equilibrium States in Ergodic Theory, London Mathematical Society Stu-
dent Texts, Vol. 42 (Cambridge University Press, 1998).
[32] A. Lopes, The dimension spectrum of the maximal measure, SIAM J. Math. Anal.
20 (1989) 12431254.
[33] H. McCluskey and A. Manning, Hausdor dimension of horseshoes, Ergodic Theory
Dynam. Systems 3 (1983) 251260.
[34] A. Mummert, The thermodynamic formalism for almost-additive sequences, Discrete
Contin. Dyn. Syst. 16 (2006) 435454.
[35] J. Palis and M. Viana, On the continuity of the Hausdor dimension and limit
capacity for horseshoes, in Dynamical Systems (Valparaiso, 1986), eds. R. Bam on,
R. Labarca and J. Palis, Lect. Notes in Math., Vol. 1331 (Springer, 1988), pp. 150
160.
[36] Ya. Pesin, Dimension Theory in Dynamical Systems: Contemporary Views and Appli-
cations, Chicago Lectures in Mathematics (Chicago University Press, 1997).
[37] Ya. Pesin and B. Pitskel, Topological pressure and the variational principle for non-
compact sets, Funct. Anal. Appl. 18 (1984) 307318.
[38] D. Rand, The singularity spectrum f () for cookie-cutters, Ergodic Theory Dynam.
Systems 9 (1989) 527541.
[39] D. Ruelle, Statistical mechanics on a compact set with Z action satisfying expan-
siveness and specication, Trans. Amer. Math. Soc. 185 (1973) 237251.
[40] D. Ruelle, Thermodynamic Formalism, Encyclopedia of Mathematics and Its Appli-
cations, Vol. 5 (Addison-Wesley, 1978).
[41] D. Ruelle, Repellers for real analytic maps, Ergodic Theory Dynam. Systems 2 (1982)
99107.
[42] H. Rugh, On the dimensions of conformal repellers. Randomness and parameter
dependency, Ann. of Math. (2 ) 168 (2008) 695748.
[43] J. Schmeling, Symbolic dynamics for -shifts and self-normal numbers, Ergodic The-
ory Dynam. Systems 17 (1997) 675694.
[44] K. Simon, The Hausdor dimension of the SmaleWilliams solenoid with dierent
contraction coecients, Proc. Amer. Math. Soc. 125 (1997) 12211228.
[45] K. Simon and B. Solomyak, Hausdor dimension for horseshoes in R3 , Ergodic Theory
Dynam. Systems 19 (1999) 13431363.
[46] P. Walters, A variational principle for the pressure of continuous transformations,
Amer. J. Math. 97 (1976) 937971.
[47] P. Walters, An Introduction to Ergodic Theory, Graduate Texts in Mathematics,
Vol. 79 (Springer, 1982).
[48] M. Yuri, Zeta functions for certain non-hyperbolic systems and topological Markov
approximations, Ergodic Theory Dynam. Systems 18 (1998) 15891612.
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004181

Reviews in Mathematical Physics


Vol. 22, No. 10 (2010) 11811208

c World Scientific Publishing Company
DOI: 10.1142/S0129055X10004181

PAULIFIERZ MODEL WITH KATO-CLASS


POTENTIALS AND EXPONENTIAL DECAYS

TAKERU HIDAKA and FUMIO HIROSHIMA


Faculty of Mathematics, Kyushu University,
Fukuoka 819-0385, Japan
hiroshima@math.kyushu-u.ac.jp

Received 29 March 2010


Revised 25 October 2010

Generalized PauliFierz Hamiltonian with Kato-class potential KPF in nonrelativistic


quantum electrodynamics is defined and studied by a path measure. KPF is defined as
the self-adjoint generator of a strongly continuous one-parameter symmetric semigroup
and it is shown that its bound states spatially exponentially decay pointwise and the
ground state is unique.

Keywords: PauliFierz model; exponential decay; ground states; functional integrations.

Mathematics Subject Classification 2010: 81Q10, 46N50

1. Introduction
In this paper, we investigate generalized PauliFierz Hamiltonians with Kato-
class potentials in nonrelativistic quantum electrodynamics by a path measure. It
includes not only Kato-class potentials but also general cuto functions of quantized
radiation elds. Basic ingredients in this paper are path measures and functional
integral representation of semigroups. It has been shown that functional integral
representations are useful tools to investigate the spectrum of models in quantum
eld theory. See, e.g., [4, 9, 15, 18, 20, 22, 23, 28, 29].
The strongly continuous one-parameter semigroup (etHp )t0 generated by the
Schrodinger operator, Hp = 12 (p a)2 + V , on L2 (Rd ) with some external potential
V and vector potential a = (a1 , . . . , ad ) is expressed by a path measure, which is
known as FeynmanKacIt o formula ([25]):
 Rt Rt
(f, etHp g) = dxf(x)Ex [e 0 V (Bs )dsi 0 a(Bs )dBs g(Bt )], (1.1)

where Ex denotes the expectation value with respect x


 t to the Wiener measure P ,
(Bt )t0 the d-dimensional Brownian motion and 0 a(Bs ) dBs a Stratonovich
integral.

1181
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004181

1182 T. Hidaka & F. Hiroshima

Conversely since a Kato-class potential V satises that


Rt
sup Ex [e 0
V (Bs )ds
] < , t 0, (1.2)
x

the family of mappings St dened by


Rt Rt
St g(x) = Ex [e 0
V (Bs )dsi 0
a(Bs )dBs
g(Bt )], t 0, (1.3)

turns to be the strongly continuous one-parameter symmetric semigroup for a


Kato-class potential V . The Schrodinger operator with a Kato-class potential V
is then dened as the self-adjoint generator of (St )t0 . See, e.g., [3, 26, 27, 19].
The three-dimensional Kato-class includes a singular external potential such as
V (x) = |x|a , 0 a < 2.
We extend this to the PauliFierz Hamiltonian. The PauliFierz Hamiltonian
HPF is a self-adjoint operator dened on the tensor product of Hilbert spaces:

H = L2 (Rd ) L2 (Q), (1.4)

where L2 (Q) is an L2 -space over a probability apace (Q, B, ) with a Gaussian


measure , and it describes the Schrodinger representation of the standard Boson
Fock space. The PauliFierz Hamiltonian HPF is given by
1
HPF = (p 1 + A )2 + V 1 + 1 Hf (m), (1.5)
2
where 0 is a coupling constant, Hf (m) the free eld Hamiltonian with a eld
mass m 0 and A = (A1 , . . . , Ad ) a quantized radiation eld with a cuto func-
tion. See Sec. 2 for further details of notations. Under some conditions on cuto
functions and V it is proven that (1.5) is self-adjoint and etHPF is then dened by
the spectral resolution. In [14], (F, etHPF G) is also presented by a path measure:

(F, etHPF G) = dx(F (x), (Tt G)(x))L2 (Q) , (1.6)

where Tt is of the form


Rt
Tt f (x) = Ex [e 0
V (Bs )ds i AE (Kt )
J0 e Jt G(Bt )] L2 (Q) (1.7)

for each x Rd . Compare with (1.3) and see (2.47) for details.
Our construction of generalized PauliFierz Hamiltonians is closed to the pro-
cedure to dene the Schr odinger operator with Kato-class potentials. We believe
however that it is worthwhile extending it to the PauliFierz Hamiltonian from
the mathematical point of view. It will be shown that the family of operators
Tt : H H , t 0, can be also dened for Kato class potentials V and general
cuto functions in A , and the generalized PauliFierz Hamiltonian KPF is dened
as the self-adjoint generator of (Tt )t0 . Of course, under some conditions KPF coin-
cides with HPF , but KPF permits to include more singular Vs and general cuto
functions in A .
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004181

Generalized PF Model 1183

Cuto functions of A (x), = 1, 2, 3, of the standard PauliFierz Hamiltonian


in three dimensions are of the form

eikx e (k, j)(k)/
|k| (1.8)
with some function and polarization vectors e(k, j) = (e1 (k, j), e2 (k, j),
e3 (k, j)), j = 1, 2. In [8], the so-called Nelson model on a pseudo-Riemannian man-
ifold is studied by a path measure. A generalized PauliFierz Hamiltonians include
a mathematical analogue of the Nelson model on a pseudo-Riemannian manifold,
which is unitarily transformed to the PauliFierz Hamiltonian with a variable mass.
The cuto function of the PauliFierz Hamiltonian with a variable mass v is (1.8)
with eikx and e (k, j)(k) replaced by (k, x) and j (k), respectively:

(k, x)j (k)/ |k|. (1.9)

Here j (k) is some function and (k, x), k = 0, is the unique solution of the
LippmanSchwinger equation ([21]):
 i|k||xy|
+ikx 1 e v(y)
(k, x) = e (k, y)dy. (1.10)
4 |x y|
The main results of the present paper are as follows:

(1) we dene the generalized PauliFierz Hamiltonian KPF with Kato-class poten-
tials and generalized cuto functions, i.e. we prove that (Tt )t0 is a strongly
continuous one-parameter symmetric semigroup;
(2) KPF is an extension of HPF ;
(3) bound states of KPF spatially exponentially decay pointwise and the ground is
unique if it exists.

We explain an outline of (1)(3) above.


First we dene the strongly continuous one-parameter symmetric semigroup
(Tt )t0 with Kato-class potentials and general cuto functions by functional inte-
gral representations. Then KPF is dened by Tt = etKPF for t 0. We introduce
two assumptions, Assumptions 2.1 and 2.12, on cuto functions of A . The former is
stronger than the latter. One advantage to dene the generalized PauliFierz Hamil-
tonian by a path measure is that we need only a weak condition on cuto functions
(Assumption 2.12) and external potentials. Then for arbitrary R, Kato-class
potential V and cuto function j (x, k) satisfying j (x, k) Cb1 (Rdx ; L2 (Rdk )), we
can dene KPF as a self-adjoint operator.
Secondly, we can show that
1
(p 1 + A )2 +
V+ 1
V 1 + 1 Hf (m) (1.11)
2
is well dened for V such that 0 V+ L1loc (Rd ) and 0 V is relatively form
bounded with respect to p2 /2 with a relative bound strictly smaller than one. It is
shown that KPF = (1.11) under Assumption 2.1 on cuto functions.
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004181

1184 T. Hidaka & F. Hiroshima

Finally it is shown that bound states of KPF spatially exponentially decays


pointwise. To show the spatial exponential decay of bound states is very important
to study the properties of spectrum of PauliFierz type models. In [2, 11, 10] the
spatial exponential decay of bound states is shown but our method is completely
dierent from them. Since b (x) = etE etKPF b for b such that KPF b = Eb ,
exponential decay of b (x) is proven by means of showing supx
b (x)
L2 (Q) <
Rt
and estimating etE Ex [e 0 V (Bs )ds ]. We conclude that


b (x)
L2 (Q) DeC|x|

(1.12)

almost everywhere x Rd , and constants D and C are independent of the eld mass
m. Here the exponent , 1, is determined by the behavior of external potential
V . When lim inf |x| V (x) < E, we can take = 1, and when V (x) = |x|2n ,
= n+1 is obtained. See Theorem 3.1 for the details. Furthermore, from a standard
argument [15] it follows that the transformed operator ei(/2)N Tt ei(/2)N is a
positivity improving semigroup, where N denotes the number operator in L2 (Q).
Then we conclude that the ground state of KPF is unique if it exists.
This paper is organized as follows: Section 2 is devoted to constructing a strongly
continuous symmetric semigroup (Tt )t0 and dening the self-adjoint operator KPF .
In Sec. 3, we show the spatial exponential decay of bound states of KPF pointwise.
And lastly, we have the Appendix.

2. Generalized PauliFierz Hamiltonian


2.1. Definitions
Let us begin with dening a generalized PauliFierz Hamiltonian by a path measure.
We usethe notation EP for the expectation with respect to a probability measure
P , i.e. dP = EP [ ]. Let Sreal = Sreal (Rd ) be the set of real-valued Schwartz
d1
test functions on Rd . We set Q = j=1 Sreal . There exist a -eld B, a probability
measure on a measurable space (Q, B) and a Gaussian random variable A ()
d1
indexed by = (1 , . . . , d1 ) j=1 L2real (Rd ) such that

E [A ()] = 0 (2.1)

and the covariance is given by

1
d1
E [A ()A ()] = (j , j )L2 (Rd ) . (2.2)
2 j=1

Throughout the scalar product on Hilbert space, L is denoted by (F, G)L , where
it is antilinear in F and linear in G. We omit L when no confusion arises. For
d1 2 d
general L (R ), A () is dened by

A () = A ( ) + iA ( ). (2.3)
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004181

Generalized PF Model 1185

Thus A () is linear in over C. The Boson Fock space is dened by L2 (Q, d) =


L2 (Q). It is know that the linear hull of
 
 
d1
 2
: A (1 ) A (n ) : j L (R ), j = 1, . . . , n, n 0
d
(2.4)


is dense in L2 (Q), where : X : denotes the wick product of X. See the Appendix
for the denition of Wick product. Let us dene the free eld Hamiltonian Hf (m)
on L2 (Q). Dene the map (T ): L2 (Q) L2 (Q) by (T )1 = 1 and
(T ) : A (1 ) A (n ) : = : A (T 1 ) A (T n ) : (2.5)
for a contraction operator T on d1 L2 (Rd ). Then (T ) is also contraction on (2.4)
and can be uniquely extended to the contraction operator on the hole space L2 (Q),
which is denoted by the same symbol (T ). We can check that (T )(S) = (T S).
Then {(eith )}tR for a self-adjoint operator h denes the strongly continuous
one-parameter unitary group on L2 (Q). The self-adjoint generator of {(eith )}tR
is denoted by d(h), i.e.
(eith ) = eitd(h) , t R. (2.6)
Let

d1
h= (i), (2.7)
where

(k) = |k|2 + m2 , m 0, k Rd . (2.8)
Then we set
Hf (m) = d(h) (2.9)
and it is called the free eld Hamiltonian on L2 (Q). Let p = i = (ix1 , . . . ,
ixd ) be momentum operators in L2 (Rdx ). We dene the Schr odinger operator
Hp by
1 2
Hp =
p + V, (2.10)
2
where V denotes a real-valued external potential. The conditions on V will be
required later. The zero coupling Hamiltonian is now given by the self-adjoint
operator
Hp 1 + 1 Hf (m) (2.11)
on the Hilbert space
H = L2 (Rdx ) L2 (Q). (2.12)
The PauliFierz Hamiltonian HPF is dened by replacing p 1 in zero cou-

pling Hamiltonian (2.11) with p 1 + A , where 0 is a coupling
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004181

1186 T. Hidaka & F. Hiroshima

constant and

A = A (x)dx (2.13)
Rd

 the2 so-called quantized radiation eld. Here we used the identication H =
is
Rd L (Q)dx. We shall dene A (x) below. Let

j (, x) = (j (, x)/ ), j = 1, . . . , d 1, = 1, . . . , d, (2.14)

(respectively X)
where j is a cuto function and X denotes the (respectively

inverse) Fourier transform of X. Note that (k, x) = j (k)(k, x)/ (k).
j

Examples of cuto functions are given letter. The quantized radiation eld is
dened by


d1
A (x) = A j (x), = 1, . . . , d, (2.15)
j=1

for each x Rd . Now we arrive at the denition of the PauliFierz Hamiltonian. It


is dened by
1
HPF = (p 1 + A )2 + V 1 + 1 Hf (m). (2.16)
2
We omit for notational convenience in what follows. Then HPF is expressed as
1
HPF = (p + A )2 + V + Hf (m). (2.17)
2
Assumption 2.1. Suppose that j Cb1 (Rdx ; L2 (Rdk )) and

j , j , j / , x j , x j / L (Rdx ; L2 (Rdk )). (2.18)

Under Assumption 2.1 it follows that


(p A + A p)F
c1
(p2 + Hf (m) + 1)F
, (2.19)

A A F
c2
(Hf (m) + 1)F
. (2.20)

Moreover, HPF is self-adjoint on D(p2 ) D(Hf (m)) under Assumption 2.1. See
[16, 17, 12] for the proof. We give examples of cuto functions j .

Example 2.2 (Standard PauliFierz Hamiltonian). The standard Pauli


Fierz Hamiltonian is dened by HPF with the dimension d = 3, m = 0, and

(k, x) = e+ikx , j (k) = (k)e
(k, j)/ ,

where e(k, j) = (e1 (k, j), e2 (k, j), e3 (k, j)), j = 1, 2, denote polarization vectors,

and is an ultraviolet cuto function. Suppose that , /
, /
L2 (Rd ).
1 2
Then (k, x) Cb (Rx ; L (Rk )) and (2.18) is fullled.
j d d
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004181

Generalized PF Model 1187

Example 2.3 (The PauliFierz Hamiltonian with a Variable Mass). The


PauliFierz Hamiltonian with a variable mass v instead of m is studied in [13].
Then d = 3, m = 0, and (k, x) is the unique solution to the LippmanSchwinger
equation ([21]):
 i|k||xy|
+ikx 1 e v(y)
(k, x) = e (k, y)dy. (2.21)
4 |x y|
(k, x) formally satises
(x + v(x))(k, x) = |k|2 (k, x), k = 0.
It is established that the PauliFierz Hamiltonian with a variable mass has a ground
state for arbitrary values of coupling constants when |v(x)| C(1+|x|2 )/2 , > 3,
with some constant C. Then it is also seen that
|(k, x) eikx | C(1 + |x|2 )1/2 . (2.22)
Since
  
1 1
x (k, x) = ik e ikx
i|k|
4 R3 |x y|

(x y )ei|k||xy| v(y)
(k, y)dy, (2.23)
|x y|2
it follows that
sup |x (k, x)| < (2.24)
kD,xRd
x

for any compact set D but D  0. Let supp j D. Then j Cb1 (Rdx ; L2 (Rdk ))
follows from (2.22) and (2.24). In addition to condition supp j D let us suppose

that j / , j , j / L2 (Rdk ), then (2.18) is fullled.

2.2. FeynmanKac type formulae


Let us prepare the Euclidean version of the quantized radiation eld A () to
construct a functional integral representation of etHPF in the same way as [14].
d1
Let QE = Sreal (Rd+1 ). There exist a probability measure E on a mea-
surable space (QE , BE ) and a Gaussian random variable AE () indexed by
d1 2 d+1
L (R ) such that
EE [AE ()] = 0
and the covariance is given by

1
d1
EE [AE ()AE ()] = (j , j )L2 (Rd+1 ) .
2 j=1
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004181

1188 T. Hidaka & F. Hiroshima

Both L2 (Q) and L2 (QE ) are connected through the second quantization of the
family of isometry {jt }tR between L2 (Rd ) and L2 (Rd+1 ):

eik0 t 
j
t f (k0 , k) = (k)/((k)2 + |k0 |2 )f(k). (2.25)

d1
Dene Jt = ( jt ) : L2 (Q) L2 (QE ). From the identity jt js = e|ts|(i)
it follows that Jt Js = e|ts|Hf (m) .

Set X = C([0, ); Rd ) be the set of continuous paths on [0, ). Let (Bt )t0
denote the d-dimensional Brownian motion starting at x Rd on (X , B(X ), P x )
with the Wiener measure P x . That is, P x (B0 = x) = 1. Let Cbn (Rdx ; L2 (Rdk )) be
the set of strongly n-times dierentiable L2 (Rd )-valued functions on Rd such that
supx
xz f (x)
L2 (Rd ) < for |z| n. For f Cb1 (Rdx ; L2 (Rdk )), = 1, . . . , d, we can
dene an L2 (Rd )-valued Stratonovich integral:

d 
 t  t  t
1
f (Bs ) dBs = f (Bs ) dBs + f (Bs )ds, (2.26)
=1 0 0 2 0

d d
where f (Bs ) dBs = =1 f (Bs )dBs and f (Bs ) = =1 (x f )(Bs ). We also
dene an L2 (Rd+1 )-valued Stratonovich integral by

d 
 t 
d  tj/n
js f (Bs ) dBs = lim jt(j1)/n f (Bs ) dBs , (2.27)
0 n t(j1)/n
=1 =1

where limn is a strong limit in L2 (X ; L2 (Rd+1 )). By the It o isometry we have


the identity for S T
 
 T  S
Ex js f (Bs ) dBs , js g(Bs ) dBs
0 0
L2 (Rd+1 )

d 
 S
= Ex [(f (Bs ), g (Bs ))]ds
=1 0

Hence we have the bound


 2
d  
 
Ex  js f (Bs ) dBs 
 
=1

  
t 
d
1 2 2
dsE x
2
f (Bs )
+
f (Bs )
. (2.28)
0 =1
2

The next proposition is fundamental.


November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004181

Generalized PF Model 1189

Proposition 2.4. Let V be bounded. Suppose Assumption 2.1. Then


 Rt
(F, etHPF G) = dxEx [e 0 V (Bs )ds (J0 F (B0 ), ei AE (Kt ) Jt G(Bt ))L2 (Q) ],

(2.29)
d1
Kt is the L2 (Rd+1 )-valued stochastic integral given by
 d 
d1  t
Kt = js j (, Bs ) dBs . (2.30)
j=1 =1 0

Here
d 
 t  t  t
1
js j (, Bs ) dBs = js j (, Bs ) dBs + js j (, Bs )ds.
=1 0 0 2 0

Proof. Suppose that j Cb2 (Rdx ; L2 (Rdk )). Then (2.29) is proven in the same
way as [16, Lemma 4.8]. Next we suppose that j (k, x) Cb1 (Rdx ; L2 (Rdk )). Let
C (Rd ) and C0 (Rd ) be such that


1, |x| < 1,
(x) = <1, 1 |x| 2, 0


0, 2 < |x|,

and (x)dx = 1. Dene N (x) = (x/N ) and n (x) = (x/n)nd/2 . Let

j (k, x)M,n = (n (j (k, )N ()))(x),


j (k, x)M = j (k, x)M (x).

We note that j (k, x)M,n Cb (Rdx ; L2 (Rdk )). Since j (k, x)M,n j (k, x)M in
Lp (Rdx , L2 (Rdk )) for 1 p < as n , there exists a subsequence n such
that j (k, x)M,n j (k, x)M strongly in L2 (Rdk ) for almost everywhere x Rd .
Furthermore, j (k, x)M j (k, x) for each x Rd in L2 (Rdk ). Then

lim lim j (k, x)M,n = j (k, x) (2.31)


M n

strongly in L2 (Rdk ) for almost everywhere x Rd . In the same way as above, we


can also see that

lim lim xz j (k, x)M,n = xz j (k, x) (2.32)


M n

strongly in L2 (Rdk ) for almost everywhere x Rd for |z| 1. Thus (2.29) holds
with j replaced by j (k, x)M,n . HPF with j replaced by j (k, x)M,n is denoted
by HPF (M, n ). Let F C0 D(Hf (m)). Then we can prove directly that

lim lim HPF (M, n )F = HPF F.


M n
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004181

1190 T. Hidaka & F. Hiroshima

Since C0 D(Hf (m)) is a core of HPF (M, n ) and HPF ,



lim lim

etHPF (M,n ) = etHPF (2.33)
M n

strongly. Moreover
 Rt
tHPF (M,n ) AE (Kt (M,n )) V (Bs )
(F, e G) = dxEx [(J0 F (x), ei e 0 Jt G(Bt ))],

(2.34)

where Kt (M, n ) is dened by Kt with j (k, x) replaced by j (k, x)M,n . Operator


N = d(1) is called the number operator in L2 (Q). Let F D(N ). Then the bound


A ()F
2

(N + 1)1/2 F

is known. From (2.34) and



AE (Kt (M,n )) AE (Kt )
|ei ei | |AE (Kt (M, n ) Kt )|

it follows that

|(F, etHPF (M,n ) G) (F, etHPF G)|

Rt
dxEx [(|J0 F (x)|, |AE (Kt (M, n ) Kt )|e 0 V (Bs ) |Jt G(Bt )|)]


C dx
(N + 1)1/2 F (x)
Ex [
Kt (M, n ) Kt

G(Bt )
]


C dx
(N + 1)1/2 F (x)
(Ex [
Kt (M, n ) Kt
2 ])1/2 (Ex [
G(Bt )
2 ])1/2 .

We estimate Ex [
Kt (M, n ) Kt
2 ]. By (2.28), we have
 d 
 t
d1  1
2 2 2
E [
Kt (M, n ) Kt
]
x
E 2
x

(Bs )
+
(Bs )
ds.
j j

j=1 0 =1
2

where f = f fM,n . By (2.31) and (2.32), we see that

lim lim Ex [
Kt (M, n ) Kt
2 ] = 0
M n

for each x Rd . Then by the Lebesgue dominated convergence theorem we have

lim lim r.h.s. (2.34)


M n
 Rt
AE (Kt ) V (Bs )
= dxEx [(J0 F (x), ei e 0 Jt G(Bt ))]. (2.35)

Then (2.29) also holds for j Cb1 (Rdx ; L2 (Rdk )). Thus the proposition follows.
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004181

Generalized PF Model 1191

2.3. One-parameter symmetric semigroup and generalized


PauliFierz Hamiltonian
We can extend functional integral representations in Proposition 2.4 to more general
external potentials and j .
Denition 2.5 (Kato-Class Potentials). External potential V : Rd R is called
a Kato-class potential if and only if


sup |(x y)V (y)|dy < d = 1,

xRd B1 (x)
 (2.36)


lim sup |(x y)V (y)|dy = 0 d 2
r0 xRd Br (x)

holds, where Br (x) denotes the closed ball of radius r centered at x, and


1, d = 1,

(x) = log |x|, d = 2, (2.37)


2d
|x| , d 3.
We denote the set of Kato-class potential by KKato .
An equivalent characterization of Kato-class is as follows:

Proposition 2.6. A function V is in KKato if and only if


 t
lim sup Ex |V (Bs )|ds = 0. (2.38)
t0 xRd 0

Proof. See, e.g., [1, 6, 26, 27].

Denition 2.7. Let K be the set of external potential V = V+ V such that


0 V+ L1loc (Rd ) and 0 V KKato .

Example 2.8. In [1, 26, 27], it is shown that Lpu (Rd ) KKato where
  


Lu (R ) = f  sup
p d
|f (x)| dx <
p
 x |xy|1
with


=1, d = 1,
p (2.39)

> d , d 2.
2
In particular let V Lp (Rd ) + L (Rd ) with (2.39), then V KKato .

Example 2.9. Let d = 3 and V (x) = P (x) |x| a


b , where a 0, 0 b < 2 and
2n
P (x) = j=0 aj x is a polynomial such that a2n > 0. Then V K .
j
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004181

1192 T. Hidaka & F. Hiroshima

t
Now we shall see that the random variable 0
V (Bs )ds is integrable with respect
to the Wiener measure P x for V K .
t
Lemma 2.10. Let 0 V L1loc (Rd ). Then P x ( 0 V (Bs )ds < ) = 1 for each
x Rd .
t
Proof. Since V L1loc (Rd ), we can see that Ex [ 0 1N V (Bs )ds] < for the indi-
cator function

1, |k| N,
1N (k) =
0, |k| > N.
Then
t there exists a measurable set NN X such that P x (NN ) = 0 and
!
0 1N (Bs )V (Bs )ds < for X \NN . Set N = N =1 NN . For X \N
t
we can see that 0 1N (Bs ())V (Bs ())ds < for arbitary N 1. Let X \N .
There exists N = N () 1 such that sup0st |Bs ()| < N . Henceforth
 t  t
V (Bs ())ds = 1N (Bs ())V (Bs ())ds < , X \N .
0 0

Thus the lemma follows.


Rt
When V KKato , it can be seen that the Rexponent e 0 V (Bs )ds is integrable
with respect to P x , and the supremum of Ex [e 0 V (Bs )ds ] in x is nite. We shall
t

check it.

Lemma 2.11. Let V KKato . Then there exists > 0 and > 0 such that
Rt
V (Bs )
sup Ex [e 0 ] < et . (2.40)
x

Furthermore when V Lp (Rd ) with




=1, d = 1,
p

> d , d 2,
2
there exists C such that

C
V
p . (2.41)

Proof. By Proposition 2.6, there exists t > 0 such that


 t
t = sup Ex V (Bs ) < 1
x 0

for all t t , and t 0 as t 0. It is known as Khasminskiis lemma that
Rt 1
V (Bs )
sup Ex [e 0 ]< (2.42)
x 1 t
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004181

Generalized PF Model 1193

for all t t . By means of the Markov property of the Brownian motion we have
R 2t R t R t
 2
1
Ex [e 0 V (Bs ) ] = Ex [e 0 V (Bs ) EBt [e 0 V (Bs ) ]] .
1 t
Repeating this procedure, we can see that
Rt
 [t/t ]+1
1
sup Ex [e 0 V (Bs ) ] (2.43)
x 1 t
1
for all t > 0, where [z] = max{w Z | w z}. Set = ( 1 t
) and =

1 1/t
log( 1t ) . Then (2.40) is proven. Next we prove (2.41). Suppose V Lp (Rd ).
In the case of d = 1, we directly see that
 t  t
t = Ex [V (Bs )]ds (2s)1/2 ds
V
1 . (2.44)
0 0

Next, we let d 2 and q be such that + 1q = 1. The following estimates are due
1
p
to [1, Proof of Theorem 4.5]. Let an arbitrary  > 0 be xed. We have
 t
Ex [|V (Bs )|]ds
0
 t  t
= Ex [|V (Bs )||Bs x| ]ds + Ex [|V (Bs )||Bs x|< ]ds
0 0
 
d/2 |y|2 /(2t)
t (2t) e |V (x + y)|dy + e t
Ex [es |V (Bs )||Bs x|< ].
|y| 0

It is easy to see that


  1/q
2 2
t (2t)d/2 e|y| /(2t) |V (x + y)|dy t(2)d/2 eq|y| /2 dy
V
p .
|y|
(2.45)

Let f be the integral kernel of ( 12 p2 + 1)1 . Then we see that


 
x s
dsE [e |V (Bs )||Bs x|< ] f (x y)|V (y)|dy.
0 |xy|<

Since |f (z)| C(z) for |z| 12 with some constant C, we have


 
dsEx [es |V (Bs )||Bs x|< ] C (x y)|V (y)|dy
0 |xy|<

and then
  1/q

s
x
dsE [e |V (Bs )||Bs x|< ] C q
(z) dy
V
p (2.46)
0 |z|<
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004181

1194 T. Hidaka & F. Hiroshima

by the H older inequality. Hence from(2.44)(2.46), there exists Ct () such that
t Ct ()
V
p and limt0 Ct () = C( |z|< (z)q dy)1/q . Then for suciently small
1 1/T
T and  we have ( 1CT () V p ) and then there exists DT such that
DT
V
p . Then (2.41) follows.

The functional integral representation (2.29) introduced in Proposition 2.4 is


well dened not only for bounded external potentials and j satisfying (2.18) but
also more general external potentials and j . We can identify Hilbert space H with
L2 (Rd Q) with the scalar product (F, G) = dx(F (x), G(x))L2 (Q) . The functional
integral representation of (F, etHPF G) is also given by
 Rt
(F, etHPF G) = dx(F (x), Ex [e 0 V (Bs )ds J0 ei AE (Kt ) Jt G(Bt )])L2 (Q) .

From this expression we shall dene (Tt )t0 by (2.47) below.

Assumption 2.12. We suppose that V K and j = j (k, x) Cb1 (Rdx ; L2 (Rdk )).
Note that under Assumption 2.12, A (x) is not relatively bounded with respect
to Hf (m) in the case of m = 0. Under Assumption 2.12 however we dene the
family of linear operators {Tt }t0 on H by
Rt
Tt F (x) = Ex [e 0
V (Bs )ds i AE (Kt )
J0 e Jt F (Bt )] (2.47)

for all t 0. Note that Kt is well dened since j Cb1 (Rdx ; L2 (Rdk )).

Lemma 2.13. Suppose Assumption 2.12. Then Tt is bounded on H for t 0.

Proof. By the denition of Tt we have


 Rt

Tt F
2H dxEx [e2 0 V (Bs )ds ]Ex [
F (Bt )
2L2 (Q) ].
Rt
Since V K , C = supx Ex [e2 0
V (Bs )ds
] < . Thus
Tt F
2H C
F
2H follows.

In what follows we shall show that {Tt }t0 is a strongly continuous one-
parameter symmetric semigroup on H . In order to show it we introduce the second
quantization of Euclidean group {ut , r} on L2 (Rd+1 ), where the time shift operator
ut : L2 (Rd+1 ) L2 (Rd+1 ) is dened by
ut f (x0 , x) = f (x0 t, x)

and the time reection r: L2 (Rd+1 ) L2 (Rd+1 ) by

rf (x0 , x) = f (x0 , x)

for x = (x0 , x) R Rd . The second quantization of ut and r are denoted by


Ut : L2 (QE ) L2 (QE ) and R: L2 (QE ) L2 (QE ), respectively. Note that r = r,
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004181

Generalized PF Model 1195

rr = r r = 1, ut = ut and ut ut = 1 and that Ut and R are unitary. The time shift


ut , the time reection r and isometry jt : L2 (Rd ) L2 (Rd+1 ) satisfy the lemma
below.

Lemma 2.14. (1) ut js = js+t and Ut Js = Js+t .


(2) rjs = js r and RUs = Us R.

Proof. By the denition of js we have


 
1 i(k0 (x0 s)+kx) (k)
js f (x) = (d+1)/2
e  f(k)dk0 dk.
(2) (k)2 + |k0 |2
Then ut js = js+t follows, and Ut Js = (ut )(js ) = (ut js ) = (js+t ) = Js+t . (2) is
similarly proven.

Lemma 2.15. Suppose Assumption 2.12. Then it follows that Tt Ts = Tt+s for all
t, s 0.

Proof. By the denition of Tt , we have


Rs
Ts Tt F (x) = Ex [e V (Br )dr i AE (Ks )
0 J0 e Js EBs
Rt
[e 0 V (Br )dr J0 ei AE (Kt ) Jt F (Bt )]]. (2.48)
Let Es = Js Js , s R, be the family of projections. By the formulae Js J0 =
Js Js Us

= Es Us and Jt = Us Jt+s , (2.48) is expressed as
Rs
Ts Tt F (x) = Ex [e V (Br )dr i AE (Ks )
0 J0 e Es EBs
Rt
[e 0 V (Br )dr Us
i AE (Kt )
e Us Jt+s F (Bt )]].
Since Us is unitary, we have

i AE (Kt ) AE (u
s Kt )
Us e Us = ei (2.49)
as an operator, where the exponent is given by

d1 d  t

us Kt = jr+s j (Br ) dBr .
j=1 =1 0

Let (Ft )t0 be the natural ltration of the Brownian motion (Bt )t0 . By the Markov
property of the projections Et s ([24]), we can neglect Es in (2.49) and we have
Rs
Ts Tt F (x) = Ex [e 0
V (Br )dr i AE (Ks )
J0 e
R
ss+t V (Br )dr i AE (Kss+t )
Ex [e e Jt+s F (Bs+t )|Fs ]],
where Ex [ | Fs ] denotes the conditional expectation with respect to (Ft )t0 and

d1 d  s+t
s+t
Ks = jr j (Br ) dBr .
j=1 =1 s
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004181

1196 T. Hidaka & F. Hiroshima

Hence we obtain that


R s+t
Ts Tt F (x) = Ex [e 0
V (Br )dr i AE (Ks+t )
J0 e Js+t F (Bs+t )] = Ts+t F (x)

and the lemma is proven.

Next we check the symmetric property of Tt .

Lemma 2.16. Suppose Assumption 2.12. Then it follows that Tt = Tt for t 0.

Proof. By the functional integral representation and the unitarity of the time-
reection R on L2 (QE ), we have
 Rt
(F, Tt G) = dxEx [e 0 V (Bs )ds (RJ0 F (B0 ), Rei AE (Kt ) RRJt G(Bt ))]
 Rt
= dxEx [e 0
V (Bs )ds
(J0 F (B0 ), ei AE (rKt )
Jt G(Bt ))],
d1 d  t
=1 0 js (Bs ) dBs . By means of the
j
where the exponent is rKt = j=1
time-shift Ut we also have
 Rt
(F, Tt G) = dxEx [e 0 V (Bs )ds (Ut J0 F (B0 ), Ut ei AE (rKt ) Ut Ut Jt G(Bt ))]
 Rt
= dxEx [e 0
V (Bs )ds
(Jt F (B0 ), ei AE (ut rKt )
J0 G(Bt ))],
d1 d  t
where ut rKt = j=1 =1 0 jts j (Bs ) dBs . Finally we set B s = Bts Bt ,
which equals to Bs in law. Then we have
 Rt

(F, Tt G) = dxE0 [e 0 V (x+Bs )ds (Jt F (x), ei AE (ut rK t ) J0 G(x + B
t ))], (2.50)

where
 d 
d1  t 
d1 
n
u
t rK t =
s ) dB
jts j (x + B s = lim j (i)
0 n
j=1 =1 j=1 i=1

and limn is in the strong sense of L2 (X ; L2 (Rd+1 )) and


d 
 ti/n
j (i) = s ) dB
jtt(i1)/n j (x + B .
s
=1 t(i1)/n

Then exchanging dx and E0 in (2.50) we have
 Rt
0
(F, Tt G) = lim E dxe 0 V (x+Bs )ds
n

Ld1 Pn
AE ( j (i))
(Jt F (x), ei j=1 i=1 J0 G(x B
t ))
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004181

Generalized PF Model 1197


and changing variable x Bt to x in dx we have
 Rt
0
(F, Tt G) = lim E dxe 0 V (x+Bs )ds
n

L Pn
AE ( d1 i=1 j (i))
(Jt F (x + Bt ), ei j=1 J0 G(x)) ,

where
d 
 ti/n
j (i) =
jtt(i1)/n j (x + Bs ) dBs .
=1 t(i1)/n

and

n d 
 t
lim j (i) =
j (x + Bs ) dBs .
n 0
i=1 =1

We thus can nally see that


 Rt
(F, Tt G) = dxEx [e 0 V (Bs )ds (Jt F (Bt ), ei AE (Kt ) J0 G(B0 ))] = (Tt F, G).

Then the lemma follows.

Lemma 2.17. Suppose Assumption 2.12. Then Tt is strongly continuous in t 0


on H .

Proof. Since
Tt
is uniformly bounded and the semigroup property Tt Ts = Tt+s is
hold, it is enough to show the weak continuity at t = 0. By the Lebesgue dominated
convergence theorem it suces to show that

AE (Kt )
Ex [(J0 F (B0 ), ei Jt G(Bt )] Ex [(J0 F (B0 ), J0 G(B0 )]
as t 0 for each x Rd . Let

AE (Kt )
Ex [(J0 F (B0 ), ei Jt G(Bt )] Ex [(J0 F (B0 ), J0 G(B0 )]

= Ex [(J0 F (B0 ), ei AE (Kt ) Jt G(Bt )] Ex [(J0 F (B0 ), ei AE (Kt ) Jt G(B0 )]

+ Ex [(J0 F (B0 ), ei AE (Kt ) Jt G(B0 )] Ex [(J0 F (B0 ), ei AE (Kt ) J0 G(B0 )]

+ Ex [(J0 F (B0 ), ei AE (Kt ) J0 G(B0 )] Ex [(J0 F (B0 ), J0 G(B0 )].
The rst and second terms of the right-hand side above converge to zero as t 0,
since Bt and Jt are continuous in t. We will check that the third line also goes to
zero. We have

|Ex [(J0 F (B0 ), ei AE (Kt ) J0 G(B0 )] Ex [(J0 F (B0 ), J0 G(B0 )]|

(Ex [
AE (Kt )J0 F (B0 )
2 ])1/2 (Ex [
G(Bt )
2 ])1/2 .
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004181

1198 T. Hidaka & F. Hiroshima

We have a bound

Ex [
AE (Kt )J0 F (B0 )
2 ]
N + 1F (x)
2 E0 [
Kt (x)
2L2 (Rd+1 ) ],
d1 d t j
where Kt (x) = j=1 =1 0 js (x + Bs ) dBs . We have
 
 t
d1 
d
1
0
E [
Kt (x)
2L2 (Rd+1 ) ] dsE 2 x

j (Bs )
2 2
+
(Bs )ds
.
j
(2.51)
j=1 0 =1
2

Then limt0 Ex [
AE (Kt )J0 F (B0 )
2 ] = 0 follows and the proof is complete.

Theorem 2.18. Suppose Assumption 2.12. Let V K . Then {Tt }t0 is a strongly
continuous one-parameter symmetric semigroup. In particular, there exists a self-
adjoint operator KPF bounded below such that

etKPF = Tt , t 0, (2.52)

and
Rt
etKPF F (x) = Ex [e 0
V (Bs )ds i AE (Kt )
J0 e Jt F (Bt )]. (2.53)

Proof. This follows from Lemmas 2.152.17.

Denition 2.19 (Generalized PauliFierz Hamiltonians). Suppose Assump-


tion 2.12. We dene a generalized PauliFierz Hamiltonian with an external poten-
tial V K by a self-adjoint operator KPF in (2.52).

Corollary 2.20. Suppose Assumption 2.12. Let us identify H with L2 (Rd Q).
Then under this identification ei(/2)N etKPF ei(/2)N , t > 0, is positivity improv-
ing. In particular the ground state of KPF is unique if it exists.

Proof. By (2.53), we can see that

(F, ei(/2)N etKPF ei(/2)N G)


 Rt
= dxEx [(J0 F (x), e 0 V (Bs )ds ei(/2)N ei AE (Kt ) ei(/2)N Jt G(Bt ))].

Since in [15] it is shown that ei(/2)N ei AE (Kt ) ei(/2)N is positivity improving,
(F, ei(/2)N etKPF ei(/2)N G) > 0 for all 0 F, G H but F = 0 and G = 0.
Then the corollary follows.

Let Lp (Rd ; L2 (Q)) = {f : Rd L2 (Q)|
f (x)
pL2 (Q) dx < } and set the

Lp norm as
F
p = (
F (x)
pL2 (Q) dx)1/p .

Corollary 2.21. Suppose Assumption 2.12. etKPF can be extended to a bounded


operator from Lp (Rd ; L2 (Q)) to itself for 1 p .
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004181

Generalized PF Model 1199

1 1
Proof. Let p = , p = 1 and p + q = 1. Then we have
Rt

etKPF F (x)
pL2 (Q) (Ex [e 0
V (Bs )ds

F (Bt )
])p
Rt
(Ex [eq 0
V (Bs )ds
])p/q Ex [
F (Bt )
pL2 (Q) ].

Thus we have
 
tKPF

e F (x)
pL2 (Q) dx C
F (x)
pL2 (Q) dx.

In the case of p = and p = 1, the proof is similar.

2.4. Quadratic form and KPF


By the functional integral representation, we have the so-called diamagnetic
inequality

|(F, etHPF G)| (|F |, et(Hp +Hf (m)) |G|). (2.54)

By means of the diamagnetic inequality, we can see that when |V |1/2 is relatively
bounded with respect to (p2 /2)1/2 with a relative bound a 0, it is also relatively

bounded with respect to ( 12 (p + A )2 + Hf (m))1/2 with a relative bound a.
See [14]. Let V = V+ V be such that V+ L1loc (Rd ) and V innitesimally small
with respect to p2 /2 in the sense of form. Then under Assumption 2.1 we can dene
the self-adjoint operator
1
HPF = (p + A )2 + Hf (m) +
V+
V (2.55)
2
by the quadratic form sum .

Theorem 2.22. Let V K and suppose Assumption 2.1. Then KPF = HPF ,
where HPF is defined by (2.55).

Proof. The functional integral representation of etHPF for (2.55) can be given by
the procedure below [25, 14]. Let


n, V (x) n.
Vn,m (x) = V (x), m < V (x) < n,


m, V (x) m.

Thus Vn,m L (Rd ) and then the functional integral representation of etHPF
with external potential Vn,m , which is denoted by etHPF (n,m) , is given by Propo-
sition 2.6. By the monotone convergence theorem for forms, we can see that
limn limm etHPF (n,m) = etHPF , where HPF is dened by (2.55). On the
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004181

1200 T. Hidaka & F. Hiroshima

other hand, the functional integral representation of I = (F, etHPF (n,m) G) =


I + i I is divided into the positive part and the negative part as
I = ( I)+ ( I) + i( I)+ i( I) ,
and each term converges as n, m by the monotone convergence theorem for
integral. Then the functional integral representation is given by
(F, etHPF G)
 Rt Rt
= lim dxE[(J0 F (B0 ), e 0 Vn,+ (Bs )ds e+ 0 Vm, (Bs )ds ei A (Kt ) Jt G(Bt ))].
n,m
 Rt
= dxE[(J0 F (B0 ), e 0
V (Bs )ds i A (Kt )
e Jt G(Bt ))]. (2.56)

Since V K , we see that V+ L1loc (Rd ) and V is innitesimally small with


respect to p2 /2 in the sense of form [6, Theorem 1.12]. Moreover (F, etKPF G)
equals to the right-hand side of (2.22). Then we conclude that etHPF = etKPF .
Thus the theorem follows.

3. Pointwise Spatial Exponential Decays


In this section, we show the spatial exponential decay of bound states of KPF . Let
b be a bound state of KPF associated with eigenvalue E;
KPF b = Eb . (3.1)

Assumption 3.1. We say that V = W + U E if and only if W L1loc (Rd ),


inf x W (x) > and 0 > U Lp (Rd ) for some

=1, d = 1,
p
> d , d 2.
2
Let W + U E and set W = W+ W , where W 0 is given by W+ (x) =
max{0, W (x)} and W (x) = min{0, W (x)}. Since U Lp (Rd ) KKato , W
L KKato and W+ L1loc (Rd ), we note that E K . We set
W = inf W (x). (3.2)
x

A fundamental estimate to show the spatial exponential decay of bound states is


the lemma below.

Lemma 3.2. Let V = W + U E . Suppose that j Cb1 (Rdx ; L2 (Rdk )). Then for
arbitrary t, a > 0 and each 0 < < 1/2, there exist constants D1 , D2 and D3 such
that
a2

b (x)
L2 (Q) D1 eD2 U p t eEt (D3 e 4 t etW + etWa (x) )
b
H , (3.3)
where Wa (x) = inf{W (y)||x y| < a}.
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004181

Generalized PF Model 1201

Proof. It is a slight modication of [5]. Since b = etE etKPF b , we have


Rt
b (x) = Ex [J0 e 0
V (Bs ) i AE (Kt )
e Jt b (Bt )]etE . (3.4)

Hence for almost every x it follows that


Rt

b (x)
L2 (Q) etE Ex [e 0
V (Bs )

b (Bt )
L2 (Q) ]. (3.5)

By this, we have
Rt Rt

b (x)
L2 (Q) etE (Ex [e4 0
W (Bs )ds
])1/4 (Ex [e4 0
U(Bs )ds
])1/4
b
H ,

where we used the Schwartz inequality and



2
Ex [
b (Bt )
2L2 (Q) ] = (2t)d/2 e|y| /2t
b (x + y)
2L2 (Q) dy

2
= e|z|
b (x + 2tz)
2L2 (Q) dz


b
2H .

Let A = { X | sup0st |Bs ()| > a}. Then it follows from a martingale inequal-
ity that

r 2 /2 d1 2
E0 [1A ] 2P 0 (|Bt | a) = 2(2)d/2 Sd1 e r dx ea /t
a/ t

with some for each 0 < < 1/2. Thus it follows that
Rt Rt Rt
Ex [e4 0
W (Bs )ds
] = E0 [1A e4 0
W (Bs +x)ds
] + Ex [1Ac e4 0
W (Bs )ds
]
e4tW E0 [1A ] + e4tWa (x)
2
ea /t 4tW
e + e4tWa (x) .
Rt
Next we estimate Ex [e4 0 U(BRs )ds ]. Since U is in Kato-class, there exist constants
D1 and D2 such that Ex [e4 0 U(Bs )ds ] D1 eD2 U p t by Lemma 2.11. Setting
t

D3 = 1/4 , we obtain the lemma by the inequality (a + b)1/4 a1/4 + b1/4 for
a, b 0.

For V = W + U E , we dene

= lim inf V (x). (3.6)


|x|

Since U Lp (Rd ), lim inf |x| U (x) = 0 and hence

= lim inf W (x). (3.7)


|x|

Moreover W holds.
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004181

1202 T. Hidaka & F. Hiroshima

Theorem 3.3. Suppose that V = W + U E and j Cb1 (Rdx ; L2 (Rdk )).

Conning Case 1. Suppose that W (x) |x|2n outside a compact set K for some
n > 0 and some > 0. Let 0 < < 1/2. Then there exists a constant C1 such that
" c #

b (x)
L2 (Q) C1 exp |x|n+1
b
H , (3.8)
16
where c = inf xRd \K W |x| (x)/|x|2n .
2

Conning Case 2. Suppose that lim|x| W (x) = . Then there exist constants
C and such that


b (x)
L2 (Q) C exp(|x|)
b
H . (3.9)

Non-Conning Case. Suppose that > E and > W . Let 0 < < 1. Then
there exists a constant C2 such that
 
( E)

b (x)
L2 (Q) C2 exp |x|
b
H . (3.10)
8 2 W

Proof. Since supx


b (x)
L2 (Q) < , it is enough to show all the statements for
suciently large |x|.

Conning Case 1. Note that W |x| (x) c|x|2n for x Rd \K. Then we have
2
bounds for x Rd \K:

|x|W |x| (x)1/2 c|x|n+1 , (3.11)


2

|x|W |x| (x)1/2 c|x|1n . (3.12)


2

|x|
Inserting t = t(x) = W |x| (x)1/2 |x| and a = a(x) = 2 in (3.3), we have
2

1n

b (x)
e 16 c|x| D1 e(D2 U p +E)c|x|
n+1

1n
|W |
+ e(1 16 )c|x|
n+1
(D3 ec|x| )
b
H (3.13)

for x Rd \K. Then (3.8) follows.

Non-Conning Case. Rewrite formula (3.3) as


a2

b (x)
D1 eD2 U p t (D3 e 4 t et(W E) + et(Wa (x)E) )
b
H . (3.14)

Then altering both = lim inf |x| (W (x)) and > W , it is possible to
choose decomposition V = W + U E such that
U
p ( E)/2, since
lim inf |x| U (x) = 0. Inserting t = t(x) = |x| and a = a(x) = |x|
2 in (3.14),
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004181

Generalized PF Model 1203

we have
|x|(W |x| (x)E)

b (x)
D1 e U p |x| (D3 e 16 |x| e|x|(W E) + e

2 )
b
H
( 16

+(W E) 12 (E))|x|
D1 (D3 e
((W |x| (x)E) 12 (E))|x|
+e 2 )
b
H .

/16
Choosing  = , the exponent on the rst term above turns out to be
W

1 1
+ (W E) ( E) = ( E).
16 2 2
Moreover we see that lim inf |x| W |x| (x) = , and obtain
2


b (x)
L2 (Q) C2 e 2 (E)|x|
b
H


for suciently large |x|. Then (3.10) follows.

Conning Case 2. Finally, we prove conning case 2. In this case for arbitrary
c > 0 there exists N such that W |x| (x) c for all |x| > N . Inserting t = t(x) = |x|
2
|x|
and a = a(x) = 2 in (3.3), we obtain that
|x|(W |x| (x)E)

b (x)
D1 e U p |x| (D3 e 16 |x| e|x|(W E) + e

2 )
b
H
( 16  U p +(W E))|x|
+ e|x|(cE U p) )
b
H

D1 (D3 e

for |x| > N . Choosing suciently large c and suciently small  such that


U
p + (W E) > 0,
16
c E
U
p > 0,

we have
b (x)
C e |x| for suciently large |x|. Then (3.9) follows.

We give several remarks on Theorem 3.3.



Independence of Bose Mass m. Suppose that (k) = |k|2 + m2 . Let b be
a normalized ground state of KPF :
b
H = 1, and Em = inf (KPF ). It is shown
that there exist also constants C1 and C2 such that


b (x)
L2 (Q) C1 eC2 |x| ,
n
n 1,

by Theorem 3.3. Since the ground state energy Em is decreasing in m, we can take
C1 and C2 independent of m < M with some M . This fact is nontrivial and useful
to show the existence of ground states of the PauliFierz model with m = 0. This
is used in, e.g., [13].
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004181

1204 T. Hidaka & F. Hiroshima

Condition W < . When inf x V (x) < , it is possible to decompose V =


W + U E such that W < . In fact, for arbitrary  > 0, there exists y Rd
such that
V (y) < inf V (x) + .
x

Suppose that inf x V (x) +  < . Let Oy Rd be a neighborhood of y. Then dene



U (x), x Oy ,
u(x) =
0, y Oy .

Let W = W + u and U = U u. This yields that V = W E and W


+U <
inf x V (x) +  < .
Threshold. The threshold is dened by
= lim inf (F, HPF F ),
R F DR , F =1

where DR = {F D(HPF ) | F (x) = 0, |x| < R}. We note that , and


= = in conning cases.
The bound given in [10] is
e+C||1(,] (HPF )
H < , where C 2 + < .
From this the bound

dx
e+|x| b (x)
2L2 (Q) C
b
H (3.15)

follows, where

< E.
Theorem 3.3, however, gives pointwise bounds:

b (x)
L2 (Q) C1 exp(C2 |x| )
b
H , 1. (3.16)

In particular, the superexponential decay,


b (x)
C1 eC2 |x|
b
H , is shown
n+1

for the case of polynomially increasing potentials (Conning Case 1), while in non-
conning cases, we show that in (3.16), = 1 and
E
C2 < . (3.17)
8 2 E W
We give examples of external potentials.
Example 3.4 (Conning Potentials). Let V = V+ V be such that V+
Lploc (Rd ) and V Lp (Rd ), where

=1, d = 1,
p
> d , d 2.
2
In this case V E .
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004181

Generalized PF Model 1205

Example 3.5 (Coulomb Potentials). Suppose Assumption 2.1. Then


HPF = KPF .
Let V = Z/|x| be the Coulomb potential. Then inf (Hp ) = Z/2. We have
( 1, HPF 1)H = (, (Hp + Ve ))L2 (Rd ) for D( 12 p2 ), where

  j
d1 d
Ve (x) = ( (x), j (x))L2 (Rd ) .
2 j=1 ,=1
d1 d
Let V = supx | j=1 ,=1 ( (x), (x))L2 (Rd ) |.
j j
Thus

inf (HPF ) (Z V ).
2
When Z > V , inf (HPF ) < lim|x| V (x) = 0 follows for all values of coupling
constant . Then ground states of HPF decay as C1 eC2 |x| pointwise for all values
of coupling constants.

Acknowledgments
FH acknowledges support of Grant-in-Aid for Science Research (B) 20340032
from JSPS and Grant-in-Aid for Challenging Exploratory Research 22654018 from
JSPS.

Appendix
In this appendix, we show the unitary equivalence between HPF and the PauliFierz
Hamiltonian dened on
L2 (Rd ) F ,
 d1 2 d d1 2 d
where F = n=0 ns ( L (R )) is the Boson Fock space over L (R ).
Let = {1, 0, 0, . . .} F be the Fock vacuum. The annihilation operator and
the creation operator in F are denoted by a (f ) and a(f ), respectively, where
d1 2 d
f = (f1 , . . . , fd1 ) L (R ). They satisfy canonical commutation relations:

d1
[a(f ), a (g)] = (fj , gj )L2 (Rd ) ,
j=1

[a (f ), a (g)] = 0 = [a(f ), a(g)].
The eld operator in F is given by

= 1 (a ()
A()

+ a()),
2

where (k)
= (k). The quantized radiation eld is dened by A = Rd A (x)dx
under the identication L2 (Rd ) F
= L2 (Rd ; F ) and A (x) = A(
(x)), where a
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004181

1206 T. Hidaka & F. Hiroshima

d1 j 
cuto function is given by (x) = (k, x) = j=1 (k)(k, x)/ (k). Finally
the free eld Hamiltonian is dened by

 k
i
d() = 1 1. (A.1)
$ %& '
k=0 i=1 k
2
Then the PauliFierz Hamiltonian in L (R ) F is given by
d


H PF = 1 (p 1 + A)2 + V 1 + 1 d(). (A.2)
2
Suppose that V is relatively bounded with respect to 12 p2 with a relative bound
strictly smaller than one, and that j Cb1 (Rdx ; L2 (Rdk )) and

j , j , j / , x j , x j / L (Rdx ; L2 (Rdk )). (A.3)
See Assumption 2.1. Then H PF is self-adjoint on D(p2 1) D(1 d()). Now let
us see the relationship between L2 (Q) and F . Let U : F L2 (Q) be dened by
U = 1,
U : A(1 ) A(n ) : = : A (1 ) A (n ):,
where the Wick product on the left-hand side is dened by moving all the creation
operators to the left and annihilation operators to the right without any commuta-
tion relations. While the Wick product of the left-hand side is dened recursively by
: A () : = A ()
and
(
n (
n
1
n (
: A () A (j ) : = A () : A (j ) : (fk , f ) : A (j ) : .
2
j=1 j=1 k=1 j =k

The unitary operator U can be extended to the unitary operator from F to L2 (Q),
and it also implements
U d()U 1 = Hf (m).
Then under (A.3) it follows that (1 U ) maps D( 12 p2 1) D(1 d()) to
D( 12 p2 1) D(1 Hf (m)) and
PF (1 U 1 ) = HPF .
(1 U )H (A.4)

References
[1] M. Aizenman and B. Simon, Brownian motion and Harnaks inequality for
odinger operators, Comm. Pure Appl. Math. 35 (1982) 209270.
Schr
[2] V. Bach, J. Fr
ohlich and I. M. Sigal, Spectral analysis for systems of atoms and
molecules coupled to the quantized radiation eld, Comm. Math. Phys. 207 (1999)
249290.
[3] K. Broderix, D. Hundertmark and H. Leschke, Continuity properties of Schrodinger
semigroups with magnetic elds, Rev. Math. Phys. 12 (2000) 181225.
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004181

Generalized PF Model 1207

[4] V. Betz, F. Hiroshima, J. L orinczi, R. A. Minlos and H. Spohn, Ground state prop-
erties of the Nelson Hamiltonian a Gibbs measure-based approach, Rev. Math.
Phys. 14 (2002) 173198.
[5] R. Carmona, Pointwise bounds for Schr odinger operators, Comm. Math. Phys. 62
(1978) 97106.
[6] H. L. Cycon, R. G. Froese, W. Kirsch and B. Simon, Schr odinger Operators (Springer-
Verlag, 1987).
[7] C. Feerman, J. Fr ohlich and G. M. Graf, Stability of ultraviolet-cuto quantum elec-
trodynamics with non-relativistic matter, Comm. Math. Phys. 190 (1997) 309330.
[8] C. Gerard, F. Hiroshima, A. Panatti and A. Suzuki, Infrared divergence of a scalar
quantum eld model on a pseudo Riemannian manifold, Interdiscip. Inform. Sci. 15
(2009) 399421.
[9] M. Gubinelli, Gibbs measures for self-interacting Wiener paths, Mark. Proc. Rel.
Fields 12 (2006) 747766.
[10] M. Griesemer, Exponential decay and ionization thresholds in non-relativistic quan-
tum electrodynamics, J. Funct. Anal. 210 (2004) 321340.
[11] M. Griesemer, E. Lieb and M. Loss, Ground states in non-relativistic quantum elec-
trodynamics, Invent. Math. 145 (2001) 557595.
[12] D. Hasler and I. Herbst, On the self-adjointness and domain of PauliFierz type
Hamiltonians, Rev. Math. Phys. 20 (2008) 787800.
[13] T. Hidaka, On the existence of ground states for the PauliFierz model with a variable
mass, preprint (2010).
[14] F. Hiroshima, Functional integral representation of a model in quantum electrody-
namics, Rev. Math. Phys. 9 (1997) 489530.
[15] F. Hiroshima, Ground states of a model in nonrelativistic quantum electrodynamics
II, J. Math. Phys. 41 (2000) 661674.
[16] F. Hiroshima, Essential self-adjointness of translation invariant quantum led models
for arbitrary coupling constants, Comm. Math. Phys. 211 (2000) 585613.
[17] F. Hiroshima, Self-adjointness of the PauliFierz Hamiltonian for arbitrary values of
coupling constants, Ann. Henri Poincare 3 (2002) 171201.
[18] F. Hiroshima, Fiber Hamiltonians in nonrelativistic quantum electrodynamics,
J. Funct. Anal. 252 (2007) 314355.
[19] F. Hiroshima, T. Ichinose and J. L orinczi, Path integral representation for
Schrodinger operator with Bernstein function of the Laplacian, preprint (2009).
[20] F. Hiroshima and J. L orinczi, Functional integral representations of the PauliFierz
model with spin 1/2, J. Funct. Anal. 254 (2008) 21272185.
[21] T. Ikebe, Eigenfunction expansion asociated with the Schr odinger operators and their
applications to scattering theory, Arch. Ration. Mech. Anal. 5 (1960) 134.
[22] J. Lorinczi, R. A. Minlos and H. Spohn, The infrared behaviour in Nelsons model of
a quantum particle coupled to a massless scalar eld, Ann. Henri Poincare 3 (2002)
128.
[23] E. Nelson, Schr odinger particles interacting with a quantized scalar eld, in Proc.
Conf. Analysis in Function Space, eds. W. T. Martin and I. Segal (MIT Press, Cam-
bridge 1964), p. 87.
[24] B. Simon, The P ()2 Euclidean (Quantum) Field Theory (Princeton Univ. Press,
1974).
[25] B. Simon, Functional Integral Representation and Quantum Physics (Academic Press,
1979).
[26] B. Simon, Schr odinger semigroups, Bull. Amer. Math. Soc. 7 (1982) 447526.
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004181

1208 T. Hidaka & F. Hiroshima

[27] B. Simon, Katos inequality and the comparison of semigroups, J. Funct. Anal. 32
(1979) 97101.
[28] H. Spohn, Ground state of quantum particle coupled to a scalar boson eld, Lett.
Math. Phys. 44 (1998) 916.
[29] H. Spohn, Dynamics of Charged Particles and their Radiation Field (Cambridge
University Press, 2004).
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

Reviews in Mathematical Physics


Vol. 22, No. 10 (2010) 12091240

c World Scientic Publishing Company
DOI: 10.1142/S0129055X10004193

EIGENFUNCTION EXPANSIONS AND SPACETIME


ESTIMATES FOR GENERATORS IN DIVERGENCE-FORM

MATANIA BEN-ARTZI
Institute of Mathematics, Hebrew University, Jerusalem 91904, Israel
mbartzi@math.huji.ac.il

Received 28 January 2010


Revised 10 August 2010

Pn
Let H = j,k=1 xj aj,k (x) x be a formally self-adjoint (elliptic) operator in
k
L2 (Rn ), n 2. The real coecients aj,k (x) = ak,j (x) are assumed to be bounded
and to coincide with outside of a ball. The paper deals with two topics: (i) An
eigenfunction expansion theorem, proving in particular that H is unitarily equivalent to
, and (ii) Global spacetime estimates for the associated inhomogeneous wave equa-
tion, proved under suitable (nontrapping) additional assumptions on the coecients.
The main tool used here is a Limiting Absorption Principle (LAP) in the framework of
weighted Sobolev spaces, which holds also at the threshold.

Keywords: Divergence-type operator; limiting absorption principle; eigenfunction expan-


sion; spacetime estimates.

Mathematics Subject Classication 2010: 35J15, 35L15, 47F05

1. Introduction

Let H = nj,k=1 j aj,k (x)k , where aj,k (x) = ak,j (x), be a formally self-adjoint
operator in L2 (Rn ), n 2. The notations j = x

j

and t = t are used throughout
the paper.
We assume that the real measurable matrix function a(x) = {aj,k (x)}1j,kn
satises, with some positive constants a1 > a0 > 0, 0 > 0,
a0 I a(x) a1 I, x Rn , (1.1)
a(x) = I for |x| > 0 . (1.2)
In what follows, we shall use the notation H = a(x).
We retain the notation H for the self-adjoint (Friedrichs) extension associated
with the form (a(x), ), where ( , ) is the scalar product in L2 (Rn ). When
a(x) I we set H = H0 = .
Operators of this type appear in geometry (Laplacian on noncompact Rieman-
nian manifolds) as well as in physics, typically when physical parameters vary in
space (such as the acoustic propagator in a medium with variable speed of sound).

1209
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

1210 M. Ben-Artzi

Under our assumptions (1.1) and (1.2), it follows that (H), the spectrum of
H, is the half-axis [0, ), and is entirely continuous. In particular, the equality
(Hu, u) = (a(x)u, u) shows that H has no eigenvalue at zero. In addition, if
the coecient matrix a(x) is smooth, the absence of singular continuous spectrum
follows from the classical work of Mourre ([58]). However, it seems that there is no
proof in the literature establishing the absolute continuity of the spectrum in our
case of non-smooth (and even discontinuous) coecients. This fact is implied by
our Theorem A stated in Sec. 3 below.
The threshold z = 0 plays a special role in this setting, as we shall see later.
The mere fact that both H and H0 are spectrally absolutely continuous over
[0, ) does not imply that they are identical, namely, in the functional analytic
setting, that they are unitarily equivalent. Thus one question that arises is:

Question 1. Are the operators H and H0 unitarily equivalent, under the above
assumptions on the coecients?
We next recall the denition of the wave operators related to H, H0 [50,
Chap. X].
Consider the family of unitary operators
W (t) = exp(itH) exp(itH0 ), < t < .
The strong limits
W (H, H0 ) = s- lim W (t), (1.3)
t

if they exist, are called the wave operators (relating H, H0 ). These operators play
an important role in scattering theory. They are clearly isometries. If the range
of W+ is equal to the absolutely continuous subspace of H (which here is L2 (Rn )
itself), we say that it is complete, with a similar denition for W . If either one is
complete, then it is unitary (in the case at hand) and provides a unitary equivalence
between H and H0 . A second question that arises therefore is:

Question 2. Do the wave operators exist and, if so, are they complete?
As noted above, a positive answer to this question entails a positive answer to
the rst question.
Another aspect related to the spectral theory of H is its associated eigenfunction
expansion. When available, it serves as an analytic tool which is sharper than the
abstract spectral theorem. In the case of H0 , the Fourier transform

n
g() = (2) 2
F g() =  g(x)eix dx, (1.4)
Rn

serves to express g(x) as



n
g(x) = (2) 2 g()eix d, (1.5)
Rn
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

Eigenfunctions Expansions and Spacetime Estimates 1211

which can be viewed as an expansion of g in terms of the generalized eigenfunc-


tions (or modes) exp(ix), associated with the eigenvalues ||2 . Furthermore, the
operator F is unitary and F H0 F 1 is just multiplication by ||2 in Fourier space.
Such (diagonalizing) expansions have been used extensively in quantum mechan-
ics (for example, the Airy transform associated with the Stark Hamiltonian). It is
therefore natural to pose the following question:

Question 3. Can one associate a similar eigenfunction expansion with the oper-
ator H? More specically, can one replace the exponentials exp(ix) by some approx-
imating generalized eigenfunctions (distorted plane waves) so that the resulting
transform remains unitary and diagonalizes the operator?

As a nal topic in this paper, we turn back to the evolution (unitary) group
exp(itH)u0 , which solves the Schrodinger equation

it u = Hu, u(0) = u0 .

The last 30 years have seen a very intensive research on the global (space-
time) properties of these solutions, known as Strichartz and smoothing estimates.
Instead of treating the Schrodinger equation we choose here to address the gener-
alized wave equation,

t2 u = Hu + f, (1.6)

subject to initial conditions u(0) = u0 , t u(0) = v0 .


The conservation of energy for this equation (in the homogeneous case, f = 0)
is given by
 
1 1
[|H t u(x, t)|2 + |H + 2 u(x, t)|2 ]dx = [|H v0 (x)|2 + |H + 2 u0 (x)|2 ]dx,
Rn Rn
(1.7)

for any R, and any t R.


In this context, the dispersive character of the equation means that the solution
escapes from any bounded set, as |t| , in some average sense. We would like
to estimate this decay in terms of the initial energy norm, namely, the right-hand
side of (1.7).
We therefore ask:

Question 4. Can one establish global L2 spacetime estimates for solutions of (1.6)
in terms of the initial energy norm?

In this paper, we answer armatively the rst three questions. As for


Question 4, we provide such estimates by imposing restrictive hypotheses
on the coecient matrix.
The precise statements, as well as discussions of the relevant bibliography for
each topic, are given in Sec. 3.
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

1212 M. Ben-Artzi

The main technical tool used here consists of a close study of the properties of
the resolvent R(z) as z approaches the real axis.
To be more specic, we introduce the general notion of the continuity up to
the spectrum of the resolvent.
Denition 1.1. Let [, ] R. We say that H satises the Limiting Absorption
Principle (LAP) in [, ] if R(z), z C , can be extended continuously to Im z =
0, Re z [, ], in a suitable operator topology. In this case we denote the limiting
values by R (), .
The precise specication of the operator topology in the above denition is left
open. Typically, it will be the uniform operator topology associated with weighted-
L2 or Sobolev spaces, which are introduced in Sec. 2.
Note that the limiting values R () are, generally speaking, dierent from
+
R (). In fact, one has (formally) the Stieltjes formula
1 d
A() = (R+ () R ()) = E(),
2i d
where E() is the spectral family associated with H.
The operator A(), [0, ), known in the physical literature as the density
of states ([28, Chap. XIII]), plays an important role in our study.
The paper is organized as follows.
Basic functional spaces and notations are introduced in Sec. 2.
Our results are stated as Theorems AC in Sec. 3. Around each of the three
theorems, we discuss some background material as well as relevant references. Obvi-
ously, the large amount of existing literature excludes any possibility of compiling
an exhaustive bibliography.
Section 4 is devoted to revisiting the LAP as applied to the Laplacian H0 , and
in particular obtaining uniform low energy estimates.
In Sec. 5, we prove Theorem A, the LAP for H.
The eigenfunction expansion theorem, Theorem B, is proved in Sec. 6.
The global spacetime estimates for the generalized wave equation (1.6), as
stated in Theorem C, are proved in Sec. 7.
Some of the results presented here were announced in [9].

2. Functional Spaces and Notation


Throughout this paper we shall make use of the following weighted-L2 and Sobolev
spaces. First, for s R and m a nonnegative integer we dene.
  
L2,s (Rn ) := u(x)/ u 20,s = (1 + |x|2 )s |u(x)|2 dx < (2.1)
Rn


H m,s (Rn ) := u(x)/D u L2,s , || m, u 2m,s = D u 20,s (2.2)

|m

(we write L2 for L2,0 and u 0 = u 0,0 ).


November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

Eigenfunctions Expansions and Spacetime Estimates 1213

More generally, for any R, let H H ,0 be the Sobolev space of order ,


namely,

u/u L2, ,
H = { u ,0 = u 0, } (2.3)

where the Fourier transform is dened as in (1.4).


For negative indices, we denote by {H m,s , m,s } the dual space of H m,s .
In particular, observe that any function f H 1,s can be represented (not
uniquely) as

n

f = f0 + i1 fk , fk L2,s , 0 k n. (2.4)
xk
k=1

In the case n = 2 and s > 1, we dene

L2,s 2
0 (R ) = {u L
2,s
(R2 )/
u(0) = 0},

and set H01,s (R2 ) to be the space of functions f H 1,s (R2 ) which have a repre-
sentation (2.4) where fk L2,s
0 , k = 0, 1, 2.
For any two normed spaces X, Y , we denote by B(X, Y ) the space of bounded
linear operators from X to Y , equipped with the operator-norm B(X,Y ) topology.

3. Statement of Results and Background


3.1. The limiting absorption principle (LAP)
We note that the operator H can be extended in an obvious way (retaining the
1 1
same notation) as a bounded operator H: Hloc
Hloc . In particular, H: H 1,s

1,s
H , for all s 0. Furthermore, the graph-norm of H in H 1,s is equivalent
to the norm of H 1,s .
Similarly, we can consider the resolvent R(z) as dened on L2,s , s 0, where
L2,s is densely and continuously embedded in H 1,s .
The basic technical tool used in the present paper is given in the following
theorem. It has its own signicance, stating that the resolvent is continuous up to
the spectrum, including the threshold at = 0.
Theorem A. Suppose that a(x) satises (1.1), (1.2). Then the operator H satises
the LAP in R. More precisely, let s > 1 and consider the resolvent R(z) = (H
z)1 , Im z
= 0, as a bounded operator from L2,s (Rn ) to H 1,s (Rn ).
Then:

(a) R(z) is bounded with respect to the H 1,s (Rn ) norm. Using the density of L2,s
in H 1,s , we can therefore view R(z) as a bounded operator from H 1,s (Rn )
to H 1,s (Rn ).
(b) The operator-valued functions, dened respectively in the lower and upper half-
planes,

z R(z) B(H 1,s (Rn ), H 1,s (Rn )), s > 1, Im z > 0, (3.1)
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

1214 M. Ben-Artzi

can be extended continuously from C = {z/ Im z > 0} to C = C R (with


respect to the operator-norm topology of B(H 1,s (Rn ), H 1,s (Rn ))).
In the case n = 2, replace H 1,s by H01,s .
Notation. We denote the limiting values of the resolvent on the real axis by
R () = lim R( + i ).
0

The spectrum of H is therefore entirely absolutely continuous. In particular, it


follows that the limiting values R () are continuous at = 0 and H has no
resonance there.
The main focus of Theorem A is the LAP for H at low energies, i.e. in intervals
[, ] where < 0 < .
However, to review the existing literature, we consider rst the LAP in (0, ),
namely, over the interior of the spectrum.
Under assumptions close to ours here (but also assuming that a(x) is continu-
ously dierentiable) a weaker version (roughly, strong instead of uniform con-
vergence of the resolvents) was obtained by Eidus ([34, Theorem 4 and Remark 1]).
His approach relied on elliptic (kernel) estimates.
The systematic treatment of the LAP started with the work of Agmon ([1]). He
established it for operators of the type H0 + V , where V is a short-range pertur-
bation. To obtain the LAP for H0 he considered the action of division by symbols
with simple zeros in weighted Sobolev spaces. We therefore label this approach as
the Fourier approach (see [41, Chap. 14]). The short-range potential was treated
by perturbation methods.
Soon thereafter, two other approaches to the LAP were proposed, rst the
Commutator method (known as Mourres method) proposed in the classical
paper [58] and then the Spectral method, initiated in joint works of the author
with Devinatz ([12, 13]). In its implementation for partial dierential operators, this
method relies on estimates of traces of Sobolev functions on characteristic mani-
folds, somewhat in analogy to the division by symbols with simple zeros in the case
of the Fourier method. In fact, it implies the H
older continuity of the limiting values

R () in a suitable operator topology.
All three approaches yielded simple proofs for the LAP associated with H =
H0 + V, where V is short-range, in the interior (0, ) of the spectrum.
Using one of the aforementioned approaches, the LAP for H has later been
established, with V being a long-range or Stark-like potential ([5, 45]), a potential
in Lp (Rn ) ([36, 47]), a potential depending only on direction (x/|x|) ([38]) or a
perturbation of such a potential ([61, 62]). In these latter cases the condition > 0
is replaced by > lim sup|x| V (x).
The LAP for operators of the type f () + V, for a certain class of functions
f, was derived in [17], using the spectral method.
A remarkable success of Mourres method was in its application to the LAP in
the case of the N -body Schr odinger operator (outside of thresholds) ([60]).
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

Eigenfunctions Expansions and Spacetime Estimates 1215

As mentioned in the Introduction, if the coecient matrix a(x) is smooth, the


operator H can be viewed as the LaplaceBeltrami operator g on noncompact
manifolds, where g is a smooth metric that approaches the Euclidean metric at
innity. The LAP in this case (in the interior of the spectrum) has already been
established by Mourre. We refer to [65] and references therein for the case of per-
turbations of such operators. More recent works that employ the Mourre method
for the derivation of the LAP in the interior of the spectrum, for asymptotically
Euclidean spaces, are [75, Sec. 5] and [19, Theorem 2.2].
We now turn back to our topic here, the LAP in intervals containing the thresh-
old at the bottom of the spectrum. The study of the resolvent near the threshold
= 0 is sometimes referred to as low energy estimates. The literature in this case
is considerably more limited. An inspection of the aforementioned works shows that
the methods they employ cannot be extended in a straightforward way to our oper-
ator H.
This case has been studied for the Laplacian H0 in [12, Appendix A] and for H
in the one-dimensional case (n = 1) in [8, 10, 27]. The present paper deals with the
multi-dimensional case n 2.
In recent works, Bouclet ([21]) and Bony and H afner ([20]) have applied the
Mourre method in order to establish low energy LAP for g on noncompact
manifolds of dimension n 3, where the metric g(x) is smooth but long-range.
The paper [64] deals with the two-dimensional (n = 2) case, but the resolvent
R(z) is restricted to continuous compactly supported functions f , thus enabling the
use of pointwise decay estimates of R(z)f at innity.
Finally we mention the case of the closely related acoustic propagator, where
the matrix a(x) = b(x1 )I is scalar and dependent on a single coordinate, has been
extensively studied [10, 22, 29, 31, 48, 49, 53], as well as the anisotropic case where
b(x1 ) is a general positive matrix ([11]). The LAP for the periodic case (namely,
a(x) is symmetric and periodic) has recently been established in [59]. Note that in
this case the spectrum is absolutely continuous and consists of a union of intervals
(bands).
The proof of Theorem A, based on the spectral approach, is given in Sec. 5. It
uses an extended version of the LAP for H0 , with the resolvent R0 (z) acting on
elements of H 1,s , for suitable positive values of s (see Sec. 4).
Since L2,s (respectively H 1,s ) is densely and continuously embedded in H 1,s
(respectively L2,s ), we conclude that the resolvents R0 (z), R(z) can be extended
continuously to C in the B(L2,s (Rn ), L2,s (Rn )) operator topology. An imme-
diate consequence of this fact is the existence and completeness of the wave
operators.
Using a well-known theorem of Kato and Kuroda ([51]), we have the following
immediate corollary concerning the completeness of the wave operators (see (1.3)
for the denition).

Corollary 3.1. The wave operators W (H, H0 ) exist and are complete.
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

1216 M. Ben-Artzi

Indeed, all that is needed is that H, H0 satisfy the LAP in R, with respect to
the same operator topologies.
We refer to the paper [46] where the existence and completeness of the wave
operators W (H, H0 ) is established under suitable smoothness assumptions on a(x)
(however, a(x) I is not assumed to be compactly supported and H can include
also magnetic and electric potentials).

3.2. The eigenfunction expansion theorem


The spectral theorem (for self-adjoint operators) can be viewed as a generalized
eigenfunction theorem. In fact, using the result of Theorem A one can obtain a
more rened version in this case as follows.
Let {E(), R} be the spectral family associated with H. Let A() = d
d
E()
be its weak derivative. More precisely, we use the well-known formula,

1 1
A() = lim (R( + i ) R( i )) = (R+ () R ()). (3.2)
2i 0+ 2i

By Theorem A, we know that A() B(L2,s (Rn ), L2,s (Rn )). The formal relation
(H )A() = 0 can be given a rigorous meaning if, for example, we can nd a
bounded operator T such that T A()T is bounded in L2 (Rn ) and has a complete
set (necessarily at most countable) of eigenvectors. These will serve as generalized
eigenvectors for H. We refer to [18, Chaps. V and VI] and [23] for a development
of this approach for self-adjoint elliptic operators. Note that by this approach we
have at most a countable number of such generalized eigenvectors for any xed
n3
. In the case of H0 = , they correspond to |x| 2
J j ( |x|)j (), where
j = j + (n1)(n3)
4 , j being the jth eigenvalue of the LaplaceBeltrami operator
on the unit sphere S n1 , j the corresponding eigenfunction and J is the Bessel
function of order .
On the other hand, the Fourier expansion (1.5) can be viewed as expressing
a function in terms of the generalized eigenfunctions exp(ix) of H0 . Observe
that now there is a continuum of such functions corresponding to > 0, namely,
||2 = .
From the physical point of view, this expansion in terms of plane waves
proves to be more useful for many applications. In particular, replacing by the
Schrodinger operator + V (x) one can expect, under certain hypotheses on the
potential V , a similar expansion in terms of distorted plane waves. This has been
accomplished, in increasing order of generality (more specically, decay assump-
tions on V (x) as |x| ) in [1, 2, 44, 63, 68]. See also [74] for an eigenfunction
expansion for relativistic Schrodinger operators.
Here we use the LAP result of Theorem A in order to derive a similar expansion
for the operator H. In fact, our generalized eigenfunctions are given by the following
denition.
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

Eigenfunctions Expansions and Spacetime Estimates 1217

Denition 3.2. For every Rn let

(x, ) = R (||2 )((H ||2 ) exp(ix))




n
= R (||2 ) l (al,j (x) l,j )j exp(ix). (3.3)
l,j=1

The generalized eigenfunctions of H are dened by

(x, ) = exp(ix) + (x, ). (3.4)


We assume n 3 in order to simplify the statement of the theorem. As we show
below (see Proposition 6.1) the generalized eigenfunctions are (at least) continuous
in x, so that the integral in the statement makes sense.
Theorem B. Suppose that n 3 and that a(x) satises (1.1) and (1.2). For any
compactly supported f L2 (Rn ) dene

n
(F f )() = (2) 2 f (x) (x, )dx, Rn . (3.5)
Rn

Then the transformations F can be extended as unitary transformations (for which


we retain the same notation) of L2 (Rn ) onto itself. Furthermore, these transforma-
tions diagonalize H in the following sense.
f L2 (Rn ) is in the domain D(H) if and only if ||2 (F f )() L2 (Rn ) and

H = F M||2 F , (3.6)
where M||2 is the multiplication operator by ||2 .

3.3. Spacetime estimates for a generalized wave equation


The Strichartz estimates ([72]) have become a fundamental ingredient in the study
of nonlinear wave equations. They are Lp spacetime estimates that are derived
for operators whose leading part has constant coecients. We refer to the books
[4, 70, 71] for detailed accounts and further references.
Here we focus exclusively on spacetime estimates pertinent to the framework
of this paper, namely, weighted L2 estimates. Indeed, once the low energy esti-
mates of Theorem A are established, the method of proof here follows a standard
methodology.
We recall rst some results related to the Cauchy problem for the classical wave
equation
2u
u = u = 0, (3.7)
t2
subject to the initial data
u(x, 0) = u0 (x), t u(x, 0) = v0 (x), x Rn . (3.8)
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

1218 M. Ben-Artzi

The Morawetz estimate [56] yields


 
|x|3 |u(x, t)|2 dxdt C( u0 20 + v0 20 ), n 4, (3.9)
R Rn

while in [7] we gave the estimate


 
|x|21 |u(x, t)|2 dxdt C ( || u0 20 + ||1 v0 20 ), n 3, (3.10)
R Rn

for every (0, 1).


Related results were obtained in [55] (allowing also dissipative terms), [42] (with
some gain in regularity), [76] (with short-range potentials) and [39] for spherically
symmetric solutions.
Here we consider the equation
2u 2u n
+ Hu = i ai,j (x)j u = f (x, t), (3.11)
t2 t2 i,j=1
subject to the initial data (3.8).
We rst replace the assumptions (1.1) and (1.2) by stronger ones as follows.

(H1) a(x) = g 1 (x) = (g i,j (x))1i,jn (3.12)


where g(x) = (gi,j (x))1i,jn is a smooth Riemannian metric on Rn such that
g(x) = I for |x| > 0 . (3.13)

(H2) The Hamiltonian ow associated with h(x, ) = (g(x), ) is nontrapping for


any (positive) value of h.

Recall that (H2) means that the ow associated with the Hamiltonian vectoreld
H = h x x leaves any compact set in Rx .
h n

Identical hypotheses are imposed in the study of resolvent estimates in semi-


classical theory ([24, 25]).
In our estimates we use homogeneous Sobolev spaces associated with the
operator H.
1
We note that since H has no eigenvalue at zero, the operators H 1 and H 2
1
are well dened self-adjoint operators. Note that H 2 0 is equivalent to the homo-
geneous Sobolev norm 0 .
Theorem C. Suppose that n 3 and that a(x) satises Hypotheses (H1) and (H2).
Let s > 1.
1
(a) (Local Energy Decay) Let u0 D(H 2 ) and v0 L2 (Rn ). Then there exists a
constant C1 = C1 (s, n) > 0 such that the solution to (3.11) and (3.8) satises,
 
1
(1 + |x|2 )s [|H 2 u(x, t)|2 + |ut (x, t)|2 ]dxdt
R Rn
   
1
C1 H 2 u0 20 + v0 20 + 2
|f (x, t)| dxdt . (3.14)
R Rn
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

Eigenfunctions Expansions and Spacetime Estimates 1219

1
(b) (Amplitude Decay) Assume f = 0. Let u0 L2 (Rn ) and v0 D(H 2 ). There
exists a constant C2 = C2 (s, n) > 0 such that the solution to (3.11) and (3.8)
satises,
 
1
(1 + |x|2 )s |u(x, t)|2 dxdt C2 [ u0 20 + H 2 v0 20 ]. (3.15)
R Rn

These estimates generalize similar estimates obtained for the classical (g = I)


wave equation ([7, 55]).

Remark 3.3. The estimate (3.14) is an energy decay estimate for the wave
equation (3.11). A localized (in space) version of the estimate has served to obtain
global (small amplitude) existence theorems for the corresponding nonlinear equa-
tion ([25, 40]).

Remark 3.4. The referee has pointed out to the author the recent preprint [19,
Theorem 1.3], where a more general result is obtained, with the metric being long-
range.

The weighted L2 -spacetime estimates for the dispersive equation



i1 u = Lu,
t
have been extensively treated in recent years. In general, in this case there is also a
gain of derivatives (so called smoothing) in addition to the energy decay. For the
odinger operator L = + V (x), with various assumptions on the potential
Schr
V, we refer to [3, 6, 7, 15, 16, 42, 52, 67, 69, 77] and references therein. Smooth-
ing estimates in the presence of magnetic potentials are considered in [30]. The
Schr
odinger operator on a Riemannian manifold is considered in [24, 33]. For more
general operators, see [14, 17, 26, 43, 57, 66, 73] and references therein.

4. The Operator H0 =
Let {E0 ()} be the spectral family associated with H0 , so that

(E0 ()h, h) = 2 d, 0, h L2 (Rn ).
|h| (4.1)
||2

Following the methodology of [13, 32], we see that the weak derivative A0 () =
d
d E0 () exists in B(L
2,s
, L2,s ) for any s > 12 and > 0. (Here and below we
write L2,s for L2,s (Rn )). Furthermore,

A0 ()h, h = (2 )1 2 d,
|h| (4.2)
||2 =

where ,  is the (L2,s , L2,s ) pairing (conjugate linear with respect to the second
term) and d is the Lebesgue surface measure. Recall that by the standard trace
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

1220 M. Ben-Artzi

lemma we have

2 d C h
2 s, 1
|h| H s> . (4.3)
||2 = 2
However, we can rene this estimate near = 0 as follows.
1
Proposition 4.1. Let 2 < s < 32 , h L2,s . For n = 2 assume further that s > 1
and h L2,s
0 . Then

2 d C min( , 1) h
|h| 2 s, (4.4)
H
||2 =

where
1
0 < =s , n 3,
2
(4.5)
1
0 < < s , n = 2,
2
and C = C(s, , n).

Proof. If n 3, the proof follows as in [16, Appendix], using the general-


ized Hardy inequality due to Herbst [37], namely, that multiplication by ||s
is bounded from H s into L2 (see also [54, Sec. 9.4]).
If n = 2 and 1 < s < 32 we have, for h L2,s
0 ,

|h()|
= |h()
h(0)| Hs ,
Cs, || h
for any 0 < < min(1, s 1). Using this estimate in the integral in the right-hand
side of (4.4) the claim follows also in this case.

Combining Eqs. (4.2)(4.4) we conclude that,


1 1
| A0 ()f, g| A0 ()f, f  2 A0 ()g, g 2
1
C min( 2 , ) f 0,s g 0, , f L2,s , g L2, , (4.6)
where either
1 3
(i) n 3, < s, < , s + > 2 and 0 < 2 = s + 2,
2 2
or
(4.7)
3 1 3
(ii) n = 2, 1<s< , << , s + > 2, 0 < 2 < s + 2
2 2 2
and f(0) = 0.
In both cases, A0 () is Holder continuous and vanishes at 0, , so as in [13] we
obtain.
Proposition 4.2. The operator-valued function

B(L2,s , L2, ), n 3,
z R0 (z) (4.8)
B(L2,s
0 ,L
2,
), n = 2,
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

Eigenfunctions Expansions and Spacetime Estimates 1221

where s, satisfy (4.7), can be extended continuously from C to C , in the respec-


tive uniform operator topologies.

Remark 4.3. We note that the conditions (4.7) yield the continuity of A0 ()
across the threshold = 0 and hence the continuity property of the resolvent as in
Proposition 4.2. However, for the local continuity at any 0 > 0, it suces to take
s, > 12 , as in [1].
This remark applies equally to the statements below, where the resolvent is
considered in other functional settings.

We shall now extend this proposition to more general function spaces. Let g
H 1, , where s, satisfy (4.7). Let f H 1,s have a representation of the form
(2.4). Equation (4.2) can be extended to yield an operator (for which we retain the
same notation)

A0 () B(H 1,s , H 1, ),

dened by (where now ,  is used for the (H 1,s , H 1, ) pairing),


   
n

1
A0 () f0 + i fk , g
xk
k=1
 
1 
n
= (2 ) f0 () + k fk () g()d, f H 1,s , g H 1, ,
||2 = k=1
(4.9)

(replace H 1,s by H01,s if n = 2).


Observe that this denition makes good sense even though the representation
(2.4) is not unique, since

n
n

f = f0 + i1 fk = f0 + i1 fk ,
xk xk
k=1 k=1

implies

n
n
f0 () + k fk () = f0 () + k fk ()
k=1 k=1

(as tempered distributions).


To estimate the operator-norm of A0 () in this setting we use (4.9) and the con-
siderations preceding Proposition 4.2, to obtain, instead of (4.6), for k = 1, 2, . . . , n,
 
 
 A0 () fk , g C min( 12 , ) f 1,s g 1, , f H 1,s , g H 1, ,
 xk 
(4.10)

where s, satisfy (4.7) (replace H 1,s by H01,s if n = 2).


November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

1222 M. Ben-Artzi

We now dene the extension of the resolvent operator by



A0 ()
R0 (z) = d, Im z
= 0. (4.11)
0 z
The convergence of the integral (in operator-norm) follows from the estimate (4.10).
The LAP in this case is given in the following proposition.

Proposition 4.4. The operator-valued function R0 (z) is well-dened (and


analytic) for nonreal z in the following functional setting.

B(H 1,s , H 1, ), n 3,
z R0 (z) (4.12)
B(H01,s , H 1, ), n = 2,
where s, satisfy (4.7). Furthermore, it can be extended continuously from C to
C , in the respective uniform operator topologies. The limiting values are denoted
by R0 ().
The extended function satises
(H0 z)R0 (z)f = f, f H 1,s , z C , (4.13)
where for z = R, R0 (z) = R0 ().

Proof. We assume for simplicity n 3. By Denition (4.11) and the esti-


mate (4.10), we get readily R0 (z) B(H 1,s , H 1, ) if Im z
= 0, as well as
the analyticity of the map z
R0 (z), Im z
= 0. Furthermore, the extension to
Im z = 0 is carried out as in [13].
Equation (4.13) is obvious if Im z
= 0 and f L2,s . By the density of L2,s in
1,s
H , the continuity of R0 (z) on H 1,s and the continuity of H0 z (in the sense
of distributions), we can extend it to all f H 1,s .
As z i 0, we have R0 (z)f R0 ()f in H 1, . Applying the
(constant coecient) operator H0 z yields, in the sense of distributions, f =
(H0 z)R0 (z)f (H0 )R0 ()f which establishes (4.13) also for Im z = 0.
Finally, the established continuity of z
R0 (z) B(H 1,s , H 1, ) (up to the
real boundary) and Eq. (4.13) imply the continuity of the map z
H0 R0 (z)
B(H 1,s , H 1, ).
The stronger continuity claim (4.12) follows since the norm of H 1, is equiva-
lent to the graph-norm of H0 as a map of H 1, to itself.

Remark 4.5. The main point here is the fact that the limiting values can be
extended continuously to the threshold at = 0.
In the neighborhood of any > 0 this proposition follows from
[68, Theorem 2.3], where a very dierent proof is used. In fact, using the termi-
nology there, the limit functions R0 ()f are the unique (on either side of the
positive real axis) radiative functions and they satisfy a suitable Sommerfeld radi-
ation condition. We recall it here for the sake of completeness, since we will need
it in the next section.
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

Eigenfunctions Expansions and Spacetime Estimates 1223

Let z = k 2 C\{0}, Im k 0. For f H 1,s let u = R0 (z)f H 1, be as


dened above. Then
  2
 n1 n1 
Ru =  (r 2 u) iku dx < ,
r (4.14)
2

|x|>0 r
where r = |x|.
We shall refer to Ru as the radiative norm of u.
Furthermore, we can take 12 < s, , as in Remark 4.3.

5. The Operator H
Fix [, ] R and let

= {z C + / < Re z < , 0 < Im z < 1}. (5.1)

Let z = + i and consider the equation

(H z)u = f H 1,s , u H 1, , (f H01,s if n = 2). (5.2)

(Observe that in the case n = 2 also u L2,


0 .)

With 0 as in (1.2), let (x) C (R ) be such that
n


0, |x| < 0 + 1,
(x) = (5.3)
1, |x| > 0 + 2.

Equation (5.2) can be written as

(H0 z)(u) = f 2 u u. (5.4)


Letting (x) = 1 ( x2 ) C0 (Rn ) and using Proposition 4.4 and standard elliptic
estimates, we obtain from (5.4)

u 1, C[ f 1,s + u 0,s], (5.5)

where s, satisfy (4.7), and C > 0 depends only on 0 , , s, n.


We note that since is compactly supported, the term u 0,s can be replaced
by u 0,s for any real s .
In fact, the second term in the right-hand side can be dispensed with, as is
demonstrated in the following proposition.

Proposition 5.1. The solution to (5.2) satises,

u 1, C f 1,s , (5.6)

where s, satisfy (4.7) and C > 0 depends only on , s, n, 0 .

Proof. In view of (5.5), we only need to show that

u 0,s C f 1,s . (5.7)


November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

1224 M. Ben-Artzi

Since L2,s (Rn ) is dense in H 1,s (Rn ) it suces to prove this inequality for f
L2,s (Rn ) H 1,s (Rn ) (using the norm of H 1,s ).
We argue by contradiction. Let

{zk }
k=1 , {fk }
k=1 L
2,s
(Rn ) H 1,s (Rn )

(with fk (0) = 0 if n = 2) and

{uk = R(zk )fk }


k=1 H
1,
(Rn )

be such that,

uk 0,s = 1, fk 1,s k 1 , k = 1, 2, . . .
(5.8)
zk z0
as k .

By (5.5), {uk }
k=1 is bounded in H
1,
. Replacing the sequence by a suitable sub-
sequence (without changing notation) and using the Rellich compactness theorem

we may assume that there exists a function u L2, ,  > , such that,

uk u in L2, as k . (5.9)

Furthermore, by weak compactness we actually have (restricting again to a subse-


quence if needed)

u in H 1, as k .
w
uk (5.10)

Since H maps continuously H 1, into H 1, we have

Hu in H 1, as k ,
w
Huk

so that from (H zk )uk = fk we infer that

(H z0 )u = 0. (5.11)

In view of (5.4) and Remark 4.5, the functions uk are radiative functions.
Since they are uniformly bounded in H 1, their radiative norms (4.14) are
uniformly bounded.
Suppose rst that z0
= 0. In view of Remark 4.5, we can take s, > 12 . Then the
limit function u is a radiative solution to (H0 z0 )u = 0 in |x| > 0 + 2 and hence
must vanish there (see [68]). By the unique continuation property of solutions to
(5.11) we conclude that u 0. Thus by (5.9) we get uk 0, 0 as k ,
which contradicts (5.8).
We are therefore left with the case z0 = 0. In this case u H 1, satises the
equation

(a(x)u) = 0. (5.12)
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

Eigenfunctions Expansions and Spacetime Estimates 1225

In particular, u = 0 in |x| > 0 and


    2 
 u 
r 2
|u| +   d dr < .
2
(5.13)
0 |x|=r r

Consider rst the case n 3. We may then use the representation of u by spherical
harmonics, so that, with x = r, S n1 ,

n1


u(x) = r 2 bj rj hj () + cj rj hj () , r > 0 , (5.14)

j=0 j=0

where,
(n 1)(n 3)
j (j 1) = j (j + 1) = j + ,
4 (5.15)
0 = 0 < 1 2
being the eigenvalues of the LaplaceBeltrami operator on S n1 , and hj () the
corresponding spherical harmonics. Since 1 = n 1, it follows that
n1 n3
0 = , 0 + 1 1 2 , = 0 < 1 2 . (5.16)
2 2
We now observe that (5.13) forces
b0 = b1 = = 0.
Also, by (5.14)

u
d = (n 2)|S n1 |c0 , r > 0 , (5.17)
|x|=r r
(|S n1 | is the surface measure of S n1 ), while integrating (5.12) we get

u
d = 0, r > 0 . (5.18)
|x|=r r

Thus c0 = 0. It now follows from (5.14) that, for r > 0 ,


   2   21    2 
 u  r  u 
|u| +   d
2
|u| +   d.
2
(5.19)
|x|=r r 0 |x|=0 r

Multiplying (5.12) by u and integrating by parts over the ball |x| r, we infer from
(5.19) that the boundary term vanishes as r . Thus u 0, in contradiction
to (5.8) and (5.9).
It remains to deal with the case n = 2. Instead of (5.14), we now have




u(x) = r 2 b0 r 2 log r +
1 1
bj rj hj () + cj rj hj () , r > 0 , (5.20)

j=0 j=1

where 0 = 12 , 1 = 32 , 1 = 12 .
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

1226 M. Ben-Artzi

As in the derivation above, the condition (5.13) yields b0 = b1 = = 0. Also,


we get b0 = 0 in view of (5.18). It now follows that
  
u 1
u
d = 2 j + |cj |2 r2j 1 , r 0 , (5.21)
|x|=r r j=1
2

from which, as in the argument following (5.19), we deduce that u 0, again in


contradiction to (5.8) and (5.9).

Proof of Theorem A. Part (a) of the theorem is actually covered by


Proposition 5.1. Moreover, the proposition implies that the operator-valued function

z R(z) B(H 1,s (Rn ), H 1, (Rn )), s > 1, z ,

is uniformly bounded, where s, satisfy (4.7). Here and below replace H 1,s by
H01,s if n = 2.
We next show that the function z R(z) can be continuously extended to in
1,s 1, 1,s
the weak toplogoy of B(H n
(R ), H (R )). To this end, we take f H
n
(Rn )
1,
and g H n
(R ) and consider the function

z g, R(z)f , z ,

where ,  is the (H 1, , H 1, ) pairing. We need to show that it can be extended


continuously to .
In view of the uniform boundedness established in Proposition 5.1, we can take
f, g in dense sets (of the respective spaces). In particular, we can take f L2,s (Rn )
and g L2, (Rn ), so that the continuity property in is obvious.

Consider therefore a sequence {zk }k=1 such that zk z0 [, ].
k
The sequence {u R(zk )f }
 k = k=1 is bounded in H
1,
(Rn ). Therefore there exists

a subsequence ukj j=1 which converges to a function u L2, ,  > .
w
We can further assume that ukj u in H 1, . It follows that
j

g, ukj  g, u.
j

Passing to the limit in (H zkj )ukj = f we see that the limit function satises

(H z0 )u = f.

We now repeat the argument employed in the proof of Proposition 5.1. If z0


= 0
we note that the functions {uk }
k=1 are radiative functions with uniformly bounded
radiative norms (4.14) in |x| > 0 + 2. The same is therefore true for the limit
function u.
If z0 = 0 the function u H 1, solves Hu = f.
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

Eigenfunctions Expansions and Spacetime Estimates 1227

In both cases this function is unique and we get the convergence

g, R(zk )f  = g, uk 
g, u.
k

We can now dene

R+ (z0 )f = u, (5.22)

with an analogous denition for R (z0 ).


At this point we can readily deduce the following extension of the resolvent R(z)
as the inverse of H z.

(H z)R(z)f = f, f H 1,s , z C , (5.23)

where R(z) = R () when z = R.


Indeed, observe that if Im z
= 0 then (H z)R(z)f = f for f L2,s (Rn )
and (H z)R(z) B(H 1,s , H 1, ), so the assertion follows from the density of
L2,s (Rn ) in H 1,s (Rn ). For z = R we use the (just established) weak continuity
of the map z
(H z)R(z) from H 1,s into H 1, in C .
The passage from weak to uniform continuity (in the operator topology) is a
classical argument due to Agmon ([1]). In [8], we have applied it in the case n = 1.
Here we outline the proof in the case n > 1.
We establish rst the continuity of the operator-valued function z R(z), ,
1,s n 2, n
in the uniform operator topologoy of B(H (R ), L (R )).
Let {zk }k=1 and {fk } H 1,s (Rn ) be sequences such that zk
k=1
k
z and fk converges weakly to f in H 1,s (Rn ). It suces to prove that the
sequence uk = R(zk )fk , which is bounded in H 1, (Rn ), converges strongly in
L2, (Rn ). Since this is clear if Im z
= 0, we can take z [, ].
Note rst that we can take 12 <  < so that s,  satisfy (4.7). Then

the
 sequence
 {uk }k=1 is bounded in H 1, (Rn ) and there exists a subsequence
ukj j=1 which converges to a function u L2, .
w
We can further assume that ukj u in H 1, .
j
It follows that the limit function satises (see Eq. (5.23))

(H z)u = f.

Once again we consider separately the cases z


= 0 and z = 0.
In the rst case, in view of (5.23) and Remark 4.5 the functions uk are
radiative functions. Since they are uniformly bounded in H 1, their radiative
norms (4.14) are uniformly bounded, and we conclude that also Ru < .
In the second case, we simply note that u H 1, solves Hu = f.
As in the proof of Proposition 5.1 we conclude that in both cases the limit is
unique, so that the whole sequence {uk }k=1 converges to u in L
2,
(Rn ).
Thus, the continuity in the uniform operator topologoy of B(H 1,s (Rn ),
2,
L (Rn )) is proved.
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

1228 M. Ben-Artzi

Finally, we claim that the operator-valued function z R(z) is continuous in


the uniform operator toplogoy of B(H 1,s (Rn ), H 1, (Rn )). Indeed, if we invoke
Eq. (5.23) we get that also z HR(z) is continuous in the uniform operator
topology of B(H 1,s (Rn ), H 1, (Rn )).
Since the domain of H in H 1, (Rn ) is H 1, (Rn ), the claim follows. The
conclusion of the theorem follows by taking = s.

Remark 5.2. In view of (5.4) and Remark 4.5 it follows that for > 0 the
functions R ()f, f H 1,s , are radiative, i.e. satisfy a Sommerfeld radiation
condition.

6. The Eigenfunction Expansion Theorem


In this section we prove Theorem B stated in Sec. 3. We rst collect some basic
properties of the generalized eigenfunctions in the following proposition.

Proposition 6.1. The generalized eigenfunctions


(x, ) = exp(ix) + (x, )
1
(see (3.4)) are in Hloc (Rn ) for each xed Rn and satisfy the equation
(H ||2 ) (x, ) = 0. (6.1)
In addition, these functions have the following properties:
(i) The map
Rn 
(, ) H 1,s (Rn ), s > 1,
is continuous.
(ii) For any compact K Rn the family of functions { (x, ), K} is uniformly
older continuous in x Rn .
bounded and uniformly H

Proof. Since (H ||2 ) exp(ix) H 1,s , s > 1, Eq. (6.1) follows from the
denition (3.3) in view of Eq. (5.23).
Furthermore, the map
Rn 
(H ||2 ) exp(ix) H 1,s (Rn ), s > 1,
is continuous, so the continuity assertion (i) follows from Theorem A.
For s > 1 the set of functions { (, ), K} is uniformly bounded in H 1,s .
Thus, in view of (6.1), it follows from the De GiorgiNashMoser Theorem [35,
Chap. 8] that the set { (x, ), K} is uniformly bounded and uniformly H older
continuous in {|x| < R} for every R > 0. In particular, we can take R > 0 (see
Eq. (1.2)). In the exterior domain {|x| > R} the set { (x, ), K} is bounded
in H 1,s , s > 1, and we have (H0 ||2 ) (x, ) = 0.
In addition the boundary values { (x, ), |x| = R, K} are uniformly
bounded. From well-known properties of solutions of the Helmholtz equation, we
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

Eigenfunctions Expansions and Spacetime Estimates 1229

conclude that this set is uniformly bounded and therefore, invoking once again the
De GiorgiNashMoser Theorem, uniformly H older continuous.

Proof of Theorem B. We use the LAP proved in Theorem A, adapting the


methodology of Agmons proof ([1]) for the eigenfunction expansion in the case of
Schrodinger operators with short-range potentials. To simplify notation, we prove
for F+ .
Let u H 1 be compactly supported. For any z such that Im z
= 0 we can write
its Fourier transform as
 n 
n (2) 2
u() = (2) 2 u(x) exp(ix)dx = 2 u(x)(H0 z) exp(ix)dx.
Rn || z Rn

Let C0 (Rn ) be a (real) cuto function such that (x) = 1 for x in a neighbor-
hood of the support of u.
We can rewrite the above equality as
n
(2) 2
u
() = (H0 z)u(x), (x) exp(ix),
||2 z

where ,  is the (H 1,s , H 1,s ) bilinear pairing (conjugate linear with respect to
the second term).
We have therefore, with f = (H z)u,
n
(2) 2
u
() = ( (H z)u(x), (x) exp(ix) + (H0 H) exp(ix), u(x))
||2 z
n
(2) 2
= 2 ( f (x), (x) exp(ix) + f (x), R(
z )(H0 H) exp(ix)). (6.2)
|| z

Introducing the function


n
f(, z) = f() + (2) 2 f (x), R(
z )(H0 H) exp(ix),

we have

 () = f(, z)
u
() = R(z)f , Im z
= 0, (6.3)
||2 z

We now claim that this equation is valid for all compactly supported f H 1 .
Indeed, let u = R(z)f H 1,s , s > 1. Let (x) = 1 (x), where (x) is
dened in (5.3).
We set

uk (x) = (k 1 x)u(x), fk (x) = (H z)((k 1 x)u(x)), k = 1, 2, 3, . . . .

The equality (6.3) is satised with u, f replaced, respectively, by uk , fk .


November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

1230 M. Ben-Artzi

Since
(k 1 x)u(x)

u(x)
k
1,s
in H , we have
(H z)((k 1 x)u(x))
(H z)u = f (x)
k
1,s
in H , where in the last step we have used Eq. (5.23).
In addition, since (H0 H) exp(ix) is compactly supported
z )(H0 H) exp(ix) = (H0 H) exp(ix), R(z)fk (x)
fk (x), R(
(H0 H) exp(ix), R(z)f  = f, R(
z )(H0 H) exp(ix).
k

Combining these considerations with the continuity of the Fourier transform (on
tempered distributions) we establish that (6.3) is valid for all compactly supported
f H 1 .
Let {E(), R} be the spectral family associated with H. Let A() = d d
E()
be its weak derivative. More precisely, we use the well-known formula,
1
A() = lim (R( + i ) R( i )),
2i 0+
to get (using Theorem A), for any f H 1,s , s > 1,
1
f, A()f  = f, (R+ () R ())f .
2i
We now take f L2 and compactly supported. From the resolvent equation we
infer
R( + i ) R( i ) = 2i R( + i )R( i ), > 0,
so that

R( + i )f 20 , > 0.
f, A()f  = lim
0+

Using Eq. (6.3) and Parsevals theorem we therefore have,



f, A()f  = lim (||2 ( + i ))1 f(, + i ) 20 , > 0. (6.4)
0+

Note that f(, z) can be extended continuously as z + i 0 by


n
f(, ) = f() + (2) 2 f (x), R ()(H0 H) exp(ix). (6.5)
In order to study properties of f(, z) as a function of we compute
 n 

2
l (al,j (x) l,j )j exp(ix), R(z)f (x)
n

f (, z) = f () + (2)
l,j=1


n 
n
= f() + (2) 2 i j (al,j (x) l,j )l (R(z)f (x)) exp(ix)dx,
l,j=1 Rn

(6.6)
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

Eigenfunctions Expansions and Spacetime Estimates 1231

where in the last step we have used that both l (R(z)f (x)) and (al,j (x)
l,j ) exp(ix) are in L2 .
Consider now the integral

g(, z) = (al,j (x) l,j )l (R(z)f (x)) exp(ix)dx, z ,
Rn

where is as in (5.1).
In view of Theorem A the family {l R(z)f (x)}z is uniformly bounded in
L2,s , s > 1, so by Parsevals theorem we get
g(, z) 0 < C, z ,
where C only depends on f.
This estimate and (6.6) imply that, if f L2 is compactly supported:
(i) The function
 (, z) f(, z)
Rn
is continuous. For real z it is given by (6.5).
(ii)

lim (||2 z)1 |f(, z)|2 d = 0,
k ||>k

uniformly in z .
As z ||2 + i 0, we have by Theorem A and Eq. (3.4),

n
lim
2
f (, z) = (2) 2 f (x)+ (x, )dx = F+ f (), (6.7)
z|| +i0 Rn

so that, taking (i) and (ii) into account we obtain from (6.4), for any compactly
supported f L2 ,

1
f, A()f  = |F+ f ()|2 d, > 0, (6.8)
2 ||2 =
where d is the surface Lebesgue measure.
It follows that for any [, ] [0, ),
 
((E() E())f, f ) = f, A()f d = |F+ f ()|2 d. (6.9)
||2

Letting 0, , we get
f 0 = F+ f 0 . (6.10)
2
Thus f F+ f L (R ) is an isometry for compactly supported functions, which
n

can be extended by density to all f L2 (Rn ).


Furthermore, since the spectrum of H is entirely absolutely continuous, it follows
that for every f L2 , Eq. (6.8) holds for almost all > 0 (with respect to the
Lebesgue measure).
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

1232 M. Ben-Artzi

Let f D(H). By the spectral theorem



2 1
Hf, A()Hf  = f, A()f  = |||2 F+ f ()|2 d, > 0.
2 ||2 =
In particular,

2
Hf 0 = |||2 F+ f ()|2 d. (6.11)
Rn
 2
Conversely, if the right-hand side of (6.11) is nite, then 0 f, A()f d < ,
so f D(H).
The adjoint operator F+ is a partial isometry (on the range of F+ ). If f (x)
L (Rn ) is compactly supported and g() L2 (Rn ) is likewise compactly supported
2

then
  
n
(F+ f, g) = (2) 2 f (x)+ (x, )dx g()d
Rn Rn
  
n
= (2) 2 f (x) g()+ (x, )d dx,
Rn Rn
where in the change of order of integration Proposition 6.1 was taken into account.
It follows that for a compactly supported g() L2 (Rn ),

n
(F+ g)(x) = (2) 2 g()+ (x, )d, (6.12)
Rn
and the extension to all g L2 (Rn ) is obtained by the fact that F+ is a partial
isometry.
Now if f D(H), g L2 (Rn ), we have
 
(Hf, g) = ||2 F+ f ()F+ g()d = F+ (||2 F+ f ())g()d,
Rn Rn
which is the statement (3.6) of the theorem.
It follows from the spectral theorem that for every interval J = [, ] [0, )
and for every f L2 (Rn ) we have, with EJ = E()E() and J the characteristic
function of J,
EJ f (x) = F+ (J (||2 )F+ f ()),
or
F+ EJ f () = J (||2 )F+ f ().
It remains to prove that the isometry F+ is onto (and hence unitary).
So, suppose to the contrary that for some nonzero g() L2 (Rn )
(F+ g)(x) = 0.
In particular, for any f L2 (Rn ) and any interval J as above,
0 = (EJ f, F+ g) = (F+ EJ f, g) = (J (||2 )F+ f (), g())
= (F+ f (), J (||2 )g()),
so that F+ (J (||2 )g()) = 0.
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

Eigenfunctions Expansions and Spacetime Estimates 1233

By Eq. (6.12), we have, for any 0 < ,



g()+ (x, )d = 0,
<||2 <

so that, in view of the continuity properties of + (x, ) (see Proposition 6.1), for
a.e. (0, ),

g()+ (x, )d = 0. (6.13)
||2 =

From the denition (3.4), we get


 
g() exp(ix)d g()R ()((H ) exp(ix))d = 0. (6.14)
||2 = ||2 =

Since (H ) exp(ix) is compactly supported (when ||2 = ), the continuity


property of R () enables us to write
 
g()R ()((H ) exp(ix))d = R () g()(H ) exp(ix)d,
||2 = ||2 =

which, by Remark 5.2, satises a Sommerfeld radiation condition. We conclude that


the function

1
G(x) = g() exp(ix)d H 1,s , s > ,
2
|| = 2
is a radiative solution (see Remark 4.5) of ( )G = 0, and hence must vanish.
Since this holds for a.e. > 0, we get g() = 0, hence g = 0.

7. Global Spacetime Estimates


1
Proof of Theorem C. (a) Dene, with G = H 2 ,
1
u = (Gu it u). (7.1)
2
Then
i
t u = iGu f. (7.2)
2
Dening
 
u+ (t)
U (t) = (7.3)
u (t)
we have
i1 U  (t) = KU + F,

  1
f (, t)
G 0 2 (7.4)
K= , F (t) =

.

0 G 1
f (, t)
2
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

1234 M. Ben-Artzi

Note that, as is common when treating evolution equations, we write U (t), F (t), . . .
for U (x, t), F (x, t), . . . when there is no risk of confusion.
The operator K is a self adjoint operator on D = L2 (Rn ) L2 (Rn ). Its spectral
family EK () is given by EK () = EG () (I EG ()), R, where EG is the
spectral family of G.
d
Let E() be the spectral family of H, and let A() = d E() be its weak
derivative (3.2). By the denition of G we have

EG () = E(2 ),

hence its weak derivative is given by


d
AG () = EG () = 2A(2 ), > 0. (7.5)
d
In view of the LAP (Theorem A), we therefore have that the operator-valued
function

AG () B(L2,s (Rn ), L2,s (Rn )),

is continuous for 0.
Denoting Ds = L2,s (Rn ) L2,s (Rn ), it follows that
d
AK () = EK () = AG () AG (), R,
d
is continuous with values in B(Ds , Ds ) for s > 1.
Making use of Hypotheses (H1) and (H2), we invoke [65, Theorem 5.1] to con-
1
clude that lim sup 2 A() B(L2,s ,L2,s ) < , so that by (7.5) there exists a
constant C > 0, such that

AG () B(L2,s ,L2,s ) < C, 0. (7.6)

It follows that also

AK () B(Ds ,Ds ) < C, R, s > 1, R. (7.7)

Let ,  be the bilinear pairing between Ds and Ds (conjugate linear with


respect to the second term).
For any , Ds we have, in view of the fact that AK () is a weak derivative
of a spectral measure,

(i) | AK (), |2 AK (),  AK (), ,


 (7.8)
(ii) AK (), d = 2L2 (Rn )L2 (Rn ) .

We rst treat the pure Cauchy problem, i.e. f 0.


To estimate U (x, t) = eitK U (x, 0) we use a duality argument. Some of the
following computations will be rather formal, but they can easily be justied by
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

Eigenfunctions Expansions and Spacetime Estimates 1235

a density argument, as in [7, 17]. We shall use (( , )) for the scalar product in
L2 (Rn+1 ) L2 (Rn+1 ).
Take w(x, t) C0 (Rn+1 ) C0 (Rn+1 ). Then,

((U, w)) = eitK U (x, 0) w(x, t)dxdt

 !  "
= AK ()U (x, 0), eit w(, t)dt d


= (2)1/2 AK ()U (x, 0), w(,
)d,

where

1
) = (2) 2
w(x, w(x, t)eit dt.
R

Noting (7.8) and (7.7), and using the CauchySchwartz inequality


  12
|((U, w))| (2)1/2 U (x, 0) 0 AK ()w(,
), w(,
)d

  12
C U (x, 0) 0 ) 2Ds
w(, d .

It follows from the Plancherel theorem that


  12
|((U, w))| C U (x, 0) 0 w(, t) 2Ds dt .
R

s
Let (x, t) C0 (Rn+1 ) C0 (Rn+1 ), and take w(x, t) = (1 + |x|2 ) 2 (x, t), so
that
s
|(((1 + |x|2 ) 2 U, ))| C U (x, 0) 0 L2 (Rn+1 ) .

This concludes the proof of the part involving the Cauchy data in (3.14), in view
of (7.3).
To prove the part concerning the inhomogeneous equation, it suces to take
u0 = v0 = 0.
In this case the Duhamel principle yields, for t > 0,
 t
U (t) = ei(t )K F ( )d,
0

where we have used the form (7.4) of the equation.


November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

1236 M. Ben-Artzi

Integrating the inequality


 t
U (t) D s ei(t )K F ( ) Ds d,
0
we get
  
U (t) Ds dt ei(t )K F ( ) Ds dtd.
0 0

Invoking the rst part of the proof we obtain


 
U (t) Ds dt C F ( ) 0 d,
0 0

which proves the part related to the inhomogeneous term in (3.14).


(b) Dene
v (x, t) = exp(itG) (x),
where
1
(x) = [u0 (x) G1 v0 (x)].
2
Then clearly
u(x, t) = v+ (x, t) + v (x, t). (7.9)
We establish the estimate (3.15) for v+ .
Taking w(x, t) C0 (Rn+1 ) we proceed as in the rst part of the proof. Let , 
be the L2,s (Rn ), L2,s (Rn ) pairing. Then

(v+ , w) = eitG + (x) w(x, t)dxdt

 !  "
= AG ()+ , eit w(, t)dt d
0

= (2)1/2 AG ()+ , w(,
)d,
0

where

1
) = (2) 2
w(x, w(x, t)eit dt.
R

Noting (7.6) as well as the inequalities (7.8) (with AG replacing AK ) and using
the CauchySchwartz inequality
  1/2
|(v+ , w)| (2)1/2 + 0 AG ()w(,
), w(,
)d
0
  12
C + 0 ) 20,s d
w(, .
0
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

Eigenfunctions Expansions and Spacetime Estimates 1237

The Plancherel theorem yields


 1/2
|(v+ , w)| C + 0 w(, t) 20,s dt .
R
s
Let C0 (Rn+1 ), and take w(x, t) = (1 + |x|2 ) 2 (x, t), so that
s
|((1 + |x|2 ) 2 v+ , )| C + 0 L2(Rn+1 ) .
This (with the similar estimate for v ) concludes the proof of the estimate (3.15).

Remark 7.1 (Optimality of the Requirement s > 1). A key point in the proof
was the use of the uniform bound (7.6). In view of the relation (7.5), this is reduced
to the uniform boundedness of A(2 ), 0, in B(L2,s , L2,s ). By [65, Theo-
1
rem 5.1] the boundedness at innity, lim sup 2 A() < , holds already
with s > 12 . Thus the further restriction s > 1 is needed in order to ensure the
boundedness at = 0 (Theorem A).
Remark 7.2. Clearly we can take [0, T ] as the time interval, instead of R, for any
T > 0.

Acknowledgments
This work was partially done during my visits to the Department of Mathematics at
Stanford University (Spring 2004) and the Department of Mathematics of the Uni-
versite de Provence (Marseille, Spring 2006). I am grateful for the hospitality of both
departments with special thanks to Professors Rafe Mazzeo and Yves Dermenjian.
In addition, very stimulating discussions with S. Agmon, K. Hidano, Y. Pinchover,
M. Ruzhansky, M. Sugimoto and T. Umeda are happily acknowledged.
The author thanks the referee for calling his attention to the works [1921].

References
[1] S. Agmon, Spectral properties of Schr odinger operators and scattering theory, Ann.
Sc. Norm. Super. Pisa 2 (1975) 151218.
[2] S. Agmon, J. Cruz-Sampedro and I. Herbst, Spectral properties of Schr odinger oper-
ators with potentials of order zero, J. Funct. Anal. 167 (1999) 345369.
[3] Y. Ameur and B. Walther, Smoothing estimates for the Schr odinger equation with
an inverse-square potential, preprint (2007).
[4] M. Beals and W. Strauss, Lp estimates for the wave equation with a potential, Comm.
Partial Dierential Equations 18 (1993) 13651397.
[5] M. Ben-Artzi, Unitary equivalence and scattering theory for Stark-like Hamiltonians,
J. Math. Phys. 25 (1984) 951964.
[6] M. Ben-Artzi, Global estimates for the Schr odinger equation, J. Funct. Anal. 107
(1992) 362368.
[7] M. Ben-Artzi, Regularity and smoothing for some equations of evolution, in Nonlinear
Partial Dierential Equations and Their Applications; Coll`ege de France Seminar,
Longman Scientic, Vol. 11, eds. H. Brezis and J. L. Lions (Longman Sci. Tech.
1994), pp. 112.
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

1238 M. Ben-Artzi

[8] M. Ben-Artzi, On spectral properties of the acoustic propagator in a layered band,


J. Dierential Equations 136 (1997) 115135.
[9] M. Ben-Artzi, Spectral theory for divergence-form operators, in Spectral and Scat-
tering Theory and Related Topics, ed. H. Ito, Vol. 1607 (RIMS Kokyuroku, 2008),
pp. 7784.
[10] M. Ben-Artzi, Y. Dermenjian and J.-C. Guillot, Analyticity properties and estimates
of resolvent kernels near thresholds, Comm. Partial Dierential Equations 25 (2000)
17531770.
[11] M. Ben-Artzi, Y. Dermenjian and A. Monsef, Resolvent kernel estimates near thresh-
olds, Dierential Integral Equations 19 (2006) 114.
[12] M. Ben-Artzi and A. Devinatz, The limiting absorption principle for a sum of tensor
products applications to the spectral theory of dierential operators, J. Anal. Math.
43 (1983/84) 215250.
[13] M.Ben-Artzi and A. Devinatz, The Limiting Absorption Principle for Partial Dier-
ential Operators, Memoirs of the AMS, Vol. 364 (Amer. Math. Soc., 1987).
[14] M. Ben-Artzi and A. Devinatz, Local smoothing and convergence properties for
Schrodinger-type equations, J. Funct. Anal. 101 (1991) 231254.
[15] M. Ben-Artzi and A. Devinatz, Regularity and decay of solutions to the Stark evo-
lution equations, J. Funct. Anal. 154 (1998) 501512.
[16] M. Ben-Artzi and S. Klainerman, Decay and regularity for the Schr odinger equation,
J. Anal. Math. 58 (1992) 2537.
[17] M. Ben-Artzi and J. Nemirovsky, Remarks on relativistic Schr odinger operators and
their extensions, Ann. Inst. H. Poincar e 67 (1997) 2939.
[18] Ju. M. Berezanskii, Expansion in Eigenfunctions of Selfadjoint Operators, Transla-
tions of Mathematical Monographs, Vol. 17 (Amer. Math. Soc., 1968).
[19] J.-F. Bony and D. H afner, The semilinear wave equation on asymptotically Euclidean
manifolds, arXiv:0810.0464.
[20] J.-F. Bony and D. H afner, Low frequency resolvent estimates for long range pertur-
bations of the Euclidean Laplacian, arXiv:0903.5531.
[21] J.-M. Bouclet, Low frequency estimates for long range perturbations in divergence
form, arXiv:0806.3377.
[22] A. Boutet de Monvel-Berthier and D. Manda, Spectral and scattering theory for wave
propagation in perturbed stratied media, J. Math. Anal. Appl. 191 (1995) 137167.
[23] F. E. Browder, The eigenfunction expansion theorem for the general self-adjoint
singular elliptic partial dierential operator. I. The analytical foundation, Proc. Natl.
Acad. Sci. 40 (1954) 454459.
[24] N. Burq, Semi-classical estimates for the resolvent in nontrapping geometries, Int.
Math. Res. Not. 5 (2002) 221241.
[25] N. Burq, Global Strichartz estimates for nontrapping geometries: About an article by
H. Smith and C. Sogge, Comm. Partial Dierential Equations 28 (2003) 16751683.
[26] H. Chihara, Smoothing eects of dispersive pseudodierential equations, Comm.
Partial Dierential Equations 27 (2002) 19532005.
[27] A. Cohen and T. Kappeler, Scattering and inverse scattering for steplike potentials
in the Schrodinger equation, Indiana Univ. Math. J. 34 (1985) 127180.
[28] C. Cohen-Tannoudji, B. Diu and F. Laloe, Quantum Mechanics (John Wiley, 1977).
[29] E. Croc and Y. Dermenjian, Analyse spectrale dune bande acoustique multistratiee.
Partie I: Principe dabsorption limite pour une stratication simple, SIAM J. Math.
Anal. 26 (1995) 880924.
[30] P. Dancona and L. Fanelli, Strichartz and smoothing estimates for dispersive equa-
tions with magnetic potentials, Comm. Partial Dierential Equations 33 (2008)
10821112.
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

Eigenfunctions Expansions and Spacetime Estimates 1239

[31] S. DeBi`evre and W. Pravica, Spectral analysis for optical bers and stratied uids
I: The limiting absorption principle, J. Funct. Anal. 98 (1991) 404436.
[32] V. G. Deich, E. L. Korotayev and D. R. Yafaev, Theory of potential scattering, taking
into account spatial anisotropy, J. Soviet Math. 34 (1986) 20402050.
[33] S.-I. Doi, Smoothing eects of Schr odinger evolution groups on Riemannian mani-
folds, Duke Math. J. 82 (1996) 679706.
[34] D. M. Eidus, The principle of limiting absorption, in American Mathematical Society
Translations, Series 2, Vol. 47 (Amer. Math. Soc., Providence, 1965), pp. 157192.
(Originally in Russian, Mat. Sb. 57 (1962) 1344).
[35] D. Gilbarg and N. S. Trudinger, Elliptic Partial Dierential Equations of Second
Order (Springer-Verlag, 1977).
[36] M. Goldberg and W. Schlag, A limiting absorption principle for the three-dimensional
Schrodinger equation with Lp potentials, Int. Math. Res. Not. 75 (2004) 40494071.
1 2
[37] I. Herbst, Spectral theory of the operator (p2 + m2 ) 2 Z er , Comm. Math. Phys.
53 (1977) 285294.
[38] I. Herbst, Spectral and scattering theory for Schr odinger operators with potentials
independent of |x|, Amer. J. Math. 113 (1991) 509565.
[39] K. Hidano, MorawetzStrichartz estimates for spherically symmetric solutions to
wave equations and applications to semilinear Cauchy problems, Dierential Integral
Equations 20 (2007) 735754.
[40] K. Hidano, J. Metcalfe, H. F. Smith, C. D. Sogge and Y. Zhou, On abstract Strichartz
estimates and the Strauss conjecture for nontrapping obstacles, to appear in Trans.
Amer. Math. Soc. (2009); http://front.math.ucdavis.edu/0805.1673.
[41] L. H ormander, The Analysis of Linear Partial Dierential Operators II (Springer-
Verlag, 1983).
[42] T. Hoshiro, On weighted L2 estimates of solutions to wave equations, J. Anal. Math.
72 (1997) 127140.
[43] T. Hoshiro, Decay and regularity for dispersive equations with constant coecients,
J. Anal. Math. 91 (2003) 211230.
[44] T. Ikebe, Eigenfunction expansions associated with the Schr odinger operators and
their application to scattering theory, Arch. Ration. Mech. Anal. 5 (1960) 134.
[45] T. Ikebe and Y. Saito, Limiting absorption method and absolute continuity for the
Schrodinger operators, J. Math. Kyoto Univ. Ser. A 7 (1972) 513542.
[46] T. Ikebe and T. Tayoshi, Wave and scattering operators for second-order elliptic
operators in Rn , Publ. RIMS Kyoto Univ. Ser. A 4 (1968) 483496.
[47] A. D. Ionescu and W. Schlag, AgmonKatoKuroda theorems for a large class of
perurbations, Duke Math. J. 131 (2006) 397440.
[48] M. Kadowaki, Low and high energy resolvent estimates for wave propagation in
stratied media and their applications, J. Dierential Equations 179 (2002) 246277.
[49] M. Kadowaki, Resolvent estimates and scattering states for dissipative systems, Publ.
RIMS Kyoto Univ. Ser. A 38 (2002) 191209.
[50] T. Kato, Perturbation Theory for Linear Operators (Springer-Verlag, 1966).
[51] T. Kato and S. T. Kuroda, The abstract theory of scattering, Rocky Mountain J.
Math. 1 (1971) 127171.
[52] T. Kato and K. Yajima, Some examples of smooth operators and the associated
smoothing eect, Rev. Math. Phys. 1 (1989) 481496.
[53] K. Kikuchi and H. Tamura, The limiting amplitude principle for acoustic propagators
in perturbed stratied uids, J. Dierential Equations 93 (1991) 260282.
[54] V. G. Mazya and T. O. Shaposhnikova, Theory of Sobolev Multipliers (Springer-
Verlag, 2008).
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004193

1240 M. Ben-Artzi

[55] K. Mochizuki, Scattering theory for wave equations with dissipative terms, Publ.
RIMS Kyoto Univ. Ser. A 12 (1976) 383390.
[56] C. S. Morawetz, Time decay for the KleinGordon equation, Proc. Roy. Soc. Ser. A
306 (1968) 291296.
[57] K. Morii, Time-global smoothing estimates for a class of dispersive equations with
constant coecients, Ark. Mat. 46 (2008) 363375.
[58] E. Mourre, Absence of singular continuous spectrum for certain self-adjoint operators,
Comm. Math. Phys. 78 (1980/81) 391408.
[59] M. Murata and T. Tsuchida, Asymptotics of green functions and the limiting absorp-
tion principle for elliptic operators with periodic coecients, J. Math. Kyoto Univ.
46 (2006) 713754.
[60] P. Perry, I. M. Sigal and B. Simon, Spectral analysis of N -body Schrodinger operators,
Ann. Math. 114 (1981) 519567.
[61] B. Perthame and L. Vega, MorreyCampanato estimates for Helmholtz equations, J.
Funct. Anal. 164 (1999) 340355.
[62] B. Perthame and L. Vega, Energy decay and Sommerfeld condition for Helmholtz
equation with variable index at innity, preprint (2002).
[63] A. Ja. Povzner, The expansion of arbitrary functions in terms of eigenfunctions of
the operator u + cu, in American Mathematical Society Translations, Series 2,
Vol. 60 (Amer. Math. Soc., 1966) 149. (Originally in Russian, Math. Sb. 32 (1953)
109156.
[64] A. G. Ramm, Justication of the limiting absorption principle in R2 , in Operator
Theory and Applications, Fields Institute Communications, Vol. 25, eds. A. G. Ramm,
P. N. Shivakumar and A. V. Strauss (Amer. Math. Soc., 2000), pp. 433440.
[65] D. Robert, Asymptotique de la phase de diusion ` a haute energie pour des pertur-
bations du second ordre du laplacien, Ann. Sci. Ecole Norm. Sup. (4) 25 (1992)
107134.
[66] M. Ruzhansky and M. Sugimoto, Global L2 -boundedness theorems for a class of
Fourier integral operators, Comm. Partial Dierential Equations 31 (2006) 547569.
[67] M. Ruzhansky and M. Sugimoto, A smoothing property of Schr odinger equations in
the critical case, Math. Ann. 335 (2006) 645673.
[68] Y. Saito, Spectral Representations for Schrodinger Operators with Long-Range Poten-
tials, Lecture Notes in Mathematics, Vol. 727 (Springer-Verlag, 1979).
[69] B. Simon, Best constants in some operator smoothness estimates, J. Funct. Anal.
107 (1992) 6671.
[70] C. D. Sogge, Lectures on Non-Linear Wave Equations, 2nd edn. (International Press,
2008).
[71] W. A. Strauss, Nonlinear Wave Equations, CBMS Lectures, Vol. 73 (Amer. Math.
Soc., 1989).
[72] R. S. Strichartz, Restrictions of Fourier transforms to quadratic surfaces and decay
of solutions of wave equations, Duke Math. J. 44 (1977) 705714.
[73] M. Sugimoto, Global smoothing properties of generalized Schr odinger equations, J.
Anal. Math. 76 (1998) 191204.
[74] T. Umeda, Generalized eigenfunctions of relativistic Schr odinger operators I, Elec-
tronic J. Dierential Equations 127 (2006) 146.
[75] A. Vasy and J. Wunsch, Positive commutators at the bottom of the spectrum, J.
Funct. Anal. 259 (2010) 503523.
[76] G. Vodev, Local energy decay of solutions to the wave equation for short-range poten-
tials, Asymptot. Anal. 37 (2004) 175187.
[77] B. G. Walther, A sharp weighted L2 -estimate for the solution to the time-dependent
Schrodinger equation, Ark. Math. 37 (1999) 381393.
November 16, 2010 15:28
WSPC/S0129-055X 148-RMP
J070-S0129055X10004211

Reviews in Mathematical Physics


Vol. 22, No. 10 (2010) 12411243

c World Scientic Publishing Company
DOI: 10.1142/S0129055X10004211

REVIEWS IN MATHEMATICAL PHYSICS


Author Index Volume 22 (2010)

Barreira, L., de Oliveira, G.,


Almost additive Asymptotics for Fermi
thermodynamic curves: Small magnetic
formalism: Some recent potential 8 (2010) 881
developments 10 (2010) 1147 De Roeck, W., Maes, C.,
Bassi, A., Durr, D. & Neto cn
y, K. &
Kolb, M., Rey-Bellet, L.,
On the long time A note on the
behavior of free non-commutative
stochastic Schrodinger LaplaceVaradhan
evolutions 1 (2010) 55 integral lemma 7 (2010) 839
Ben-Artzi, M., de Siqueira Pedra, W.,
Eigenfunction see Bru, J.-B. 3 (2010) 233
expansions and Demirel, S. & Harrell, II,
spacetime estimates for E. M.,
generators in On semiclassical and
divergence-form 10 (2010) 1209 universal inequalities
Ben Halima, M., for eigenvalues of
Construction of certain quantum graphs 3 (2010) 305
fuzzy ag manifolds 5 (2010) 533 Dimassi, M. &
Brain, S. & Landi, G., Petkov, V.,
The 3D spin geometry Spectral shift function
of the quantum for operators with
two-sphere 8 (2010) 963 crossed magnetic and
Bru, J.-B. & de Siqueira electric elds 4 (2010) 355
Pedra, W., Dirr, G.,
Eect of a locally see Schulte-
repulsive interaction on Herbr uggen, T. 6 (2010) 597
s-wave D
urr, D.
superconductors 3 (2010) 233 see Bassi, A. 1 (2010) 55
Chatterjee, S., Lahiri, A. Feher, L. & Pusztai,
& Sengupta, A. N., B. G.,
Parallel transport over Derivations of the
path spaces 9 (2010) 1033 trigonometric BCn
Daud e, T. & Sutherland model by
Nicoleau, F., quantum Hamiltonian
Inverse scattering reduction 6 (2010) 699
in de SitterReissner Glaser, S. J.,
Nordstr om black hole see Schulte-
spacetimes 4 (2010) 431 Herbr uggen, T. 6 (2010) 597

1241
November 16, 2010 15:28
WSPC/S0129-055X 148-RMP
J070-S0129055X10004211

1242 Author Index

Grigorian, S., Lahiri, A.,


Moduli spaces of G2 see Chatterjee, S. 9 (2010) 1033
manifolds 9 (2010) 1061 Landi, G.,
Guha, P., see Brain, S. 8 (2010) 963
EulerPoincare ows Liang, S.,
on the loop see Kusuoka, S. 7 (2010) 733
BottVirasoro group Longo, R., Martinetti, P.
and space of tensor & Rehren, K.-H.,
densities and Geometric modular
(2 + 1)-dimensional action for disjoint
integrable systems 5 (2010) 485 intervals and boundary
Harrell, II, E. M., conformal eld theory 3 (2010) 331
see Demirel, S. 3 (2010) 305 Maes, C.,
Helmke, U., see De Roeck, W. 7 (2010) 839
see Schulte- Marin, L.,
Herbr uggen, T. 6 (2010) 597 Dynamical bounds for
Hidaka, T. & Sturmian Schr odinger
Hiroshima, F., operators 8 (2010) 859
PauliFierz model with Martinetti, P.,
Kato-class potentials see Longo, R. 3 (2010) 331
and exponential Matte, O. &
decays 10 (2010) 1181 Stockmeyer, E.,
Hiroshima, F., Spectral theory of
see Hidaka, T. 10 (2010) 1181 no-pair Hamiltonians 1 (2010) 1
Ichinose, W., Morsella, G. &
On the Feynman path Tomassini, L.,
integral for From global
nonrelativistic symmetries to local
quantum currents: The free
electrodynamics 5 (2010) 549 (scalar) case in four
Jencov
a, A. & Ruskai, dimensions 1 (2010) 91
M. B., Nachtergaele, B., Schlein,
A unied treatment of B., Sims, R., Starr, S.
convexity of relative & Zagrebnov, V.,
entropy and related On the existence of the
trace functions, with dynamics for
conditions for equality 9 (2010) 1099 anharmonic quantum
Jensen, A. & Yajima, K., oscillator systems 2 (2010) 207
Spatial growth of Netocny, K.,
fundamental solutions see De Roeck, W. 7 (2010) 839
for certain Nicoleau, F.,
perturbations of the see Daud e, T. 4 (2010) 431
harmonic oscillator 2 (2010) 193 Petkov, V.,
Kolb, M., see Dimassi, M. 4 (2010) 355
see Bassi, A. 1 (2010) 55 Porta, M. &
Kriz, I., Simonella, S.,
Perturbative Borel summability of
deformations of 44 planar theory via
conformal eld theories multiscale analysis 9 (2010) 995
revisited 2 (2010) 117 Pusztai, B. G.,
Kusuoka, S. & Liang, S., see Feher, L. 6 (2010) 699
A classical mechanical Rehren, K.-H.,
model of Brownian see Longo, R. 3 (2010) 331
motion with plural Rey-Bellet, L.,
particles 7 (2010) 733 see De Roeck, W. 7 (2010) 839
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP
J070-S0129055X10004211

Author Index 1243

Robert, D., Sengupta, A. N.,


On the HermanKluk see Chatterjee, S. 9 (2010) 1033
semiclassical Simonella, S.,
approximation 10 (2010) 1123 see Porta, M. 9 (2010) 995
Ruskai, M. B., Sims, R.,
see Jencov
a, A. 9 (2010) 1099 see Nachtergaele, B. 2 (2010) 207
Sanders, K., Starr, S.,
The locally covariant see Nachtergaele, B. 2 (2010) 207
Dirac eld 4 (2010) 381 Stockmeyer, E.,
Sango, M., see Matte, O. 1 (2010) 1
Density dependent Tomassini, L.,
stochastic see Morsella, G. 1 (2010) 91
NavierStokes Yajima, K.,
equations with see Jensen, A. 2 (2010) 193
non-Lipschitz Zagrebnov, V.,
random see Nachtergaele, B. 2 (2010) 207
forcing 6 (2010) 669 Zhang, R. B. &
Schlein, B., Zhang, X.,
see Nachtergaele, B. 2 (2010) 207 Projective module
Schulte-Herbr uggen, T., description of
Glaser, S. J., Dirr, G. embedded
& Helmke, U., noncommutative
Gradient ows for spaces 5 (2010) 507
optimization in Zhang, X.,
quantum information see Zhang, R. B. 5 (2010) 507
and quantum
dynamics: Foundations
and applications 6 (2010) 597

Vous aimerez peut-être aussi