Peter S. Riseborough
February 12, 2014
Contents
1 Introduction
6
7
11
12
3 Maxwells Equations
14
3.1 Vector and Scalar Potentials . . . . . . . . . . . . . . . . . . . . . 15
3.2 Gauge Invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4 Relativistic Formulation of Electrodynamics
4.1 Lorentz Scalars and Vectors . . . . . . . . . .
4.2 Covariant and Contravariant Derivatives . . .
4.3 Lorentz Transformations . . . . . . . . . . . .
4.4 Invariant Form of Maxwells Equations . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
20
20
22
25
27
5 The
5.1
5.2
5.3
5.4
5.5
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
30
35
36
39
41
44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
44
46
47
47
49
50
51
7 The
7.1
7.2
7.3
7.4
Electromagnetic Lagrangian
Conservation Laws for Electromagnetic Fields
Massive SpinOne Particles . . . . . . . . . .
Polarizations of Massive SpinOne Particles .
The Propagator for Massive Photons. . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
54
58
64
65
68
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
74
78
79
80
83
85
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
88
91
93
95
96
98
100
105
107
112
113
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
114
117
117
120
123
132
138
141
143
148
151
158
163
166
168
172
175
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
the Massive
. . . . . .
. . . . . .
. . . . . .
. . . . . .
Graviton
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
179
180
183
185
186
198
201
204
209
Dirac Equation
Conservation of Probability . . . . . . . . . . . . . . . . . .
Covariant Form of the Dirac Equation . . . . . . . . . . . .
The Field Free Solution . . . . . . . . . . . . . . . . . . . .
Coupling to Fields . . . . . . . . . . . . . . . . . . . . . . .
12.4.1 Mott Scattering . . . . . . . . . . . . . . . . . . . . .
12.4.2 Maxwells Equations . . . . . . . . . . . . . . . . . .
12.4.3 The Gordon Decomposition . . . . . . . . . . . . . .
12.5 Lorentz Covariance of the Dirac Equation . . . . . . . . . .
12.5.1 The Space of the Anticommuting Matrices. . . .
12.5.2 Polarization in Mott Scattering . . . . . . . . . . . .
12.6 The NonRelativistic Limit . . . . . . . . . . . . . . . . . .
12.7 Conservation of Angular Momentum . . . . . . . . . . . . .
12.8 Conservation of Parity . . . . . . . . . . . . . . . . . . . . .
12.9 Bilinear Covariants . . . . . . . . . . . . . . . . . . . . . .
12.10The Spherically Symmetric Dirac Equation . . . . . . . . .
12.10.1 The Hydrogen Atom . . . . . . . . . . . . . . . . . .
12.10.2 LowestOrder Radial Wavefunctions . . . . . . . . .
12.10.3 The Relativistic Corrections for Hydrogen . . . . . .
12.10.4 The Kinematic Correction . . . . . . . . . . . . . . .
12.10.5 SpinOrbit Coupling . . . . . . . . . . . . . . . . . .
12.10.6 The Darwin Term . . . . . . . . . . . . . . . . . . .
12.10.7 The Fine Structure of Hydrogen . . . . . . . . . . .
12.10.8 A Particle in a Spherical Square Well . . . . . . . .
12.10.9 The MIT Bag Model . . . . . . . . . . . . . . . . . .
12.10.10The Temple Meson Model . . . . . . . . . . . . . . .
12.11Scattering by a Spherically Symmetric Potential . . . . . .
12.11.1 Polarization in Coulomb Scattering. . . . . . . . . .
12.11.2 Partial Wave Analysis . . . . . . . . . . . . . . . . .
12.12An Electron in a Uniform Magnetic Field . . . . . . . . . .
12.13Motion of an Electron in a Classical Electromagnetic Field .
12.14The Limit of Zero Mass . . . . . . . . . . . . . . . . . . . .
12.15Classical Dirac Field Theory . . . . . . . . . . . . . . . . . .
12.15.1 Chiral Gauge Symmetry . . . . . . . . . . . . . . . .
12.16Hole Theory . . . . . . . . . . . . . . . . . . . . . . . . . . .
12.16.1 Compton Scattering . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
216
221
222
224
230
231
235
236
240
251
258
261
265
267
271
275
286
299
301
306
307
313
314
318
324
328
330
330
334
337
340
344
352
355
358
363
12 The
12.1
12.2
12.3
12.4
.
.
.
.
.
.
.
.
.
372
14 The
14.1
14.2
14.3
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
374
374
376
381
381
385
386
389
390
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
391
392
393
394
397
Introduction
Nonrelativistic mechanics yields a reasonable approximate description of physical phenomena in the range where the particles kinetic energies are small compared with their rest mass energies. However, it should be noted that when the
relativistic invariant mass of a particle is expressed in terms of its energy E and
momentum p via
E 2 p2 c2 = m2 c4
(1)
it implies that its dispersion relation has two branches
q
E = c
p2 + m2 c2
(2)
E(p)
In relativistic classical mechanics, it is assumed expedient to neglect the negativeenergy solutions. This assumption based on the expectation that, since energy
+mc2
2mc2
pc
mc2
Figure 1: The positive and negative energy branches for a relativistic particle
with rest mass m. The minimum separation between the positiveenergy branch
and the negativeenergy branch is 2mc2 .
can only change continuously, it is impossible that a particle with positive energy
can make a transition from the positive to negativeenergy states. However, in
quantum mechanics, particles can make discontinuous transitions. Therefore, it
is necessary to consider both the positive and negativeenergy branches. These
considerations naturally lead one to the concept of particles and antiparticles,
and also to the realization that one must consider multiparticle quantum mechanics or field theory.
We shall take a look at the quantum mechanical description of the electromagnetic field, Diracs relativistic theory of spin onehalf fermions (such as
leptons and quarks), and then look at the interaction between these particles
and the electromagnetic field. The interaction between charged fermions and
the electromagnetic field is known as Quantum Electrodynamics. Quantum
Electrodynamics contains some surprises, namely that although the interaction
h c
137.0359979
(3)
(r, t)
(5)
(r, t) = (3) (r, t)
(3) (r, t)
This conjecture can be verified by examining the transformational properties of
a vector field under rotations. Under a rotation of the field, the components
of the field are transported in space, and also the direction of the vector field
is rotated. This implies that the components of the transported field have to
be rotated. The rotation of the direction of the field is generated by operators
which turn out to be the intrinsic angular momentum operators. Specifically,
the generators satisfy the commutation relations defining angular momentum,
but also correspond to the subspace with angular momentum one.
2.1
(6)
or equivalently
r) = (r)
0 (R
(7)
The above equation can be used to determine 0 (r) by using the substitution
1 r so
rR
1 r)
0 (r) = (R
(8)
If e is a unit vector along the axis of rotation, the rotation of r through an
e
exr
ex(exr)
r(e.r)
r
(9)
(R1r)
(r)
yaxis
r'=R1r
R
xaxis
= (r e r)
= (r) ( e r ) . (r) + . . .
= (r) ( r ) . e (r) + . . .
i
) (r) + . . .
( e . L
= (r)
h
8
=
exp
i
)
( e . L
h
(r)
(10)
(11)
Therefore, locally, rotations of the scalar field are generated by the orbital an
gular momentum operator L.
is a rotation, it also rotates a vector field (r). Not
Since the operation R
only does the rotation transfer the magnitude of (r) to the new point r0 but it
must also rotate the direction of the transferred vector (r). The rotated vector
(r). Mathematically, the transformation is expressed
0 (r0 ) is denoted by R
0.5
0
0.5
1
1
0.5
0.5
0.5
0.5
Figure 4: The effect of a rotation R on a vector field (r). The rotation affects
both the magnitude and direction of the vector.
1
2
1
as
0 2
1
0
1
2
(r)
(12)
0 (r0 ) = R
2
or equivalently
r) = R
(r)
0 (R
(13)
(14)
does not affect the posiThe part of the rotational operator designated by R
tional coordinates (r) of the vector field, and so can be found by considering
the rotation of the vector field at the origin
=
R
I + e
(15)
(r e r)
= R
= (r e r) +
= (r e r) +
i
= (r)
( e .
h
i
= (r)
( e .
h
e (r e r)
e (r) + . . .
) (r) + e (r) + . . .
L
) (r) i ( e . S
) (r)
L
h
(16)
has
where the terms of order ()2 have been neglected and a vector operator S
only admixes the components of , unlike
been introduced. The operator S
which only acts on the r dependence of the components. The components of
L
the threedimensional vector operator S are expressed as 3 3 matrices1 , with
matrix elements
(S(i) )j,k = i h i,j,k
(18)
where i,j,k is the antisymmetric LeviCivita symbol. The LeviCivita symbol
is defined by i,j,k = 1 if the ordered set (i, j, k) is obtained by an even number
of permutations of (1, 2, 3) and is 1 if it is obtained by an odd number of
permutations, and is zero if two or more indices are repeated. Specifically, the
antisymmetric matrices are given by
0 0 0
S(1) = h
0 0 i
(19)
0 i 0
and by
S(2)
i
0
0
(20)
0 i 0
= h
i 0 0
0 0 0
(21)
0 0
= h
0 0
i 0
and finally by
S(3)
(22)
10
(17)
and
[ S(i) , S(j) ] = i h i,j,k S(k)
(23)
where the repeated index (k) is summed over. The above set of operators
form a Lie algebra associated with the corresponding Lie group of continuous
rotations. Thus, it is natural to identify these operators which arise in the
analysis of transformations in classical physics with the angular momentum
operators of quantum mechanics. In terms of these operators, the infinitesimal
transformation has the form
i
+ S
) (r) + . . .
e . ( L
h
i
+ S
) (r)
0 (r) = exp
e . ( L
h
0 (r) (r)
or
(24)
(25)
(26)
where
+ S
J = L
(27)
is the total angular momentum. The operator S is the intrinsic angular momentum of the vector field . The magnitude of S is found from
S2 = (S(1) )2 + (S(2) )2 + (S(3) )2
(28)
which is evaluated as
1
S2 = 2 h2 0
0
0
1
0
0
0
1
(29)
and is the Casimir operator. It is seen that a vector field has intrinsic angular
momentum, with a magnitude given by the eigenvalue of S2 which is
S(S + 1)h
2 = 2 h2
(30)
2.2
First, we shall try and construct the Schrodinger equation describing a massless
uncharged spinless particle. A spinless particle is described by a scalar wave
function, and an uncharged particle is described by a real wave function. The
derivation is based on the energymomentum relation for a massless particle
E 2 p2 c2 = 0
11
(31)
t
p = i h
i h
(32)
One finds that the real scalar wave function (r, t) satisfies the wave equation
1 2
2
= 0
(33)
c2 t2
since h
drops out. This is not a very useful result, since it is a secondorder differential equation in time, and the solution of a secondorder differential equation
can only be determined if two initial conditions are given. Usually, the initial
conditions are given by
(r, 0) = f (r)
= g(r)
t t=0
(34)
2.3
The wave function of an uncharged spinone particle is expected to be represented by a real vector function.
We shall try and factorize the wave equation for the vector E into two firstorder differential equations, each of which requires one boundary condition.
This requires one to specify six quantities. Therefore, one needs to postulate
the existence of two independently measurable fields, E and B. Each of these
fields should satisfy the two wave equations
1 2
2
(35)
E = 0
c2 t2
and
1 2
2
c2 t2
12
B = 0
(36)
(37)
since the lefthand side is a vector, the righthand side must also be a vector
composed of the operator p and the wave functions. Like Newtons laws, these
equations must be invariant under timereversal invariance, t t. The
transformation leads to the identification
a = d = 0
(38)
b = e
(39)
and
if one also requires that one of the two fields changes sign under time reversal2 .
We shall adopt the convention that the field E retains its sign, so
E
B
E
B
(40)
under timereversal invariance. On taking the time derivative of the first equation, one obtains
2
2 E
2 2
h
= c b p
p E
(41)
t2
Likewise, the B field is found to satisfy
2B
h
t2
2
= c b p
p B
(42)
(43)
and
2B
h
t2
2
= c b
p B + p
p . B
.
E
=
c
b
E
t2
(44)
(45)
13
and
2B
t2
= c b
B +
.B
(46)
so h
drops out. To reduce these equations to the form of wave equations, one
needs to impose the conditions
.E = 0
(47)
.B = 0
(48)
and
On identifying the coefficients with those of the wave equation, one requires
that
b2 = 1
(49)
Thus, one has arrived at the set of the sourcefree Maxwells equations
1 E
c t
1 B
c t
.E
.B
= B
= E
=
=
0
0
(50)
Maxwells Equations
Classical Field Theories describe systems in which a very large number of particles are present. Measurements on systems containing very large numbers of
particles are expected to result in average values, with only very small deviations. Hence, we expect that the subtleties of quantum measurements should be
completely absent in systems that can be described as quantum fields. Classical
Electromagnetism is an example of such a quantum field, in which an infinitely
large number of photons are present.
In the presence of a current density j and a charge density , Maxwells
equations assume the forms
B
1 E
c t
14
4
j
c
E +
1 B
c t
.E
.B
=
=
4
0
(51)
The field equations ensure that the sources j and satisfy a continuity equation.
Taking the divergence of the first equation and
derivative of the third, one obtains
1
. B
.E
c
t
1
.E
c
t
4
c t
=
=
=
4
.j
c
4
.j
c
4
.j
c
(52)
+ .j = 0
t
(53)
3.1
1 B
c t
15
(56)
+
A
= 0
(57)
c t
c t
which is automatically satisfied since
= 0
(58)
and the terms involving A cancel since A is analytic. The remaining sourcefree
Maxwell equation is satisfied, since it has the form
.B
(59)
which reduces to
.
A
= 0
(60)
A
=
.A
2 A
2
.A
c t
(62)
4
j
c
16
(63)
3.2
Gauge Invariance
The vector and scalar potentials are defined as the solutions of the coupled
partial differential equations describing the electric and magnetic fields
E =
1 A
c t
(64)
and
B = A
(65)
Hence, one expects that the solutions are only determined up to functions of
integration. That is the vector and scalar potentials are not completely determined, even if the electric and magnetic fields are known precisely. It is possible
to transform the vector and scalar potentials, in a way such that the E and B
fields remain invariant. These transformations are known as gauge transformations of the second kind3 .
In particular, one can perform the transform
A A0 = A
1
0 = +
c t
(66)
= A0
=
A
= A
= B
(67)
= 0
(68)
valid for any scalar function has been used. The electric field is invariant,
since the transformed electric field is given by
E0
3 The
1 A0
0
c t
transformation
0
i
h
p0 = i
h
= exp
17
1
c
1
=
c
= E
=
1
A
+
t
c t
A
t
(69)
In the above derivation, it has been noted that the order of the derivatives can
be interchanged,
(70)
t
t
since is an analytic scalar function.
The gauge invariance allows us the freedom to impose a gauge condition
which fixes the gauge. Two gauge conditions which are commonly used are the
Lorenz gauge
1
.A +
= 0
(71)
c t
and the Coulomb or radiation gauge
.A = 0
(72)
The Lorenz gauge is manifestly Lorentz invariant, whereas the Coulomb gauge
is frequently used in cases where the electrostatic interactions are important.
It is always possible to impose one or the other of these gauge conditions. If
the vector and scalar potentials (, A) do not satisfy the gauge transformation,
then one can perform a gauge transformation so that the transformed fields
(0 , A0 ) satisfy the gauge condition.
For example, if the fields (, A) do not satisfy the Lorenz gauge condition,
since
1
.A +
= (r, t)
(73)
c t
where is nonzero, then one can perform the gauge transformation to the new
fields (0 , A0 )
. A0 +
1 0
c t
1
1 2
= . A 2 +
+ 2
c t
c t2
2
1
=
2 2
c t2
(74)
The new fields satisfy the Lorentz condition if one chooses to be the solution
of the wave equation
1 2
2
2
(75)
= (r, t)
c t2
18
This can always be done, since the driven wave equation always has a solution.
Hence, one can always insist that the fields satisfy the gauge condition
. A0 +
1 0
= 0
c t
(76)
(77)
one can use Poissons equations to show that one can always find a such that
the Coulomb gauge condition is satisfied4 .
In the Lorenz gauge, the equations of motion for the electromagnetic field
are given by
4
1 2
2
A =
j
+ 2
2
c t
c
1 2
= 4
(78)
2 + 2
c t2
Hence, A and both satisfy the wave equation, where j and are the sources.
The solutions are waves which travel with velocity c.
In the Coulomb gauge, the fields satisfy the equations
4
1
1 2
2
A =
j
+ 2
2
c t
c
c t
2 = 4
The second equation is Poissons equation and has solutions given by
Z
(r0 , t)
(r, t) =
d3 r 0
 r r0 
(79)
(80)
19
4.1
The spacetime fourvector has components given by the time t and the three
space coordinates (x(1) , x(2) , x(3) ) which label an event. The zerothcomponent
of the fourvector x(0) (the time component) is defined to be ct, where c is the
velocity of light, in order that all the components have the dimensions of length.
In Minkowski space, the fourvector is defined as having contravariant components x = (ct, x(1) , x(2) , x(3) ), while the covariant components are denoted
by x = (ct, x(1) , x(2) , x(3) ). The invariant length is given by the scalar
product
x x = ( c t )2 ( x(1) )2 ( x(2) )2 ( x(3) )2
(81)
where repeated indices are summed over. The invariant length x x is related
to the proper time . This definition can be generalized to the scalar product
of two arbitrary fourvectors A and B as
A B = A(0) B (0) A(1) B (1) A(2) B (2) A(3) B (3)
(82)
(84)
(85)
1 0
0
0
0 1 0
0
g, =
(86)
0 0 1
0
0 0
0 1
20
where labels the rows and labels the columns. If the fourvectors are expressed as columnvectors
(0)
A
A(1)
A =
(87)
A(2)
A(3)
and
A(0)
A(1)
=
A(2)
A(3)
(88)
(0)
A
A(0)
1 0
0
0
A(1)
0 1 0
A(1)
0
(89)
A(2) = 0 0 1
0 A(2)
A(3)
0 0
0 1
A(3)
The inverse transform is expressed as
A = g , A
(90)
(91)
1
cos
cos
1
(95)
(96)
The covariant components of the vector (X(1) , X(2) ) are found from the contravariant components by the action of the metric tensor
(1)
X(1)
1
cos
X
=
(97)
X(2)
cos
1
X (2)
The covariant components of any vector can be found geometrically by dropping
normals from the tip of the vector to the axes. The intersection of a normal with
its corresponding axis determines the covariant component. For a Cartesian coordinate system, the covariant components are identical to the contravariant
components.
A familiar example of the Lorentz invariant scalar product involves the momentum fourvector with contravariant components p ( Ec , p(1) , p(2) , p(3) )
where E is the energy. The covariant components of the momentum fourvector
are given by p ( Ec , p(1) , p(2) , p(3) ) and the scalar product defines the
invariant mass m via
2
E
p p =
p2 = m2 c2
(98)
c
Another scalar product which is frequently encountered is p x which is given
by
p x = E t p . x
(99)
This scalar product is frequently seen in the description of planes of constant
phase of waves.
4.2
(100)
(x ) + . . .
x
(101)
(x )
x
(102)
(x )
x
(103)
x
1
=
c
1
=
c
,
,
,
t x(1) x(2) x(3)
,
t
(104)
x
1
=
c
1
=
c
=
,
,
,
(1)
(2)
t
x
x
x(3)
,
t
(105)
(1)
(2)
(3)
A
=
, A , A , A
=
, A
(106)
then the Lorenz gauge condition can be expressed as
A =
1
+ .A = 0
c t
23
(107)
which is of the form of a Lorentz scalar. Likewise, if one introduces the current
density fourvector j with contravariant components
j =
c , j (1) , j (2) , j (3)
(108)
=
c, j
then the condition for conservation of charge can be written as
+ .j
t
j
(109)
which is a Lorentz scalar. Also, the gauge transformation can also be compactly
expressed in terms of a transformation of the contravariant vector potential
A A0 = A +
(110)
= +
A A0
(111)
Similarly, one can use the contravariant notation to express the quantization
conditions
E
p
t
i h
i h
(112)
in the form
p i h
(113)
One can also express the wave equation operator in terms of the scalar product
of the contravariant and covariant derivative operators
=
1 2
2
c2 t2
(114)
Hence, in the Lorenz gauge, the equations of motion for the fourvector potential
A can be expressed concisely as
A =
4
j
c
24
(115)
4.3
Lorentz Transformations
(116)
(118)
(119)
If the scalar product is to be invariant, the transform must satisfy the condition
g, = g,
(120)
0 0 1 0 2 0 3
1 0 1 1 1 2 1 3
=
(121)
2 0 2 1 2 2 2 3
3
3
3
3
0 1 2 3
where labels the rows and labels the columns. In terms of the matrices, the
condition that is a Lorentz transformation can be written as
g = T g
(122)
(123)
0
0
1
vc
v
1
0
c
q 0
1
2
v
(124)
= q
0
0
0
1
2
2
c
v
q
1 c2
2
0
0
0
1 vc2
25
x(3)
x'(3)
x(2)
x'(2)
O'
x(1)
(1)
x'
Figure 5: Two inertial frames of reference moving with a constant relative velocity with respect to each other.
which can be seen to satisfy the condition
g, = g,
(125)
x(2)
x'(1)
x(1)
Figure 6: Two inertial frames of reference rotated with respect to each other.
Likewise, the rotation through an angle about the x(3) axis represented by
1
0
0
0
0 cos sin 0
=
(126)
0 sin cos 0
0
0
0
1
is a Lorentz transformation, since it also satisfies the condition eqn(120).
Since the boost velocity v and the angles of rotation are continuous, one
could consider transformations where these quantities are infinitesimal. Such
infinitesimal transformations can be expanded as
= + + . . .
26
(127)
where is the Kronecker delta function representing the identity transformation5 and is a matrix which is firstorder in the infinitesimal parameter. The
condition on required for to be a Lorentz transform is given by
0 = g, + g,
(129)
or, on using the metric tensor to lower the indices, one has
, = ,
(130)
(131)
which transforms the contravariant components of a vector into covariant components. It follows that, if and are either both spaceindices or are both
timeindices, the components of the finite Lorentz transformation matrix
are antisymmetric on interchanging and . Whereas if the pair of indices
and are mixed space and timeindices, the components of the transformation
matrix are symmetric.
Exercise:
Show that a Lorentz transformation from the unprimed rest frame to the
primed reference frame moving along the x(3) axis with constant velocity v, can
be considered as a rotation through an imaginary angle = i in spacetime,
where i c t plays the role of a spatial coordinate. Find the equation that determines .
4.4
In physics, one strives to write the fundamental equations in forms which are
independent of arbitrary choices, such as the coordinate system or the choice of
gauge condition. However, in particular applications it is expedient to choose
the coordinate system and gauge condition in ways that highlight the symmetries and simplify the mathematics.
We shall introduce an antisymmetric field tensor F , which is gauge invariant. That is, the form of F , is independent of the choice of gauge. We
5 The student more adept in indexgymnastics may consider the advantages and disadvantages of replacing the Kronecker delta function by g , since
g = g , g, = ,
(128)
27
(133)
(134)
(135)
since
,
1
c
1
=
c
=
=
(1)
t
x1
(1)
A
+
t
x(1)
E (1)
(136)
and
F 1,2
A(2)
A(1)
x1
x2
=
A(2) +
A(1)
x(1)
x(2)
= B (3)
(137)
The nonzero components of the field tensor are related to the spatial components (i, j, k) of the electromagnetic field by
F i,0 = E (i)
(138)
(139)
and
where i,j,k is the LeviCivita symbol. Therefore, the field tensor can be expressed as the matrix
0
E (1) E (2) E (3)
E (1)
0
B (3) B (2)
(140)
F , =
E (2) B (3)
0
B (1)
E (3) B (2) B (1)
0
28
4
j
c
(141)
F
+
F j,i
c t
xj
1 (i)
E
+ i,j,k
B (k)
c t
xj
(i)
1 (i)
E
+
B
c t
=
=
=
4 (i)
j
c
4 (i)
j
c
4 (i)
j
c
(142)
F j,0
xj
E (j)
j
x
.E
4 (0)
j
c
(143)
since F 0,0 vanishes. The above field equations are the two Maxwells equations
which involve the sources of the fields. The remaining two sourceless Maxwell
equations are expressed in terms of the antisymmetric field tensor as
F, + F, + F, = 0
(144)
where the indices are permuted cyclically. These internal equations reduce to
.B = 0
(145)
when , and are the space indices (1, 2, 3). When one index taken from the
set (, , ) is the time index, and the other two are different space indices, the
field equations reduce to
1 B
+ E = 0
c t
(146)
If two indices are repeated, the above equations are satisfied identically, due to
the antisymmetry of the field tensor.
Alternatively, when expressed in terms of the vector potential, the field equations of motion are equivalent to the wave equations
4
A A
=
j
(147)
c
29
= A
= j
(148)
(149)
(150)
This shows that, under a Lorentz transform, the electric and magnetic fields
(E, B) transform into themselves.
Exercise:
Show explicitly, how the components of the electric and magnetic fields
change, when the coordinate system is transformed from the unprimed reference frame to a primed reference frame which is moving along the x(3) axis with
constant velocity v.
Consider a string stretched along the xaxis, which can support motion in the
ydirection. We shall consider the string to be composed of mass elements
mi = a, that have fixed xcoordinates denoted by xi and are separated by
a distance a. The mass elements can be displaced along the yaxis. The ycoordinate of the ith mass element is denoted by yi . We shall assume that the
string satisfies the spatial boundary conditions at each end. We shall assume
that the string satisfies periodic boundary conditions, so that y0 = yN +1 .
The Lagrangian for the string is a function of the coordinates yi and the
i
velocities dy
dt . The Lagrangian is given by
L =
i=N
X
i=1
mi
2
dyi
dt
2
i
2
2
yi yi1
(151)
The first term represents the kinetic energy of the mass elements, and the second
term represents the increase in the elastic potential energy of the section of the
string between the ith and (i 1)th element as the string is stretched from its
equilibrium position. This follows since, si the length of the section of string
between mass element i and i 1 in a nonequilibrium position is given by
s2i
= ( xi xi1 )2 + ( yi yi1 )2
= a2 + ( yi yi1 )2
30
(152)
yi+1
yi+1yi
yi
a
yi1
xi1
xi
xi+1
Figure 7: A string composed of a discrete set of particles of masses mi separated by a distance a along the xaxis. The particles can be moved from their
equilibrium positions by displacements yi transverse to the xaxis.
since the xcoordinates are fixed. Thus, if one assumes that the spring constant
for the stretched string segment is i , then the potential energy of the segment
is given by
i
( yi yi1 )2
(153)
Vi =
2
We shall consider the case of a uniform string for which i = for all i.
The equations of motion are obtained by minimizing the action S which is
defined as the integral
Z T
S =
dt L
(154)
0
(155)
(156)
where S0 is the action evaluated for the extremal trajectories. The firstorder
deviation found by varying yi is given by
ex
Z T
i=N
X
dyi
dyi
ex
ex
1 S =
dt
mi
yi yiex yi1
+ yi yi+1
yiex
dt
dt
0
i=1
(157)
31
i
in which yi (T ) and dy
dt are to be evaluated for the extremal trajectory. Since
the trajectory which the string follows minimizes the action, the term 1 S must
vanish for an arbitrary variation yi . We can eliminate the time derivative of
the deviation by integrating by parts with respect to t. This yields
i=N
X
d dyiex
ex
ex
ex
ex
=
dt
mi yi
yi yi yi1
+ yi yi+1 yi
dt dt
0
i=1
ex T
X
dyi
(158)
+
mi yi (t)
dt 0
i
Z
The boundary term vanishes since the initial and final configurations are fixed,
so
yi (T ) = yi (0) = 0
(159)
Hence the firstorder variation of the action reduces to
Z T
i=N
X
d dyiex
1
ex
ex
ex
S =
dt
yi mi
2 yi yi1 yi+1
dt
dt
0
i=1
(160)
The linear variation of the action vanishes for an arbitrary yi (t), if the term in
the square brackets vanishes
d dyiex
ex
ex
ex
+ 2 yi yi1 yi+1
= 0
(161)
mi
dt
dt
Thus, out of all possible trajectories, the physical trajectory yiex (t) is determined
by the equation of motion
d dyi
mi
= 2 yi yi1 yi+1
(162)
dt dt
dyi
dt
(164)
pi
32
dyi
L
dt
(165)
The Hamiltonian is only a function of the pairs of canonically conjugate momenta pi and coordinates yi . This can be seen, considering infinitesimal changes
i
in yi , dy
dt and pi . The resulting infinitesimal change in the Hamiltonian dH is
expressed as
i=N
X
dyi
dyi
L
L
dyi
dH =
dpi
+ pi d(
)
d(
)
dy
i
i
dt
dt
dt
yi
( dy
dt )
i=1
i=N
X
dyi
L
dpi
dyi
(166)
=
dt
yi
i=1
i
since, the terms proportional to the infinitesimal change d( dy
dt ) vanish identically, due to the definition of pi . From this, one finds
H
dyi
=
pi
dt
and
H
=
yi
L
yi
(167)
(168)
pi
mi
= i
yi yi1
+ i+1
yi+1 yi
(171)
(172)
yi pi
yi pi
i=1
33
(173)
pi , pj
=
yi , yj
= 0
(175)
a
2
dyi
dt
2
+
2
yi yi1
(176)
34
5.1
i=1
1
a
dx
(182)
(183)
dx L
L =
(184)
where
L =
1
2
dy
dt
2
a
y
x
2
(185)
The equations of motion are found from the extrema of the action
Z
S =
dx L
dt
(186)
It should be noted, that in S time and space are treated on the same footing
and that L is a scalar quantity.
In the continuum limit, the Hamiltonian is given by
Z
dx H
H =
0
35
(187)
H =
dy
dt
2
+ a
y
x
2
(188)
dy
dt
y
x
(189)
5.2
(190)
Normal Modes
The solutions of the equations of motion are of the form of a (real) superposition
of plane waves
1
k (x) = exp i ( k x t )
(191)
L
The above expression satisfies the wave equation if the frequency satisfies the
dispersion relation
k2 = v 2 k 2
(192)
The above dispersion relation yields both positive and negative frequency solutions. If the planewaves are to satisfy periodic boundary conditions, k must be
quantized so that
2
n
(193)
kn =
L
for integer n. The positivefrequency solutions shall be written as
1
k (x) = exp i ( kn x n t )
(194)
L
and the negative frequency solutions as
1
k (x) = exp i ( kn x + n t )
L
These solutions form an orthonormal set since
Z L
dx k0 (x) k (x) = k0 ,k
(195)
(196)
y(x) =
ck (0) k (x) + ck (0) k (x)
k
36
(197)
where the ck are arbitrary complex numbers that depend on k. If the time
dependence of the k (x) is absorbed into the complex functions ck via
ck (t) = ck (0) exp i k t
(198)
then one has
1 X
y(x) =
ck (t) + ck (t)
exp i k x
L k
(199)
which is purely real. Thus, the field y(x) is determined by the amplitudes of
the normal modes, i.e. by ck (t). The timedependent amplitude ck (t) satisfies
the equation of motion
d2 ck
= k2 ck
(200)
dt2
and, therefore, behaves like a classical harmonic oscillator. To quantize this
classical field theory, one needs to quantize these harmonic oscillators.
The Hamiltonian is expressed as
2
Z L
1
1
y(x)
2
2
H =
dx
p(x) + a
2a 0
a
x
On substituting y(x) in the form
1 X
ck (t) + ck (t)
exp i k x
y(x) =
L k
and
dck
dck
a X
+
exp i k x
p(x) =
dt
dt
L k
(201)
(202)
(203)
then after integrating over x, one finds that the energy has the form
dck
X
dck
dck
dck
H =
+
+
2
dt
dt
dt
dt
k
a X 2
+
k
ck (t) + ck (t)
ck (t) + ck (t)
(204)
2
k
H =
k ck (t) ck (t)
ck (t) ck (t)
2
k
a X 2
+
k
ck (t) + ck (t)
ck (t) + ck (t)
(205)
2
k
37
(206)
k2
(207)
(208)
(209)
and also
i k 0 a
ck0
ck0
a X
p(xj ) exp
L j
i k xj
(210)
These relations are simply the results of applying the inverse Fourier transform
to y(x) and p(x). One can find the Poisson brackets relations between ck and
ck from
i k 0
ck0 ck0
,
ck + ck
a X
=
p(xi ) , y(xj )
exp i ( k xi + k 0 xj )
L i,j
a X
0
= +
i,j exp i ( k xi + k xj )
L i,j
a X
0
= +
exp i ( k + k ) xi
L i
=
+ k+k0
(211)
38
Likewise, one can obtain similar expressions for the other commutation relations.
This set of equations can be satisfied by setting
i
ck0 , ck
=
k,k0
(212)
2 k
and
ck0
ck
=
ck0 , ck
= 0
(213)
The above set of Poisson brackets can be recast in a simpler form by defining
ck =
1
ak
2 k
(214)
ak0 , ak
= i k,k0
and
ak0 , ak
=
(215)
ak0 , ak
= 0
(216)
5.3
(218)
[a
k0 , a
k ] = [ a
k0 , a
k ] = 0
(219)
and
To get rid of the annoying h
in the commutator, one can set
h b
a
k =
k
a
k =
h bk
(220)
Whether it was noted or not, bk is the Hermitean conjugate of bk . The Hermitean relation can proved by taking the Hermitean conjugate of y(xi ), and
39
noting that the third rule of quantization states, Measurable quantities are to
replaced by Hermitean operators. Therefore, the operator
s
1 X
h
bk (t) + b (t)
y(x) =
exp
i
k
x
(221)
k
2 k
L
k
y (x) =
bk (t) + ( bk (t))
exp i k x
(222)
2 k
L k
has to be the same as y(x). On setting k = k 0 in the above equation, one has
s
X
1
h
y (x) =
bk0 (t) + ( bk0 (t))
exp + i k x (223)
2 k 0
L k0
For y (x) to be equal to y(x), it is necessary that the Hermitean conjugate of
the operator bk0 is equal to bk0 . This shows that the pair of operators are indeed
Hermitean conjugates. The quantum field is represented by the operator
s
1 X
h
y(x) =
bk0 (t) + bk0 (t)
exp + i k x
(224)
2 k 0
L 0
k
H =
k ck ck + ck ck
k
X h k
=
bk bk + bk bk
2
(226)
H =
(227)
b k bk + b k bk
2
k
40
(228)
is evaluated as
P
=
a
2
X
h k
b bk
k
b + bk
k
(229)
where the planewave orthogonality properties have been used. This quantity
can be expressed as the sum of two terms
a X
P =
h k bk bk bk bk
2
k
a X
+
h k bk bk bk bk
(230)
2
k
(231)
which obviously is proportional to the sum of the momenta of the quanta. The
quantity a is just the square of the wave velocity v 2 . On noting that the
quanta travel with velocities given by v sign(k) and have energies given by
h k = h v  k , one sees that P is the expressed as the total energy flux
5.4
(232)
(233)
The eigenvalues nk are positive integers, including zero. This can be inferred
from the commutation relations
[ bk , bk0 ] = k,k0
6 P.
41
(234)
b 0
k
bk0
(235)
(236)
Therefore, since bk lowers the eigenvalue of the number operator by one unit,
one can write
bk  nk > = C(nk )  nk 1 >
(237)
where the complex number C(nk ) has to be determined. The normalization
coefficient C(nk ) can be determined by noting that
< nk  bk
(238)
(239)
On taking the norm of the state and its conjugate, one finds the normalization
< nk  bk bk  nk > = C (nk ) C(nk ) < nk 1  nk 1 >
=  C(nk ) 2
(240)
However, on using the definition of the number operator and the normalization
condition, one finds that
 C(nk ) 2 = nk
(241)
so, on choosing the phase factor, one can define
bk  nk > = nk  nk 1 >
(242)
42
(243)
(244)
=  C (nk ) 2
(245)
bk b = n
k + 1
k
(246)
which with
yields
 C (nk ) 2 = nk + 1
(247)
Since, the phase factor has already been determined by the Hermitean conjugate
equation, one has
b  nk > = nk + 1  nk + 1 >
(248)
k
which raises the eigenvalue of the number operator.
Hence, one sees that the eigenvalues of the number operator are separated
by integers. Furthermore, the smallest eigenvalue corresponds to nk = 0, since
for nk = 0 the equation
bk  nk > = nk  nk 1 >
(249)
reduces to
bk  0 > = 0
(250)
where the sum runs over all possible number eigenstates, and the complex coefficients C({nk }) are arbitrary except that they must satisfy the normalization
condition
X
 C({nk }) 2 = 1
(253)
{nk }
43
5.5
The classical limit of the quantum field theory can be characterized by the limit
in which the field operator can be replaced by a function. This requires that the
classical states are not only described as states with large numbers of quanta
in the excited normal modes, but also that the state is a linear superposition
of states with different number of quanta, with a reasonable well defined phase
of the complex coefficients. For a quantum state to ideally represent a given
classical state, one needs the quantum state to be composed of a coherent superposition of states with different numbers of quanta.
That states which are eigenstates of the number operators (  {nk } > ) can
not represent classical states, can be seen by noting that the expectation value
of the field operator is zero
< {nk }  y(x)  {nk } > = 0
(254)
follows from the expectation value of the creation and annihilation operators
< {nk }  ak  {nk } > = 0
(255)
Despite the fact that the average value of the field is zero, the fluctuation in the
field amplitude is infinite since
h
1 X
0
0
< {nk }  bk0 + bk
bk + bk0
 {nk } >
< {nk }  y(x)  {nk } > =
L 0 2 k 0
k
1 X
h
( 1 + 2 nk 0 )
(256)
=
L 0 2 k 0
k
and the zeropoint contribution diverges logarithmically at the upper and lower
limits of integration.
do not
Hence, the eigenstates of the number operator. or equivalently H,
describe the classical states of the string. Classical states must be expressed as
a linear superposition of energy eigenstates.
0
3
S =
dt
d x L ,
(257)
44
value
ex and the deviation as
=
ex +
(258)
The space and time derivatives of the arbitrary field can also be expressed as
the derivatives of the sum of the extremal field and the deviation
=
ex +
(259)
S =
dt0
d3 x
L
+
(
)
L
ex
ex
ex
ex
( )
0
(260)
On integrating by parts with respect to x in the last term, and on assuming
appropriate boundary conditions, one finds
Z t
Z
0
3
S =
dt
d x
L ex , ex
L ex , ex
( )
0
(261)
which has to vanish for an arbitrary choice of . Hence, one obtains the
EulerLagrange equations
(262)
ex
ex
ex
ex
( )
This set of equations determine the time dependence of the classical fields
ex (x). That is, out of all possible fields with components , the equations
of motion determine the physical field which has the components
ex . It is
convenient to define the field momentum density 0 (x) conjugate to as
1
(x ) =
L ,
(263)
c (0 )
The Hamiltonian density H is then defined as the Legendre transform
X
H = c
0 (0 ) L
(264)
2
L =
( ) ( )
2
h
45
(265)
for a real scalar field , determine the EulerLagrange equation and the Hamiltonian density H.
Exercise:
Consider the Lagrangian density
L =
1
2
( ) ( )
mc
h
2
 2
(266)
h2
V (x)
L =
2m
2i
t
t
(267)
(i) Determine the equation of motion, and the Hamiltonian density H.
(ii) Consider the case V (x) 0, then by Fourier transforming with respect to
space and time, determine the form of the general solution for .
6.1
The Hamiltonian formulation reserves a special role for time, and so is not
Lorentz covariant. However, the Hamiltonian formulation is the most convenient formulation for quantizing fields. The Hamilton equations of motion are
determined from the Hamiltonian
Z
H =
d3 x H
(268)
by noting that H is only a functional of 0 and . This can be seen, since as
X
Z
H =
d3 x c
0 (0 ) L
(269)
H =
d x c
(0 ) + (0 )
L
46
(270)
(271)
where the EulerLagrange equations were substituted into the first term. Therefore, the variation in the Hamiltonian is given by
Z
X
3
0
0
H =
d xc
(0 ) (0 )
(272)
which does not involve the time derivative of the fields. This implies that
the Hamiltonian is a function of the fields 0 , and their derivatives. On
calculating the variation of H using the independent variables 0 and , and
integrating by parts, one finds that the Hamiltonian equations of motion are
given by
H
H
c 0 =
(0 )
0
H
H
c 0 0 =
(273)
( )
The structure of these equations are similar to those of the classical mechanics of
point particles. Similar to classical mechanics of point particles, one can define
Poisson Brackets with fields. When quantizing the fields, the Poisson Bracket
relations between the fields can be replaced by commutation relations.
6.2
Emmy Noether produced a theorem linking continuous symmetries of a Lagrangian to conservation laws7 .
6.2.1
Conservation Laws
(274)
(275)
Noether, Nachr. d. Kgl. Gessch. d. Wiss. Gottingen, K1. Math. Phys. (1918) 235.
47
(276)
( )
where the field index is to be summed over. However, the generalized momentum density (x) is defined by
L
(277)
(x) =
( )
so
L = ( ) +
(278)
L
= 0
(279)
L
+ ( ) +
L
= +
=
(281)
L =
since the last term in the second line vanishes if the fields satisfy the EulerLagrange equations. If the Lagrangian is invariant under the transformation,
then L = 0, so
= 0
(282)
where the field index is to be summed over. The above equation can be
rewritten as a continuity equation
j = 0
(283)
(x)
( )
48
(284)
(285)
The conserved charge Q is defined as the integral over all space of the time
component of the current density j (0) . That is, the conserved charge is given by
Z
Q =
d3 x j (0) (x)
(286)
or, more specifically
Z
Q =
(287)
Since is a constant, the total charge Q is constant. Therefore, the total time
derivative of Q vanishes
dQ
= 0
(288)
dt
The spatial components of j form the current density vector.
6.2.2
Noether Charges
(289)
= + i
= i
49
(291)
where and its complex conjugate are regarded as independent fields. The
transformation represents a an infinitesimal constant shift of the phase of the
field8 . The conserved current is
L
L
j = i
(x)
(x)
(292)
( )
( )
which is the electromagnetic current density fourvector.
Exercise:
The Lagrangian density for the complex Schrodinger field representing a
charged particle is given by
h2
L =
V (x)
2m
2i
t
t
(293)
(i) Determine the conserved Noether charges.
Exercise:
Determine the Noether charges for a complex KleinGordon field theory,
governed by the Lagrangian density
L =
6.2.3
1
2
( ) ( )
mc
h
2
 2
(294)
Noethers Theorem
The basic theorem can be generalized to the case where the Lagrangian density
is not invariant under the infinitesimal transformation, but instead changes by
a combination of total derivatives. That is,
L =
(295)
0 (x) = exp
i
q
(x)
hc
(x)
A gauge transformation of the second kind is one in which the field changes according to
A0 = A + ( )
Since p = i
h , the combination of these transformations keep the quantity (
p
invariant
50
q
c
A )
for some analytic vector function with components . This type of transformation does not change the total action. If the Lagrangian changes by the above
amount for the combined transformation
(x) 0 (x) = (x) + (x)
(296)
L =
( ) + ( ) +
L
= +
=
(297)
one has
=
j =
(298)
(299)
(300)
holds.
6.3
(301)
(302)
In this case, the change in the Lagrangian density is given by the total derivative
L
L
)
+
L =
( )
L
L
=
( )
= L
(303)
51
where the last line follows since the Lagrangian only depends implicitly on x
through the fields. Hence, the change in the Lagrangian is a total derivative
L =
(304)
(305)
L
=
( )
L
L
=
( )
( )
L
=
(306)
( )
where the EulerLagrange equation has been used in the second line. Thus, the
fields satisfy the continuity conditions
L
0 =
L
(307)
( )
where
=
=
1
0
if =
otherwise
T =
( )
(308)
(309)
which is the energymomentum density T . The energy momentum tensor satisfies the conservation law
T = 0
(310)
The secondrank tensor can be written in contravariant form as
L
,
T , =
g
L
( )
(311)
where the metric tensor has been used to raise the index . The component
with = = 0 is the Hamiltonian density H for the fields
L
0,0
H = T
=
0
L
(312)
(0 )
52
(313)
(314)
(317)
(318)
,
,
M
=
T
x + T
T
x T ,
= T , T ,
(319)
(320)
expressing the independence of the variables x and x have been used. The
divergence of the thirdrank tensor vanishes if T , is symmetric. Thus, the
angular momentum tensor M ,, is conserved if the energymomentum tensor
is symmetric.
It should be noted that the tensor T , is only symmetric for scalar fields.
This is related to the fact that a vector or tensor field carries a nonzero intrinsic angular momentum. It is possible to incorporate an additional term in the
53
2
( ) ( )

(321)
L =
2
h
(ii) Find the forms of the energy and momentum density of the field.
(iii) Using the form of the general solution, find expressions for the total energy
and momentum of the field in terms of the Fourier components of the field.
Exercise:
(i) Determine the energymomentum tensor for the Lagrangian density for the
complex Schr
odinger field representing a charged particle given by
h2
L =
V (x)
2m
2i
t
t
(322)
(ii) Find the forms of the energy and momentum density of the field.
(iii) Find the forms of the generalized orbital angular momentum density of the
field.
(iv) Consider the case where V (x) 0. Using the form of the general solution,
find expressions for the total energy and momentum of the field in terms of the
Fourier components of the field.
1
F , g, g, F ,
16
(324)
From the above expression, it is seen that the two factors of the antisymmetrical secondrank field tensors produce identical variations of the action. Furthermore, in this form the Lagrangian only depends on the contravariant derivatives
54
(326)
0
E (1) E (2) E (3)
E (1)
0
B (3) B (2)
(327)
F ,
E (2) B (3)
0
B (1)
E (3) B (2) B (1)
0
and the covariant field tensor can be expressed as the matrix
0
E (1)
E (2)
E (3)
E (1)
0
B (3) B (2)
F,
E (2) B (3)
0
B (1)
E (3) B (2) B (1)
0
(328)
in which the sign of the terms with mixed time and space indices have changed.
Therefore, the Lagrangian density can be expressed in terms of the electromagnetic fields as
1
L =
( E2 B2 )
(329)
8
Since the Lagrangian density is completely expressed in terms of the electromagnetic field, it is gauge invariant.
55
1
1
F , F,
A j
16
c
(330)
This interaction term is the only Lorentz scalar that one can form with the
fourvector current and the field. It should be noted that the last term is not
gauge invariant. This action yields the equation of motion
F , =
4
j
c
(331)
as expected.
The lack of gauge invariance in the interaction Lagrangian
Lint =
1
A j
c
(332)
does not affect the equations of motion. On performing the gauge transformation
A A0 = A +
(333)
one finds that the interaction part of the Lagrangian density is transformed to
1
Lint 0 =
A + j
(334)
c
Since charge is conserved, the current density must satisfy the continuity equation
j = 0
(335)
The continuity condition can be used to express the interaction as the untransformed Lagrangian density and a perfect derivative
Lint 0 =
1
1
A j
( j )
c
c
(336)
The perfect derivative term only adds a constant term to the action which does
not affect the equations of motion9 . Hence, although the Lagrangian density
is not gauge invariant in the presence of sources, the Lagrangian equations of
motion are gauge invariant.
The momentum density conjugate to A is calculated as
0, =
c
F 0,
4
(337)
9 The change in the form of the interaction Lagrangian density produced by a gauge transformation should be taken as a warning against considering quantities in a field theory as
being localized.
56
=
=
=
=
+
+
1
4
1
4
1
4
1
8
1
4
( 0 A ) F 0, L
( F0, + A0 ) F 0, L
1
1
( B2 E2 ) +
j A
8
c
1
1
+ B2 )
( . E ) A0 +
j A
4
c
( F0, + A0 ) F 0, +
( E2
. ( A(0) E )
(338)
The fourth line has been derived by noting that the nonzero components of
F 0, are only nonzero for spacelike and are given by
F 0,i = E (i)
(339)
1
1
F0, F 0, = +
E2
4
4
(340)
1
( E2 B2 )
8
(341)
originating from the Lagrangian density. This combination results in the term
1
E2 + B2
(342)
8
which is recognized as the usual expression for the energy density of a free
electromagnetic field. On substituting eqn(339) into the second term in the
third line, one finds
1
+
(343)
( A0 ) . E
4
which can be expressed as
1
1
1
A0 ( . E )
. ( A0 E )
( A0 ) . E =
4
4
4
(344)
This relation has been used in arriving at the fourth line of eqn(338). Since the
divergence of the electric field satisfies Gausss law
.E = 4
57
(345)
(346)
1
1
( E 2 + B 2 ) A0 +
j A
8
c
1
+
. ( A(0) E )
4
+
(347)
(348)
which originates from the Lagrangian interaction (Lint ), one finds that the
terms proportional to A0 in the Hamiltonian density cancel. On neglecting
the total derivative term [ + 41 . ( E ) ], one finds that the Hamiltonian
density reduces to
H =
1
1
j.A
( E2 + B2 )
c
8
(349)
The first term is the energy density of the free electromagnetic field and the
second term represents the energy of the interaction between the electromagnetic field and charged particles. It should be noted that the interaction
Hamiltonian is expressed entirely in terms of an interaction between the current
density and the vector potential, which demonstrates that the Hamiltonian is
not invariant under a Lorentz transformation
Hint =
1
j.A
c
(350)
7.1
j A
16
c
(352)
58
T =
A
L
( A )
L
=
A L
( A )
The derivative of the Lagrangian density is evaluated as
L
1
=
F , F ,
( A )
8
1
=
F ,
4
Therefore, the energymomentum density is found as
1
,
T =
F
A L
4
(353)
(354)
(355)
On raising the index with the metric tensor, one has the contravariant secondrank tensor
1
,
,
T
=
F
A g , L
(356)
4
The energymomentum tensor is not gauge invariant, as it explicitly involves
the fields A . On using the expression for the sourcefree Lagrangian density
1
L =
E2 B2
(357)
8
one finds that the time components of T , are given by
1
1
2
2
0,0
+
. E
T
=
E + B
4
8
(358)
1
F 0, ( (j) A )
4
(359)
but since F , is antisymmetric, only the terms where is a spatial index are
nonzero. Hence, one has
1 X 0,i
T 0,j =
F ( (j) Ai )
4 i
1 X 0,i
1 X 0,i
(j) (i)
(i) (j)
= +
F
A
A
+
F ( (i) A(j) )
4 i
4 i
(360)
59
where the relation between the spacelike components of the covariant and contravariant fourvector Ai = A(i) has been used. Since the time component
of the field tensor is given by
F 0,i = E (i)
and10
(i)
(j)
(j)
(i)
=
(361)
X
i,j,k B (k)
(362)
E B
E ( (i) A(j) )
4
4 i
i,k
(j)
1
1
E B
+
E . ( A(j) )
(363)
4
4
(364)
0,j
1
=
4
(j)
E B
1
+
.
4
(j)
E
(365)
The components T 0, , apart from the terms involving total derivatives which
integrate out to zero, are related to the total energy and the components of the
total momentum of the electromagnetic field. The components of T , satisfy
the continuity equations
T , = 0
(366)
which represent the conservation of energy and momentum. The other mixed
time and spatial components of the energymomentum tensor are evaluated as
T j,0 =
1
4
(j)
E B
1
4
(j)
1
c t
E (j)
(367)
The components T j,0 represent the components of the energy flux.
It should be noted that the energymomentum tensor T , is not symmetric. This has the consequence that the covariant generalization of the angular
momentum to the thirdrank tensor
M ,, = T , x T , x
(368)
60
(369)
g
F, ( A )
T
=
4
4
4
(370)
The first two terms are symmetric w.r.t. and and are gauge invariant. These
two terms will form the basis for , , which will be expressed as
1
1 ,
,
,
,
,
=
g
F, F
+
g
F, F
(371)
4
4
The expression , is symmetric under the interchange of and , as can be
seen by writing
1 ,
1
, =
F F , +
g
F, F ,
4
4
1
1
=
F , F +
g , F, F ,
(372)
4
4
If , and T , are to represent the same set of conserved quantities, the last
term in eqn(370) must be expressible as a total derivative. That this is true can
be seen by examining the asymmetric term
1 ,
g
F, ( A )
4
1
F , ( A )
4
(373)
where the index was raised by using the metric tensor. On combining the
above expression with the source free Maxwell equation
( F , ) = 0
11 J.
(374)
Belinfante, Physica 6, 887 (1939) has shown that the modified tensor
, = T , +
defined by
;,
where ;, is an arbitrary tensor that is antisymmetric under the interchange of the first
pair of indices
;, = ;,
will automatically satisfy the same continuity conditions as T , and leave the total energy
and momentum unaltered.
61
one obtains
1 ,
g
F, ( A )
4
1
,
,
=
F
( A ) + A ( F )
4
1
,
=
F
A
(375)
4
which is a total covariant derivative of a thirdrank tensor which is antisymmetric under the interchange of and . Furthermore, adding this term does to the
energymomentum tensor does not change the energymomentum conservation
law. This can be seen by observing that the difference between the two forms
of energymomentum conservation law involves the double derivative
1
( , T , ) =
F , A
(376)
4
and F , is antisymmetric. On interchanging the order of the derivatives in the
right hand side, switching the summation labels, and using the antisymmetric
property of F , , one has
1
( , T , ) =
F , A
4
1
F , A
=
4
1
=
F , A
4
1
F , A
(377)
= +
4
On comparing the right hand sides of the first and last line, one finds that they
have opposite signs and, therefore, they are zero. Thus, the difference between
continuity relations vanish
( , T , ) =
(378)
=
g
F, F
+
g
F, F
4
4
is a conserved quantity. The purely temporal component is given by
1
0,0 =
E2 + B2
8
62
(379)
(380)
1
=
4
(j)
E B
(381)
=
E E
+ B B
(E + B )
(382)
4
2
Noethers theorem is purely classical, but there are generalizations for quantum fields. Quantum generalizations includes the WardTakahashi and TaylorSlavnov identities.
Exercise:
Evaluate the components T j,0 and T i,j of the (asymmetric) energymomentum
tensor for a sourcefree electromagnetic field.
Exercise:
Show that in the presence of sources, the symmetric energymomentum tensor has components with the form
1
1
2
2
0,0
j.A
E + B
=
c
8
(j)
1
A(j)
(383)
0,j =
E B
4
Verify the form of the conservation laws for energy and momentum.
Exercise:
Show that the extra term added to the tensor T i,j in order that i,j will
be symmetric produces a contribution to the angular momentum density of the
form
(j)
1
0,j
(384)
S
=
E A
4
which is the intrinsic spin density of the electromagnetic field.
63
7.2
The electromagnetic theory has been unified with the theory of weak interactions. This generalization requires the existence of two new types of spinone
particles in addition to the photon, which together mediate the electroweak
interaction. These new particles have nonzero mass. The massive spinone
particle has to satisfy the equation12
p p = m2 c2
(385)
(386)
1 2
2 +
c2 t2
mc
h
2
A =
4
j
c
(387)
where h no longer drops out. This equation can be derived from the Lagrangian
1
1
F , F, +
L =
16
8
mc
h
2
A A
1
j A
c
(388)
+
mc
h
2
A =
4
j
c
(389)
Neither the Lagrangian, nor the equation of motion are gauge invariant. The appropriate gauge condition can be enforced by imposing conservation of charge13
j = 0
(390)
+
mc
h
2
A =
4
j
c
(391)
The first term on the lefthand side vanishes due to the definition of F , , since
F , = A A
(392)
F , = A A
(393)
one finds
12 A.
64
therefore
F ,
= A A
= 0
(394)
The term on the righthand side of eqn(391) also vanishes, because it was chosen
to impose charge conservation. Hence, one finds that A for a massive spinone
particle must satisfy the Lorenz gauge condition
A = 0
(395)
Exercise:
Starting from the expression eqn(379), determine the symmetrized energymomentum tensor for the massive vector field. Hence, find the energy and
momentum densities.
7.3
A (x) =
A (k) exp[ i k x ] + c.c.
(396)
2V k
which results in four components associated with four polarization vectors which
are denoted by e (k). For massive photons, if one assumes charge conservation,
the gauge fields must satisfy the Lorentz gauge condition. The Lorentz gauge
condition
A (x) = 0
(397)
results in the Fourier components satisfying the condition
k A (k) = 0
(398)
65
q
2
2
components (k (0) , 0, 0, k (3) ) where k (0) = ( mc
h
) + k . The Lorentz condition
becomes
k (0) A(0) (k) k (3) A(3) (k) = 0
(399)
(3)
Hence, A(0) (k) = kk(0) A(3) (k). This implies that in the photons rest frame, the
fourvector potential only has three spatial components and the scalar potential
is zero. In any case, there are only threeindependent components of the fourvector potential. The fourvector potential can be expressed in terms of the
three independent components as
X
k (3)
1
A(x) =
A(3) (k) ( (0) e(0) (k) + e(3) (k) )
k
2V k
(1)
(1)
(2)
(2)
1
A
c t
= A
=
(401)
The longitudinal component of the electric field has a Fourier amplitude given
by
(3)
(0)
(3)
(3)
(0)
E (k) = i k A (k) k A (k)
= i
= i
k (0)
m c
h
k (3)2
k (0)
A(3) (k)
2
A(3) (k)
k (0)
(402)
and the longitudinal component of the magnetic field B (3) (k) is zero since
(3)
0 = i k e (k) A(3) (k)
(403)
Thus, the component A(3) (k) contributes to the E field but not the B field.
For massive photons, the unsymmmetrized energymomentum tensor T , is
expressed in terms of contributions from the E, B fields and the also the gauge
66
1
1
( F , ) A = +
( F , ) A
4
4
(406)
This can be accomplished by adding zero in the form of A times the EulerLagrange equation
+ F
mc
h
2
4
j = 0
c
(407)
therefore, completing the divergence at the expense of adding extra mass and
source terms. The divergence of thirdrank antisymmetric tensor can be dropped,
leading to the symmetrized energymomentum tensor given by
1
, =
g , F, F , + 4 F , F
16
2
2
g , m c
1
mc
A A +
A A
8
h
4
h
j A
j A
+ g ,
(408)
c
c
It is seen that the coupling of the electromagnetic fields to the current densities
spoils the symmetry of the energymomentum tensor. Hence, the charge currents act as sources of angular momentum and results in the electromagnetic
fields angular momentum not being conserved. The energydensity H is given
by 0.0 so
H =
1
8
E2 + B2
+
1
8
mc
h
2
A(0)2 + A2
j.A
(409)
c
1
4c
E B
1
4c
67
mc
h
2
A(0) A
j (0) A
c2
(410)
Thus, the longitudinal photon with components A(3) (k) and A(0) (k) does have
physical effects, as do the two transverse photons A(1) (k) and A(2) (k). Therefore, the fourvector potential of a massive electromagnetic field has three physical components.
Exercise:
Determine the contribution to the energy and momentum of the electromagnetic field which originates from the coupled time and spacelike (longitudinal)
components of the fourvector potential.
For a massless photon m = 0, so one sees that the longitudinal component
of the electric field E (3) (k) is zero. Also since for m = 0 one has k (0) = k (3) ,
so the Lorentz gauge condition requires that the component A(3) (k) is equal to
A(0) (k) and, therefore, the longitudinal component is associated with a single
polarization vector ( e(0) (k) + e(3) (k) ). Since A(0) (k) = A(3) (k), the norm of
the vector (A(0) (k), 0, 0, A(3) (k)) is identical zero. For m = 0, the longitudinal
component A(3) (k) neither contributes to the E field nor to the B field. Since
the energy and momentum densities are expressed in terms of the E and B
fields via
1
2
2
(411)
H =
E + B
8
and
1
P =
4c
E B
(412)
the component A(3) (k) or equivalently A(0) (k) has no physical effect. Hence,
it can be deemed unphysical. Therefore, the fourvector potential of a massless electromagnetic field only has two physical components A(1) (k) and A(2) (k)
which are the components transverse to the direction of propagation.
7.4
+
mc
h
2
A =
4
j
c
(413)
On Fourier transforming the linear equation with respect to space and time, one
obtains
2
mc
4
k k +
A (k) + k k A (k) =
j (k)
(414)
h
68
k k +
+ k k
A (k) =
j (k)
h
c
(415)
(416)
(417)
h
c
The above matrix equation can be solved by first writing the propagator as
D, (k) = g , A(k 2 ) + k k B(k 2 )
(418)
and then solving for the unknown quantities B(k 2 ) and A(k 2 ). This leads to
the following expression for the photon propagator
g , (km ck)2
h
4
D, (k) =
(419)
2
c
m c
k
4
A (k) =
j (k)
2
c
m c
k k
h
(420)
Hence, A (x) can be found from the inverse Fourier transform. It should be
noted that, if one imposes current conservation
k j (k) = 0
(421)
the second term in the numerator of the photon propagator has no physical
effect. We also note that
2
4
h
k A (k) =
k j (k)
(422)
c
mc
So, once again we see that assumption of charge conservation enforces the
Lorentz gauge condition, even if the massive photon is involved in a virtual
process.
69
8.1
mc
2 h 0
2
20
2
(423)
(425)
= 1 + i 2
= 1 i 2
(426)
then the Lagrangian can be written as a Lagrangian density involving the two
real scalar fields 1 and 2 . The Lagrangian density has a U (1) symmetry which
corresponds to the rotation of around a circle about the origin in the (1 , 2 )
plane.
We shall assume the field representing the physical ground state corresponds to only one of the infinite number of possible candidates. The physical
state must have a phase, which shall be defined as zero. That is, one starts with
a ground state = 0 , and then consider the small amplitude excitations. A
lowenergy excited state corresponds to the complex field
= 0 +
70
(427)
v()
Re []
0
Im []
L = ( 1 ) ( 1 ) + ( 2 ) ( 2 )
mc
2 h 0
2
2 0 1 +
21
22
2
(429)
If one only considers infinitesimally small amplitude oscillations, one only needs
consider terms quadratic in the fields. The quadratic Lagrangian density LF ree
describes noninteracting fields. The quadratic Lagrangian density is given by
2
mc
LF ree = ( 1 ) ( 1 )
21 + ( 2 ) ( 2 )
(430)
h
The symmetry breaking has resulted in the complex field breaking up into two
fields: The first field 1 describes massive excitations m and the second field 2
describes massless excitations. The first field 1 has planewave solutions if the
energy and momentum are related via the dispersion relation
2
m c2
2 = c2 k 2 +
(431)
h
and represents excitations which corresponds to a stretching of 0 . It is
massive since this excitation moves the field away from the minimum of the
potential. The second excitation 2 represents which is transverse to 0 in
71
(432)
which vanishes at k = 0. The Goldstone boson dynamically restores the spontaneously broken U (1) symmetry since, at k = 0, it just corresponds to a change
of the value of the (static and uniform) broken symmetry field from (0 , 0)
to the new direction (0 , 2 ). Therefore, if infinitely many zeroenergy Goldstone bosons are excited in the system, the resulting state should correspond
to a new ground state with a different value of the phase. As noted by Anderson15 prior to Goldstones work, the Goldstone theorem breaks down when
longranged interactions are present. Anderson was responsible for the concept
of mass generation through symmetry breaking due to the coupling with gauge
fields16 . This concept was subsequently developed by Peter Higgs17 and Tom
Kibble and coworkers18 .
8.2
We shall now consider the coupling of a scalar field with charge q to a gauge
field A . The Lagrangian density is related to the sum of the Lagrangian density
for the electromagnetic field and the Lagrangian density for the charged scalar
particle. The coupling between the fields is found from the minimum coupling
assumption
q
A
(433)
p p0 = p
c
which becomes
q
A
(434)
i
h i h
c
Therefore, the Lagrangian density for the coupled fields has the form
q
q
1
( i
A ) ( + i
A )
F , F,
hc
c
h
16
2
2
mc
0
(435)
2 h 0
L =
14 J.
18 G.S. Guralnik, C.R. Hagen, and T.W.B. Kibble, Global Conservation Laws and Massless
Particles. Physical Review Letters 13, 585587 (1964): T.W.B. Kibble, Symmetry Breaking
in nonAbelian Gauge Theories, Physical Review, 155, 1554 (1967).
72
(437)
and the A vanish. Any local gauge transformation leads to a state with the
same energy, therefore, the ground state is infinitely degenerate.
We shall assume that a physical system spontaneously breaks the symmetry
in that it corresponds to a specific constant value of . We shall choose the
local gauge (x) such that the field representing the excited states is purely
real. However, once the gauge has been fixed, no further gauge transformations
can be made.
The small amplitude excitations can be expressed as
= 0 +
(438)
(439)
and on substituting in the Lagrangian and collecting the quadratic terms, one
obtains
2
mc
21
LF ree = ( 1 ) ( 1 )
h
2
1
q 0
F , F, +
A A
(440)
16
h c
Therefore, one finds that the charged boson field has a mass m and the gauge
field has acquired a mass mA given by
m2A
= 8
q 0
c2
2
(441)
73
of the gauge field. More specifically, the field 2 which initially corresponded
to the Goldstone mode became unphysical when the massless vector field was
introduced as it could be gauged away. This is seen by writing
2
= 0 + 1 + i 2
0 + 1
exp i
(442)
0
so 2 could be removed by a gauge transformation involving the particular
choice of
2
=
(443)
q 0
h
c
Gravitational Interactions
We introduce the field theory of the Gravitational Interaction, in the weak field
limit, by analogy with electromagnetism. More specifically, we shall develop the
classical field theory of the massive graviton in parallel with the Procas theory
of a massive spinone particle that we discussed previously. This approach has
the disadvantage that it does not have the same beautiful geometric basis as
Einsteins field equations.
Electromagnetic fields are mediated by spinone bosons and gravitational
fields are mediated by spintwo bosons. Electromagnetic forces between twoparticles with like charges is repulsive but Gravitational forces are always attractive. These facts are quantified by Coulombs law
q1 q2
V (r1 r2 ) =
(444)
 r1 r2 
and Newtons Force law
V (r1 r2 ) =
G m1 m2
 r1 r2 
(445)
These two equations show the fields have the same type of classical Greens
functions associated with massless bosons. Therefore, we expect that the difference is caused by the spin values of the bosons involved in the exchanges. We
shall examine the difference of the signs of the electromagnetic and gravitational
forces by examining the quantum propagators.
Consider (massive) electromagnetic radiation in the presence of a source.
The Lagrangian density formulated by Proca20 is given by
2
1
1
mc
1
L =
F , F, +
A A
A j
(446)
16
8
h
c
20 A. Proca, C.R. Acad. Sci., Paris 203, 709711 (1936), J. de Physique et Radium, 8, 2328
(1937).
74
where
F , = A A
(447)
The mass term spoils the gauge invariance of the sourcefree Lagrangian. The
action S is given by
Z
S =
d4 x L
(448)
which, on integrating by parts, can be expressed as
Z
S =
d x
1
A
8
mc
h
2
g
1
j
c
A (449)
+
mc
h
2
g
A =
4
j
c
(450)
If charge is conserved
j = 0
(451)
On operating on the equation of motion with , one finds that because of the
finite mass the field A must satisfy the condition
A = 0
(452)
Thus, conservation of charge removes the choice of gauge. In the rest frame
of a photon, k = ( mh c )(1, 0, 0, 0), so one has A0 (k) = 0 and so one finds that
the massive photon has three components. This is expected for a particle with
spin S = 1. The electromagnetic propagator can be defined as the kernel of the
integral relation
Z
4
A (x) =
d4 x0 D, (x, x0 ) j (x0 )
(453)
c
Thus, the propagator satisfies the partial differential equation
mc
h
2
g ,
D, (x, x0 ) = 4 (x x0 ) (454)
which can be solved by Fourier transforming. On performing the Fourier transformation, one finds the algebraic equation
k k +
mc
h
2
g , + k k
D, (k) =
(455)
g
, +
k k
75
k k
c 2
( mh
)
m c 2
( h )
(456)
1
c
dk00
(2)
d4 k
j (k 0 , k) A (k0 , k)
( 2 )4 0
(458)
On eliminating the fourvector potential, the interaction portion of the Lagrangian can be expressed as
Lint
1
=
c
dk00
(2)
d4 k
4
j (k00 , k)
4
(2)
c
g
, +
k k
k k
c 2
( mh
)
m c 2
( h )
j (k0 , k)
(459)
(460)
4
c2
d3 k
( 2 )3
(k) (k)
k 2 ( mh c )2
(462)
(463)
Thus, the interaction Lagrangian between two like charges is negative and this
corresponds to a positive (replusive) interaction potential V .
76
(464)
g
, +
k k
k k
c 2
( mh
)
m c 2
( h )
(465)
The numerator originates from the product of polarization vectors (k) for the
massive spinone particle. The polarization vectors can can be chosen as
(1)
(2)
(3)
=
=
=
(0, 1, 0, 0)
(0, 0, 1, 0)
(0, 0, 0, 1)
(466)
mc
), 0, 0, 0)
h
(467)
(469)
(k) (k)
(470)
(471)
(472)
mc
h
(473)
2
= 0
(474)
(475)
The constant of proportionality A can be fixed by considering the massive photon in its rest frame, and choosing one spatial index for and in the sum
over polarization components (eg. = = 1). This fixes A = 1. Thus, the
numerator of the photon propagator is given by
k k
G, (k) =
g, + m c 2
(476)
( h )
We have shown that the photon propagator is given by
G, (k)
D, (k) =
k k ( mh c )2
9.1
(477)
General Relativity
8G
T,
c3
(478)
(480)
The Ricci tensor is obtained from the Riemann tensor via the contraction
R, = R,,
(481)
and the Riemann tensor is given in terms of the Christoffel symbols via
R,,
= , , + , , , ,
78
(482)
where, in turn, the Christoffel symbols , are given in terms of the metric
tensors by
1 ,
, = g
g, + g, g,
(483)
2
In the absence of a source, the Lagrangian density of the sourcefree gravitational
field is given by
L = g R
(484)
where g = det g, is negative.
9.2
Linearized Gravity
(486)
Since the metric tensor is symmetric, it has 10 independent components. However, under a general coordinate transformation
x x0 = x
(487)
x x
x0 x0
(488)
(489)
Thus, both metrics describe the same physical gravitational field and so gravitational theory has a gauge invariance. The gravitational gauge transformation
is very similar to the electromagnetic gauge transformation
A A0 = A +
(490)
For a massive photon with charge conservation, the electromagnetic field must
satisfy the gauge condition
A = 0
(491)
and, therefore, has three independent components as expected for a spinone
particle. Likewise, one expects that the massive gravitons field should satisfy
four different Lorentz gauge conditions
h, = 0
79
(492)
,
h, =
T,
,
T,
(496)
c3
2
The second term in the source projects out unwanted components of the energymomentum tensor. For the vacuum, where T, = 0, the linearized equations
take the form
h, = 0
(497)
which is just the relativistic wave equation for a massless spintwo particle. For
a graviton of mass m, the wave equation could be expected to have the form
9.3
mc
h
2
h, = 0
(498)
h, = 0
(499)
mc
h
2
Fierz and W. Pauli, Proc. Roy. Soc. London, 187, 211232 (1937).
80
density is given by
L =
1
1
h, h,
, h, , h,
2
2
h, h, + h, , h,
2
1 mc
h, h, , h, , h,
2
h
(500)
up to a constant factor. The form of the mass term is not enforced by any known
symmetry. In fact, like the Proca Lagrangian for the massive photon, the mass
term in the FierzPauli Lagrangian density violates the gauge symmetry. The
principle of extremal action leads to the equations of motion
0
= h, + , , h,
+ h, + h,
, h, , h,
2
mc
h, , , h,
h
(501)
On operating on the equation with and on noting that only the mass term
remains, one finds the equation
h, , h, = 0
(502)
This is the Lorentz gauge condition for linearized (massive) general relativity.
Here this is not a choice since the condition is enforced by the existence of the
mass. Substituting this mandatory condition back into the equations of motion
yields
2
mc
,
,
h, h, +
h, , h,
= 0 (503)
h
On taking the trace of the above equation, if m 6= 0, one recovers the condition
that h, is traceless
, h, = 0
(504)
as anticipated. Furthermore, this also implies the four gauge conditions
h, = 0
(505)
must also be satisfied. On applying the above five constraints to the equation
of motion, one finds
2
mc
+
h, = 0
(506)
h
Thus, the FierzPauli Lagrangian for h, does indeed describe a massive particle
with S = 2, and we have confirmed the form of the five conditions on the field.
81
In the presence of a source, the equation could be expected to have the form
2
16 G
mc
+
h, =
T,
(507)
h
c3
where T, is an appropriate expression linear in the energymomentum tensor
which radiates a tensor field with five independent components. The unwanted
components of T, must be projected out, similar to what we have done for
the linearized theory of general relativity. The propagator takes care of this
projection. The equation of motion is a linear equation with a source, so it can
be solved by introducing a propagator which satisfies the equation
2
1
mc
0
+
D,;, (x, x ) =
+ 4 (xx0 ) (508)
h
2
so that the solution of the inhomogeneous equation can be written as
Z
16 G
d4 x0 D,;, (x, x0 ) T, (x0 )
h, (x) =
c3
(509)
, (k) , (k) = I
(511)
The numerator of the graviton propagator must involve the combination which
transforms as a tensor under a Lorentz transformation. The propagators form
must be such that the symmetric tensor h, must satisfy the four gauge conditions and be traceless. Using the five conditions repeatedly, one can show that
the numerator is proportional to
X
2
(512)
where
k k
(513)
G, (k) =
, + m c 2
( h )
The sign of the propagator is fixed by considering a massive graviton in its rest
frame and setting = = 1 and = = 2. This leads to
G, (k) G, (k) + G, (k) G, (k)
23 G, (k) G, (k)
1
D,;, (k) =
2
k k ( mh c )2
(514)
82
(515)
k T , (k) = 0
(516)
The above condition eliminates the the kdependent terms in the propagator
when coupled to the energymomentum tensor, so one can replace the G, (k)
by , . It should be noted that the energy density is positive
T 0,0 (k) > 0
(517)
9.4
h, h
h,
h,
h
c h, T ,
In the presence of a source, the FierzPauli equation of motion becomes
16 G
T , = h, , , h,
c3
83
(519)
h, h,
+ , h, + , h,
2
mc
,
,
,
+
h
, h
h
(520)
On operating on the equation with and on noting that only the mass term
remains, one finds the equation
16 h2 G
h, , h, =
T ,
(521)
m2 c5
We see that in the absence of a graviton mass, the Lagrangian theory of the
graviton requires that the energymomentum tensor must be conserved. For the
massive graviton, the conservation of energy and momentum is not automatically ensured, but is an independent assumption. We shall assume that the
energymomentum tensor remains a conserved quantity.
On assuming conservation of energy and momentum, one can substitute the
condition
h, , h, = 0
(522)
into the equation of motion, yielding
2
mc
16 G
,
,
h, , h,
=
T,
h, h, +
h
c3
(523)
Taking the trace, we find that
2
mc
16 G
, T,
(524)
( d 1 ) , h,
=
h
c3
so that h, is not traceless in the presence of a source. On substituting the
trace in the condition, one finds that the four gauge conditions are modified to
become
1
16 h2 G
, T,
(525)
h, =
d 1
m2 c5
Thus, the automatic conservation of energy and momentum is linked to gauge
invariance. When gauge invariance is lost, energy and momentum conservation
has to be inferred from other considerations. Substituting the expression for
the trace into the equations of motion leads to
2
mc
16 G
1
,
T
T
h, +
h, =
+
,
,
,
h
c3
d 1
( mh c )2
(526)
On Fourier transforming, one obtains
16 G
c3
1
k k
,
h, (k) =
T, (k)
, m c 2 T, (k)
d 1
( h )
k k ( mh c )2
(527)
84
or
h, (k) =
16 G
c3
k k ( mh c )2
, ,
d 1
,
k k
mc 2
( h )
,
T , (k)
(528)
(529)
9.5
h, h, , h, , h,
h
(532)
22 A.I.
85
(533)
The vanishing of the trace and the mandatory gauge conditions obtained from
the equation of motion can be used to reduce the Lagrangian density to an
effective Lagrangian density
2
c4
mc
Lef f = +
h, h,
h, h,
32 G
h
(534)
Hence, the energy of the field can be expressed as
2
Z
c3
mc
P0 =
d3 r 0 h, 0 h, + h, . h, +
h, h,
32 G
h
(535)
or, on Fourier transforming,
2
Z
mc
c3 2
2
3
2
h, (k) h, (k)
(536)
d k k0 + k +
P0 =
h
4G
so the energy is positive.
The fields are restricted by the four constraints
k h, (k) = 0
(537)
, h, (k) = 0
(538)
(539)
X ki kj
hi,j (k)
k0 k0
i,j
(540)
h0,i (k) =
and
h0,0 (k) =
The trace condition is
(541)
(542)
2
h
(k)
hi,j (k)2
i,j
i,j
0
0
0
k
k
k
i,j
j
i
i,j
(543)
For k directed along the zdirection, one has
h, (k) h
(k)
2
2
k2
k
2
2
2
1
h3,3 (k) 2
1
h3,1 (k) + h3,2 (k)
k02
k02
1
h
(k)
1
h
(k)
+
h
(k)
3,3
3,1
3,2
2 k02
k02
1
+ 2 h1,2 (k)2 +
 h1,1 (k) h2,2 (k) 2
(544)
2
Thus, the massive graviton field has five independent components.
If one assumes that
k02 = k 2 +
mc
h
2
(545)
then one sees that the two offdiagonal modes h3,1 (k) and h3,2 (k) have an energy
proportional to
2
mc
h3,1 (k)2 + h3,2 (k)2
(546)
4
h
(547)
transform as
h3,1 (k)0 i h3,2 (k)0 = exp[ i ]
h3,1 (k) i h3,2 (k)
(548)
under a rotation of around the zaxis. Thus, these combinations have angular
momenta of 1 around the zaxis. The physical effects of the modes h3,1 (k)
87
i h1,2 (k)
(550)
)
h, (k) =
, ,
, m c 2 , T , (k)
k k (
h
c3
d 1
( h )
(551)
and noting that, for k along the zdirection, the term in the source which diverges when m 0 only couples to h3,3 (k). Thus, we have
h3,3 (k)
1
( mh c )2
(552)
and, therefore, this mode yields a finite contribution to the energy. It is this
longitudinal mode which gives rise to the discontinuity at m = 0.
10
Following the work of Dirac23 , the energy, momentum and angular momentum
of the electromagnetic field shall be reduced into contributions from a set of
normal modes. A particular normal mode will correspond to a particular wave
vector and a particular polarization of the field. The normal modes can be
described in terms of a set of harmonic oscillators and, when quantized, the
23 P. A. M. Dirac, Proc. Roy. Soc. A 114, 243 (1927).
In this paper Dirac uses two different approaches to quantizing electromagnetism. In one
approach he treated a single photon as satisfying a singleparticle Schr
odinger equation, that
has a similar form to Maxwells equations. The other approach treated the fields as dynamical
variables and then quantized them. Dirac then showed that these two methods produce
equivalent results. By doing this, Dirac created second quantization.
88
(554)
(555)
where V is the volume of the system. The inverse Fourier Transform is given
by
1 X
A(r, t) =
exp i k . r A(k, t)
(556)
V k
On Fourier transforming the wave equation with respect to space and time, one
finds the equation of motion
2
1
A(k, t) = 0
(557)
k2 + 2
c
t2
and the Coulomb gauge condition becomes
k . A(k, t) = 0
(558)
We shall look for solutions for A(k, t) that have a time dependence given by
linear superpositions of the terms proportional to
exp i k t
(559)
By substituting the above terms into the wave equation, it is found that linear
superpositions of planewaves are solutions of Maxwells equation but only if
the frequency k and wave vector k are related via the dispersion relation
k2 = c2 k 2
(560)
The gauge condition also requires that the vector potential is oriented perpendicular to the direction of propagation. Therefore, an arbitrary planewave
solution can be represented as a linear superposition of two polarized waves
with polarizations described by two mutually orthogonal unit vectors denoted
by (k). The polarization vectors satisfy
k . (k)
(k) . (k)
89
= 0
= ,
(561)
E
B
k
Figure 9: The normal modes of the classical electromagnetic field are planepolarized waves, in which E and B are transverse to the direction of propagation
k, and oscillate in phase.
We shall assume that three vectors k, 1 (k), 2 (k) form a mutually orthogonal
coordinate system. We shall define
1 (k)
2 (k)
= 1 (k)
= 2 (k)
(562)
The algebraic equations for A(k) can be solved trivially. One can express the
vector potential as a linear superposition
1 X
A(r, t) =
(k) exp i k . r (k, t)
(563)
V k,
However, since the vector potential is real
A(r, t) = A (r, t)
(564)
(k, t) = (k, t)
(565)
90
10.1
1 A
c t
= A
=
A
8 c2
t
(567)
(568)
1 X X
k+k0
8
k,k0 ,
1
(k)
(k 0 )
(k) . (k 0 ) 2
c
t
t
+ ( k (k) ) . ( k 0 (k 0 ) ) (k) (k 0 )
(k)
1 X
1 (k)
2
(k)
(k)
t
t
8
c2
k,
(571)
In the above expression, the summation over k is unrestricted. If the Lagrangian
is to be expressed in terms of the independent components, then the summation
over k must be restricted to half the set of allowed values. With this restriction,
one obtains
(k)
2 X0
1
(k)
2
(k)
(k)
L =
t
t
8
c2
k,
(572)
91
k
k
Figure 10: A possible partition of kspace, which does not contain both k and
its inverse k.
where the prime over the summation denotes the restriction of k to values
in the positive half volume of kspace. Since there are half the number of
independent normal modes, their contributions are twice as big. The Lagrangian
is a function of the six generalized variables (k) and (k) for the independent
k values. The generalized momenta variables are found as
2
(k)
(k) =
8 c2
t
2
(k)
(573)
(k) =
t
8 c2
The Lagrangian equations of motion of the field are given by
1
(k)
k2
=
(k)
2
t 8 c
t
8
or
2 (k)
t2
= k2 (k)
(574)
(575)
where k = c k. Thus, the classical field (k) has a timedependent amplitude which resembles that of a harmonic oscillator with frequency k = c k.
The Hamiltonian can be obtained from the Lagrangian, via the Legendre Transformation
X
(k)
(k)
0
L
(576)
+ (k)
H =
(k)
t
t
k,
92
(577)
where the summation over (k, ) runs over the independent normal modes.
Hence, the k summation only runs over the set of points in k space which are
not related via the inversion operator. The Hamiltonian is related to the energy
of the electromagnetic field, as shall be seen below.
The energy density H for the electromagnetic field can be expressed as
1
2
2
E + B
(578)
H =
8
in the Coulomb gauge. The energy density can be written in terms of the vector
potential as
2
2
1
A
1
+
A
(579)
H =
t
8 c2
The energy is the integral of the energy density over all space
Z
H =
d3 r H
(580)
H =
(k) (k) +
k (k) (k)
(581)
4
8
k,
in which the summation over k is unrestricted. Thus, the above expression for
the energy is identical to the Hamiltonian for the electromagnetic field. Furthermore, the Hamiltonian has been expressed in terms of a set of the normal
modes labeled by (k, ).
10.2
The quantized Hamiltonian is obtained from the classical Hamiltonian by replacing the field components and their canonically conjugate momenta
(k) , (k)
(582)
(k)
(k) ,
(583)
by the operators
and their complex conjugates are replaced by the Hermitean conjugate operators. The canonically conjugate coordinates and momenta operators satisfy the
commutation relations
(k 0 ) ] = i h , k,k0
(k) ,
[
(k) ,
(k 0 ) ] = 0
[
(k) ,
(k 0 ) ] = 0
[
93
(584)
4
8
(585)
k,
(586)
2 h k
8 h k
2
and the Hermitean conjugate operators
s
s
2 k2
8 c2
1
(k)
(k) +
i
a
k, =
8h
k
2 h k
2
(587)
known as creation operators. The commutation relations for the creation and
annihilation operators can be obtained directly from the commutation relations
(k) and
(k) which are shown in eqn(584). It can
of the field operators
be shown that the creation and annihilation operators satisfy the commutation
relations
k0 , ] = , k,k0
[a
k, , a
[a
k, , a
k0 , ] = 0
[a
k, , a
k0 , ] = 0
(588)
The field operators can be expressed in terms of the creation and annihilation
operators. Starting with
s
s
1
8 c2
2 k2
a
k, =
i
(k) +
(k)
(589)
2 h k
8 h k
2
(k) =
(k) and
(k) =
transforming k k and then by noting that
(590)
(k)
(k) +
i
a
k, =
8 h k
2 h k
2
(k) by adding the expression for the creation operator
One can eliminate
given by eqn(587) and the expression for the annihilation operator with momentum k given by eqn(590). This process yields the expression for the field
(k) in the form
component operators
r
2 h k
+
a
(591)
(k) =
a
k,
k,
k2
94
(k) =
a
+
a
(592)
k,
k,
k2
(k). Likewise, the canonically conjugate momenta
which is identical to
operators are given by
r
h k
(k) = i
a
k, a
k,
(593)
8 c2
and their Hermitean conjugates are
r
h k
(k) = i
a
k,
k,
8 c2
(594)
as was anticipated.
10.2.1
4
8
(595)
k,
(596)
If one sets k k in the second set of terms, then one finds the Hamiltonian
becomes the sum over independent harmonic oscillators for each k value and
polarization
X h k
=
a
k, a
k, a
k,
(597)
k, + a
H
2
k,
95
(598)
and has integer eigenvalues denoted by nk, . Hence, the energy eigenvalues E
are given by
X
1
E =
h k nk, +
(599)
2
k,
10.2.2
A(r) =
(k)
a
k, + a
k,
exp i k . r
k V
(600)
k,
(602)
A(r,
h
h
or
s
t)
A(r,
X
k,
(k)
2 h c2
k V
a
k,
exp
i k t
+ a
k, exp
i k t
exp
ik.r
(603)
The above equation was obtained by noting that, in the basis composed of
eigenstates of the number operators nk, >, one has
k, (0) nk, > exp ik t (nk, + 1/2)
a
k, (t) nk, > = exp + ik t (
ak, a
k, + 1/2) a
24 H. B. G. Casimir, Proc. Neth. Aka. Wetenschapen, 51, 793 (1948), M. J. Sparnaay,
Physica 24, 761 (1959)
96
and that the timedependent creation operator is given by the Hermitean conjugate expression. Thus, the explicit form of time dependence of the vector
potential is a consequence of the explicit time dependence of the creation and
annihilation operators in the Heisenberg representation. Alternatively, one can
find the time dependence of the creation and annihilation operators directly
from the Heisenberg equations of motion without invoking a privileged set of
basis states. The equation of motion for the creation operator is given by
i h
ak,
t
]
= [a
k, , H
(605)
(606)
ak,
t
= h k a
k,
(607)
a
k,
exp
i k t
(608)
ak,
]
= [a
k, , H
t
(609)
and as
k, , k,k0
k0 , a
k0 , ] = + a
[a
k, , a
(610)
ak,
= +h
k a
k,
t
(611)
10.2.3
The total momentum operator for the electromagnetic field is given by the
integral over all space of the Poynting vector
Z
1
3
P =
d r E B
(613)
4c
and B
field operators in terms of the
This will be evaluated by expressing the E
vector potential A operator via
1 A
c t
= A
=
(614)
The vector potential operator can be written in terms of the creation and annihilation operators for the normal modes as
s
X
2 h c2
A(r, t) =
(k)
a
k, (t) + a
k, (t) exp i k . r (615)
k V
k,
a
k, a
k,
exp i k . r (616)
E(r) = i
(k)
V
k,
and
s
B(r)
= i
( k (k) )
k,
2 h c2
k V
a
k, + a
k,
exp
ik.r
(617)
For a fixed k, the polarization vectors (k) and k are mutually orthogonal.
Therefore, one has
(k) ( k (k) )
a
k, + a
k,
k, a
k,
P =
(k) ( k (k) ) a
2
k,
h X
(619)
a
k, + a
k,
=
k a
k, a
k,
2
k,
It should be noted that the momentum from each normal mode of the field is
parallel to its direction of propagation. Since the creation operators commute
a
k, a
k, = a
k, a
k,
98
(620)
(621)
one finds that the part of the momentum represented by the summation over k
given by
X
h
k a
k, a
k, a
k, a
k,
= 0
(622)
k,
vanishes since the summand is odd under inversion symmetry. Thus, the momentum of the electromagnetic field is given by
h X
P =
k a
k, a
k, a
k, a
k,
2
k,
X
1
=
hka
k, a
k, h k a
k, a
k, h k
(623)
2
k,
where the commutation relations for the creation and annihilation operators
were used to obtain the last line. The last term vanishes when summed over k,
due to inversion symmetry. Hence, the momentum of the field is given by the
operator
1 X
P =
h k a
k, a
k, h k a
k, a
k,
(624)
2
k,
Finally, on transforming k to k in the last term of the summand, one finds the
total momentum of the field is carried by the excitations since
X
h k a
k, a
k,
(625)
P =
k,
99
10.2.4
The total angular momentum operator of the electromagnetic field JEM is given
by
Z
1
B
)
d3 r r ( E
(627)
JEM =
4c
The ith component is given by
Z
1
(i)
3
i,j,k
(j)
(k)
JEM =
d r
x (E B)
4c
Z
1
(l) B
(m)
=
d3 r i,j,k x(j) k,l,m E
4c
Z
(p)
1
3
i,j,k
(j) k,l,m (l) m,n,p A
d r
x
E
=
4c
x(n)
(628)
However, due to the identity
k,l,m m,n,p =
(629)
one finds
(i)
JEM
1
4c
d3 r i,j,k
(k)
(l)
(j) (l) A
(l) A
x(j) E
x
E
(630)
x(k)
x(l)
1
(i)
(j) (l)
(l) A
(k)
JEM =
d3 r i,j,k x(j) E
+
x
E
A
4c
x(k)
x(l)
(631)
Since the divergence of the electric field vanishes26 ,
(l)
E
= 0
x(l)
(634)
26 In the presence of a charge density q (r)2 , the angular momentum of the EM field will
contain a term given by
Z
q
which is gaugedependent. This term combines with a corresponding term in the expression
for the orbital angular momentum of the charged particles to yield the gaugeinvariant term
d3 r (r) r ( p
q
A(r) ) (r)
c
(633)
Hence, although the total angular momentum of a system of charged particles in an electromagnetic field is welldefined and gaugeinvariant, there could be some confusion as to how
the angular momentum is distributed between the particles and the field.
100
and since
x(j)
= j,l
(635)
x(l)
the total angular momentum can be rewritten as
Z
(l)
1
(i)
(l) x(j) A
(j) A(k)
JEM =
d3 r i,j,k E
+
E
(636)
4c
x(k)
The first term can be recognized as the orbital angular momentum of the field.
(i) is given by
The orbital angular momentum operator L
(637)
x(k)
so the total angular momentum of the field is given by
Z
i
(i)
(l) L
(i) A(l) i h i,j,k E
(j) A(k)
JEM =
d3 r E
4
hc
Z
i
(l) L
(i) A(l) + E
(j) ( S(i) )j,k A(k) (638)
=
d3 r E
4
hc
(i) = i h i,j,k x(j)
L
(639)
for S, the intrinsic spin operator for the photon, has been used in obtaining the
second line. The total vector angular momentum operator can be expressed as
Z
i
(j)
3
j,k
j,k
L
J EM =
d rE
+ (S)
A(k)
(640)
4
hc
which shows that the orbital angular momentum is diagonal with respect to
the field components and the spin angular momentum mixes the different field
components.
The total spin component of the angular momentum operator for the electromagnetic field is given by
Z
i
(i)
3
(j)
(i) j,k (k)
d r E (S ) A
SEM =
4 h c
Z
1
(j) A(k)
d3 r i,j,k E
=
4c
(i)
Z
1
3
=
d r E A
(641)
4c
This can be expressed in terms of the photon creation and annihilation operators
as
h X
(i)
(j)
SEM = i
(k)
(k) i,j,k (k)
2
k,,
(642)
a
k, + a
k,
k,
a
k, a
101
The first term in parenthesis is recognized as the ith component of the vector
product
(k) (k)
(643)
and, therefore, it is antisymmetric in the polarization indices and and the
nonzero contributions are restricted to the case 6= . Since the creation
and annihilation operators corresponding to different polarizations commute,
the product of the two remaining parenthesis can be rearranged as the sum of
two terms
(i)
h X
(i)
(k) (k)
SEM = i
2
k,,
a
k, a
k, a
k, a
k,
+ a
k, a
k, a
k, a
k,
(644)
On transforming the summation variable k k and commuting the operators, one finds that the first term is symmetric whereas the second term is
antisymmetric under the interchange of and . Hence, on summing over
the polarization indices, the contribution from the first term vanishes, as it is
the product of a symmetric and antisymmetric term. Therefore, the total spin
operator of the electromagnetic field is expressed as
(i)
h X
(i)
SEM = i
(k) (k)
a
k, a
k, a
k, a
k,
2
k,,
(645)
On defining the sense of the polarization vectors relative to k ( e3 (k) the unit
vector in the direction of propagation) via
1 (k) 2 (k)
= k
(646)
so that k corresponds to the zdirection, one finds that
h X
k,1 a
k,2
S EM = i
k a
k,2 a
k,1 a
2
k
k,1 a
k,2
k a
k,2 a
k,1 a
(647)
On setting k k in the second part of the summation, the spin of the electromagnetic field is found as
X
(648)
S
=
i
h
EM
k,1 k,2
k,2 k,1
k
102
It should be noted that in this expression, the indices (1, 2) refer to directions
in threedimensional space and do not refer to the zcomponent of the spin angular momentum. Therefore, the above equation shows that a planepolarized
photon is not an eigenstate of the singleparticle spin operator quantized along
the kaxis27 .
In our Cartesian component basis, the eigenstates of the component of the
spin operator parallel to the direction of propagation S(3) , where
0 i 0
S(3) = h
i 0 0
(649)
0 0 0
e (k), where
are given by
m
e (k)
+1
e (k)
e (k)
1
1
i
=
2
0
0
= 0
1
1
1
i
=
2
0
(650)
(1)
1 i 0
(k)
+1 (k)
0 (k) = 1 0
(651)
0
2 (2) (k)
2
(k)
1
i 0
(3) (k)
1
This relation between the two bases can be expressed in the alternate form
(k) =
m=1
X
em m (k)
(652)
m=1
1
= ( 1 (k) + i e2 (k) )
2
27 Strictly speaking, this quantum number corresponds to the helicity as it is the spin eigenvalue which is quantized along the direction of propagation.
103
e0
e1
= 3 (k)
1
= ( 1 (k) i 2 (k) )
2
(653)
The circularlypolarized unit vectors are associated with photons which have
definite helicity eigenvalues. It should be noted that these complex unit vectors
are orthogonal, and satisfy
em0 . em = m,m0
(654)
The above relations allow one to define the circularlypolarized creation and
annihilation operators via their relation to the quantum fields. This procedure
yields
i=3
m=1
X
X
i (k) a
k,i =
em (k) a
k,m
(655)
m=1
i=1
1
k,1 i a
= (a
k,2 )
2
= a
k,3
1
= (a
k,1 + i a
k,2 )
2
(656)
a
k,1
a
k,2
a
k,3
(657)
When expressed in terms of the circularlypolarized unit vectors, the spin operator for the electromagnetic field becomes
X
(658)
S
=
h
k
k,m=1
k,m=1
EM
k,m=1
k,m=1
k
which is expressed in terms of photons with definite helicity. Within the manifold of singlephoton states with momentum h k, the spin operator has eigen It is seen that the photon
values of
h when measured along the direction k.
has helicity m = 1 but does not involve the helicity state with m = 0 since
the electromagnetic field is transverse. The transverse nature of the field is due
to the photon being massless. In general, a massive particle with spin S should
have (2S + 1) helicity states. However, a massless particle can only have the
104
e1
e2
Figure 11: The circularlypolarized normal modes of a classical electromagnetic field are composed of two planepolarized waves which are out of phase,
are mutually orthogonal, and are transverse to the direction of propagation k.
The resulting electric field spirals along the direction of propagation. The left
circularlypolarized wave shown in the diagram corresponds to a helicity of +
h.
two helicity states corresponding to m = S.
The angular momentum of the elementary excitation of the electromagnetic
field was inferred from experiments in which beams of circularlypolarized light
were absorbed by a sensitive torsional pendulum28 . Quantum electromagnetic
theory shows that the angular momentum density of left circularlypolarized
light is just h
times the photon density or, equivalently, is just 1 times the
energy density which is also the case for classical electromagnetism29 . Hence, the
net increase of angular momentum per unit time can easily be calculated from
the excess of the angular momentum flux flowing into the pendulum over that
flowing out. Beths experiments verified that the net torque on the pendulum
was consistent with the theoretical prediction. Thus, the quantized electromagnetic field has been shown to be related to a massless particle with spin h and
energymomentum given by the fourvector (
hk /c, hk). This particle is the
photon. Every quantized field is to be associated with a type of particle.
10.3
Uncertainty Relations
105
= 1 A
E
c t
(659)
(660)
(661)
since
the fluctuation in the field is given by
.E
 {nk0 , } >
< {nk0 , }  E
=
=
 {nk0 , } > 2
 < {nk0 , }  E
.E
 {nk0 , } >
< {nk0 , }  E
4 X
1
hk ( nk, +
)
V
2
k,
(662)
The fluctuations in the field diverge because the zeropoint energy fluctuations
diverge.
The commutation relations between the xcomponent of the E field and the
B field at the same instant of time are nonzero30 . That is,
X
x (r) , B
y (r0 ) ] = 2
h k (k)x ( k (k) )y exp i k . ( r0 r )
[E
V
k,
2 X
0
= i
c
h 3 0
(r r)
2 2 z
The fact that the two polarizations are transverse to the unit vector k has been
do not commute, it follows that E
and B
used to obtain the third line. Since E
and B obey an uncertainty relation in that the values of E and B cannot both
be specified to arbitrary accuracy at the same point.
30 P.
106
(663)
However, if two points in space time x and x0 are not causally related, i.e.
 r0 r  =
6 c  t0 t 
(664)
(665)
Thus, if the two points in spacetime are not connected by the propagation of
light, then the Ex and By fields can both be determined to arbitrary accuracy.
10.4
Coherent States
We shall focus our attention on one normal mode of the electromagnetic field,
and shall drop the indices (k, ) labelling the normal mode. A coherent state
 a > is defined as an eigenstate of the annihilation operator
a
 a > = a  a >
(666)
For example, the vacuum state or ground state is an eigenstate of the annihilation operator, in which case a = 0.
The coherent state31 can be found as a linear superposition of eigenstates of
the number operator with eigenvalues n
 a > =
Cn  n >
(667)
n=0
= a
Cn  n >
(668)
Cn n  n 1 > = a
Cn  n >
n
(669)
On taking the matrix elements of this equation with the state < m , and using
the orthonormality of the eigenstates of the number operator, one finds
Cm+1
m + 1 = a Cm
(670)
Hence, on iterating downwards, one finds
m
a
Cm =
C0
m!
31 R.
107
(671)
n
X
a
> = C0
n >
n!
n=0
(672)
(673)
1 = C0 C0 exp a a
(674)
(675)
From this, it can be shown that if the number of photons in a coherent state
are measured, the result n will occur with a probability given by
( a a )n
P (n) =
exp a a
(676)
n!
Thus, the photon statistics are governed by a Poisson distribution. Furthermore,
0.2
Pn
0.15
0.1
0.05
0
1
10
11
12
13
14
15
Figure 12: The probability of finding n photons P (n) in a normal mode represented by a coherent state.
the quantity a a is the average number of photons n present in the coherent
state.
108
The coherent states can be written in a more compact form. Since the state
with occupation number n can be written as
n > =
(a
)n
0 >
n!
1
( a a
)n
 a > = exp
a a
0 >
2
n!
n=0
or on summing the series as an exponential
1
a a exp a a
0 >
 a > = exp
2
(677)
(678)
(679)
Thus the coherent state is an infinite linear superposition of states with different
occupation numbers, each coefficient in the linear superposition has a specific
phase relation with every other coefficient.
The above equation represents a transformation between number operator
states and the coherent states. The inverse transformation can be found by
expressing a as a magnitude a and a phase
a = a exp i
(680)
The number states can be expressed in terms of the coherent states via the
inverse transformation
Z 2
n!
1 2
d
exp i n  a >
exp +
a
n > =
an
2
2
0
(681)
by integrating over the phase of the coherent state. Since the set of occupation number states is complete, the set of coherent states must also span Hilbert
space. In fact, the set of coherent states is overcomplete.
The coherent state  a > can be represented by the point a in the Argand
plane. The overlap matrix elements between two coherent states is calculated
as
 < a00  a > 2 = exp
 a a00 2
(682)
Hence, coherent states corresponding to different points are not orthogonal. The
coherent states form an over complete basis set. The over completeness relation
can be expressed as
Z
d <e a d =m a
 a > < a  = I
(683)
109
Im z
0.5
Re z
0
1
0.5
0.5
0.5
1
0
Z
Z 2
d a n an
=
da a
exp  a 2
n0 ! n!
0
0
Z 2
Z
n+n0
a
d
2
0
=
da a
exp  a 
exp i ( n n )
n0 ! n!
0
0
Z
0
an+n
exp a2 2 n,n0
(685)
=
da a
n0 ! n!
0
On changing variable to s = a2 , one proves the completeness relation by noting
that
Z
ds sn exp s = n!
(686)
0
a
 a > = a
exp
a a
2
110
exp
a a
0 >
1
a a a
exp a a
 0 >
2
1
= exp
a a
exp a a
 0 >
2
a
1
1
= exp
a a
exp +
a a  a >
2
a
2
(687)
=
exp
The coherent state is not an eigenstate of the creation operator, since the resulting state does not include the zerophoton state.
The expectation value of the field operators between the coherent states
yields the classical value, since
< a  ( a
+ a
)  a > = ( a + a )
(688)
(689)
has been used in the term involving the annihilation operator and the term originating from the creation operator is evaluated using the Hermitean conjugate
equation
< a  a
= < a  a
(690)
One also finds that that the expectation value of the number operator is given
by
< a  a
a
 a > = a a
(691)
so the magnitude of a is related to the average number of photons in the
coherent state n. This identification is consistent with the Poisson distribution
of eqn(676) which governs the probability of finding n photons in the coherent
state. The coherent state is not an eigenstate of the number operator since
there are fluctuations in any measurement of the number of photons. The rms
fluctuation n can be evaluated by noting that
< a  n
2  a >
= < a  a
a
a
a
 a >
= < a  a
a
a
a
 a > + < a  a
a
 a >
2
2
= ( a ) ( a ) + a a
(692)
where the boson commutation relations have been used in the second line. Thus,
the mean squared fluctuation in the number operator is given by
< a 
n2  a > = a a
(693)
The rms fluctuation of the photon number is only negligible when compared to
the average value if a has a large magnitude
a a 1
111
(694)
The expectation values of coherent states almost behave completely classically. The deviation from the classical expectation values can be seen by
examining
< a  a
a
 a > = a a + 1
(695)
which is evaluated by using the commutation relations. It is seen that the expectation values can be approximated by the classical values, if the magnitude
of a is much greater than unity.
Exercise:
Determine the expectation values for the electric and magnetic field operators in a coherent state which represents a planepolarized electromagnetic wave.
Exercise:
Determine the expectation values for the electric and magnetic field operators in a coherent state which represents a left circularlypolarized electromagnetic wave composed of photons with a helicity of +1.
10.4.1
From the discussion of coherent states, it is seen that the coherent state has a
definite phase, but does not have a definite number of quanta. In general, it is
impossible to know both the phase of a state and the number of a state. This
is formalized as a phase  number uncertainty relation.
The phase and amplitude of a state is related to the annihilation operator.
Since the annihilation operator is nonHermitean, one can construct the annihilation operator as a function of Hermitean operators. Formally, the amplitude
can be related to the square root operator, and the phase to a phase operator32 .
Hence, one can write
a
k, = exp + i (k, k t)
nk,
(696)
and the Hermitean conjugate operator, the creation operator can be expressed
as
(697)
a
k, = nk, exp i (k, k t)
since it has been required that n and are Hermitean. Furthermore, the
112
(699)
k,k0 ,
exp
(700)
Thus, for k = k 0 and = , one has
exp + i k, n
k, exp + i k,
k, n
= exp + i k, (701)
This relationship is satisfied, if the phase and number operators satisfy the
commutation relation
(702)
[n
k, , k, ] = i
If one can construct the Hermitean operators that satisfy this commutation
relation, then one can show that the rms uncertainties phase and number must
satisfy the inequality
(k, )rms (nk, )rms 1
(703)
It should be noted that only the relative phase can be measured33 . Thus, if the
phase difference of any two components (k, ) and (k 0 , 0 ) is specified precisely,
then the occupation number of either component can not be specified.
Exercise:
Express the vector potential and the electric and magnetic field operators in
terms of the amplitude and phase operators.
10.4.2
The coherent state  a > can be represented by the point a in the Argand
plane. The overlap matrix elements between two coherent states is calculated
as
 < a00  a > 2 = exp
33 L.
 a a00 2
113
(704)
Hence, coherent states are not orthogonal. In fact, their overlap decreases exponentially with large separations between the points a and a00 in the Argand
plane. We shall denote  a  by a. Two states separated by distances a or
a such that a 1 and a 1 are effectively orthogonal or independent.
However, states within an area given by a a 1 have significant overlap
and so can represent the same state. Therefore, the minimum uncertainty state
occupies an area a a 1. We note that 2 a a can be interpreted
as a measure of the uncertainty n in the particle number for the state, and
is the uncertainty in the phase of the state. Hence, the phase  number
uncertainty relation sets the area of the Argand diagram that can be associated
with a single state as
a a 1
(705)
Im z
a
a
Re z
Figure 14: Due to the phasenumber uncertainty principle, the minimum area
of the Argand diagram needed to represent a minimum uncertainty state has
dimensions such that a a 1.
11
The nonrelativistic Hamiltonian for a particle with charge q and mass m interacting with a quantized electromagnetic field can be expressed as
p
2
q2
q
2 (r)
+
A
p
. A(r) + A(r) . p
+ q (r)
2
2mc
2mc
2m
2 0
2 0
Z
(r )
E (r ) + B
+
d3 r 0
(706)
8
when the vector potential is chosen to satisfy the Coulomb gauge. The second,
third and fourth terms are to be evaluated at the location of the charged point
particle, r, and the last term is evaluated at all points in space. The Hamiltonian
114
can be expressed as
= H
0 + H
rad + H
int
H
(707)
+
A
2mc
2 m c2
The interaction term is composed of a paramagnetic interaction which is linearly
proportional to the vector potential and the diamagnetic interaction which is
proportional to the square of the vector potential. When the electromagnetic
field is quantized, the radiation Hamiltonian has the form
X h k
Hrad =
a
k, a
k, + a
k, a
k,
(710)
2
k,
(712)
in which the transverse gauge condition . A = 0 has also been used. The
(k,)
(k,)
p
p'
p'
115
Hdia =
(k ) . (k) exp i ( k + k ) . r
k k 0 V
2 m c2
k,k0 ,,
a
k0 , a
k, + a
k0 , a
k, + a
k0 , a
k, + a
k0 , a
k,
(713)
For charged particles with spin onehalf, analysis of the nonrelativistic Pauli
equation shows that there is another interaction term involving the particles
spins. This interaction can be described by the anomalous Zeeman interaction
Zeeman = q h
.B
(714)
H
2mc
where
B = A(r)
(715)
(716)
h
a
(717)
(718)
(719)
Hence, since the wave length of light is larger than the linear dimension of an
atom, > a, one finds the inequality between the magnitude of the paramagnetic interaction and the Zeeman interaction
e h 1
e h 1
A >
A
mc a
mc
(720)
Both the paramagnetic and Zeeman coupling strengths are proportional to the
magnitude of the vector potential A, hence the ratio of the strengths of the
116
11.1
11.1.1
(723)
The transition rate for the electron to make a transition from (n, l, m) to
(n0 , l0 , m0 ) can be calculated34 from the FermiGolden rule expression
1
2 X
int  nlm {nk0 , } > 2 ( Enlm En0 l0 m0 h k, )
=
 < n0 l0 m0 {n0k0 , }  H
k,
(724)
34 P.
A. M. Dirac, Proc. Roy. Soc. A 112, 661 (1926), A 114, 243 (1927).
117
(h'/c,hk')
0
Enlm
E
4
e
8
12
En'l'm'
16
Figure 16: An electron in the initial atomic state with energy En,l,m makes a
transition to the final atomic state with energy En0 ,l0 ,m0 , by emitting a photon
with energy h
k 0 .
The delta function expresses the conservation of energy. The energy of the
initial state is given by
Enlm +
h k0 , ( nk0 , +
k0 ,
1
)
2
(725)
h k0 , ( n0k0 , +
k0 ,
1
)
2
(726)
The difference in the energy of the initial state and final state is evaluated as
Enlm En0 l0 m0 h k,
(727)
which is the argument of the delta function and must vanish if energy is conserved. The sum over k can be evaluated by assuming that the radiation field is
confined to a volume V . The allowed k values for the normal modes are determined by the boundary conditions. In this case, the sum over k is transformed
to an integral over kspace via
Z
X
V
(728)
d3 k
( 2 )3
k
118
ik.r
 {nk0 , } >
(729)
since only the paramagnetic part of the interaction has nonzero matrix elements. For the photon emission process, the matrix elements of the creation
operator between the initial and final states of the electromagnetic cavity is
evaluated as
p
< {n0k0 , }  a
k,  {nk0 , } > =
nk, + 1
(730)
hence, the matrix elements of the interaction are given by
s
q
2 h c2 p
int  nlm {nk0 , } > =
< n0 l0 m0 {n0k0 , }  H
nk, + 1
mc
V k
0 0 0
< n l m  (k) . p
exp i k . r  nlm >
(731)
Therefore, the transition rate for photon emission can be expressed as
2
Z
1
2
q
V
2 h c2 X
3
( nk, + 1 )
=
d k
V k
mc
( 2 )3
 < n0 l0 m0  (k) . p
exp i k . r  nlm > 2 ( Enlm En0 l0 m0 h k )
(732)
The above expression shows that the rate for emitting a photon into state (k, )
is proportional to a factor of nk, + 1, which depends on the state of occupation
of the normal mode. The term proportional to the photon occupation number
describes stimulated emission. However, if there are no photons initially present
in this normal mode, one still has a nonzero transition rate corresponding to
spontaneous emission. These factors are the result of the rigorous calculations35
based on Diracs quantization of the electromagnetic field, but were previously
derived by Einstein36 using a different argument. From the above expression, it
is seen that the number of photons emitted into state (k, ) increases in proportional to the number of photons present in that normal mode. This stimulated
35 P.
36 A.
119
emission increases the number of photons and can lead to amplification of the
number of quanta in the normal mode, and leads to the phenomenon of Light
Amplification by Stimulated Emission of Radiation (LASER).
11.1.2
2 c h
3000
A
h
(733)
whereas the typical length scale r for the electronic state is of the order of an
Angstrom. Therefore the product k r 103 , so the exponential factor in the
vector potential can be Taylor expanded as
exp i k . r
1 i k . r + ...
(734)
The first term in the expression produces results that are equivalent to the radiation from an oscillating classical electric dipole. If only the first term in the
A(r)
(r)
V(r)
Figure 17: A cartoon depicting the relative lengthscales assumed in the dipole
approximation.
expansion is retained, the resulting approximation is known as the dipole approximation. The second term in the expansion yields results equivalent to the
radiation from an electric quadrupole. The dipole approximation, where only
the first term in the expansion is retained, is justified for transitions where the
successive terms in the expansion are successively smaller by factors of the order
of 103 . The dipole approximation crudely restricts consideration to the case
where the emitted photon can only have zero orbital angular momentum. This
follows from the dipole approximations requirement that the size of the atom is
negligible compared with the scale over which the vector potential varies. Then
the vector potential in the spatial region where the electron is located only describes photons with zero orbital angular momentum.
120
In the dipole approximation, the transition rate for single photon emission
is given by
2
Z
1
2
q
V
2 h c2 X
3
d k
( nk, + 1 )
h
mc
( 2 )3
V k
 (k) . < n0 l0 m0  p
 nlm > 2 ( Enlm En0 l0 m0 h k )
(735)
The matrix elements of the momentum can be evaluated by noting that the
states  nlm > are eigenstates of the unperturbed electronic Hamiltonian so
0  nlm > = Enlm  nlm >
H
(736)
p
2
+ V (r)
2m
(737)
(738)
On using this relation, the matrix elements of the momentum operator can be
written in terms of the matrix elements of the electrons position operator r by
< n0 l0 m0  p
 nlm >
= i
= i
m
h
0 ]  nlm >
< n0 l0 m0  [ r , H
m
( Enlm En0 l0 m0 ) < n0 l0 m0  r  nlm >
h
(739)
d3 k k
(2)
(741)
It is seen that the volume of the electromagnetic cavity has dropped out of the
expression of eqn(740) for the transition rate. We shall assume that the number
of photons nk, in the initial state is zero. The (complex) factor
dnlm,n0 l0 m0 = q < n0 l0 m0  r  nlm >
121
(742)
is defined as the electric dipole moment, and the electronic energy difference is
denoted by the frequency
Enlm En0 l0 m0 = h
nl,n0 l0
(743)
d k
 (k) . dnlm,n0 l0 m0 2 ( h nl,n0 l0 h k )
2
k
(744)
3
2
h
k
0
Z
3
X
nl,n
0 l0
=
dk
 (k) . dnlm,n0 l0 m0 2
(745)
2
h c3
The above expression yields the rate at which an electron makes a transition
between the initial and final electronic state, in which one photon of any polarization is emitted in any direction.
If one is only interested in the decay rate of the electronic state via the
emission of a photon, one should sum over all polarizations and integrate over
all directions of the emitted photon. The direction of the emitted photon k is
expressed in terms of polar coordinates defined with respect to an arbitrarily
chosen polar axis. The direction of the photons wave vector k is defined as
k = (sin k cos k , sin k sin k , cos k ). The directions of the two transverse
polarizations are defined as
1 (k)
2 (k)
=
=
(746)
The scalar product between the polarization vectors and the dipole moment can
be expressed in terms of the Cartesian components via
X
(i)
(k) . dnlm,n0 l0 m0 =
(747)
(i)
(k) . ( dnlm,n0 l0 m0 )
i
As neither the polarization nor the direction of the outgoing photon are measured, the transition rates is determined as an integral over all directions
3
X X Z
nl,n
0 l0
1
(j)
(i)
(i)
=
dk (j)
(k) ( dnlm,n0 l0 m0 ) ( dnlm,n0 l0 m0 )
(k)
2
h c3 i,j
(748)
122
dk
e2(k)
z
k
k
e1(k)
y
k
x
Figure 18: A photon is emitted with wave vector k with a direction denoted
by the polar coordinates (k , k ). The polarization vector e1 (k) is chosen to be
in the plane containing the polaraxis and k, therefore, e2 (k) is parallel to the
x y plane.
On using the identity
Z
1 X
2
dk (j)
(i)
i,j
(k)
(k) =
4
3
(749)
one finds that the transition rate is given by the scalar product of complex
vectors
3
4 nl,n
0 l0 dnlm,n0 l0 m0 . dnlm,n0 l0 m0
1
=
(750)
3 h c3
The electric dipole matrix elements can be shown to vanish between most pairs
of states. The selection rules determine which matrix elements are nonzero
and, therefore, which electric dipole transitions are allowed.
11.1.3
Electric dipole induced transitions obey the selection rules l = 1 and either
m = 1 or 0, where l is the quantum number for electrons orbital angular
momentum and m is the zcomponent. The dipole selection rules can be derived
by writing the wave functions for the oneelectron states as
n,l,m (r) = Rnl (r) Yml (, )
(751)
where Rn,l (r) is the radial wave function, and Yml (, ) is the spherical harmonic
function quantized along the zdirection. The components of an arbitrarily
123
oriented electric dipole matrix elements involve matrix elements of the quantities
x = r sin cos
y = r sin sin
z = r cos
(752)
Since the above expressions are the components of a vector, they can be rewritten as combinations of the spherical harmonics with angular momentum
l = 1, via
r
8
1
1
x = r
Y1
(, ) Y11 (, )
2
3
r
i
8
1
1
y = r
Y1 (, ) + Y1 (, )
2
3
r
4 1
z = r
Y0 (, )
(753)
3
Hence, the components of the vector r can be written as
r
4
ex + i ey 1
ex i ey 1
r =
r
Y1 + ez Y01
Y1
3
2
2
(754)
ex i ey
2
= ez
ex + i ey
=
2
(755)
k,m
k,m + em a
r . em a
(756)
(757)
(758)
(where k 0), an electron with angular momentum quantized along the zdirection most naturally couples to circularlypolarized light with the same
124
quantization axis. The electric dipole matrix elements involve the three factors
Z 2
Z
0
1
d
d sin Yml 0 (, ) Y1
(, ) Yml (, )
0
Z
d
(759)
(760)
m0
m0
(761)
and
= m 1
= m
because one unit of angular momentum is carried away by the photon in the
form of its spin37 .
The mselection rules for electric dipole transitions.
The selection rules on the zcomponent of the angular momentum, m, follow
directly from the dependence of the spherical harmonics
1
l
l
exp i m
(762)
Ym (, ) = m ()
2
so, the integral over the Cartesian components of the dipole matrix elements
involve
Z 2
1
1
sin
i m+1,m0 + i m1,m0
d exp i ( m m0 )
=
cos
m+1,m0 + m1,m0
2 0
2
(763)
and
Z 2
1
d exp i ( m m0 ) = m,m0
(764)
2 0
The above results lead to the selection rules for the zcomponent of the electrons
orbital angular momentum
m0
m0
= m 1
= m
(765)
37 In the dipole approximation, the photon is restricted to have zero orbital angular momentum. Therefore, the angular momentum is completely transformed to the photons spin.
More generally, the spatial (planewave) part of the vector potential should be expanded in
terms of spherical harmonics to exhibit the photons orbital angular momentum components.
125
(766)
On taking the matrix elements between states with definite zcomponents of the
angular momenta, one finds
z , x ]  nlm > = i h < n0 l0 m0  y  nlm >
< n0 l0 m0  [ L
z , y ]  nlm > = i h < n0 l0 m0  x  nlm >
< n0 l0 m0  [ L
z , z ]  nlm > = 0
< n0 l0 m0  [ L
(767)
which reduce to
( m0 m ) < n0 l0 m0  x  nlm >
( m0 m ) < n0 l0 m0  y  nlm >
( m0 m ) < n0 l0 m0  z  nlm >
(769)
(771)
(772)
or
Hence, the mselection rules for the electric dipole transitions are m = 1, 0.
The lselection rules for electric dipole transitions.
The selection rules for the magnitude of the electrons orbital angular momentum can be found by considering the double commutator
2
2
2
2
2
[ L , [ L , r ] ] = 2 h
rL + L r
(773)
126
2
l (l +1)l(l+1)
0 0
l (l +1)+l(l+1)
(774)
Since
2
l (l + 1) l(l + 1)
= ( l0 + l + 1 )2 ( l0 l )2
(775)
and
2 l0 ( l0 + 1 ) + l ( l + 1 )
=
( l0 + l + 1 )2 + ( l0 l )2 1
(776)
the above equation is satisfied if, either
< n0 l0 m0  r  nlm > = 0
or
(l + l + 1)
1
(l l)
(777)
1
= 0
(778)
The first factor in eqn(778) is always positive when l0 6= l, therefore, the electric
dipole selection rule becomes l = 1.
The actual values of the matrix elements can be found from explicit calculations. The dependence of the matrix elements is governed by the associated
Legendre functions through
s
( 2 l + 1 ) ( l m )! l
l
m () =
P (cos )
(779)
2
( l + m )! m
which obey the recursion relations
l
sin Pm1
(cos ) =
l+1
l1
Pm
(cos ) Pm
(cos )
2l + 1
(780)
and
l1
l+1
( l + m ) ( l + m + 1 ) Pm
(cos ) ( l m ) ( l m + 1 ) Pm
(cos )
2l + 1
(781)
appropriate for the m = 1 transitions and
l
sin Pm+1
(cos ) =
l
cos Pm
(cos ) =
l+1
l1
( l m + 1 ) Pm
(cos ) + ( l + m ) Pm
(cos )
2l + 1
(782)
127
m+1
(2l 1)(2l + 1)
(783)
for m = 1, while for m = 1 one finds
s
( l + m ) ( l + m 1 ) l1
sin lm () =
m1
(2l + 1)(2l 1)
s
( l + 2 m ) ( l + 1 m ) l+1
m1
(2l + 1)(2l + 3)
(784)
and for constant m
s
cos
lm ()
(l + 1 + m)(
(2l + 1)(
s
(l + m)(l
+
(2l 1)(2
l + 1 m ) l+1
m
2l + 3)
m)
l1
l + 1) m
(785)
The coefficients in the above equation have a similar form to the ClebschGordon
coefficients. The dipole matrix elements can be evaluated by taking the matrix
0
elements of the above set of relations with lm0 () and then using the orthogonality properties. The above three relations give rise to the selection rules for
the magnitude of the orbital angular momentum l
l0 = l 1
(786)
Hence, not only have the selection rules on l been rederived but the angular
integrations have also been evaluated.
What the above mathematics describes is how the spin angular momentum
of the emitted photon is combined with the orbital angular momentum of the
electron in the final state, so that total angular momentum is conserved. This
implies the selection rules which leads to the magnitude of the initial and final
electronic angular momentum l having to satisfy the triangular inequality
l0 + 1 l  l0 1 
(787)
m0
m0 = m + 1
l0 = l + 1
2i
12
(lm)(l1m)
(2l1)(2l+1)
m =m
2i
(l+2m)(l+1m)
(2l+1)(2l+3)
i
2
(lm)(l1m)
(2l1)(2l+1)
m0 = m 1
1
2
(l+2m)(l+1m)
(2l+1)(2l+3)
12
(l+2+m)(l+1+m)
(2l+1)(2l+3)
m0 = m + 1
l =l1
(l+2+m)(l+1+m)
(2l+1)(2l+3)
m0 = m
m0 = m 1
1
2
(l+m)(l1+m)
(2l1)(2l+1)
i
2
(l+m)(l1+m)
(2l1)(2l+1)
Then, since the electric dipole carries angular momentum (1, ), the WignerEckart theorem reduces to
1
2
l0
+ 1
(789)
where the first term which represents the angular integration is a ClebschGordon coefficient and the second factor is the reduced matrix element which
does depend on the form of the particular vector, but is independent of any
choice of coordinate system. Furthermore, the WignerEckart theorem yields
the selection rules for the electric dipole transition
l + l0 1  l l0 
Exercise:
129
(l+m)(lm)
(2l1)(2l+1)
For this example, the irreducible tensor is the vector V with components V
given by
r
(x iy)
4 1
V
=
= r
Y
3 1
2
r
4 1
V0 = z = r
Y
(788)
3 0
(l+1+m)(l+1m)
(2l+1)(2l+3)
(790)
Using the commutation relations for the jth component of a vector V j with
i,
the ith component of the orbital angular momentum L
X
i , V j ] = i h
[L
i,j.k V k
(791)
k
[ L , V ] = i h L V V L
= 2 i h L V i h V
(792)
[L , [L , V ]] = 2h
V L + L V
4 h L L . V
(793)
and that the last term of the above expression is zero if V = r.
The parity selection rule.
In addition to the electronic orbital angular momentum selection rules, there
is a parity selection rule. The parity operation is an inversion through the origin
given by r r. The parity operator P has the effect
P (r) = (r)
(794)
The parity operator is its own inverse since for any state (r)
P 2 (r)
= P (r)
= (r)
(795)
Therefore, the parity operator has eigenvalues p = 1 for the eigenstates which
are defined by
P p (r) = p p (r)
(796)
so
P 2 p (r) = p2 p (r) = p (r)
(797)
(799)
r
r
r
r
P.
 n > = En  n >
H
P  n > = pn  n >
(801)
(802)
Hence, for any matrix elements of r between any eigenstates of the parity operator, one has
< n0  r  n >
= < n0  P r P 1  n >
= pn0 pn < n0  r  n >
(803)
(804)
This is known as the Laporte selection rule for electric dipole transitions39 . The
validity of this selection follows from the fact that inversion commutes with the
38 The
39 O.
131
(805)
sin exp
il
(806)
11.1.4
We shall assume that the initial state is polarized so that the electron is in an
electronic state labelled by m, where the axis of quantization is fixed in space.
The decay rate in which a photon of polarization is emitted into the solid
angle dk is given by
1
dk
3
X
nl,n
0 l0
d
 (k) . dnlm,n0 l0 m0 2
k
2 h c3
(808)
(809)
=
=
(810)
Therefore, the scalar products of the transition matrix elements of r with the
polarizations are given by
1 (k) . < n0 l0 m0  r  nlm >
1
cos k exp[ i k ] < n0 l0 m0  (x + iy)  nlm >
2
1
+
cos k exp[ + i k ] < n0 l0 m0  (x iy)  nlm >
2
sin k < n0 l0 m0  z  nlm >
(811)
132
and
2 (k) . < n0 l0 m0  r  nlm >
i
exp[ i k ] < n0 l0 m0  (x + iy)  nlm >
2
i
exp[ + i k ] < n0 l0 m0  (x iy)  nlm >
+
2
(812)
(813)
(814)
and
the crossterms in the square of the matrix elements are zero. Hence, on summing over the polarizations, one finds that the (k , k ) dependence of the decay
is governed by the dipole matrix elements through
X
1
1 + cos2 k  < n0 l0 m0  (x + iy)  nlm > 2
 (k) . rnlm,n0 l0 m0 2 =
4
1
+
1 + cos2 k  < n0 l0 m0  (x iy)  nlm > 2
4
+ sin2 k  < n0 l0 m0  z  nlm > 2
(815)
Ilm
0 =l+1 (k , k ) =
133
Ilm
0 =l+1 (k , k ) =
The independence of the result on m follows since, in this case, there are no angular correlations and the choice of direction of quantization of m is completely
arbitrary. The total decay rate for an electron in a state with fixed m due to
an l0 = l + 1 transition is given by
Z
2
3
4 e2 nl,n
0 l0
(l + 1)
1
2
=
dr
r
R
(r)
r
R
(r)
(820)
nl
n0 l+1
3
l0 =l+1
3 h c
(2l + 1)
0
for l0 = l + 1. This decay rate would be measured in experiments in which
neither the final state of the electron nor the final photon state is measured.
However, if the initial electronic state is unpolarized, then one should statistically average over the initial m. In this case, the emitted radiation becomes
isotropic
l
X
X
0
1
2 (l + 1)
Ilm
(821)
0 =l+1 (k , k ) =
2l + 1
3 (2l + 1)
0
m=l
since
1
2l + 1
l
X
m2 =
m=l
1
l(l + 1)
3
(822)
=
dr r Rn0 l+1 (r) r Rnl (r)
(823)
3
l0 =l+1
3 h c
(2l + 1)
0
for l0 = l + 1. This is the same result that was previously obtained for the decay
rate of a level with a specific m value, when the m0 value of the final state and
the polarization or direction of the emitted photon are not measured.
134
For the case where l0 = l 1, one finds that the decay rate involves the
angular factor
1 (l m)(l 1 m)
m0
2
Il0 =l1 (k , k ) =
1 + cos k
m0 m1
4 (2l 1)(2l + 1)
1 (l 1 + m)(l + m)
m0 m+1
+ 1 + cos2 k
4 (2l 1)(2l + 1)
(l + m)(l m)
+ sin2 k
m0 m
(824)
(2l 1)(2l + 1)
which on summing over the final values of m0 yields the angular dependence of
the radiation field
l(l 1) + m2
l2 m2
1
( 1 + cos2 k )
+ sin2 k
2
(2l 1)(2l + 1)
(2l 1)(2l + 1)
m0
(825)
The anisotropy of the emitted radiation is determined by the factor
X
Ilm
0 =l1 (k , k ) =
Ilm
0 =l1 (k , k ) =
m0
1 l(l + 1) 3 m2
l(l 1) + m2
+
sin2 k (826)
(2l 1)(2l + 1)
2 (2l 1)(2l + 1)
which shows that for m = 0 the photons are preferentially emitted perpendicular
to the direction of quantization axis since this maximizes the overlap between
the polarization and the dipole matrix element. In the opposite case of larger
m2 [ 3 m2 > l (l + 1) ], one finds that the photons are preferentially emitted
parallel (or antiparallel) to the axis of quantization.
Again it is noted that if the initial state is unpolarized, so that m has to be
averaged over, then the radiation field is isotropic since
1
2l + 1
l
X
X
m=l
Ilm
0 =l1 (k , k ) =
m0
2
l
3 (2l + 1)
(827)
Therefore, the decay rate in which the photon is emitted in any direction is
given by the expression
Z
2
3
4 e2 nl,n
0 l0
1
l
2
=
dr r Rn0 l1 (r) r Rnl (r)
(828)
3
0
l =l1
3 h c
(2l + 1)
0
for l0 = l 1.
Classical Interpretation.
The quantum mechanical results for the angular distribution of the radiation
can be understood in terms of a simple classical model of the atom. In Bohrs
model, a single electron orbits a central nucleus to which it is bound by the
135
attractive Coulomb potential. We shall assume that the radius of the orbit is a
and that the electron is performing a circular orbit in the x y plane. Since the
direction of the electrons orbital angular momentum is aligned with the zaxis,
it corresponds to the case where m l and l 1. In this case, the electron
has an oscillating dipole moment given by
d(t) = q a
cos t ex + sin t ey
= q a <e ex i ey
exp i t
(829)
This rotating dipole moment can be decomposed into two orthogonal linear
dipole moments which oscillate out of phase with each other. It should be
y
m=l
ed(t)
t
x
Figure 20: A classical electron orbiting in the xy plane (m = l) can be considered as producing two perpendicular linearlyoscillating electricdipole moments.
recalled that a classical oscillating (linear) electric dipole moment radiates power
P () into a solid angle dk with a distribution given by
4
dP
c
=
 d 2 sin2 kd
dk linear
8 c
(830)
where kd is the angle between the detector and the direction of the electric
dipole. On considering the radiation from the atom to be generated from two
orthogonal linear oscillating dipoles, one finds
4
dP
c
2
2
2
sin kx + sin ky
=
d
dk dipole
8 c
(831)
which on using
cos kx
cos ky
= sin k cos k
= sin k sin k
136
(832)
dk
ez
k
ky
k
kx
ey
ex
e m=l
Figure 21: The polarization of the radiated electromagnetic field for an electron
orbiting in the xy plane (m = l) can be comprehended in terms of the classical
radiation emanating from two linearly oscillating electricdipole moments. The
angles kx and ky , respectively, are the angles between the emitted radiation
and the xaxis and the angle subtended by the emitted radiation and the yaxis.
becomes
4
dP
c
=
 d 2
1 + cos2 k
dk dipole
8 c
(833)
Since the energy of the emitted photon is given by h , one finds the angular
dependence of the semiclassical prediction of the decay rate is given by
3
1
e2
a
=
1 + cos2 k dk
(834)
dk
8 h a
c
The polarization vector is parallel to the direction of the electric field, which in
turn is given by the direction of the oscillating dipole that produced it. Hence,
a detector which is arranged to accept radiation travelling in the direction k will
detect polarizations that are found by projecting the electrons orbit onto the
For example, in this case where the electrons orbit
plane perpendicular to k.
is in the x y plane, so radiation along the zaxis will be circularlypolarized,
whereas radiation in the x y plane will be linearlypolarized.
The angular dependence of the decay rate follows directly from the expressions of eqn(817) and eqn(825) by setting m l 1, replacing the radial matrix
elements of r by a, adding the expressions and inserting them into eqn(808). The
analysis shows that quantum mechanics reproduces the classical limit correctly,
as is expected from the correspondence principle.
137
Circular
ek
k
Linear
The decay rate due to dipole transitions includes processes in which photons
of all polarizations are emitted in all directions. Accordingly, the decay rate
is found by summing over all polarizations and integrating over the directions
of the emitted photon. For a spherically symmetric system, the energy will
be independent of the zcomponent of the orbital angular momentum. In this
case, one should sum over all values of m0 corresponding to the degenerate final
states. On summing over all final states corresponding to a specific l0 value, that
is on summing over m0 where m0 = m, m 1, one finds that the transition
rate can be expressed as
1
3
4 e2 nl,n
0 l0
3
3 h c
(l+1)
(2l+1)
l
(2l+1)
for
0
l =
Z
l+1
l1
2
dr r2 Rn0 l0 (r) r Rnl (r) (835)
(836)
It should be noted that, for a fixed l0 , the lifetime of the state  nlm > is independent of the value of m. This is expected since the choice of the quantization
direction is completely arbitrary.
There are no selection rules associated with the radial integration in the
dipole matrix elements
Z
dr r2 Rn0 l1 r Rnl
(837)
0
138
Zr
d 2 Rnl Rnl = 1.
a . The functions are normalized so that 0
n = 1
n = 2
l = 0
1
2
l = 0
l = 0
2
3
32
32
2
3
22
22
9
139
2
27
exp
exp
32
l = 2
l = 1
2 exp
l = 1
n = 3
exp
exp
exp
Table 3: Values of 
R
0
n, l
n0 , l 1
np
1s
28 n7 (n 1)2n5 (n + 1)2n5
2s
3s
2p
3p
3d
nd
nf
The radial part of the dipole matrix element can be expressed in terms of the
hypergeometric function F (a, b, c) via
Z
dr r2 Rn0 l1 r Rnl
0
s
0
0
(n + l)!(n0 + l 1)! (4n0 n)l+1 (n n0 )n +n2l2
a (1)n l
=
4(2l 1)
(n l 1)!(n0 l)!
(n0 + n)n0 +n
0
2
4n0 n
n n
4n0 n
0
0
F (l + 1 n, l n , 2l, 0
)
F (l + 1 n, l n , 0
)
(n n)2
n0 + n
(n n)2
(838)
Simple analytic expressions for the squares of the matrix elements for small values of (n0 , l) are shown in Table(3).
The radial integrations were evaluated by Schrodinger40 using the generating
function expansion for the Laguerre polynomials. Eckart41 and Gordon42 have
calculated these dipole matrix elements by other means. In general, the lifetime
of the hydrogenic states increases with increasing n, varying roughly as n3 for
a fixed value of l. The decrease in the dipole matrix elements with increasing n
40 E.
Schr
odinger, Ann. der Phys. 79, 362 (1926).
Eckart, Phys. Rev. 28, 927 (1926).
42 W. Gordon, Ann. der Phys. 2, 1031 (1929).
41 C.
140
is simply due to the increasing numbers of nodes in the radial wave functions.
The magnitude of the decay rate is estimated as
3 2
c a
e
1
a
c
h c
(839)
(840)
c
h c
where the length scale a has dropped out. Hence, as
2
e
1
h c
137.0359979
(841)
a h c
(842)
so the decay time is approximately eight orders of magnitude larger than the
time taken for the photon to cross the atom. When averaged over l, the electric
dipole decay rate is given by
X (2l + 1)
9
1
n 2
n
n5
(843)
so, as seen in Table(4), the decay is slower for the higher energy levels.
11.1.6
Consider the decay of the 2p state (with m = 0) to the 1s state in the hydrogen
atom. As can be seen from Table(2), the initial state is described by an electronic
wave function
r
r
1
1 r
3
2p (r) =
exp
cos
(844)
3
a
2 a
4
2 6a
and the final state electronic is given by
2
exp
1s (r) =
a3
r
a
1
4
(845)
2
h
m e2
141
(846)
Table 4: Electric Dipole Transition Rates for Hydrogen, in units of 108 sec1 .
Initial
Final
n=1
n=2
n=3
2p
ns
6.25
3s
3p
3d
np
ns
np
1.64

0.063
0.22
0.64
4s
4p
4p
4d
4f
np
ns
nd
np
nd
0.68

0.025
0.095
0.204

0.018
0.030
0.003
0.070
0.137
The decay rate in the FermiGolden rule, evaluated in the dipole approximation,
is given by
3
4 1,2
d1s,2p . d1s,2p
1
=
(847)
3 h c3
The frequency is evaluated from
h 12
= E2p E1s
m e4
1
=
1
4
2 h2
2
3 e
=
8 a
(848)
Hence,
1
=
=
2
3 2 3 2 2
3
e
e a d1s,2p
8
h a
h c3 e a
2
2 4
9
e
c d1s,2p
128 h c
a ea
4
3
(849)
Therefore, the scattering rate is determined by the ratio ac but also is modified
by the fourth power of the dimensionless electromagnetic coupling strength
2
e
1
(850)
h c
137.0359979
142
The smallness of this factor allows us to only consider the FermiGolden rule
expression for the decay rate. The dimensionless dipole matrix elements are
expected to be nonzero, since they obey the selection rules. They are nonzero,
as can be directly verified by performing an integration. The only nonzero
dipole matrix element originates from the zcomponent of the dipole
Z
d3 r 1s
d1s,2p = e
(r) r 2p (r)
(851)
since only the zcomponent satisfies the m = 0 selection rule. The angular
integration is evaluated as
Z 2
Z
2
3
3
2
2
d
d sin cos =
4
3
4 0
0
1
=
(852)
3
and the radial integration yields
Z
dr r2 R1s
(r) r R2p (r) =
0
dr
r4
exp
a3
3 r
2 a
4
dx x exp x
6 0
5 Z
2
a
=
6 3
0
5
5
2
2
4!
= 4a 6
= a
3
6 3
2
d1s,2p
= 4 2
ea
3
Therefore, the dipole allowed decay rate is given by
8 2 4
1
2
e
c
=
3
h c
a
(853)
(854)
(855)
Hence, the time scale is of the order of 1010 seconds. The exact value of the
decay time is calculated to be 1.6 109 seconds.
11.1.7
Consider decays such as the 3d state (with m = 0) to the 1s state in the hydrogen
atom. Since, in this transition, the change in the electrons angular momentum
is two units, the transition is forbidden in the dipole approximation. Therefore,
the transition rate is evaluated by keeping the next order term in the expansion
exp i k . r
1 i k . r + ...
(856)
143
The second term in the expansion describes electric quadrupole and magnetic
dipole transitions.
The matrix elements that have to be evaluated are of the form
< n0 l0 m0  ( k . r ) ( (k) . p )  nlm >
(857)
This shall be written as the sum of two terms, with different symmetries with respect to interchange of r and p. These two terms will describe electric quadrupole
and magnetic dipole transitions. The matrix elements are written as the sum
of a term symmetric under the interchange of r and p and a term that is antisymmetric
1
( k . r ) ( (k) . p ) =
( k . r ) ( (k) . p ) + ( k . p ) ( (k) . r )
2
1
( k . r ) ( (k) . p ) ( k . p ) ( (k) . r )
+
2
(858)
The first term represents the matrix elements for the electric quadrupole transitions43 , and the second term represents the matrix elements for the magnetic
dipole transitions. The first term can be written as the scalar products of a
symmetric dyadic
k.
r p + p r
. (k)
(859)
The scalar products are organized such that the left most vector outside the
parenthesis forms a scalar product with the left most vectors within the parenthesis, and likewise with the right most vectors. The electronic matrix elements
only involve the dyadic operator, as the wave vector and polarization vectors
are properties of the photon. The matrix elements
0 0 0
(860)
< nlm 
r p + p r  nlm >
are evaluated by first noting that
[ r , p2 ] = 2 i h p
(861)
im
[H , r]
h
(862)
Therefore, the matrix elements of the dyadic can be expressed in the form of
the matrix elements of the commutator with the dyadic
im
0 0 0
, r] + [H
, r ] r  nlm >
< nlm 
r p + p r
 nlm > =
< n0 l 0 m0  r [ H
h
43 J.
A. Gaunt and W. H. McCrea, Proc. Camb. Phil. Soc. 23, 930 (1927).
144
im
, r r ]  nlm >
< n0 l0 m0  [ H
h
( En0 l0 m0 Enlm )
= im
< n0 l0 m0  r r  nlm >
h
= i m n0 ,n < n0 l0 m0  r r  nlm >
(863)
=
The decay rate in the FermiGolden rule, evaluated in the electric quadrupole
approximation, is given by
2 Z
e
m2 c2 k2 X
1
 k . < n0 l0 m0  r r  nlm > . (k) 2
=
d3 k
8 k
mc
( Enlm En0 l0 m0 h k )
5 Z
X
e2
nl,n0 l0
dk
 k . < n0 l0 m0  r r  nlm > . (k) 2
8 h
c
(864)
nl,n0 l0
c
(865)
= Enl En0 l0
m e4
= m c2
h2
e2
a
e2
h c
2
(866)
Hence,
1
5
Z
X
e2
e2
 k . < n0 l0 m0  r r  nlm > . (k) 2
dk
hac
8 h
2 6
e
c
(867)
hc
a
Therefore, the scattering rate is determined by the ratio ac but also is modified
by the sixth power of the dimensionless electromagnetic coupling strength
2
e
1
(868)
h c
137.0359979
The smallness of this factor allows us to only consider the FermiGolden rule
expression for the decay rate. The dimensionless quadrupole matrix elements
145
are expected to be nonzero, since they obey the selection rules which involve
the exchange of two units of angular momentum. They are nonzero, as can
be directly verified by performing an integration. Therefore, the quadrupole
allowed decay rate is given by
2 6
1
e
c
(869)
h c
a
Hence, the time scale is expected to be of the order of 106 seconds44 .
This type of transition is known as an electric quadrupole transition. Because
of the transversality condition
k . (k) = 0
(870)
one can add a diagonal term to the dyadic without affecting the result. A
diagonal term with a magnitude that makes the resulting dyadic traceless is
added to the dyadic, leading to the expression
1
2
i,j  r 
(871)
Qi,j = e xi xj
3
Therefore, the transition rate is governed by the electric quadrupole tensor
< n0 l0 m0  Qi,j  nlm >
(872)
The symmetric dyadic Qi,j has six inequivalent components, which because of
the restriction that the dyadic is traceless, can be reduced to five independent
components. Due to the transformational properties of the dyadic under rotation, it can be expressed as a linear combination of the spherical harmonics
Ym2 (, ) and nothing else45 . This can be seen from rewriting the quadrupole
tensor Q
r2
xx
xy
xz
3
Q
2
=
(873)
yx
yy r3
yz
e
2
zx
zy
zz r3
in terms of spherical polar coordinates
1
2
2
1
1
2
146
(876)
= < n0  P r r P 1  n >
= pn 0 pn < n0  r r  n >
(877)
(878)
(879)
(880)
(881)
is the orbital angular momentum. The orbital angular momentum produces the
orbital magnetic moment given by
=
e
(r p)
2mc
(882)
The magnetic dipole transition should be extended from orbital angular momentum to include the spin magnetic moment which is of the same order
e h
. ( k (k) )
2mc
e
(883)
( r p ) . ( k (k) )
2mc
147
(884)
1  m 
(885)
and
also parity does not change.
Terms with higherorder orbital angular momentum that occur in the expansion of the photons wave function exp[ i k . r ] can be found by using the
Rayleigh expansion. The terms with orbital angular momentum l are proportional to the spherical harmonics jl (kr) which vary as (kr)l when kr 0, as is
found from the expansion of the exponential term. The presence of the extra
factors k l in the matrix element has the result that the electric 2s th multipole
transition rates are found to vary as
2s+1
n,n0
1
a2s
(886)
c
where s is the magnitude of the change in the electronic orbital angular momentum, which satisfies the inequality
( l + l0 ) s  l0 l 
(887)
The extra factors from the photons angular momentum results in an overall de2
crease in the electric multipole transition rate by a factor of ( he c )2s . It should
also be noted that the relative strength of the higherorder electric multipole
transitions increase more rapidly with Z than the electric dipole transitions.
Therefore, it is frequently found that the quadrupole transitions cannot be neglected for the heavy elements. Alternatively, higherorder multipole transitions
do become important in the xray region, since in this region the wavelength of
the radiation is comparable to the spatial extent of the charged particles wave
function.
11.1.8
 3d > . (k) 2
(888)
=
dk
 k . < 1s  Q
8
h c
4
e
=
(889)
c
9 a h c
148
e2 a4
(890)
We shall consider the transition from the m = 0 state of the 3d level to the 1s
state. As can be easily shown, the matrix elements of quadrupole tensor for this
transition are diagonal and are given by
0
0
Q2zz
 3d > = < 1s 
(891)
< 1s  Q
0  3d >
0
Q2zz
0
0
Qzz
c
1
=
8a
2
Z
X  < 1s  Qzz  3d > 2
1
1
=
dk
kz (k)z
kx (k)x
ky (k)y
e2 a4
2
2
(892)
The direction of the emitted photon k is expressed as
k = (sin k cos k , sin k sin k , cos k )
(893)
=
=
(894)
Thus for the m = 0 level, one finds that the integral over the angular distribution
is given by
Z
X 1
e  3d > . (k) 2
dk
 k . < 1s  Q
2 a4
e
2
Z
 < 1s  Qzz  3d > 2
3
=
dk
sin k cos k
e2 a4
2
3
 < 1s  Qzz  3d > 2
=
4
(895)
10
e2 a4
The scattering rate becomes
1
3
=
20
5 2 6
4 c
e
 < 1s  Qzz  3d > 2
9 a h c
e2 a4
149
(896)
dr r2 R1s
(r) 2 R3d (r)
a
0
r
92
1
4
3
=
4
(897)
3
5
4
Finally, one finds the resulting expression for the quadrupole decay rate of the
3d state with m = 0
2 6
1
1 c
e
(898)
=
3600 a h c
which is evaluated as 228 sec1 . From the above analysis, it is seen that angular
distribution for the emitted photon is governed by the factor
cos2 k sin2 k
(899)
and the intensity is largest for the cone with k 0.28 or 0.72 . This angular
m=0
150
is given by
6
dP
c
=
Q2 cos2 k sin2 k
dk quad
288 c
(900)
(901)
2.
The 3d 1s quadrupole decay rate will be dwarfed by dipole allowed cascade emission processes such as 3d 2p followed by 2p 1s. Therefore, the
intensity of the emitted light corresponding to the 3d 1s process is expected
to be extremely weak. However, the quadrupole line is expected to be much
more readily observed in absorption spectra.
11.1.9
The 2s state of the hydrogen atom can not decay via the paramagnetic interaction since, as it can be shown that the matrix elements that govern the emission
intensity vanish
(902)
< 1s  (k) . p exp i k . r  2s > = 0
First on integrating by parts, the matrix elements can be written as
i
h < 1s  (k) . exp i k . r  2s >
= +i
h < 2s  exp + i k . r (k) .  1s >
(903)
On utilizing the expression for the 1s wave function
1
r
1s (r) =
exp
a
a3
(904)
r 1s (r)
r
a
Hence, the transition matrix element is given by
Z
(k) . r
h
3
i
d r 2s (r) exp i k . r
1s (r)
a 0
r
1s (r) =
151
(905)
(906)
(907)
the factor (k) . r only depends on x and y and is antisymmetric with respect
to the transformations x x and y y. All other factors are even
functions of x and y. On integrating over the directions in the x y plane, one
finds that the integral is identically zero.
The above result could have been (partially) anticipated by considering the
selection rules. The electric dipole transition is forbidden by parity. The magnetic dipole transition is zero in this nonrelativistic treatment. All magnetic
and electric quadrupole and higher multipole transitions are forbidden by angular momentum conservation.
The 2s state decays via twophoton emission which is described by the diamagnetic interaction and by the effect of the paramagnetic interaction taken to
secondorder in timedependent perturbation theory. Since only the part of the
paramagnetic interaction that creates a photon is involved, for our purposes the
paramagnetic interaction can be replaced by
para
H
1
q X 2 h c2 2
(908)
p . (k) ak, exp i k . r
mc
V k
k,
hk
n
n'
dia
H
q2
2 m c2
X
k,;k0 ,0
2 h c2
V
(k) . 0 (k 0 )
ak, ak0 ,0 exp i ( k + k 0 ) . r
k k 0
(909)
for twophoton emission. The system is assumed to be initially in an eigenstate
152
(k,)
(k',')
2s
1s
where Cn0 (t) are timedependent coefficients. The probability of finding the
system in the final state  n0 > at time t is then given by Cn0 (t)2 . The rate
at which the transition n n0 occurs is then given by the timederivative of
Cn0 (t)2 .
It shall be assumed that the interaction is slowly turned on when t .
The interaction can be reduced to zero at large negative times by introducing a
multiplicative factor of exp[ + t ] in the interaction, where is an infinitesimally small positive constant. To firstorder in the diamagnetic interaction, one
finds
q2
2 h c2
i
(1)
0
0
< n  exp[ i ( k + k ) . r ]  n >
Cn0 (t) =
2 m c2
k k 0 V
Z t
i
2 (k) . 0 (k 0 )
dt0 exp
( h 0 + h + En0 En ) t0
h
(911)
where = c k and 0 = c k 0 are the energies of the two photons in the final
state. The small quantity has been absorbed as a small imaginary part to the
initial state energy
En En + i h
(912)
The paramagnetic interaction is of order of q and the diamagnetic interaction
is of order q 2 . Thus, to secondorder in q, one must include the diamagnetic
interaction and the paramagnetic interaction to secondorder. There are two
secondorder terms which represent:
153
(k,)
(k',')
(k',')
n''
(k,)
n''
n'
n'
i
=
h
X
2
exp[
n00
q
mc
2
2 h c2
k k 0 V
Z
t
0
t0
dt00
dt
i
i
(En0 + h 0 En00 ) t0 ] exp[ (En00 + h En ) t00 ]
< n0 l0 m0  0 (k 0 ) . p
 n00 l00 m00 > < n00 l00 m00  (k) . p
 nlm >
X
i
i
+
exp[ (En0 + h En00 ) t0 ] exp[ (En00 + h
0 En ) t00 ]
h
h
00
n
< n0 l0 m0  (k) . p
 n00 l00 m00 > < n00 l00 m00  0 (k 0 ) . p
 nlm >
(913)
The earliest time integration can be evaluated leading to
(2)
Cn0 (t)
=
i
h
q
mc
2
2 h c2
k k 0 V
Z
dt0
X
n00
exp[
i
(
h 0 + En0 En h) t0 ]
h
< n0 l0 m0  0 (k ) . p
 n00 l00 m00 > < n00 l00 m00  (k) . p
 nlm >
( En En00 h )
(2)
The coefficients Cn0 (t) and Cn0 (t) have the same type of timedependence.
154
(914)
exp[
(
h 0 + h
+ En0 En ih) t ]
(
h 0 + h + En0 En ih)
i
h
(915)
(1)
C 0 (t) + C (2)
n0 (t)
n
2
(916)
t
exp
2
exp[ hi (
h 0 +
h + En0 En ih) t ]
= (
0
(
h + h
+ En0 En ih)
h 0 + h + En0 En )2 + h2 2
(917)
Since the momenta and polarizations of the emitted photons are not measured,
the rate is summed over (k, ) and (k 0 , 0 ). Therefore, the transition rate is
given by the expression
2 exp 2 t
X
1
2
=
(918)
2 2 M
0+
2 +
0
(
h
+
E
E
)
h
n
n
k,:k0 ,0
where the matrix elements M are due to the combined effect of the diamagnetic
interaction and the paramagnetic interaction taken to secondorder. That is,
M
dia  2s >
< 1s k, k 0 0  H
X < 1s k , k 0 0  H
para  n00 l00 m00 k 0 0 > < n00 l00 m00 k 0 0  H
para  2s >
+
E2s En00 l00 m00 h
k 0
00 00
00
n l m
X
n00 l00 m00
para  2s >
para  n00 l00 m00 k > < n00 l00 m00 k  H
< 1s k , k 0 0  H
00
00
00
E2s En l m h
k
(919)
These three terms add coherently, and it should be noted that the intermediate
state is only a virtual state and it can have a higherenergy than the 2s state46 .
In the limit 0 the first term in the expression for the transition rate
of eqn(918) reduces to a delta function which expresses conservation of energy
46 Due to the Lamb shift, there is a 2p state with slightly lower energy than the 2s state.
However, due to the small magnitude of the energy difference, the part of the decay process
involving any real 2p transition is negligibly small.
155
(921)
k,:k0 ,0
The emitted photons have continuous spectra. In the expression for the matrix
elements M , the last two terms differ in the timeorder that the two photons
are emitted. On inserting the expressions for the interactions into M , one can
pull out the common factors leaving a dimensionless matrix element M 0 . This
leads to the expression
2 h c2
1
q2
M =
M0
(922)
2
2mc
V
k k 0
where M 0 is the dimensionless factor given by
M 0 = (k) . 0 (k 0 ) < 1s  exp i ( k + k 0 ) . r  2s >
+
+
2
m
2
m
X
n00 l00 m00
X
n00 l00 m00
< 1s  (k) . p exp[ i k . r ] n00 l00 m00 > < n00 l00 m00  0 (k 0 ) . p exp[ i k 0 . r ]  2s >
E2s En00 l00 m00 hk0
< 1s  0 (k 0 ) . p exp[ i k 0 . r ] n00 l00 m00 > < n00 l00 m00  (k) . p exp[ i k . r ]  2s >
E2s En00 l00 m00 hk
(923
The first term is negligible, since 1 k . r and the electronic eigenstates
are orthogonal. The order of magnitude of the second term is given by the
electronic kinetic energy divided by the excitation energy. Hence, the reduced
matrix elements have a magnitude of the order of unity. The transition rate is
given by
2 2
X h c2
1
e
( 2 )3
=
M 0 2 ( E2s E1s hk hk0 )(924)
2
2
0
2mc
V
k
k
0
0
k,;k ,
One can assume that the dipole matrix elements of the intermediate states
should be randomly oriented in space, since the initial and final electronic states
are isotropic. After summing over the polarizations, the transition rate becomes
isotropic. On setting
X
 M 0 2 1
(925)
,0
156
one finds
2 2
Z
1
c2
h
e
=
2 m c2
( 2 )3
d3 k
k
d3 k 0
( E2s E1s hk hk0 )(926)
k0
e2
m c2
2
h c2
2
dk k
0
e2
m c2
2
c
2
21
k)
c
dk k (
0
(928)
where 12 is related to the energy difference of the 1s and 2s states. An elementary integration yields
1
=
e2
m c2
2
c
12
12
c
3
(929)
The first factor has dimensions of length squared and can be recognized as the
square of the classical radius of the electron. However, since
12
3 e2
=
c
8 h
ca
and
a =
2
h
m e2
(930)
(931)
or
e2
a e4
= 2 2
2
mc
h c
one finds the decay rate is approximated by
1
1
=
12
3 2 7
3
c
e
8
a h c
(932)
(933)
Thus, the estimated decay rate is 8.75 sec1 . The exact value calculated by
Shapiro and Breit47 is 8.266 sec1 .
47 J.
157
11.1.10
= nk, 1
n0k0 ,
= nk0 ,
(934)
2 h c2
q
0 0 0
< n l m  p . (k) exp + i k . r  nlm >
nk,
=
V k
mc
k,
(935)
The photon absorption rate is found from the FermiGolden rule expression
2
X
1
2
q
2h
c2
=
nk,
( Enlm + hk En0 l0 m0 )
mc
V k
n0 l0 m0
0 0 0
(936)
 < n l m  p . (k) exp + i k . r  nlm > 2
This is related to the lifetime due to stimulated emission, if the initial and final
states are interchanged.
The scattering crosssection for photon absorption absorb () is found by
relating the number of photons absorbed (per second) to the product of the
incident flux and the crosssection. The photon flux is given by the photon
density times the velocity of light
nk,
ck
(937)
j =
V
Hence, the crosssection can be written as
2
X
V
2
q
2 h c2
absorb (k ) =
nk,
( Enlm + hk En0 l0 m0 )
h
mc
V k
nk, c
n0 l0 m0
0 0 0
 < n l m  p . (k) exp + i k . r  nlm > 2
(938)
which simplifies to
absorb (k )
4 2 e2
m2 k c
X
( Enlm + hk En0 l0 m0 )
n0 l0 m0
+ ik.r
 nlm > 2
(939)
158
The absorption crosssection is independent of the volume of the electromagnetic cavity and the number of photons in the incident beam. As a function of
frequency, the Born approximation for the crosssection for photon absorption
contains delta function lines corresponding to the atomic excitation energies.
Measured absorption lines do have natural widths nl,n0 l0 and the absorbtion
spectra can be approximated by the sums of Lorentzian functions. The widths
() [ h /m ]
70
60
50
40
30
20
10
0
0.7
0.8
0.9
h [ Ryd ]
48 Lines
in the absorption spectra with weak intensities can be broad if the final states are
rapidly decaying.
49 V. F. Weisskopf and E. Wigner, Z. Physik, 63, 54 (1930), Zeit. f
ur Physik, 65, 18 (1930).
159
(941)
0 0
(942)
For an isotropic medium, the electronic states are degenerate with respect to
the zcomponents of the orbital angular momentum, so the initial state (n, l, m)
should be averaged over the different values of m
l
X
1
(2l + 1)
(943)
m=l
and the values of m0 for the final states are summed over all possible values.
This averaging process results in an isotropic absorption rate, and is equivalent
to averaging the polarization vector over all directions in space. Therefore, in
the dipole approximation, the absorption crosssection for an isotropic medium
is given by the expression
X
4 2 e2
absorb () =
n0 l0 ,nl  < n0 l0 m0  r  nlm > 2 (n0 l0 ,nl )
3
hc
0
0
0
nlm
(944)
The strength of each absorption line can be found by integrating the crosssection over a narrow frequency range centered on the frequency of the absorption line. (More specifically, the width of the interval of integration must be
greater than the natural linewidth.) The integrated intensity of the transition
(nlm) (n0 l0 m0 ) is given by
2
Z nl,n0 l0 +
4 2
e
d absorb () =
n0 l0 ,nl  < n0 l0 m0  r  nlm > 2
3
h c
nl,n0 l0
(945)
The intensity of each line is proportional to the oscillator strength fnln0 l0
defined as
fnln0 l0 =
2 m n0 l0 ,nl
 < n0 l0 m0  r  nlm > 2
h
(946)
The intensities and the frequencies of all the transitions are related via sum
rules50 . These sum rules involve quantities of the form
X
(947)
np 0 l0 m0 ,nlm  < n0 l0 m0  r  nlm > 2
n0 l0 m0
160
n0 l0 m0
3 h
2 m
2
m
2 m
and have values given in the Table(5). The sum rules can be used to provide
checks of experimental data.
Sum Rules for Dipole Radiation
There exists a systematic way of deriving sum rules for the weighted intensities of the dipole allowed transitions. The sum rules are of the form
X
p
0 0 0
nl,n
> 2
(948)
0 l0  < nlm  A  n l m
n0 l0 m0
where
h nl,n0 l0 = Enl En0 l0
(949)
A(t)
= exp +
H0 t A exp
H0 t
h
h
50 W. Thomas, Naturwiss. 11, 527 (1925).
F. Reiche and W. Thomas, Zeit. f
ur Physik, 34, 510 (1925).
W. Kuhn, Zeit. f
ur Physik, 33, 408 (1925).
161
(950)
(951)
and
2F
t2
2
i
0 , [ H
0 , A(t)
] ] A (0)  nlm >
=
< nlm  [ H
h
(953)
etc. This process shows that the pth derivative is expressed as p nested commutators
p
p
F
i
0 , [ . . . [ H
0 , [ H
0 , A(t)
] ] . . . ] ] A (0)  nlm >
=
< nlm  [ H
p
t
h
(954)
Alternatively, one can insert a complete set of states in the definition for F (t)
yielding
X
 n0 l0 m0 > < n0 l0 m0  A (0)  nlm > (955)
F (t) =
< nlm  A(t)
n0 l0 m0
0 , one has
but since the states  nlm > are eigenstates of H
F (t) =
exp
i nl,n0 l0 t
n0 l0 m0
2
< nlm  A  n0 l0 m0 >
(956)
pF
tp
=
p
nl,n
0 l0
exp i nl,n0 l0
n0 l0 m0
2
0 0 0
t < nlm  A  n l m > (957)
The sum rules are found by equating the two forms of the pth timederivative
and then setting t = 0
2
p
X
0 0 0
Enl En0 l0
< nlm  A  n l m >
n0 l0 m0
0 , [ . . . , [ H
0 , [ H
0 , A ] ] . . . ] ] A  nlm >(958)
= < nlm  [ H
Hence, the pth moment of the matrix elements of A is related to the expecta 0 and A multiplied
tion value of the product of the pth nested commutator of H
by A .
The expectation value of p nested commutators of A can be expressed as the
expectation value of the product of p q nested commutators of A with q nested
commutators of A . This can be demonstrated by noting that the expectation
162
(960)
(961)
(964)
11.1.11
e
mc
2
2
h c2
V k
nk,
 1s > 2 ( E1s + h k
k0
(965)
163
2 k 02
h
)
2m
(k)
exp
i
k
.
r

1s
>

(
E
+
h
)
=

<
k
1s
k
2m
m2 k c
0
k
(966)
where the initial wave function is given by
1s (r) =
1
a3
exp
(967)
As long as the emitted electron is not close to threshold, the final state wave
function can be approximated by a planewave
1
k0 (r) = exp i k 0 . r
(968)
V
The sum over final states of the electron can be replaced by an integral over the
magnitude of its momentum and its direction
Z
Z
X
V
0 02
dk
k
dk0
(969)
( 2 )3 0
0
k
It is seen that the factor of the volume in the density of final states cancels with
the factors from the normalization of the electrons final state. The differential
crosssection corresponds to the part of the crosssection where the outgoing
electron is emitted into the solid angle dk0 . Hence,
Z
V
e2
h2 k 02
d
0
0 02
2
=
dk
k

<
k

p
.
(k)
exp
i
k
.
r

1s
>

(
E
+
h
)
1s
k
d0
2 m2 k c
2m
0
(970)
The integration over the magnitude of electrons final momentum k 0 can be
performed by using the properties of the energy conserving delta function. The
magnitude of electrons final momentum is denoted by kf
kf2 =
2m
( hk + E1s )
h2
(971)
e
(hk,k)
(Ek',k')
'
Figure 28: The geometry for the photoemission of an electron from an atom.
An electromagnetic wave, with polarization along the zaxis, is incident along
the xaxis. The photoemitted electron propagates along the direction k 0 .
exp
i
k
.
r
 1s > 2 (974)
f
f
d0
2 m k c a2
where (, ) are the polar coordinates of the vector r. The matrix elements are
evaluated using the dipole approximation for the photon wave function and set
exp i k . r
1 + i k . r + ...
(975)
and only keep the first term of the expansion. The factor cos can be expressed
as a spherical harmonic through
r
4 1
cos =
Y0 (, )
(976)
3
and the final state electronic wave function can be expressed in terms of the
Rayleigh expansion
X
0
exp i kf k . r
= 4
il jl ( kf r ) Yml (, ) Yml (0 , 0 )
(977)
l,m
where (0 , 0 ) are the polar coordinates of the electrons final momentum. The
angular integration over the polar coordinates (, ) can be performed by using
the orthogonality relations for the spherical harmonics. The end result is
Z
1
r
< kf d0  cos  1s > = 4 i cos 0
dr r2 j1 (kf r) exp
a
a3 V 0
(978)
where the cos 0 dependence refers to the direction of the emitted electrons
angular momentum. The radial integral is evaluated to yield
r
a3
2 kf a
0
0
< kf d  cos  1s > = 4 i cos
(979)
V ( 1 + kf2 a2 )2
165
(980)
Using
2
h
m e2
the photoemission crosssection can be rewritten as
2 2
2
kf
e
2 kf a
d
2
2 0
= 8
a
cos
d0
k
h c
( 1 + kf2 a2 )2
a =
(981)
(982)
Thus, although the photon is propagating along the xdirection, the electron is
preferentially emitted along the direction of the polarization (0 0). This can
be understood as being due to the effect of c being large, so that the photons
momentum is negligible compared with the energy, therefore, (in the dipole
approximation) only the direction of the polarization determines the angular
distribution of the emitted electron. It should be noted that in the relativistic
case, where the momentum of the photon is important, the electrons are predominantly ejected in the direction of the photon51 . This formula also breaks
down for emitted electrons with low energies. In this case, the correct electronic
0 should be used52 . The inwave function for the continuous spectrum of H
clusion of the Coulomb attraction of the ion in the final state has the effect of
reducing the crosssection near the threshold.
11.1.12
0 where
Free electrons are described by the noninteracting Hamiltonian H
p2
2m
0 =
H
(983)
i k0 . r
(984)
h2 k 02
2m
(985)
q
2 h c
Sauter, Ann. Phys. 9, 217 (1931), Ann. Phys. 11, 454 (1931).
Stobbe, Ann. Phys. 7, 661 (1930).
166
which is evaluated as
r
q
2 h c
(987)
This shows that momentum is conserved. Furthermore, for the transition rate
p'
p''
2 k 02
h
2 k 002
h
=
2m
2m
(988)
(989)
( p + p0 ) ( p + p0 ) = p00 p00
(990)
Hence,
but the electrons momenta form a Lorentz scalar which is related to the rest
mass
p0 p0 = p00 p00 = m2 c2
(991)
and the photon has zero mass
p p = 0
(992)
(993)
167
11.2
Scattering of Light
Kramers and Heisenberg evaluated the scattering crosssection for light incident
on atomic electrons53 . The incident photon is denoted by (k, ) and the scattered photon by (k 0 , 0 ). The scattering crosssection involves the paramagnetic
interaction to secondorder and the diamagnetic interaction to firstorder. The
matrix elements of the diamagnetic interaction are given by
e2
0 0 0 0 0
.A
 nlmk >
< n0 l0 m0 k 0 0  A
< n l m k  Hdia  nlmk > =
2 m c2
e2
0
0 0 0 0 0

(
a
)
.
r
<
n
l
m
k
a
+
a
a
)
exp
i
(
k
k
 nlmk >
=
0
0
0
0
k,
k,
k ,
k ,
2 m c2
2
h c2
(k) . 0 (k 0 )
(994)
k k 0 V
where it has been assumed that only the initial and final photon are present. On
making use of the longwavelength approximation a, the matrix elements
simplify to
e2
0 0 0 0 0
h
2 m c2
k k 0 V
Z t
i
0
0
0
0
dt exp
( h + En h En ) t
(997)
h
53 H.
168
(998)
k''
k''''
n'
n''
n
(k',')
(k,)
(k',')
(k,)
n''
n
n'
n'
mc
k k 0 V
X
i
i
h 0 + En0 En00 ) t0 ] exp[ (En En00 + h) t00 ]
exp[ (
h
00
n
169
00
n
 nlm >
< n0 l0 m0  (k) . p
 n00 l00 m00 > < n00 l00 m00  0 (k 0 ) . p
(999)
The earliest time integration can be evaluated leading to
(2)
Cn0 (t)
=
i
h
0 0
e
mc
2
2 h c2
k k 0 V
00 00
Z
dt0
00
00 00
exp[
n00
00
i
(
h 0 + h
+ En0 E n ) t0 ]
< n l m  0 (k ) . p
 n l m > < n l m  (k) . p
 nlm >
( En En00 + h )
< n0 l0 m0  (k) . p
 n00 l00 m00 > < n00 l00 m00  0 (k 0 ) . p
 nlm >
0
( En En00 h )
(2)
The coefficients Cn0 (t) and Cn0 (t) have the same type of timedependence.
The remaining integration over time yields
Z t
i
i
0
0
0
dt exp
(
h + En0 En h ih) t
h
exp[
(
h 0 + En0 En h ih) t ]
(
h 0 + En0 En h ih)
i
h
(1001)
1
=
(1)
C 0 (t) + C (2)
n0 (t)
n
2
(1002)
k k 0 V
(
h 0 + En0 En
h)2 + h2 2 m c2
170
(1000)
0
 n00 l00 m00 > < n00 l00 m00  (k) . p
 nlm >
1 X < n0 l0 m0  0 (k ) . p
( En En00 + h )
m 00
n
 nlm >
 n00 l00 m00 > < n00 l00 m00  0 (k 0 ) . p
1 X < n0 l0 m0  (k) . p
+
( En En00 h 0 )
m 00
n
(1005)
On taking the limit 0, the first factor in the decay rate reduces to an
energy conserving delta function. Therefore, one obtains the FermiGolden rule
expression
2 2
2
1
2
e
2 h c2
=
M 2 (
hk0 + En0 En hk ) (1006)
m c2
k k 0 V
The magnitudes of the final state photon quantum numbers (k 0 ) must be integrated over, since these are not measured. This integration imparts a physical
meaning to the expression for the rate which contains the Dirac delta function.
We shall assume that the direction of the scattered photon is to be measured
and that the photon is absorbed by a detector which subtends a solid angle d0
to the atom. Therefore, the scattering rate is given by
2 2
2
Z
1
2
e
V
2 h c2
0
0 02
=
d
 M 2 (
hk0 +En0 En hk )
dk k
d0
h
m c2
( 2 )3
k k 0 V
0
(1007)
Since h
k 0 = h
c k 0 , the integration over the delta function can be performed,
yielding
2 2
2
2
e
V d0 02
2 h c2
1
=
 M 2
(1008)
d0
h
m c2
( 2 )3 h c3
0 V
The scattering crosssection is defined as the transition rate divided by the
photon flux. The photon flux is found by noting that it has been assumed that
there is one photon per volume V so the photon density is V1 and the speed of
light is c. Hence, the photon flux is given by Vc . Therefore, the crosssection is
determined by the KramersHeisenberg formula
2 2 0
d
e
=
 M 2
(1009)
d0
m c2
This quantity is often called the classical radius of the electron. The quantity
re can be expressed as
2
2
e
h
e
re =
=
2.82 1015 m
(1011)
m c2
h c
mc
11.2.1
Rayleigh Scattering
but one can rewrite the Kronecker delta function in terms of the commutation
relation
[ xi , pj ] = i h i,j
(1014)
Thus, one can express the scalar product as a commutator
(k) . 0 (k 0 )
1 X
(k)i [ xi , pj ] 0 (k 0 )j
i h i,j
1
[ (k) . r , p
. 0 (k 0 ) ]
ih
(1015)
(1016)
the initial and final states must be identical if this is nonzero. Hence, the
result is equivalent to the expectation value in the state  nlm > . On replacing
the matrix elements by the expectation value and then insert a complete set of
electronic states, one finds
< n0 l0 m0  nlm > (k) . 0 (k 0 )
X
1
=
< n0 l0 m0  r . (k)  n00 l00 m00 > < n00 l00 m00  0 (k 0 ) . p  nlm >
i
h 00 00 00
n l m
< n0 l0 m0  0 (k 0 ) . p  n00 l00 m00 > < n00 l00 m00  r . (k)  nlm >
(1017)
172
via
< n0 l0 m0  p  n00 l00 m00 > =
=
=
1
< n0 l0 m0  [ r , p2 ]  n00 l00 m00 >
2 i h
m
0 ]  n00 l00 m00 >
< n0 l0 m0  [ r , H
i h
m
( En00 l00 m00 En0 l0 m0 ) < n0 l0 m0  r  n00 l00 m00 >
i h
(1018)
i
< n0 l0 m0  p  n00 l00 m00 >
m n00 ,n0
(1019)
where
n00 n0
En00 l00 m00 En0 l0 m0 = h
(1020)
1
m
X
n00 l00 m00
< n0 l0 m0  p
. 0 (k 0 )  n00 l00 m00 > < n00 l00 m00  p . (k)  nlm >
h nn00
(1021)
1
m
X
n00 l00 m00
< n0 l0 m0  p
. 0 (k 0 )  n00 l00 m00 > < n00 l00 m00  p . (k)  nlm >
En00 En
(1022)
On substituting this back into the expression for the matrix elements M , one
obtains
X < n0 l0 m0  p
. (k)  n00 l00 m00 > < n00 l00 m00  p . 0 (k 0 )  nlm >
1
M =
m 00 00 00
En00 En
n l m
1
m
X
n00 l00 m00
< n0 l0 m0  p
. 0 (k 0 )  n00 l00 m00 > < n00 l00 m00  p . (k)  nlm >
En00 En
173
1
m
1
m
X
n0 l00 m00
X
n00 l00 m00
(1023)
which simplifies to
M
n00 n ( nn00 )
(1024)
m h
In the limit of small photon frequencies compared with the electronic energies,
one can expand the denominators of the matrix element as
1
1
= 2
3
+ ...
nn00 ( nn00 )
nn00
nn00
(1025)
When this lowfrequency expansion is substituted into the matrix elements, the
leading term vanishes. This can be seen since the leading term is proportional
to
X 1
< n0 l0 m0  0 (k 0 ) . p
 n00 l00 m00 > < n00 l00 m00  p
. (k)  nlm >
2
00
nn
00
n
< n0 l0 m0  (k) . p
 n00 l00 m00 > < n00 l00 m00  p
. 0 (k 0 )  nlm >
(1026)
which can be expressed as
X
2
m
< n0 l0 m0  0 (k 0 ) . r  n00 l00 m00 > < n00 l00 m00  r . (k)  nlm >
n00
0 0
00 00
00
< n l m  (k) . r  n l m
00 00
00
or, on using the completeness relation, one finds the expectation value of the
commutator is given by
m2 < n0 l0 m0  [ 0 (k 0 ) . r , r . (k) ]  nlm > = 0
174
(1028)
Thus, the leading term of the lowfrequency expansion vanishes. Therefore, the
scattering rate is expressed as
2
3
X
d
re
1
4
=
d0
m
h
nn00
n00 l00 m00
. (k)  nlm >
< n0 l0 m0  0 (k 0 ) . p
 n00 l00 m00 > < n00 l00 m00  p
0 0
00 00
00
+ < n l m  (k) . p
n l m
2
>< n l m p
. 0 (k )  nlm >
00 00
00
(1029)
Finally, the scattering rate can be expressed in terms of the dipole matrix elements as
X
2
3
re m
1
d
4
=
d0
h
nn00
n00 l00 m00
< n0 l0 m0  0 (k 0 ) . r  n00 l00 m00 > < n00 l00 m00  r . (k)  nlm >
0 0
00 00
00
+ < n l m  (k) . r  n l m
2
> < n l m  r . 0 (k )  nlm >
00 00
00
(1030)
Hence, at longwavelengths, the scattering crosssection varies as 4 as expected
from Rayleighs law. Since the typical electronic frequency nn00 is in the ultraviolet spectrum, then
nn0
(1031)
for all frequencies in the visible optical spectrum. This leads to the phenomena
of blue skies in the day and red sunsets at dusk.
11.2.2
Thomson Scattering
(1032)
so that the photon energy is greater than the atomic bindingenergy. In this
case, the second and third terms in the KramersHeisenberg formula can be
neglected. This is because
0
1
< n0 l0 m0  0 (k 0 ) . p
 n00 l00 m00 > < n00 l00 m00  p
. (k)  nlm >
m
(1033)
175
(k)
.
(k
)
(1034)
e
d0
which is independent of . The above result is dependent on the scattering
angle via the polarization vectors.
In the investigation of the angular dependence of Thomson scattering, it is
convenient to introduce a coordinate system which is defined by the polarization
vectors and direction of propagation of the incident photon and its polarization
1 (k). The coordinate system is composed of the three orthogonal unit vec Thus the direction of the polarization vector 1 (k) defines
tors (
1 (k), 2 (k), k).
the xdirection. In this coordinate system, the scattered photon (k 0 , 0 ) is in
the direction k 0 with polar coordinates (k0 , k0 ). The polarization of the final
photons (k 0 ) must be transverse to k 0 . Two polarization vectors are defined
according to
1 (k 0 ) = ( cos k0 cos k0 , cos k0 sin k0 , sin k0 )
(1035)
(1036)
k
e2(k')
k'
k'
e1(k')
e2(k)
k'
e1(k)
Figure 32: The coordinate system and polarization vectors used to describe
Thomson scattering.
vectors, the scattering crosssection for incident radiation that is polarized along
176
for 0 = 1
for 0 = 2
(1037)
(1038)
k
e
d0 xpol
where the polarizations of the final state photon have been summed over.
If the incident beam of photons is unpolarized, then is undefined since
the azimuthal direction of the scattered photon is defined with respect to the
assumed polarization 1 (k). In the case of an unpolarized incident beam the
expression should be integrated over and divided by 2. The scattering rate
is given by
d
re2
cos2 k0
for 0 = 1
=
(1039)
0
1
for 0 = 2
d unpol
2
if the polarizations of the final state photons are measured. This result is identical to that obtained by assuming that the initial beam is composed of one half
of the number photons polarized along the xdirection and the other half of the
number of photons polarized along the ydirection. That is
1
2
2
2
d
0
0
0
for 0 = 1
2
2 cos k ( cos k + sin k )
=
r
(1040)
2
1
e
2
0
0
0
for 0 = 2
d unpol
2 ( sin k + cos k )
The crosssection for unpolarized photons with a polarization insensitive detector is given by
d
re2
2
=
1 + cos k0
(1041)
d0 unpol
2
where the final polarizations have been summed over.
The total crosssection is obtained by integrating over all directions. The
total Thomson scattering crosssection is independent of whether the initial
beam was polarized or unpolarized. The final result is
=
8 2
r
3 e
(1042)
m 2
these processes are smaller by factors of ( M
) . The derivation of the Thomson
scattering crosssection breaks down for photons which have energies of the
order of the electrons rest energy
h me c2
(1043)
For photons with these highenergies, one must describe the scattering process
relativistically. In this energy region, Compton scattering dominates.
Classical Interpretation
The classical counterparts of Rayleigh and Thomson scattering can be described by a twostep process. In the first step, the incident classical electromagnetic field causes an electron to undergo forced oscillations. In the second
step, the oscillating electrons emit electromagnetic radiation.
In the first process, an electron bound harmonically to the atom which responds to an electromagnetic field E 0 exp[ i t ] can be described by the
equation of motion
q
E 0 <e exp i t
(1044)
r + 02 r =
m
where 0 is the frequency of the electrons natural motion. In the steady state,
one finds
q
m E0
<e exp i t
(1045)
r = 2
0 2
The acceleration of the charged particle can be described by
q
2 E 0
<e
exp
i
t
r = m2
0 2
(1046)
=
=
2 q2 r2 4
3 c3
4
2 q E 20
4
2
2
3
3 m c ( 0 2 )2
(1047)
(1048)
8 2
4
re
2
3
( 0 2 )2
178
(1049)
This formula has the correct frequency dependence in the limit 0 in which
case the classical crosssection varies as 4 , as expected for Rayleigh scattering.
On the other hand, in the limit 0 the crosssection becomes frequency
independent, as is expected for Thomson scattering.
11.2.3
Raman Scattering
For inelastic scattering, one has h 6= h 0 , therefore, the condition of conservation of energy requires that
Enlm + h = En0 l0 m0 + h 0
(1050)
Since it is most probable that the initial electron is in the ground state, one has
En0 l0 m0 > Enlm
(1051)
h > h 0
(1052)
Hence, the final photon has less energy than the initial photon. That is, the
I(')
Stokes
n n'
antiStokes
n' n
n',n
+n',
'
Figure 33: The schematic frequency dependence of the observed intensity expected in a Raman scattering experiment. The ratio of intensities of the Stokes
and antiStokes lines provides a relative measure of the initial occupation of the
lowenergy state n and the higherenergy excited state n0 .
electromagnetic field has lost energy and left the electron in an excited state.
This inelastic process describes the Stokes line. On the other hand, if the
electron is initially in an excited state, then it is possible that the electron
looses energy and makes a transition to the ground state. In this case,
En0 l0 m0 < Enlm
179
(1053)
(1054)
11.2.4
In the analysis of photon scattering, it has been assumed that the energy denominators ( En En00 + h
) do not vanish. If the energy denominator
vanishes, the KramersHeisenberg formula becomes singular, however, the physically observed scattering crosssection may become large but does not diverge.
This is the phenomenon of resonancefluorescence. Using the classical model,
one can describe the scattering crosssection, if damping is introduced to represent the lifetime of the electronic states. That is, the dynamics of the bound
electron is modelled by a damped harmonic oscillator
q
2
E <e exp i t
(1055)
r + r + 0 r =
m 0
which has the solution
r =
q
1
E <e 2
exp
2m 0
0 + i 2
it
(1056)
since is related to the decay rate and is of the order of 108 sec1 , it is usually
negligible compared with the frequency of light which is estimated as 1015
sec1 . Following our previous arguments, one finds that the scattering crosssection is given by
2 2
8
q
4
() =
(1057)
3
m c2
( 2 02 )2 + 2 2
which no longer diverges when the resonance condition is satisfied, because of
the damping of the electronic states.
The lifetime of a quantum mechanical state which at t = 0 is represented
I is given by the
by  n (0) > calculated to secondorder in the interaction H
FermiGolden rule expression. The rate can be expresses as the limit 0 by
X  < n0  H
I  n > 2
1
2
=
=m
n
h
E n En0 + i
0
(1058)
whereas the energyshift found in secondorder (RayleighSchrodinger) perturbation theory is also given by the limit 0 of
En = <e
X  < n0  H
I  n > 2
E n En0 + i
n
180
(1059)
so
En = En(0) + En
(1060)
Hence, due to the form of the expressions for the shift and the lifetime as the
real and imaginary parts of a complex function, it is possible to consider an
unstable state as having a complex energy54 given by
En i n En(0) + En i
h
2 n
(1061)
That is, the lifetime can be considered as giving the state an energywidth n .
This is the natural width of the electronic state. The factor of two in the width
can be understood by considering the timedependence of the state  n (t) >
which is given by
i
( En i n ) t  n (0) >
(1062)
 n (t) > = exp
h
Hence, the probability Pn (t) that the state has not decayed at time t is given
by
Pn (t)
=
 M 2
(1066)
0
2
d
mc
54 That is, the perturbation produces a complex shift of the energyshift which related to
the selfenergy n (E) which is to be discussed later
55 P. A. M. Dirac, Proc. Roy. Soc. A 114, 710 (1927).
181
1
m
1
m
X
n00 l00 m00
X
n00 l00 m00
(1067)
2
d
=
d0
m2 c2
( En En00 + h )2 + 2n00
(1068)
This expression can be reexpressed in terms of the product of two factors
2
 n00 l00 m00 > 2 V
e 2
h  < nlm  (k) . p
d
=
d0
m2 V
( En En00 + h )2 + 2n00
c
2
2 e 2 h
V 02
 < n00 l00 m00  0 (k 0 ) . p
 n0 l0 m0 > 2
2
0
h
m V
( 2 )3 h c3
(1069)
which is the probability for absorption from the ground state to the resonant
state  n00 l00 m00 > (divided by the incident flux) times the probability for its
decay via emission. On resonance, it appears that the process corresponds to
two sequential processes, first absorption and secondly emission.
For energies slightly offresonance, the resonant scattering is expected to interfere with the nonresonant scattering process. Likewise, if the resonant state
is degenerate, the sum over the degeneracy must be performed before the matrix
elements are squared leading to constructive interference.
The difference between a resonant process and two step process, is determined by the lifetime of the intermediate state  n0 l0 m0 > compared with the
frequency width of the photon beam. The frequency width of the photon beam
may be limited by the monochromator, or by the timescale of the experiment
if it involves a pulsed light source. If the lifetime of the intermediate state is
sufficiently long compared with the time scale of experiment, it may be possible to observe the decay long after the incident light has been switched off. In
182
11.2.5
Natural LineWidths
The interaction representation will be used to calculate the natural width for
the absorption of light (k, ), by introducing the lifetimes of the initial and final
state. Strictly speaking, one should not take the exponential decay of a probability Pn (t) of finding an electron in state n too literally. If one considers the
approximate exponential decay as being rigorous, this implies that the Hamiltonian should be nonHermitean which is strictly forbidden. One should think
of the decaying wave function as a wave packet or linear superposition of exact
energy eigenstates (with energies denoted by E). The Fourier transform of the
timedependent wave function should provide the energydistribution n (E) of
the exact energy eigenstates in the wave packet  n (t) >
Z
i
1
n (E) =
dt exp +
Et
< n (0)  n (t) >(1070)
2
h
h
On assuming the approximate form of a decaying wave packet
i
t
En t
< n (0)  n (t) > = exp
h
2 n
(1071)
where the decay includes transitions to all possible final states, one finds
1
1
1
n (E) =
2i
E En i 2 hn
E En + i 2 hn
h
1
2 n
( E En )2 + ( 2 h )2
n
(1072)
This can only be an approximate form of the energydistribution since the energy must be bounded from below. The existence of a lowerbound to energy distribution implies that the width of the electronic energy level has to
be energydependent 2 hn = n as this must become zero below a threshold
energy. However, it should be noted that the width of the energydistribution
will determine the approximate exponential decay. Since the perturbations introduce an energydependent width to the wave packet, causality requires that
the energyshift En should also be energydependent. Hence, the effects of
the perturbation (such as the energyshift and lifetime) should be described in
56 V.
183
(1073)
The energydependent selfenergy appears most naturally if one uses BrillouinWigner perturbation theory to calculate the correction to an approximate energy
En . From secondorder BrillouinWigner perturbation theory, one finds that the
energydependent selfenergy, when evaluated just above the real E axis, is given
by
X  < n0  H
I  n > 2
(1074)
n (E + i) =
E + i E n0
0
n
This complex selfenergy has a real and imaginary part. The imaginary part
can be thought of as occurring via amplification of the infinitesimal imaginary
term i in the denominator, and can be seen to be nonzero when the energy
E of the component in the wave packet falls in the region when the spectral
density of the approximate En0 is finite. Hence, since the En0 are bounded from
below, then so is the energydistribution n (E) since
n (E) =
=m n (E + i)
1
( E En <e n (E) )2 + ( =m n (E + i) )2
(1075)
The real part of the selfenergy must also be energydependent, since it is related
to the imaginary part via the KramersKronig relations
Z
1
=m n (z + i)
<e n (E) =
dz
E z
Z
<e n (z)
Pr
dz
(1076)
=m n (E + i) = +
E z
+
Hence, the real part of the selfenergy is also energydependent. The KramersKronig relation is an expression of causality.
Since the electronic states in the expression for the FermiGolden rule decay
rate
1
2
I  nlm > 2 ( En0 l0 Enl h ) (1078)
=
 < n0 l0 m0  H
nln0 l0
h
184
conservation for the components of the wave packets. Hence, the decay rate
should be written as the convolution
Z
Z
1
2
I  nlm > 2
=
 < n0 l0 m0  H
dE 0 n0 l0 (E 0 )
dE nl (E) ( E 0 E h )
nln0 l0
h
Z
2
0 0 0
2
I  nlm > 
 < nlm H
dE n0 l0 ( E + h ) nl (E)
(1079)
=
h
We shall use the approximation for the energy distributions suggested by eqn(1072).
In this case, the convolution is evaluated by contour integration as
1
nln0 l0
2
2 n + 2 n0 l
I  nlm > 2
 < n0 l 0 m0  H
h
( h + Enl En0 l0 )2 + ( 2 hnl +
2
2 n0 l0 )
(1080)
since only the terms with poles on the opposite sides of the realaxis yield
nonzero contributions. From this, one can show that the optical absorption
crosssection is given by
absorb () =
4
3
e2
h c
X
n0 l0 ,nl (
1
2 n0 l0
( n0 l0 ,nl )2 + (
(1081)
which was first derived by Weisskopf and Wigner57 . Hence, the natural width
is given by the average of the decay rates for the initial and final electronic
states. This leads to the conclusion that even weak lines can be broad, if the
final electronic state has a short lifetime.
11.3
n0 l0 m0
Renormalization
185
1
2 nl
1
2 n0 l0
)
1
2 nl
)2
11.3.1
The zeropoint energy of the electromagnetic field can lead to measurable effects. In general relativity, the total energy including the zeropoint energy
of the electromagnetic radiation is the source for the gravitational field. The
Casimir effect58 shows that the zeropoint energy of the electromagnetic radiation produces a force on the walls of the cavity. We shall consider a cubic
volume V = L3 which is enclosed by conducting walls that acts as a cavity
for the electromagnetic radiation. This volume is divided into two by a metallic
partition, which is located at a distance d from one side of the cavity. We shall
Ld
Figure 34: The geometry of the partitioned electromagnetic cavity used to consider the Casimir effect.
evaluate the total energy of this configuration and then deduce the form of the
interaction between the partition and the walls of the cavity.
We shall consider the total energy due to the zeropoint fluctuations in the
container. Since the zeropoint energy is divergent due to the presence of arbitrarily large frequencies, we shall introduce a convergence factor. The convergence factor can be motivated by the observation that, in mater, electromagnetic
radiation becomes exponentially damped at large frequencies. Hence, one can
write
k,
1 X
E =
(1082)
h k, exp
c
2
k,
58 H.
186
(1083)
for i = x, y and ni are positive integers. The boundary condition for the remaining two boundaries leads to the restriction
kz d = nz
(1084)
The energy of the radiation in one part of the partition can be expressed as
s
s
2
2
Z
Z
X
n
nz
L2
z
2
2
2
2
dkx
dky
kx + ky +
exp kx + ky +
Ed = h
c 2
d
d
0
0
n =1
z
(1085)
where the two polarizations have been summed over. The integration has cylindrical symmetry but only extends over the quadrant with positive kx and ky ,
therefore, it shall be rewritten as
s
s
2
2
Z
X
n
nz
2 L2
z
2
2
dk k
k +
exp k +
Ed = h
c
4 0
d
d
n =1
z
(1086)
or, on changing variable to the dimensionless = k 2 ( nzd )2
Ed
3 Z
nz
nz
L2 X
= h
c
d + 1 exp
+ 1
4 n =1
d
d
0
z
(1087)
The factor of n3z can be expressed as a thirdorder derivative of the exponential
factor w.r.t.
Z
L2 X
1
3
nz
Ed = h c
d
exp
+
1
4 n =1 0
+ 1 3
d
z
(1088)
The summation over nz can be performed, leading to
Z
exp[ d + 1 ]
L2
1
3
Ed =
hc
d
4 0
+ 1 3 1 exp[ d + 1 ]
Z
L2
1
3
1
=
hc
d
(1089)
4 0
+ 1 3 exp[ d + 1 ] 1
Let t = + 1 so
Z
2 L2
dt 3
1
(1090)
Ed =
hc
4 1
t 3 exp[ d t ] 1
187
t
] 1
d
(1092)
therefore
Ed
h c L2 2
2 d 2
Z
s0
ds
s2
(1093)
] 1
d
d
Ed =
2 d 2 s0
d
h c L2 2
=
2 d 2 exp[ d ] 1
2
h c L2 2
d
d
=
2 d 2
exp[ d ] 1
(1094)
(1095)
X
x
xn
=
Bn
exp[x] 1
n!
n=0
(1096)
h c L2
n4
2 X Bn (n 2)(n 3)
d3 n=0
n!
d
(1097)
188
with n = 4 remains finite in the limit 0 and all the higherorder terms
vanish in this limit. Explicitly, one has
2 B1 d
2 B4 2
h c L2 6 B0 d2
Ed =
+
+
+
O()
(1098)
2d
2 4
3
4! d2
The first term in the energy is proportional to L2 d, which is the volume of the
cavity and the second term is proportional to L2 the surface area of the walls.
The third term is independent of the cutoff and the higher order terms vanish
in the limit 0.
The Casimir force is the force between two planes, which originates from
the energy of the field59 . This energy can be separated out into a volume
part and parts due to the creation of the surfaces and an interaction energy
between the surfaces. In order to eliminate both the volume dependence of the
energy and the surface energies, we are considering two configurations of the
partitions in the cavity. In one configuration the plane divides the volume into
two unequal volumes d L2 and (L d) L2 , and the other configuration is a
reference configuration where the cavity is partitioned into two equal volumes
L3
2 . The difference of energies for these configurations is given by
E = Ed + ELd 2 E L
2
(1099)
Ld,0
2 h
c L2
720 d3
(1100)
2
L2
h c 4
240
d
(1101)
The force is proportional to L2 which is the area of the wall of the cavity. The
predicted force was measured by Sparnaay60 . A more recent experiment involving a similar force between a planar surface and a sphere has achieved greater
accuracy61 .
59 Our considerations only include the part of Fock space that corresponds to having zero
numbers of excited quanta. Hence, the Casimir force is due to the properties of the field, and
is not due to the transmission of real particles (photons) between the planes.
60 M. J. Sparnaay, Physica 24, 751 (1958).
61 S. K. Lamoreaux, Phys. Rev. Lett. 78, 5 (1997).
189
(9)
Figure 35: The separationdependent force between two closely spaced metallic
surfaces due to the modification of the zeropoint energy. The lower panel shows
the difference between 8the experimental results and the theoretical prediction
for the Casimir Force. [After S. K. Lamoreaux, Phys. Rev. Lett. 78, 5, (1997).]
To summarize, the physical quantity is the force or difference in energies
when one wall is moved. When the change in energy is calculated, the difference between the two divergent energies is finite and independent of the choice
of cutoff62 .
CutOff Independence
It is the boundary condition and not the cutoff that plays an important role
in the Casimir effect. For simplicity, one can choose zero boundary conditions.
The zeropoint energy of a cylindrical electromagnetic cavity of radius R and
length d can be expressed as the sum
s
2 s
2
Z
h c R2 X
n
nz
z
2
2
Ed = 2
dk k
k +
F
k +
2 ( 2 ) n =1
d
d
z
(1102)
where F (z) is an arbitrary cutoff function (which may depend on an arbitrary
parameter which is ultimately going to be set to zero). The cutoff must not
effect the low energymodes so one can choose F (0) = 1 and all the derivatives
of F (z) to be zero for finite values of z. These assumptions are all in accord with
the ideal case of no cutoff function or F (z) = 1. The energy can be written as
Ed =
h c R2 X
f (nz )
2
n =1
(1103)
62 The independence of any cutoff procedure can be shown by evaluating the divergent sums
by using the EulerMaclaurin summation formula.
190
1.2
F(z)
0.8
0.4
0
0
0.2
0.4
z/N
0.6
0.8
f (nz ) =
dk k
k2
nz
d
s
2
F
k2
nz
d
2
(1104)
dx f (x)
0
N
1
X
1
1
f (0) +
f (n) +
f (N )
2
2
n=1
(1105)
dx f (x)
0
N
1
X
1
1
f (0) +
f (n) +
f (N )
2
2
n=1
B2
B4
( f (1) (0) f (1) (N ) ) +
( f (3) (0) f (3) (N ) ) + . . .
2!
4!
(1106)
We shall assume that f (n) and all its derivatives vanishes in the limit of large
n, limN f (N ) 0, due to the behavior of the cutoff function. The
191
d
d2
0
k 2 + ( n )2
since the derivatives of F (z) all vanish for finite z. The integration over the
variable k is reexpressed in terms of an integration over the variable z, defined
by
s
2
n
2
z =
k +
(1108)
d
so
k
dz = dk s
(1109)
2
k2 +
n
d
3 n2
=
d3
(1110)
In deriving the above expression, the condition that the firstorder derivative of
F (z) vanishes for finite z has been used. It immediately follows that
f (2) (n)
2 3 n
d3
(1111)
2 3
d3
(1112)
and
f (3) (n)
and all higher order derivatives vanish. Hence, one finds that at z = 0 all the
mth order derivatives f (m) (0) vanish, except for m = 3 which is given by
f (3) (0) =
2 3
d3
(1113)
Hence, on evaluating the energy of the cylindrical cavity (and using the zero
boundary conditions), one finds that the energy is composed of the sum of an
192
integral and a finite number of other terms. The integral part of the expression
only depends on the volume of the cavity and is proportional to a divergent
integral, and hence drops out when the energy differences are taken. The only
terms that yield nonzero contributions to the energy difference originate with
f (3) (0) and depend on d. It is these terms that give rise to the Casimir force.
This approach also showed that any particular choice made for the cutoff is
irrelevant.
Mathematical Interlude:
The EulerMaclaurin Summation Formula.
The EulerMaclaurin formula allows one to accurately evaluate the difference
of finite summations and their approximate evaluations in the form of integrals.
The EulerMaclaurin Formula
If N is an integer and f (x) is a smooth differentiable function defined for all
real values of x between 0 and N , then the summation
N
1
X
S =
f (n)
(1114)
dx f (x)
(1115)
n=1
Z
I =
0
(1116)
The EulerMaclaurin formula provides expressions for the difference between the
sum and the integral in terms of the higherderivatives f (n) at the end points
of the interval 0 and N . For any integer p, one has
p
X
1
B2n
S +
f (0) + f (N ) I =
f 2n1 (N ) f 2n1 (0) + R
2
(2n)!
n=1
(1117)
where B1 = 1/2, B2 = 1/6, B3 = 0, B4 = 1/30, B5 = 0, B6 = 1/42, B7 =
0, B8 = 1/30, ... are the Bernoulli numbers, and R is an error term which is
normally small if the series on the right is truncated at a suitable value of p.
The Remainder Term
The remainder R when the series is truncated after p terms is given by
Z N
Pp+1 (x)
p
R = (1)
dx f (p+1) (x)
(1118)
(p + 1)!
0
193
where Pn (x) = Bn (x [x]) are the periodic Bernoulli polynomials. The remainder term can be estimated as
Z N
2
dx  f 2p1 (x) 
(1119)
R
(2)p 0
Derivation by Induction
First we shall examine the properties of the Bernoulli polynomials and the
Bernoulli numbers. Then we shall indicate how the EulerMaclaurin formula
can be obtained by induction.
The Bernoulli polynomials Bn (x), for n = 0, 1, 2, ... are defined by the
generating function expansion
G(z, x) =
X
z ezx
zn
=
Bn (x)
z
e 1
n!
n=0
(1120)
X
z
zn
=
Bn
z
e 1
n!
n=0
(1121)
where Bn are the Bernoulli constants. Hence, the Bernoulli constants are the
Bernoulli polynomials evaluated at x = 0, i.e. Bn (0) = Bn . Furthermore, on
differentiating the generating function w.r.t. x, one finds
G(z, x)
= z G(z, x)
x
(1122)
X
X
Bn (x) z n
zn
= z
Bn (x)
x
n!
n!
n=0
n=0
(1123)
On equating the coefficients of z n in the above equation, one obtains the important relation
Bn (x)
= n Bn1 (x)
(1124)
x
Therefore, by integration it easy to show that Bn (x) are polynomials of degree
n. The first few Bernoulli polynomials can be explicitly constructed from the
generating function expansion. The first few polynomials are given by
B0 (x)
= 1
B1 (x)
= x
1
2
194
1
6
B2 (x)
= x2 x +
B3 (x)
3
1
= x3 x2 + x
2
2
B4 (x)
B5 (x)
1
30
5 4 5 3 1
5
= x x + x x
2
3
6
= x4 2x3 + x2
...
(1125)
From the generating function expansion, one can show that the Bernoulli polynomials are either even or odd functions of x 12 . The generating function can
be expressed
X
z
zn
z(x 12 )
(1126)
G(z, x) = e
=
B
(x)
z
z
n
n!
e 2 e 2
n=0
where the second factor is an even function of z, thus, the generating function is
invariant under the combined transformation z z and (x 12 ) (x 12 ).
Therefore, one has
X
X
1
1 zn
1 1
zn
Bn
+x
=
Bn
+ x ( 1 )n
(1127)
2
2 n!
2 2
n!
n=0
n=0
so the polynomials satisfy
Bn (x) = ( 1 )n Bn (1 x)
(1128)
(1129)
The generating function with x = 0 can be rewritten as the sum of its even
and odd parts
z
X
z
zn
2
G(z, 0) =
=
B
(0)
(1130)
n
tanh z2
2
n!
n=0
The even part has only even terms in its Taylor expansion, and there is only
one term in the odd part. Hence, the odd Bernoulli numbers vanish for n > 1,
i.e. B2n+1 (0) = 0 for n > 0. Therefore, for n 2, one has Bn (0) = Bn (1). This
equality can be used to evaluate the integrals of the Bernoulli polynomial over
the range from 0 to 1. On expressing the integral of Bn (x) in terms of Bn+1 (x),
one has
Z 1
Z 1
1
Bn+1 (x)
dx Bn (x) =
dx
(n
+
1)
x
0
0
Bn+1 (1) Bn+1 (0)
=
( n+1 )
= 0 for n 1
(1131)
195
Hence, the Bernoulli polynomials may be defined recursively via the relation
Bn (x)
= n Bn1 (x)
x
(1132)
(1133)
(1134)
where [x] is the integral part of x. This definition of Pn (x) reproduces to the
Bernoulli polynomials on the interval (0, 1) since [x] = 0 in this interval. The
functions Pn (x) are periodic over an extended range of x with period 1.
The EulerMaclaurin formula can be obtained by mathematical induction.
Consider the integral
Z n+1
Z n+1
v
(1135)
dx f (x) =
dx u
x
n
n
with the identification of
u = f (x)
(1136)
and
v
= 1 = P0 (x)
(1137)
x
since P0 (x) = 1. Therefore, on using the recursion relation involving the
derivative of the Bernoulli polynomials, one finds that
v = P1 (x)
(1138)
n
n
n+1
dx
f (x)
P1 (x)
x
(1139)
1
2
(1140)
it has the value of 1/2 at the limits of integration. Hence, the integration reduces
to
Z n+1
Z n+1
f (n + 1) + f (n)
f (x)
dx f (x) =
dx
P1 (x) (1141)
2
x
n
n
196
dx f (x) =
Adding
N
X
n=1
f (1) + f (N )
2
f (1) + f (N )
2
Z
f (n) =
+
N
1
X
f (n)
dx
1
n=2
f (x)
P1 (x) (1142)
x
to both sides of the equation and rearranging, one finds
dx f (x) +
f (1) + f (N )
2
Z
+
dx
1
f (x)
P1 (x) (1143)
x
The last two terms, therefore, give the error when the sum is approximated by
an integral. The first correction is simply the end point corrections from the
trapezoidal rule, and the second correction has to be evaluated to yield the
EulerMaclaurin formula. The last correction is of the form of an integral which
can be expressed in terms of the sum of the integrals
Z n+1
dx f 0 (x) P1 (x)
(1144)
n
where the prime refers to the derivative of f (x) w.r.t. x. The above expression
can be evaluated by integrating by parts. The integrand is rewritten as
Z n+1
Z n+1
v
dx f 0 (x) P1 (x) =
dx u
(1145)
x
n
n
where one identifies the two factors as
u
v
x
= f 0 (x)
= P1 (x)
(1146)
(1147)
dx f 00 (x) P2 (x)
2
2 n
n
(1149)
197
11.3.2
The Lamb shift is a shift between the energy levels of the 2 2 S 12 and the 2 2 P 12
levels of hydrogen from the predictions of the Dirac equation as they have the
same n and j values (j = l s). The Dirac equation predicts that these two
levels should be degenerate. However, these levels were measured by Lamb and
Retherford63 who found that the 2 2 S 21 level is higher than the 2 2 P 12 level
by 1058 MHz or 0.033 cm1 . Bethe explained this in terms of the interaction
between the bound electron and the quantized electromagnetic field64 . Similar shifts should also occur between the n 2 S 12 and the n 2 P 12 levels, but the
magnitude of the shifts should be much smaller, as the magnitude varies as n3 .
Qualitatively, the electron interacts with the fluctuating electromagnetic
field and with the potential due to the nucleus. The zeropoint fluctuations
cause the electron to deviate from its quantum orbit by an amount given by r
and, therefore, experiences a potential given by
V (r + r) = V (r) + r . V (r) +
63 W.
64 H.
1
( r . )2 V (r) + . . .
2!
198
(1151)
e2
r
the Laplacian is related to a point charge density at the nucleus
V (r) =
2 V (r) = 4 e2 (r)
(1152)
(1153)
(1154)
Hence, the shift due to the fluctuations in the electrons potential energy occurs
primarily at the origin. The effect of the electromagnetic fluctuations on the
kinetic energy are not state specific, and can be considered as a uniform shift
of all the energy levels, like the electrons rest mass energy m c2 . Thus, the
relative energy shift of the levels is solely determined by the potential at the
origin. Therefore, the states with nonzero angular momenta do not experience
the relative energyshift since the electronic wave functions vanish at the origin.
Thus, only the 2s state experiences a shift but the 2p state is unshifted.
The magnitude of the Lamb shift can be ascertained by expressing r in
terms of the zeropoint fluctuations in the electromagnetic field65 . If it is assumed that the electron is bound to the atom harmonically, r is determined
from the equation of motion
q
r + 02 r =
E
(1155)
m
where the electric field E has components that are fluctuating with wave vector k
or equivalently with frequency . This has the result that the position fluctuates
r(t)
q
1
E 2
m
0 2
199
(1156)
where E is the Fourier component of the fluctuating electric field. Hence, the
component of the mean squared fluctuation66 in the particles position is given
by
2
q
E2
<  r2  > =
< 
 >
(1157)
m
( 02 2 )2
On approximating the electromagnetic energy associated with the fluctuating
electromagnetic field <  E2  > by the half the sum of the zeropoint energies
of the photon modes, one has
V
1
<  E2  > = 2 h
8
4
(1158)
where the factor 2 represents the two types of polarization of the normal modes.
Therefore, on summing over the normal modes, one finds that the mean squared
deviation of the electrons trajectory from the classical orbit is proportional to
Z
Z
E2
V
d
d 2 < 
 >
2
3
(2c)
( 0 2 )2
0
Z
Z
4
h
V
3
=
d
d
(1159)
2
3
V
(2c)
( 0 2 )2
0
The integration over can be approximated as
Z
mc2
h
d
m c2
= ln
0
h
(1160)
where an upper and lower cutoff have been introduced to prevent the integral
from diverging67 . The expectation value of the second derivative of the potential
for the 2s state is given by
<  2 V  > = 4 e2
1
a3
(1161)
where the second factor represents the 2s electron density at the origin. The
corresponding factor for an ns level is expected to vary proportionally to n3 .
Combining the above expressions, one finds that the 2s level is shifted by an
energy given by
E2s
4
=
2
e2
h c
3
m e4
h2
ln
m c2
h 0
(1162)
66 The average squared fluctuation of the electromagnetic field should, in principle, be calculated as an average over a volume in time and space which encompasses the electrons
trajectory.
67 The upper limit can be considered as being determined by the spatial dimension of the
volume in which the electromagnetic fluctuations are being averaged over. The divergence at
the lower limit of integration is unphysical and is caused by the neglect of the lifetimes of
the electronic states. The inclusion of the lifetimes result in the integrand being finite at the
resonance frequency 0
200
where the frequency of the electrons orbit 0 has been chosen as a lower cutoff
on the frequency of the electromagnetic fluctuations. Since the logarithmic fac2
tor is approximately given by 2 ln he c , one can see that the above estimate
is consistent with the Lamb shift having a magnitude of approximately 4.372
106 eV.
11.3.3
The corrections to the energy of a free electron due to its coupling to the electromagnetic field are to be considered68 . It shall be assumed that the electromagnetic field is in the ground state  {0} > , and the energy of an electron in
a state with momentum q will be evaluated via perturbation theory.
The lowestorder correction to the electrons energy comes from the diamagnetic interaction. From firstorder perturbation theory, one finds the correction
(k,)
Figure 38: The firstorder correction to the rest mass of the electron due to the
diamagnetic interaction.
dia  q {0} >
Eq(1) = < q {0}  H
On using a planewave to represent the electronic wave function
1
q (r) = exp i q . r
V
(1163)
(1164)
then the firstorder change in the electrons energy due to the coupling to the
field is given by
X
2 h c2 (k) . 0 (k 0 )
e2
(1)
Eq
=
< {0}  ak0 ,0 ak,  {0} >
2 m c2
V
k k 0
0
0
k,,k ,
201
Z
1
k k 0
2 m c2
V
0
k,,k,
since the electronic matrix elements give rise to the condition of conservation of
momentum. Hence, the correction to the energy is found as
Z
e2
V
2 h c
1
Eq(1) =
2
d3 k
2 m c2 ( 2 )3
V
k
Z
e2
V
2 h c
=
8
dk k
2 m c2 ( 2 )3
V
0
2 Z
e h
dk k
(1166)
=
mc
0
which diverges. This contribution is independent of the electrons momentum
q, and since k = k 0 it can be seen that the contribution of the diamagnetic
interaction to firstorder is independent of the quantum state of the electron.
This contribution to the electrons energy can be lumped together with the electrons restenergy m c2 . However, since the corrections are being evaluated for
nonrelativistic electrons for which the rest energy is not observable, it is customary to ignore the restenergy and, therefore, this correction shall no longer
be considered.
The paramagnetic interaction when taken to secondorder also yields a correction to the electrons selfenergy. This correction can be considered to be due
(k,)
qk
q
q ,k,
(1167)
where  q 0 1k, > is a onephoton intermediate state of the electronphoton
system. We assume that the process does not conserve energy, so that the
202
(1171)
(1172)
h q 2 ( 1 cos2 )
e2
h
=
dk k
d sin h 2 q k
2
2
2
2m c
cos h k h c k
0
0
m
2 m
(1173)
69 The completeness relation merely expresses the fact that any vector in a threedimensional
space can be expressed in terms of the components along three orthogonal directions ei
A =
3
X
Ai ei
i=1
ei ei
203
( 1 cos2 )
(2)
Eq
=
dk
d sin
c
2m
hc 0
2 q cos k 2 m
0
h
(1174)
It should be evident that the integral diverges logarithmically at large
divergent part of the integral can be written as
2
Z
Z
h2 q 2
e
2
(2)
2
Eq
d sin ( 1 cos )
2mc
2m
h c 0
h
Z
e2
8
h2 q 2
dk
=
2m
h c 3 2mc
k
h
k. The
dk
k
(1175)
If an upper cutoff 1
+ is introduced, then the correction to the electrons kinetic
energy can be estimated as
2
h2 q 2 8
e
h
(2)
Eq =
ln
(1176)
2 m 3 h c
2 m c +
This shift can be interpreted as a (secondorder) renormalization of the electrons
mass from the unrenormalized mass to the physical mass m
2
1
1
8
e
h
=
1
ln
+
.
.
.
(1177)
m
m
3 h c
2 m c +
It is the renormalized mass m which would be determined by an experiment.
11.3.4
(1178)
On summing over the polarizations using the completeness relation, one obtains
2
Z
Z
X  < n0 l0 m0  p
 nlm > 2 ( 1 cos2 k )
e
hc
(2)
Enlm =
dk
k
d
sin
k
k
m2 c2 ( 2 ) 0
Enlm En0 l0 m0 h k
0
n0 l0 m0
(1179)
204
where k is the angle subtended between k and the matrix elements of p. The
angular integration can be performed, yielding
2
Z
X  < n0 l0 m0  p
 nlm > 2
2 h c
e
(2)
dk k
Enlm =
2
2
m c
3
Enlm En0 l0 m0 h k
0
n0 l0 m0
(1180)
In the completely nonrelativistic limit, the integration over k can be shown to
be linearly divergent at the upper limit of integration.
Hans Bethe argued70 that, within the same dipole approximation, the correction to the kinetic energy of the electron in the state  nlm > is given by an
expression analogous to that of an electron in a continuum state n
Tn(2) =
2
3
e2
hc
h
mc
2 Z
d
0
X  < n0  p
 n > 2
E n E n0 h
0
(1181)
Since momentum is conserved for continuum states (on average), only the state
where n = n0 contribute so the denominator simplifies71 . The expression for the
mass renormalization is divergent and is given by
Tn(2)
2
3
4
=
3
e2
hc
e2
hc
h
mc
2 Z
h
m c2
d
0
Z
0
X  < n0  p
 n > 2
h
0
n
d < n  p2  n >
2m
(1182)
where the completeness relation has been used. This expression is valid if n
labels either a continuum or a discrete state, since only the mass of the electron
is being altered and the expectation value of p
is unaltered. Thus, Bethe argued,
the kinetic energy of an electron in a bound state which has the physical mass
m should be approximated as
< nlm 
p2
 nlm >
2 m
p2
= < nlm 
 nlm >
22m
Z
4
e
h
d < nlm  p2  nlm >
2
3 h c
mc
2m
0
(1183)
p2
+ V (r)
2m
70 H.
(1184)
205
and the unperturbed energy of the hypothetical state  nlm > is calculated in
the nonrelativistic Schr
odinger theory as
(0)
p2
 nlm > + < nlm  V (r)  nlm >
2m
(1185)
(0)
However, when this is evaluated, the approximate energy Enlm has to be expressed in terms of the observed physical mass m via
(0)
Enlm
p2
= < nlm 
 nlm > + < nlm  V (r)  nlm >
22m
Z
4
e
h
d < nlm  p2  nlm >
+
2
3 h c
mc
2m
0
2
p
= < nlm 
 nlm > + < nlm  V (r)  nlm >
2
2m
Z
e
h
d X  < n0 l0 m0  p  nlm > 2
4
+
3 h c
m c2
2m
0
0 0
0
nlm
(1186)
where the completeness relation was used in obtaining the last line. The second
term in the unperturbed energy is a correction due to the mass renormalization72
which should be combined with the secondorder radiative correction. The total
energy (to secondorder) is given by
Enlm
(0)
(2)
= En,l,m + En,l,m
p2
= < nlm 
 nlm > + < nlm  V (r)  nlm >
2 m
2
2 Z
X  < n0 l0 m0  p  nlm > 2
2
e
h
+
d
3 h c
mc
h
0
n0 l0 m0
2
2 Z
X  < n0 l0 m0  p
 nlm > 2
2
e
h
d
+
3 h c
mc
Enlm En0 l0 m0 h
0
0 0
0
nlm
(1187)
The overall (secondorder) shift from the Schrodinger estimate of the energy for
the state  nlm > (as calculated with the physical mass) is given by the sum
of the last two terms, which is expressed as
2
2 Z
X  < n0 l0 m0  p
 nlm > 2 ( Enlm En0 l0 m0 )
2
e
h
shift =
Enlm
d
3 h c
mc
( Enlm En0 l0 m0 h ) h
0
0 0
0
nlm
(1188)
72 Renormalization is an idea which Bethe attributed to H. A. Kramers. Kramers had
proposed that physical quantities should be expressed in terms of observable quantities, with
all mention of bare quantities removed. Kramers was advocating a classical treatment from
which Bethe created a nonrelativistic quantum treatment.
206
The integration over is logarithmically divergent, and can be made to converge by introducing an upper cutoff + = c 1 . Therefore, the difference
of the linearly divergent selfenergy of the bound electron and the linearly divergent selfenergy of the free electron is only logarithmically divergent. After
introducing the cutoff, one finds the result
2 X
 < n0 l0 m0  p
 nlm > 2
e
2
shift
Enlm
=
3 h c
m2 c2
n0 l0 m0
hc1 + En0 l0 m0 Enlm
( Enlm En0 l0 m0 ) ln
En0 l0 m0 Enlm
(1189)
If the rest energy of the electron is used as the upper cutoff energy m c2 0.5
106 eV, and assuming that the averaged logarithm of the electron excitation
energy corresponds to an energy of the order of 17.8 Ryd, then the logarithm has
a value of about 7.63 and is not sensitive to the precise value of Enlm En0 l0 m0
and, therefore, can be taken outside the summation
2
2 h2 c2
e
shift = 2
ln 2 4
Enlm
3
hc
Z e
X  < n0 l0 m0  p
 nlm > 2
( Enlm En0 l0 m0 )
m2 c2
0 0
0
nlm
(1190)
2
0 ]  nlm >(1191)
< nlm  p
 n0 l0 m0 > < n0 l0 m0  [ p
, H
n0 l0 m0
(1192)
207
On substituting
p = i h
(1193)
and
p2
+ V (r)
2m
into the expression for the matrix elements, one obtains
Z
h2
d3 r nlm (r) . ( V (r) ) nlm (r)
0 =
H
(1194)
(1195)
4 Z e2
3
e2
h c
h2
m2 c2
2 h2 c2
 nlm (0)  ln 2 4 (1199)
Z e
2
Therefore, the Lamb shift only occurs for electrons with l = 0, since electronic
wave functions with l 6= 0 vanish at the origin. The atomic wave function at
the position of the nucleus is given by
3
1
Z
2
 n00 (0)  =
(1200)
na
This yields Bethes estimate for the Lamb shift as
2 3 4 4
2 h2 c2
4
e
Z e m
shift
ln 2 4
En00
=
3 n3 h c
Z e
h2
(1201)
The above formulae leads to the estimate of 1040 MHz which is in good agreement with the experimentally determined value76 . The exact relativistic calculation77 yields the result
2 5
m c2
4
e
shift =
4
2
+ 31 (1202)
En00
Z
m
c
ln
2 h n,n0
3 n3
h c
120
76 W.
77 N.
208
where the mc2 in the logarithm comes from the Dirac theory without invoking
any cutoff. The most recent experimentally measured value78 is 1057.851 MHz
which is in good agreement with the theoretical value of 1057.857 MHz.
11.3.5
Brehmstrahlung
Z e2
r
(1203)
The Hamiltonian of the unperturbed electron is simply the kinetic energy. The
incident electron is assumed to have a momentum q and the scattered electron
has momentum q 0 and the crosssection for the scattering process will be calculated via loworder perturbation theory.
Rutherford Scattering
To secondorder, the scattering crosssection is expressed as Rutherford scattering which is elastic and, therefore, involves no emission of photons. The
q'
q
qq'
78 G.
209
4 Z e2
V  q q 0 2
(1205)
V  q q 0 2
0
2
2
V
m
4 Z e2
=
q
d0
h ( 2 )3 h2
V  q q 0 2
(1206)
The denominator in the potential has to be evaluated on the energy shell. On
introducing the scattering angle 0 and using the elastic scattering condition
 q q 0 2
= 2 q 2 ( 1 cos 0 )
4 q 2 sin2
0
2
(1207)
one finds
q'
2q sin'/2
'
q
Figure 41: The geometry for Rutherford scattering. For elastic scattering, the
magnitude of the initial momentum q is equal to the magnitude of the final
momentum q 0 and the scattering angle is 0 .
d0 Rutherford
2
V
m
q
h ( 2 )3 h2
4 Z e2
V 4 q 2 sin2
2
0
2
d0
(1208)
h q
mV
210
(1209)
2
0
2
(1210)
d/d'
0.25
0.5
0.75
'/
Figure 42: The scattering angle dependence of the differential scattering crosssection.
crosssection diverges at 0 = 0 and is always finite at 0 = no matter how
large q is. The scattering at 0 = is known as backscattering, and is caused
by the extremely high potential experienced by electrons with very small impact parameters. It was the large crosssection for backscattering of charged
particles from atoms, found by H. Geiger and E. Marsden in 191379 , that
was instrumental in verifying Rutherfords 1911 conjecture80 that atoms have
nuclei which are of very small spatial extent. The divergence in the scattering
crosssection at 0 = 0 is due to the longranged nature of the Coulomb interaction, which causes electrons to undergo scattering (no matter how slight the
scattering is) at arbitrarily large distances from the nucleus.
Brehmstrahlung
Elastic scattering of electrons by the Coulomb potential is highly unlikely,
since from classical electrodynamics it is known that accelerated particles radiate. Hence, it is expected that photons should be emitted in this process.
This phenomenon is known as Brehmstrahlung. We shall calculate the Brehmstrahlung scattering crosssection81 using loworder perturbation theory. The
electron is scattered between the free electron eigenstates due to a perturbation
79 H.
211
which is a linear superposition of the Coulomb interaction with the nucleus and
the paramagnetic interaction.
(k,)
(k,)
qk
q'+k
q
q'
q'
The lowestorder probability amplitude describing Brehmstrahlung is a linear superposition of two processes. These are:
(a) Scattering of an electron from the nucleus followed by the emission of a photon. The initial state of the electron is assumed to have momentum q and the
final state of the electron is given by q 0 while the emitted photon has momentum
k. Therefore, from conservation of momentum, the momentum of the electron
in the intermediate state is given by q 0 + k.
(b) Emission of a photon followed by scattering from the nucleus. Conservation
of momentum indicates that the intermediate state has momentum given by
q k.
The matrix elements for these secondorder processes are given by
s
4 Z e2
e h
2 h c2
Ma =
(k) . ( q 0 + k )
0
2
V q q k
mc
V k
1
(1211)
Eq Eq0 +k + i
and
s
2 h c2
e h
4 Z e2
(k) . ( q k )
=
2
0
V k
mc
V q q k
1
(1212)
Eq Eqk h k + i
Mb
It should be noted that the numerators of the matrix elements simplify because
the photons have transverse polarizations
(k) . k = 0
212
(1213)
From the energy conserving delta function in the expression for the decay rate,
one finds
Eq = Eq0 + h k
(1214)
hence the first energydenominator can be expressed in a similar form to the
second
Eq Eq0 +k = Eq0 Eq0 +k + h k
(1215)
For small k, the energydenominators can be expanded, yielding
k
Eq0 Eq0 +k + h k = h
h 0
h2 k 2
q .k
m
2m
(1216)
and
h
h2 k 2
q.k
(1217)
m
2m
Since the energy of the photon cannot exceed the energy of the initial electron,
one must have q > k, so the third term is smaller than the second term. Due to
the large magnitude of c compared with the electron velocities hmq , the second
and third terms can be neglected. Therefore, the photonenergy dominates both
the energydenominators. On substituting the above expressions in the sum of
the matrix elements, one finds
s
4 Z e2
e
2 h c2
M a + Mb =
V  q q 0 k 2
mc
V k
0
(k) . q h
(k) . q h
+
Eq0 Eq0 +k + h k + i
Eq Eqk h k + i
s
4 Z e2
e
2 h c2
0
2
V q q k
mc
V k
0
(k) . ( q q )
(1218)
k
Eq Eqk
h k = h k +
Using this approximation for the matrix elements, the transition rate is given
by
1
2
2
e
2 h c2
4 Z e2
2 X X
mc
V k
V  q q 0 k 2
h
0
q
k,
(k) . ( q 0 q ) 2
( Eq Eq0 h k )
k
(1219)
213
c
k
h (k) . ( q 0 q ) 2
(1220)
mc
If the angular distributions of the emitted photon (dk ) and the scattered electron (d0 ) are both measured, the scattering crosssection can be represented
as
d3
q0
d
=
d0 dk dk Brehmse
q d0 Rutherford
2 X
h (k) . ( q 0 q ) 2
1
e
4 2 k h c
mc
(1221)
where the second factor is the probability of emitting a photon with energy h
k
into solid angle dk . On summing over the polarization and integrating over
the directions of the emitted photon, one obtains
2
q0
2 m Z e2
d2
=
d0 dk Brehmse
q h2  q q 0 2
2
h ( q 0 q ) 2
e
2
(1222)
3 k h c
mc
Hence, the scattering rate which includes the emission of a photon of energy
h k is given by the product of the Rutherford scattering rate with a factor
q0
2
q 3 k
e2
h c
2qh
sin 2
mc
2
(1223)
214
with finite energy resolution, elastic scattering and very lowenergy quasielastic
scattering processes cannot be distinguished, so it is might be expected that the
elastic scattering and quasielastic scattering divergences should be combined.
The divergences found in the problem of Brehmstrahlung were first considered by Bloch and Nordsieck84 who showed that the infrared divergences
cancel. That is, the infrared divergence does not exist85 . The cancelation
was achieved adding virtual emission processes for Rutherford scattering to the
Brehmstrahlung crosssection for the emission of photons of energy less than
0 , since these processes cannot be distinguished for sufficiently small photon
frequencies 0 . That is, on introducing an infrared cutoff , one finds that
the total inelastic scattering in which a photon with frequency less than 0 is
emitted is given by
2
d
d
1
e
2 0
=
A ln
+ ... + ...
d Brehmse
d Rutherford 2 h c
c
(1224)
where the factor A depends on the initial and final momentum of the electron.
This result is logarithmically divergent as 0. On the other hand, to the
same order, the elastic scattering crosssection is found as
2
d
d
1
e
h
=
1+
A ln
+ ... + ...
d Elastic
d Rutherford
2 h c
m c
(1225)
Hence, on combining the results, one finds that the quasielastic scattering crosssection is given by
2
d
d
1
e
2 h 0
=
1+
A ln
+
.
.
.
+
.
.
.
d QuasiElastic
d Rutherford
2 h c
m c2
(1226)
so the cutoff cancels and the scattering crosssection does not diverge logarithmically. With this reasoning, Bloch and Nordsieck found that the appro2
2
h
0
priate expansion parameter is not he c but instead is given by he c ln m
c2 . The
higherorder perturbations may also describe processes involving larger numbers
of emitted soft photons and results in a multiplicative exponential factor to the
quasielastic scattering rate
2
d
d
1
e
2 h 0
exp
B ln
+
.
.
.
d QuasiElastic
d Rutherford
2 h c
m c2
(1227)
Therefore, the scattering rate from soft photons vanishes in the limit 0 0.
This occurs because perturbation theory causes the normalization of the starting
84 F.
215
approximate wave function to change, and hence the probabilities of the various processes are changed by including higherorder processes. In other words,
since the probability of emitting an arbitrarily large number of softphotons is
finite, the probability of emitting either zero or any fixed number of soft photons
must be zero. Bloch and Nordsiecks calculation was restricted to the case of
emission of sufficiently lowenergy photons. Pauli and Fierz86 also considered
Brehmstrahlung in a nonrelativistic approximation. Pauli and Fierz showed
that the infrared divergences, discussed above, cancel. Pauli and Fierz went on
to examine the remaining ultraviolet divergences, and showed that portions of
the ultraviolet infinities that were found in the calculations of the scattering
processes could be associated with mass renormalization. Using a relativistic
theory Ito, Koba and Tomonaga87 showed that the remaining infinities could
be absorbed into a renormalization of the electron charge. Similar conclusions
were arrived at by Lewis88 and by Epstein89 . Dyson90 showed that all infinities
that appear in Quantum Electrodynamics could be cured by renormalization to
arbitrarily highorders in perturbation theory.
12
= H
(1228)
i h
t
Since this equation is only firstorder in time, then the solution is uniquely
specified by the initial condition for . It is essential to only require an evolution
equation which is firstorder in time. Dirac91 searched for a set of coupled firstorder (in time) equations for a multicomponent wave function
(0)
(1)
=
(1229)
..
(N 1)
The wave function was assumed to satisfy an equation of the form
h
i
. p = m c
c t
86 W.
216
i
h
+ i h .
c t
= mc
(1230)
The equations have to be of this form since, if the equation is a firstorder partial
differential equation in time then it must also only involve the firstorder partial
derivatives with respect to the spatial components for the resulting equation to
be relativistically covariant. The wave function is a N component (column)
wave function and the three as yet unknown components of and are three
N N matrices. Since the Hamiltonian is the generator of time translations,
should be equivalent to ih . Hence, as the Hamiltonian operator H
then H
t
must be Hermitean, then the operators and must be Hermitean matrices.
This set of equations is required to yield the dispersion relation for a relativistic
particle
2
E
(1231)
p2 = m2 c2
c
which, following the ordinary rules of quantization, leads to the KleinGordon
equation
h2 2
2
2
2
+ h
= m2 c2
(1232)
c t2
(which is a secondorder partial differential equation in time). The requirement
that the Dirac equation is compatible with the KleinGordon equation imposes
conditions on the form of the matrices. On writing the Dirac equation as
h
i
=
m c i h .
(1233)
c t
and iterating, one has
2 2
2
h
=
m c i h .
c
t2
=
2 m2 c2 i h m c ( + ) .
2
2
h ( . )
(1234)
When expressed in terms of individual matrices (j) , the above equation becomes
2 2
X
h
2
2 2
( (j) + (j) ) j
=
m
c
i
h
m
c
c
t2
j
2 X
h
(i) (j)
(j) (i)
(
+ ) i j (1235)
2 i,j
since the derivatives commute. If the above equation is to be equivalent to the
KleinGordon equation, then the coefficients of the various derivatives must be
217
identical for both equations. Therefore, it is required that the constant terms
are equal
2 = I
(1236)
It is also required that the firstorder derivative terms vanish and that the
secondorder derivative terms should be equal, hence the matrices must satisfy
the anticommutation relations
(i)
(i) + (i)
(j) + (j) (i)
= 0
= 2 i,j I
(1237)
(1238)
(1239)
(1241)
since (i) is its own inverse. Apart from the negative sign, the form of the
lefthand side is of the form of an equivalence transformation. By using cyclic
invariance, it can be shown that the trace of a matrix is invariant under equivalence transformations. Therefore, one has
Trace (i) = Trace (i)
(1242)
Trace (i) = 0
(1243)
or
which proves that the matrices are traceless.
218
= I
= I
(1244)
=
= 2
(1246)
(1247)
This and the condition that the matrices are traceless implies that the set of
eigenvalues of each matrix are composed of equal numbers of +1 and 1, and
it also confirms the conclusion that dimension N of the matrices must be even.
The smallest value of the dimension for which there is a representation of the
matrices is N = 4. The smallest even value of N , N = 2 cannot be used since
one can only construct three linearly independent anticommuting 2 2 matrices92 . These three matrices are the Pauli spin matrices (j) . Hence, Dirac
constructed the relativistic theory with N = 4.
It is useful to find a representation in which the mass term is diagonal, since
this represents the largest energy which occurs in the nonrelativistic limit.
When diagonalized, the matrix has two eigenvalues of +1 and two eigenvalues
of 1 and so can be expressed in 2 2 blockdiagonal form. We shall express
the 4 4 matrices in the form of 2 2 block matrices. In this case, one can
represent the matrix in the blockdiagonal form
I
0
=
(1248)
0 I
If the three matrices (i) are to anticommute with and be Hermitean, they
must have the offdiagonal form
0
A(i)
(i)
=
(1249)
A(i) 0
92 In d + 1 spacetime dimensions, one can form 2d+1 matrices from products of the set of
d+1 linearly independent (anticommuting) Diracmatrices. We shall assume that the product
matrices are linearly independent. Since the number of linearly independent N N matrices
is N 2 , the minimum dimension N which will yield a representation of the Diracmatrices is
N =2
d+1
2
219
where A(i) is an arbitrary 2 2 matrix. We shall choose all three A(i) matrices
to be Hermitean. Since the three (i) matrices must anticommute with each
other, the A(i) must also anticommute with each other. Since the three Pauli
matrices are mutually anticommuting, one can set
0
(i)
(i) =
(1250)
(i) 0
where the (i) and I are, respectively, the 2 2 Pauli matrices and the 2 2
unit matrix. The Pauli matrices are given by
0 1
(1) =
(1251)
1 0
0 i
(2) =
(1252)
i 0
and
(3)
=
1
0
0 1
=
0 I
(1253)
(1254)
This set of matrices form a representation of the Dirac matrices. This can
be seen by directly showing that they satisfy the appropriate relations. Many
different representations of the Dirac matrices can be found, but they are all
related by equivalence transformations and the physical results are independent
of which choice is made.
Exercise:
By direct matrix multiplication, show that the above matrices satisfy the
relations
( (j) )2 = 2 = I
(1255)
and the anticommutation relations
(i)
(i) + (i)
(j) + (j) (i)
= 0
= 2 i,j I
220
(1256)
12.1
Conservation of Probability
i h
=
i h c . + m c2
t
(1257)
(0)
(1)
(2)
(3)
=
,
,
,
,
(1258)
one obtains
i
h
=
t
i h c . + m c2
(1259)
(1260)
Since and are Hermitean matrices, the Hermitean conjugate equation simplifies to
i
h
=
+ i h c . + m c2
(1261)
t
Postmultiplying the Hermitean conjugate equation by the columnvector ,
yields
i
h
=
+ i h c . + m c2
(1262)
t
On subtracting eqn(1262) from the eqn(1259) and combining terms, one obtains
i h
( ) = i h c . ( )
t
(1263)
+ .j = 0
t
(1264)
(1265)
Using the rules of matrix multiplication the probability density is a real scalar
quantity, which is given by the sum of squares
=  (0) 2 +  (1) 2 +  (2) 2 +  (3) 2
221
(1266)
(1267)
(1268)
is conserved, since
dQ
dt
d3 x
t
Z
=
d3 x . j
Z
=
d2 S . j
=
(1269)
where Gausss theorem has been used to represent the volume integral as surface
integral. For a sufficiently large volume, the current at the boundary vanishes,
hence the total probability is conserved
dQ
= 0
dt
12.2
(1270)
= mc
= mc
(1271)
p = i h
x
(1272)
(1273)
= mc
= mc
222
(1275)
(1276)
= (0)
= (0)
(1277)
=
=
=
=
=
( (i) )
( ( (i) ) )
( (i) )
( (i) )
(i)
(1278)
since (i) and are Hermitean and, in the fourth line the operators have been
anticommuted. Now, the gamma matrices with spatial indices can be shown
to be unitary since
(i) (i)
= (i) (i)
= (i) (i)
= I
(1279)
where, in obtaining the second line, the anticommutation properties of (i) and
have been used, and the property
( (i) )2 = 2 = I
(1280)
was used to obtain the last line. Since it has already been demonstrated that
the spatial matrices are antiHermitean
( (i) ) = (i)
(1281)
223
(1282)
= (0)
(1283)
( (0) )2 = I
(1284)
Hence, since
= (0)
(1285)
The continuity equation has the Lorentz covariant form
j
= 0
x
(1286)
(1287)
By using the definition of the Dirac adjoint, the current density can be reexpressed as the four quantities
j (0)
= c (0)
j (i)
= c (i)
(1288)
that, respectively, represent c times the probability density and the j (i) are the
contravariant components of the probability current density.
12.3
In the absence of fields, the Dirac equation can be solved exactly by assuming
a solution in the form of planewaves. This is because the momentum operator
since in the absence of fields there is no
p
commutes with the Hamiltonian H
explicit dependence on position. The solution can be expressed as a momentum
eigenstate in the form
(0)
u
u(1)
exp i k x
=
(1289)
u(2)
u(3)
where the functions u (k) are to be determined. On substituting this form in
the Dirac equation, it becomes an algebraic equation of the form
mc
(0)
k I k. (
) = 0
(1290)
h
224
where k is a threevector with components given by the contravariant spatial components of k . In order to write this equation in two by two blockdiagonal form, the fourcomponent spinor can be written in terms of two
twocomponents spinors
A
=
(1291)
B
where the two twocomponent spinors are given by
(0)
u
A
=
u(1)
(2)
u
B
=
u(3)
(1292)
Hence, the Dirac equation can be expressed as the blockdiagonal matrix equation
m c
(0)
A
k
+
I
k
.
= 0
(1293)
B
m c
(0)
I
k.
k h
where the threevector scalar product involves the contravariant components
of the momentum k (i) with the Pauli spin matrices (i) . The above equation
is an eigenvalue equation for k (0) . The eigenvalues are given by the solution of
the secular equation
m c
(0)
I
k.
k + h
(1294)
= 0
k.
k (0) mh c I
which can be written as
2
2
mc
(0)2
k
=
.k
h
Using Paulis identity
.A
.B
=
A.B I + i. A B
(1295)
(1296)
one finds the energy eigenvalues are given by the doublydegenerate dispersion
relations
s
2
mc
(0)
k
=
+ k2
(1297)
h
Thus, the field free relativistic electron can have positive and negativeenergy
eigenvalues given by
q
E =
m2 c4 + p2 c2
(1298)
225
Since the solutions are degenerate, solutions can be found that are simultaneous
given by
eigenvalues of the Hamiltonian H
m c2 I
i h c .
=
(1299)
H
i h c .
m c2 I
It is convenient to choose the
and another operator that commutes with H.
second operator to be the helicity operator.
corresponds to the projection of the electrons spin
The helicity operator
along the direction of momentum. The (unnormalized) helicity operator is
k = +1
k = 1
Figure 44: A cartoon depicting the two helicity states of a spin onehalf particle.
given by
= i h
.
0
0
.
(1300)
This is the appropriate relativistic generalization of spin valid only for free
particles93 , as the helicity is a conserved quantity since
,
] = 0
[H
(1301)
H(k) =
(1302)
h c . k m c2 I
Likewise, for the source free case, the properly normalized Helicity operator is
found as
. k
0
(1303)
(k) =
0
. k
93 Helicity is not conserved for spherically symmetric potentials. However, if only a timeindependent vector potential is present, the generalized quantity
= . ( p q A )
c
is conserved. This conservation law implies that the spin will always retain its alignment with
the velocity.
226
0
(k) =
(1304)
0
(3)
and the eigenstates of helicity with eigenvalue +1 are composed of a linear
superposition of the spinup eigenstates. We shall represent the twocomponent
spinors A
+ via
A
+
= u(0) +
1
= u(0)
0
(1305)
and B
+ as
B
+
= u(2) +
1
= u(2)
0
(1306)
u(0) +
u(2) +
exp
i k x
(1307)
= u(1)
0
= u(1)
1
(1308)
and B
+ as
B
= u(3)
0
(3)
= u
1
(x) =
exp
i
k
x
u(3)
(1309)
(1310)
(1311)
= H
t
(1312)
one finds
E
=
m c2
(3)
c h k (3)
(3) c h k (3)
m c2
(1313)
B
Therefore, the complex amplitudes A
and are found to be related by
B
=
(3) c h
k (3) A
E + m c2
(1314)
(3) c h
k (3) B
m c2 E
(1315)
shows that A
Hence, the two
is small for the negativeenergy solutions.
positiveenergy and two negativeenergy (unnormalized) solutions of the Dirac
equation can be written as
+
exp i k x
+ (x) = Ne
(1316)
c h
k(3)
E + m c2 +
for helicity +1 and
(x) = Ne
c h
k(3)
E + m c2
exp
i k x
(1317)
(1318)
= V Ne
= V Ne 2
= V Ne 2
E 2 + 2 E m c2 + m2 c4 + c2 h
2 k2
( E + m c2 )2
2
2 E + 2 E m c2
( E + m c2 )2
2E
E + m c2
(1319)
(1320)
the lower components are the large components. In this case, it is more convenient to express the negativeenergy solutions as
m cc2h k E +
exp i k x
+ (x) = Np
(1321)
+
for helicity +1 and
(x) = Np
c h
k
m c2 E
exp
i k x
(1322)
for helicity 1. Furthermore, in this expression the normalization constant has
the form
r
m c2 E
(1323)
Np =
2EV
Hence, the positive and negativeenergy solutions are symmetric under the interchange E E, if and the upper and lower twocomponent
spinors (A , B ) are interchanged.
General Helicity Eigenstates
The Helicity operator for a particle with a momentum h
k is given by the
Hermitean operator
k (3)
k (1) i k (2)
1
(k) =
k
(1)
(2)
(3)
k
+ ik
k
cos k
sin k exp[ik ]
=
(1324)
sin k exp[+ik ]
cos k
229
which since
(k) (k) = I
(1325)
has eigenvalues of 1. The helicity eigenstates94 are given by the twocomponent spinors . The positive helicity state is given by
(1)
k
i k (2)
1
+ = p
2 k ( k k (3) )
(3)
k k
k
cos 2 exp[i 2k ]
k
= exp[ i
]
(1326)
2
k
k
sin 2 exp[+i 2 ]
in which (k, k , k ) are the polar coordinates of k. The negative helicity eigenstate is given by the spinor
k + k (3)
1
= p
2 k ( k k (3) )
k (1) + i k (2)
sin 2k exp[i 2k ]
k
(1327)
= exp[ + i
]
2
k
k
cos 2 exp[+i 2 ]
Therefore, the general helicity eigenstate planewave solutions of the Dirac equation can be written in terms of two twocomponent spinors as
exp i k x
(1328)
(x) = Ne
c h
k
E + m c
In this expression Ne is a normalization factor
r
E + m c2
Ne =
2EV
(1329)
12.4
Coupling to Fields
The Dirac equation describes relativistic spin onehalf fermions, and their antiparticles. It describes all massive leptons such as the electron, muon and tao
particle, and can be generalized to describe their interaction with the electromagnetic field, or its generalization the electroweak field. In the limit m 0,
the Dirac equation reduces to the Weyl equation95 which describes neutrinos.
94 C. G. Darwin, Proc. Roy. Soc. A 118, 654 (1928).
C. G. Darwin, Proc. Roy. Soc. A 120, 631 (1928).
95 H. Weyl, Z. Physik, 56, 330 (1929).
230
The Dirac equation also describes massive quarks and the interaction can be
generalized to quantum chromodynamics.
In the absence of interactions, the Dirac equation can be expressed in either
of the two forms
p
ih
= mc
= mc
(1330)
q
A
c
(1331)
and q is the charge of the particle, the Dirac equation in the presence of an
electromagnetic field becomes
q
A = m c
p
c
q
i h
+ i
A = m c
(1332)
h c
This process has resulted in the inclusion of the interaction with the electromagnetic field in a gauge invariant, Lorentz covariant manner. The appearance
of the gauge field together with the derivative results in local gauge invariance.
Sometimes it is convenient to define a covariant derivative as the gaugeinvariant
combination
q
D + i
A
(1333)
h c
The concept of the covariant derivative also appears in the context of other
gauge field theories. Using this definition we can express the Dirac equation in
the presence of an electromagnetic field in the compact covariant form
i h D = m c
(1334)
The presence of an electromagnetic field does not alter the form of the conserved
fourvector current
j = c
(1335)
which is explicitly gauge invariant.
12.4.1
Mott Scattering
231
electrons with momentum h k 0 . The initial and final states of the positiveenergy
electron can be represented by the Dirac spinors of the form
k, (x) = Nk
exp i k x
(1336)
c h
k .
Ek + m c
where the normalization constant is chosen as
s
Ek + m c2
Nk =
2 Ek V
(1337)
The interaction Hamiltonian with the electrostatic field of the nucleus is given
by the diagonal matrix
Z e2
I 0
HInt =
(1338)
0 I
r
The flux of incident electrons is defined by
v
V
E
v =
p
F =
where
(1339)
(1340)
(1341)
The elastic scattering crosssection in which the final state polarization is unmeasured is given by
Z
d
1
Ek V 2 X
Int  k, > 2 ( Ek Ek0 )
=
dk 0 k 02  < k 0 0  H
d0
( 2 )2
h 2 k c2
0
0
(1342)
where the delta function ensures conservation of energy. Since the polarization
of the final state electron is unmeasured, the spin 0 is summed over. The
integration over k 0 can be performed, yielding
2 X
d
EV
Int  k, > 2
 < k0 0  H
(1343)
=
d0
2
h2 c2
0
T0
I +
( E + m c2 )2
(1344)
232
where the normalization constants have been combined, since energy is conserved. Likewise, the complex conjugate matrix elements are given by
E + m c2
4 Z e2
Int  k 0 , 0 > =
< k,  H
2EV
V  k k 0 2
2 2
c h ( . k ) ( . k 0 )
T
I +
0
( E + m c2 )2
(1345)
These expressions for the matrix elements are inserted into the scattering crosssection. Since the final state polarization is not detected, then 0 must be
summed over. The trace over 0 is evaluated by using the completeness relation
X
0 T0 = I
(1346)
0
I +
( E + m c2 )2
( E + m c2 )2
(1347)
The products of matrix elements shown above can be evaluated with the aid of
the Pauli identity. The sum of the crossterms can be evaluated directly using
the Pauli identity. We note that since the vector product are antisymmetric in
k and k 0 , the sum of the vector product terms cancel. That is
c2
h2
0
0
(
.
k
)
(
.
k
)
+
(
.
k
)
(
.
k
)
( E + m c2 )2
=
c2 h
2
2 ( k . k 0 ) I
( E + m c2 )2
(1348)
The remaining term is evaluated by using the Pauli identity for the inner two
scalar products, and then reusing the identity for the outer two scalar products.
Explicitly, this process yields
c4
h4
0
0
(.k)(.k )(.k )(.k)
( E + m c2 )4
=
c4 h
4
k 02 k 2 I
( E + m c2 )4
(1349)
d
d0
=
Z e2
2 c2  k k 0 2
h
2
c4 h
4 k 2 k 02
( E + m c ) + 2 c h k . k +
( E + m c2 )2
(1350)
2 2
233
It should be noted that the last two terms originated from the combined action
of the Pauli spin operators and involved the lower twocomponent spinors. The
last term can be simplified by using the elastic scattering condition k = k 0 and
then using the identity
c4 h
4 k 4 = ( E 2 m2 c4 )2
(1351)
(1353)
0
2
(1354)
(1355)
sin
(1356)
=
0
c
2
4 h2 c2 k 2 sin2 2
where the expression for the magnitude of the velocity
2
2
c h k
2
v =
E
(1357)
has been introduced. The above result is the Mott scattering crosssection96 ,
which describes the scattering of charged electrons. It differs from the Rutherford scattering crosssection due to the multiplicative factor of relativistic origin,
which deviates from unity due to the electrons internal degree of freedom. The
extra contribution to the scattering is interpreted in terms of scattering from the
magnetic moment associated with the electrons spin interacting with the magnetic field of the nuclear charge that the electron experiences in its rest frame.
It should be noted that even if the initial beam of electrons is unpolarized, the
scattered beam will be partially spinpolarized (due to higherorder corrections).
96 N.
234
12.4.2
Maxwells Equations
0
B (1) iE (1)
=
(1358)
B (2) iE (2)
(3)
(3)
B iE
Maxwells equations can be written in the form
i =
4
j
c
c
j (1)
j =
j (2)
j (3)
(1359)
(1360)
We shall require that the matrices are Hermitean and that they satisfy the
equation
( )2 = I
(1361)
On comparing with the form of Maxwells equations97 , one finds that the Matrices are given by
1 0 0 0
0 1 0 0
(0) =
(1362)
0 0 1 0
0 0 0 1
0 1 0 0
1 0 0 0
(1) =
(1363)
0
0 0 i
0
0 i 0
0
0 1 0
0
0
0
i
(2) =
(1364)
1 0
0
0
0 i 0 0
0 0 0 1
0 0 i 0
(3) =
(1365)
0 i 0
0
1 0 0
0
97 Since the first element of is zero, the first columns of the matrices are not determined
directly from the comparison. The first rows are determined by demanding that the matrices
are Hermitean.
235
The matrices corresponding to the spatial indices are traceless and satisfy the
anticommutation relations
(i) (j) + (j) (i) = 2 i,j
(1366)
and
(i) (j) = i
i,j,k (k)
(1367)
4
j
c
(1368)
(1369)
one obtains
4
j
(1370)
c
Utilizing the anticommutation of the spatial matrices, the lefthand side simplifies to
1
+ 2
(1371)
c t
= i
On substituting the new form of Maxwells equations in the second term, the
expression reduces to
8
+ i 2
j
(1372)
c t
Thus, the equation becomes
4
= i
c
2
c t
j
(1373)
The interaction of the Dirac particle with the electromagnetic field is described
by the interaction Hamiltonian which is described by the 4 4 matrix
I = q c (0) A
H
(1374)
c
236
c A
HI =
c
q
=
j A
(1375)
c
where j is the fourvector probability current density which satisfies the condition for conservation of probability. Due to the prominence of the current
density operator in applications of the Dirac equation, since it naturally describes interactions with an electromagnetic field and the conservation laws, the
physical content of the current densities shall be examined next.
In the presence of an electromagnetic field, the fourvector current density
is given by the expression
(1376)
j = c
where
= (0)
One can rewrite the current density by using the Dirac equation
q
i
h + i
A = m c
h c
and the Hermitean conjugate equation
q
i
h i
A = m c
h c
(1377)
(1378)
(1379)
On symmetrizing the current density and then substituting the Dirac equation
in one term and its Hermitean conjugate in the other term, one obtains
c
+
j
=
2
i
h
q
q
(0)
(0)
=
( i
A ) + ( + i
A )
2m
h c
h c
i
h
q
q
(0) (0)
=
( i
A ) + ( + i
A )
2m
h c
c
h
(1380)
where the partial derivatives only operate on the wave function immediately to
the right of it. The identity
(0) (0) = I
(1381)
j
=
( i
A ) + ( + i
A )
2m
h c
c
h
(1383)
where, once again, the partial derivative only operates on the wave function
immediately to the right of it. Furthermore, if one sets
1
+
= g , I
2
1
= i ,
(1384)
2
then the current density can be expressed as the sum of two contributions
j
= jc + js
q ,
i
h
,
g
( ) + 2 i
g
A
=
2m
c
h
h
,
(1385)
2 m x
where
jc
js
q
i
h
( ) + 2 i
A
2m
c
h
,
(1386)
=
2 m x
between states and . As shall be shown, the first contribution in the Gordon decomposition is gauge invariant and dominates the current density in the
nonrelativistic limit. The second contribution involves the matrix , which
is antisymmetric in its indices and has the form of a spin contribution to the
current density.
Let us examine the first term in the probability current density. If repre(0)
sents an energy eigenstate, then jc is given by
E
q
jc(0) =
A(0)
(1387)
mc
mc
This contribution obviously yields the main contribution to (c times) the probability density
jc(0) c
(1388)
98 W.
Gordon, Zeit. f
ur Physik, 50, 630 (1928).
238
in the nonrelativistic limit since the rest mass energy dominates the energy
(i)
E m c2 . The spatial components of jc are given by
i
h
q
jc =
( ) ()
A
(1389)
2m
mc
where the derivatives have been expressed as derivatives w.r.t. the contravariant
components x(i) of the position vector. This expression coincides with the full
nonrelativistic expression for the current density j (i) .
We now examine the second term js in the Gordon decomposition. For
future reference, the antisymmetrized products of the Dirac matrices , will
be expressed in 2 2 block diagonal form. Therefore, since
I
0
(0) =
0 I
0
(i)
(i) =
(1390)
(i) 0
and
i
2
, =
(1391)
(j)
and
i,j =
i,j,k
(j)
0
(k)
0
(1392)
0
(k)
(1393)
The two by two block diagonal matrix of Pauli spin matrices will be denoted
(0)
by
. For an energy eigenstate, the time component of js is identically zero.
(i)
Hence, the spatial components of js are given by
)
(
2m
where
is the 2 2 blockdiagonal Pauli spin matrix
0
=
0
js =
(1394)
(1395)
The additional term in the current density clearly involves the Pauli spinmatrices. To elucidate its meaning, its contribution to the energy shall be
examined. On substituting this term in the interaction Hamiltonian density,
one finds a contribution
spin = q j . A
H
I
c s
q h
)
(1396)
= +
A. (
2mc
239
On integrating over space, the interaction Hamiltonian density gives rise to the
interactions contribution to the total energy. By integrating by parts, it can
be shown that this energy contribution is equivalent to the energy contribution
caused by an equivalent form of the interaction Hamiltonian density
spin
H
I
q h
(
).( A)
2mc
q h
(
).B
2mc
(1397)
where B is the magnetic field. Hence, the interaction energy contains a term
which represents an interaction between the electrons internal degree of freedom and the magnetic field.
12.5
One goal of Physics is to write the laws in a manner which are independent
of any arbitrary choices that are made. Within special relativity, this implies
that the laws of Physics should be written in a way which is independent of
the choice of inertial reference frame. Diracs theory is Lorentz covariant if the
results are independent of the Lorentz frame used. To this end, it is required
that the Dirac equation in a Lorentz transformed frame of reference has the
same form as the Dirac equation in the original reference frame, and also that
the solutions of these two equations describe the same physical states. That is,
the two solutions must describe the same set of measurable properties in the
different reference frames, and therefore the results are simply related by the
Lorentz transformation.
The first step of the proof of the Lorentz covariance of the Dirac equation
requires that one should be able to show that under a Lorentz transformation
defined by
A A0 = A
(1398)
then the Dirac equation is transformed from
( p
q
A ) = m c
c
(1399)
q 0
A ) 0 = m c 0
c
(1400)
Furthermore, the four components of the spinor wave function 0 are assumed
0 (x0 ) = R()
(x)
240
(1401)
Hence, the transformed Dirac equation can be rewritten in terms of the untransformed spinor
q 0
A ) 0
c
q 0
A ) R()
0 ( p0
0 ( p0
= m c 0
= m c R()
(1402)
if such an R()
exists. The 0 matrices must satisfy the same anticommutation
relations as the and, therefore, only differ from them by a similarity transformation99 . The transformations of 0 just results in the set of the four linear
equations that compose the Dirac equation being combined in different ways,
A ) R()
c
q
( p
A ) R()
0 ( p0
0
= m c R()
= m c R()
(1403)
1 () 0 ( p q A ) R()
R
c
q
1 () 0 R()
R
( p
A )
c
= mc
(1404)
= mc
(1405)
R
=
(1406)
The transformed Dirac equation has the same form as the original equation if
the transformed 0 matrices satisfy the same anticommutations and conditions
as the unprimed matrices. This can be achieved by choosing 0 = . This
choice yields the condition for covariance as
1 () R()
R
=
(1407)
(1408)
99 This is a statement of Paulis fundamental theorem [W. Pauli, Ann. Inst. Henri Poincar
e
6, 109 (1936).]. For a general discussion, see R. H. Good Jr. Rev. Mod. Phys. 27, 187
(1955).
100 It should be noted that the matrices and R
act on totally different spaces. The
matrices act on
matrices act on the components of the fourvectors x , whereas the R
the components of the fourcomponent Dirac spinor .
241
R
=
(1409)
R()
exists which does satisfy the condition. Instead of following the general
theorem, the solution will be inferred from consideration of infinitesimal Lorentz
transformations.
(1410)
(1411)
where is a four by four matrix that has yet to be determined. The inverse
matrix can be written as
1 = I + i + . . .
R
4
(1412)
=
(1413)
4
or on raising and lowering indices
i
= g
4
(1414)
Thus, since is antisymmetric as it represents an infinitesimal Lorentz transformation, the matrix can be restricted to be antisymmetric in the indices,
By making
because any symmetric part does not contribute to the matrix R.
specific choices for the antisymmetric quantities , which are zero except for
a chosen pair of indices (say and ), one finds that the antisymmetric part
of is determined from the equation
i
[ , ] = g g
2
(1415)
These sets of equations have to be satisfied even if arbitrary choices are made for
the infinitesimal Lorentz transformations . The infinitesimal unitary matrix
242
(1417)
The set of (as yet unknown) matrices that solve the above set of equations
are given by
i
= =
[ , ]
(1418)
2
which are the six generators of the general infinitesimal Lorentz transformation.
Proof of Solution
It can be shown that the expression for , given in eqn(1418) satisfies the
requirement of eqn(1417), by evaluating the nested commutator through repeatedly using the anticommutation properties of the matrices. The commutator
can be expressed as a nested commutator or as the sum of two commutators
[ , ]
=
=
i
[ [ , ] , ]
2
i
i
[ , ]
[ , ]
2
2
(1419)
(1420)
= i [ , ] + i g , [ I , ]
= i [ , ]
(1421)
where the second line follows since the identity matrix commutes with . One
notices that if the s are anticommuted to the center of each product, some
terms will cancel and there may be some simplification. On using the anticommutation relation in the second term of the expression
[ , ] = i
(1422)
243
one finds
[ , ]
= i
+ 2 g ,
(1423)
Likewise, the matrices in the first term can also be anticommuted, leading to
,
,
[ , ] = i 2g
+ 2g
= 2 i g , g ,
(1424)
since the middle pair of terms cancel. Hence, one has proved that
i
,
,
[ , ] =
g
g
(1425)
= c
= c (0)
(1426)
= c 0 (0) 0
(0) R
= c R
244
(1427)
The identity
1 = (0) R
(0)
R
(1428)
(1429)
(0) R
= c (0) (0) R
1 R
= c (0) R
(1430)
R
=
(1431)
= c (0)
= c
= j
(1432)
Hence, the probability current densities j 0 and j found in the two reference
frames are simply related via the Lorentz transformation. Therefore, the Dirac
equation gives consistent results, no matter what inertial frame of reference is
used.
Proof of Identity
The identity
1 () = (0) R
() (0)
R
(1433)
can be proved by starting from the expression for the expression for R appropriate for infinitesimal transformation given by
= I + 1 [ , ] + . . .
R
8
(1434)
1
= I +
[ , ] + . . .
8
1
= I
[ , ] + . . .
8
(1435)
(1436)
between the pairs of four by four matrices in the commutator and noting that
(0) (0) =
(1437)
1
= I
[ , ] + . . .
8
1
= R
(1438)
The last line follows from the observation that on combining the expression for
with the expression for (0) R
(0) , the terms of order cancel. Hence to
R
2
(0) (0)
1 . This concludes
the order of , the product
R
coincides with R
the discussion of the desired identity.
Finite Rotations
Consider a finite rotation of the coordinate system specified by the transformation matrix
x0 = x
(1439)
Specifically, a finite (passive) rotation through an angle about the e3 direction
can be expressed in terms of the transformation matrix
1
0
0
0
0
cos
+ sin 0
=
(1440)
0 sin
cos
0
0
0
0
1
The above transformation represents a rotation of the coordinate system while
the physical system stays put. For an infinitesimal rotation through , the
x(2)'
x(2)
x(1)'
x(1)
Figure 45: A passive rotation of the coordinate system through an angle about
the e3 axis.
246
1
0
0
1
=
0
0
0
0
+
1
0
0
0
+ ...
0
1
(1441)
(1442)
(1443)
R()
i
+ . . .
= I
4
(1444)
R()
i
= I +
( 1,2 2,1 ) + . . .
4
i
= I +
1,2 + . . .
2
1,2
= exp i
(1445)
R() =
R()
1,2
= exp i N
2
1,2
= exp i
(1446)
2
Therefore, for a finite rotation, the transformation matrix is given by
1,2
R() = exp i
2
which can be expressed in terms of even and oddpowers of 1,2 via
1,2
1,2
R() = cos
+ i sin
2
2
247
(1447)
(1448)
but since
1,2
i
[ (1) , (2) ]
2
=
(3)
=
R()
= cos
+ i sin
2
2
(1449)
(1450)
(1451)
= I
=
(3)
R() = cos
I + i sin
(3)
2
2
(1452)
(1453)
Therefore, under a finite rotation through angle around the unit vector e, a
spinor is rotated by the operator
R()
= cos I + i sin e .
(1454)
2
2
From the above equation, due to the presence of the halfangle, one notes that
a rotation and through + 2 are not equivalent, since
+ 2) = R()
R(
(1455)
which changes the sign of the spinor. For spin onehalf electrons, it is necessary
to rotate through 4 to return to the same state
+ 4) = R()
R(
(1456)
cosh
sinh 0 0
sinh
cosh
0 0
=
(1457)
0
0
1 0
0
0
0 1
248
tanh =
(1458)
so
1
1 ( vc )2
cosh =
sinh =
v
c
1 ( vc )2
(1459)
1
0 0
1
0 0
+ ...
=
(1460)
0
0
1 0
0
0
0 1
to firstorder in the infinitesimal quantity . Therefore, with the infinitesimal
form of the general Lorentz transformation
= + + . . .
(1461)
(1462)
R()
i
+ . . .
= I
4
(1463)
R()
i
= I + 2 0,1 + . . .
4
0,1
= exp i
(1464)
boost R()
R()
N
=
R()
0,1
= exp i N
2
0,1
= exp i
2
249
(1465)
R() = exp i
(1466)
2
which can be expressed in terms of even and oddpowers of 0,1 via
0,1
0,1
R() = cosh i
+ sinh i
2
2
(1467)
but since
0,1
i
[ (0) , (1) ]
2
= i (1)
=
R()
= cosh
+ sinh
2
2
(1468)
(1469)
(1470)
= I
= (1)
R()
= cosh
I + sinh
(1)
2
2
= cosh
I sinh
(1)
2
2
(1471)
(1472)
R() = cosh
I tanh
2
2
= cosh
I tanh v .
(1473)
2
2
where the rapidity is determined by
tanh =
250
v
c
(1474)
Exercise:
Determine the relationship between the rapidities for a combined Lorentz
transformation consisting of two successive Lorentz boosts with parallel velocities v0 and v1 .
Exercise:
Starting from a solution of a free stationary Dirac particle with spin ,
perform a Lorentz boost to determine the solution for a Dirac electron with
momentum p0 .
L'
12.5.1
One can form sixteen matrices i from the product of the four matrices. Since
the matrices obey the anticommutation relations
{ , }+ = 2 g , I
(1475)
all other products can be reduced to the above products. The order of the
matrices is irrelevant, since the different matrices anticommute. Also, since
251
Table 6: The Set of the Sixteen Matrices n with their Phase Factors (j > i)
I
(0)
i (i)
(0) (i)
i (0) (4)
(i) (4)
(1477)
only if
i = j
(1478)
Also, by anticommuting the factors of in the products, one can show that
i j = j i
(1479)
Specifically, for a fixed i not equal to the identity, one can always find a
specific k such that
i k = k i
(1480)
252
(1481)
Traceless Matrices
The above facts can be used to show that the i matrices, other than the
identity, are traceless. This can be proved by considering
Trace i
= Trace( i ) = Trace( k i k )
= Trace( i k k ) = Trace i
(1482)
(1483)
Linear Independence
The sixteen i matrices are linearly independent. The linear independence
can be expressed in terms of the absence of any nontrivial solution of the
equation
X
Ci i = 0
(1484)
i
other than Ci 0 for all i. If the i are linearly independent, the only solution
of this equation is
Ci 0
for all i
(1485)
This can be proved by multiplying eqn(1484) by any one j in the set which
leads to
X
Cj I +
Ci i j = 0
i6=j
Cj I +
Ci ai,j k
= 0
(1486)
i6=j
Ci ai,j Trace k
i6=j
= Cj 4
(1487)
253
since the matrices k are traceless. Hence, all the Cj are zero, so the matrices
are linearly independent.
Uniqueness of Expansions
The existence of sixteen linearly independent matrices require that the matrices can be represented in a space of N N matrices, where N 4. Any
matrix A in the space of 4 4 matrices can be uniquely expressed in terms of
the basis set of the i . For example, if
X
A =
Ci i
(1488)
i
= Cj Trace( j j ) +
Ci Trace( i j )
i6=j
= Cj Trace( I ) +
Ci Trace( ai,j k )
i6=j
= Cj 4
(1489)
1
Trace( A j )
4
(1490)
Schurs Lemma
The uniqueness of the expansion can be used to show that the product of i
for fixed i with the set of j for leads to a different k for each j. This can be
shown by assuming that there exist two different (linearly independent) values
j and j 0 which lead to the same k
i j
i j 0
= ai,j k
= ai,j 0 k
(1491)
= ai,j i k
= ai,j 0 i k
(1492)
254
ai,j 0
j
ai,j
(1493)
(1495)
Since it has been assumed that A commutes with all the i , for the specific k
one has
A =
k A k
X
= Ci k i k +
Cj k j k
j6=i
= Ci i +
Cj k j k
(1496)
j6=i
(1497)
Cj ( 1 )j,k j
(1498)
j6=i
(1499)
j6=i
Since the expansion is unique, the coefficients of the j are unique and in particular
Ci = Ci
(1500)
Hence, if A commutes with all the i
so Ci = 0 for any i such that i 6= I.
then A must be proportional to the identity.
255
(1501)
(1503)
i j i j = a2,j 2k = a2i,j I
(1504)
2k = I
(1505)
since
On premultiplying eqn(1504) by j i , one obtains
j i i j i j
= a2i,j j i
(1506)
but since
j i i j
= I
(1507)
eqn(1506) reduces to
i j
= a2i,j j i
(1508)
However, as
i j = ai,j k
(1509)
= a2i,j j i
(1510)
(1511)
The i matrices are constructed so that they satisfy similar relations to the i .
0
In particular, the i matrices satisfy
i 0 j 0 = ai,j k 0
(1512)
(1513)
(1514)
i 0 j 0 = ai,j k 0
(1515)
and
one finds
i 0 S i =
a4i,j k 0 F k
(1516)
i 0 S i =
k 0 F k
(1517)
However, since i is fixed and j is being summed over, every k appears once
and only once in the product. Therefore, the sum can be performed over k
X
i 0 S i =
k 0 F k = S
(1518)
k
If one can show that the matrix S has an inverse, then on postmultiplying by
S1 , one finds
i 0 S i S1 = I
(1519)
Furthermore, since i 0 is its own inverse, then on premultiplying by i 0 the
equation reduces to
S i S1 = i 0
(1520)
This is a generalization of the statement of the theorem. As a particular case,
one may choose i = in which case the theorem becomes
0 = S S1
(1521)
which was the initial statement of Paulis fundamental theorem made above.
257
(1523)
S0 = i S0 i 0
(1524)
= i
= i
S0 i 0 i 0 S i
S0 S i
(1525)
Hence, by Schurs Lemma one sees that S0 S commutes with all the matrices in
the space, therefore it must be a multiple of the identity
S0 S = I
(1526)
(1527)
12.5.2
When evaluated in the Born Approximation, Mott scattering does not result
in the polarization of an unpolarized beam. However, when higherorder corrections are included, Mott scattering produces a partially polarization of the
scattered electrons101 . If the incident beam is polarized by having a definite
helicity, it is expected that the helicity may change as a result of the scattering.
The probability of nonhelicity flip scattering and helicity flip scattering can
be evaluated using the Born approximation. The initial beam will be considered
as having a momentum p parallel to the e3 axis and as having a helicity of +1.
The initial spinor is proportional to
s
p.r
+
Ep + m c2
exp i
p,+ (r) =
(1528)
c p
2 Ep V
h
Ep + m c2 +
101 N.
258
p'
'
p'
p
Figure 47: Helcity nonflip and helicity flip Mott scattering of an electron with
helicity +1. The scattering angle is p0 .
The electrons are assumed to be elastically scattered to a state with final momentum p0 . The scattering is defined to occur through an angle p0 in the z x
plane. The final state is composed of a linearsuperposition of states with different helicities. Since the final state helicities are specified relative to the final
momentum, the final state helicity eigenstates can be obtained by rotating the
initial state helicity eigenstates through an angle p0 around the e2 axis
p0 0 ,0 (r)
p0 ) p0 ,0 (x)
= R(
s
Ep + m c2
=
R(p0 )
2 Ep V
0
0
c p
Ep + m c2
exp
p0 . r
i
h
(1529)
where the rotation operator is given by
p0 (2)
p0
I i sin
R(p0 ) =
cos
2
2
(1530)
which does not mix the upper and lower twocomponent spinors. Therefore, one
finds that the final state twocomponent spinors representing helicity eigenstates
are given by
p0
p0 (2)
0
+ =
cos
I i sin
+
2
2
!
0
cos 2p
(1531)
=
0
sin 2p
and
0
=
p0
p0 (2)
cos
I i sin
2
2
259
sin 2p
0
cos 2p
!
(1532)
h
Ep + m c2 0
The Born approximation scattering crosssection can be
the modulus squared matrix elements
Z
Z e2
p,+ (r)
d3 r 0 p0 ,0 (r) (0)
r
expressed in terms of
2
which is evaluated as
2
2
2
c2 p2
E p + m c2
4 Z e2
0
1 +
V  p p0 2
( Ep + m c2 )2
2 Ep
2
2
2 2
2
4Z e
( Ep + m c ) + 0 c2 p2
=
2 E p ( E p + m c2 )
V  p p0 2
(1534)
2
0
0 +
2
0
0 + (1535)
cos2
(1536)
0
2
V p p 
2
whereas the probability for helicity flip scattering is given by
2
2
4 Z e2
m c2
p0
sin2
V  p p0 2
Ep
2
(1537)
It is seen that the probability for helicity flip scattering vanishes in the ultrarelativistic limit. Also, in the nonrelativistic limit, a static charge cannot flip
the spin. Therefore, in the nonrelativistic limit, if one expresses the spin eigenstate as a linear superposition of the final helicity eigenstates
+ = cos
p0 0
p0 0
sin
2 +
2
(1538)
one is lead to expect that the relative probability of helicity flip to nonhelicity
flip will be governed by a factor of
tan2
p0
2
(1539)
which agrees with the above matrix elements evaluated in the nonrelativistic
limit. The crosssection for nonflip scattering is determined as
2
p0
d
2 Z e2 E p
cos2
(1540)
=
2 p0
2
2
2
d0 +,+
4 c p sin 2
260
d
d0
2 Z e2 m c 2
=
4
+,
c2
p2
0
sin2 2p
2
sin2
p0
2
(1541)
p0
2
0
Ep2 cos2 2p
Ep2 cos2
p0
2
0
m2 c4 sin2 2p
m2 c4 sin2
+
(1543)
12.6
The nonrelativistic limit of the Dirac equation should reduce to the Schrodinger
equation. As shall be seen, the appropriate Schrodinger equation for a particle
with positiveenergy is modified due to the existence of spin. The nonrelativistic
limit is described by the Pauli equation102 .
The Dirac equation can be written as
q
h
q
A )+ m c
i
A0 = . ( p
c
c t
c
(1544)
The equation can be written in 2 2 block diagonal form, if the wave function
is expressed in the form of two twocomponent spinors. We shall mainly focus
on the positiveenergy solutions and recognize that, in the nonrelativistic limit,
the largest component of the wave function is A and the largest term in the
102 W.
261
energy is the rest mass energy m c2 . Therefore, the spinor wave function will
be expressed as
A
m c2
=
exp i
t
(1545)
B
h
The above form explicitly displays the restmass energy of the positiveenergy
solution of the Dirac equation. Hence, the Dirac equation takes the form
A
c . ( p qc A ) B
i
h
q A0
=
B
c . ( p qc A ) A
t
0
2
2mc
(1546)
B
where the rest mass has been eliminated from the equation for the large component A of the positiveenergy solution. Since the kinetic energy and the
potential energy are assumed to be smaller than the rest mass energy, the equation for the small component
q
B
i
h
q A0 = c . p
A A 2 m c2 B (1547)
t
c
can be expressed as
B =
1
.
2mc
p
q
A
c
(1548)
Substituting the expression for the small component into the equation for the
large component, hence eliminating B , one finds the equation
2
1
q
A
i
h
q A0 =
. p
A
A
(1549)
t
2m
c
which is the Pauli equation. The equation can be simplified by expanding the
terms involving the Pauli spin matrices. The Pauli identity can be used to
obtain
2
2
q
q
q
q
p
A
+ i.
p
A
= I p
A
. p
A
c
c
c
c
2
q h
q
(1550)
. A
= I p
A
c
c
where the last term originates from the noncommutativity of the components
of p
and A. Since the magnetic field B is given by
B = A
(1551)
The Pauli equation103 is the nonrelativistic limit of the Dirac equation. It represents the Schr
odinger equation for a charged particle with spin onehalf. The
two components of the spinor A in the Pauli equation represent the internal
spin of the electron. The last term represents the anomalous Zeeman interaction
between the magnetic field and the electrons spin.
The other contribution to the Zeeman interaction originates with the electrons orbital angular momentum L. The ordinary Zeeman interaction occurs
between the constant magnetic field B and the orbital angular momentum and
originates from the gaugeinvariant term in the Hamiltonian
1
2m
p
q
A
c
2
=
1
2m
p
q
B r
2c
2
(1553)
where the vector potential has been expressed in terms of the uniform magnetic
field via
1
B r
(1554)
A =
2
The expression for the energy term can be further simplified to
1
2m
p
q
A
c
2
p2
q
p . (B r) + (B r) . p
2m
4mc
2
q
+
A2
2 m c2
q
p2
q2
=
(B r) . p
+
A2
2m
2mc
2 m c2
p2
q
q2
=
B . (r p)
+
A2
2m
2mc
2 m c2
p2
q
q2
B.L
+
A2 (1555)
2m
2mc
2 m c2
=
In obtaining the second line, the ith component of p has been commuted with
the ith component of ( B r ). In obtaining the third line, the (cyclic) vector
identity
(1556)
(A B).C = (B C ).A
has been used. The first term in eqn(1555) represents the usual nonrelativistic
expression for the kinetic energy of the electrons, the second term represents
the ordinary Zeeman interaction which originates from the paramagnetic interaction. The last term represents the diamagnetic interaction.
The total Zeeman interaction is the energy of the total magnetic moment M
in the field B
Zeeman = M . B
H
(1557)
103 W.
263
(1558)
(1559)
It is seen that both the spin angular momentum and the orbital angular momentum of the charged particle interacts with the magnetic field, therefore, both
contribute to the magnetic moment. However, it is noted that the magnetic
moment can be written in the form
q
M =
L + gS
(1560)
2mc
where the magnitude of the magnetic moment is determined by the factor 2 qmh c
which is the Bohr magneton. The Dirac equation shows that the spin angular
momentum couples with a different strength to orbital angular momentum, and
the relative coupling strength g (the gyromagnetic ratio) is given by g = 2.
The existence of spin and the value of 2 for the gyromagnetic ratio were the
first successes of Diracs theory. Quantum Electrodynamics104 yields a small
correction to the gyromagnetic ratio of
2
1
e
g = 2 1 +
+ ...
(1561)
2 h c
which has been experimentally verified to incredible precision105 . Using the features associated with spin, Diracs theory correctly described the fine structure
of the Hydrogen atom. The second success of the Dirac equation followed Diracs
physical interpretation of the negativeenergy states in terms of antiparticles106 .
The second round of success came with the discovery of the positron by Anderson107 .
Exercise:
The Dirac equation can be phenomenologically modified to describe particles
with anomalous magnetic moments. The Dirac equation is modified to
q h
q
,
i
h ( + i
A ) +
m
c
I
= 0 (1562)
,
hc
4 m c2
104 J.
264
Show that the modified equation is Lorentz covariant and that the Hamiltonian
is Hermitean. Also derive the corrections to the magnetic moment due to the
spin by examining the nonrelativistic limit.
12.7
(1563)
=
I
0
0
=
I
0
I
form
0
I
(1566)
(1567)
where
is the 2 2 blockdiagonal Pauli spin matrix. The rate of change of
orbital angular momentum is given by the Heisenberg equation of motion
i h
, H
]
L = [L
t
(1568)
The orbital angular momentum operator commutes with the mass term and
with the spherically symmetric potential V (r). The orbital angular momentum
does not commute with the momentum. Thus,
i
h
, . p ]
L = c[L
t
265
(1569)
. [ p , L
(1570)
I 0
(i) and momenta
However, the components of the orbital angular momentum L
p(j) satisfy the commutation relations
X
(i) , p(j) ] = i h
[L
i,j,k p(k)
(1571)
k
L = i h c
t
0
I
I
0
(
p )
(1572)
which shows that orbital angular momentum is not conserved for a relativistic
electron with a central potential.
The spin angular momentum is also not conserved. This can be seen by
examining the Heisenberg equation of motion for the Pauli spin operator
i h
= [
, H
t
(1573)
The spin operator commutes with I and but does not commute with the
matrices. Hence,
i
h
= c[
, . p ]
= c[
,
] . p
0
I
I
0
(1574)
The components of the Pauli spin operators satisfy the commutation relations
X
[ (i) , (j) ] = 2 i
i,j,k (k)
(1575)
k
which, clearly, have a similar form to the commutation relations for the orbital
angular momentum. Hence, spin angular momentum is not conserved since
0 I
(
p )
(1576)
i
h
= 2ic
I 0
t
266
(1577)
J = ih
L + i h
S
t
t
t
0
0 I
= i
hc
(
p )
I
I 0
=
I
0
(
p )
(1578)
which follows from combining eqn(1572) and eqn(1576). This confirms the interpretation of the quantity S defined by
h
S =
(1579)
12.8
Conservation of Parity
Dirac was very conscious that his book Principles of Quantum Mechanics
never contained any mention of parity. It seems that he had questioned the
requirement of parity invariance108 since biological systems are not parity invariant. Diracs viewpoint was vindicated by the discovery that the weak interaction violates parity.
The parity transform P acting on the coordinates (t, r) has the effect
P (t, r) (t0 , r0 ) = (t, r)
(1580)
which is an inversion of the spatial coordinates. Thus, the parity reverse the
spacelike components of vectors, so the effects of the parity operation on the
position and momentum vectors are given by
P r P 1
P p P 1
= r
= p
(1581)
= L
(1582)
108 The question of parity conservation in weak interactions was raised subsequently by T. D.
Lee and C. N. Yang [T. D. Lee and C. N. Yang, Phys. Rev. 104, 254 (1956).]
267
which is unchanged. This implies that spin angular momentum should also be
invariant under the parity transform
P P 1
(1583)
(1584)
(1585)
= V (r)
(1586)
=
=
(1587)
The condition on the potential is the familiar condition for parity invariance
in classical mechanics. In the standard representation, in 2 2 block diagonal
form, the requirement of parity invariance on the Dirac matrices become the
matrix equations
0
0
P 1 =
P
0
0
I
0
I
0
P
P 1 =
(1588)
0 I
0 I
The above equation shows that, in the standard representation, the parity operator can be uniquely factorized as
I
0
P =
P
(1589)
0 I
where the operator P only acts on the coordinates r. The presence of the matrix
in the parity operation on the Dirac spinor should be compared with the effect
of the parity operator on the fourvector potential of Electrodynamics A (r)
which is given by the product of spatial inversion and a matrix operation
P A (r, t) = P A (r, t)
= A (r, t)
(1590)
1 0
0
0
0 1 0
0
=
0 0 1
0
0 0
0 1
268
(1591)
P (t, r) = P
B (t, r)
A
I
0
(t, r)
=
0 I
B (t, r)
A (t, r)
(1592)
=
B (t, r)
Hence, in the standard representation, the parity operator changes the relative
sign of the two twocomponent spinors. Due to the presence of the term I in
the lower diagonal block of the parity matrix, the lower twocomponent spinor
B in the Dirac spinor is said to have a negative intrinsic parity.
The parity eigenstates satisfy the eigenvalue equation
P = p
(1593)
= p A (r)
= p B (r)
(1595)
(1597)
sin sin
cos cos
m
exp i m
( 1) exp i m
(1598)
269
exp
i
l
Yl (, ) =
2l l!
4
(1599)
are eigenstates of the parity operator and have parity eigenvalues of (1)l . The
, defined via
lowering operator L
L = h exp i
i cot
(1600)
P L
(1601)
(1602)
which shows that all states with a definite magnitude of the orbital angular momentum l are eigenstates of the parity operator and have the same eigenvalue.
Exercise:
Show that under a parity transformation the positiveenergy solution for the
+
free Dirac particle k,
(x) transforms as
+
+
P k,
(x) = k,
(x)
(1603)
P k,
(x) = k,
(x)
(1604)
Hence, the parity operation reverses the momentum and keeps the spin invariant for the positiveenergy and negativeenergy solutions solution. The extra
negative sign implies that the negativeenergy solution has opposite intrinsic
parity to the positiveenergy solution.
Exercise:
Consider the parity transform as an example of an improper Lorentz transformation , for which det   = 1. If the Lorentz transform is given
by
x0 = x
(1605)
the spinor wave function transforms via
0 (x0 ) = R()
(x)
270
(1606)
where R()
rotates the spinor. The covariant condition for the Dirac equation
is
1 () R()
R
=
(1607)
For a parity transformation, one has
x0 = x
(1608)
since the spatial components of x change sign. Hence, for a parity transformation, the transformation matrix is determined as
= g,
(1609)
(1610)
R
= g,
(1611)
12.9
Bilinear Covariants
(1612)
0 (x0 ) = R()
(x)
(1613)
and the condition that the Dirac equation is covariant under the orthochronous
Lorentz transformation is
1 () R()
R
=
(1614)
From the transformational properties of the Dirac spinors, together with the
identity
() (0) = R
1 ()
(0) R
(1615)
one can find the transformational properties of quantities that are bilinear in
the Dirac spinors.
=
=
=
=
0 (0) 0
() (0) R()
(0) 2
(0)
( ) R () R()
1 () R()
(0) R
(1616)
271
for the Dirac equation.
Table 7: The sixteen bilinear covariants Q
Quantity
Bilinear
Covariant
Transformed
Matrix
1 () Q
R()
Scalar
Vector
Antisymmetric Tensor
Pseudoscalar
(4)
det   (4)
AxialVector
(4)
det   (4)
Number of
Matrices
where a factor of ( (0) )2 = I has been used in the third line and the identity
has been used in the fourth. Thus, one finds that transforms like a scalar.
Likewise, one can show that the bilinear quantities transform like
the components of a fourvector. That is
=
=
=
=
0 (0) 0
() (0) R()
(0) 2
(0)
( ) R () R()
1 () R()
(0) R
1 () R()
= R
(1617)
where the covariant condition has been used in obtaining the last line. Since
this relation holds for Lorentz boosts, rotations and spatial inversions,
is a fourvector.
The antisymmetric quantity , defined as
, =
i
[ , ]
2
(1618)
, 0
=
=
=
=
0 (0) , 0
() (0) , R()
(0) 2
(0) ,
( ) R ()
R()
1 () , R()
(0) R
1 () , R()
= R
(1619)
(1620)
, 0
1 () R()
= i R
1 () R()
1 () R()
= i R
R
= i
=
(1621)
(1622)
(1623)
=
I 0
(1624)
(4)
has the
(1625)
The quantity (4) can be used to construct a bilinear covariant quantity (4) .
Under an orthochronous Lorentz transformation, the bilinear quantity transform according to
(4) 0
= 0 (0) (4) 0
273
= R
(0) 2
(0) (4)
= ( ) R () R()
= (0) R
1 () (4) R()
= R
(1626)
R().
Therefore, the bilinear quantity transforms as
(1627)
which behaves like a scalar under a proper orthochronous Lorentz transformation for which det   = 1. However, for an inversion where det   = 1,
R
= det   (4)
so one has
(1629)
(1630)
One can also define the bilinear axialvector (4) . From considerations similar to those used previously, one can show that these quantities
transform according to
(1631)
under an inversion, but the time components do change sign. Therefore, (4)
transforms like an axialvector.
Exercise:
Show that a modified Dirac equation described by
q
qh
, (4)
i h ( + i
A ) i
F, m c I = 0 (1632)
hc
4 m c2
274
12.10
c ( . p ) B (r)
( E V (r) + m c2 ) B (r)
c ( . p ) A (r)
cos exp[+i]
r
i h
0
i
exp[+i]
r sin
(1635)
. p ) can be expressed as
sin exp[i]
cos
r
cos exp[i]
sin
i exp[i]
0
(1636)
which has a quite complicated structure. For future reference, it shall be noted
that the matrix part of the coefficient of the partial derivative w.r.t. r is simply
equal to
r.
(1637)
r
which is independent of the radial coordinate r. The operator ( . p ) can be
cast in a more convenient form through the repeated use of the Pauli identity.
275
r.
r
2
(1638)
(1639)
and are their own inverses. Therefore, one can express the operator ( . p ) as
( . p )
=
=
=
=
=
r.
r
r.
r2
r.
r2
r.
r2
r.
r2
2
( . p )
( r . ) ( . p )
r . p + i . ( r p )
i h r
+ i.L
r
2i
i h r
+
S.L
r
h
(1640)
where the Pauli identity has been used in going between the second and third
lines. Therefore, the twocomponent spinors satisfy the set of coupled equations
r.
2i
2
A
( E V (r) m c ) (r) = c
i h r
+
S . L B (r)
r2
r
h
2i
r.
2
B
i h r
+
S . L A (r)
( E V (r) + m c ) (r) = c
r2
r
h
(1641)
It is seen that, due to the effect of special relativity, the Dirac equation results
in the coupling of the spin and the orbital angular momentum.
TwoComponent Spinor Spherical Harmonics
The angular dependence of the twocomponent wave functions A (r) and
(r) are determined by the eigenvalue equations for the magnitude and the
zcomponents of the total angular momentum
B
J = L + S
(1642)
j =l+
1
2
j =l
1
2
l + jz +
2 l + 1
sz = 12
1
2
l jz +
2 l + 1
1
2
l jz +
2 l + 1
1
2
l +jz + 12
2 l + 1
1
1
j l
2
2
(1643)
1
2
1
0
= l
2
= l +
277
(1645)
As shall be seen later, the twocomponent spinors ll0 1 ,jz (, ) and ll+ 1 ,jz (, )
2
2
have opposite parities. In fact, the twocomponent spinors generated by angular
momentum l and l0 = (l + 1) are related by the action of the pseudoscalar
r.
cos
sin exp[i]
=
(1647)
sin exp[+i]
cos
r
which changes sign under a parity transformation, (, ) ( , + ). The
explicit relationship is given by
r.
lj,jz (, ) = l+1
(1648)
j,jz (, )
r
as can be shown by examination of Table(1). Likewise, on using the identity
r.
r
2
= I
(1649)
one finds that the inverse relationship between the twocomponent spinors is
also given by
r.
l
l+1
(1650)
j,jz (, ) = j,jz (, )
r
Therefore, one concludes that the two angular momentum eigenstates have different properties under the spatial inversion transformation r r.
Mathematical Interlude:
The Action of the Operator ( r . ) on the Spinor Spherical Harmonj 1
ics j,jz2 (, ).
Here, it will be argued that the spinor spherical harmonics satisfy the equations
r.
j+ 1
j 1
j,jz2 (, ) = j,jz2 (, )
r
r.
j 1
j+ 1
j,jz2 (, ) = j,jz2 (, )
(1651)
r
278
(1652)
(1653)
The complete proof of this statement immediately follows from the proof of the
relation for any one component J(i) , since ( r . S ) is spherically symmetric.
Thus, for i = 1, one has
[ J(1) , ( r . S ) ]
(1655)
(1656)
and
one finds that
[ J(1) , ( r . S ) ]
= i
h
=
(1657)
which was to be shown. From repeated use of the above commutation relations
which involve the components J(i) , it immediately follows that
2
[ J , ( r . S ) ] = 0
(1658)
2
j
Thus, since j,jz2 is a simultaneous eigenstate of J and J(3) and because these
j 1
operators commute with ( r . S ), then ( r . S ) j,jz2 is also a simultaneous
eigenstate with eigenvalues (j, jz ).
1
2
j
Since the states ( r . S ) j,jz2 are simultaneous eigenstates of J and J(3)
with eigenvalues (j, jz ), and because this subspace is spanned by the basis comj 1
posed of the two states j,jz2 (, ), the transformed states can be decomposed
as
r.
j+ 1
j+ 1
j 1
j,jz2 (, ) = C++ (j, jz ) j,jz2 (, ) + C+ (j, jz ) j,jz2 (, )
r
r.
j 1
j+ 1
j 1
j,jz2 (, ) = C+ (j, jz ) j,jz2 (, ) + C (j, jz ) j,jz2 (, )
r
(1659)
279
J j,jz (, )
(1660)
J
j,jz (, ) =
r
r
and
r.
j+ 1
j 1
j+ 1
J
j,jz2 (, ) = C++ (j, jz ) J j,jz2 (, ) + C+ (j, jz ) J j,jz2 (, )
r
r . j+ 12
j+ 1
j 1
J j,jz (, ) = C++ (j, jz 1) J j,jz2 (, ) + C+ (j, jz 1) J j,jz2 (, )
r
(1661)
Hence, on comparing the linearlyindependent terms on the lefthand sides, one
concludes that
C++ (j, jz 1) = C++ (j, jz )
C+ (j, jz 1) = C+ (j, jz )
(1662)
j 1
j,jz2 (, ) ( 1 )j 2 j,jz2 (, )
(1663)
which follows from the properties of the spherical harmonics Yml (, ) under the
parity transformation. Also one has
r.
r.
(1664)
r
r
under the parity transform. Thus, after the parity transform, one finds that the
transformed states have the decompositions
r.
j+ 1
j+ 1
j 1
j,jz2 (, ) = C++ (j) j,jz2 (, ) + C+ (j) j,jz2 (, )
r
r.
j 1
j+ 1
j 1
j,jz2 (, ) = C+ (j) j,jz2 (, ) C (j) j,jz2 (, )
r
(1665)
which by comparison with eqn(1659) leads to the identification
C++ (j) = C (j) = 0
280
(1666)
Therefore, recalling that the coefficients are independent of jz , one can express
the effect of the operator on the spinor spherical harmonics as
r.
j 1
j+ 1
j,jz2 (, ) = C+ (j) j,jz2 (, )
r
r.
j+ 1
j 1
j,jz2 (, ) = C+ (j) j,jz2 (, )
(1667)
r
Furthermore, since
r.
r
2
= I
(1668)
(1669)
.
r
is Hermitean, which
leads to
C+ (j) = C+ (j)
(1670)
The above two equations suggest that C+ (j) and C+ (j) are pure phase factors, such as
C+ (j) = exp + i (j)
C+ (j) = exp i (j)
(1671)
The phase factor can be completely determined by considering the relations
(1667) with specific choices of the values of (, ). As can be seen by examining
the case where = 0, the phase (j) is either zero or . For the case = 0,
the operator simplifies to
r.
cos
sin
=
(1672)
sin cos
r
The spinor spherical harmonics are given by
q
j+ 12
j+1jz
Y
(,
)
1
1
j+
jz 2
q 2j+2
j,jz2 (, ) =
j+ 12
z
+ j+1+j
Y
(, )
2j+2
jz + 12
q
j 12
j+jz
(,
)
Y
1
1
j
2j
jz 2
j,jz2 (, ) = q
j 12
jjz
2j Yj + 1 (, )
z
(1673)
which becomes real for = 0 since the spherical harmonics become real. Hence,
on inspecting eqn(1667) with = 0, one concludes that the phase factors are
equal and are purely real. That is
C+ (j) = C+ (j) = 1
281
(1674)
(1675)
r
=
2j + 1 1
jz 12 ,0
4
(1676)
j 1
one finds that, for fixed j, only the four spinor spherical harmonics j,jz2 (0, )
j+ 1
j,jz2 (0, )
2j+1
q 8
2j+1
8
+
q
j
j,jz2 (0, ) = q
2j+1
8
2j+1
8
jz 12 ,0
jz + 12 ,0
jz 12 ,0
(1677)
jz + 12 ,0
are nonzero. The spinor spherical harmonics with = 0 are connected via
1
0
j 1
j 1
(1678)
j,21 (0, ) = j,21 (0, )
0 1
2
2
Hence, one has determined that
C+ (j) = C+ (j) = 1
(1679)
which holds independent of the values of and j, so the effect of the operator
on the spinor spherical harmonics is completely specified by
r.
j+ 1
j 1
j,jz2 (, ) = j,jz2 (, )
r
r.
j 1
j+ 1
j,jz2 (, ) = j,jz2 (, )
(1680)
r
as was to be shown.
The Ansatz
If one only considers the spatial part of the parity operator, P , the two0
0
component spinor states ll0 1 ,jz (, ) have parities (1)l
2
0
0
0
P ll0 1 ,jz (, ) = (1)l ll0 1 ,jz (, )
2
282
(1681)
Furthermore, as has been seen, the upper and lower twocomponent spinors of
the fourcomponent Dirac spinor must have opposite intrinsic parity. Therefore,
the desired simultaneous eigenstates for the relativistic electron can be either
f (r)
l
(,
)
1
r
l+ ,j
l+
(r) = g (r) l+12 z
(1682)
1
2 ,jz
i r l+ 1 ,j (, )
2
+
or by j,j
(r)
z
+
(r) =
l+
1
,j
2
f + (r)
r
g + (r)
r
(, )
l+1
l+ 1 ,j
ll+ 1 ,jz (, )
2
(1683)
which has parity (1)l+ 2 + 2 . In these expressions f (r) and g (r) are scalar
radial functions that have to be determined as solutions of the radial equation.
These states do not correspond to definite values of the orbital angular momentum since the upper and lower twocomponent spinors correspond to the
different values of either l or l0 = l + 1 for the orbital angular momentum.
To condense the notation, the energy eigenstates will be written in the compact form
!
f (r)
lA
(,
)
j,jz
r
(1684)
j,jz (r) =
B
i g r(r) lj,j
(,
)
z
where lA = j
1
2
and lB = j
1
2.
We shall find the radial Dirac equation for the solution j,j
(r). The Dirac
z
spinor wave functions in eqn(1683) and eqn(1682) are substituted into eqns(1641).
The spinorbit interaction term can be evaluated by squaring the expression
+ S
J = L
(1685)
2
2 S2
J L
(1686)
When this operator acts on the relativistic twocomponent spinor spherical harA
monic lj,j
, one finds
z
2
lA = h
S.L
j,jz
2
j ( j + 1 ) lA ( lA
283
3
+ 1)
4
A
lj,j
z
(1687)
which for j = lA +
1
2
yields
2
1
A
l A = h
S.L
(j
) lj,j
j,jz
z
2
2
(1688)
(1689)
2
f (r) lA
g(r) lB
j,jz = c h
r
j,jz
( E V (r) m c2 )
S
.
L
r
r2
r
r
h2
f (r) lA
g(r) lB
r.
( E V (r) + m c2 )
j,jz = c h
r
S
.
L
j,jz
r
r2
r
r
h2
(1690)
Following Dirac, it is customary to define an integer in terms of the eigenvalues
via
of S . L
2
A
) lA = h
(S.L
( 1 + ) lj,j
j,jz
z
2
2
) lB = h ( 1 ) lB
(S.L
j,jz
j,jz
2
(1691)
j+ 1
A
Therefore, if lj,j
= j,jz2 , i.e. j = lA 12 , then = (j + 12 ).
z
j 1
A
Otherwise, if lj,j
= j,jz2 , i.e. j = lA + 12 , then = (j + 12 ).
z
On substituting the above expressions in the Dirac energy eigenvalue equation for j,jz (r) one finds
g(r) lB
r.
f (r) lA
r
+
1
j,jz
j,jz = c h
( E V (r) m c2 )
r2
r
r
r
g(r) lB
r.
f (r) lA
( E V (r) + m c2 )
j,jz = c h
r
+
1
+
j,jz
r
r2
r
r
(1692)
Since the radial spin projection operator is independent of r, it can be commuted to the right of the differential operator in the large parenthesis. Then on
using either the relation given in eqn(1648) or in eqn(1649), one finds that the
relativistic spherical harmonics factor out of the equations, leading to
f (r)
1
g(r)
( E V (r) m c2 )
= c h
+
r
r
r
r
g(r)
1 +
f (r)
( E V (r) + m c2 )
= c h
+
r
r
r
r
(1693)
284
lA
lB
Parity
= (j + 12 )
j+
1
2
1
2
(1)
= (j + 12 )
1
2
j+
1
2
(1)1
Therefore, the Dirac radial equation consists of the two coupled firstorder differential equations for f (r) and g(r). On multiplying by a factor of r and
simplifying the derivatives of f (r)/r, one finds the pair of more symmetrical
equations
( E V (r) m c2 ) f (r) + c h
g(r) = 0
r
r
+
f (r) = 0
( E V (r) + m c2 ) g(r) c h
r
r
(1694)
The above pair of equations are the central result of this lecture.
The Probability Density in Spherical Polar Coordinates.
The probability density P (r) that an electron, in an energy eigenstate of a
spherically symmetric potential, is found in the vicinity of the point (r, , ) is
given by
P (r)
(r) (r)
f (r)2
g(r)2
lA
lA
B
B
(,
)
(,
)
+
lj,j
(, ) lj,j
(, )
=
j,jz
j,jz
z
z
2
2
r
r
(1695)
=
(1696)
the probability is independent of the azimuthal angle and the sign of jz (just
like in the nonrelativistic case) and has a common angular factor of Aj,jz  ().
Thus, the probability distribution factorizes into a radial and the angular factor
f (r)2
g(r)2
(1697)
P (r) =
+
Aj,jz  ()
r2
r2
285
jz 
Aj,jz  ()
1
2
1
2
1
4
3
2
1
2
3
2
3
2
5
2
1
2
5
2
3
2
5
2
5
2
1
8
( 1 + 3 cos2 )
3
8
3
16
sin2
( 1 2 cos2 + 5 cos4 )
3
32
sin2 ( 1 + 15 cos2 )
15
32
sin4
The angular distribution function for a closed shell is given by the sum over the
angular distribution functions. Due to the identity,
j
X
Aj,jz  () =
jz =j
2j + 1
4
(1698)
one finds that closed shells are spherically symmetric, as is expected. The first
few angular dependent factors Aj,jz  () are given in Table(10) and the corresponding nonrelativistic angular factors are given in Table(11). On comparing the relativistic angular dependent factors with the nonrelativistic factors
Yml (, )2 , one finds that they are identical for jz  = j. Since the relativistic
distribution is the sum of two generally different positive definite forms originally associated with the two spinors + and , it generally does not go to
zero for nonzero values of .
12.10.1
(E +
m c2 ) f (r) + c h
g(r) = 0
r
r
r
Z e2
(E +
+ m c2 ) g(r) c h
+
f (r) = 0
r
r
r
(1699)
286
Figure 48: The relativistic (left) and nonrelativistic (right) angular distributions Aj,jz  () for j = 21 and j = 32 .
287
Figure 49: The relativistic (left) and nonrelativistic (right) angular distributions Aj,jz  () for j = 25 .
288
m
Al,m ()
1
4
3
4
cos2
3
8
sin2
5
16
( 1 3 cos2 )2
sin2 cos2
15
8
15
32
sin4
The above equations will be written in dimensionless units, where the energy is
expressed in terms of the rest mass m c2 and lengths are expressed in terms of
the Compton wave length mh c . A dimensionless energy is defined as the ratio
of E to the rest mass energy
E
=
(1700)
m c2
For a bound state, m c2 > E > m c2 so the value of the magnitude of
is expected to be a little less than unity. A dimensionless radial variable is
introduced which governs the asymptotic large r decay of the bound state wave
function. The variable is defined by
p
rmc
2
=
1
(1701)
h
In terms of these dimensionless variables, the Dirac radial equations for the
hydrogenlike atom become
r
1
+
f +
g = 0
1 +
r
1 +
+
g
+
f = 0
(1702)
1
where
=
Z e2
h c
289
(1703)
is a small number.
Boundary Conditions
The asymptotic form of the solution can be found from the asymptotic form of the equations
r
1
f +
g 0
1 +
r
1 +
g
f 0
(1704)
1
Hence, on combing these equations, one sees that the asymptotic form of the
equation is given by
2f
= f
(1705)
2
Therefore, one has
f A exp
+ B exp
+
(1706)
and, likewise, g has a similar exponential form. If the solution is to be normalizable, then the coefficient B in front of the increasing exponential must be
exactly zero (B 0).
The asymptotic 0 behavior of the solution can be found from
f +
g = 0
+ f = 0
g
(1707)
where it has been noted that both the angular momentum term and the
Coulomb potential govern the small variation, while the mass and energy terms are negligible. This is in contrast to the case of the nonrelativistic
Schr
odinger equation with the Coulomb potential, where for small r the Coulomb
potential term is negligible in comparison with the centrifugal potential. We
shall make the ansatz for the asymptotic small variation
f
g
A s
B s
(1708)
where the exponent s is an unknown constant and then substitute the ansatz in
the above equations. This procedure yields the coupled algebraic equations
A + (s )B =
B (s + )A =
290
0
0
(1709)
(1711)
one must choose the positive solution for s. Normalizability near the origin
requires that 2 s > 1. Hence, one may set
p
s =
2 2
(1712)
This will be a good solution for = 1 if Z does not exceed a critical value.
For values of Z greater than 172, the point charge can spark the vacuum
and spontaneously generate electronpositron pairs109 . The solution with the
negative value of s given by
p
s = 2 2
(1713)
could also possibly exist and be normalizable if is greater than a critical value
c determined as
p
1
=
1 c2
(1714)
2
This critical value of is found from
3
c =
(1715)
2
which corresponds to Zc 118. The solutions corresponding to negative s
are, infact, unphysical and do not survive if the nucleus is considered to have
a finite spatial extent.
The Fr
obenius Method
We shall use the Fr
obenius method to find a solution. The solutions of the
radial equation shall be written in the form
f (r) = exp s F ()
g(r) = exp s G()
(1716)
109 H. Backe, L. Handschug, F. Hessberger, E. Kankeleit, L. Richter, F. Weik, R. Willwater,
H. Bokemeyer, P. Vincent, Y. Nakayama, and J. S. Greenberg, Phys. Rev. Lett. 40, 1443
(1978).
291
+ F +
+ s G = 0
1 +
r
1 +
+ G
+ s + F = 0
1
(1717)
The functions F () and G() can be expressed as an infinite power series in
F () =
G() =
X
n=0
an n
bn n
(1718)
n=0
where the coefficients an and bn are constants which have still to be determined.
The coefficients are determined by substituting the series in the differential
equation and then equating the coefficients of the same power in . Equating
the coefficient of n yields the set of relations
r
1
an1 + an
+ ( n + s ) bn bn1 = 0
1 +
r
1 +
bn1 + bn
( n + s + ) an + an1 = 0
1
(1719)
This equation is automatically satisfied for n = 0, since by definition a1 =
b1 0 so the equation reduces to the indicial equation for s. These relations
yield recursion relations between the coefficients (an , bn ) with different values of
n. The form of the recursion relation can be made explicit by using a relation
between an and bn valid for any n. This relation is found by multiplying the
first relation of eqn(1719) by the factor
r
1 +
(1720)
1
and adding it to the second, one sees that the coefficients with index n 1
vanish. This process results in the equation
r
r
1 +
1 +
( n + s + ) an +
+
( n + s ) bn = 0
1
1
(1721)
valid for any n. The above equation can be used to eliminate the coefficients
bn and yield a recursion relation between an and an1 . The ensuing recursion
292
relation will enable us to explicitly calculate the wave functions G() and hence
F ().
Truncation of the Series
The behavior of the recursion relation for large values of n can be found by
noting that eqn(1721) yields
r
1 +
n an
n bn
(1722)
1
which when substituted back into the large n limit of the first relation of
eqn(1719) yields
n an 2 an1
(1723)
Since the large limit of the function is dominated by the highest powers of
, it is seen that if the series does not terminate, the functions F () and G()
would be exponentially growing functions of
F () exp + 2
G() exp + 2
(1724)
Therefore, the set of recursion relations must terminate, since if the series does
not terminate, the large behavior of the functions F () and G() would governed by the growing exponentials. Even when combined with the decaying
exponential term that appear in the relations
s
f (r) = F () exp
s
g(r) = G() exp
(1725)
the resulting functions f (r) and g(r) would not satisfy the required boundary
conditions at . We shall assume that the series truncate after the nr th
terms. That is, it is possible to set
anr +1
bnr +1
= 0
= 0
(1726)
Thus, the components of the radial wave function may have nr nodes. Assuming
that the coefficients with indices nr + 1 vanish and using the first relation in
eqn(1719) with n = nr + 1, one obtains the condition
r
1 +
bn = anr
(1727)
1 r
293
(1730)
m c2
(1731)
2
( nr + s ) 2
1 2
) 2
(1732)
2
This expression for the energy eigenvalue is independent of the sign of and,
therefore, it holds for both cases
s =
(j +
= (l + 1)
= l +
1
2
1
2
(1733)
Hence, the energy eigenstates are predicted to be doubly degenerate (in addition
to the (2j + 1) degeneracy associated with j3 ), since states with the same j but
have different values of l0 have the same energy. If the positiveenergy eigenvalue
is expanded in powers of , one obtains
E m c2
1
2
m c2
2
( nr + j +
1
2
)2
+ ...
(1734)
which agrees with the energy eigenvalues found from the nonrelativistic Schrodinger
equation. However, as has been seen, the exact energy eigenvalue depends on
nr and (j + 21 ) separately, as opposed to being a function of the principle quantum number n which is defined as the sum n = nr + j + 21 . Hence, the Dirac
110 C. G. Darwin, Proc. Roy. Soc. A 118, 654 (1928).
C. G. Darwin, Proc. Roy. Soc. A 120, 621 (1928).
W. Gordon, Zeit. f
ur Physik, 48, 11 (1928).
294
equation lifts the degeneracy between states with different values of the angular
momentum. The energy levels together with their quantum numbers are shown
in Table(12). The energy splitting between states with the same n and different
j values has a magnitude which is governed by the square of the fine structure
2
constant Z ( he c ). That is
1
2
E m c2 1
2 ( nr + j + 12 )2
1
4
1
3
+
.
.
.
2 ( nr + j + 12 )3
( j + 12 )
4 ( nr + j + 12 )
(1735)
The fine structure splittings for Hlike atoms was first observed by Michelson111
and the theoretical prediction is in agreeement with the accurate measurements
of Paschen112 . The fine structure splitting is important for atoms with larger Z.
This observation has a classical interpretation which reflects the fact that for
large Z the electrons move in orbits with smaller radii and, therefore, the electrons must move faster. Relativistic effects become more important for electrons
which move faster, and this occurs for atoms with larger values of Z. Although
the fine structure splitting does remove some degeneracy, the two states with
the same principle quantum number n and the same angular momentum j but
which have different values of l are still predicted to be degenerate. Thus, for
example, the 2Sj= 21 and the 2Pj= 12 states of Hydrogen are predicted to be degenerate by the Dirac equation. It has been shown that this degeneracy is
removed by the Lamb shift, which is due to the interaction of an electron with
its own radiation field. The Lamb shift is smaller than the fine structure shifts
2
discussed above because it involves an extra factor of he c .
The Ground State Wave Function
The ground state wave function of the hydrogen atom is slightly singular
at the origin. This can be seen by noting that it corresponds to nr = 0 and
= 1. Since the dimensionless energy is given by the expression
p
=
1 2
(1736)
one finds that the dimensionless radial distance is simply given by
p
rmc
2
=
1
h
mc
=
r
h
Z e2 m
r
=
h2
111 A.
112 F.
295
(1737)
nr
= (j + 12 )
nLj
Degenerate
E
m c2
Partner
p
1 2
1
1S 12
1
2S 12
2P 12
+1
2P 12
2S 12
2+2
2
2P 32
1
3S 12
3P 12
+1
3P 12
3S 12
2
3P 32
3D 32
+2
3D 32
3P 32
3
3D 52
1 2
q
296
1
4
5+4
1 2
r
5+2
4 2
q
1
9
where the characteristic length scale is just the nonrelativistic Bohr radius
divided by Z. The wave functions are written as
f () = exp s F ()
g() = exp s G()
(1738)
Since nr = 0, the recursion relations terminate immediately leading to the
functions F () and G() being given, respectively by constants a0 and b0 . Since
= 1 for the ground state, the recursion relations are simply
a0 + ( s + 1 ) b0
b0 ( s 1 ) a0
= 0
= 0
(1739)
(1740)
and
p
s +
1 2 1
b0
(1741)
=
=
a0
This shows that the lower component is smaller than the upper constant by
approximately , which has the magnitude of vc where v is the velocity in Bohrs
theory. The ratio of b0 to a0 determines the radial functions as
f (r)
s1
= a0
exp
r
p
g(r)
1 2 1 s1
= a0
exp
(1742)
r
Since Y00 (, ) =
ponents are just
1 ,
4
(1743)
r. 1
r
4
1
cos
sin exp[i]
(1744)
=
sin exp[+i]
cos
4
=
Thus, apart from an over all normalization factor, the fourcomponent spinor
Dirac wave function is given by
2
N
1 1 exp
=
(1745)
r .
i
4
r
297
0.8
1S1/2
0.6
0.4
f(r)
0.2
g(r) x 100
0
0
0.5
1.5
2.5
0.2
0.4
mcr/h
Figure 50: The large f (r) and small component g(r) radial wave functions for
the 1S 12 ground state of Hydrogen.
Hence, it is seen that as approaches the origin, at first the wave function is
slowly varying since
2
2
2
1 1 exp
ln 1
ln
(1746)
2
2
but for distances smaller than the characteristic length scale
2
h
rc =
exp 2
mc
(1747)
the wave function exhibits a slight singularity. This length scale is much smaller
that the nuclear radius so, due to the spatial distribution of the nuclear charge,
the singularity is largely irrelevant. This singularity is not present in the nonrelativistic limit, since in this limit one assumes that the inequality  V (r) 
m c2 always holds, although this assumption is invalid for r 0. Therefore, one
concludes that the relativistic theory differs from the nonrelativistic theory at
small distances, which could have been discerned from the use of the Heisenberg
uncertainty principle.
298
0.2
0.4
2P1/2
2S1/2
0.2
0
0
0
0
10
0.2
10
f(r)
g(r) x 100
f(r)
0.2
g(r) x 100
0.4
0.4
0.6
0.6
mcr/h
mcr/h
Figure 51: The radial wave functions for the 2S 12 and 2P 12 states of Hydrogen.
12.10.2
The first few radial functions for the hydrogen atom can be expressed in the
form
s
s
E
r
r
r
f (r) = N
1 +
exp
2
c
0
1
m c2
a
a
a
s
s
E
r
r
r
g(r) = N
1
exp
d0 2 d1
2
mc
a
a
a
(1748)
where the above form is restricted to the case where the radial quantum number
nr take on the values of 0 or 1. The index s is the same as that which occurs
in the Frobenius method and is given by the positive solution
p
s =
2 2
(1749)
It is seen that the radial wavefunction depend on the dimensionless variable
defined by
r
=
(1750)
a
where the length scale a is given in terms of the energy E and the Compton
wavelength by
2 12
E
h
a = 1
(1751)
m c2
mc
The values of the indices s, energy E, length scale a and normalization N
are given in Table(13). Since the twocomponent spinor spherical harmonics
lj,jz (, ) are normalized to unity, the normalization condition is determined
from the integral
Z
2
2
2
N
dr  f  +  g 
= 1
(1752)
0
299
involving the radial wave functions. The integral is evaluated with the aid of
the identity
Z
d a+b exp 2 = 2(a+b+1) (a + b + 1)
(1753)
0
The coefficients cn and dn in the above expansion of the radial functions differ
Table 13: Parameters specifying the Radial Functions for the Hydrogen atom.
State
= 1
1 2
E
m c2
a
m c
h
1 2
N
1
a
s+ 1
2
2(2s+1)
1S 12
= 1
q
1
1 +
1 2
2
E
m c2
1
2
2 E
1)
m c2
( 2 E2 )
m c
1
a
2 E
+1)
m c2
( 2 E2 )
m c
1
a
s+ 1
2
(2s+1)
2S 32
=1
q
1 2
1 +
1 2
2
E
m c2
1
2
(2s+1)
2P 12
= 2
4 2
2
4
1
a
s+ 1
2
2(2s+1)
2P 32
from the coefficients an and bn that occur in the Frobenius expansion, since the
values of the ratio cnr /dnr has been chosen to simplify in the limit of large n. In
particular at the value of nr (at which the series terminates), the ratio is chosen
to satisfy
cnr
= 1
(1754)
d nr
instead of the condition
anr
=
b nr
1 + ( mEc2 )
1 ( mEc2 )
300
s+ 1
2
(1755)
The relative negative sign and the square root factors in the coefficients have
been absorbed into the expressions for the upper and lower components f (r) and
g(r). The square root factors are responsible for converting the upper and lower
components, respectively, into the large and small components for positive E,
and vice versa for negative E. The expansion coefficients are given in Table(14).
Since the ratio of the magnitudes of the polynomial factors is generally of the
Table 14: Coefficients for the Polynomial in the Hydrogen atom Radial Wavefunctions.
State
c0
c1
d0
d1
= 1
1S 12
= 1
E
m c2
2 E
m c2
)+1
2 s + 1
2 E
m c2
2
E
m c2
2 E
m c2
)+1
2 s + 1
2 E
m c2
+ 1
2S 32
=1
E
m c2
1
)1
2 s + 1
2
E
m c2
)1
2 s + 1
2P 12
= 2
2P 32
order of unity, the ratio of the magnitudes of the small to large components is
found to be of the order of .
12.10.3
The Dirac equation for Hydrogen will be examined in the nonrelativistic limit,
and the lowestorder relativistic corrections will be retained. The resulting equation will be recast in the form of a Schrodinger equation, in which the Hamiltonian contains additional interaction terms. The resulting interactions, when
treated by firstorder perturbation theory, yield the fine structure. The physical
interpretation of the interactions will be examined. Historically, the following
type of analysis and the ensuing discussion of the Thomas precession played a
301
2
i
h
V mc
A = i h c ( . ) B
t
2
i
h
V + mc
B = i h c ( . ) A
(1756)
t
where A and B are, respectively, the upper and lower two component spinors
of the fourcomponent Dirac spinor . The energy eigenvalues of these equations
are sought, so to this end the explicit timedependence of the energy eigenstates
will be separated out via
A
i
Et
(1757)
=
exp
B
h
Also the nonrelativistic energy will be defined as the energy referenced with
respect to the restmass energy
E = m c2 +
The coupled equations reduce to
V
A
2
V + 2 mc
B
(1758)
= i h c ( . ) B
= i h c ( . ) A
(1759)
The pair of equations will be expanded in powers of ( mp c )2 and only the firstorder relativistic corrections will be retained. One can express B as
B
=
=
i
h c ( . ) A
V + 2 m c2
1
1
V
1 +
( . p ) A
2mc
2 m c2
1
V
1
+ ...
( . p ) A
2mc
2 m c2
(1760)
302
.
p
)
(
.
p
)
d3 r S S =
d3 r A A +
4 m2 c2
Z
1
A
A
.
p
)
(
.
p
=
d3 r A A +
4 m2 c2
Z
p2
=
d3 r A I +
A
(1763)
4 m2 c2
Therefore, the twocomponent Schrodinger wave function can be identified as
p2
S =
I +
+ ...
A
(1764)
8 m2 c2
or, on inverting the expansion
A
I
p2
+ ...
8 m2 c2
S
(1765)
1
( . p ) I
S
2mc
2 m c2
8 m2 c2
p2
V
1
( . p ) S
( . p ) I
8 m2 c2
2 m c2
2mc
(1766)
On substituting B and S into the equation for A , one finds the (twocomponent) energy eigenvalue equation
p2
V
I
S
8 m2 c2
( . p )
p2
V
( . p ) S
=
( . p ) I
8 m2 c2
2 m c2
2m
(1767)
303
or
p2
p2
V
+
V
S
8 m2 c2
8 m2 c2
2
p
p2
V
=
I
.
p
)
(
.
p
)
S
2m
8 m2 c2
4 m2 c2
(1768)
+ ( . p )
( . p ) S (1769)
16 m3 c2
4 m2 c2
The term proportional to the product of the energy eigenvalue and the kinetic
energy can be rewritten as
p2
S
8 m2 c2
p2
S
8 m2 c2
p2
p2
(
V
+
) S
8 m2 c2
2m
(1770)
+
(
.
p
)
(
.
p
)
S
S
8 m2 c2
4 m2 c2
(1772)
where the relativistic corrections are symmetric in p2 and V . This represents the
energy eigenvalue equation for a twocomponent wave function S , similar to
the Schr
odinger wave function, but the above equation does include relativistic
corrections to the Hamiltonian. The first correction term is
H
Kin =
304
p4
8 m3 c2
(1773)
which is recognized as the relativistic kinematic energy correction, that originates from the expansion of the kinetic energy
p
m2 c2 + p2 c2 m c2
=
p2
p4
+ ...
(1774)
2m
8 m3 c2
The remaining two correction terms
2
V
p V + V p2
( . p )
(
.
p
4 m2 c2
8 m2 c2
(1775)
will be interpreted as the sum of the spinorbit interaction and the Darwin term.
It should be noted that the sum of these two terms would identically cancel in
a purely classical theory. This cancellation can be shown to occur since, in the
classical limit, V and p commute, and then the Pauliidentity can be used to
show that the resulting pairs of terms cancel.
The factor
2 ( . p ) V ( . p )
p2 V + V p2
(1776)
can be evaluated as
2 p . V p
p V + V p
+ 2i.
p
V p
(1777)
The first two terms can be combined to form a double commutator, yielding
[ p , [ p , V ] ] + 2 i . p
V p
(1778)
or
+ h2 2 V + 2 i .
p
V p
(1779)
(1780)
since
p
p
0
(1781)
+
2 V (1782)
H
=
+
Darwin
2
2
8 m2 c2
4m c
305
The first term is the spinorbit interaction term, and the second term is the
Darwin term. For central potentials, the Darwin term is only important for
electrons with l = 0. The evaluation and the physical interpretation of the
energy shifts due to the three finestructure interactions will be discussed separately.
12.10.4
H
Kin =
p4
8 m3 c2
(1783)
originates from the expansion of the relativistic expression for the kinetic energy
of a classical particle
p
=
m2 c2 + p2 c2 m c2
p2
p4
+ ...
(1784)
2m
8 m3 c2
+
S
(1785)
2m
2 n2
h c
r
which leads to
p4
S (r)
8 m3 c2
2
2
Z
1
m c 2 Z e2
Z e2
3
=
d
r
(r)
+
S (r)
S
2 m c2
2 n2
h c
r
4
Z
Z e2
1
3
Z 2 e4
2
= mc
8 n4
2 m c2
Z
EKin
d3 r S (r)
Hence, the firstorder energy shift due to the kinematic correction is evaluated
as
4
Z e2
3
1
EKin = m c2
(1787)
h c
8 n4
n3 ( 2 l + 1 )
This term is found to lift the degeneracy between states with fixed principle
quantum numbers n and values of the angular momenta l. The relativistic kinematic correction to the energy is found to be smaller than the nonrelativistic
energy by a factor of
2
Z e2
Z 2 104
(1788)
h c
306
12.10.5
SpinOrbit Coupling
1
1 v E
q
E v
c
c
v2
1 c
(1789)
1
r p
B0 =
m c r r
1
=
L
(1791)
m c r r
which is caused by the apparent rotation of the charged nucleus. In the electrons
rest frame, the electrons spin S should interact with the magnetic field through
the Zeeman interaction
rest = q g B 0 . S
Int
(1792)
H
S
2mc
where gS is the gyromagnetic ratio for the electrons spin. Diracs theory predicts that the spin is a relativistic phenomenon and also that gS = 2 for an
electron in its rest frame. This interaction with the magnetic field will cause
the spin of the electron to precess. The spin precession rate found in the electrons rest frame is calculated as
e
rest =
gS B 0
(1793)
2mc
307
However, the electron is bound to the nucleus and is orbiting with angular
momentum L. Therefore, one has to consider the corrections to the precession
rate (and the interaction) caused by the acceleration of the electrons rest frame.
Thomas Precession
Electrons exhibit two different gyromagnetic ratios. The gyromagnetic ratio
of gS = 2 couples a spin to an external magnetic field and there is a gyromagnetic ratio of unity for the lab frame. This gyromagnetic ratio of unity (in
the lab frame) enters the coupling between the spin of an electron in a circular
orbit to the magnetic field B 0 experienced in the electrons rest frame113 . We
shall find the gyromagnetic ratio in the lab frame, by calculating the rate of
precession that is observed in the lab frame and then inferring the (lab frame)
interaction which produces the same rate of precession.
In the electrons rest frame, the gyromagnetic ratio due to the orbital magnetic field B 0 (caused by the charged nucleus) is given by gs = 2. This gyromagnetic ratio yields a spin precession rate in the electrons rest frame of
rest =
e
gS B 0
2mc
(1794)
The spin precession rate observed in the lab frame will be calculated later. The
rate of precession as observed in the electrons rest frame has to be corrected
by taking into account the motion of the electron. The correction is due to
the nonadditivity of velocities in successive Lorentz transformations. First,
the transformation properties of Dirac spinors under infinitesimal rotations and
boosts will be reexamined. Secondly, infinitesimal transformations will be successively applied to describe the particles instantaneous rest frame and the
Thomas precession.
A Lorentz transform of a spinor field is achieved by the rotation operator
via
R
(R1 r)
(1795)
0 (r) = R
shuffles the components of the spinor. For a passive rotation (of the
where R
coordinate system) through the infinitesimal angle in the i  j plane, the
infinitesimal Lorentz transform has the nonzero elements
i,j = j,i =
(1796)
R()
= exp + i
(1797)
2
113 L. H. Thomas, Nature 117, 514 (1926).
L. H. Thomas, Phil. Mag. 3, 1 (1927).
308
where
i
[ (i) , (j) ]
2
(k)
X
i,j,k
=
i,j
(k)
(1798)
+ ...
(1799)
R()
= I + i
0
(k)
2
k
which can be expressed in terms of the projection of the (block diagonal) spin
= h
operator S
as
2 on the axis of rotation e
R()
i
= I +
( e .
) + ...
2
i
) + ...
( e . S
= I +
h
(1800)
R() = exp + i
(1802)
2
where the angle is governed by the boost velocity v through
tanh =
v
c
(1803)
However,
i
[ (0) , (k) ] = i (k)
2
(k)
R() = exp
0,k =
so
(1804)
(1805)
Therefore, for a Lorentz boost with an infinitesimal velocity v along the kth
direction, one finds
v
R()
= I
(k) + . . .
(1806)
2c
309
v
Rest
a
Figure 52: A cartoon depicting a rotating charged spin onehalf particle, along
with the precession of the spin due to the external field in the particles rest
frame and the Thomas precession.
1 of a spinor due to an infinitesimal Lorentz transConsider the rotation R
formation with small velocity v, then
1 = I
R
1
. v + ...
2c
(1807)
At a time t later, the electron has changed its velocity since it is accelerating.
The new velocity of the electrons rest frame is given by
v 0 = v + a t
(1808)
On performing a second Lorentz transform with the boost a t, one finds the
rotation
2 = I 1 . a t + . . .
(1809)
R
2c
The combined Lorentz transform is given by
1
1
= R
2 R
1 =
R
I
I
.v
+ ...
. a t
2c
2c
1
1
..
( . a ) ( . v ) t + .(1810)
= I
. ( v + a t ) +
4 c2
2c
The Pauli identity can be used to evaluate the last term
(.a)(.v)
= (
.a)(
.v)
= a.vI + i
.(a v)
310
(1811)
where, since the product of the two s yields a two by two block diagonal form
which involves the four by four matrices I and
. Hence, the righthand side acts
equally on both the upper and lower twocomponent spinors. Furthermore, since
the orbit is circular, the acceleration is perpendicular to the velocity, therefore
a.v = 0
(1812)
i
1
. ( v + a t ) +
. ( a v ) t + . . .
2c
4 c2
(1813)
=
=
1
(a v)
2 c2
q
(E v)
2 m c2
(1815)
e
B0
2mc
(1816)
and its direction is opposite to the precession of the spin in the electrons rest
frame.
On combing the two precession frequencies, one finds that in the lab frame
the spins precession rate is given by
Lab
= rest T
e
=
( gS 1 ) B 0
2mc
(1817)
be given by ( gS 1 ).
The SpinOrbit Interaction
In the lab frame, the interaction between the moving electrons spin S magnetic moment and its field is inferred to be
Lab = q ( g 1 ) B 0 . S
Int
(1819)
H
S
2mc
where gS is the gyromagnetic ratio. Since the magnetic induction field is given
by
1
0
L
(1820)
B =
m c r r
where the electrostatic potential is given by
(r) =
q0
r
(1821)
H
SO =
q
q0
( gS 1 )
L.S
2mc
m c r3
(1822)
H
SO =
Z e2
( gS 1 ) L . S
2 m2 c2 r3
(1823)
The spinorbit coupling is a relativistic coupling which, apart from the Thomas
precession factor, indicates that the electrons spin interacts with a magnetic
field in its rest frame via the gyromagnetic ratio of 2. The magnitude of the
interaction agrees precisely with the interaction found from the perturbative
treatment of the Dirac equation.
To firstorder in perturbation theory, the spinorbit coupling interaction
yields a shift of the energy levels. Since the total angular momentum J is a
good quantum number, one can write
1
3
L.S =
j(j + 1) l(l + 1)
(1824)
2
4
but j for a single electron can only take on the values j = l
1
1
1
(l +
)
L.S =
2
2
2
1
2,
so
(1825)
(1826)
Z e2
hc
4
(l +
4 n3 l ( l +
1
2
1
2
) 12
)(l + 1)
(1827)
Therefore, the spinorbit interaction lifts the degeneracy between states with
different j = l 12 values. For l = 0, the numerator vanishes since the total
angular momentum can only take the value
j = +
1
2
(1828)
The energy shift produced by the spinorbit coupling is about a factor of the
square of the fine structure constant
e2
hc
2
1
137
2
104
(1829)
m c2
2 n2
Z e2
h c
2
(1830)
The Darwin term has no obvious classical interpretation. It only has physical
consequences for states with zero orbital angular momentum. However, it does
play an important role for the s electronic state of hydrogen, and is essential in
describing why the Diracs theory makes the 2S 21 and 2P 12 states of hydrogen
degenerate. This degeneracy was an essential ingredient in the discovery of the
Lamb shift and the subsequent development of Quantum Electrodynamics.
The Darwin interaction is given by
Z e2 h2 3
H
=
(r)
Darwin
2 m2 c2
(1831)
Z e2 h2
S (0) S (0)
2 m2 c2
313
(1832)
Hence, the shift only occurs for electrons with l = 0. Furthermore, since the
probability density for finding the electron at the origin is given by
S (0)
1
S (0) =
Z
na
3
l,0
(1833)
e2
h c
4
l,0
(1834)
which shifts the energies of s states upwards. The Darwin term reflects the fact
that the relativistic corrections are important for small r since the inequality
m c2
Z e2
r
(1835)
12.10.7
P
2P3/2
Kinematic
Spin Orbit
2S1/2
2P1/2
Kinematic
Darwin
n=2
Figure 53: The Grotarian energy level diagram for the n = 2 shell of hydrogen
(blue). The diagram shows the magnitude and sign of the various relativistic
corrections. It should be noted that states with the same j are degenerate.
When the various relativistic corrections are combined, for l = 0, the Darwin
term exactly compensates for the absence of the spinorbit interaction. Therefore, the energy shifts combine to yield one formula in which l drops out. This
implies that the energy levels only depend on the principle quantum number n
314
and the total angular momentum j. States with different orbital angular momenta are degenerate, even though the individual interactions appear to raise
the degeneracy. The relativistic corrections inherent in Diracs theory of hydrogen yields energy shifts and linesplittings which are described as fine structure.
The energy levels are described by
1 Z 4 4
1
3
1 Z 2 2
E m c2 1
+
.
.
.
2 n2
2 n3
4n
( j + 12 )
(1836)
where
=
e2
h c
(1837)
is the fine structure constant. Generally, states with larger j values have higher
energies. The fine structure splittings decrease with increasing n like n3 , but
increase with increasing Z like Z 4 . The splitting of the lower energy levels are
largest, for example
m c2 4
1
1
E2P 3 E2P 1 =
315
P
Kinematic
D
3D5/2
Kinematic
Spin Orbit
3P3/2
3D3/2
Spin Orbit
3S1/2
3P1/2
Kinematic
Darwin
n=3
Figure 54: The Grotarian energy level diagram for the n = 3 shell of hydrogen
(blue). The diagram shows the magnitude and sign of the various relativistic
corrections. It should be noted that states with the same j are degenerate.
Lamb and Retherford designed an experiment to accurately measure the fine
structure of the hydrogen atom. In the experiment, the time scales were such
that the population of all excited states, other than the metastable 2S 12 state
of hydrogen, radiatively decayed to the ground state. Hence, the number of
induced transitions from the 2S 12 state could be monitored by simply observing
of the population of hydrogen atoms not in the ground state.
I
e
eOven
H1
EM
Cavity
H1
Figure 55: A schematic of the apparatus used in the LambRetherford experiment. The beam of H molecules is produced in an oven, the beam is excited
by crossbombardment with an electron beam. The population of the 2S 12 is altered in the microwave resonator, and the population is observed via the current
emitted at the tungsten plate.
A beam of hydrogen atoms was produced by dissociating hydrogen molecules
316
Figure 56: The dependence of the current emitted from the tungsten plate on
the applied magnetic field. The resonance frequency was set to 9487 Megacycles.
[W. E. Lamb Jr. and R. C. Retherford, Phys. Rev. 72, 241 (1947).]
In the resonator, an applied magnetic field Zeeman split the excited levels of
hydrogen and, when the oscillating field was onresonance with the splitting of
the energy levels, the hydrogen atom made transitions out from the metastable
2S 12 state. At resonance, the frequency of the oscillating electromagnetic field
is equal to the energy splitting. Therefore, for fixed frequency, knowledge of the
resonance magnetic field allowed the splitting of the energy levels to be accu
317
Figure 57: The observed dependence of the resonance frequencies on the applied
magnetic field. The solid lines are the predictions of the Dirac theory and the
dashed lines are the result of Diracs theory if the energy of the 2S state is
simply shifted. [W. E. Lamb Jr. and R. C. Retherford, Phys. Rev. 72, 241
(1947).]
that at zero field the degeneracy between the 2S 21 and 2P 12 states were lifted,
with the 2S 21 state having the higher energy.
12.10.8
The radial equation for a relativistic spin onehalf particle in a spherically symmetric square well potential is given by
( E V (r) m c2 ) f (r) + c h
g(r) = 0
r
r
( E V (r) + m c2 ) g(r) c h
+
f (r) = 0
r
r
(1841)
318
We shall examine the case of an attractive central square well potential V (r)
which is defined by
V0 for r < a
V (r) =
(1842)
0
for r > a
In the region r < a where the potential is finite, the Dirac radial equation
0.5
V(r)/V0
0.5
1
1.5
0
0.5
1.5
r/a
Figure 58: A spherically symmetric potential well, of depth V0 and radius a.
becomes
( E + V0
( E + V0
m c ) f (r) + c h
g(r) = 0
r
r
+ m c2 ) g(r) c h
+
f (r) = 0
r
r
(1843)
The function f (r) satisfies a secondorder differential equation, which can be
found by premultiplying the second equation by the operator
c h
(1844)
r
r
and then eliminating g(r) by using the first equation. This process yields the
equation
2
( + 1)
2
2 4
c2
h2
f
(r)
=
(
E
+
V
)
m
c
f (r)
0
r2
r2
(1845)
319
By using a similar procedure, starting from the second equation, one can find
the analogous equation for g(r)
2
( 1)
2 2
2
2 4
c
h
g(r) = ( E + V0 ) m c
g(r)
r2
r2
(1846)
It should be recognized that the term proportional to ( + 1 ) on the lefthand side of the eqn(1845) for the large component, when divided by 2 m c2 ,
is equivalent to the centrifugal potential in the nonrelativistic limit. The small
component experiences a different centrifugal potential. Furthermore, the quantity
( E + V0 )2 m2 c4
(1847)
plays a similar role to the kinetic energy in the nonrelativistic Schrodinger
equation.
Real Momenta
If the quantity ( E + V0 )2 m2 c4 is positive, it can be written as
( E + V0 )2 m2 c4 = c2 h
2 k02 > 0
(1848)
+
1
)
f = 0
2
2
2
2 g
2
+
( 1) g = 0
(1849)
2
Since (apart from the sign) is identified with a form of angular momentum,
one sees that the upper and lower components experience different centrifugal
potentials. These equations have forms which are closely related to Bessels
equation. If one sets
1
f = 2 X+ 12 
(1850)
and
g = 2 Y 12 
the equations reduce to the pair of Bessels equations
2 X+ 12 
X+ 12 
1 2
2
2
+
+
( +
)
X+ 12 
2
2 Y 12 
Y 12 
1 2
2
2
+
)
Y 12 
2
(1851)
0
(1852)
320
jn () =
J 1 ()
2 n+ 2
r
n () =
N 1 ()
(1853)
2 n+ 2
Therefore, the general solutions of each of the radial equations can be expressed
as
f (r)
(1854)
= A0 j+ 12  12 (k0 r) + A1 + 12  12 (k0 r)
r
and
g(r)
= B0 j 12  12 (k0 r) + B1  12  12 (k0 r)
(1855)
r
However, since the functions f (r) and g(r) in the upper and lower components
are related by the differential equations
1 +
f (r)
( E + V0 + m c2 ) g(r)
+
=
(1856)
r
c h k0
r
and
1
+
g(r)
r
( E + V0 m c2 )
=
c h k0
f (r)
r
(1857)
the two sets of coefficients (A0 , A1 ) and (B0 , B1 ) must also be related. The
explicit relations can be found by using the recurrence relations for the spherical
Bessel functions jn ()
n+1
jn ()
= n+1 jn1 ()
(1858)
and
n jn ()
= n jn+1 ()
(1859)
normalizable near the origin and the spherical Neumann functions n () diverge
as (n+1) as 0.
Imaginary Momenta
If the quantity ( E + V0 )2 m2 c4 is negative, it can be written as
( E + V0 )2 m2 c4 = c2 h
2 20 < 0
(1861)
(1865)
so that the wave function in the region r < a where the potential is zero is
exponentially decaying. Since E 2 m2 c4 < 0, the wave functions in the
outer region should also be expressed in terms of the modified spherical Bessel
functions. The quantity 1 can be defined as
E 2 m2 c4 = h
2 c2 21
(1866)
(1867)
In this case, it is more useful to express the solution of the radial Dirac equation
in terms of the spherical Hankel functions h
n (). The spherical Hankel functions
are defined via
h
(1868)
n () = jn () i n ()
322
For asymptotically large , these functions are complex conjugates and represent
outgoing or incoming spherical waves
1
1
lim h
()
exp
(
n
+
)
(1869)
n
2 2
The factor of 1 reflects the fact that the intensity of an outgoing wavepacket
decreases in proportion to 2 in order to conserve energy and probability. From
the asymptotic variation, it is seen that the spherical Hankel functions h
n (i)
with imaginary arguments, respectively, represent exponentially attenuating or
growing spherical waves. In the exterior region, the solutions are represented
by
f (r)
= C00 h+
(i1 r) + C10 h
(i1 r)
(1870)
+ 12  12
+ 12  12
r
and
g(r)
= D00 h+
(i1 r) + D10 h
(i1 r)
(1871)
 12  12
 12  12
r
The coefficients of the upper and lower components are related via
E + m c2
C00 =
D00
c h 1
E + m c2
0
C1 =
D10
(1872)
c h 1
as can be seen by substituting the asymptotic form of the Hankel functions given
by eqn(1869) in the asymptotic form of the differential equations relating f (r)
and g(r) with V0 = 0. If this wave function is to be normalizable at ,
one must set C10 = D10 = 0.
The solutions for the wave functions have been found in the inner and outer
regions of the potential. The solution must also hold at r = a. This is achieved
by demanding that the upper and lower components of the wave function are
continuous at r = a. These conditions are demanded due to charge conservation
j = 0, since the current j only depends on the components of and does
not (explicitly) depend on their derivatives.
Since the wave function at the origin must be normalizable, and since the
wave function must be exponentially decaying, when r , the matching
condition for the upper component becomes
(i1 a)
A0 j+ 21  12 (k0 a) = C00 h+
+ 1  1
2
(1873)
323
(1874)
(1875)
In the above expression, the quantities k0 and 1 are defined by
h2 c2 k02 = ( E + V0 )2 m2 c4
(1876)
2 c2 21 = m2 c4 E 2
h
(1877)
and
These equations determine the allowed values for the energy. The above set of
equations have to be solved numerically to find the energy eigenvalues. We note
that for the Dirac particle, the spin effectively results in the formation of a centrifugal barrier (either for the upper or the lower component) even for electrons
in s states. As a result, the potential V0 must exceed a critical strength if it is
to yield a bound state.
12.10.9
From the point of view of symmetry, a baryon, such as a neutron or proton, are
thought of as being composed of three (valence) quarks. For example, the proton
is considered to be made of two up quarks and a down quark (p = (uud)), while
the neutron is considered to be made of one up quark and two down quarks
(n = (udd)). These valence quarks are assumed to be surrounded by a sea of
gluons which bind the quarks together and a sea of virtual quark/antiquark
pairs that are produced by the gluon field. Likewise, mesons are considered
to be made of a quark and an antiquark, but these valence quarks are also
surrounded by a sea of gluons and quark/antiquark pairs. The gluon force has
the property that the energy of interaction increases as the separation between
the quarks increases. It is this property of the gluon force that results in the
quarks being confined, so that no single quark can be found in nature.
The MIT bag model115 is a simple purely phenomenological model for the
structure of strongly interacting particles (hadrons). The model is based on
the spherically symmetric potential of radius a, but it will be assumed that the
quark mass can have one or the other of two values. The quark is assumed to
have a small mass (approximately zero) if it is located within a sphere of radius
a, and the mass is assumed to be very large (or infinite) if r > a. To be sure,
the quark mass is assumed to be a function of r such that
m = 0 if r < a
m if r > a
(1878)
324
It is the infinite mass of the quark for r > a that results in the confinement of
the quark to within the hadron. That is, in the exterior region, the infinite rest
mass energy exceeds the bound state energy so the exterior region is classically
forbidden, therefore, the particle is confined to the interior.
Inside the hadron, where both the potential energy and the mass m are zero,
the kinetic energy parameter k0 can be expressed entirely in terms of the energy
via E = h c k0 since the potential is assumed to be zero. Therefore, the radial
components of the Dirac wave function can be expressed as
f (r)
r
g(r)
r
= A0 j+ 12  12 (k0 r)
=
sign() A0 j 12  12 (k0 r)
(1879)
where the amplitudes of the upper and lower components are the same, since
the potential and mass are zero for r < a.
Outside the hadron, where r > a, the energy E is assumed to be much less
than the rest mass energy, m c2 E, therefore, the momentum parameter is
imaginary and one can set h c 1 m c2 . In the exterior region, the radial
functions can be expressed as
f (r)
r
g(r)
r
= C00 h+
(i1 r)
+ 1  1
2
= C00 h+
(i1 r)
 1  1
2
(1880)
(1881)
Due to the asymptotic properties of the spherical Hankel functions, their ratio
is unity for large 1 . This leads to the energies of the quarks being governed by
the simplified matching condition
j+ 21  12 (k0 a) = sign() j 12  12 (k0 a)
(1882)
E = c h k0
(1883)
where
The above equation governs the ground state and excited state energies of the
individual quarks inside the hadron. Since the spherical Bessel functions oscillate in sign, the above equations will result in a set of solutions for k0 with fixed
325
. From the structure of the equations, it is seen that the solutions k0 will only
depend on the integer number and the value of a. Since another boundary
condition should also be imposed at the bags surface, only states with angular
momentum j = 21 should be retained. This extra condition restricts the interest
to states with = 1.
We shall examine the lowestenergy bound state which corresponds to the
case = 1. The bound state energies are given by the matching condition
j0 (k0 a) = j1 (k0 a)
Since
sin
j0 () =
and
j1 () =
(1884)
(1885)
sin cos
2
(1886)
1
1 + cot
(1887)
P()/P(0)
0.5
0
0
Figure 59: The radial dependence of the quarkdistribution in the ground state
of the MIT bag.
lowestenergy quark is give by the formula
E=1,nr =0 =
2.04 c h
a
(1888)
Table 15: The lowest singleparticle energies (in units of E,nr a/c h) of the MIT
Bag Model.
nr
= 1
= +1
= 2
= +2
nr = 0
2.04
3.81
3.21
5.12
nr = 1
5.40
7.00
6.76
8.41
nr = 2
8.58
10.17
10.01
11.61
nr = 3
11.73
13.31
13.20
14.79
single particle levels. This could allow one to calculate the excitation energies
required to change the hadrons internal structure.
In conclusion, the MIT bag model, when interpreted as being a strictly
singleparticle picture, predicts that the set of excitation energies (for the internal structure) of each of the basic hadrons can be put into a onetoone
correspondence with each other. That is, the family of excitation energies for
each hadron should fall ontop of each other, if one scales the energies by multiplying them with the hadrons characteristic length scale a. The bag radius
is determined by the use of further phenomenological considerations. However,
although the model can be used to fit the right size for a nucleon ( 1 fm),
the model predicts that a meson (such as the pions which are composed of a
quark and antiquark in the combinations of either (u, d), (d, u) or 12 (dd uu)
) should have almost the same radii116
an
=
a
14
3
2
(1889)
Hence, the ratio of the nucleon mass Mn to the pion mass M is expected to be
given by the formula
Mn
3 2.04/an
3
=
=
M
2 2.04/a
2
14
2
3
(1890)
116 It is assumed that the bag energy is given by the sum of a volume term B a3 and the sum
P
a4 =
c
h
3B
X
n
327
Table 16: The Observed Energy Levels for the charmonium system (cc) in units
of MeV/c2 .
1
S0
2981
3686

S1
3097
3770
4040
4160
P0
3415

P1
3510

P2
3556

which yields a ratio of 1.36. This ratio is far too small for the triplet of mesons
since M 139 MeV/c2 , and Mn 938 MeV/c2 . Although it is in adequate
for the pseudoscalar mesons, the MIT bag model is more appropriate for the
vector meson which is composed of 12 (uu + dd) and has a mass of M 783
MeV/c2 , or the vector meson 1 2 (uu + dd) with a mass M 776 MeV/c2 .
Hence, at best, the MIT Bag model produces mixed results. The MIT Bag
model is also quite unappealing, since the basic assumptions of the bag model
do not follow from Quantum Chromodynamics, and the model is neither renormalizable nor is it Lorentz invariant.
12.10.10
A quark and antiquark pair form bound states. Thus, for example a charmed
quark/antiquark pair (c, c) can form states with different internal quantum
numbers117 . The experimentally determined energies for the J/ system118
are given in Table(16). Similarly, the Upsilon particle119 (bb) has a similar set
of energy levels. The energy levels of the Upsilon system120 are tabulated in
Table(17). For positronium121 , like the hydrogen atom, it is the electromagnetic
force mediated by vector photons which binds the electron and positron into a
bound state. For a quark/antiquark bound state, it is the color force mediated
by massless vector gluons that bind the quark/antiquark pair together. The
color force has the property that it increases with increasing separation of the
quark/antiquark pair, which has the consequence that the quarks are confined.
Furthermore, highenergy inelastic scattering experiments on hadrons indicate
117 J. E. Augustin et al. Phys. Rev. Lett. 33, 1406 (1974).
J. J. Aubert Phys. Rev. Lett. 33, 1404 (1974).
118 The data are taken from the Particle Data Group: http://pdg.lbl.gov
119 S. W. Herb, et al. Phys. Rev. Lett. 39, 252 (1977).
W. R. Innes et al. Phys. Rev. Lett. 39, 1240 (1977).
120 The data are taken from the Particle Data Group: http://pdg.lbl.gov
121 M. Deutsch, Phys. Rev. 82, 455 (1951).
328
Table 17: The Observed Energy Levels for the Upsilon system (bb) in units of
MeV/c2 .
1
S0
9460
10025
10355
10580
P0
9860
10232

P1
9893
10255

P2
9913
10268

that at small separations the quarks only interact weakly. This property is
called asymptotic freedom. It was the realization by t Hooft122 , Gross and
Wilczek123 and Politzer124 that nonAbelian gauge theories possessed the properties of asymptotic freedom that led to the acceptance of the theory of Quantum
Chromodynamics. The screening of the color force between the quarks at large
distances (due to virtual quark/antiquark pairs) is more than compensated by
an antiscreening due to virtual gluon pairs. However, at small distances the
color force vanishes.
The restmass energy of the quarks and antiquarks will be modeled by
m(r) c = m0 c i . r
(1891)
which describes an energy similar to that of an elastic string which couples to
the spin125 . The model has two undetermined parameters, the quark mass m0
and the string tension m0 c . The mass m(r), and the Dirac equation, can be
used to determine the energy levels of quarkonium.
Exercise:
Show that the positive energy eigenvalues of the Dirac equation with the
mass m(r) given by
(1892)
m(r) c = m0 c i . r
are determined as
En,j,l = m0 c2
122 G.
tA + 1
329
(1893)
1
2
1
2 (n + j) + 3 if j = l
2
2 (n j) + 1 if j = l +
(1895)
12.11
12.11.1
The scattering of a relativistic electron by a Coulomb force field results in spinflip scattering since the electron has a magnetic moment which interacts with
the magnetic field produced in the electrons rest frame. Since the Coulomb
potential is spherically symmetric, the angular momentum J2 and J(3) commute with the Hamiltonian, hence, (j, j3 ) are constants of motion. However,
does not commute with H.
c p .
A (r)
E p + m c2
330
(1898)
pr
in
(r) = NEp
exp
i
cos
(1899)
c p
Ep + m c2
h
From the Rayleigh expansion, one observes that the inasymptotes are not eigen 2 = (L
+ S)
2 since they are formed of linear superpositions of many
states of (J)
2 but have a fixed eigenvalue of S
2 . Howstates with differing eigenvalues of L
(3) + S(3) with eigenvalues
ever, the inasymptote are eigenstates of J(3) = L
h
2.
in
(,)
out
Figure 60: The geometry of the asymptotic final state of Mott scattering. At
large r, the beam separated into an unscattered beam in and a spherical outgoing wave out .
The corresponding outasymptotes can be described as spherical outgoing
waves. Even though the inasymptote may have a definite eigenvalue of S(3) ,
the spherically symmetric outasymptote waves may contain a component with
flipped spin, due to the action of the spinorbit coupling
.L
(3) + 1
+ S L
+
S
= S(3) L
S+ L
(1900)
2
active in the vicinity of the target. In spherical polar coordinates, the orbital
331
= h exp[i]
L
i cot
(1901)
(1902)
c p ( er . )
)
Ep + m c2 ( f () g() exp[ i ] S
pr
1
exp i
= NEp
r
h
(1903)
where er is a unit vector in the radial direction. It should be noted that the
outasymptote describes an outgoing spherical wave when r . Therefore,
the operator ( . p ) appearing in the asymptote has simplified since
out
(r)
lim ( . p )
lim
r.
r
r
r.
r
2i
ih
+
r
h
i h
r
S.L
r
(1904)
which reflects that the spinorbit coupling term is ineffective at r . Similarly, the effect of the differential operator can be evaluated as
1
pr
p
pr
lim i
h
exp i
exp i
(1905)
r
r
r
h
r
h
In light of the comment about the upper twocomponent spinor, one sees that
the scattered wave is determined by
f ()
(1906)
g() exp[+i]
for an incident beam with positive helicity, and by
g() exp[i]
f ()
(1907)
if the initial beam has a negative helicity. The quantities f () and g() are generalized scattering amplitudes that have the dimensions of length, and depend
on but do not depend on as both the in and out asymptotes are eigenstates
of J(3) with eigenvalues h2 . A partial wave analysis can be performed on the
Dirac equation to yield expressions for the scattering amplitudes f () and g()
in terms of phase shifts. A detailed knowledge of the scattering amplitudes is
not required for the following analysis.
332
If the inasymptote has the spin quantized along the direction given by
(sin s cos s , sin s sin s , cos s ), the upper component of the Dirac wave spinor
is determined by the twocomponent spinor
cos 2s exp[i 2s ]
s =
(1908)
sin 2s exp[+i 2s ]
The outasymptote is then determined by the twocomponent spinor
1
pr
f () cos 2s exp[i 2s ] g() sin 2s exp[+i 2s ] exp[i]
A
(r) =
exp i
g() cos 2s exp[i 2s ] exp[+i] + f () sin 2s exp[+i 2s ]
r
h
(1909)
The probability for scattering is proportional to
2
s
s
s
s
exp[i ] g() sin
exp[+i ] exp[i]
I(, ) f () cos
2
2
2
2
2
s
s
s
s
+ g() cos
exp[i ] exp[+i] + f () sin
exp[+i ]
2
2
2
2
2
2
=
 f ()  +  g() 
+ sin s sin( s ) i f () g() f () g ()
(1910)
which clearly depends on the azimuthal angle .
If the initial beam is unpolarized, the direction of the initial spin (s , s )
should be averaged over by integrating over the solid angle ds = ds ds sin s .
This process yields the scattering probability for the unpolarized beam
Z
ds
2
2
I(, ) =
 f ()  +  g() 
(1911)
4
which is independent of the azimuthal angle . It should be noted that the
unpolarized crosssection differs from the polarized crosssection.
Even if the initial beam is unpolarized, the final beam will be partially
polarized. The direction of the net polarization is determined by evaluating the
and averaging over the direction of the initial spin, s and
matrix elements of S
s . The result is proportional to
h
f () g() f () g ()
=
i
(sin , cos , 0)
(1912)
S
2
 f () 2 +  g() 2
Hence, the polarization is perpendicular to the scattering plane. It should also
be noted that the net polarization of the scattered wave is determined by the
relative deviation of the scattering crosssection for polarized electrons from the
unpolarized scattering crosssection.
333
12.11.2
The Dirac equation with a spherically symmetric potential V (r) has solutions
of the form
!
j 12
f (r)
(,
)
j,jz
r
(r) =
(1913)
j 12
i g(r)
j,jz (, )
r
j 1
j 12
j+ 2 2 jz
(,
)
Y
1
j
2j+11
jz 12
(1914)
j,jz2 (, ) = q 1 1
j 12
j+ 2 2 jz
Y
(,
)
1
2j+11
j +
z
2
E V (r) m c
f (r) = c h
g (r)
r
r
+
f (r)
(1915)
E V (r) + m c2 g (r) = c h
r
r
where = ( j +
1
2
(1916)
(1917)
where j (kr) and (kr) are the spherical Bessel and the spherical Neumann
functions. For negative values of , the solutions are given by
f (r)
= A j1 (kr) + B 1 (kr)
r
(1918)
The spherical Bessel and spherical Neumann functions have the asymptotic
forms
j (kr)
(kr)
cos(kr ( + 1) 2 )
kr
sin(kr ( + 1) 2 )
kr
(1919)
The solutions for a free particle do not involve the spherical Neumann functions,
since they are not normalizable at the origin. The amplitudes of the asymptotic
solution in the presence of a finite potential V (r) are usually written as
B
= tan (k)
A
334
(1920)
where (k) are the phase shifts that characterize the potential. The phase
shifts depend directly on (and the energy) and only depend indirectly on j
and l through . The phase shifts are defined so that the asymptotic variation
of the radial functions is given by
cos(kr ( + 1) 2 + (k))
f (r)
ei (k)
r
r
(1921)
and only differs from the asymptotic variation of the free particle solutions
through the phase shifts. Furthermore, if this is decomposed in terms of incoming and outgoing spherical waves,
exp i k r ( + 1) 2 + 2 (k)
f (r)
2r
r
exp i k r ( + 1) 2
(1922)
+
2r
their fluxes are equal due to conservation of particles and, as written, the incoming spherical waves are not modified by the phaseshifts.
The general asymptotic r form of the wave function for the scattering
is composed of the unscattered wave and a spherical outgoing wave. The polaraxis is chosen to be parallel to direction of the incident beam which is also chosen
to be the quantization axis for the spin. If the incident beam is polarized with
spinup, the upper twocomponent spinor has the form
exp i k r
1
f ()
A
exp i k r cos +
(r) =
0
g() exp[ i ]
r
(1923)
whereas for a downspin polarized incident beam
exp i k r
0
g() exp[ i ]
A
exp i k r cos +
(r) =
1
f ()
r
(1924)
On recalling the Rayleigh expansion
X
exp i k r cos
il ( 2 l + 1 ) jl (kr) Pl (cos )
(1925)
=
l
one can find the scattered spherical outgoing wave by subtracting the unscattered beam from the total wave function. On using the asymptotic large r
variation, one obtains the asymptotic form
X
cos(kr (l + 1) 2 )
exp i k r cos
il ( 2 l + 1 )
Pl (cos ) (1926)
kr
l
335
which has a similar form to the asymptotic form of the total wave function.
In particular, the spin and orbital angular momentum eigenstates can be decomposed in terms of the spinor spherical harmonics. Thus, for the upspin
polarized incident beam one has the upper twocomponent spinor
r
4
Pl (cos ) + =
Y l (, ) +
2l + 1 0
4
=
l + 1 ll+ 1 , 1 l ll 1 , 1
(1927)
2 2
2 2
2l + 1
and for the downspin beam
r
Pl (cos )
=
=
4
Y l (, )
2l + 1 0
4
(1928)
l + 1 ll+ 1 , 1 + l ll 1 , 1
2
2
2
2
2l + 1
2
l + 1 ll+ 1 , 1 l ll 1 , 1
exp i k r cos +
4
il
2 2
2 2
kr
l
(1929)
and
X cos(kr (l + 1) )
l
l
l
2
l + 1 l+ 1 , 1 + l l 1 , 1
4
i
exp i k r cos
2
2
2
2
kr
l
(1930)
Although the coefficients A of the exact wave function are as yet unknown, they
can be determined by requiring the scattered spherical wave does not contain
terms proportional to
exp
ikr
(1931)
r
which would represent an incoming spherical wave. This requirement leads to
the outgoing spherical wave having a spinup component given by
4 X
(l + 1)
exp[ 2 i l1 (k) ] 1
2ik
2l + 1
l
exp i k r
l
exp[ 2 i l (k) ] 1
Y0l (, )
+
r
2 l+ 1
(1932)
336
4 X
l(l + 1)
exp[ 2 i l1 (k) ] 1
2ik
2l + 1
l
exp i k r
exp[ 2 i l (k) ] 1
Y1l (, )
(1933)
r
In the above expressions, the index on the phaseshifts (k) refer to the value
of . Hence, for a spinup polarized incident beam, the scattering amplitudes
are given in terms of the phaseshifts via
(l + 1)
4 X
exp[ 2 i l1 (k) ] 1
f () =
2ik
2l + 1
l
l
exp[ 2 i l (k) ] 1
Y0l (, ) (1934)
+
2l + 1
and
g() exp[ i ]
r
4 X
l(l + 1)
exp[ 2 i l1 (k) ] 1
2ik
2l + 1
l
exp[ 2 i l (k) ] 1
Y1l (, )
(1935)
12.12
=
exp
Et
B
h
337
(1937)
(1938)
q
A ) B (r)
c
q
= c . ( p
A ) A (r)
c
= c . ( p
(1939)
Substituting the expression for B from the second equation into the first, one
obtains the secondorder differential equation for A
(E
m c )
2
q
= c
. ( p
A)
A (r)
c
q
q h
2
2
= c
( p
A)
. B A (r)
c
c
2 2
2
2 2
(z)
=
p c + q B x 2 q py c B x q c h B A (r)
2
(1940)
h c
+(c
h ky q B x ) q c h B
A (x) = ( E 2 m2 c4 c2 h2 kz2 ) A (x)
x2
(1942)
The equations decouple if the twocomponent spinor A (x) can be taken to be
an eigenstate of the zcomponent of the spin operator
A (x) = f (x)
(1943)
(z) =
(1944)
where
in which the eigenvalues of (z) are denoted by . Therefore, the eigenvalue
equation can be reduced to
h2 c2
2
+ ( q B )2
x2
x
ch
ky
qB
2
338
which (apart from an overall scale factor) is formally equivalent126 to the (nonrelativistic) energy eigenvalue equation for a shifted harmonic oscillator, with
frequency 2 c  q  B. The modulus sign was inserted to ensure that the frequency
HO is positive. The energy eigenvalues are determined from
( E 2 m2 c4 c2 h
2 kz2 + q c h B ) = 2  q  c h B ( n +
1
) (1946)
2
Hence, for an electron with negative charge q = e one finds that the positiveenergy eigenvalue is given by the solution
r
 e  h
E = c
m2 c2 + h2 kz2 + ( 2 n + 1 + )
B
(1947)
c
This expression has an infinite degeneracy as it is independent of the continuous
variable ky . It also has a discrete (twofold) degeneracy between the levels with
quantum numbers (n, = 1) and (n + 1, = 1). The twofold degeneracy
can be understood as a consequence of the generalized helicity . ( p qc A )
This results in the spins alignment with
commuting with the Hamiltonian H.
the electrons velocity being preserved, as the spins precession is precisely balanced by the electrons orbital precession. It should be noted that if the g factor
deviates from 2, and such an anomaly in the g factor is expected from Quantum
Electrodynamics and has been found in experiment, then this degeneracy will
be lifted. The calculated ( g 2 ) anomaly for an electron is given by
2
3
4
g 2
1
=
0.3284986
+ 1.17611
1.434
+ ...
2
2
Theor
(1948)
where
2
e
(1949)
=
h c
1
mHO =
2 c2
and then determine the frequency from
2
m2HO HO
=
339
qB
c
2
and differs from the theoretical value in the last two decimal places127 . In
the nonrelativistic limit, the expression for the relativistic energy eigenvalue
reproduces the expression for energies of the wellknown Landau levels
1 +
 e  h B
h2 kz2
E m c2 +
+ (n +
)
(1951)
2m
2
mc
which are doublydegenerate.
12.13
(1953)
(1954)
where the prime indicates differentiation with respect to . The classical vector
potential must satisfy the sourcefree wave equation
A = k k A ()00 = 0
(1955)
(1956)
2 2
h 2 i
h A + 2 A A m c i h k A () = 0
c
c
c
(1957)
127 This discrepancy could indicate the importance of virtual processes in which heavy particle/antiparticle pairs are created. The (g 2) anomalies for the muon and its antiparticle
have also been measured [G. W. Bennett et al., Phys. Rev. Lett. 92, 1618102 (2004).].
These experiments show that particles and antiparticles precess at the same rate. However,
the value of the (g 2) anomaly is inconsistent with the theoretical prediction based on the
standard model of particle physics.
340
where is the fourcomponent Dirac spinor. In deriving this, the Lorenz gauge
condition has been used to rewrite
,
A
= A
g
A
(1958)
in the diagonal terms.
Following Volkow128 , the solution of the secondorder differential equation
can be found in the form
p x
= exp i
F ()
(1959)
h
where p is a fourvector and F () is a fourcomponent spinor. This form reduces to the form of a free particle solution when A 0 in which case p
becomes the momentum of the free particle. The exponential form is unaltered
when the vector potential is nonzero since arbitrary multiples of the electromagnetic wave vector k can be added to the momentum of the free particle, in
which case p has a different interpretation. For a transverse polarized vector
potential describing an electromagnetic wave travelling in the e3 direction, the
operators
p1
p2
= i h 1
= ih
2
(1960)
commute with the timedependent Dirac Hamiltonian and are constants of motion. Although the particles energy and momentum operators do not commute
with the Hamiltonian, as these quantities are not conserved due to the interaction with the field, the quantity
p3 p0 = i h 3 0
(1961)
does commutes with the Hamiltonian and, therefore, is conserved. The conservation of this quantity can be interpreted in terms of the energy absorbed
or emitted by the electron due to interaction with the classical electromagnetic
field being accompanied by the absorption or emission of similar amount of momentum129 . Despite the different interpretation of p in the presence of the
classical field, the fourvector p shall be chosen to satisfy the condition
p p = m2 c2
(1962)
M. Volkow, Zeit. f
ur Physik, 94, 25 (1935).
the quantized electromagnetic field, the absorption of a photon involves the absorption
of the energy and momentum given by the fourvector
h k , where k = (k, 0, 0, k).
130 If the condition on p is dropped, the function F () will acquire an overall phase factor
that depends linearly on and on the constant value of p p m2 c2 .
129 For
341
(1963)
since A satisfies the Lorenz gauge condition and k satisfies the dispersion
relation for electromagnetic waves in vacuum. On substituting the ansatz into
the secondorder equation, using the above two equations and the choice of p
satisfying the freeelectron dispersion relation, one finds that the secondorder
equation reduces to a firstorder differential equation for the spinor F ()
q
q2
q
2i
h p k F ()0 = 2 A p 2 A A + i h k A ()0 F ()
c
c
c
(1964)
which only depends on since the exponential phasefactor which depends on
p x has been factored out. The firstorder equation can be integrated w.r.t.
to yield
Z
1 q 0
q k A
iq
0
0
0
F () = exp
p
A
(
)
A
(
)
A
(
)
d
+
F (0)
h c p k 0
2 c
c
2 p k
(1965)
where F (0) is an arbitrary constant fourcomponent spinor. The exponential of
the matrix is defined in terms of its series expansion.
Z
1 q 0
iq
0
0
0
p
A
(
)
A
(
)
A
(
)
d
F () = exp
h c p k 0
2 c
q k A
exp
F (0)
(1966)
c
2 p k
The above form can be simplified by expanding the last exponential factor due
to the identity
n
k A
= 0
(1967)
for all integers n such that n > 1. The identity can be proved by
k A k A
= k k A A + 2 g , A k k A
= k k A A
(1968)
where the first line follows by using the anticommutation relations for the
matrices and the second line follows from applying the Lorenz gauge condition.
The expression can be further simplified by noting that on anticommuting the
first pair of matrices, one has
=
=
=
=
k k A A
k k A A + 2 g , k k
k k A A
k k A A
342
(1969)
the third line follows from the condition k k = 0 and the last line follows
from interchanging the first two pairs of summation indices. On comparing the
first and last lines, one notes that the righthand side is zero. Therefore, one
has proved the identity
k A k A = 0
(1970)
0
0
0
F () = exp
p
A
(
)
A
(
)
A
(
)
d
h c p k 0
2 c
q k A
I +
F (0)
(1971)
c
2 p k
Hence, the spinor solution of the secondorder differential equation can be expressed as
S
q k A
(x) = exp i
I +
F (0)
(1972)
h
c
2 p k
where S given by
S = p x
q
c p k
p A (0 )
1 q 0
A ( ) A (0 )
2 c
d0 (1973)
(1975)
Therefore, one demands that F (0) satisfies the above supplementary condition
which is the same as for a free particle. Hence, one can set
F (0) = NF
(1976)
p .
p(0) + m c
where the normalization constant is given by
s
p(0) + m c
NF =
2 p(0) V
343
(1977)
j = c
Since the Dirac adjoint spinor is given by
q A k
S
= F (0) I +
exp
i
c
2 p k
h
the current density is evaluated as
c
q
q p A
q 2 A A
j = (0)
p
A + k
c
c k p
c2 2 k p
p V
(1978)
(1979)
(1980)
j = (0)
p k
(1981)
c2 2 k p
p V
which shows that the electromagnetic wave does not drop out from timeaveraged
quantities.
12.14
(1982)
where the matrices are any set of matrices which satisfy the anticommutation
relations
+ = 2 g , I
(1983)
131 L. V. Keldysh, Zh. Eksp. Teor. Fiz. 47, 1945 (1964). [Sov. Phys. J.E.T.P. 20, 1307
(1965).]
F. H. M. Faisal, J. Phys. B 6, L89 (1973).
H. R. Reiss, Phys. Rev. A 22, 1786 (1980).
344
The Dirac equation is independent of the specific representation of the matrices. We have chosen the representation
I
0
(0) =
(1984)
0 I
and
(i)
=
0
(i)
(i)
0
(1985)
0 = U
(1987)
= 0 (0)0
(1988)
=
(1991)
I 0
and
(i)0
=
0
(i)
(i)
0
(1992)
The components of the wave function in the chiral representation 0 are denoted
as
L0
0
=
(1993)
R0
345
(1994)
=
R0
A + B
2
The chiral representation is particularly useful for the description of massless
spin onehalf particles, such as might be the case for the neutrino. The neutrino
masses are extremely small. The masses have evaded direct experimental measurement. However, direct measurements have set upper limits on the masses
which decrease with time132 . In this case, with the limit m 0, the Dirac
equation takes the form
L0
+
c
0
t
= 0
(1995)
R0
0
t
Hence, the Dirac equation for a massless free particle reduces to two uncoupled
equations, each of which are equations proposed by Weyl133
+ c . R0 = 0
(1996)
t
and
c . L0 = 0
(1997)
t
The Weyl equation describes a spin onehalf massless particle by a two component spinor wave function. The Weyl equation violates parity invariance.
The Weyl equation was considered to be unphysical until the discovery of the
(anti)neutrino134 and the associated violation of parity invariance135 . After the
parity violation of the weak interaction was established, the Weyl equation was
adopted to describe the neutrino136 .
Inexplicably nature seems to have selected the Weyl equation for L , but not
R to describing neutrinos. The solutions of the Weyl equation for free particles
(1998)
c . L = 0
t
132 L. Langer and R. Moffat, Phys. Rev. 88, 689 (1952).
V. A. Lyubimov, F. G. Novikov, V. Z. Nozik, F. F. Tretyakov, and V. S. Kosik, Phys. Lett.
94B, 266 (1980).
A. I. Belesev et al., Phys. Lett. 350, 263 (1995).
133 H. Weyl, Zeit. f
ur Physik, 56, 330 (1929).
134 C. L. Cowan Jr., F. Reines, F. B. Harrison, H. W. Kruse and A. D. McGuire, Science
124, 103 (1956).
F. Reines and C. L. Cowan Jr., Phys. Rev. 113, 273 (1959).
135 C. S. Wu, E. Ambler, R. W. Hayward, D. D. Hoppes and R. F. Hudson, Phys. Rev. 108,
1413 (1957).
136 T. D. Lee and C. N. Yang, Phys. Rev. 105, 1671 (1957).
A. Salam, Nuovo Cimento 5, 299 (1957).
L. Landau, Nuclear Phys. 3, 127 (1957).
346
can be written as
(0)
i
1
u
L =
exp
(
E
t
p
.
r
)
u(1)
h
V
(1999)
Since helicity is conserved, one can choose the direction of p as the axis of
quantization. The positiveenergy solution is given by
1
i
0
L
exp
=
(Et pz)
(2000)
1
h
V
which has negative helicity and has energy given by
E = c p
(2001)
h
V
(2002)
(2003)
This negativeenergy solution will describe antiparticles. The Weyl equation for
R has a positiveenergy solution with positive helicity, and a negativeenergy
solution with negative helicity. Since only neutrinos with negative helicity are
observed in nature, only L is needed. The antineutrinos have positive helicity
and are represented by R .
=1
=+1
=1
Elementary Excitations
=+1
Figure 61: The dispersion relations for L and R . The elementary excitations
are the negativehelicity neutrino and a positivehelicity antineutrino .
The Neutrino
The neutrino was postulated by Pauli to balance energy and momentum
conservation in beta decay. In beta decay, it had been observed that neutron
347
(2004)
[N(p)/Fp ]
2 1/2
200
150
100
50
0
0
10
15
20
Energy [keV]
Figure 62: The FermiKurie plot of the energy distribution of the electrons
emitted in the beta decay of tritium, 3 H 3 He + e + e . The decay releases
18.1 keV. It is seen that the electrons produced in the decay process have a
nonzero probability for carrying off most of the released energy. Hence, one
concludes that the antineutrinos are almost massless. The dashed blue curve
is the curve expected if the neutrino had a mass of 3 keV.
distribution is based on the phase space available for the emission of the electron
and antineutrino138 . The joint phasespace available for the electron of fourmomentum (Ee /c, p) and the antineutrino of fourmomentum is (E /c, q) is
proportional to the factor
Z
d = dp p2
dq q 2 (E Ee (p) E (q))
Z
p Ee (p)
q E (q)
= dEe (p)
dE (q)
(E Ee (p) E (q))
2
c
c2
137 L. Langer and R. Moffat, Phys. Rev. 88, 689 (1952).
V. A. Lyubimov, F. G. Novikov, V. Z. Nozik, F. F. Tretyakov, and V. S. Kosik, Phys. Lett.
94B, 266 (1980).
A. I. Belesev et al., Phys. Lett. 350, 263 (1995).
138 E. Fermi, Zeit. f
ur Physik, 88, 161 (1934).
F. N. D. Kurie, J. R. Richardson and H. C. Paxton, Phys. Rev. 48, 167 (1935).
348
=
=
p
1
2 m2 c4 E (q)
dE
(p)
p
E
(p)
E
(q)
e
e
5
c
E (q)=EEe (p)
p
1
dEe (p) p Ee (p)
( E Ee (p) )2 m2 c4 ( E Ee (p) )
c5
(2005)
where, since the antineutrinos trajectory is unobservable, its momentum is integrated over. This phasespace factor partially governs the energy distribution
of the emitted electrons. The second to last factor in the accessible volume
of phasespace contains the dependence on the antineutrinos mass m and it
is this factor which is highlighted by the FermiKurie plot. The plot is designed to exhibit a linear energy variation until the line cuts the Eaxis, if the
antineutrino is massless. On the other hand, if the antineutrino has a finite
mass, the line should curve over and cut the Eaxis vertically. In this case, the
antineutrino mass would be determined by the difference between the linearly
extrapolated intercept and the actual intercept.
The process of beta decay does not conserve parity. The nonconservation
of parity was discovered in the experiments of C. S. Wu et al.139 . In these
experiments, the spin of a 60 Co nucleus was aligned with a magnetic field. The
spin S = 5h 60 Co nucleus decayed into a spin S = 4
h 60 N i nucleus by emitting
an electron and an antineutrino.
60
Co
60
N i + e + e
(2006)
Since angular momentum is conserved, the spin of the electron and the antineutrino initially must both be aligned with the field. In the experiment, the
angular distribution of the emitted electrons was observed. Because the helicity
of the electrons is conserved, the angular distribution of the electrons can be
used to prove that the electrons all have negative helicity, and hence it is inferred
that the antineutrinos should have positive helicity. Since helicity should be
reversed under the parity operation, and since only negative helicity electrons
are observed, the process is not invariant under parity. Hence, parity is not
conserved.
The electrons that are emitted in beta decay have negative helicities. If
the momentum of an emitted electron is given by (p, p , p ), then its helicity
operator is
cos p
sin p exp[ip ]
p =
(2007)
sin p exp[+ip ]
cos p
The helicity operator has eigenstates given by
p
p = p
(2008)
139 C. S. Wu, E. Ambler, R. W. Hayward, D. D. Hoppes and R. F. Hudson, Phys. Rev. 108,
1413 (1957).
349
+
p
sin 2p exp[i 2p ]
cos 2p exp[+i 2p ]
cos 2p exp[i 2p ]
sin 2p exp[+i 2p ]
!
(2009)
Since angular momentum is conserved and the emitted electrons only have negative helicity, the angular distribution of the emitted electrons is proportional to
the square of the overlap of the initial electron spinup spinor with the negative
helicity spinors
2
 +
=0 p 
p
2
sin2
1
( 1 cos p )
2
(2010)
e
S=h/2
eS=h/2
Co
S=5h
Ni
S=4h
1.2
I()
0.8
0.6
0.4
0.2
0
0
0.2
0.4
0.6
0.8
Figure 64: The angular distribution of the emitted electron in the beta decay
experiment of Wu et al.
captures an electron from the Kshell and decays to the excited state of a 152 Sm
nucleus with angular momentum J = h
and emits a neutrino.
152
Eu + e
152
Sm + e
(2011)
The J = h
excited state of Sm subsequently decays into the J = 0 ground
state of Sm by emitting a photon.
152
Sm
152
Sm +
(2012)
Goldhaber et al. measured the photons with the full Doppler shift, from which
they were able to infer the direction of the recoil of the nucleus. The photons
eEu
Sm*
J=1
Sm
=1
J=0
oriented along the direction of motion of the emitted neutrino. Since the sum of
the angular momentum of the excited state (J = h) and the emitted neutrino
must equal the spin of the captured electron h2 , the neutrino must have its spin
oriented antiparallel to the angular momentum of the Sm nucleus. Hence, the
neutrino has negative helicity.
12.15
A ) m c2
L =
i
h c ( + i
c h
(2013)
1 L
c (0 )
= i h (0)
= ih
(2014)
=
=
1
L
c (0 )
0
(2015)
The Lagrangian equation of motion is found from the variational principle which
states that the action is extremal with respect to and . The condition that
the action is extremal with respect to variations in leads to the Dirac equation
i h ( + i
q
A ) = m c
c h
(2016)
after the resulting equation has been multiplied by a factor of (0) . On making
a variation of the action with respect to , one finds the Hermitean conjugate
equation
q
i h c ( i
A ) m c2 = 0
(2017)
c h
That this is the Hermitean conjugate of the Dirac equation can be shown by
taking its Hermitean conjugate, which results in
i
h c (0) ( + i
q
A ) m c2 (0) = 0
c h
(2018)
= I
=
(2019)
Hence, the equation found by varying is just the Hermitean conjugate of the
Dirac equation
q
i
h ( + i
A ) = m c
(2020)
c h
Furthermore, it is surmised that the starting Lagrangian is appropriate to describe the Dirac field theory.
The Hamiltonian density H is determined from the Lagrangian by the usual
Legendre transformation process
H
= c 0 + c 0 L
= i h c 0 L
= i h c (0) 0 L
= i
hc .( i
A ) + ( m c2 + q (0) A(0) )
c h
(2021)
where the relation between the covariant components of the vector potential to
the contravariant components A(i) = A(i) has been used in the last line.
The result is identifiable with the Hamiltonian density that appears in the usual
expression for the quantum mechanical expectation value for the energy for the
Dirac electron.
The set of conserved quantities can be obtained from Noethers theorem.
The momentumenergy tensor T is given by
T
L
L
+
L
( )
( )
(2022)
which is evaluated as
T
= i h c L
q
= i
h c +
i h c ( + i
A ) + m c2
c h
(2023)
T 00 = i
hc .( i
A ) + ( m c2 + q (0) A(0) )
c h
q
= i
h c . ( i
A ) + ( m c2 + q A(0) )
c h
(2024)
which is the Hamiltonian density H. On integrating over all space, one sees
that the energy of the Dirac Field is equal to the expectation value of the Dirac
Hamiltonian operator
Z
Z
3
0
d rT 0 =
d3 r H
(2025)
353
= i h c (0) j
= ih
c j
= c pj
(2026)
where the partial derivative has been identified with the covariant momentum
operator. Hence, the contravariant component of the momentum is given by
T 0,j
= i h c
xj
= c p(j)
(2027)
xj
(2028)
(2029)
0
= exp i
(2030)
where is a constant real number. The infinitesimal global gauge transformation produces a variation in the (independent) fields
= + i
= i
(2031)
(2032)
so we have
0
= L
L
L
L
L
=
+
+
(
)
+
( )
( )
( )
(2033)
354
After substituting the EulerLagrange equations for the derivatives w.r.t. the
fields and , the variation is expressed as
L
L
0 =
+
(2034)
( )
( )
For an arbitrary gauge transformation through the fixed infinitesimal angle ,
this condition becomes
L
L
0 = i
(2035)
( )
( )
Hence, one finds that there is a current j which satisfies the continuity equation
j = 0
(2036)
(2037)
j i
( )
( )
For the Dirac Lagrangian, the second term is identically zero and the first term
is nonzero. Hence, on adopting a conventional normalization, the conserved
current is identified as
j = c
(2038)
This is the same expression for the conserved current that was previously derived
for the oneelectron Dirac equation. Hence, the oneparticle Dirac equation
yields the same expectation values and obeys the same conservation laws as the
(classical) Dirac field theory.
12.15.1
In the limit of zero mass, the Dirac Lagrangian takes the form
L = i h c
(2039)
Starting with the standard representation and making the unitary transform
1
I I
U =
(2040)
I I
2
one finds that in the chiral representation the Dirac Lagrangian reduces to
L = i
h c L L
L + R R
R
(2041)
where L and R are twocomponent Dirac spinors and the two sets of quantities
and
are expressed in terms of the Pauli matrices as
= ( 0 , )
= ( 0 , )
355
(2042)
where L and R are independent angles. The Lagrangian has a U (1) U (1)
gauge symmetry. The presence of a mass term would couple the two fields and
reduce the gauge transformation to one in which R = L .
In the chiral representation, the Hermitean matrix defined by
(4) = i (0) (1) (2) (3)
takes the form
(4)
=
I
0
0
I
(2044)
(2045)
The general gauge transformations for the massless fermion can be expressed as
the product of two independent transformations
L + R
R L
0 = exp i
I
exp i
(4) (2046)
2
2
where is a fourcomponents spinor
=
L
R
(2047)
The first factor represents the usual global gauge transformation for the Dirac
Lagrangian with finite mass. This transformation yields the usual conserved
fourvector current jV defined by
jV = c
(2048)
The second factor is specific to the Dirac Lagrangian with zero mass. It is
called the chiral transformation or axial U (1) transformation. Using the anticommutation relation
{ (4) , }+ = 0
(2049)
one can show that the exponential factor in the chiral gauge transformation has
the property that
R L
R L
(4)
(4)
exp i
= exp i
(2050)
2
2
356
This property can be used to show that the Lagrangian is invariant under the
chiral transformation because
R L
R L
0 0 = exp i
(4) (0) exp i
(4)
2
2
= (0)
(2051)
which involves two commutations. Since the massless Dirac Lagrangian is invariant under the chiral transformation, Noethers theorem shows that the current
jA
= c (4)
(2052)
jA
= (0) (4)
= (4)
= L L + R R
(2053)
which is the difference between the number of particles with positive helicity
and the number of particles with negative helicity.
In the presence of a mass m, the Dirac Lagrangian in the chiral representation
is
L = i
hc
L L
L + R R
R
m c2
L R + R L
(2054)
and one finds that the axial current is not conserved because the mass term is
not invariant and acts like a current source
jA
= i
2 m c (4)
h
(2055)
To summarize, just like the Proca equation yields a zero mass for the photon
if one imposes U (1) gauge invariance on the electromagnetic field141 , the neutrino must have zero mass if one imposes a global U (1) chiral gauge invariance.
Furthermore, the existence of conservation of chirality for the massless neutrino
implies that the weak interaction must involve a coupling proportional to a factor of either (I + (4) ) or (I (4) ).
141 Schwinger has noted that the condition of gauge invariance does not necessarily result
in the photon being massless. He argued that if the electromagnetic coupling strength were
larger, the photon could have finite mass. [J. Schwinger, Gauge Invariance and Mass,
Physical Review, 125, 397 (1962).]
357
Exercise:
By considering an infinitesimal chiral gauge transformation on the Lagrangian
for massive Dirac particles, determine L and show that this leads to the axial
current jA
not being conserved.
12.16
Hole Theory
The negativeenergy solutions of the Dirac equation lead to the conclusion that
oneparticle quantum mechanics is an inadequate description of nature. In classical mechanics, the dispersion relation for a free particle is found to be given
by
p
E = m2 c4 + p2 c2
(2056)
The negativeenergy states found in classical mechanics can be safely ignored.
The rational for ignoring the negativeenergy states in classical mechanics is
that, the dynamics is governed by a set of differential equations which result
in the classical variables changing in a continuous fashion. Since the particles
energy can only change in a continuous fashion, there is no mechanism which allows it to connect with the negative branch of the dispersion relation. However,
in quantum mechanics, particles can make discontinuous transitions between
different energy levels, by emitting photons. Hence, if one has a single electron
in a positiveenergy state where E > m c2 , this state would be unstable to
the electron making a transition to a negativeenergy state which occurs with
the simultaneous emission of photons which carry away an energy greater than
2 m c2 . The transition rate for such process is quite large, therefore, one might
conclude that positiveenergy particles should not exist in nature. Furthermore,
if one does have particles in the negativeenergy branch, they might be able to
further lower their energies by multiple photon emission processes. Hence, the
states of negativeenergy particles with finite momenta could be unstable to
states in which the momentum has an infinite value.
Dirac noted that if the negativeenergy states were all filled, then the Pauli
exclusion principle would prevent the decay of positiveenergy particles into
the negativeenergy states. Furthermore, in the absence of any positiveenergy
particles, the Pauli exclusion principle would cause the set of particles in the
negativeenergy state to be completely inert. In this picture, the filled sea
of negativeenergy states would represent the physical vacuum, and would be
unobservable in experiments. For example, if charge is measured, it is the
nonuniform part of the charge distribution that is measured, but the infinite
number of particles in the negativeenergy states do produce a uniform charge
density. Likewise, when energies are measured, the energy is usually measured
with respect to some reference level. For the case of a vacuum in which all
the negativeenergy states are filled with electrons, the measured energies correspond to energy differences and so the infinite negative energy of the vacuum
358
E/mc
1
2
Figure 66: A cartoon depicting the vacuum for Diracs Hole Theory, in which
the negativeenergy states are filled and the positiveenergy states are empty.
should cancel. Therefore, Dirac postulated that the vacuum consists of the
state in which all the negativeenergy states are all filled with electrons142 . Furthermore, physical states correspond to the states were a relatively few of the
positiveenergy states are filled with electrons and a few negative states are unoccupied. In this case, the electrons in the positiveenergy states are identified
with observable electrons, and the unfilled states or holes in the distribution of
negativeenergy states are also observable. These holes are known as positrons
and are the antiparticles of the electrons. The properties of a positron are
found by computing the difference between the property for a state with an
absent negativeenergy electron and the property of the vacuum state.
We shall assume that the vacuum contains of N electrons which completely
fill all the N negative states and, for simplicity of discussion, the effect of coupling to the electromagnetic field can be ignored. Then the charge of a positron
qp is the difference between the charge of the vacuum with one missing electron,
and the charge of the vacuum
q p = ( N 1 ) q e N qe
(2057)
Therefore, one finds that the positron has the opposite charge to that of an
electron
qp = qe
(2058)
142 P.
359
Hence, the positron has a positive charge. Likewise, the energy of the vacuum
in which all the electrons occupy all the negativeenergy states is denoted by
E0 . The positron energy will be denoted as Ep (pe ). The positron corresponds
all states with negative energy being filled except for the state with the energy
p
Ee (pe ) = m2 c4 + p2e c4
(2059)
which is unfilled. The positron energy is defined as the energy difference
Ep (pp ) =
E0 Ee (pe )
E0
= Ee (pe )
p
=
m2 c4 + p2e c4
(2060)
P 0 pe
P0
= pe
(2061)
Hence, the momentum of the positron is the negative of the momentum of the
missing electron
p p = pe
(2062)
Likewise, the spin of the positron is opposite to the spin of the missing electron,
etc. The velocity of an electron is defined as the group velocity of a wave packet
of momentum pe . Hence, one finds the velocity of the negative energyelectron
from
ve
Ee (pe )
pe
pe c2
= q
m2 c4 + p2e c2
while the velocity of the positron is given by
vp
Ep (pp )
pp
pp c2
q
m2 c4 + p2p c2
360
(2063)
Table 18: The relation between properties of Negative Energy Electron and
Positron States.
Particle
Charge
Energy
Momentum
Electron
e
E
+p
Positron
+e
+E
Spin
Helicity
Velocity
.p
.p
pe c2
= q
m2 c4 + p2e c2
= ve
(2064)
Therefore, the positron and the negativeenergy electron states have the same
velocities.
Hole theory provides a simple description of the relation between a negativeenergy state and antiparticle states. Mathematically, this relation is expressed
in terms of the charge conjugation transformation. A unique signature of the
hole theory is that a positiveenergy electron can make a transition to an unfilled
negativeenergy state emitting radiation, which corresponds to the process in
which a electronpositron pair annihilates143
e + e 2
(2065)
In this process, it is necessary that the excess energy be carried off by two
photons if the energymomentum conservation laws are to be satisfied. Likewise,
by supplying an energy greater than a threshold energy of 2 m c2 , it should be
possible to promote an electron from a negativeenergy state, thereby creating
an electronpositron pair. Since it is unlikely that more than one photon can
be absorbed simultaneously, electronpositron pair creation only occurs in the
vicinity of a charged nucleus which can carry off any excess momentum.
e + e
(2066)
361
E/mc
(k,)
1
2
362
Compton Scattering
d
dk0
=
V k 0
2 h c2
2
 M 2
(2068)
Int  q, k, >
Int  q 00 , k, , k 0 , 0 > < q 00 , k, , k 0 , 0  H
< q 0 , k 0 , 0  H
+
( Eq Eq00 h k0 )
(2069)
and where q indicates all the quantum numbers of a positiveenergy free electron
state. The sum over q 00 represents a sum over all possible intermediate states
of the electron, no matter whether they are positive or negativeenergy states.
The matrix element M is composed of a coherent superposition of matrix elements for virtual processes which represent the absorption of a photon (k, )
followed by the subsequent emission of a photon (k 0 , 0 ) and the process where
the emission of light precedes the absorption process.
363
(k,)
(k,)
q''
q
(k',')
(k',')
q''
q
q'
q'
(2070)
which leads to the identification of the momentum of the intermediate and final
states as
q 00 = q + k
q0 = q + k k0
(2071)
In the second process, where the emission process precedes the absorption, conservation of momentum yields
k + q = k + k 0 + q 00
k + k 0 + q 00 = k 0 + q 0
(2072)
which yields
q 00 = q k 0
q0 = q + k k0
(2073)
The limit in which the initial electron is at rest q = 0 shall be considered. The
momenta of the incident and scattered photon will be assumed sufficiently low
so that the momentum of the electron in the intermediate state can be neglected
since q 00 0. That is, the Compton scattering process will be consider in the
limit k 0 and k 0 0.
364
If the initial (positiveenergy) electron is stationary and has spin , its wave
function can be represented by the Dirac spinor
1
,q (r) =
(2074)
0
V
Because the interaction Hamiltonian has the form of an offdiagonal 2 2 block
matrix
0
Int = q
H
.A
(2075)
0
c
the only nonzero matrix elements are those which connect the upper twocomponent spinor to the lower twocomponent spinor of the virtual state. Also,
momentum conservation requires that the virtual state also be one of almost
zero momentum. Hence, the electron in the virtual state must have the form of
a negativeenergy eigenstate
1
0
00 ,q00 (r)
(2076)
00
V
since the contribution from a positiveenergy state with small momentum is
negligibly small. Therefore, the electronic part of the matrix elements involving
the initial electron simply reduce to the expression
Int  ,q > =  e  00 . A
< 00 ,q00  H
(2077)
Likewise, the matrix elements which involve the final (positive energy) electron
are evaluated as
Int  00 ,q00 > =  e  0 00 . A
< 0 ,q0  H
(2078)
From these one finds that, to secondorder, the matrix elements that appear in
the transition rate are given by
X
( 0 . 0 (k 0 ) 00 ) ( 00 . (k) )
2
h c2
M = e2
V k k 0
Eq Eq00 + h k
00
( 0 . (k) 00 ) ( 00 . 0 (k 0 ) )
+
Eq Eq00 h k0
X
e2
2
h c2
( 0 . 0 (k 0 ) 00 ) ( 00 . (k) )
2 m c2
V k k 0
00
+ ( 0 . (k) 00 ) ( 00 . 0 (k 0 ) )
(2079)
where one has set
Eq Eq00 2 m c2
(2080)
365
0
0
0
0
)
)
+
(
)
)
(
M
.
(k)
)
(
.
(k
.
(k
.
(k)
)
0
2 m c2
V k k 0
(2082)
The products in the above expression can be evaluated with the aid of the Pauli
identity. The result is
( . (k) ) ( . 0 (k 0 ) ) = ( (k) . 0 (k 0 ) ) + i . ( (k) 0 (k 0 ) ) (2083)
Therefore, after combining both terms and noting that the pair of vector product
terms cancel since the vector product is antisymmetric under the interchange
k k 0 , one finds that the matrix elements reduce to
e2
2 h c2
0
0
0 2 (k) . (k )
M
2 m c2
V k k 0
e2
2 h c2
M
,0 2 (k) . 0 (k 0 )
(2084)
2 m c2
V k k 0
Hence the matrix elements are diagonal in the spin indices. The above matrix
elements are identical to the matrix elements that occur in the nonrelativistic
quantum theory of Thomson scattering. On substituting this result into eqn(2068),
one recovers the nonrelativistic expression for the differential scattering crosssection
2 2
k 0
d
e
0
,
 (k) . 0 (k 0 ) 2
dk0
k
m c2
2 2
k 0
e
,0
cos2
(2085)
k
m c2
where
cos = (k) . 0 (k 0 )
(2086)
is the angle subtended by the initial and final polarization vectors. Hence, one
concludes that the negativeenergy states do play an important role in light
scattering processes which involve lowenergy electrons. The result, although
correct, does need reinterpretation, since the states of negative energy are assumed to be filled with electrons in the vacuum and, therefore, the electron is
forbidden to occupy these levels in the intermediate states.
ElectronPositron Interpretation
The first contribution to the matrix elements, which was described above,
has to be reinterpreted as representing a process in which an electron that
initially occupies the negativeenergy state q 00 makes a transition to the positiveenergy state q 0 while emitting the photon (k 0 , 0 ). This transition is subsequently
followed by the positiveenergy electron q absorbing the photon (k, ) and falling
366
e
(k',')
e
(k,)
q''
e+ q''
e+
(k,)
q'
(k',')
e
q'
(2087)
the contribution to the matrix element of these two descriptions are identical
(apart from an over all negative sign).
The second contribution to the matrix elements can be viewed as originating
from an electron which initially occupies a negativeenergy state q 00 that absorbs
the photon (k, ) and makes a transition to the positiveenergy state q 0 . This
is followed by the electron in the positiveenergy state q emitting the photon
(k 0 , 0 ) and then falling into the empty negativeenergy state q 00 . Again, on
reordering the matrix elements and noting that
Eq h k0 = Eq0 h k
(2088)
one finds an identical expression (and the multiplicative factor of minus one).
Hence, Dirac holetheory does lead to the correct classical result.
The above description is quite cumbersome, but can be made more concise by
adopting an antiparticle description of the unoccupied negativeenergy states.
The first contribution to M first involves the creation of a virtual electronpositron pair with the emission of the photon (k 0 , 0 ). The electron which has
just been created in the momentum eigenstate (q 0 , 0 ) remains unchanged in
the final state. Subsequently, the positron annihilates with the initial electron
(q, ) while absorbing the photon (k, ). Since the intermediate state is a virtual state, energy does not have to be conserved. The second contribution to M
involves the creation of a virtual electronpositron pair with the absorption of
the photon (k, ). The created electron (q 0 , 0 ) remains in the final state while
367
the positron subsequently annihilates with the initial electron (q, ) and emits
the photon (k 0 , 0 ). This process is also a virtual process if the energy of the
incident light h
k is less than 2 m c2 .
The perturbative expression for the Compton scattering crosssection can
be evaluated exactly, without recourse to nonrelativistic approximations. The
exact result is
2
d
1 2 0
0
2
+
2
+
4
cos
(2089)
=
re
d
4
where is the angle between the polarization vectors. This result was first
derived by Klein and Nishina148 in 1928.
12.16.2
Charge Conjugation
(2092)
We shall multiply the complex conjugate of the Dirac equation by (2) and anticommute (2) with the real and commute (2) with the (2) matrix. This
procedure changes the sign in front of the term originating from the differential
momentum operator w.r.t. the sign of the mass term. This procedure yields
q
(2)
( i h
A ) m c = 0
c
q
(i
h +
A ) m c (2) = 0
(2093)
c
148 O.
368
Hence, one sees that (2) describes a Dirac particle with mass m and a charge
of q moving in the presence of a vector potential A . The fact that the operation of charge conjugation (in any representation) involves complex conjugation
is related to gauge invariance. Charge conjugation is a new type of symmetry for
particles that have complex wave functions which relates particles to particles
with opposite charges. The charge conjugate field c is defined as
c = C
(2094)
which is the result of the complex conjugation followed by the action of a linear
The joint operation can be represented as an antiunitary operator.
operator C.
The charge conjugation operator C is defined as the unitary and Hermitean
operator
C = i (2)
(2095)
The charge conjugation operator is Hermitean as
C = + i (2) = i (2) = C
(2096)
C C = (2) (2) = I
(2097)
where the anticommutation relations of the matrices have been used. It was
through this type of logic that Kramers149 discovered the form of the charge
conjugation transformation which turns a particle into an antiparticle.
The expectation values of an operator A in a general charge conjugated state
are related to the expectation values in a general state via
< c  A  c > =
<  (2) A (2)  >
(2098)
c
d3 r (r) C A C (r)
d r (r) A (r) =
Z
(2099)
=
d r (r) C A C (r)
where we have used the identity z = (z ) in the second line. However, since C
is real, one finds
Z
Z
3
c
c
3
d r (r) A (r) =
d r (r) C A C (r)
Z
=
d3 r (r) (2) A (2) (r)
(2100)
149 H.
369
( E + m c2 )
,k (x) =
exp i k x
(2101)
c h
. k
2EV
E + m c2
The charge conjugate wave function is given by
c
,k
(x)
= C ,k
(x)
r
( E + m c2 )
C
=
2EV
c h
. k
E + m c2
exp
+ i k x
(2102)
where
C
= i (2)
0
i (2)
=
i (2)
0
0 0 0 1
0 0 1 0
=
0 1 0 0
1 0 0 0
(2103)
,k (x) = i
exp
+
i
k
x
2EV
(2104)
which has the form of a planewave solution with negative energy E E,
and momentum h k h k. Furthermore, the spin of the charge conjugated
wave function has been reversed150 , since when i (2) acts on the
complex conjugated positiveeigenvalue eigenstate of the spin projected on an
arbitrary direction
cos 2 exp[+i 2 ]
(2105)
+ (, ) =
sin 2 exp[i 2 ]
150 Note
370
(2106)
= ( . k ) ( i (2) ) + (, )
= ( . k ) (, )
= ( . ( k ) ) (, )
(2108)
The end result is that the charge conjugated singleparticle wave function has
the form
r
c h
( . (k) )
( E + m c2 )
c
2
E
+
m
c
exp + i k x
,k (x) =
2EV
(2109)
The properties described above are the properties of a state of a relativistic free
particle with a negative energy eigenvalue E, momentum h k and spin .
The absence of an electron in the charge conjugated state describes a positron,
with positive energy E, momentum h
k and spin .
More generally, even when an electromagnetic field is present, the charge
conjugated wave function of a positiveenergy particle corresponds to the wave
function of a state with reversed energy E E, reversed spin
and reversed charge q q. Therefore, the charge conjugated state corresponds to the (negativeenergy) state which when unoccupied is described as an
antiparticle.
Exercise:
Consider massless Dirac particles, m 0. (i) Show that the energyhelicity
eigenstates coincide with the eigenstates of (4) . (ii) Hence, show that the operators 12 ( I (4) ) project onto helicity eigenstates. These projection operators
relate the fourcomponent Dirac spinors onto the independent twocomponent
Weyl spinors L and R . (iii) Show that charge conjugation transforms L into
R .
Exercise:
371
Prove the completeness relation for the set of solutions for the Dirac equation
for a free particle
X
(r) (r0 ) + c (r) c (r0 )
= 3 (r r0 ) , (2110)
13
LQED =
F , F, + c
( p
A ) m c (2111)
16
c
The role of the source term for the electromagnetic field is played by the term
in the Dirac Lagrangian which represents the coupling to the vector potential.
The expression for the QED action
Z
SQED =
dx4 LQED
(2112)
is invariant under an infinite number of infinitesimal gauge transformations with
the form
A
A0 = A +
iq
0 =
h c
iq
0 = +
h c
(2113)
151 Frequently, the relativistic free electron states are given a manifestly covariant normalization, in order to facilitate covariant perturbation theory. The use of different normalization
conventions results in changes the form of the completeness relation.
372
j =
(2114)
( A )
( )
h c
which satisfy the continuity conditions
j = 0
(2115)
j =
F , + q
(2116)
4
On substituting the EulerLagrange equation for the electromagnetic field
4
q c
(2117)
c
into the second term of the fourvector current density, one finds that the continuous current has the form
1
j =
F ,
(2118)
4
F , =
Q=const. =
d3 r ( . E )
(2120)
4c
On using Gausss law
(2121)
.E = 4
one finds that global gauge invariance ensures conservation of electrical charge
Z
Q=const. =
(2122)
d3 r (r)
c
However, for any of the infinite number of spatially varying which vanish on
the boundaries
Z
1
Q =
d3 r . ( E )
4c
Z
1
=
d2 S . E
4c
= 0
(2123)
373
14
14.1
(2124)
The fermion creation and annihilation operators, c and c , satisfy the anticommutation relations
{ c , c }+
{ c , c }+
=
=
0
0
(2125)
{ c , c }+ = ,
(2126)
and
where the quantum numbers and describe a complete set of singleparticle
states.
The anticommutation relation
c c = c c
(2127)
is merely a restatement of the antisymmetric nature of a fermionic manyparticle wave function under the permutation of a pair of particles, as is the
Hermitean conjugate relation
c c = c c
(2128)
374
(2129)
The choice of anticommutation relations results in the eigenvalues of the number operator to be restricted to either n = 1 or n = 0. This can be seen by
examining the identity
n
n
= n
(2130)
which follows from
n
n
= c c c c
= c c c c c c
= c c + c c c c
(2131)
where we have used the anticommutation relation for the creation and annihilation operator to obtain the second line and used the anticommutation relation
for two annihilation operators to obtain the last line. On comparing the second
and third lines, one recognizes that
c c c c = 0
(2132)
= c c
= n
(2133)
Hence, we have
n
n
(2134)
(2136)
(2137)
The state  0 > is also an eigenstate of the number operator with eigenvalue
zero since
n
 0 > = c c  0 >
= 0
(2138)
(2140)
If this operator equation acts on the state where the quantum state is unoccupied  0 >, and using the condition
n
 0 > = 0
(2141)
(2142)
 0 , 0 , 0 , ... , 0 >
(2143)
 {n } > =
n !
=1
where the allowed values of the set of occupation numbers n are either unity
or zero. The sequencing or ordering of the creation operators in this expression
is crucial, since the interchange the positions of the operators may result in a
change in sign of the state. For example, the action of a creation operators on
a general number eigenstate has the effect
P
c  n1 n2 . . . n . . . > = ( 1 )( i=1 ni )  n1 n2 . . . n + 1 . . . > (2144)
where the sign occurs since this involves anticommutating c with
other creation operators to bring it into the th position.
14.2
i=1
ni
The quantization of the Dirac field proceeds exactly the same way as for nonrelativistic electrons153 . However, the negativeenergy states will be described
153 W. Heisenberg and W. Pauli, Zeit. f
ur Physik, 56, 1 (1929).
W. Heisenberg and W. Pauli, Zeit. f
ur Physik, 59, 168 (1930).
376
with a different notation from the positiveenergy states. The change of notation is to reflect the intent of describing the (quasiparticle) excitations of
the system and not to describe the manyparticle ground state which is unobservable. The wave functions (r) describing the positiveenergy states of the
noninteracting electrons are indexed by the set of quantum numbers (k, ).
The negativeenergy states are described as the charge conjugates of the positiveenergy states. Therefore, the negativeenergy states are described by the same
set of indices and the corresponding wave functions are denoted by c (r).
The annihilation operator for electrons in the positiveenergy state is denoted
by c . However, the operator which removes an electron from the (negativeenergy) charge conjugated state c (r) is denoted by a creation operator b . The
change from annihilation operator to creation operator merely represents that
creating a positron with quantum numbers is equivalent to creating a hole in
the negativeenergy state154 . The effect of the annihilation operators on Diracs
vacuum  0 >, in which all the negativeenergy states are fully occupied are
c  0 >
b  0 >
= 0
=
(2145)
where the first expression follows from the assumed absence of electrons in the
positiveenergy states, and the second expression follows from the assumption
that all the negativeenergy states are completely filled, so adding an extra electron to the state c is forbidden by the Pauliexclusion principle. More concisely,
the above relations state that the vacuum contains neither (positiveenergy)
electrons nor positrons. It is seen that the form of the anticommutation relations are unchanged by this simple change of notation. The anticommutation
relations become
{ c , c }+
{
, c }+
= { c , c }+ = 0
= ,
(2146)
= { b , b }+ = 0
{ b , b }+
= ,
(2147)
(2148)
The mixed electron/positron anticommutation relations are all zero, since the
operators describe electrons in different singleparticle energy eigenstates. In
154 W.
377
(r) =
(r) c + (r) b
(2149)
and
(r) =
X
(r) c + c (r) b
(2150)
(2151)
(r)
=
= i h
c (0 )
(2152)
Hence, one expects that the field operators (r) and (r)
are canonically conjugate and, therefore, satisfy the equaltime anticommutation relations
0 ) }+ = 3 (r r0 ) ,
{ (r) , (r
(2153)
where and label the components of the Dirac spinor. The anticommutation
relations for the field operators can be verified by noting that
X
{ (r) , (r ) }+ =
{ c , c }+ (r) (r0 ) + { c , b }+ (r) c (r0 )
,
X
= (r r0 )
(2154)
where the fermion anticommutation relations have been used in arriving at the
second line. The positiveenergy states and their charge conjugated states form
a complete set of basis states for the singleparticle Dirac equation, so their
155 W. Heisenberg and W. Pauli, Zeit. f
ur Physik, 56, 1 (1929).
W. Heisenberg and W. Pauli, Zeit. f
ur Physik, 59, 168 (1930).
378
completeness condition has been used in going from the third to the fourth line.
The equaltime field anticommutation relations can be generalized to field anticommutators at spacetime points with a general type of separation. In the case
where the two field points x and x0 have a spacelike separation
( x x0 ) ( x x0 ) < 0
causality dictates that the anticommutators are zero
0 ) }+ = 0
{ (x) , (x
ct
That is, for spacelike separations, there is no causal connection156 so a measurement of a local field at x0 cannot affect a measurement at x. N. Bohr and
x2 > 0
r
2
x < 0
Figure 70: Due to causality, the anticommutator of the field operator should
vanish for spacelike separations. The anticommutators can be nonzero inside
or on the light cone.
L. Rosenfeld157 have put forward general arguments that the commutation relations also place limitations on the measurement of fields at timelike separations.
The Hamiltonian density for the (noninteracting) quantized Dirac field theory can be expressed as the operator
= (0) c
H
i h . + m c
(2155)
and the Hamiltonian operator is given by
Z
=
H
d3 r H
(2156)
When the expansion of the quantized field in terms of singleparticle wave functions is substituted into the Hamiltonian, one finds
X
H =
E c c + E b b
156 Outside
379
X
E c c E b b
(2157)
where the expression for the energy of the charge conjugated state
Ec = E
(2158)
has been used. On anticommuting the positron and annihilation operators, one
finds
X
=
H
E c c + b b 1
(2159)
The last term, when summed over , yields the infinitely negative energy of
Diracs vacuum in which all the negativeenergy states are filled. The vacuum
energy shall be used as the reference energy, so the Hamiltonian becomes
X
H =
E c c + b b
(2160)
which describes the energy of the excited state as the sum of the energies of the
excited electrons and the excited positrons. The energies of the positrons and
electrons are given by positive numbers.
The momentum operator defined by Noethers theorem is found as
X
P =
h k ck, ck, + bk, bk,
(2161)
k,
which is just the sum of the momenta of the (positiveenergy) electrons and the
positrons. The spin operator is defined as
Z
h
S =
d3 r
(2162)
2
This is evaluated by substituting the expression for the field operators in terms
of the singleparticle wave functions and the particle creation and annihilation
operators. The expectation value of the spin operator in the charge conjugated
state c is given by
Z
Z
c (r) =
d3 r (r) (2) (2) (r)
d3 r c (r)
Z
=
d3 r (r) (2)
(2) (r)
Z
3
(r)
=
d r (r)
Z
3
=
d r (r)
(r)
(2163)
380
(2164)
The last line follows since is Hermitean. Hence, the spin operator is evaluated
as
h X
S =
00 0
ck,00 ck,0 + bk,00 bk,0
(2165)
2
0
00
k; ,
which is just the sums of the spins of the electrons and positrons.
Finally, the conserved Noether charge corresponding to the global gauge
invariance is given by
Z
Q =
d3 r (r) (r)
X
=
c c + b b
X
c c b b + 1
(2166)
The last term in the parenthesis, when summed over all states , yields the total
charge of the vacuum which is to be discarded. Hence, the observable charge is
defined as
X
=
Q
c c b b
(2167)
which shows that the total electrical charge defined as the difference between
the number of electrons and the number of positrons is conserved.
14.3
The Lagrangian density may posses continuous symmetries and it may also
posses discrete symmetries. Some of the discrete symmetries are examined below.
14.3.1
Parity
The parity eigenvalue equation for a multiparticle state with parity can be
expressed as
P  > =  >
(2168)
Since the action of the parity operator on states is described by a unitary operator, operators transform under parity according to the general form of a unitary
381
(r)
0 (r0 ) = P (r)
P
(2169)
(r) =
c (r) + b (r)
(2170)
one has
P (r)
P =
X
c P (r) P + b P c (r) P
(2171)
(2172)
where P is a phase factor which represents the intrinsic parity of the state.
then the intrinsic parities P and P c have to
Furthermore, since P 2 = I,
satisfy the conditions
( P )2
( P c )2
=
=
1
1
(2173)
So the intrinsic parities are 1. The intrinsic parity of a state (r) and its
charge conjugated state c (r) are related by
P c = P
(2174)
This follows since charge conjugation flips the upper and lower twocomponent
spinors and these twocomponent spinors have opposite intrinsic parity. Therefore, the state (r) and the charge conjugates state c (r) have opposite parities. Therefore, it follows that the field operator transforms as
X
P (r)
P =
P c P (r) P b cP (r)
(2175)
382
The relations between parity reversed states and parity reversed charge conjugated states can be verified by examining the free particle solutions of the
Dirac equation and noting that the parity operator consists of the product of
(0) and spatial inversion r r. This spatial inversion acting on a wave
function with momentum k and spin becomes a wave function with momentum
k and spin , up to a constant of proportionality. A free particle momentum
eigenstate is given by
(0)
,k (x) = N
exp i ( k0 x
k.r)
(2176)
c h
k .
E + m c2
The application of the parity operator to the above wave function yields
(0)
(0)
P ,k (x) = N
exp i ( k0 x
+ k.r)
c h
k .
E + m c2
= N
exp i ( k0 x(0) + k . r )
c h
k .
E + m c2
= ,k (x)
(2177)
= C ,k (x)
= N C
c h
k .
E + m c2
exp
+ i k x
(2178)
where
C
= i (2)
0 0 0 1
0 0 1 0
=
0 1 0 0
1 0 0 0
(2)
c
E
+
m
c
,k (x) = i
N
exp + i k x
(2179)
(2180)
(2)
c
(0)
2
E + m c
N
exp
+
i
(
k
x
+
k
.
r
)
P ,k (x) = i
0
+
!
c h
( k . )
(2)
(0)
2
E
+
m
c
= i
N
exp + i ( k x0 + k . r )
+
= c,k (x)
(2181)
383
where in the first line the parity operator has sent r r and the factor of
(0) has flipped the sign of the lower components. In the second line we have
rewritten k as (k) in the two twocomponent spinor, in anticipation of the
comparison with eqn(2178) which allows us to identify the factor of c,k (x).
This example shows that a state and its charge conjugate have opposite intrinsic
parities.
From the general form of the parity transformation on Dirac spinors, one
infers that the parity transform of the field operator is given by
X
P (r)
P =
c P P (r) b P cP (r)
(2182)
P
P
c
P (r) P =
P0 cP0 0 (r) P0 bP0 0 (r)
(2183)
P0
P (r)
P =
P
cP (r) P
bP c (r)
(2184)
Thus, the parity operation can also be interpreted as only affecting the particle
creation and annihilation operators, and not the wave functions. Quantum
mechanically, this interpretation corresponds to viewing that the particles as
being transferred into their parity reversed states
X
c
P (r) P =
P c P (r) + P b P (r)
(2185)
In this new interpretation, the effects of parity on the fermion operators are
found by identifying the operators multiplying the singleparticle wave functions
in the previous two equations. The resulting operator equations are
P
P c P = P
cP
(2186)
P
P b P = P
bP
(2187)
and
which shows that fermion particles and antiparticles have opposite intrinsic
parities. Therefore, we conclude that, irrespective of which interpretation is
used, the field operator transforms as
X
P
P
c
c P (r) b P (r)
(2188)
P (r) P =
which shows that the quantum field operators transforms in a similar fashion
to the classical field.
384
14.3.2
Charge Conjugation
(2189)
(up to an arbitrary phase) since this is how the singleparticle wave functions
transform. Classically, the (antilinear) charge conjugation operator C is the
product of complex conjugation and the unitary matrix operator C = i (2) .
If the classical field is expressed as a linear superposition of energy eigenfunctions, the amplitudes of the eigenfunctions are represented by complex numbers.
In the charge conjugated state, these amplitudes are replaced by the complex
conjugates. In the quantum field, the amplitudes must be replaced by particle creation and annihilation operators. If an amplitude is associated with an
annihilation operator, then the complex conjugate of the amplitude is usually
associated with a creation operator. Hence, we should expect that charge conjugation will result in the creation and annihilation operators being switched.
Since the quantum field operator is expressed as
X
=
c (r) + b c (r)
(r)
(2190)
c (r) = C (r)
C =
c C (r) C + b C c (r) C
(2191)
where, in accord with the earlier comment about the relation between the quantum and classical fields, the singleparticle operators have been replaced by their
Hermitean conjugates. However, under charge conjugation general Dirac spinors
satisfy
C (r)
C c (r)
= c c (r)
= c (r)
(2192)
therefore,
c (r) = C (r)
C =
c (r)
+ b (r)
(2193)
C c C (r) + C b C (r)
(2194)
(r) =
385
For consistency, the two expressions for c (r) must be equivalent. Hence, the
operator coefficients of (r) and c (r) in the two expressions should be identical. Therefore, one requires that
C c C = c b
C b C = c c
(2195)
(2196)
where is the Hermitean conjugate (column) field operator. Apart from the
replacement of the complex amplitudes with the Hermitean conjugates of the
creation and annihilation operators, the above expression is identical to the expression for charge conjugation on the classical field.
The charge conjugation operator has the effect of reversing the current density operator
C =
(2197)
C
which is understood as the result in the change of the charges sign.
14.3.3
Time Reversal
The timereversal operation interchanges the past with the future. Time reversal
transforms the spacetime coordinates via
T (ct, r) = (ct, r)
(2198)
Thus, under time reversal, the time and spatial components of the position
fourvector have different transformational properties. Furthermore, the energymomentum fourvector transforms as
T (p(0) , p) = (p(0) , p)
(2199)
Hence, the position fourvector and momentum fourvector have different transformational properties. Due to the above properties, angular momentum (including spin) transforms as
(2200)
T J = J
Therefore, we find that time reversal reverses momenta and flips spins.
According to the Wigner theorem158 , time reversal can only be implemented
by an antilinear antiunitary transformation. Since the time reversal operator
158 E.
P. Wigner. G
ottinger. Nachr. 31, 546 (1932).
386
(2201)
(2203)
C  > =
C T  >
(2204)
(2205)
satisfies the Dirac equation with t t. For example, the plane wave
solutions of the Dirac equation can be shown to transform as
T ,k (r, t) = (1) (3) ,k (r, t)
= ,k (r, t)
(2206)
which flips the momentum and the spin angular momentum. It should be noted
that the matrix operator (1) (3) does not couple the upper and lower twocomponent spinors, but nevertheless is closely related to the operator i (2)
which occurs in the charge conjugation operator.
Also, if the Dirac field operator is required to satisfy
r) T = (1) (3) (t, r)
T (t,
(2207)
= cT
= bT
387
(2208)
Charge Conjugation
Parity
Time Reversal
CPT
= C P T (x)
(2)
= i
P T (x)
= + i (2) (0) (1) (3) (x)
= i (2) (0) (1) (3) (x)
= i (0) (1) (2) (3) (x)
= (4) (x)
159 J.
160 J.
(2209)
388
14.3.4
The CPT theorem states that any local161 quantum field theory with a Hermitean Lorentz invariant Lagrangian which satisfies the spinstatistics theorem,
is invariant under the compound operation C P T , where the operators can be
placed in any order.
The proof of the theorem relies on the fact that any Lorentz invariant quantity must be created out of contracting the indices of bilinear covariants (quantities such as the current density j which involve products of the ) with the
indices of contravariant derivatives . Since the joint operation P T results in
each of the contravariant derivatives in the product changing sign, the theorem ensures that the corresponding bilinear covariants with which the derivatives are contracted with must undergo an equivalent number of sign changes
under the compound operation C P T . The theorem only assumes invariance
under proper orthochronous Lorentz transformations and makes no assumptions
about reflection. The improper transformations are treated as analytic continuation of the Lorentz transformation into complex spacetime. The theorem was
first discussed by L
uders162 and Pauli163 , and then by Lee, Oehme and Yang164 .
The theorem has several consequences, such as the equality of the masses of
particles and their antiparticles. This follows since the mass mc is an eigenvalue
of p(0) in the particles rest frame and since one can find simultaneous eigenstates
of the commuting operators p and the product C P T . If one denotes the
compound operator as
= C P T
(2210)
then
 > = < 
1
< H
1
= < 
H
1
 >
 >
H
(2211)
(2212)
(2213)
161 A Local Field Theory is one expressible in terms of a local Lagrangian density in which
interactions can be expressed in terms of products of fields at the same point in spacetime.
It would be truly remarkable if this concept were to continue to work at arbitrarily small
distances!
162 G. L
uders, Dan. Mat. Fys. Medd. 28, 5 (1954).
G. L
uders, Ann. Phys. 2, 1 (1957).
163 W. Pauli in Niels Bohr and the Development of Physics, Pergamon Press, London (1955).
164 T. D. Lee, R. Oehme and C. N. Yang, Phys. Rev. 106, 340 (1957).
389
(2214)
1
c
=
0 >
(2215)
(2216)
one finds that the energy of a particle is equal to the energy of an antiparticle
with a reversed spin. However, as the rest mass cannot depend on the angular
momentum, the mass of a particle is equal to the mass of its antiparticle. For
unstable particles, the equality of the mass of the particle and antiparticle is
ensured by the invariance of the Smatrix under C P T .
Likewise, one can use the CPT theorem to show that the total decay rate
of a particle into products is equal to the total decay rate of the antiparticle
into its products165 . It should be noted that the partial decay rates into specific
final states are not equivalent, only the sums over all final states are equal.
14.4
The above result for the energy operator of the Dirac field illustrates the Spin
Statistics Theorem proposed by Pauli166 . The theorem states that particles
with half oddinteger spins obey FermiDirac Statistics and particles with integer
spins obey BoseEinstein Statistics. The Dirac spinor describes spin onehalf
particles, and if these particles are chosen to satisfy anticommutation relations,
then the energy of the excited states is given by
X
H
=
E
c
+
b
b
(2217)
Dirac
which only has positive excitation energies. Hence, if the wave function changes
sign under the interchange of a pair of spin onehalf particles the energy is
bounded from below. If the field operators had been chosen to obey commutation relation, then the wave function would have been symmetric under the
165 T.
166 W.
390
interchange of particles. If this were the case, there would be a negative sign in
front of the positron energies so that the energy would have been unbounded
from below. This would have implied that the vacuum would not be stable, and
the theory is erroneous. This can be taken as implying that spin onehalf particles must obey FermiDirac Statistics. The other part of the theorem compels
integer spin particles to be bosons. Therefore, since photons have spin one, the
expression for the energy of the electromagnetic field is considered to be given
by
X h k
H
=
a
+
a
(2218)
k, k,
k, k,
Photon
2
k,
H
=
2
a
+
1
(2219)
k,
k,
Photon
2
k,
which is the sum of the vacuum energy (the zeropoint energies) and the energies of each excited photon. The excitation energies are positive. If it had been
assumed that the photon wave functions were antisymmetric under the interchange of particles, then one would have found that the photon energies would
have been identically equal to zero. Furthermore, the excited photons would
have carried zero momentum and, therefore, be completely void of any physical consequence. Hence, one concludes that spinone photons must obey BoseEinstein Statistics. The generalized theorem167 is an assertion that a nontrivial
integer spin field cannot have a anticommutator that vanishes for spacelike separations and a nontrivial odd halfinteger spin field cannot have a commutator
that vanishes for spacelike separations.
15
= <e 1 + i =m 1
= <e 2 + i =m 2
167 R.
(2221)
Streater and A. S. Wightman, PCT, Spin and Statistics, and All That, Princeton
Univ. Press (2000).
168 C. N. Yang and R. L. Mills, Phys. Rev. 96, 191 (1964).
391
This is equivalent to assuming four independent real fields. The inner product
is defined as
= 1 1 + 2 2
(2222)
15.1
U
= U
=
(2224)
(2225)
=
1 0
0 i
(2)
=
i 0
1
0
(3)
=
(2226)
0 1
where these matrices generate a Lie algebra. That is, the algebra of the commutation relations is closed, since
[ (i) , (j) ] = 2 i i,j,k (k)
(2227)
U = exp i
(2228)
k
392
where the k are three real quantities. This represents an arbitrary rotation in
isospin space169 . The U (1) gauge transformation can also be represented in the
same way. Namely, the U (1) transformation can be expressed as
0 = exp i (0) (0)
U
(2229)
where (0) is the unit matrix
(0)
=
1 0
0 1
(2230)
We should note that since (0) commutes with all isospin operators, the U (1)
symmetry is decoupled from the SU (2) symmetry, and hence, when a coupling
to gauge fields is introduced, the U (1) gauge field may have a coupling constant
which is different from the coupling constant for the three SU (2) gauge fields.
15.2
We shall start with a Lagrangian density Lscalar describing the field free two
component scalar field, given by
Lscalar = V ( )
(2231)
i g A
igA
=
i g A0
i g A0
(2233)
169 We shall not stop and contemplate the question of what restricts our measurements have to
be quantized along the isospin zdirection, and shall not ponder why there is a superselection
rule at work.
393
Since
0
= U
(2234)
we require that
i g A0 0
= U
i g A
(2235)
A0 = U
g
(2236)
are
where the derivative only acts on the unitary transformation. Since the U
(k)
3
X
A,k (k)
k=1
=
A,(3)
,(1)
A
+ i A,(2)
A,(1) i A,(2)
A,(3)
(2237)
15.3
We have four real fourvector fields A,k . These are the gauge fields. The free
gauge fields exist in the absence of the particles, and has a free Lagrangian. The
field strength tensors F , are given by the SU (2) generalized form of the EM
field tensor
F , = D A D A
(2239)
170 This can be related to the contravariant derivative familiar in the context of general
relativity, if one follows the logic adopted by Weyl and considers GR as a gauge field theory.
394
where D is the covariant derivative only involving the SU (2) triplet of gauge
fields. It should be noted that since the gauge fields do not commute, this
involves terms which are secondorder in the field amplitudes. That is
F , =
A A
i g A A A A
(2240)
The quadratic terms can be evaluated by using the commutation relations of
the isospin operators (k) . The kth component of the SU (2) triplet of gauge
fields is given by
Fk,
Ak
Ak
3
X
+ g
i,j,k ( Ai Aj Ai Aj )
{i,j}=1
3
X
= Ak Ak + 2 g
i,j,k ( Ai Aj Ai Aj )
{i>j}=1
(2241)
where the indices i and j are summed over and i,j,k is the LeviCivita symbol.
In arriving at the above expression, we have used the identity
(i) (j) = i,j (0) + i
3
X
i,j,k (k)
(2242)
k=1
(2243)
as expected for an electromagnetic field. Since the SU (2) gauge fields dont
commute, the field theory is a nonAbelian gauge field theory. Under an SU (2)
transformation, the field tensors transform according to
F , U
F , F , 0 = U
(2244)
which is just a local unitary transform in isospin space. The Lagrangian density
for all the free gauge fields can be expressed as
Lgauge =
1
Trace F , F,
32
(2245)
where the Trace is evaluated in isospin space and takes into account that there
are a total of four fields. The Lagrangian density can be expressed directly in
terms of the contributions from four components of the field. The result can be
expressed as
3
1 X k
Lgauge =
F, F ,,k
(2246)
16
k=0
395
(2247)
evaluated the product of the Pauli spin matrices and used the fact that the
Pauli spin matrices (k) for k 6= 0 are traceless.
One can consider the kcomponents of the vector potential A (i.e. the three
real components Ak for fixed ) as forming threevectors A in isospin space.
These quantities transform as threevectors under transformations in isospin
space, and also the A transform as fourvectors under Lorentz transformations
in Minkowsky spacetime. The threevector fields are spinone bosons with
isospin one. Hence, we might expect that the isospin triplet should contain two
oppositely charged particles and one uncharged particle. These particles are
supplemented by the particle corresponding to the single uncharged field A(0) .
In terms of this set of isospin vectors, the free gauge field Lagrangian density
can be written in the form of a sum of a scalar product in isospin space and an
isospin scalar
1
Lgauge =
( A A ) + 2g A A . ( A A ) + 2g A A
16
1
A(0) A(0)
A,(0) A,(0)
(2248)
16
It should be noted that the Lagrangian reduces to the sum of four noninteracting
electromagnetic Lagrangians in the limit g 0. However, at finite values of
g, the Lagrangian density contains cubic and quartic interactions with coupling
strengths that are fixed by gauge invariance in terms of the single gauge parameter g.
Figure 71: The interaction vertices representing the interaction of three and
four isospin triplet gauge field bosons.
Exercise:
396
Determine the equations of motion for the vector gauge fields, in the presence
of a source term
1
Lint =
Trace ( A . j )
(2249)
c
where the current source j has also been decomposed in terms of Pauli spin
matrices.
It is convenient to introduce the two combinations
1
(1)
(2)
i
A
A
=
(2250)
which appear in the isospin matrix form of A . These combinations are mutually
complex conjugate. Likewise, one can introduce the combinations of the field
tensors
1
(1)
(2)
F,
=
F,
i F,
(2251)
2
which are evaluated as
(3)
F,
= ( 2 i g A(3)
) A ( 2 i g A ) A
(2252)
+
F,
= ( A(3)
A(3)
) + 2 i g ( A A A A )
(2253)
In terms of these new combinations, the free Lagrangian for the gauge fields
become
Lgauge =
1
1
1
(0)
(3)
F,
F ,,(0)
F,
F ,,(3)
F F ,,+ (2254)
16
16
8 ,
where the first two terms are recognized as being similar to the Lagrangian density for the electromagnetic field. It was first hypothesized by Sheldon Glashow
that the electroweak interaction is produced by the massless vector bosons described by the above Lagrangian171 . Masses for the gauge bosons should not
be added by hand, since the resulting theory would not be renormalizable. To
retain renormalizability of the theory, and to have massive vector bosons, we
need to break the symmetry.
15.4
We shall assume that our massive charged scalar boson field has broken symmetry172 . The small amplitude excitations of the field with broken symmetry will
171 S.
397
be modified, as will be the excitations of the gauge fields. Due to the symmetrybreaking of the scalar field, the U (1) vector gauge field will become coupled to
the triplet of SU (2) gauge fields. When the symmetry is broken, the elementary
excitations of the coupled system of fields change and these new excitations will
represent the observable particles.
The potential for the twocomponent scalar field is chosen to be given by
2
2
mc
V () =
20
(2255)
2 h 0
where 0 is a fixed real constant. The lowestenergy state described by this
potential is given by
= 20
(2256)
This state is degenerate with respect to global rotations in the fourdimensional
space of <e 1 , =m 1 , <e 2 , =m 2 which keeps the magnitude of
constant and uniform over space.
The symmetry is broken by assuming that the physical ground state corresponds to one specific choice of the uniform field . Given the specific ground
state which the system chooses spontaneously, one can make use of the global
gauge invariance to describe the ground state 0 as a field which has one nonzero component which is real. That is, k can be chosen so that
<e 1
0 =
0
0
=
(2257)
0
The excited states can be expressed as
0 + 1
=
0
(2258)
where the local gauge degrees of freedom have been used to make 1 real. This
excited field is invariant under the transformation
EM
0 = U
(2259)
EM is restricted to have the form
where U
1
0
EM =
U
(2260)
0 exp i
This is a transformation in which the U (1) transformation is combined with a
specific SU (2) transformation
0
exp + i 2
EM = exp i
(2261)
U
2
0
exp i 2
398
and will turn out to represent the residual U (1) gauge invariance of the electromagnetic field.
The Lagrangian density for the isospin doublet of scalar fields and their
couplings can be evaluated for the excited state as
2
mc
Lscalar = (D ) D
21
(2262)
h
where the covariant derivative of the doublet of scalar fields is given by
,(0)
A,(3) ( 0 + 1 )
1
A
( 0 + 1 )
D =
i g0
ig
0
0
2 A, ( 0 + 1 )
(2263)
A new interaction strength can be defined as
q
=
g02 + g 2
(2264)
and on defining an angle via
tan =
g
g0
(2265)
= cos
= sin
(2266)
Thus, the covariant derivative has the connection with the field
AZ = cos A,(0) + sin A,(3)
(2267)
The field AZ will turn out to be the field that describes the neutral Z particle.
The field orthogonal to the Z field is defined as
AEM = sin A,(0) + cos A,(3)
(2268)
When expressed in terms of the transformed fields and constants, the covariant
derivative terms become
1
AZ,
D =
i ( 0 + 1 )
(2269)
0
g 2A
The lowestorder terms in the Lagrangian density of the nonuniform scalar field
and all its couplings to the gauge fields are expressed as
2
mc
Lscalar = 1 1
21 + 2 20 AZ A,Z + 2 g 2 20 A+ A
h
(2270)
399
The higherorder terms, which have been neglected, describe the selfinteractions
between the scalar field and the residual interactions between the scalar field
and the gauge fields.
The Lagrangian density for the free gauge field is
Lgauge =
1
1
1
F (3) F ,,(3)
F F ,,+ (2271)
F (0) F ,,(0)
16 ,
16 ,
8 ,
has to be expressed in terms of the new fields AZ and AEM . The inverse
transform is given by
cos AZ sin AEM
A,(0)
A,(3)
(2272)
,
If one defines FZ, and FEM
as the transformed field tensors evaluated to lowestorder in the fields
FZ,
,
FEM
= AZ AZ
= AEM AEM
(2273)
,
cos FZ, sin FEM
,
F(3)
,
sin FZ, + cos FEM
+ 2 i g ( A A+ A A+ )
(2274)
The Lagrangian density describing the small amplitude excitations of the scalar
field and the gauge fields can be written as
LF ree
= 1 1
mc
h
2
21
1
F,,Z FZ, + 2 20 AZ A,Z
16
1
,
F,,EM FEM
16
1
F F ,,+ + 2 g 2 20 A+ A
8 ,
(2275)
In electroweak theory, the first term represents the free uncharged scalar
boson. The second term describes an uncharged vector particle with mass MZ
proportional 0 . The third term describes the uncharged massless vector
particle known as the photon. From the equations of motion for A, , the
remaining term can be shown to describe a pair of charged particles with masses
MW proportional to g 0 = sin 0 . These particles are known as the W +
400
and W particles. The W + and W particles are charged and the observed
charges are e. The interaction mediated by the massive vector bosons is found
to have a finite range ( 1018 m), and is responsible for the weak interaction.
The experimentally determined masses173 are MW c2 80.33 GeV and MZ c2
91.187 GeV. Nearly all the parameters of this theory have been determined
through experiment, the only exception is the mass m of the scalar particle
which remains to be discovered. The ratio of the masses determines the angle
via174
MW
= sin
(2276)
MZ
which yields sin 0.8810. The W particles carry electrical charges e since
they couple to the electromagnetic field. This can be seen by examining the
W field tensor
F, = ( 2 i g A,(3) ) A ( 2 i g A,(3) ) A
(2277)
where
A,(3)
(2278)
Therefore, the covariant derivatives of the fields D A couple them to the electromagnetic field AEM with either a positive or negative coupling constant of
magnitude 2 g cos . Since only electrically charged particles couple to the
electromagnetic field, one can make the identification
e
= 2 g cos = sin 2
(2279)
hc
173 G. Arnison, A. Astbury, B. Aubert, et al., Phys. Lett. B, 122 103116 (1983).
G. Arnison, O. C. Allkofer, A. Astbury, et al., Phys. Lett. B 147, 493508 (1984).
MW
174 It is customary to define the Weinberg angle
W via cos W = M .
Z
175 Physics Letters B, 716, 1 (2012), Physics Letters B, 716, 30 (2012).
176 G. t Hooft, Nuclear Physics, B 35, 167 (1971), G. t Hooft and M. Veltman Nuclear
Physics B 44, 189 (1972).
401