Physics 73 Lecture Notes on Thermodynamics, Relativity and Quantum Physics

Physics 73 Lecture Notes (Thermodynamics, Special Relativity,
and Quantum Physics)

Reginald Christian S. Bernardo
Theoretical Physics Group, National Institute of Physics, University of the Philippines, Diliman,
Quezon City 1101, Philippines
email: rbernardo@nip.upd.edu.ph
1 Thermodynamics
Definitions:
Thermodynamics - phenomenological description of properties of macroscopic systems in thermal

equilibrium.
phenomenological - empirically based, experiment based
macroscopic - involving a large number of components, constituents (degrees of freedom)
thermal equilibrium - thermal properties of an object are uniform throughout its body and not
changing in time
1.1 Zeroth Law and Temperature Scales

Zeroth Law of Thermodynamics
Figure 1: Thermodynamic systems A and B are connected thermally to a thermodynamic system C through a
thermal conductor. Systems A and B are separated by an insulator. The whole systems is assumed to be inside
an insulating box.
A, B, C - thermodynamic systems
insulator - does not allow passage of energy transfer
conductor - allows the passage of energy transfer
If two thermodynamic systems A and B are separately in thermal equilibrium with a third system C,
then the two systems A and B are in thermal equilibrium with each other.
Consequences of the zeroth law:
1. allows for the definition of the temperature T (something must be the same for systems in thermal
equilibrium
2. allows the creation of thermometers (system C can be a thermometer)
Physics 73 Lecture Notes

1
The condition for thermal equilibrium between two systems 1 and 2 can then be written as
T1 = T2 . (1.1)
This is in addition to the conditions P1 = P2 and 1 = 2 for equilibrium between two systems.
Note: A thermometer actually measures its own temperature. It is assumed that the thermometer and
the body is in thermal equilibrium when using the thermometer to measure the temperature of a body.
Different objects react to temperature changes differently liquid-in-glass thermometer, bimetallic
strip-based thermometer, gas-pressure thermometer, resistance thermometer
Figure 2: liquid-in-glass thermometer
Figure 3: bimetalic strip-based thermometer
Figure 4: gas-pressure thermometer
Temperature scales can be defined using two thermodynamic points. Conveniently chosen points are
the freezing pt. and boiling pt. of pure water. In the Celsius temperature scale, the freezing pt.
of pure water is set to 0 C while the boiling pt. of pure water is set to 100 C. Using a liquid-in-glass
thermometer or any thermometer based on a property that changes linearly with temperature changes
T , the Celsius temperature scale can be created by making 100 equally spaced divisions between the
freezing pt. and boiling pt. marks. In Fahrenheit temperature scale, the freezing pt. is set to 32 F

2
while the boiling pt. is set to 212 F . To convert from the Celsius temperature TC to the Fahrenheit
temperature scale TF , we can use the following equation:
9
TF = TC + 32. (1.2)
5
Can you derive this equation?
Kelvin Scale
In principle, we can calibrate different thermometers so that the temperatures would agree at the ther-
modynamic points corresponding to temperatures T1 and T2 . For example, we can set the temperature
to agree at 0 C and 100 for the freezing and boiling pts. of water. But this does not necessarily mean
that the intermediate temperatures for the two thermometers would agree. Why?
In connection with this, it would be convenient to have a temperature scale that is the same for all
objects. This leads us to the Kelvin temperature scale. The results of the experiment shows that the
Figure 5: Pressure-temperature graph for gases of different amounts and types.
hypothetical temperature T0 at which the pressure of the gas goes to zero is independent on the amount
of gas and the type of gas. This hypothetical temperature which is 273.15 C can be used to define the
Kelvin scale. The conversion from the Celsius scale TC to the Kelvin scale TK can be done using
TK = TC + 273.15. (1.3)
The Kelvin scale (or absolute temperature scale) can be defined using only a single reference temperature
using the known properties of gases. The triple pt. of water (where solid, liquid, and gas phases coexist)
is conveniently used to define the Kelvin scale:
Ttriple
T = P (1.4)
Ptriple
where Ttriple and Ptriple are the temperature and pressure, respectively, at the triple pt. of water.
1.2 Thermal Expansion

The size of objects increase or decrease in response to temperature changes. For solids and liquids, it
turns out that the change in size is proportional to the temperature change T . For linear dimensions,
the change in the linear dimension L for a temperature change T is given by
L = LT. (1.5)
where L is the initial length and is the coefficient of linear expansion. In the way Eq. 1.5 is written,
the coefficient of thermal expansion is an average only within the time interval T :
1 L
= . (1.6)
L T

3
In general, the coefficient of thermal expansion would change within finite temperature intervals T .
This invites us to define the coefficient of linear expansion at a definite temperature T as
1 L 1 dL
= lim = . (1.7)
0 L T L dT
In problems where the coefficient of thermal expansion changes significantly, we can start with the
infinitesimal form of Eq. 1.5 which is given by
dL = LdT. (1.8)
To calculate the change in length over a finite temperature interval T = T2 T1 we then integrate:
Z T2
L = (T )LdT. (1.9)
T1
Volume Expansion
All the dimensions of an object responds to a temperature change. If all the linear dimensions expands
according to Eq. 1.5, we can show that the change in the volume V is
V = V T (1.10)
where V is the initial volume and is the coefficient of volume expansion. As an exercise try to show
that the coefficient of volume expansion is related to the coefficient of linear expansion through
= 3. (1.11)
You can start by differentiating the volume of a cube with sides L1 , L2 , and L3 . As in linear expansion,
the coefficient of volume expansion might change, for some objects, significantly over given temperature
intervals. Similar infinitesimal relations hold for volume expansion as in linear expansion. What are
these?
Exercises (Tipler)
1. A steel bridge is 1000 m long. By how much does is expand when the temperature rises from 0 C to
30 C? (Given = 11 106 K 1 ). Ans. 0.33 m = 33 cm.
2. A 1-L glass flask is filled to the brim with alcohol at 10 C. If the temperature is raised to 30 C, how
much alcohol spills out of the glass flask? (Given alcohol = 1.1 103 K 1 and glass = 9 106 K 1 )
Ans. 21.5 mL.
Thermal Stress
Heat up an object and restrain the expansion or compression response (say, using clamps). The Youngs
modulus is defined to be
stress F/A
Y = = . (1.12)
strain L/L
The stress applied by the restraints must be equal to the thermal stress to keep the size of the object
the same. In this case, it follows from Eq. 1.12 that
F L
=Y = Y T. (1.13)
A L
Example (Tipler)
3. A copper bar is heated to 300 C and is then clamped rigidly between two fixed points so that it can
neither expand nor contract. If the breaking stress of copper is 230 M N
m2 , at what temperature will the
GN
bar break as it cools? (Given Y = 100 m2 ) Ans. 177 C.

4
Figure 6: Joules Experiment: Work is done on the water by stirring
1.3 Heat, Specific Heat and Heat Capacity, Calorimetry

Heat
Heat - energy transferred from one system to another due to temperature difference.
Results of Joules experiment:

1. Temp. of system is raised by doing work on the system
2. Temp. of system is raised by adding heat
We define the work needed to raise the temperature of 1 g of water by 1 C as the calorie. Here are useful
conversion factors:
4.184 J = 1 cal (1.14)

1 btu = 252 cal (1.15)
Heat Capacity and Specific Heat

The amount of heat needed to raise the temperature of the system is proportional to the temperature
change and the mass of the system. This is a conclusion that can be obtained for water using Joules
experiment. It turns out that this result also extends to many thermodynamic systems. We call the
proportionality constant the specific heat. In symbols, the heat Q released or absorbed by the system of
mass m and temperature change T is
Q = mcT (1.16)
where c is the specific heat. The specific heat can, in general, depend on the temperature and Eq. 1.16
gives us a way to calculate average values of c over a temperature change T . If we want to measure
the specific heat at a particular temperature, we take the limit
1 dQ
c = lim . (1.17)
T 0 m dT
The heat capacity C is defined as follows:
heat capacity = C = heat needed to raise the temp. of substance by 1 K. (1.18)
From Eq. 1.16, we can write down Q = mcT = (mc)T and identify the heat capacity as
C = mc. (1.19)
In many instances, it is also useful to express quantities in moles. It is convenient to define the molar
heat capacity or molar specific heat. Note that the mass m of a substance can be expressed as
m = nM (1.20)
where n is the number of moles and M is the molar mass. Using Eq. 1.20 we can write down Eq. 1.16
as Q = mcT = n(M c)T and identify the molar heat capacity:
C 0 = M c. (1.21)

5
Figure 7: Basic steps in calorimetry.
In terms of the molar hear capacity, we can express the heat corresponding to a temperature change T
as
Q = nC 0 T. (1.22)
Calorimetry
calorimetry - procedure to measure the specific heat of an object
In calorimetry, we assume that the system is isolated from the surroundings. In this way, all of the
energy transfer takes place between the calorimeter and the objects inside it. Because the object is
heated, it is at an initial temperature that is higher than the final temperature. It therefore releases heat
Q < 0 while the calorimeter and the water absorbs this released heat:
Qout = mc(Tf Ti ) < 0 (1.23)
Qin = mW cW (Tf Ti,W ) + mC cC (Tf Ti,W ). (1.24)
where subscripts W and C refer to the water and the calorimeter, respectively. Being isolated from the
surroundings, we have
Qreleased + Qabsorbed = 0. (1.25)
Try: Express the specific heat c of the object using the equations above.
Phase Changes and Latent Heat

We define the phase as a specific state of matter, e.g. solid, liquid, gas. Transitions from one phase to
another occur frequently and cover one of the most exciting and most complicated areas of interest. We
define the following terms:
fusion - liquid to solid
melting - solid to liquid
vaporization - liquid to gas
condensation - gas to liquid
sublimation - solid to gas
deposition - gas to solid
Large amounts of energy is needed to go from one phase of matter to another. At the transition
point, the energy supplied is used to break the molecular bonds of the system, thus, changing the phase.
To change the phase of a substance of mass m from solid to liquid, we need the amount of heat
Qf = mLf (1.26)
where Lf is called the latent heat of fusion. To change the phase of a substance of mass m from liquid
to gas, we need the amount of heat
Qv = mLv (1.27)
where Lv is called the latent heat of vaporization.

6
Figure 8: Temperature vs. time for specimen of water initially at solid phase.
Exercises
1. (Tipler) To measure the specific heat of lead, you heat 600 g of lead shot to 100 C and place it
in an aluminum calorimeter of mass 200 g that contains 500 g of water initially at 17.3 C. If the
final temperature of the mixture is 20.0 C, what is the specific heat of lead? (The specific heat of
kJ kJ
the aluminum container is 0.900 kgK .) Ans. 0.128 kgK .
2. (Young and Freedman) A heavy copper pot of mass 2.0 kg (including the copper lid) is at a
temperature of 150 C. You pour 0.10 kg of water at 25c ircC into the pot and quickly close the
lid of the pot so that no steam can escape. Find the final temperature of the pot and its contents
and determine the phase of the water. Assume that no heat is lost to the surroundings. Ans.
Tf = 100 C, partly liquid, partly gas.
1.4 Mechanisms of Heat Transfer

There are three ways for transferring thermal energy. These are conduction, convection, and radiation.
Conduction - thermal energy is transferred by interaction of atoms or molecules but may or may not
involve transfer of atoms or molecules
Figure 9: Thermal conduction by transfer of energy between consecutive atoms
Heat Current - amount of heat dQ transferred in time dt.

Experimental results for conduction:
1. Heat current is proportional to the temperature gradient
2. Heat current is proportional to the cross-sectional area
The proportionality constant is called the thermal conductivity that we denote as k. In symbols, the
heat current for conduction is given by
T
H = kA (1.28)
x
where H is the heat current, A is the cross-sectional area, T
x of the temperature gradient, and k is the
thermal conductivity. We can establish the connection with I = VR for electrical conduction. By writing

7
down Eq. 1.28 as
T
H= 1 x
. (1.29)
k A
We identify the thermal resistance as

1 x
R= . (1.30)
k A
Consider a series connection of thermal conductors as shown. Because the connection is in series the
Figure 10: Series connection of two thermal conductors
current flowing through each of the conductors is the same. We thus have
T2 T1 T3 T2
= = H. (1.31)
R1 R2
The above relation can be used, for example, to calculate the temperature at the junction between two
conductors. The current flowing through the whole system can be written as
T3 T1
H= . (1.32)
Req
We can easily identify that

Req = R1 + R2 . (1.33)
In general, the arguments can be extended to calculate the equivalent thermal resistance for a series or
N conductors with resistances R1 , R2 , etc. The result is
Req = R1 + R2 + ...RN . (1.34)
Consider a parallel connection of thermal conductors as shown. The temperature change for both
Figure 11: Parallel thermal conduction. The conductors are subject to the same temperature change.
conductors are the same while the current varies. Thus,
T = H1 R1 = H2 R2 . (1.35)

8
Thus, the total current is
T T T
Htotal = = + . (1.36)
Req R1 R2
It follows that the equivalent resistance is given by
1 1 1
= + . (1.37)
Req R1 R2
It is left as an exercise to show that for N conductors connected in parallel, the equivalent resistance is
given by
1 1 1 1
= + + ... . (1.38)
Req R1 R2 RN
Convection is transfer of thermal energy by mass motion of fluid. There are two types:
1. forced convection - e.g. blood pumping by the heart
2. natural/free convection - e.g. hot air rising due to density differences
Convection is a very complex process and it is not yet completely understood. Here are a few experimental
facts:
1. heat current is proportional to the surface area
2. viscosity of fluids slow natural convection
5
3. heat current is proportional to (TS Tf ) 4 where TS is temperature of surface and Tf is temperature
of main body of fluid
Radiation - transfer of thermal energy through vacuum (no medium) by means of electromagnetic
energy.
Experimental evidence:
1. rate of energy transfer T 4
2. rate of energy transfer A = surface area
3. rate of energy transfer depends on the nature of the surface
Collecting this amount of information leads to
H = AT 4 (1.39)
where A is the surface area, T is the absolute temperature, is the emissivity (depends on radiating
object), and is the Stefan-Boltzmann constant (5.67 108 mW 2 K ). When the surroundings is at a
temperature TS , the net heat current due to radiation is

H = A T 4 TS4 .

(1.40)
Exercises
1. (Young and Freedman) Related to Physics 73.1 Conduction Experiment. Two metal bars, each
of length 5 cm and rectangular cross-section with sides 2 cm and 3 cm are wedged between two walls,
one held at 100 C and the other at 0 C as shown in Figure 12.
Figure 12: Thermal conductivity exercise

9
W W
The bars are lead and silver with thermal conductivities kP b = 353 mK and kAg = 429 mK . Find (a) the
total thermal current through the bars (Ans. 232.6 W) and (b) the temperature at the interface/junction
(Ans. 45.1 C).
2. (Young and Freedman) The rate at which solar energy from the sun reaches the earths atmosphere
is about 1.50 kW 11
m2 . The distance from the earth to the sun is 1.50 10 m and the radius of the sun is
8
6.96 10 m. (a) What is the rate of radiation of energy per unit area from the suns surface? (b)If the
sun radiates as an ideal blackbody, what is the temperature difference of its surface?
1.5 Ideal Gas Equation and Introduction to Molecular Properties of Matter

We begin with the equation of state. The properties of matter are dictated by what we measure: pressure,
volume, temperature, electrical resistance, length, etc. We refer to these properties as state variables.
For a particular choice of state variables, we refer to matter being in that particular state. But these
properties are not independent of each other. The relationship connecting these properties is described
by the equation of state.
Example: Approximate equation of state for a solid

V
= 1 + (T T0 ) (P P0 ) (1.41)
V0
The constant is what we know as the coefficient of volume expansion while is called the compressibility
of the solid. The term corresponding to the compressibility is negative because the volume of solids
decreases with an increase in the pressure.
Ideal Gas Equation

We are going to cheat and reveal the equation of state for the ideal gas and discuss its consequences
without resorting to a more fundamental description. This more fundamental description is postponed
to the discussion of the kinetic theory of gases.
So, at this point, what do we mean by an ideal gas?

The ideal gas is gas at low pressure and high temperature that obeys the equation of state given by
P V = nRT (1.42)
where P is the pressure, V is the volume, T is the temperature, and n is the number of moles in the gas.
The ideal gas equation (Eq. 1.42) is a summary of several observations (Boyle, Charles, etc.):
1. V n. Doubling n while keeping P and T constant doubles V .
1
2. V P . Doubling the pressure while holding T and n reduces the volume to half.
3. P T . Doubling T while keeping V and n doubles the pressure P .
4. V T . Doubling the temperature while keeping P and n doubles V .
It is a (surprising!) result that the constant R is the same for all gases. The value is given by
R = 8.3145 J/molK = 0.08206 Latm/molK. (1.43)
Exercises
1. (Young and Freedman) In an automobile engine, a mixture of air and gasoline is compressed in
the cylinders before being ignited. A typical engine has a compression ratio of 9.00 to 1.00; this
means that the gas in the cylinder is compressed to 1/9 of its original volume. The initial pressure
is 1.00 atm and the initial temperature is 27 C. If the pressure after compression is 21.7 atm, find
the temperature of the compressed gas. Ans. 723 K or 450 C
2. (Young and Freedman) Mass of air in scuba tank
A typical tank used for scuba diving has a volume of 11.0 L and a gauge pressure, when full, of
2.10 107 Pa. The empty tank contains 11.0 L of air at 21.0 C and 1 atm. When the tank is filled

10
with hot air from a compressor, the temperature is 42 C and the gauge pressure is 2.10 107 Pa.
What mass of air was added? (Air is a mixture of gases, i.e. 78% nitrogen, 21% oxygen, and 1%
miscellaneous. The average molecular mass of air is 28.8 g/mol.) Ans. 2.54 kg
3. TRY (Young and Freedman) From fluid dynamics, the variation of the pressure with the elevation
as a function of density is
dP
= g. (1.44)
dy
Show that if the atmosphere is at constant temperature T , then

mgy
P = P0 e kT . (1.45)
Van der Waals Gas
Real gases obey only the ideal gas equation under certain conditions. The Van der Waals equation is an
equation of state that is a next level approximation to real gases as compared to the ideal gas equation.
The Van der Waals equation takes into account the volume of molecules and the intermolecular forces
of attraction. The Van der Waals gas obeys the equation of state
n2

P +a (V nb) = nRT. (1.46)
V2
n
Eq. 1.46 reduces the the ideal gas equation for very small densities V 1.
PV diagrams
- simple 2D representations to help specify/visualize the evolution of the state of a system.
An example of PV diagram for the ideal gas is shown below.
Figure 13: PV diagram for an ideal gas showing isotherms (curves of constant temperature)
As stated, the PV diagrams show the evolution of a state of a system and therefore can also show
phase transitions. What do we know about phase transitions? Experimentally, at the phase transition
we have constant pressure and temperature and a huge amount of energy is involved. After transition
from one phase to another the volume of the object is also drastically changed. An example of PV
diagram showing phase transition is given by Figure ??.

11
Figure 14: PV diagram showing the phase transition from liquid to gas
Molecular Properties of Matter

Feynman asks: If the world is about to come to an end and you can only pass along a single sentence to
the next inhabitants of earth, what would that sentence be?
What do we know about matter? Recall that two point charges or point masses interact with a force
that varies as 1r where r is the separation. Gravitational forces are typically very small at atomic scales
A = 1 1010 m.) The interaction can be studied by plotting the potential energy vs the
(sizes at 1
separation distance.
We follow the same approach to study matter. A typical plot of the potential energy vs. the separation
for two molecules is shown below.
Figure 15: Typical potential energy vs separation for molecules
Phases of Matter
1. solid - crystal lattice structure: everywhere the same and periodic, long-range order
2. liquid - short-range order
3. gas - no attractive forces effectively holding the molecules
It is convenient to define the mole in analyzing the properties of matter. A mole is the amount of
substance that contains as many elementary entities as there are atoms in 0.012 kg of Carbon-12. This
leads to Avogadros number 6.022 1023 molecules/mol.

12
1.6 The Kinetic Theory of Gases
We now begin our analysis of the simplest phase of matter. Compared to solids and liquids, the interaction
of the molecules in a gas is relatively small. To a good approximation, we can even neglect it for many
of the gases that we are dealing with.
The description of a gas can be made macroscopically and microscopically:
macroscopic viewpoint - phase of matter, has pressure, temperature, volume, etc.
microscopic viewpoint - large number of molecules colliding with one another and the walls of the
container.
Consider gas in a box with movable piston as shown.
Figure 16: Gas molecules in a box with movable piston. The piston is going to pick up momentum from collisions
with the molecules. Thus, if it is to be stationary, it has to be held by some external agent.
The piston is going to pick up momentum from collisions with the molecules. If the piston moves by
a distance dx due to collision and we want to keep it still, then we push the piston with a force F with
displacement dx. The work that we do on the gas is then
dW = F (dx) = P dV. (1.47)
The total pressure applied by the gas molecules on the piston must be equal to the total force that we
apply to keep the piston still. This way we can measure the pressure exerted by the gas molecules.
Ideal Gas Model: Microscopic standpoint

Assumptions:
1. molecules in the gas collide elastically with the walls of the container
2. molecules are essentially point particles; the dimensions of the molecules are much smaller compared
to the dimensions of the container; the frequency of collision of molecules between molecules is small
3. molecules do not interact with each other; gravity is negligible. This implies that there is no
preferred direction for motion.
Under these assumptions we will have a uniform distribution of molecules in the container. If we have
N molecules and the volume of the container is V , then the number density of molecules is N/V .
Let us now calculate the total momentum imparted by the gas molecules on the movable piston. Note

13
that by the mechanism of doing work that we just outlined, we can measure this as a pressure. Consider
a short time t. The total momentum that would be imparted to the wall in a time t is
total momentum imparted to piston in t =( # of molecules that reach piston in t )

(1.48)
(momentum imparted by single molecule).
Consider a single molecule with velocity components of vx , vy , and vz . Upon hitting the piston (which
we allow to move along the x), the momentum imparted to the piston is then mvx (mvx ) = 2mvx .
Now how many molecules are going to hit the piston in a time t? This is simple: density volume =
N
V Avx t. The total momentum imparted on the piston is therefore
N N
ptotal = Avx t 2mvx = 2 Atmvx2 . (1.49)
V V
Have we been missing anything? Upon averaging, we would actually find that only half of the molecules
are directed towards the piston! So we divide the number that we have for molecules hitting the piston
in time t by 2. The total momentum imparted on the piston is now
N
ptotal = Atmvx2 . (1.50)
V
Since this momentum is what accounts to the pressure that we measure
1 ptotal
P = (1.51)
A t
we obtain
P V = N mvx2 . (1.52)
Of course, not all the molecules have a velocity of vx in the x direction. We must therefore take the
above expression as an average and write it down as
P V = N mhvx2 i. (1.53)
Now recall that there is no preferred direction. This implies that the average speeds over the three
translational directions are the same. In symbols, we have hv 2 i = hvx2 i + hvy2 i + hvz2 i = 3hvx2 i. Thus we
can write down the last result in terms of the speed v as

2 1 2
PV = N mhv i . (1.54)
3 2
Because the gas molecules are not interacting the sum of their kinetic energies is the total energy of the
gas. We call this the gas internal energy which we denote as U :
1
U = N mhv 2 i. (1.55)
2
Finally, we have an expression for the pressure, volume, and the internal energy for the ideal gas:
2
PV = U. (1.56)
3
By comparing this with what we know experimentally P V = nRT we identify the internal energy in
terms of temperature
1 3
U = N mhv 2 i = nRT. (1.57)
2 2
The temperature is thus identified with the average kinetic energy of the molecules of a gas. Gas molecules
at the same temperature therefore have the same average kinetic energy. By writing down n = N/NA
where NA is Avogadros number we identify the average kinetic energy of a molecule as
1 3 R
mhv 2 i = T. (1.58)
2 2 NA
R
We call the constant NA the Boltzmann constant which we denote as k. Its numerical value is
k = 1.38 1023 m
2
kg/s2 K. (1.59)

14
Equipartition theorem
What we found out? The total energy for the monoatomic gas molecule is 32 kT . For each translational
direction we identify a contribution of 12 kT . The equipartition theorem roots deeply from classical
statistical mechanics and states that
When a substance is in equilibrium, there is an average energy of 12 kT per molecule or 21 RT per mole
associated with each degree of freedom.
Translational motion in one direction is an example of degree of freedom. Other degrees of freedom are
associated with rotational and vibrational motion.
1.6.1 Distribution of molecular speeds

The molecules of a gas do not all have the same speed. It turns out that the probability that the fraction
of molecules having speeds between v and v + dv is given by
dN
= f (v)dv (1.60)
N
where f (v) is known as the Maxwell-Boltzmann distribution:
4 m 32 2 mv 2

f (v) = v exp . (1.61)
2kT 2kT
q
2kT
It is easy to show that this distribution function leads to the following values: vMAX = m and
q
vRMS = 3kT m . What is vAVE ?
Since E = 12 mv 2 , it follows that the fraction of molecules having energies between E and E + dE is
given by
dN
= F (E)dE (1.62)
N
where 3
2 1 2 1 E
F (E) = E 2 exp . (1.63)
kT kT
The fraction of molecules having speeds below v/vRMS is shown below.
v/vRMS fraction
0.20 0.011
0.40 0.077
0.60 0.218
0.80 0.411
1.00 0.608
1.20 0.771
1.40 0.882
1.60 0.947
1.80 0.979
2.00 0.993
Exercises
1. Fifteen students took a 25 point quiz. Their scores are 25, 22, 22, 20, 20, 20, 18, 18, 18, 18, 18, 15,
15, 15, 10. Find the average score and the RMS score. Ans. AVE = 18.3, RMS = 18.6 .
2. Use the Maxwell-Boltzmann distribution to show that the average kinetic energy per molecule of
the gas is 32 kT .
3. A vessel contains some amount of ideal monoatomic gas at temperature T . The mass of one
q molecule
1 12kT
of this gas is m. What fraction of this gas has molecules having speeds between v1 = 5 m and
q
v2 = 51 48kT
m ?

15
1.7 Heat Capacities of Gases and Solids
We have already encountered that the total energy associated with the translational motion of a monoatomic
gas is 32 kT . By thinking of this as a total from contributions from the x, y, and z directions then we can
say that there is a contribution of 21 kT to the total energy for each translational motion.
The heat capacity at constant volume CV is formally defined as

QV = CV T (1.64)
where QV is an amount of heat supplied at constant volume. Since the volume is constant, no work is
done on the gas and this heat goes into changing the internal energy. Thus we have
U = CV T (1.65)
and we can define the heat capacity at constant volume as
U
CV = . (1.66)
T
If we want to more elegant we can take the limit T 0 and the above expression reduces to a deriva-
tive. There is another way of defining the heat capacity and that is by considering adding heat at
constant pressure. Can you get CP in terms of CV ? Hint: Knowledge of the first law of thermodynamics
is required.
According to the kinetic theory the internal energy per mole of an ideal monoatomic gas is given by
3
2 R.What does experiment have to say about this? This tells us that the naive simple kinetic model
Table 1: Heat Capacity of Gases (J/molK) at 25 C (source: Tipler)

monoatomic CV diatomic CV polyatomic CV
He 12.52 N2 20.80 CO2 28.17
Ne 12.68 H2 20.44 N2 O 28.39
Ar 12.45 O2 20.98 H2 S 27.36
Kr 12.45 CO 20.74
of an ideal monoatomic gas leads to results consistent with experiments. What about for the diatomic
gas? The diatomic gas can be visualized as two molecules connected to each other by a nonextensible
rod. Thus, other than the three translational motion there are the two rotational degrees of freedom
that contributes to the total energy of the system. The total energy per mole for a diatomic gas is then
5 12 RT or 52 RT . This leads to the heat capacity for a diatomic gas which is 52 R and we see that this
is consistent with experiment. There is no simple picture that we can use to analyze polyatomic gases.
But it turns out that considering two more degrees of freedom in addition to the two rotational degrees
of freedom for diatomic molecules, we get the heat capacity at constant volume of polyatomic molecules
to be 27 R. This turns out to be consistent with some polyatomic gases but further analysis is needed.
The model that we have for solid is that of a lattice and that each lattice point is occupied by
an atom of the solid. The atoms are connected to adjacent atoms by springs. Thus, there are six
degrees of freedom: three translational degrees of freedom and three vibrational degrees of freedom. The
equipartition theorem leads us to the result 6 12 RT for the total energy and thus 3R for the heat capacity
of a solid. This prediction turns out to have remarkable agreement with many elemental solids and is
called Dulong-Petit Rule.
Failure of Equipartition Theorem

In general, the heat capacities of gases and solids can depend on the temperature and this is observed
experimentally. This cannot be explained for the equipartition theorem and the reason is simple. The
equipartition theorem was derived by combining the laws of probability with classical physics. But it
is now known that classical physics is not the correct law of nature but quantum mechanics. Quantum
mechanics tells us that the energy of a system cannot actually have any value. The energy levels are
quantized and only a set of energies are allowed! More on this for 3rd LE coverage.

16
1.8 Work and PV diagrams
At this stage, we are ready to establish the relationship between the work done by the system to its
surroundings and PV diagrams. We restrict our attention to the simple case where the working substance
(that system which is doing the work) is an ideal gas.
Recall that as a gas displaces a piston of area A by an amount dx it does work by an amount equal to
dW = F dx = P (Adx) or simply
dW = P dV (1.67)
where P is the pressure of the gas and V is the change in the volume. Eq. 1.67 is valid, in general,
for infinitesimal volume changes. For a finite volume change, the dependence of the pressure P on the
volume V is needed: Z Vf
W = P (V )dV. (1.68)
Vi
This is the area under the curve in the PV diagram! What changes case to case is then the dependence
of the pressure P on V . Here are the few cases that we are going to cover:
1. Isochoric - constant volume process
In this case dV = 0 for all of the parts of the process. Eq. 1.68 is then trivial:
Wisochoric = 0. (1.69)
2. Isobaric - constant pressure process

At constant pressure we identify the area under the curve to be simply a rectangle. In this case,
Wisobaric = P V (1.70)
3. Isothermal - constant temperature process

At constant temperature, the process of changing the pressure and volume obeys what we know
as Boyles law. In this case,
Z Vf Z Vf
1 Vf
Wisothermal = P (V )dV = Pi Vi dV = Pi Vi ln . (1.71)
Vi Vi V Vi
4. Adiabatic - no heat loss

In an adiabatic process, work is done is such a way that all of the energy used goes into changing
the internal energy of the system, i.e. there is no heat. In an adiabatic process, the pressure of a
gas depends on the volume V as
P V = constant. (1.72)
In this case, the work done is
Vf Vf
Pf Vf Pi Vi
Z Z
1
Wadiabatic = P (V )dV = Pi Vi
dV = . (1.73)
Vi Vi V 1
The First Law of Thermodynamics

Before (first lecture), we have define heat as energy in transit between to thermodynamics systems due
to temperature changes. At this point, we are ready to introduce another definition of heat. When we
do work on a system, some part of this work goes into changing the internal energy of the system. The
energy that goes away is what we call heat. In symbols, we have
dU = dQ + dW (1.74)
where dU is change in internal energy, dW is the work done ON THE SYSTEM, and dQ is the heat. In
terms of the system, the sign of dW is opposite and in some references we will find instead
dU = dQ dW. (1.75)
This is the first law of thermodynamics which is again another manifestation of the conservation of
total energy.

17
Heat Capacity at Constant Pressure
Using the first law of thermodynamics, we can derive an expression for the heat capacity at constant
pressure P . We define this quantity CP by
dQP = CP dT. (1.76)
At constant volume V , all the heat goes into changing the internal energy of the system. At constant
pressure, some part goes into work. Thus, we have dQP = dU + dW = CP dT . Using an ideal gas as a
working substance we have dU = CV dT and dW = P dV = nRdT . Therefore,
CP dT = (CV + nR) dT. (1.77)
From this we can extract the relation

CP = CV + nR. (1.78)
Thus, if 23 R is the heat capacity per mole of an ideal monoatomic gas, 52 R is its heat capacity at constant
pressure. The heat capacity ratio is another important quantity that is defined by
CP
= . (1.79)
CV
For a monoatomic ideal gas, this value is 5/3. Using , the relation between the pressure, volume, and
temperature for a thermodynamic system can in fact be written down as
P V = ( 1) U. (1.80)
Adiabatic Process
In an adiabatic process, all of the work goes into changing the internal energy as there is no heat flow
into or out of the system. To find the relationship between P and V we take the exact differential of Eq.
1.80:
dP V + P dV = ( 1) dU. (1.81)
Now, since all the work goes to changing the internal energy dU = P dV and we have
dP V + P dV = ( 1) P dV. (1.82)
By transposing all of the terms to the left hand side and dividing by P V we obtain
dP dV
+ = 0. (1.83)
P V
The solution to this differential equation is exactly Eq. 1.72.
Exercises
1. Adiabatic compression in a diesel engine (YF)
The compression ratio of a diesel engine is 15 to 1; this means that air in the cylinders is compressed
to 1/15 of its initial volume. If the initial pressure is 1.01 105 Pa and the initial temperature is
27 C, find the final pressure and the temperature after compression. Air is mostly a mixture of
diatomic oxygen and nitrogen; treat it as a gas with = 1.4. Ans. Pf = 44 atm, Tf = 613 C .
2. Work done in adiabatic process

In the previous exercise (Adiabatic compression in a diesel engine), how much work is done in the
adiabatic process if the initial volume of the cylinder is 1.00 L? Try to solve in two ways. Ans.
494 J .
3. Photon gas What is the value of the heat capacity ratio for a photon gas? You can derive this by
showing that the equation of state for a photon gas is given by P V = 13 U . Use kinetic theory and
an advanced knowledge that the total energy of a photon is E = pc where p is the momentum and
c is the speed of light.

18
1.9 Internal Energy and the First Law
We have already introduced the first law of thermodynamics:
dU = dQ dW. (1.84)
This basically says that when heat dQ enters the system while the system does an amount of work dW
then the change in the internal energy of the system is the difference dQ dW . Also, we have learned
that the work done to change the state is dependent on the path taken (isochoric, isobaric, etc.) By the
definition of the adiabatic process we also have dQadiabatic = 0. So we also have an idea that the heat is
dependent on the path. We are now about to learn that, in contrast to the work and heat, the change
in the internal energy is independent on the path taken.
Recall that we can write down the internal energy of an ideal gas as
U = nCV T (1.85)
where n is the number of moles and CV is the molar heat capacity of the ideal gas e.g. CV = 25 R for
diatomic gas and CV = 32 R for monoatomic gas. This expression for the internal energy turns out to be
fairly general for gases at high temperatures and low pressures. This means that when we change the
state of the system, the change in the internal energy is given by
U = nCV T. (1.86)
The internal energy of a gas depends only on the temperature. Therefore, the change in the internal
energy depends only on the initial and final temperatures. What does experiment say about this result?
(Read about the free expansion experiment of Joule!)
Some of the consequences of the above result are:
1. For isothermal processes, dUisothermal = 0; Since the temperature in an isothermal process is zero,
the change in the internal energy must be zero.
2. For isochoric, isobaric, and adiabatic processes, dU = nCV dT . But this can still take different
forms when the ideal gas equation is used. For example, in an isobaric process P dV = nRdT so
the change in the internal energy can also be written as dU = CRV P dV .
3. For a cyclic process U = 0.
The heat absorbed or released in a process can now be calculated using the First Law.
In constant volume and pressure, the heat absorbed or released might be calculated without resorting
to the first law. At constant volume, recall that the heat absorbed is
dQ = nCV dT (1.87)
where CV is the molar heat capacity at constant volume. This can be used to calculate the heat in an
isochoric process. At constant pressure, we instead have
dQ = nCP dT = n (CV + R) dT (1.88)
where CP is molar heat capacity at constant pressure. This can be used to calculate the heat in an
isobaric process.
Exercises
1. (Tipler) You do 25 kJ of work on a system consisting of 3 kg of water by stirring it with a paddle
wheel. During this time, 15 kcal of heat is removed. What is the change in the internal energy of
the system? (4.18 J = 1 cal) Ans. 37.7 kJ.
2. In a cyclic process, what is the relation between the total work done and the total heat?
3. Reconcile the apparent issue: We have said the the internal energy of a gas can be expressed
only in terms of the temperature by U = nCV T . But doesnt the internal energy also depend on
pressure since P V = ( 1) U ? So does the internal energy depend on the volume now? Does the
internal energy also depend on the pressure?

19
1.10 The Second Law of Thermodynamics
Energy is always conserved. We got that from the first law. So why do we have to conserve energy? And
doesnt the statement conserve energy lose its meaning if energy is always conserved?
It turns out that not all forms of energy are useful e.g. it is easy to convert mechanical energy completely
into thermal energy BUT not the other way around. This leads us to one of the statements of the 2nd
law:
It is impossible to convert thermal energy from a system and convert it completely into mechanical
energy without other changes.
The second law has been expressed in many ways. We have the Kelvin statement:
There can be no process whose only final result is to transfer thermal energy from a cooler object to a
hotter one.
We also have Clausius statement of the second law:
It is impossible to remove thermal energy from a system at a single temperature and convert it to
mechanical energy/work without changing the system and/or the surroundings in some other ways.
A heat engine is a device that allows us to convert as much heat as possible into useful work. We limit
our discussion to cyclic heat engines. A heat engine uses a working substance that absorbs heat, does
work, and gives off heat on return to its initial state (working substance: water, air/gas, etc.)
Hot/cold reservoir - very large heat capacity that allows it to absorb or give off thermal energy with
no appreciable change in the temperature.
Basic Heat Engine
The heat engine is represented by the ERM (Energy-Reservoir Model) diagram shown below. Basically,
Figure 17: Energy-reservoir model for heat engine
a heat QH is given off by the hot reservoir and a part of this energy is converted into useful work W by
the heat engine while the rest of the heat QC is thrown away to a cold reservoir. Since we restrict out
attention to cyclic processes, U = 0 and we have Q = W . In terms of the variables shown in the ERM
diagram we have
|QH | |QC | = |W |. (1.89)
The efficiency of the heat engine is defined as the ratio of the work done per cycle to the heat t input.
In symbols, we have
|W | |QH | |QC | |QC |
= = =1 . (1.90)
|QH | |QH | |QH |
We now state the heat engine form of the second law:
It is impossible for a heat engine working in a cycle to produce no other effect than that of extracting
thermal energy from a reservoir and performing an equivalent amount of work.

20
Example: Otto Engine
In the Otto engine, a working substance (usually air) undergoes a cyclic process represented by the P V
diagram shown below.
Figure 18: P V diagram for the Otto cycle
Here are the parts and their descriptions: What is important to us is to calculate the efficiency of
part description
ab compression stroke (adiabatic)
bc ignite fuel (isochoric heating)
cd power stroke (adiabatic expansion)
da reject heat to environment (isochoric cooling
the Otto engine? We ask, will an Otto engine with a compression ratio r = r1 have a greater efficiency
compared to an Otto engine with a compression ratio r = r2 > r1 ? Let us find out the answer to this
question.
We begin by calculating QH and QC . Since heat enters and leaves the system at constant volume, we
have
QH = nCV (Tc Tb ) (1.91)

QC = nCV (Ta Td ). (1.92)
Using these, we can express the efficiency of the Otto engine in terms of the temperatures:
|QC | |Ta Td |
=1 =1 . (1.93)
|QH | |Tc Tb |
Now note that point c is connected to d adiabatically. Also, a is connected to b adiabatically. Thus, we
also have
Tc V 1 = Td (rV )1 (1.94)
1
Tb V = Ta (rV )1 . (1.95)
Cancelling out the common V and subtracting the resulting equations give
Td Ta 1
= 1 . (1.96)
Tc Tb r
Finally, we arrive at the desired expression for the efficiency of the Otto engine:
1
=1 . (1.97)
r1
This shows that a larger compression ratio leads to a larger efficiency! What can you say about the
dependence of the efficiency on the property of the working substance?

21
1.11 Refrigerator, Heat Engine, and the Second Law
What is a refrigerator?
Answer: A refrigerator is basically a heat engine that is run backwards. Simply reverse the direction
of the arrows in the ERM diagram.
Figure 19: ERM diagram for a refrigerator
The story is: An amount of heat QC is taken away from the cold reservoir by using work W (input) to
throw away the heat QH to a hot reservoir. The direction of the energy flow shows that the following
relation still holds:
|QH | = |QC | + |W |. (1.98)
The ideal refrigerator will allow us to remove a heat from the cold reservoir and thrown it away into
the hot reservoir using only a minimum amount of work. The relevant number that allows us to tell the
quality of the refrigerator is the coefficient of performance. This number is given by
|QC |
K= . (1.99)
|W |
In contrast with the heat engines efficiency, note that this number can be greater than 1. The greater
the coefficient of performance is, the better the refrigerator.
We now state the refrigerator form of the second law:
It is impossible for a refrigerator working in a cycle to produce no other effect than to transfer thermal
energy from a cold object to a hot object.
Equivalence of the Heat Engine and Refrigerator statements

To prove the equivalence, we show that if either statement is assumed to be false, the other must be false.
To illustrate, assume that the heat engine statement is not true. As shown, this leads to the violation of
the refrigerator statement. As an exercise, show that a violation of the refrigerator statement leads to a
violation of the heat engine statement.

22
Exercises
1. A refrigerator with a coefficient of performance of 2.40 removes 6.80 MJ of heat inside it per cycle.
If it runs in reverse as a heat engine, what is its thermal efficiency? Ans. 29.4 %.
2. A heat engine with efficiency of 20 % takes in 35 kJ of heat from the hot reservoir to provide work
input to a refrigerator. If the refrigerator discards 28 kJ of heat to the hot reservoir, what is its
coefficient of performance? Ans. 3
Reversible vs. Irreversible processes

The conversion of mechanical energy into thermal energy is irreversible as we know that we cannot have
a perfect heat engine. The heat engine allows us to only partially reverse this process. To build the most
efficient heat engine, we must therefore avoid all of these irreversible processes. But let us first write
down some of the conditions for have a reversible process (Tipler):
1. No work must be done by friction, viscous forces, or other dissipative forces that produce heat.
2. Heat conduction can only occur isothermally.
3. The process must be quasistatic so that the system is always in an equilibrium state (or infinitesi-
mally near/away an equilibrium state).
Carnots Theorem
The most efficient engine can be built out using reversible steps.
This theorem leads us to the Carnot cycle which is illustrated in the PV diagram below. The efficiency
Figure 20: Carnot cycle in PV diagram
of the Carnot engine is given by

TC
=1 . (1.100)
TH
Before deriving this equation, let us discuss some of its properties. We build the Carnot engine out of
the question: What is the most efficient engine working under the same hot and cold reservoirs? The
efficiency of the most efficient engine should not depend on the properties of the working substance.
This is exactly what we observe in Eq. 1.100. Contrast this with the efficiency of the Otto engine which
depends on (working substance) and r (compression ratio). Also, the only variables appearing in Eq.
1.100 are TC and TH . Why? Because only the temperatures TH and TC characterize the hot and cold
reservoirs, respectively. Since the most efficient engine must not depend on the properties of the working
substance, then the efficiency of this most efficient engine can depend only on TC and TH . Lastly, since
TC < TH , the efficiency given by Eq. 1.100 is less than one thus allowing it to be interpreted as efficiency.
Exercise: What is the most efficient heat engine that is working between reservoirs with temperatures
of 473 K and 273 K, respectively? It is possible to have a heat engine with efficiency of 43 % working
under the same heat reservoirs? Ans. 42 %, No.

23
Derivation of the Carnot efficiency
The parts of the Carnot cycle are summarized in the table below. We perform the derivation of the
part description
12 isothermal expansion at TH
23 adiabatic expansion, gas expands
34 isothermal compression at TC
41 adiabatic compression, temperature increases
Carnot efficiency using an ideal gas as a working substance. But as discussed, the resulting equation
must not depend on the properties of the gas or the working substance. The result is more general and
can actually be derived without resorting to an ideal gas as working substance. We restrict our attention
to the simplest case using an ideal gas as a working substance.
Since two of the parts are adiabatic the heat input and output corresponds only to the isothermal
parts. Let the isotherms at points 1 and 2 correspond to the temperature TH and the isotherm at points
3 and 4 correspond to the temperature TC . In this case, we have
V2
QH = P1 V1 ln (1.101)
V1
V4
QC = P3 V3 ln . (1.102)
V3
Therefore, we have
V
QC P3 V3 ln V43 TC ln(V4 /V3 )
= V
= . (1.103)
QH P1 V1 ln V 2 TH ln(V2 /V1 )
1
Now note that points 2 and 3 as well as points 1 and 4 are connected adiabatically. Thus,
TH V21 = TC V31 (1.104)
TH V11 = TC V41 . (1.105)
Dividing these last two expressions lead to
V1 V4
= . (1.106)
V2 V3
Finally, we have
QC TC
= . (1.107)
QH TH
The minus sign is present only because QC < 0 for the heat engine. Thus, we arrive at Eq. 1.100. As
expected, all dependence on the properties of the working substance cancelled out.
Carnot refrigerator
Since each step in a Carnot cycle is reversible, the entire cycle for the Carnot heat engine can be reversed
leading to the Carnot refrigerator. The coefficient of performance of this most efficient refrigerator is
1 TC /TH
K= 1= . (1.108)
1 TC /TH
Exercises
1. A refrigerator has a coefficient of performance of 5.5. How much work is needed for this refrigerator
to make ice cubes from 1 L of water at 10 C?
2. A steam engine works between the hot reservoir at 100 C and cold reservoir at 0 C. What is the
maximum possible efficiency of this engine? If the engine is run backwards, what is the coefficient
of performance? Ans. 26.8 %, 2.73
3. A Carnot engine with an efficiency of 63 % performs 3.50104 J of work in each cycle. If it exhausts
heat at a 298 K reservoir, what is the temperature of its heat source?

24
1.12 Entropy and the Second Law of Thermodynamics
We have only found out that for a cyclic heat engine working using reversible processes only the quantity
TC |QC |
TH = |QH | . We can recast this into the form |QC |/TC = |QH |/TH . In this way, we can tell that in a
reversible engine, it turns out that the quantity Q
T is not changing. We call this quantity which has units
of heat divided by temperature J/K the entropy. In a reversible engine, we say that the net change
in the entropy is zero. In general, the second law of thermodynamics can be stated as
S 0 (1.109)
where S is the change in the entropy.

Before going to a more quantitative discussion, let us further give meaning to this quantity called
entropy. This is the same entropy that we are talking about in chemistry. It is basically a measure of
disorder. What Eq. 1.109 tells us is that systems always tend to go to a more disordered state. And
this is what we observe. Consider a two compartment box where one of the compartments is occupied
by a gas and the other one is empty. If we remove the divider, the gas leaks out and occupies the whole
container. Why? Because that is the state which is more disordered? Each of the molecules can now
occupy either of the compartments instead of only one.
Entropy is a statement of impossibility. It is a state variable like the internal energy and must
depend only on the initial and the final states. What is surprising about entropy is that the heat dQ
as we know it depends on the path. But when we define the quantity dQ T we get a quantity that is
dependent only on the initial and the final states. (It can be shown using the ideal gas equation and the
first law of thermodynamics that the quantity dQ T is a total differential. Can you do that?) So like the
internal energy, it is the changes in the entropy that is important.
We define the change in the entropy dS for a system that goes from an initial state to a final state
(that is only infinitesimally away) is given by
dQrev
dS = (1.110)
T
where dQrev is the heat that must be added in the system in a reversible process that brings the system
from its initial state to its final state.
Entropy of an ideal gas

Consider an arbitrary reversible quasistatic process involving a quantity of heat dQ. The first law of
thermodynamics tells us that
dQ = dU + dW
nRT (1.111)
= nCV dT + dV.
V
Dividing by the temperature T we have
dT dV
dS = nCV + nR . (1.112)
T V
When the initial and final points are finitely separated, we easily see that
S = nCV ln(Tf /Ti ) + nR ln(Vf /Vi ). (1.113)
As a special case, we see that for an isothermal expansion

Vf
S = nR ln (1.114)
Vi
In the isothermal expansion U = 0. The amount of heat that leaves the reservoir is converted into work
(Q = W ). The entropy change of the gas is |Q|T . Since the same amount leaves the reservoir, the change
|Q|
in the entropy of the reservoir is T . The entropy change of the universe is zero! This is actually a
more general result. Reversible processes are actually defined such that the net change in the entropy of
the universe is zero:
dSuniverse = 0, reversible processes. (1.115)

25
In the free expansion of a gas, no work is done and no heat is transferred. Experimentally, it is found
that the temperature of the gas does not change. This process is definitely not reversible as we get more
uncertainty in the positions of each of the gas molecules as the whole container is occupied. So how do we
quantify the entropy here? Note that the entropy depends only on the initial and the final states. Thus,
we can make the initial and the final states of the free expansion the same as the initial and final states
for an isothermal expansion. In this way, we can say that the change in the entropy in free expansion is
Vf
S = nR ln . (1.116)
Vi
But since there is no reservoir involved in the free expansion, the entropy change of the universe is the
entropy change of the gas. Thus, S > 0 for the free expansion. In irreversible processes, the entropy
change of the universe is positive!
Microscopic Interpretation of Entropy

Being related to disorder, the entropy of a system can actually be written down as
S = k ln (1.117)
where k is as usual Boltzmann constant and is the number of possible ways of arranging the parts
of the system that leads to the same state. To go further, we need to define what is macroscopic and
microscopic. This is left as reading assignment.
We give out examples in terms of a very special gas composed of only three molecules. We place
this gas in a two compartment box that is separated by a valve. The gas molecules can be in either
compartments and we assume that we can track down the position of gas molecules. A microstate
macroscopic microscopic
3 molecules in left compartment (L, L, L)
2 molecules in left compartment (L, L, R), (L, R, L), (R, L, L)
1 molecule in the left compartment (L, R, R), (R, L, R), (R, R, L)
0 molecules in the left compartment (R, R, R)
corresponds to a possible arrangement that a macrostate can correspond. In statistical mechanics,

thermodynamics is derived by counting all of the possible microstates. We shall not go over this subject
and only give you an example of the use of Eq. 1.117.
Example: Microscopic Entropy

Consider gas in a ten compartment box. Initially, the gas is confined in one of the compartments. Then
we open all the valves so that the whole box can be occupied by gas. We ask: what is the change in the
entropy of the system?
We can get the answer we want without resorting to the microscopic interpretation of entropy. We
know that the entropy change corresponding to a free expansion is given by Eq. 1.114. Thus, when we
open all compartments for the gas molecules we get the following relationship for the volumes of the
initial and the final states Vf /Vi = 10. Therefore, the change in the entropy is S = nR ln 10. We now
seek to get this answer using the microscopic definition of entropy.
Initially each of the gas molecules can occupy only one of the compartments. When all compartments
are available for each of the molecules, the ratio of the number of microstates in the final state to the
number of microstates in the initial state for a molecules is 10. Thus, for all of the molecules, this ratio
is 10N where N is the number of molecules in the gas. Using Eq. 1.117 we get
f
S = Sf Si = k ln = k ln 10N = N k ln 10. (1.118)
i
But we know that N k = (nNA )k = n(NA k) = nR. Thus we obtain at the same expression for the change
in the entropy without resort to any knowledge of heat or the first law of thermodynamics. We simply
counted the number of microstates available to the system initially and finally.

26
2 Special theory of relativity
We now begin our lectures on the special theory of relativity or as we will be calling it many times
special relativity or SR. These notes are not meant (yet) to be a standalone reference for the subject
but a supplement to what is discussed in Spacetime Physics by Taylor and Wheeler.
2.1 Natural Units, Inertial Reference Frames, and the Principle of Relativity
We begin the subject by introducing natural units, inertial reference frames, and the principle of relativity.
The parable of the surveyor invites us to use the same units for both the space and time dimensions.
The conversion factor is given by the speed of light:
c = 2.998 108 m/s. (2.1)
Conventionally, we say that 1 second is the time it takes for light to travel a distance of 3 108 m. In
this subject we start to measure time in units of length. We can define now what we will be calling a
clock (See figure 21). We create mirror A in such a way that it ticks each time the pulse hits it. In our
Figure 21: A very short pulse of light is trapped between two mirrors A and B separated by half a meter.
construction, we call the time it takes between the ticks as 1 meter of light travel time. Ok. Now we
can say that we are measuring the time in meters. What are the consequences of that?
Inertial Reference Frames

Our task is to simplify the description of physical phenomena. It turns out that physics is simple for
a particular class of frames that is called inertial reference frames. So what is this? The inertial
reference frame is a basically a frame of reference where Newtons first law of motion holds. Thus, in an
inertial reference frames, test particles at rest and test particles in motion would continue this current
state of motion unless acted upon by an external force. By an external force, we mean every force besides
the gravitational force. Why are we excluding the gravitational force?
Gravity affects everything. We know that. Every object has mass and energy and anything with mass
and energy is affected by gravity. Gravity acts on everything. In this case, we would not actually observe
any test particles moving in a straight line or at rest since these test particles would be acted upon by
gravity. Because of gravity, test particles would deviate from their natural straight line motions. In
considering this, an inertial reference frame must therefore also be a gravitation-free reference frame.
Also, we define a test object as an object that does not produce any significant gravitational force.
Example: Falling train

Consider a train that is oriented horizontally as shown with two test particles A and B. This train is
an inertial reference frame if when the train has travelled the distance h, the change in the separation
of the test objects x is undetectable. Now suppose that the train is oriented vertically as shown. The

27
Figure 22: Train oriented horizontally and vertically and test objects acted upon by the gravitational field of the
earth.
train is an inertial reference frame if when the train falls by the distance h, the change in separation
between the test objects y is undetectable by the current technology.
The Principle of Relativity

Without further ado, the principle of relativity states that
The laws of physics are the same in every inertial reference frames.
What this means is that the form of the equations is the same in every inertial reference frame. Thus,
when we derive the physical laws in, say, the lab frame, we do not have to derive the physical laws in
every other frame. The laws in all frames take the same form. For example, in the lab frame, the ideal
gas equation and the wave equation are given by
PV = N kT (2.2)
2 1 2
= . (2.3)
x2 c2 t2
In another inertial reference frame, say a rocket frame that is moving with constant speed with respect
to the lab, the form of the ideal gas equation and the wave equation are
P 0V 0 = N 0 kT 0 (2.4)
2 0 1 2 0
= . (2.5)
x02 c2 t02
We say that the form of the equations are preserved. Also, the value of the physical constants, e.g. the
speed of light c and the Boltzmann constant k, have the same value. What is not necessarily the same for
the two frames are the values of their measurements, say, of the coordinates, the electric and magnetic
fields, the pressure, and the volume, etc. The relation between these different values are related by what
we call the Lorentz transformations but we hold on to that discussion in quite a while.
2.2 Events, Spacetime Interval, and the Proper Time

In the parable of the surveyor, the job of the surveyor (its use for the people) is to record the location
of the gates of the town. In analogy, the job of the relativist is to record to location of events in
spacetime. An event is, for example, the emission of particles, flashes of light, reflection or absorption
of particles or light, collisions, etc. The location of an event in spacetime is labelled by four numbers
(t, x, y, z). The first number t represents the time that the event happened while the remaining three
numbers (x, y, z) represent the position of the event (where the event happened). Now we ask, how do
we determine the location of an event in spacetime?

28
Figure 23: A latticework of clocks and meter sticks.
We build a latticework of meter sticks and clocks as shown. In this latticework of clocks and meter
sticks, the time t of an event (with respect to this reference frame) is recorded by the clock that is nearest
to that event. The spatial components (x, y, z) of the event is given by the location of the clock in the
lattice. What we call the observer is actually the collection of all of this clocks. Through this latticework,
we can test if a given reference frame is going to be inertial. We simply check the motion of test particles.
But before going very far, we first have to calibrate the clocks. We now focus on an inertial reference
frame. To do this, we choose a particular clock that we shall call reference clock and we call its location
to be the origin of the inertial reference frame. What is want is that when this reference clock reads 5
meters of time then the clocks located 5 meters away from this clock will also read 5 meters of time.
To do this, we suppose that we have a large number of slaves that we position at each of the clocks in
the lattice. The slaves will set their assigned clocks at the time corresponding to the position of that
clock with respect to the reference clock. For example, the slaves assigned at the clocks (5, 0, 0), (0, 5, 0),
(0, 0, 5) will set the clocks at a time 5 meters of light travel time. The slaves hold the times in their
assigned clocks. Now, we, positioned at the reference clock, will release a flash of light that spreads out
in all directions. The job of the slaves is to start their clocks as soon as the flash reaches their location.
In this way, all of the clocks in the lattice will be synchronized.
Invariance of the Interval
Consider two inertial reference frames with overlapping regions of spacetime as shown. Because there
Figure 24: Two overlapping inertial reference frames. To distinguish the two frames, the clocks in frame A are
shaped circles while clocks in frame B are shaped as rectangles.
is a common region of spacetime, there is a number of events that can be described by the two inertial
reference frames. Without loss of generality, we suppose that the relative motion of the two frames is
along the x (the x axes coincide). We also consider the y and z axis of the two frames to be oriented in
the same way (no rotated) as viewed in either the x axis of frame A or the x axis of frame B.

29
Now consider light that is emitted (event E) to the y direction at some location x along the lattice of
frame B. This light hits a mirror near the clock located at the lattice position lB meters away. The light
then bounces back and is received by a detector (event R) located at the same location x. See figure. In
Figure 25: Observation of events E and R by inertial frame B.
frame B, the temporal and spatial separation of the events are
tB = 2lB (2.6)
xB = 0. (2.7)
Now, inertial frame A sees something different. See next figure. According to an observer in inertial
Figure 26: Observation of events E and R by inertial frame A.
frame A, the two events areqseparated by meters. The time taken by light to travel along the path
2 + 2 . According to this observer, the temporal and spatial separation

shown is (in natural units) 2 lA 2
of the events are
r 2
tA = 2 lA 2 + (2.8)
2
xA = . (2.9)
Using the principle of relativity, we can actually show that the coordinates perpendicular to the relative
direction of motion of the two frames remain the same (see argument by Taylor and Wheeler). This
means that we can set lA = lB = l. At this point, we should already notice that the time and space
coordinates of the events are different in the two frames. Since the coordinates perpendicular to the
relative direction of motion of the two inertial frames remain the same according to the principle of
relativity, then it follows that
t2A x2A = t2B x2B . (2.10)
Can you show this?
There is actually something special about our choice of reference frame B. That is the frame where
the two events E and R occur at the same place. To remove this special treatment, we consider another
inertial reference frame C (with a common region of spacetime with frames A and B). Let this inertial
frame travel at constant speed with respect to frame A that is larger than the speed of frame B with

30
Figure 27: Observation of events E and R by inertial frame C. The clocks are shaped as triangles in this frame.
respect to frame A. An observer in frame C will see events E and R as shown. According to an observer
here, the emission and reception events have the temporal and spatial separations of
s 2
2 +
tC = 2 lC (2.11)
2
xC = . (2.12)
Again, it would follow from the invariance of perpendicular lengths that
t2A x2A = t2B x2B = t2C x2C . (2.13)
This result is part of a more general result that is called the invariance of the spacetime interval.
The result is actually very general that it defines itself the geometry of special relativity and special
relativity itself. The spacetime interval ds between two events is defined by
ds2 = dt2 dx2 dy 2 dz 2 . (2.14)
Proper Time
The proper time is that time measured by an observer that is at rest in his own inertial reference
frame, that is, the wristwatch time. For a particular choice of two events, this is the time measured by
the observer in the frame where the events occur at the same place. In symbols, the proper time d
between two events (infinitesimally separated) is given by
d 2 = dt2 dx2 dy 2 dz 2 . (2.15)
When the events have finite separation this becomes
2 = t2 x2 y 2 z 2 . (2.16)
Exercises
1. Two firecrackers situated 400 m apart exploded 500 m of time one after the other according to
a lab frame. A rocket frame observes the explosions to occur at the same place. Find the spatial
separation of these two events. What is the time between explosions according to this rocket frame?
What is the speed of the rocket with respect to the laboratory? Ans. 400 m and 0 m, 300 m, 4/5
2. A proton moving 3/4 light speed (wrt laboratory) enters two detectors 2 m apart. Events 1 and
2 are the transits through the two detectors. Find the time and space separation between the two
events according to the (i) lab frame and (ii) proton frame. Ans (2.67 m and 2 m) and (1.76 m and
0 m)
3. Two twins have their clocks initially synchronized. Observer A stays at rest in the lab frame while
Observer B travels from (0, 0) to (5 ly, 13/3 ly) as observed by A. Calculate the proper time of B
as well as its speed with respect to A. Ans. 2.5 ly, 13/15

31
2.2.1 Spacetime Diagram, Lightcone, and Proper Time
The day time and night time surveyor plots the coordinates of the towns gates their own spatial diagrams
(x y plane). It would be convenient if we have something like that in relativity - this leads us to space-
time diagrams. These diagrams allow us to visualize the events as points on a plane. See figure below.
The diagram shows us that events A and C occur at the same place. Also, events A and B occur at the
Figure 28: Spacetime diagram. The vertical axis is the time axis while the horizontal axis is the position axis.
The points in these diagram correspond to events in spacetime.
same time (simultaneous). In general, events that like along a line that is parallel to the time axis occur
at the same place. Events that like along a line that is parallel to the position axis occur at the same time.
Now consider a reference frame that we call lab frame. In this lab frame, event O occurs at the origin
at time t = 0 and event B occurs also at the origin but at a later time t = tL . When these events are
viewed from a rocket frame (whose origin coincide with the lab origin at t = 0) moving to the right with
respect to the lab frame, event O occurs at the origin at t = 0 but event B occurs at some x < 0 at
some time x 6= tL . Viewed from a rocket frame moving to the left with respect to the lab frame, event
O occurs at the origin at t = 0 but event B occurs at some x > 0 at some time x 6= tL . Spacetime
diagrams drawn by observers in these frames are shown below. Since the event O occurs at the origin in
these frames we have x = x and t = t. The invariance of the interval immediately reveals to us the
following relationship:
t2 x2 = t02 x02 = t002 x002 = constant. (2.17)
This shows that even when the spacetime coordinates of event A are different according to the different
frames, event A lies in the hyperbola with equation t2 x2 = constant in all of the spacetime diagrams.
We call this hyperbola the invariant hyperbola.

32
We can characterize events the separation of events in spacetime using their spacetime intervals.
Consider any pair of events. If t2 > x2 , then we say that the events are timelike separated. When
the events are such that t2 < x2 , then we say that the events are spacelike separated. When
2 = x2 the events are lightlike separated. In figure 28 can you see which events are timelike
separated? spacelike separated? and lightlike separated?
Light Cones
Drawing light cones is a good way to determine if events are causally related. Light cones are the
surface generated by light that is emitted at a point in spacetime. See figure below. All events that
Figure 29: Light cone. Light rays emerge from point A at trajectories with slopes of 45 from the horizontal. By
rotating this drawing about the vertical axis passing through point A we generate a cone.
are in the future light cone of A are causally related to A. By causally related, we mean that whatever
happens at event A might affect what happens at event B. In contrast, event C is outside the light cone
of A. Thus, whatever A does cannot affect what happens to C. This is so because nothing can travel
faster than light. We have not yet proven this but introduce this idea now. The proof will come later
when we are talking about energy. We will eventually show that to accelerate a particle to the speed of
light an infinite amount of energy will be needed. Are events A and D causally related?
Proper Time and Proper Distance

When two events are timelike separated, we can assign a proper time for these two events. Recall that
the proper time is the time between two events measured in the inertial reference frame where the two
events happen at the same position. The proper time can be calculated in any inertial reference frames
using the formula p
= t2 x2 . (2.18)
When the two events are spacelike separated we do not assign a proper time. Why? Because if two events
are spacelike separated, we cannot find that a physical frame (travelling with speed less than light) in
which the two events happen at the same place. Instead, when two events are spacelike separated, we
define the proper distance: p
= x2 t2 . (2.19)
The proper distance for any pair of spacelike separated events is the spatial separation of the events in
the frame where the two events are simultaneous.
Exercise
An event occurs at (t, x) = (5 m, 3 m) in some lab frame. At what speed of the rocket frame will this
event have the largest time separation with respect to the common origin of the inertial frames? Is the
event timelike separated or spacelike separated with respect to the common origin? What is the proper
time/proper distance for the event with respect to the common origin?

33
2.3 Lorentz Transformation
It is well-known that in the limit of small magnitudes of relative velocities between two inertial reference
frame (we are talking about Newtonian mechanics) the correct transformation rule for the coordinate
systems is given by the Galilean transformation. According to this, given the coordinates (t0 , x0 ) in
a rocket frame moving with speed relative to a lab frame, the transformation rule is
t = t0 (2.20)
0 0
x = x + t (2.21)
y = y0 (2.22)
0
z = z. (2.23)
To go from the coordinates of the lab observer to the rocket frame, one then only switches the sign of
the relative velocity :
t0 = t (2.24)
0
x = x t (2.25)
0
y = y (2.26)
z0 = z. (2.27)
From the Galilean transformation, we can immediately show the Galilean velocity addition rule:
CA = BA + CB (2.28)
which implies that the velocity of a particle C relative to frame A is given by the sum of the velocity of
frame B relative to A and the velocity of particle C relative to B. We can think of it as frame A is a lab
while frame B is a rocket moving at speed BA , relative to frame A, and that inside the rocket there is a
particle moving with speed CB . But isnt there something fundamentally wrong with this addition rule?
Consider a light pulse travelling at the speed of light in a frame B that moves with speed relative
to A. According to the Galilean velocity addition rule, the speed of the light pulse measured by frame
A is given by + 1 6= 1. This violates the principle of relativity! Moreover, in the derivation of the
Galilean transformation, one assumes that the time measured in one frame is exactly the same as the
time that is measured in another reference frame. But we know that to be incorrect either. So, we agree
that there is room for improvement. Our goal now is to derive the so-called Lorentz transformation
which is the relativistic generalization of the Galilean transformation. It does not mean that the Galilean
transformation are totally incorrect and that it is useless starting this point. On contrary, we are going
to use it as a guide in our derivation.
With this goal in mind, we assume that the transformation between the two inertial reference frames
is linear. Can you present an argument for this assumption? For simplicity, we consider two frames
whose relative motion is along the x direction. Without loss of generality, we take the origin of both
frames to coincide at time zero in both frames. The principle of relativity tells us that y = y 0 and z = z 0 .
Now, for the time and position along the x-axis we have
t = at0 + bx0 (2.29)
0 0
x = ct + dx . (2.30)
The principle of relativity tells us that light travels at the same speed = 1 in any inertial reference
frame. This means that a light beam sent out to the positive x direction travels at the same speed = 1
to the right in both frames. Also, a light beam sent out to the negative x direction travels at the same
speed = 1 to the left in both frames. From Eqs. 2.29 and 2.30 we obtain t = at0 + bx0 and
x = ct + dx. From which we obtain
0
x c + d x
t0
= 0 . (2.31)
t a + b x
t0
That the pulse travels at the same speed in both directions in both frames gives us
c+d
1 = (2.32)
a+b
cd
1 = . (2.33)
ab

34
We can divide the equations with each other and do some rearrangement. This only leads us to conclude
that
d c
= . (2.34)
a b
This last equation constraints the solution we are looking for to preserve the speed of light in both frames.
Now, suppose the rocket is moving with relative velocity with respect to the lab. This means that the
points on the x-axis of the rocket frame are moving with speed with respect to the lab observer. In
symbols, Eq. 2.31 tells us that
c
= . (2.35)
a
Now we use the principle that the coordinates lie along the same hyperbola in the two frames. Using
Eqs. 2.29 and 2.30 we obtain
t2 x2 = (a2 c2 )t02 (d2 b2 )x02 + 2t0 x0 (ab cd). (2.36)
It follows from the invariant hyperbola (or from the invariance of the interval) that
a2 c2 = 1 (2.37)
2 2
d b = 1 (2.38)
ab cd = 0. (2.39)
It is left as an exercise to solve the remaining system of equations. The result from this is the Lorentz
transformation:
t = (t0 + x0 ) (2.40)
0 0
x = (t + x ) (2.41)
0
y = y (2.42)
0
z = z (2.43)
where is the so-called Lorentz factor that depends on the speed as
1
=p . (2.44)
1 2
If this is going to be the correct transformation, then this must reduce to the Galilean transformation
for small velocities. And indeed it does. Can you see it? Go back to the ordinary units to see clearly.
Exercises
1. Consider a rocket frame moving in the positive x axis with speed 0.75. Suppose an event A has
coordinates tA = 3 yr and xA = 5 yr in the lab frame. What is the coordinates of the same event
in the rocket frame? Ans. (t0 , x0 ) = (1.13 yr, 4.16 yr).
2. Your friend recorded the location of a certain explosion at (10 m, 25 m). If he is aboard a bus
moving at a velocity -0.75 with respect to your reference frame, what is the coordinates of the
event in your reference frame? Ans. (13.2 m, 26.5 m).
The Lorentz transformation written down in the form given by Eqs. 2.40, 2.41, 2.42, and 2.43 is
actually not the best way of writing it. It turns out that one can write down the Lorentz transformation
as
t = t0 cosh + x0 sinh (2.45)
0 0
x = t sinh + x cosh (2.46)
0
y = y (2.47)
0
z = z (2.48)
where
= tanh . (2.49)
The quantity is known as the rapidity and we will give more meaning to this later in the course. In
the mean time, performing the Lorentz transformation to find the coordinates of an event in another
inertial reference frame is quicker to input on a calculator using the latter form. Use this for the last
two exercises and show that your arrive at the same answer.

35
2.4 Relativity of Simultaneity, Time Dilation, and Length Contraction
At this point, we know that the coordinates of events have different values with respect to different
inertial reference frames moving at constant speed with respect to one another. So, it must not come
at a surprise that two events which are simultaneous (same time) in one frame are not simultaneous in
another. The question we ask now is: given the coordinates of two events A and B in a lab frame, can
we find out a rocket frame where the two events are simultaneous? where event A happens before B?
where event B happens before A? We will focus on getting the answer to these questions as we go over
the this lecture and the next. To start, we answer the following problems.
Exercises
1. Consider two events A (0 yr, 10 yr) and B (0 yr, 10 yr) as measured in the lab frame. According
to the lab frame, which of the two events happened first? Ans. Both happened at the same time.
2. Consider two events A (0 yr, 10 yr) and B (0 yr, 10 yr) as measured in the lab frame. In a rocket
frame moving to the right with speed 0.75, which of these two events will happen first? Ans. Event
B occurred first.
3. Consider two events A (0 yr, 10 yr) and B (0 yr, 10 yr) as measured in the lab frame. In a rocket
frame moving to the left with speed 0.75, which of these two events will happen first? Ans. Event
A occurred first.
So, how can we tell if two events can happen at the same time? Say, we have two events A and B
whose coordinates we know in the lab frame. In the rocket frame where these two events happen at the
same time we have t0 = 0. Using the invariance of the interval we have
x02 = x2 t2 . (2.50)
This tells us that if there is a rocket frame that can observe events A and B simultaneously, then the
interval squared for events A and B must be spacelike.
How can we tell if two events can happen at the same place? Again, we have two events A and B
whose coordinates we know in the lab frame. In the rocket frame where these two events happen at the
same place we have x0 = 0. Using the invariance of the interval we have
t02 = t2 x2 . (2.51)
This tells us that if there is a rocket frame that can observe events A and B at the same place, then the
interval squared for events A and B must be timelike.
Given the spacetime diagram in Figure 30 (according to some lab frame observer) can you tell which
events can happen at the same time/place in some rocket frame? If you are extra clever, can you tell the
speed of the rocket frame where this pair of events can happen at the same time/place? To guide you,
events B and D can happen at the same time in a rocket frame moving with speed 2/6 = 1/3 the speed
of light. Discover if something like this holds for other pairs of events.
Time Dilation
Consider two events A and B. Say, a muon is created (event A) propagates in spacetime by travelling
at uniform speed and decays (event B). We observe the two events in the lab and conclude that the
temporal separation for the two events is t. According to the muon, the creation and decay happened
at the same place x0 = 0. In the muon frame, its rest frame, the temporal separation between the two
events is equal to the proper time . We ask, what is the relationship between t and ? Using the
Lorentz transformation, setting x0 = 0, we immediately obtain
t = cosh . (2.52)
Since cosh 1 we see that t . Since the proper time is what we read as the wrist watch time
(of the muon), this is where the saying moving clocks run slower came from. It turns out that we
(observing the two events in the lab) actually age more compared to the muon.

36
Figure 30: Spacetime diagram according to some frame that observes events A, B, C, and D.
Exercises
1. A muon decays into other particles with a mean lifetime of 2.20 106 s as measured in a reference
frame in which it is at rest. If a muon is moving at 0.990 relative to a lab frame, what will an
observer on this lab measure its mean lifetime to be? Ans. 15.6 106 s .
2. An airplane flies over a distance of 4.80 106 m at a steady speed of 300 m/s (1.00 106 in natural
units) with respect to a lab frame. How much time does the trip take as measured by an observer
in the lab frame? By an observer in the plane? Ans. 1.6 104 s and 1.6 104 s (1 0.5 1012 ).
Length Contraction
As with time dilation, it should not come as a surprise that the length of two sticks as measured in two
frames moving at some relative speed with respect to each other is different. Two give more insight, we
ask, how is the length of a stick to be measured in an inertial reference frame? The length of a stick is
measured by reading out the spatial coordinates of both ends of the stick at the same time. This implies
two events. These two events which are simultaneous in, say, a rocket frame, will not be simultaneous in
other reference frames. We know this now and this explains why the lengths are different. Now we ask:
how is the length of the sticks related?
To get the answer to this last question we consider two simultaneous events corresponding to reading
the spatial coordinates of both ends of the stick in a rocket frame. We know that the value that
comes out of this is the proper distance between the events. We can call this the proper length .
What does an observer in the lab frame measure? x. According to this lab observer, x0 = =
t sinh + x cosh . Since the two events are measured simultaneously in the rocket, we have
x = / cosh . (2.53)
Because cosh 1 we see that x < and we have contraction.
Exercises
1. A cylindrical rocket has radius 5.00 m and length 20.0 m in its rest frame (the rocket frame). This
rocket was observed in the lab frame to be moving along the direction of its length, at constant
velocity -0.700. What is the radius and length of this rocket according to the lab frame? Ans.
Radius = 5.00 m, Length = 14.3 m
2. Megaman X boards a spaceship and then zips past Zero (at rest in a lab frame) at a relative speed
of 0.600. Zero starts to blink just as X flies past him, and X measures that the blink takes 0.400 s
from beginning to end. According to Zero, what is the duration of his blink?

37
2.5 Two-observer Spacetime Diagram
In the previous exercises, we have used the Lorentz transformation to identify the chronological order of
events as viewed in inertial reference frames moving at a constant speed relative to each other. It would
be convenient to have, for example, a way in which we can determine this chronological order without
actually performing a calculation (in the sense of involving a calculator). It turns out that this is possible
with the help of a two-observer spacetime diagram. Moreover, the two-observer spacetime diagram
allows us to get more insight into simultaneity, time dilation, and length contraction.
Let us consider two inertial reference frames whose relative motion is along the x-direction. In this
case, we know that the coordinates (t0 , x0 ) of one frame is related to the coordinates of the frame (t, x)
by the Lorentz transformation Eqs. 2.45 and 2.46. The time axis is the worldline corresponding to the
origin of the inertial reference frame. The position axis is the locus of points that occurs simultaneously
at the temporal origin of the inertial reference frame. This is what we ask: In the spacetime diagram of
a lab frame, say, with coordinates (t, x), how does the time axis and the position axis of a rocket frame
with coordinates (t0 , x0 ) look like? Let us simplify the answer to this question by considering inertial
reference frames whose origins coincide at the zero of time.
The time axis of the rocket frame is basically the worldline of its origin x0 = 0. To draw this
trajectory, we set x0 = 0 in the Lorentz transformation equations to obtain t = t0 cosh and x = t0 sinh .
The orientation of the time axis of the rocket frame is then given by x/t = tanh = . Thus, it is a line
tilted with respect to the time axis of the lab frame with a slope given by the speed of the rocket frame
relative to the lab. What about the x0 -axis? Since this is the locus of events that occur simultaneously
with t0 = 0 in the rocket frame, we set t0 = 0 in the Lorentz transformation. This gives t = x0 sinh
and x = x0 cosh . The orientation of the x0 -axis of the rocket frame as viewed in the lab is given by
t/x = tanh = . The two-observer spacetime diagram for a lab frame and a rocket frame moving
with velocity 0.25 with respect to the lab is shown below. Now, how does the two-observer spacetime
Figure 31: Two-observer spacetime diagram for a lab frame and a rocket frame moving with speed 0.25 to the
right relative to the lab frame.
diagram serve our purpose of determining the chronological order of events as viewed in different inertial
reference frames? Simple. Events that are simultaneous in an inertial reference frame lie along a line
that is parallel to the position axis of that inertial reference frame. Similarly, events that occur in the
same place in an inertial reference frame lie along a line that is parallel to the time axis of that inertial
reference frame.
Exercise
1. Draw the two-observer spacetime diagram for a lab frame and a rocket frame moving with a speed
0.25 to the left with respect to the lab.

38
2. Figure 31 is the two-observer spacetime diagram drawn from the point-of-view of the lab observer.
Draw the same two-observer spacetime diagram from the point-of-view of the rocket frame observer.
3. Consider the events A (0 yr, 10 yr) and B (0 yr, 10 yr) as measured by an observer in a frame
that we shall call lab frame. By drawing a two-observer spacetime diagram, determine the chrono-
logical order of the events as viewed in a rocket frame that is moving to the right with speed 3/4
according to the lab frame. Also, using the two-observer spacetime diagram technique, determine
the chronological order of events in a rocket frame that is moving to the left with speed 3/4 relative
to the lab frame.
4. Explore time dilation and length contraction using the two-observer spacetime diagram technique.
Hint: Calibrate the temporal measurements by drawing the invariant hyperbolas. For example, the
1 m of time in the time axis of both frames corresponds to the intersection points of the hyperbola
1 m2 = t2 x2 with the t and t0 axes. Similarly, the 1 m of position along the x-axis of both frames
corresponds to the intersection points of the hyperbola 1 m2 = x2 t2 with the x and x0 axes.
2.6 Velocity Transformation and Doppler Effect

In writing down the Lorentz transformation, we have introduced the variable through the substitution
= tanh without giving any physical significance to it. At best, we have only shown that using
we can write down the Lorentz transformation in a way that looks like a rotation in the place of the
time and position axis. We now give more meaning to in this section by showing that its addition
rule it is analogously the same as the Galilean velocity addition rule. What does the Galilean velocity
addition rule state? Consider an object X that is moving with velocity XR with respect to a rocket
frame observer. If the rocket frame moves with a velocity RL as measured by a lab frame observer, then
the Galilean velocity transformation rule states that the velocity of the object X as measured in the lab
frame is given by
XL = RL + XR . (2.54)
Of course, it is now established that the application of the Galilean velocity addition rule is limited to
cases when the speeds are much less than the speed of light. This is unfortunate because the Galilean
addition rule is simple in the sense that its velocity addition rule is actual addition with numbers. But it
turns out that the addition rule in terms of is as simple as the Galilean velocity addition. The variable
has the special name velocity parameter or rapidity. These two terms are the same and might be used
interchangeably. If XR = tanh1 (X R) is velocity parameter of the object X relative to the rocket
frame and RL = tanh1 (R L) is the velocity parameter of the rocket frame relative to the lab frame,
then the velocity parameter of the object X as measured by the lab frame is given by
XL = RL + XR . (2.55)
Let us now make a few steps backward and derive the velocity parameter addition rule Eq. 2.55.
Start with the Lorentz transformation for temporal and spatial separations:
t = t0 cosh + x0 sinh (2.56)

x = t0 sinh + x0 cosh . (2.57)
Let the two events correspond to the passage of the object X through lattice clocks in the lab and rocket
frames. Therefore, we have x/t = XL and x0 /t0 = XR for the velocity of object X as measured
in the lab and rocket frames, respectively. The velocity of the rocket frame with respect to the lab frame
is of course given by RL = tanh . Dividing out the above equations for the coordinate separations gives
x t0 sinh + x0 cosh
=
t t0 cosh + x0 sinh
0 (2.58)
tanh + x
t0
= 0 .
1 + x
t0 tanh
We can write down the last result as

RL + XR
XL = . (2.59)
1 + XR RL

39
This equation is the relativistic velocity addition rule and is the generalization to the Galilean trans-
formation. But as we have noted, it does not look as simple as the Galilean velocity addition rule and
this is an undesirable feature, regardless if its description of nature is correct. But now let us define the
velocity parameters XR = tanh1 (XR ) and XL = tanh1 (XL ). In this way, Eq. 2.59 can be written
as
tanh(RL ) + tanh(XR )
tanh(XL ) = . (2.60)
1 + tanh(XR ) tanh(RL )
But the right-hand-side is equal to tanh(RL + XR ) by an identity for the hyperbolic tangents. This
means that XL = RL + XR which is the velocity parameter/rapidity addition rule.
Exercise
1. A spaceship is moving at a speed 0.70 with respect to a rocket moving at a speed 0.50 as measured
by the lab frame. What is the speed of the spaceship as measured by the lab frame? Solve the
problem in two ways: (i) direct use of the relativistic velocity addition rule (ii) use of the rapidity.
2. A spaceship is moving at a speed -0.70 with respect to a rocket moving at a speed 0.50 as measured
by the lab frame. What is the speed of the spaceship as measured by the lab frame?
2.6.1 Doppler Effect

Consider a light source that is emitting photons at a constant time interval. The frequency of the light
source as measured by an observer is simply the reciprocal of the measurement of the observer of this
time interval. Since the time intervals are not the same in different inertial reference frames, it comes as
no surprise that the frequencies measured by observers in different inertial reference frames are not the
same. The relationship between the frequencies (or wavelengths) is given by the relativistic expression
for the Doppler effect. Let us derive this.
Consider a light source (say, a lamp) that emits photons at a constant time interval of as measured
in the light sources rest frame. Let this light source be carried by a rocket frame that is moving with
a speed = tanh relative to a lab frame. In the rocket frame, the light source is at rest. Therefore,
the rocket frame is the rest frame of the source and the observers in the rocket frame would say that
the frequency of the source is fS = 1/ . The question we would like to ask is: What is the frequency
of the light that is emitted by the source as measured by observers in the lab frame? To get the
answer, we look closely at two adjacent photon emission events E1 and E2 shown in the two-observer
spacetime diagram below. In the rocket frame, the coordinates of events E1 and E2 are (t0 , 0) and
Figure 32: Relativistic Doppler effect: time interval between the emission and reception of photons in the two
inertial reference frames are different.
(t0 + , 0), respectively. In the lab, the coordinates of E1 and E2 are given by (t0 cosh , t0 sinh ) and
((t0 + ) cosh , (t0 + ) sinh ), respectively. Since the photons travel through diagonal lines oriented

40
45 relative to the time or position axes, the temporal coordinates of the reception events R1 and R2 in
the lab are then t1 = t0 cosh + t0 sinh and t2 = (t0 + ) cosh + (t0 + ) sinh , respectively. The
observers in the lab would then define the frequency as the reciprocal of the time difference t2 t1 = 1/fR .
It straightforwardly follows that
s
1 1 1 1+
= (cosh + sinh ) = . (2.61)
fR fS fS 1
Therefore, the frequency fR that is detected by the receiver is related to the frequency of the source fS
by s
1
fR = fS . (2.62)
1+
This is the relativistic expression for Doppler effect. To get the relationship for wavelengths simply
get the reciprocal of the above relation.
Eq. 2.62 tells us that the frequency that is detected by a receiver is less than the frequency of the
source is the source is moving away ( > 0) from the receiver. We say that the light is red-shifted
(recall that in terms of frequency red is at the lower end of the visible spectrum). On contrary, if the
source is moving towards the receiver ( < 0) the frequency that is detected by the receiver is greater
than the frequency of the source. In this case, we say that the light is blue-shifted. Can you guess
why? One of the hints that lead to the prediction of the expansion of the universe is that light that is
received from earth coming from all directions is always red-shifted. Can you explain why?
Exercises
1. How fast must you be approaching a red traffic light ( = 675 nm) for it to appear yellow ( =
575 nm)? Ans. 0.159 or 4.77 107 m/s
2. A source of electromagnetic radiation is moving in radial direction relative to you. The frequency
you measure is 1.25 times the frequency measured in the rest frame of the source. Is the source
moving toward you or away from you? What is the speed of the source relative to you?
2.7 Relativistic Energy and Momentum

Since we have started our study of SR, we have only been talking about events, that is, points in spacetime
and how these are viewed in one inertial reference frame and another. The underlying idea behind many
of our results is the invariance of the spacetime interval. For particles that are moving in spacetime we
find out that we can actually parametrize the trajectories using the proper time :
2 = t2 x2 . (2.63)
This proper time is the same as calculated in different inertial reference frames. We can say that the
proper time is invariant under Lorentz transformation or, simply, it is Lorentz invariant.
We are now ready to tackle the concepts of energy and momentum in special relativity. Although
these are different from the Newtonian values of the energy and momentum, we see that we are guided
by the Newtonian theory in building a more satisfactory theory of nature.
What are some of the difficulties encountered in applying the Newtonian theory of mechanics to rela-
tivistic particles? First, it turns out that the quantity m for a particle of mass m and velocity which
is called Newtonian momentum is not conserved in high energy collisions. There are two ways to proceed
from this point. We can accept this form of momentum in special relativity but accept that momentum
is not going to be conserved in collisions; that is, the sum of the momentum vectors before and after a
collision are not equal. The other way is to drop completely the Newtonian expression of momentum
m and look for a new one that is going to be conserved in a collision. The conservation of momentum
and energy has guided and simplified the analysis of many problems not only in physics but in science
as a subject. We prefer and choose the latter solution: to find out that conserved quantity that we shall
call the momentum. Second, Newtonian mechanics cannot explain the production of particle-antiparticle
pairs even in terms of its conservation laws. But these phenomena are very common in these days. Try

41
explaining the production of an electron and a positron via the collision of two photons using Newto-
nian theory. Of all things, photons are relativistic and Newtonian mechanics will not even get close to
explaining the behaviour of photons. There are other reasons which suggests that we have to modify the
Newtonian theory.
Relativistic Momentum
So where do we begin? We start by finding out the expression for the relativistic momentum and energy
that is conserved before and after a collision. We do this considering a so-called glancing collision as
shown in the figure below. Momentum is a vector. This means that the first thing we must establish
Figure 33: Glancing collision of identical masses A and B as viewed in three inertial reference frames.
is the direction to where it points. The only unique direction we can actually choose for its direction is
the direction of the motion of the particle. If the momentum vector is oriented at an angle relative to
the direction of motion, then there are going to be an infinite set of possible choices corresponding to
each possible direction oriented with the same angle. The isotropy of space does not allow us to prefer
any one of these vectors. Thus, we choose that the momentum is parallel to the direction of motion. As
guided by symmetry we say that
The direction of the momentum of a particle is parallel to the direction of motion of the particle.
Let us now calculate the magnitude by being guided by the Newtonian expression for momentum. Let
us (in a lab frame) arrange the encounter such that the mass A is going to be relativistic while mass
B is non-relativistic. This means that we can use the non-relativistic expression of momentum for the
relativistic momentum of mass B. We can always look at the collision in the frame where the velocities
of the masses are pointing in opposite directions. In this velocity-symmetric frame, we can easily argue
that the total momentum before the collision is zero. We then require that the total momentum after
collision be zero as well. This frame where the total momentum is zero is shown in Figure 33. In some
rocket frame that is moving to the left the initial and final momentum along the horizontal of particle
B is going to be zero. We call this frame S. In another rocket frame that is moving to the right at
a sufficient speed, the horizontal component of the momentum of particle A is going to be zero. We
call this frame S. We choose the encounter such we can use the Newtonian expression my/t, to a
good approximation, for mass B. We analyze the collision in frame S. Note that for mass B the proper
time before collision to point of impact and the proper time immediately after impact to final position is
almost the same as the coordinate time t. Looking at frame S, the conservation of momentum along
the vertical direction leads us to
y
vertical component of pA = m . (2.64)

By the symmetry of frames S and S, we are further led to conclude that the proper times of A and B
are the same. The relativistic expression for the momentum of mass A can be obtained by using similar
triangle identities for the displacement and momentum diagrams (Figure ). The resulting expression is

42
Figure 34: Displacement and momentum diagram for the glancing collision of identical masses in frame S.
given by
dr
p=m . (2.65)
d
Using the expression for the proper time we can write this down as
dr dr/dt
p = m = mq = mp (2.66)
dt2 dr2 1 (dr/dt)
2 1 2
or simply
p = m sinh . (2.67)
Eq. 2.67 is the relativistic expression for momentum which is conserved in particle interactions. Because
its derivation is guided by the Newtonian expression for momentum it naturally reduces to m for very
small velocities 1.
Relativistic Energy
Now we are one step ahead when we have started talking about momentum and energy. We make another
step here by considering how we can realize the generalization to the energy of a particle. In line with
this goal we introduce the notion of the 4-vector.
A 4-vector is labelled by four numbers just as the vectors that we know in three dimensions are
labelled by three numbers. We label the 4-vector as ( time component, 3 spatial components). An ex-
ample of the 4-vector is the displacement vector (dt, dx, dy, dz). What we know about the displacement
4-vector is that the components transform from one frame to another using Lorentz transformation. In
general, 4-vectors are defined such that their components transform under Lorentz transformation when
we go from one frame to another. To construct 4-vectors we can multiply known 4-vectors by scalar
quantities. Our goal now is to construct a 4-vector whose components correspond to the energy and the
momentum. Why? It is known from the theory of tensors that when three components of a 4-vector is
conserved then the remaining component is also conserved. Since we defined the momentum to be a con-
served quantity, if we can construct a 4-vector whose three components correspond to momentum then
the fourth momentum will be conserved. This is one reason to speculate the fourth component as energy.
The proper time is a scalar (invariant). We can divide the displacement vector by the proper time d
to construct the so-called unit tangent vector: (dt/d, dx/d, dy/d, dz/d ). This result is a 4-vector since
we have essentially multiplied a 4-vector (whose components transform under LT) with a scalar (does
not change under LT). The resulting quantity then changes components under LT. We then multiply
this tangent vector with the mass m of a particle. The resulting 4-vector is

dt dx dy dz
p = m , m , m , m . (2.68)
d d d d
Notice that the last three components of this vector p is the momentum that we have derived. This
momentum is conserved in all inertial frames of reference because this is how we have constructed it. Be-
cause the momentum is conserved in all inertial reference frames then the remaining component m dt/d

43
is conserved in all inertial reference frames.
Before calling m dt/d the energy let us give some arguments. First, we want the energy to be
conserved. We have always been guided by the conservation of energy (in geology, chemistry, biology,
etc.). Second, this component has the same units as the momentum and we know that energy (in natural
units) has the same units as momentum. Third, we show that this reduces to the Newtonian expression
for the energy 12 m 2 for very small velocities:
dt dt m
m = m =p . (2.69)
d dt2 dx2 1 2
For 1 the expression above reduces to
dt 1
m = m + m 2 + higher-order terms. (2.70)
d 2
The second term is what we know to be the Newtonian expression for the kinetic energy. These arguments
give us enough reason to call the remaining component of p the energy. Thus, we write down the energy
E of a particle of mass m as
dt m
E=m =p (2.71)
d 1 2
or equivalently
E = m cosh . (2.72)
Nature turns out to combine the momentum and energy into a single energy-momentum 4-vector.
Kinetic Energy
By expanding the expression for the energy Eq. 2.72 in terms of the rapidity or, equivalently, in terms
of the speed we find that
E = m{1 + terms that depend on or }
(2.73)
= m + { motion dependent terms }.
It turns out that the relativistic expression for the energy suggests that the total energy of a system
can be divided into a term that does not depend on the state of motion of the particle and a term that
depends on the state of motion. The motion-independent terms is m and we call it the rest mass.
In conventional units this is mc2 . The first term of the motion dependent parts is 21 m 2 which is the
expression for the Newtonian kinetic energy. We borrow the Newtonian idea of kinetic energy as that
part of the energy which depends on the state of motion of the particle. Thus all the motion-dependent
terms in the last expression is what we call the kinetic energy. The energy of a particle can then be
written as
E =m+T (2.74)
where T is the kinetic energy.
Exercises
1. According to a lab frame, Albert Einstein throws a particle which then moves with velocity 0.900.
What is the particle energy and momentum according to the lab if the particle has a mass of 10
kg? Ans. 22.9 kg and 20.6 kg
2. An object of mass 300 MeV moves 8 m along the x-direction in 10 m of time as measured in the
lab. What is its energy and momentum? Ans. E = 500 MeV and p = 400 MeV
Mass, Momentum, and Energy

Now that we have our relativistic expressions for the energy and momentum of a particle of mass m we
derive a very powerful relationship that exists between the mass, momentum, and energy. But first let
us notice that our expression for the energy suggests that a particle at rest has energy
E = m. (2.75)

44
In conventional units this is the famous E = mc2 which suggests that the existence of a particle itself
contributes to the total energy.
Now, take the square of the energy and the momentum and subtract the resulting expressions:
E 2 p2 = m2 cosh2 m2 sinh2
(2.76)
= m2 cosh2 sinh2 .

But we know that for any value of we have cosh2 sinh2 = 1. Therefore, we arrive at the following
relationship between the energy, the momentum, and the mass:
E 2 p2 = m2 . (2.77)
Notice that this is very similar to the expression for the invariant interval (t2 x2 = 2 ). In fact,
we can show that the mass (just as the interval is) is invariant under Lorentz transformations.
Exercise
A particle with mass 1 kg moves in the positive x-direction in the lab with kinetic energy equal to three
times its rest energy. What is the particles energy, velocity and momentum?
2.8 Lorentz Transformation of Energy and Momentum

We have already mentioned that the energy and the momentum of a particle are actually components of
its energy-momentum 4-vector. Since 4-vectors are defined to have transformation properties consistent
with Lorentz transformation then it should not come as a surprise that the energy and the momentum
have different values in different coordinate frames. Before we go on to describe this quantitatively let
us first talk about why the energy and momentum would have different values when viewed in different
inertial frames of reference. Think of a particle that is at rest in front of you. When you take a
measurement of the energy and momentum of that particle you will get E = m and p = 0. The particle
is at rest and therefore its energy comes only from its mass and its momentum is zero. Now, suppose
that (your best friends name) zooms in past you to your right at a constant speed . According to your
friend this particle is moving with a speed to the left. If he takes a measurement of the energy and the
momentum of the particle in front of you, what he would get is (using Eqs. 2.72 and 2.67) E = m cosh
and p = m sinh . This simple example illustrates in a very simply way that the values of the energy
and the momentum can vary in different frames. Is there anything that does not vary? When you make
a measurement of the mass of the particle you calculate E 2 p2 and get m. When your friend makes a
measurement of the mass he/she also calculates E 2 p2 and get m. Thus, even when E and p varies, it
turns out that the mass of the particle remains constant (the mass is an invariant).
So what is the Lorentz transformation for the energy and the momentum? By our construction we
know that p = (E, px , py , pz ) is given by Eq. 2.68. Of all the variables appearing in this equation the
displacement 4-vector dx = (dt, dx, dy, dz) is what transforms under a Lorentz transformation. The
mass of the particle is invariant and we know that the proper time is also invariant. Therefore, it is
pretty obvious that the Lorentz transformation for the energy and the momentum is given by (for relative
motion along x)
E = E 0 cosh + p0 sinh (2.78)

0
px = E sinh + p0x cosh (2.79)
py = p0y (2.80)
pz = p0z . (2.81)
Using these we can confirm the results of the example that we have considered earlier.
Exercises
1. According to the lab frame, a 1-kg particle moves with a constant velocity of 0.5j. What is
the energy and momentum along the y-axis of the particle as seen by a rocket frame moving at a
velocity 0.25j with respect to the lab frame? Ans. 1.34 kg and 0.89 kg.

45
2. A particle is at rest relative to a rocket frame and has mass 8.25 eV. Relative to a stationary
observer in a lab frame, this rocket frame is moving with speed 0.255 along the +x-axis. What is
the particles energy and momentum relative to the lab frame? Ans. E = 8.47 eV and p = 1.91
eV.
3. A particle is observed by the lab frame to have some energy E and momentum p. A rocket frame
measures the same particle to have a zero momentum. What is the velocity of the rocket frame
with respect to the lab?
4. Particle A and B (both have mass of 2 MeV) are moving toward each other with speeds 0.600 and
0.450 respectively. What is the total energy and momentum of the system wrt the lab? What is
the total energy and momentum of the system according to a rocket moving with velocity 0.750
wrt the lab? Ans. (4.74 MeV, 0.492 MeV) and (6.60 MeV, 4.63 MeV).
5. Doppler Shift Suppose a photon is emitted in a lab frame at an angle with respect to its x-axis
and frequency f . Show that the frequency f 0 and angle 0 that is measured by a rocket frame that
is moving with speed to the x-axis are given by
f 0 = f cosh (1 tanh cos ) (2.82)
and
cos tanh
cos 0 = . (2.83)
1 tanh cos
Exercise
A lab frame emits a photon with frequency 3 MHz at an angle of 30 with respect to its x-axis. What
is the measured frequency and angle of this photon in a rocket frame moving at a velocity of 0.50 along
the x-axis with respect to the lab frame? f = 1.96 MHz, 0 = 49.8 .
We have already completed all the discussion on this course of the special theory of relativity. In the
beginning we have talked about events, how these are measured, and how different observers see these
events. We have talked about Lorentz transformations and have derived it from the invariance of the
interval. Finally, we discussed the relativistic generalizations to the concepts of energy and momentum.
The principle of relativity is what allows us to arrive at all these.
We have achieved enough for an introductory course. To end, we give an example and leave problems
that can be analyzed now that we have learned about the concepts of energy and momentum and mass
and how these three are intertwined with each other.
Threshold Energy for + e + e+

We consider colliding a beam of photons. The photons then annihilate and produce an electron-positron
pair. What we want to calculate is the minimum energy that is needed for this process to occur. This
minimum energy is the threshold energy. Why is there a minimum energy? We have learned that the
total energy is a sum of the rest mass and the kinetic energy. Thus to produce even a particle at rest
would require an incident energy equal to the value of the rest energy of the particle.
Let each of the photons have an energy E . The threshold energy can be analyzed in the frame where
the two photons have equal and opposite momenta. In this frame, the total energy of the system before
collision is
Etotal = 2E (2.84)
while the total momentum is
ptotal = 0. (2.85)
Because the total momentum is zero and the momentum is conserved, the electron-positron pair must
emerge in opposite directions with equal magnitudes of momentum. Let me be the mass of the electron
and the positron. If thep electron-positron pair emerge with nonzero momenta pe then each would be
having an energy Ee = p2e + m2e . It can be seen that the smallest value of this is equal to the rest energy
me of an electron. The minimum energy needed to produce the electron-positron pair then corresponds
to the case where the electron-positron pair is at rest when created. The total energy of the system after
collision is
Etotal = 2me (2.86)

46
while the total momenta is
ptotal = 0. (2.87)
Because the total energy and momenta does not change before and after a collision it follows that the
minimum energy needed to create an electron-positron pair using two-photon collision is
Ethreshold = me . (2.88)
Thus the photons must be energetic enough to exceed the rest energy of the electron-positron pair for
the process to occur. As an order of magnitude estimate note that me = 0.5 eV. A photon with this
energy has a wavelength given by E = h . This leads to 1012 m = 103 nm. What part of the
electromagnetic spectrum does this wavelength correspond? It seems that we cannot use visible light
photons to produce electron positron pairs. We need more energy. This is true for all particle production
processes. By going to a higher and higher beam energies the production of particles can be observed.
This is why we upgrade the particle accelerators (LHC, SLAC, FermiLab) to higher and higher energies.
Although the analysis of particle production processes is more complicated (involving the laws of
probability and quantum mechanics and an own set of conservation laws) we have made a step forward.
At best, we can collide particles and know the minimum energy that we need to set to produce a certain
set of particles after collision.
Exercises: Application of the Conservation Laws

1. A positive pion at rest decays into a positive muon and a neutrino. What is the sum of the kinetic
energies of the muon and the neutrino? m = 139.6 MeV, m = 105.7 MeV, m = 0. Ans. 33.9
MeV
2. Particle A (mass 2 kg) is moving with kinetic energy 3 kg. Particle A is in a collision course
with Particle B (mass 3 kg) with kinetic energy 5 kg. After collision, they formed a new particle,
Particle C. What is the mass of Particle C? Ans. 12.7 kg
3. A stationary particle with mass 100 MeV decays into two identical particles, A and B, each with
mass 20 MeV, travelling in opposite directions. What should be the magnitude of the momentum
of A and B? 45.83 MeV
4. An electron at rest collides with a photon with energy equal to twice the mass of the electron. The
electron is kicked forward with some speed while the photon is scattered backwards, i.e., in the
opposite direction of its initial velocity. What is the speed of the electron and the energy of the
scattered photon?
5. An electron (mass m) at rest collides with a photon. After the collision, the photon ceased to exist,
and two newly created particles (electron and positron) have gone off in company with the original
electron at some speed. What is minimum energy E required for this creation process? What is
the speed at which the three particles move after the collision?
6. A proton with mass m collides with another proton initially at rest. After the collision, one proton
and one anti-proton is created and moves together with the original protons. What must be the
minimum kinetic energy (threshold energy) of the proton for this creation to occur? At what speed
does the four particles move after collision?
7. A -meson decays to two photons with energies E1 and E2 . What is themass of the -meson if
the trajectories of the two photons have an angular separation ? Ans. 2 E1 E2 sin 2 .
8. A moving radioactive nucleus of known mass M emits a gamma ray (photon) in the forward
direction and drops to its stable nonradioactive state of known mass m. Find the kinetic energy
of the incoming nucleus (BEFORE) such that the resulting mass m nucleus (AFTER) is at rest.
9. A proton (rest energy 938 MeV) with kinetic energy T collides with a proton at rest. Both protons
survive the collision, and a neutral pion (rest energy 135 MeV) is produced. What is the threshold
energy for this process?
10. Consider the head-on collision of two protons. Calculate the minimum energy that each of the
protons much have to produce a Higgs boson particle. Enjoy. Compare your result with the
center-of-mass energy of the LHC at the present (15 TeV).

47
3 Quantum Physics
We begin our discussion of quantum phenomena (which is the last part of the course). As always, the
best that can be promised is to give a glimpse of the subject because quantum mechanics has developed
into a vast and complex field that even up to this point contains some unresolved issues. Nonetheless, the
subject of quantum mechanics provides us with a sufficient description of atoms and molecules, allows
us to build the laser and the transistor (which are at the core of modern electrical devices), and, in the
future, promises to give us even better technology (quantum computers).
3.1 Photoelectric Effect

Reading assignment: Historical background on Hydrogen atom stability, blackbody radiation, and the
photoelectric effect (Hertz discovery and Einstein explanation).
Figure 35: Monochromatic light is incident on a cathode (made up of some metal). The galvanometer reveals
that a current (we call this the photocurrent) flows when the frequency of the incident light is greater than
some cutoff frequency.
The main conclusion from the photoelectric effect experiment (Hallwachs and Lenard, 1886-1900) is as
follows:
No electrons were emitted unless the frequency of light is greater than some minimum value.
So what? Can this not be explained by the classical wave viewpoint of the electromagnetic wave? Let us
give this wave viewpoint an attempt. The energy carried by an electromagnetic wave is proportional to
the magnitude of the Poynting vector |S|~ E 2 where E is the magnitude of the electric field. If this were
true, then we would expect the magnitude of the current to be independent on the frequency of the light
source. Thus, if we increase the intensity of the light (whatever its frequency is) we are going to increase
the number of the photoelectrons that is emitted by the cathode. This leads to an increase in the current
detected by the galvanometer. This explanation however is readily debunked by the experimental result
that there is a cutoff voltage that depends on the frequency of the light source.
We call the minimum frequency of light needed to produce a photocurrent the threshold frequency
while the ejected electrons are called photoelectrons. The explanation to the photoelectric effect was
offered by Einstein by borrowing the Planck postulate that light is made up of quanta (photons) each of
which carries an energy
E = hf (3.1)
where f is the frequency of the light. At the time Planck made the postulate the value of the constant h
remained undetermined. It was the photoelectric effect experiment that provided the experimental value
of the constant h:
h = 6.626 1034 J s (3.2)

48
Figure 36: For different intensities of the monochromatic light the value of the cutoff voltage is the same (left).
For the same intensity but different wavelengths of monochromatic light the cutoff voltage is different (right).
To determine the maximum kinetic energy of the emitted electrons we make the potential of the anode
relative to the cathode negative enough so that the current stops. In this case we get the value of the
cutoff voltage experimentally as
VAC = V0 . (3.3)
Through the work energy theorem we obtain the following relationship between the maximum kinetic
energy of the emitted electrons and the stopping voltage V0 :
1
KEmax = mv 2 = eV0 . (3.4)
2 max
Using Plancks hypothesis the maximum kinetic energy of the emitted electrons can be written as the
difference between the energy of the photon (hf ) that strikes the metal surface and the work function
characterizing the strength of the binding of the electrons to the cathode. Thus,
eV0 = hf . (3.5)
The story behind the Eq. 3.5 is simple (see figure). The electrons are bound to the metal with a binding
energy that we call the work function. The cutoff frequency for a particular cathode material can be
Figure 37: Photoelectric effect: Electron that is bound to the metal is targeted by a photon. The electron can
only get ejected if the energy of the photon is greater than the binding energy of the electron to the metal.
determined setting the maximum kinetic energy of the emitted electrons to be zero. This leads to
c
fcutoff = = (3.6)
cutoff h
where cutoff is the corresponding cutoff wavelength.

49
Exercises
1. While conducting a photoelectric effect experiment with light of certain frequency, you find that a
reverse potential of 1.25 eV is required to reduce the photocurrent to zero. Find (a) the maximum
kinetic energy (b) maximum speed of the emitted electrons. Ans. KEmax = 1.25 eV and vmax =
6.63 105 m/s
2. For a certain cathode material in a photoelectric effect experiment you require a stopping potential
of 1.0 V for light of wavelength 600 nm, 2.0 V for 400 nm, and 3.0 V for 300 nm. Determine the
work function for this material and the value of Plancks constant. Ans. 1.0 eV and 6.4 1034 J s
3.2 Bremsstrahlung and Compton Scattering

Bremsstrahlung is the production of x-rays through very violent acceleration of thermally excited
electrons that hits a metal plate. One can think of this process that the reverse of the photoelectric
effect in which photons strike the metal plate and electrons get ejected. In this case the electrons strike
a metal plate and photons are ejected. Let the potential difference between the anode and the cathode
Figure 38: Electrons are ejected from the cathode through thermal excitation heating and strike the anode
producing x-rays.
by V0 . When the temperature is very large, the thermal excitation energy kT (coming from random
molecular motion) can provide sufficient energy for the electrons to get ejected and travel accelerating
through the region between the cathode and the anode. When the ejected electron hits the anode it
suffers a very violent acceleration and emits a photon. In principle we must also take into account the
binding energy of the electrons to the cathode and the anode. But in Bremsstrahlung this binding energy
is very small compared to the thermal excitation energy. Since this thermal excitation energy provides
the energy of the released electrons we have kT KE. The electrons which are accelerated between the
cathode and the anode gains a kinetic energy eV0 (using the work energy theorem). When the ejected
electrons hit the anode then
eV0 = hf hf = hf /. (3.7)
The frequency and wavelength of the emitted photons are (to a good approximation) given by
eV0 /h = f = c/. (3.8)
Compton scattering refers to the detection of a wavelength shift of electromagnetic wave that is
scattered by target electrons. According to the classical viewpoint in which the photon is a wave the
wavelength of the scattered photon should be the same as that of the incident photon. A sufficient
explanation of Compton scattering can be explained by instead viewing the effect as a collision between
a photon and an electron. The conservation laws for energy and momentum lead to the following

50
Figure 39: Compton scattering: collision between a photon and an electron (at rest).
equations (in natural units):
E+m = E 0 + Ee (3.9)
0
E = E cos + pe cos (3.10)
0 = E 0 sin pe sin . (3.11)
A long but workable calculation using these equations lead to the Compton scattering equation
h
0 = (1 cos ) . (3.12)
m
In SI units this leads to the wavelength shift:
h
= 0 = (1 cos ) . (3.13)
mc
Exercises
1. Derive the Compton scattering equation Eq. 3.13.
2. A photon with wavelength of 400 nm hits a stationary electron and is scattered at an angle 30 .
The scattered photon hits another stationary electron and is scattered by an angle 20 . What is
the total shift in the wavelength of the photon? Ans. 4.7 1013 m
3. An electron with kinetic energy 4.5 keV hits a metal surface. What is the minimum wavelength of
the photon emitted by the metal? Ans. 2.76 1010 m
4. An electron was accelerated from rest using a potential difference of 30 kV and hits a metal plate.
If the x-rays emitted have wavelength 5.0 10 11 m how much energy was absorbed by the metal
plate? Ans. 5.2 keV
5. A photon with wavelength 4.75 1011 m is scattered by an electron at rest. If the scattering angle
of the photon is 60 , what is the kinetic energy of the electron after collision?
3.3 The Bohr Model

We now discuss the Bohr model, which is the first widely accepted model of the atom. Even if the model
is not the most accurate model that we have for the atom we cannot simply throw it away because of
its simplicity. Also, despite its lack of accuracy in explaining some quantum aspects it surely contains
hints of truth given that it can reproduce with sufficient accuracy the spectrum of the hydrogen atom.

51
De Broglie hypothesis
Before we go on to the details of the Bohr model we have to introduce the hypothesis that de Broglie
made concerning the wave nature of a particle. In the previous lectures we have been talking about
light as if it is made up of particles which we call photons. Now we treat light as a wave and as a
particle depending on its use. As a wave, light allows us to analyze the two-slit interference experiment.
As a particle, light allows us to analyze the photoelectric effect, Bremsstrahlung, and Compton scattering.
What is going to come as a surprise at this point is that particles (such as electrons, protons, and
neutrons) also act as a wave, i.e. exhibits interference patters when subject through the double-slit
experiment. See two-slit interference experiment with electrons and the Davison-Germer experiment on
electron diffraction (reading assignment). Given this, then what are we to associate as a wavelength to
electrons and other elementary particles. It was de Broglie that took the big step and have written down
the wavelength for matter waves to be
h
= (3.14)
p
where h is Plancks constant and p is the momentum, say, of the electron. The inspiration in writing
down the wavelength for matter in this form comes from the expression for the energy and momentum
of a photon pc = E = hc/. Since photons behave as a wave and as a particle then maybe electrons also
behave as a wave and as a particle and we can associate the same relationship regarding the wavelength
and the momentum. Given that Eq. 3.14 was inspired by a relativistic expression the momentum p
appearing in Eq. 3.14 is also relativistic. For example, an electron with mass m will have the momentum
p = m cosh .
Exercises
1. What is the de Broglie wavelength of a proton that is moving with the speed of sound 340 m/s?
2. A 1.0 1030 kg relativistic particle moves at a speed half the speed of light. What is its de Broglie
wavelength?
3. A relativistic particle with mass 1.0 1030 kg has total energy 1.5 1013 J. What is the de
Broglie wavelength of the particle?
The Bohr Model of the Hydrogen Atom

Here are the experimental facts about the hydrogen atom:
It is more stable compared to what was predicted by Maxwells electromagnetism.
Its spectra consists of only specific wavelengths.
There is a positively charged core (proton). See Rutherford scattering experiment.
It is known in classical electromagnetism that an accelerating charged particle radiates its energy away
in the form of electromagnetic wave. In the Bohr model, the electron orbits around the nucleus (proton)
and therefore radiates electromagnetic waves. Through this mechanism the electron will collapse to the
proton. This does not pose any problem as long as the lifetime (the time before the electron collapses
to the nucleus) is large. Say, the earth which is orbiting around the sun radiates gravitational waves.
But it will take more than a million years before the earth collapses around the sun according to the
general theory of relativity. There is no issue. However, for the case of the hydrogen atom the predicted
lifetime is of the order 1011 nm. This means that we would all vanish before any of us can do anything.
Obviously there is a mechanism other than that given by the classical theory of electromagnetism that
dominates the hydrogen atom. It is also predicted by the classical theory of electromagnetism that the
hydrogen atom spectra would be continuous, i.e. contains all colors. What is observed consists of only
a few colors in the visible spectrum (red, blue-green, blue-violet, and violet lines).
The Davison-Germer experiment provides us hints that electrons also behave as waves. Guided with
this intuition then we might think that the electron propagates around the nucleus as a wave. This leads
us to the Bohr model. We then write down
2r = n (3.15)

52
Figure 40: Bohr model: the electron propagates around the nucleus. The length of the circumference must be
an integer number of wavelengths .
where n is a positive integer. Given this starting point, what are we to associate to the wavelength of the
electron? Fortunately, we are guided by de Broglies hypothesis that the wavelength of the electron is
given by Eq. 3.14. We make the further assumption that the electron is going to move non-relativistically
around the nucleus. Thus, the magnitude of the momentum is given by p = mv where m is the electron
mass and v is the orbital speed. Combining this with the Eq. 3.15 leads to
mvr = n (h/2) . (3.16)
The constant on the right hand side (despite being proportional to Plancks constant) is very special in
quantum mechanics and it is given by a special symbol h
= h/2. We thus write down the last result as
mvr = nh. (3.17)
This ends the hypothesis and we now continue with known lines of work.
The total energy for the hydrogen atom is given by

1 1 e2
E= mv 2 . (3.18)
2 40 r
Since the electron is orbiting around the nucleus it exhibits centripetal acceleration coming from the
electromagnetic force:
v2 1 e2
m = . (3.19)
r 40 r2
It is an easy task (good as an exercise) to show that the last three equations lead to the following values
of energy, radius, kinetic energy, and potential energy of the hydrogen atom:
40 h2

r= n2 = a0 n2 (3.20)
me2
2
e2

1 2 1 1 1
E = mc = mc2 2 2 (3.21)
2 40 hc n2 2 n
1 1
KE = mc2 2 2 (3.22)
2 n
1
PE = mc2 2 2 . (3.23)
n
We introduce two very special numbers/constants in quantum mechanics. The first one is a0 which is
called the Bohr radius
40 h2
a0 = 5.29 1011 m. (3.24)
me2
The expression for the radius given by Eq. 3.20 means that the electron in the hydrogen atom can only
occupy specific values of radius, the smallest possible radius is given by the Bohr radius. The second
constant that we wish to introduce is the so-called fine structure constant:
e2 1
= . (3.25)
40 hc 137

53
The fine structure constant plays a very special role in quantum electrodynamics. Knowing the rest
energy of the electron mc2 then we can estimate the hydrogen atom energy levels to be
13.6 eV
E= , n = 1, 2, 3, ... (3.26)
n2
The integer n is called the principal quantum number and represents the state of the internal energy of
the hydrogen atom.
Now we have the main results. Let us now go back to the experimental facts and explain them using
the Bohr model. Provided the assumption that the electron behaves as a wave with a wavelength given
by the de Broglie wavelength, then we arrive the conclusion r = a0 n2 given by Eq. 3.20. These are
the so called stable orbits. An electron occupying these orbits would not radiate away its energy by
electromagnetic radiation. Other than the radius, the spectrum of hydrogen can now be also explained.
Consider a hydrogen atom that goes from an initial state (quantum number ni ) to a final state with a
quantum number nf at a lower energy. The energy difference corresponding to this change in the energy
is !
1 2 2 1 1 2 2 1 1 2 2 1 1
E = mc 2 + mc 2 = mc 2 . (3.27)
2 nf 2 ni 2 n2i nf
Because the state went to a lower energy ni < nf this energy change E is negative. Energy is released
by the system. Where does this energy go? Nature turns out to really favour conservation of energy. The
released energy is carried away by the photon and this is what is detected as a spectral line. The energy
carried away by the photon is then E = hc . With this quantum picture of electron and photon we
obtain a relationship between the energy levels of the hydrogen atom and the wavelength of the photon:
!
1 1 mc 2 1 1
= 2 . (3.28)
2 h n2i nf
Note that the constant h/mc appeared before in our analysis of Compton scattering. This constant h/mc
is actually known as the Compton wavelength (Do not confuse with de Broglie wavelength! It is only a
special name). Eq. 3.28 gives the spectrum of the hydrogen atom according to the Bohr model.
What is going to come as a surprise is that this model reproduces the experimental hydrogen atom
spectrum with remarkable accuracy given that it is a very crude quantum picture. Experimentally, the
spectrum of the hydrogen atom is fitted by the Rydberg equation:
!
1 1 1
=R 2 . (3.29)
n2i nf
1 mc 2
The experimental value of the Rydberg constant remarkably coincides with the value 2 h = 1.09737
107 /m.
Beyond the Hydrogen Atom

The simplicity of the Bohr model and its remarkable agreement with the hydrogen atom spectrum makes
us think that maybe we can also represent atoms with more than one electron by a similar model. Unfor-
tunately this is not true. Even the next-to-simplest atom to the hydrogen which is the helium atom poses
us with very difficult analysis. To highlight this point, think about this. The helium contains two elec-
trons, two protons and two neutrons. We can maybe model the electromagnetic interaction between the
electrons and the protons and between electrons themselves using Coulombs law. But there are mech-
anisms that seem to complicate everything. Electrons obey the Pauli exclusion principle. This means
that other than the repulsion of the two electrons there is also repulsion coming from the Pauli exclusion
principle that cannot be modelled by a picture similar to the Bohr atom. Moreover, there are protons
and neutrons and these interact via the nuclear force. These, among the next-to-next-to-next-to-next dif-
ficulty in the mathematical analysis, are among the issues that we will encounter when we analyze helium.
So should we hate nature? Why is nature so brutal that we have to go through a hell of mathematics
only to predict the energy levels of its constituents? No! Nature is so good that it actually tells us
already all that we need to know about its constituents. All we have to look is to look at the spectra of
atoms. It is us that wants to calculate the spectra of all the atoms for pride. Well, this would be a noble

54
cause and a good dream. If there is anything that we need to learn from the Bohr model, it is that
Nature prefers that its building blocks (e.g. atoms, molecules) exist only with its own chosen specific
values of internal energy.
We do not have to calculate the spectra of helium. All we need to do to obtain information about its
energy levels is to look at its spectra.
Exercises
1. Hydrogen-like atoms. Hydrogen-like atoms contain a positively charged core with Z protons and
a single valence electron that orbits this core. Consider a picture that resembles that of the Bohr
model and calculate the energy levels of the system. Show that the quantized energy, radius, etc.
for hydrogen-like atoms can be obtained by simply replacing each e2 by Ze2 in the corresponding
expressions for the Bohr model.
2. The orbital speed of an electron in the hydrogen atom is 7.3 105 m/s. What is its energy?
3. Consider an electron in hydrogen atom with energy 0.378 eV. What is the orbital radius of the
electron? What is the orbital speed of the electron?
3.4 Wave-particle Duality

Light exhibits interference patterns and diffraction. Then it is a wave. But it also presents itself as if it
consists of bundles in the photoelectric effect, bremsstrahlung, and Compton scattering. Then it is also
a particle. Today we consider light both as a wave and as a particle. What about elementary particles
such as the electrons? We have always thought about electrons as particles and this allows us to analyze
electrical conduction and particle collisions. But electrons also exhibit interference effects as seen in
the two-slit interference experiment for electrons. It was also found that electrons exhibit diffraction as
observed by Davison and Germer. Thus, an electron is both a wave and a particle. Now, we think of
matter as both wave and particle. What we see depends on how we look at it.
How are we now to describe this dual nature of matter? We do so by assigning a number for a photon
and an electron. We call this number the wave function to represent the state of the quantum particle.
This is a fancy wave for number (a function) that satisfies the wave equation. An example of a wave
function is given by the electric and magnetic field vectors in a free electromagnetic field. In this case,
we know that
~ B
E, ~ ei(~k~xt) (3.30)
where ~k gives the direction of wave propagation and the wavelength k = 2
and gives the frequency of
the wave. Because we now picture the electromagnetic wave as consisting of photons then we can consider
this same function to also represent the photon. For a photon we know that pc = E = hf = hc/. In
this case, we can write down the wave function for the photon in terms of its momentum and energy:
(~x, t) ei(~p~xEt)/h . (3.31)
We take a step from here and consider that, in general, the state of a free particle (may it be a photon
or a massive particle such as an electron) be represented by the wave function
(~x, t) = A ei(~p~xEt)/h . (3.32)
where A is a constant. What is the meaning of the wave function? It contains information about the
position and momentum of the particle. To give more emphasis into this statement let us go back to the
case of a free photon. A free photon has a definite wavelength and therefore a definite momentum. But
it is spread to an infinite extent. What the free particle wave function tells us is that ||2 is a constant.
This means that the probability of finding a photon near any point in space is the same. A photon with
definite wavelength cannot be found. This is the same for any other free particle. A free particle will
have a definite momentum and hence a definite wavelength given by the de Broglie wavelength. The
payment for knowing the momentum of the free particle is knowledge about its position. As we go on in
our journey of the quantum realm we shall find that the payment for knowing the position of a particle is
knowledge about its momentum. This leads to the uncertainty principle which the heart of all quantum
phenomena.

55
Exercises
1. What is the energy of a free particle with the wave function
= A exp i (1.35 107 m 1) x (4.31 1011 s1 ) t ?

(3.33)
2. A free particle has momentum 9.11 1027 kg m/s and energy 2.84 104 eV. What is the wave
function of the free particle?
3. What is the mass of a non-relativistic particle with wavefunction given by
= A exp i (1.28 107 m 1) x (4.52 1010 s1 ) t ?

(3.34)
3.5 Heisenberg Uncertainty Principle

At this point we have analyzed a sufficient number of experiments related to the quantum nature that we
can introduce what is at the heart of all quantum phenomena, the Heisenberg uncertainty principle.
In symbols, the Heisenberg uncertainty principle is given by
xp h/2 (3.35)
where x and p are uncertainties in the position of a particle. A derivation of Eq. 3.35 from first
principles can be obtained only when equipped with the necessary mathematics (i.e. operator algebra).
Nonetheless, we can motivate Eq. 3.35 by simple thought experiment with photons and electrons. Say, we
want to measure the position of an electron. With this goal, we aim a photon with momentum p. When
the photon scatters off the electron knowledge about the position of the photon x is gained. But because
of the conservation of energy and momentum, the scattered electron also gains some momentum coming
from the momentum of the photon p. Thus, x 1/p. The more we increase the momentum
of the photon, then the more we get information about its position (smaller x). But in exchange
we lose knowledge about the momentum of the electron since there is a stronger recoil. In quantum
mechanics, the systems are small in the sense that the act of observation actually alters the state of the
system. Equivalently, the Heisenberg uncertainty principle states that when we get knowledge about the
momentum of a particle (smaller p) then the more we lose information about its position (larger x).
A similar uncertainty principle also holds for the time and energy of a particle. This is called the
time-energy uncertainty principle and in symbols is written down as
tE h/2. (3.36)
Consider as an example a particle with an uncertainty in the energy of E. What the time-energy
uncertainty principle says is that the lifetime t of this particle would be of the order t 1/E. The
larger the uncertainty in the energy, the smaller the lifetime of the particle. This offers us an explanation
to why massive elementary particles such as the Higgs boson have a very small lifetime compared to the
lifetime of the electron.
Exercises
1. Particle Resonance. The lifetime of the baryon resonance particle ++ (1232) is 5.63 1024 s
before it decays to a positron and a pion. What is the minimum uncertainty in the measured
energy of the particle?
2. Nahihilooo.. nalilitooo... The position uncertainty of a 5.00 1028 kg quantum particle is
estimated to be 1.00 mm. What is the minimum uncertainty of its velocity assuming that the
particle moves in 1D motion?
3. Velocity Uncertainty. The uncertainty of the location of an electron along the x-direction was
determined to be 1 cm. What is the minimum uncertainty of the velocity along the x-direction of
the electron?
4. Impossible Estimate. Three experimenters estimated the uncertainties in location and momen-
tum of a particle in the x, y, and z-directions. Which experimenter/s made an impossible estimate?
(a) x = 3.74 103 m and px = 7.55 1015 kg m/s
(b) y = 1.85 109 m and py = 2.86 1026 kg m/s
(c) z = 4.32 106 m and pz = 6.10 1030 kg m/s

56
3.6 Schr
odinger Equation
We have already introduced the concept of the wave function when we talked about the free particle. Now,
we talk about this notion, in general, for all quantum particles. Being a wave function means implies that
it satisfies some sort of a wave equation. In quantum mechanics (at least in its nonrelativistic version)
the wave equation that is obeyed is called the Schrodinger equation:
2 2
h
(~x, t) + U (~x)(~x, t) = ih (~x, t). (3.37)
2m t
In this elementary course we will consider in most cases its one-dimensional version that is given by
2 2
h
2
(x, t) + U (x)(x, t) = ih (x, t) (3.38)
2m x t
Eq. 3.37 is called the time-dependent Schr odinger equation and governs the behaviour of a particle
of mass m moving around in a potential that is given by U (~x). And as we have stated for the free particle
what has meaning is the quantity ||2 :
|(~x, t)|2 = probability density. (3.39)
Note that this means taking the complex conjugate of the quantity and multiplying it with . By
the probabilistic interpretation that follows Eq. 3.39 then |(x, y, z, t)|2 dxdydz is the probability of
finding the particle within a volume dxdydz centered around the point (x, y, z). In one dimension ||/2 dx
is the probability of finding the particle within the length dx centered around x.
Given this interpretation, it is reasonable to assume that the probability of finding the particle
everywhere is equal to one. This leads us to the normalization integral
Z
(~x, t)(~x, t)d3 x = 1. (3.40)
all space
In one dimension we have Z +

(x, t)(x, t)dx = 1. (3.41)

Stationary States
States with definite energy are called stationary states. It also turns out that these states do not evolve
in time. In symbols, the probability densities of these states do not depend on the time variable t.
Stationary states are represented by the separable wave function
Et
E (~x, t) = E (~x)ei h
. (3.42)
One can easily verify that

|E (~x, t)|2 = |E (~x)|2 (3.43)
showing that the probability density does not have any dependence on the time. By substituting Eq.
3.42 into Eq. 3.37 we obtain the time-independent Schr odinger equation:
2 2
h
E (~x) + U (~x)(~x) = EE (~x). (3.44)
2m
The time-independent Schr
odinger equation describes the behaviour of a stationary state E with definite
energy E. In many cases, we shall be dropping the subscript E for stationary states for simplicity.
Schr
odinger Equation as Energy Statement
At this state it is impossible to derive the Schrodinger equation. The derivation requires a minimum
math knowledge of infinite dimensional vector spaces and operator algebra. But at least we can easily
motivate the Schr odinger equation. Eq. 3.37 is a statement of the nonrelativistic expression for energy
as a sum of kinetic and potential energies. We know that
KE + P E = H. (3.45)

57
Equivalently, we can write this down as
p~2
+ U (~x) = H. (3.46)
2m
If we make the following replacements for the momentum and energy
~
p~ ih (3.47)
and

H ih (3.48)
t
and act it on a function (~x, t) then we obtain the time-dependent Schrodinger equation.
3.7 Probability Theory and Complex Algebra

It is now apparent that we shall be needing more mathematical technology than what has been required
from us in thermodynamics and relativity. The probabilistic interpretation of quantum mechanics defi-
nitely invites us to review the classical probability theory while the existence of the imaginary number i
in the Schrodinger equation invites us to review complex algebra.
Probability Theory
In probability theory, the question that we ask is what is the probability of occurrence of an event?
or what is the mean? or what is the variance? By event we mean heads and tails. Say, what is the
probability of obtaining three heads in a game where three unbiased coins are tossed at the same time?
These questions can be answered by constructing the so-called probability density for the random
variable. In a coin toss the random variable is the number of heads or tails.
Let us simplify the discussion by considering the specific game in which three coins (unbiased) are
tossed at the same time. Well find that (by drawing a tree diagram) or doing an experiment that 1/8
times we get three heads, 3/8 times we get two heads, 3/8 times we get one head, and 1/8 times we
get zero heads. We summarize this result by writing down P (3) = 1/8, P (2) = 3/8, P (1) = 3/8, and
P (0) = 1/8. The quantity P (N ) is what we call the probability density for finding N heads in a game
of three coin toss. What are some ofPits properties? Note that if we add the probability density for
all results, we obtain a number one, P (N ) = 1. If we want to find the mean number of heads that
appears we multiply each P (N ) with N and add all possibilities.
Now let us begin to be more elegant. In probability theory, we are usually given the probability
distribution P (x) for some random variable x. It is our task to extract all that is measurable from this
probability distribution. The probability distribution is normalized:
X
P (x) = 1. (3.49)
all x
The mean and the variance can be calculated using

X
hxi = xP (x) (3.50)
all x
and X
hx2 i = x2 P (x). (3.51)
all x
In application to coin toss problems or dice problems the formalism described above would be sufficient.
However, in many physical applications one also considers the probability of occurrence of a variable
that is continuous (in kinetic theory the Maxwell-Boltzmann distribution gives the probability that a
gas molecule will have a certain speed). In this case we consider the probability distribution (x). This
is normalized according to Z +
dx(x) = 1. (3.52)

The mean and the variance can be computed using
Z +
hxi = dx x(x) (3.53)


58
and Z +
dx x2 (x). (3.54)

In quantum mechanics, what we are going to use as (x) is the absolute square of the wave function
||2 . But let us first take a step towards complex algebra.
Complex Algebra

The introduction of the imaginary number i = 1 allows us to make a whole new number space on
which we can construct our physical models. This is complex number space and its elements are complex
numbers satisfying some algebraic properties. An arbitrary complex number A can written as
A = a + ib (3.55)
where a and b are real numbers. The constants a and b are called the real and imaginary parts of A.
Alternatively, a complex number A can be written in the form
A = ei (3.56)
where and are real numbers. This latter form is called the polar form for a complex number where
is called the modulus and is called the phase. We can relate the polar form to the standard form
by expanding the exponential using de Moivres formula:
ei = cos + i sin . (3.57)
Using this we can immediately show that

a = cos (3.58)
and
b = sin . (3.59)
The inverse relationship can easily be obtained:
p
= a2 + b2 (3.60)
and
b
= arctan (3.61)
a
The last four equations are used to transform from the polar form to the standard (rectangular) form of
complex number and vice verse.
The complex conjugate of a complex number can be obtained by replacing all the i appearing in
a complex number with i. For example, the complex conjugate of A which is labelled A is given by
A = a ib = ei . (3.62)
The absolute square or modulo square of a complex number is the product of a complex number
and its own conjugate. For example, the absolute square of A is given by
|A|2 = A A = (ei )(ei ) = 2 = a2 + b2 . (3.63)
The polar form is often useful in calculating the absolute squares of quantities because the phase angle
easily cancels out of the calculation.
Exercises
1. Consider two complex numbers A = a + ib and B = c + id where a, b, c, and d are real numbers.
Calculate the following quantities (a) A A (b) A B (c) AB (d) AB and (e) A B .
1+2i
2. Calculate the modulus of the complex number 12i .
q
1+2i i4 3+4i
3. Calculate the absolute square of the complex number 12i e 2i .
4. If you toss a dice 1000 times, what is the most probable score that you will obtain?

59
3.8 Expectation Values/Averages in Quantum Mechanics
We can straightforwardly apply the concepts that we have learned in probability theory to quantum
mechanics. Given that the absolute square of the wave function ||2 is the probability density, then
|(~x, t)|2 dV is the probability of finding the particle in the infinitesimal box of volume dV centered at
the location ~x in time t. The probability of finding the particle within a certain volume V can be carried
out simply by adding up to probabilities for finding the particle at each infinitesimal box making up this
volume, i.e. we have to integrate,
Z
(~x, t) (~x, t)d3 x. (3.64)
over V only
The probability of finding the particle over all space must of course be unity as implied by the normaliza-
tion condition Eq. 3.40. In one dimension the probability of finding the particle within a certain interval
(a, b) is given by
Z b
P (a, b) = (x, t) (x, t)dx. (3.65)
a
Following the probabilistic interpretation of quantum mechanics, we also have the average position
Z
h~xi = (~x, t) ~x(~x, t)d3 x. (3.66)
all space
The average momentum is given by

Z
h~
pi = (~x, t) (ih) (~x, t)d3 x (3.67)
all space
and the average kinetic energy is

h2 2
Z

hKEi = (~x, t) (~x, t)d3 x. (3.68)
all space 2m
In general, the average of any measurement that is represented by the operator O is given by
Z
hOi = (~x, t) O(~x, t)d3 x. (3.69)
all space
Note that the position of the operator O in the integration matters, e.g. when O is a differential operator
then O 6= O|P si|2 6= |P si|2 O. The operator acts on the function to its right!The one dimensional
generalization to these equations for averages must be obvious.
Example: Particle confined in one-dimension

As an example, consider the one-dimensional wave function which is zero everywhere but within x (0, L)
is given by r
2 nx En t
(x, t) = sin exp i (3.70)
L L h
where 2 2
h
En = n2 (3.71)
2mL2
and n is a positive integer. This wave function represents the possible quantum states of an electron that is
confined in a one-dimensional box of length L and plays an important role in the theory of semiconductors
and applications to optoelectronics. We with to do the following with this wave function: (i) show that
the wave function is normalized (ii) calculate the average position (iii) calculate the average momentum.
To show that the wave function is normalized we multiply the wave function with its complex conju-
gate and integrate over all space:
Z + Z L "r # "r #
2 nx E n t 2 nx E n t
dx = sin exp i sin exp i dx
0 L L h L L h
2 L 2 nx
Z
= sin dx (3.72)
L 0 L
2 L1
Z
2nx
= 1 cos dx
L 0 2 L

60
It can be expected that the reader can continue the integration of the elementary integrals as mini-
mum mathematical requirement. It follows from this that the wave function is normalized; that is, the
integration leads to Z +
dx = 1. (3.73)

The average position is given by

Z +
hxi = xdx

Z L
2 nx
= x sin2 dx
L 0 L
2 L 1
Z
2nx
= x 1 cos (3.74)
L 0 2 L
"Z #
L Z L
2 x x 2nx
= dx cos dx
L 0 2 0 2 L
1 L
Z
L 2nx
= x cos dx.
2 L 0 L
By performing integration by parts on the second term, i.e. let u = x and dv = cos 2nx

L dx, it can be
shown that the second term reduces to zero. The average position of the particle is then located at the
center of the interval (0, L):
L
hxi = . (3.75)
2
Finally, the average momentum is given by
Z +
d
hpi = i h dx
dx
Z L "r # "r #
2 nx En t d 2 nx En t
= sin exp i ih sin exp i
0 L L h dx L L h
Z L (3.76)
2 n nx nx
= i
h sin cos dx
L L 0 L L
n L 1
Z
2 2nx
= i
h sin dx.
L L 0 2 L
It should be obvious from this point that the average momentum is zero:
hpi = 0. (3.77)
Exercise: ground state of the harmonic oscillator

Consider the quantum state described by the wave function at t = 0:
(x, t = 0) = A exp bx2

(3.78)
where b is a constant. This wave function represents the ground state energy of a particle in a harmonic
oscillator potential and plays important role in the low temperature description of solids. (i) Express
the normalization constant A in terms of the given variable b (ii) Calculate the average position of the
particle (iii) calculate the average momentum of the particle. Hint: You can differentiate the following
integral with respect to to perform the required integrations,
Z + r

exp x2 =

. (3.79)

(iv) Continue the integrations and calculate the expectation value of the square of the position hx2 i and
the uncertainty x (v) Calculate hp2 i and p. (vi) What must be the value of b so that the extrema of
the Heisenberg uncertainty principle xp = h /2 will be satisfied?

61
3.9 Eigenvalues, Eigenfunctions, and the Superposition Principle
We have already shown that one possible solution of the time-dependent Schrodinger equation are the
stationary states given by
Et
E (x, t) = E (x) exp i (3.80)
h
where E represents the energy for the state E (x) satisfying the time-independent Schrodinger equation.
Each of the E are called the eigenvalues while E (x) are the corresponding eigenfunctions. Mathemati-
cally, we are guided to construct the most general wave function as

X Et
(x, t) = aE E (x) exp i (3.81)
h
all possible E
because the eigenfunction turn out to be linearly independent with each other. In Eq. 3.81, aE are
constants whose interpretation we shall now discuss.
A valid wave function in quantum mechanics is one that is normalizable. We subject Eq. 3.81 to this
constraint:

Z + Z + 0

X Et X Et
||2 dx = aE E (x) exp i aE 0 E 0 (x) exp i dx
h h
all possible E all possible E
Z +
E E0
X

= aE aE 0 exp i t E (x)E 0 (x)dx.
0
h

EE
(3.82)
To continue from this we note one special property of eigenfunctions. Eigenfunctions satisfy the condition
(known as orthogonality)
Z +
E (x)E 0 (x)dx = EE 0 (3.83)

where EE 0 is zero whenever E 6= E 0 and one when E = E 0 . This limits the sum to terms E = E 0 and
we obtain Z + X
||2 dx = |aE |2 . (3.84)
E
Because the wave function is normalized we obtain the following condition on the constants aE :
X
|aE |2 = 1. (3.85)
E
This apparently resembles the condition that the sum of all probabilities of having different energies E is
equal to one. This hints us to interpret that |aE |2 gives the probability that the quantum state represented
by (x, t) has a certain energy E. This is the currently accepted interpretation. To strengthen this
interpretation further, we can actually calculate the average energy corresponding to the state Eq. 3.81.
The result is X
hEi = E|aE |2 . (3.86)
E
This resembles the equation for calculating the average energy with probability |aE |2 for finding the
particular energy E.
That the most general quantum state can be written as a sum of stationary solutions (states which do
not evolve in time) is known as the superposition principle. Though this follows from the mathematics
of vector spaces this can also be interpreted physically. In quantum mechanics, a system is allowed
to possess only a particular/specific set of energies allowed by the Heisenberg uncertainty principle.
The system, at any given time, can be in any one of these quantum states with allowed energies. A
measurement of the energy of the state would yield any one of this possible values. Another measurement
can yield another of the possible values. When we make a sufficiently large number of measurements
we can experimentally create the probability density |aE |2 of finding the system in the particular energy
E. When we average all of the results, we also find that the average energy is consistent with what
probability theory tells us; that we add E|aE |2 for all possible values. This allows us to write down the
general quantum state in the form Eq. 3.81 even without the help of mathematics. It is important to
understand that this interpretation applies in quantum mechanics because of the probabilistic nature of
quantum systems.

62
Eigenvalues and Eigenfunctions
We have mentioned eigenvalues and eigenfunctions without giving a formal definition to these terms. That
E and E are eigenvalue and eigenfunction means that these satisfy the time-independent Schrodinger
equation:
h2 2

+ V (x) E = EE . (3.87)
2m
We can think of the quantity in the brackets as an operator acting on the function E . In general, an
eigenfunction n of an operator L is such that when the operator acts on the eigenfunction, the resulting
equation is a constant n multiplied by the eigenfunction itself, i.e.
Ln = n n . (3.88)
In quantum mechanics, the observables (position, momentum, energy, spin) are represented by opera-
tors. The act of measuring an observable yields one of the eigenvalues of the operator representing the
observable.
As example, eigenvalues p~ and eigenfunctions p of the momentum operator satisfy
(ih) p = p~p . (3.89)
It is straightforward to show that these eigenfunctions p are given by plane waves

p~
p (~x) = exp i ~x . (3.90)
h
d
In one dimension, this eigenfunction of the momentum operator ih dx are given by
p
p = exp i x . (3.91)
h
We then see that free particle solutions = A exp (i (kx t)) are eigenfunctions of the momentum with
eigenvalue hk. On contrary, the function (x, t) = sin(kx) is not an eigenfunction of the momentum
operator because the action of the momentum operator on this function yields a different function that
is linearly independent from the one acted on (a cosine).
Exercises
1. Normalization. Consider a particle described by the wave function (x) = Aebx (Ab are real and
positive constants) in the region 0 x < and (x) = 0 elsewhere. What is the normalization
constant A?
p
2. Expectation. Consider the wave function (x) = 5/(2L5 )x2 for |x| < L, where L is a positive
constant. What is the expectation value of the position?
3. Consider the following wave function of a particle in a one-dimensional quantum system:
r
i(0.38eV)t/ 1 i
(x, t) = C1 (x)e h
+ 2 (x)ei(1.52eV)t/h + 3 (x)ei(3.42eV)t/h (3.92)
3 2
where C is a complex number and 1 (x), 2 (x), and 3 (x) are normalized spatial wave functions
that correspond to the ground, first-excited, and second-excited states, respectively. (i) What is
the probability that the particle will be in the ground state? (ii) What is the average energy? (iii)
What is the minimum uncertainty in the particles lifetime according to the energy-time uncertainty
principle?
4. Consider a particle having the following probability distribution in momentum p:
P (p = 1.51 kg m/s) = 0.70 (3.93)
P (p = 2.51 kg m/s) = 0.25 (3.94)
P (p = 7.21 kg m/s) = 0.05 (3.95)
(i) What is the average momentum of the particle? (ii) What is the uncertainty in the momentum of
the particle? (iii) What is the minimum uncertainty in the position of the particle? (iv) Write down
the wave function for the particle letting i correspond to the states with the given momentum
and Ei be the energy of these states.

63
3.10 Particle in a Box
We discuss our first specific quantum system: that of an electron confined between two very high,
impenetratable, potential barriers. The potential defining the problem is given by
(
0 , in the box
V (x) = (3.96)
, elsewhere.
This implies that the particle is essentially free inside the box. The infinite potentials outside the box
is interpreted to stop any possibility that the particle can be found in those regions. This is a physical
constraint that we impose on the solution to the Schrodinger equation. We write down
(x) = 0, outside the box. (3.97)
Other than this physical constraint we also want the wave function to be a smooth function of the
position, i.e. the wave function and its derivative should be continuous:
= continuous (3.98)
d
= continuous. (3.99)
dx
2
h

These two conditions allow the kinetic energy term, 2m 2 , in the Schrodinger equation to have a
finite value.
Inside the box, the potential is zero and the Schrodinger equation becomes
2 d2
h
= E. (3.100)
2m dx2
The general solution to this is a linear combination of the oscillatory functions sin and cos:
(x) = A sin(kx) + B cos(kx) (3.101)
where k is given by
2mE
k2 = . (3.102)
h2
Without loss of generality we place the boundaries of the box at x = 0 and x = L. At the left boundary
x = 0 the continuity condition of the wave function immediately leads to the condition
B = 0. (3.103)
The solution at this point is

(x) = A sin(kx). (3.104)
The continuity condition at x = L leads to
A sin(kL) = 0. (3.105)
This can be satisfied when the wave number is quantized:

k= n, n = 1, 2, ... (3.106)
L
By taking the square of this and using Eq. 3.102 we obtain the possible energies for the particle in a
box: 2 2
h
En = n2 , n = 1, 2, ... (3.107)
2mL2
When we make a measurement of the energies of the particle we get one of the possible values given in
Eq. 3.107.
p It is left as an exercise to show that the normalization condition of the wave function gives
A = 2/L. The solution to the Schr odinger equation is then given by
r
2 nx
(x) = sin , n = 1, 2, ... (3.108)
L L

64
The particle can be any one of the states described by the solution given by Eq. 3.108. The probability
density for a possible state is then given by
r
2 nx
|(x, t)|2 = sin2 . (3.109)
L L
Because a quantum system may not have a definite energy we write down its wave function as a sum
of possible solutions corresponding to definite energy (superposition principle). The general quantum
state of a particle in a box is then given by
r !
X 2 nx
(x, t) = an sin eiEn t/h (3.110)
n=1
L L
where |an |2 gives the probability of finding the system in the state with energy En .
Exercises
1. An electron subjected
to an infinite square well potential was prepared in the following state
(x, 0) = A sin 4x
nm . What is the wave function of the particle after time t?
2. The Longer The Better. An electron in a 10 nm box emits a photon as it relaxes toward the
n = 2 state. What is the longest wavelength of the photon that the electron can emit? Ans.
6.60 105 m
3. Analyze the problem of the particle in a box carefully. What changes if we shift the location of the
box boundaries to x = L/2 to x = L/2?
3.11 Particle in a Finite Square Well

The particle in a box is sufficient to describe the state of a system whose lowest energies are way smaller
compared to the height of the barrier. But when the energy calculated using Eq. 3.107 are close the
the height of the potential barrier the particle in a box model losses its predicting power. We improve
on this model by introducing the particle in a finite square well. In this case, instead of having infinite
potential barriers at the boundaries of the box we introduce finite potential barriers that the particle
can penetrate by quantum mechanical effects. The potential describing the problem is
(
0 , in the well
V (x) = (3.111)
V0 , elsewhere.
Because of the translational symmetry of the problems, the well can be anywhere in space, we have the
option to use the box location that best simplifies the problem. In this case, we choose the boundaries
to be at x = L/2 and x = L/2. We are now ready to solve the Schrodinger equation and subject the
solution to normalization condition and Eqs. 3.98 and 3.99, continuity and smoothness.
The choice of box location puts a symmetry on the system that allows us to discuss even and odd
solutions to the Schr
odinger equation separately. The general solution is given by
(
A cos(kx) + B sin(kx) , in the well
(x) = x x
(3.112)
Ce + De , in the barriers
where
2mE
k2 = (3.113)
h2
and
2m(V0 E)
2 = . (3.114)
h2
The previous statement on symmetry allows us to write down the solutions for even and odd functions
of x separately. Furthermore, while the finite barrier allows the particle to be found inside the barriers
we do not expect this probability to be large. Physically, we can argue that the probability of finding

65
the particle farther and farther from the well becomes very small. This must be reflected in the wave
function. For solutions with even dependence on the position we have

x
Ce
, x L/2
(x) = A cos(kx) , L/2 < x < L/2 (3.115)
x

Ce , x L/2.
Clearly the probability density decreases exponentially as one goes farther and farther away from the well.
Since we have already imposed the symmetry in x we do not have to impose the boundary conditions
at both x = L/2 and x = L/2 since these would give exactly the same condition. The continuity and
smoothness conditions at x = L/2 give

kL
A cos = CeL/2 (3.116)
2
and
kL
kA sin = CeL/2 . (3.117)
2
We can divide these conditions to cancel out the constants A and C. This gives

kL kL L
tan = . (3.118)
2 2 2
Adding Eqs. 3.113 and 3.114 we obtain
2 2
kL kL mV0 L2
+ = . (3.119)
2 2 h2
2
Eqs. 3.118 and 3.119 are two equations which can be solved numerically for the energies of the system.
An analytical expression for the solution cannot be obtained for the finite well problem.
Exercises
1. The solution to the Schr
odinger equation with odd dependenceon x is

x
Ce
, x L/2
(x) = A sin(kx) , L/2 < x < L/2 (3.120)
x

Ce , x L/2.
Derive the quantization condition similar to Eq. 3.118 for this solution. Which should has greater
energy, the solution with even dependence on x or the solution with odd dependence on x?
2. Write down the solution to the Schrodinger equation for the particle in a well described by an
asymmetric potential well:

V0
, x < L/2
V (x) = 0 , L/2 < x < L/2 (3.121)

, x > L/2.

3. Numerical solution to square well problem. Equipped with the capacity of making plots
using spreadsheet programs one can actually calculate the exact energy levels for a particle in a
finite square well. This can be done by defining the dimensionless variables
= kL/2 (3.122)
= L/2. (3.123)
In this case, for example, the quantization
q condition for solutions with even dependence on position
2
becomes = tan and = mV 0L
h2
2
2 . Taking the intersection of these two plots lead to the
possible energy levels. Say, there is anintersection
at = 0 . Then one possible energy level is
2
h 2
given by 0 = kL/2 or simply E = 2 mL2 0 . For a particle with a mass of 0.067 times the
electron mass that is confined in a finite square well of with 100 A and height 240 meV, show
that this numerical routine leads to the energy level 32.47 meV. This is the lowest energy that the
system can occupy. Repeat the procedure for the solution with odd dependence on position and
show that the first excited state energy level is given by 124.44 meV.

66
3.12 Quantum Tunnelling
For the past two cases we have been talking about the case of a particle which can be found somewhere
close to a region of space because of a potential, i.e. a potential wells traps it so that the particle
is close by. We call these cases bound states. We now give an example of a scattering problem in
quantum mechanics. In scattering, a particle travels with constant velocity from far away and then
encounters a scattering region (represented by a potential). Once it hits the scattering region it becomes
a superposition of a scattered wave function and the original wave function. In one-dimension when the
particle hits the scattering region (a potential barrier) the particle is partially reflected and partially
transferred. When we make a measurement of the position of the particle when it hits the scattering
region we find that it is sometimes reflected while sometimes transmitted. The probability of transmission
and reflection is governed by the quantum mechanical laws.
Consider the case of a particle that is incident on the potential barrier from the left (x < 0) as shown
in the figure below. The potential function is given in symbols as
(
U0 , in barrier region
V (x) = (3.124)
0 , elsewhere.
We consider the case that the energy E of the particle is less than the height of the potential barrier. This
Figure 41: One-dimensional picture of scattering through a rectangular potential barrier.
is represented
by the free particle wave function with positive momentum, i.e. (x) = Aeipx/h where
k = 2mE/h. When it hits the potential barrier it becomes reflected and transmitted at the same time.
Thus, an additional solution Beikx is introduced at x < 0 and a solution Ceikx is introduced for x > L.
In region II, inside the barrier the solution is a linear combination of exponential functions. The solution
the Schrodinger equation is given by

ikx ikx
Ae + Be
, region I
x
(x) = e + e x (3.125)
, region II

ikx
Ce , region III
where
2mE
k2 = (3.126)
h2
and
2m(U0 E)
2 = . (3.127)
h2
What are we interested for? The tunneling probability. This is the one that we can measure as a
current in a tunnel diode or a scanning tunneling microscope. To identify what this quantity is in terms
of our given variables we use the quantum mechanical interpretation. The constant A corresponds to the
amplitude of the incident/original wave that is scattered. Once scattered, it is divided into a reflected
wave with constant B and a transmitted wave with constant C. The ratio of the constants C/A is the

67
amplitude measure of transmission with respect to the original wave moving to the rate. The probability
of transmission is then the absolute square of this ratio. Similarly, the probability of reflection is the
absolute square of B/A. Once imposing the continuity and smoothness conditions at the boundaries, we
obtain the following result for the transmission probability:
1
T = |C/A|2 = . (3.128)
2
1 k
sinh2 (L)

1+ 4 + k
In terms of U0 and E, the barrier height and the particle energy we have
1
T = . (3.129)
U02
1+ 1
4 U0 EE 2 sinh2 (L)
For very tall barriers or very long barriers or both we take the limit L 1 and this becomes
E2

E
T 16 2 exp(2L). (3.130)
U0 U0
The important result from this expression is that the transmission probability depends on the length L
of the barrier as
T e2L . (3.131)
This makes sense because the probability must decrease to zero as the width of barrier L becomes larger
and larger.
Exercises
1. Harang. A particle has an energy E, and encounters a barrier with height U0 and length L.
Which of the following can reduce transmission probability of the particle as it interacts with the
barrier? Increasing L? Decreasing U0 ? Increasing E?
2. Break Away. A 2.0 eV electron encounters a barrier with height 5.0 eV and width 1.50 nm. By
how much factor would the probability of tunneling increase if the width is decreased to 1.00 nm?
3.13 Harmonic Oscillator

We discuss the quantum analog for the motion of a particle that is connected to a spring. This is another
bound state problem whose energies can take on only specific values. Consider the particle that interacts
through the potential given by
1
V (x) = m 2 x2 . (3.132)
2
Classically, a particle that is acted upon by this potential experiences back and forth motion. This
particle can have any value of energy.
In quantum mechanics, we solve the Schrodinger equation for the energy levels of the system:
2 d2 1
h
+ m 2 x2 = E. (3.133)
2m dx2 2
This differential equation can be solved exactly but the level of math that is needed is beyond math
55. In this case we only give out the solution that leads to the quantized energies. The solution to the
harmonic oscillator problem in quantum mechanics is given by
14 1 y2
n (y) = Hn (y)e 2 (3.134)
2n n!
where

y= x (3.135)
and
m
= . (3.136)
h

68
The functions Hn (x) are known as Hermite polynomials. The energy levels corresponding to the solution
n (y) is given by
1
En = h
n + , n = 0, 1, 2, 3, ... (3.137)
2
For n = 0 we have H0 (x) = 1 and the ground state solution to the harmonic oscillator is given by
14 y2
0 (x) = e 2 . (3.138)

One can easily convince himself/herself that this is indeed a solution by substituting this into the
Schr
odinger equation. This gives the ground state energy for a particle in the harmonic oscillator:

h
E0 = . (3.139)
2
As always, a particle in quantum mechanics can occupy any of the states with constant energies. This
gives the superposition principle and for the harmonic oscillator problem this takes the form
1
X 4 1 y2
(y, t) = an Hn (y)e 2 exp (En t/h) . (3.140)
n=0
2n n!
For example, the wave function of a particle can be
r r
3 iE0 t/ 1
(x, t) = 0 (x)e h
+ 1 eiE1 t/h . (3.141)
4 4
This tells us that when we make a measurement, 75% of the time we find the particle in the ground state
with energy E0 and 25% of the time we find the particle in the first excited state with energy E1 .
The Ehrenfest Theorem

It turns out that when we take the average of quantum mechanical operators for the quantum harmonic
oscillator we recover the values that these operators would have taken if they are classical. This is known
as the Ehrenfest theorem and is expressed in the position and momentum as
dhxi
m = hpi (3.142)
dt
and
dhpi
= m 2 hxi. (3.143)
dt
These are simply what we know to be the equations of motion in classical mechanics, i.e. Newtons laws
of motion for the harmonic oscillator.
Exercises
1. Promoted. A harmonic oscillator is initially in the first-excited state. It absorbs a photon,
raising its state by two levels. If the ground-state energy of such an oscillator is 1.20 eV, what is
the wavelength of the absorbed photon? Ans. 259 nm
2. final. A harmonic oscillator, initially in the first-excited state, absorbs a photon with a wavelength
of 550 nm. What is the final state nf of the oscillator if its angular frequency is 3.44 1015 rad/s?
3. Uncoupled. Three harmonic oscillators A, B, and C have ground state energies of EA , EB > EA ,
and EC > EB , respectively. Which of these oscillators would absorb a photon with the longest
wavelength to transition to the first-excited state?
4. Recoil. A photon was absorbed by a quantum harmonic oscillator that was prepared at ground
state with angular frequency . The excited harmonic oscillator emits a photon with the same
angular frequency , then relaxes into the second-excited state. What is the angular frequency of
the original incident photon that was absorbed by the quantum harmonic oscillator? Ans. 3
5. Pewpew. The ground-state energy of a harmonic oscillator is 5.00 eV. If the oscillator undergoes
a transition from its n = 3 to n = 2 level by emitting a photon, what is the wavelength of the
emitted photon?

69
3.14 Quantum mechanics in 3D
We have so far been talking only about systems that have only spatial dependence in one direction.
While the analysis that we have done for particle in a square well and the harmonic oscillator works for
systems with spatial symmetry in two directions the more general attack to the problem of quantum
mechanics in three dimensions remain a mystery. This is the topic of this section.
Fast Forward. When we quantize the system we expect that the energy that the system can take
would take on only its nature chosen specific values. These are the eigenvalues that we can obtain from
the Schrodinger equation. But in contrast with one dimension, we find that the energies/eigenvalues are
dictated/labelled by three quantum numbers instead of only one. Another new feature that we are going
to find in three dimensions is degeneracy. This occurs when two or more quantum states (different set
of quantum numbers) have the same energy.
Particle in a Box
In 3D, the problem that we intend to solve is given by
2 2
h
+ V (x, y, z) = E (3.144)
2m
where the potential energy function is defined by
(
0 , 0 < x, y, z < Lx , Ly , Lz
V (x, y, z) = (3.145)
, elsewhere.
We remember that the operator 2 corresponds to

2
2 2

2 = + + . (3.146)
x2 y 2 z 2
Once again, as in the time-dependent Schrodinger equation given by Eq. 3.37, we have a partial differ-
ential equation. The simplest way to solve a particle differential equation is to assume that there is a
separable solution of the form
(x, y, z) = A(x)B(y)C(z). (3.147)
When we plug in this assumption into the original partial differential equation we instead obtain three
ordinary differential equations for the variables x, y, and z. When we subject these solutions to the
boundary conditions we obtain the allowable energies of the system.
Let us now go back to the problem at hand. The infinite potential barrier outside the box implies
that the particle would not be able to penetrate through the walls. This means that = 0 outside the
box. Thus, the requirement of continuity of the solution to the Schrodinger equation imposes
(x = 0, y, z) = (x = Lx , y, z) = 0 (3.148)
(x, y = 0, z) = (x, y = Ly , z) = 0 (3.149)
(x, y, z = 0) = (x, y, z = Lz ) = 0. (3.150)
Inside the box, the Schr

odinger equation becomes
2 2 2 2

h
2
+ 2+ 2 = E. (3.151)
2m x y z
Substituting the separable solution given by Eq. 3.147 leads to
2 d2 A d2 B d2 C

h
BC 2 + AC 2 + AB 2 = EABC. (3.152)
2m dx dy dz
When we divide the equation above by = ABC we obtain
2 1 d2 A
h h2 1 d2 B h2 1 d2 C
= E. (3.153)
2m A dx2 2m B dy 2 2m C dz 2

70
The first term in the left hand side of this equation involves only a derivative with respect to x; the
second term, derivative with y; the third term with z. The only way that this equation can be satisfied
is if these derivatives are equal to a constant that add up to the total energy E. Thus, we obtain
2 d2 A
h
= Ex A (3.154)
2m dx2
2 d2 B
h
= Ey B (3.155)
2m dy 2
2 d2 C
h
= Ez C (3.156)
2m dz 2
where
Ex + Ey + Ez = E. (3.157)
Look at what we have achieved at this point: we have reduced the three-dimensional Schrodinger which is
a partial differential equation into three ordinary differential equation (depends only on one independent
variable). Furthermore, given the continuity requirement that has been mentioned earlier we notice that
we actually obtain three one-dimensional particle in a box problem. Since we know the solution to the
particle in a box problem then we already know the solution of the problem at hand. The normalized
solutions to A, B, and C are given by
r
2 nx x
A(x) = sin (3.158)
Lx Lx
s
2 ny y
B(y) = sin (3.159)
Ly Ly
r
2 nz z
C(z) = sin (3.160)
Lz Lz
and the energy contributions coming from the x, y, and z differential equations are
2 h2 n2x
Ex = , nx = 1, 2, 3, ... (3.161)
2m L2x
2 h2 n2y
Ey = , ny = 1, 2, 3, ... (3.162)
2m L2y
and
2 h2 n2z
Ez = , nz = 1, 2, 3, ... (3.163)
2m L2z
The full solution to the 3D Schr odinger equation which is the product ABC is given by
s
8 nx x ny y nz z
(x, y, z) = sin sin sin (3.164)
Lx ly Lz Lx Ly Lz
while the full energy is given by
!
h2
2 n2x n2y n2z
E= + + , nx , ny , nz = 1, 2, ... (3.165)
2m L2x L2y L2z
Particle in a Cubic Box

Consider the special case of L = Lx = Ly = Lz corresponding to the problem of a particle that is
confined in a cubical volume in space. The possible values of the energy in this case reduces to
2 h2
n2x + n2y + n2z .

E= 2
(3.166)
2mL
2 2
h

Calling E1 = 2mL 2 we see that the lowest possible energy belongs to the quantum state with quantum
numbers (1, 1, 1) with energy 3E1 . The first excited state energy level belongs to state (2, 1, 1) with
energy 6E1 . This is where degeneracy enters. Note that the states (1, 2, 1) and (1, 1, 2) also have the
energy 6E1 . Thus we say that the first excited state is triply-degenerate or it has degeneracy of three.
An important point to make is that the degeneracy arises because of the symmetry of the cube.

71
Harmonic Oscillator
The harmonic oscillator problem in 3D is defined by the potential
1 1 1
V (x, y, z) = mx2 x2 + my2 y 2 + mz2 z 2 (3.167)
2 2 2
where m is the mass of the particle. We can easily repeat the steps done for the 3D particle in a box
to reduce the 3D Schr odinger equation into three 1D Schrodinger equation for each of the coordinate
direction. The full eigenfunction is then a product of three harmonic oscillator eigenfunctions in one
dimension and the full energy is a sum of the energies in each direction:

1 1 1
E=h x nx + + hy ny + + hz nz +
2 2 2
(3.168)
3
=h x nx + y ny + z nz + , nx , ny , nz = 0, 1, 2, ...
2
We consider now the case of the isotropic harmonic oscillator where = x = y = z . In this special
case, the possible energies of the system become

3
E = h nx + nz + nz + . (3.169)
2
Since the quantum numbers nx , ny , nz = 0, 1, 2, ... the ground state of the system is labelled by (0, 0, 0)
with energy 23
h. The first excited state is (1, 0, 0) with energy 52 h. But we can easily see that (0, 1, 0)
and (0, 0, 1) also have energy 52
h. So, the first excited state is again triply-degenerate. The point that
we want to highlight is that degeneracy comes up again because of symmetry.
Exercises
1. S.E.S. A particle is in a box with dimensions Lx = Ly = Lz . What are the set of quantum
numbers corresponding to the second excited state?
2. Maldita sea IHO! A 3D isotropic harmonic oscillator transitions from the ground state to the
second excited state. What is the ratio of the ground state energy with the energy of the second
excited state, E0 /E2 ?
3. In Transit. A 3D isotropic harmonic oscillator makes a transition from the first excited state to
the ground state. If angular frequency of the oscillator is 1.80 1013 rad/s, what is the frequency
of the emitted photon? Ans. 1.80 1013 rad/s
4. For the next two numbers, consider an electron in a three-dimensional cubic box of length equal to
2.50 nm. (a) What is the degeneracy of the fourth-excited state? Neglect the effect of spin. (b)
What is the energy in the third-excited state? Ans. 1 and 0.54 eV
5. 3DHO. A three-dimensional isotropic harmonic oscillator (3DHO), initially in the third excited
energy level, transitions to the second-excited energy level, emitting a photon. What is the wave-
length of the emitted photon if the 3DHO oscillates at a frequency of 5.00 1015 Hz? Ans. 60 nm
3.15 Hydrogen Atom: Revisit with the Schr

odinger equation
To end our study of quantum mechanics we revisit the picture of the hydrogen atom as viewed from
the Schrodinger equation point-of-view. Let us recall what the Bohr model of the atom tells us. In the
Bohr model, the electron orbits the nucleus in stable orbits where the atom would not radiate energy.
This leads to the experimentally correct spectra of the hydrogen. But while the Bohr model gives
the correct spectra/energy levels we have to note that its picture of the atom is inconsistent with the
Heisenberg uncertainty principle. Being in a certain stable radius means that the uncertainty in the
radial position of the electron is zero r = 0. Since the electron also orbits the nucleus in a circle
its radial component of the momentum is completely determined and we have pr = 0. Clearly, this
picture is incompatible with the Heisenberg uncertainty principle (which is the heart of all of quantum
phenomena). But since the Bohr model can reproduce the exact spectra of hydrogen it surely contains
odinger equation viewpoint we shall find that rn = a0 n2 are the most probable
hints of truth. In the Schr

72
radius of the probability density. The probabilistic picture enters which is consistent with the Heisenberg
uncertainty principle.
Let us see how far we can go into solving the Schrodinger equation for the hydrogen atom. The
problem is defined by the Coulomb potential given by
1 e2
V (r) = (3.170)
40 r
where e is the electronic charge. The Schr
odinger equation for the hydrogen atom is then given by
2 2
h 1 e2
= E. (3.171)
2m 40 r
In contrast to the 3D particle in a box and 3D harmonic oscillator problem we cannot use the separable
solution
p (x, y, z) = A(x)B(y)C(z) since the potential energy function depends on the radial variable
r = x2 + y 2 + z 2 . What we can do instead is consider the separable solution (r, , ) = R(r)()().
But in this line of attack we have to express the operator 2 in spherical polar coordinates (r, , ).
Fortunately, these expressions can be found in mathematics and physics texts. This leads us to the
Schr
odinger equation in spherical polar coordinates for the hydrogen atom:
h2 1 2

2 1 1 1 1
2
r + 2 2 sin + 2 2 2
= E. (3.172)
2m r r r r sin r sin 40 r
The separable solution (r, , ) = R(r)()() in spherical polar coordinates then reduce this equation
into three ordinary differential equations:
1 d2
= m2 (3.173)
d2

1 d d
sin sin + l(l + 1) sin2 = m2 (3.174)
d d
2mr2 1 e2

d 2 dR
r + E R = l(l + 1)R. (3.175)
dr dr h2 40 r
The solution to the differential equation in is an exponential as can easily be verified:
() = eim . (3.176)
The azimuthal symmetry of the system, i.e. that the system does not change with the translation
+ 2, constrains m into positive and negative integers and zero: m = 0, 1, 2, .... The solution
to the differential equation in is well-known in mathematics as
() = Plm (cos ) (3.177)
where Plm (cos ) are called associated Legendre functions. The possible values of l are l = 0, 1, 2, 3, ....
It should be noted that the product of the and solutions is another special function that we call
spherical harmonics: s
2l + 1 (l m)! m
Ylm (, ) = P (cos )eim . (3.178)
4 (l + m)! l
Finally, the solution to the remaining radial differential equation is another well-known function in
mathematics: l
2r r r
R(r) = e na0 L2l+1
nl1 (3.179)
na0 na0
where Lln (x) are called associated Laguerre polynomials. The full normalized solution the the Schrodinger
equation for the hydrogen atom is given by
" 3 #1/2 l
2 (n l 1)! r 2r 2r
nlm (r, , ) = e na0 L2l+1
nl1 Ylm (, ). (3.180)
na0 2n(n + l)! na0 na0

73
The quantum numbers are given by n = 1, 2, 3, ..., l = 0, 1, n 1, and m = 0, 1, l and the energy turns
out to be dependent only on the quantum number n:
13.6 eV
En = . (3.181)
n2
As we can see the energy eigenvalues are the same as that in the Bohr model and would therefore agree
with the experimentally observed spectra of hydrogen.
We label the quantum states of the hydrogen atom by (n, l, m). In contrast with the Bohr model
we can easily observe the degeneracy arising from the spherical symmetry of the problem. For example,
consider the quantum states with the energy 13.6 eV/22 . A possible quantum state with this energy is
(2, 0, 0). But the quantum states (2, 1, 1), (2, 1, 0), and (2, 1, 1) also have the same energy. This state,
which is in fact the first excited state has degeneracy of four. Thus, we see that different states/different
probability distributions lead to the same energy. In addition to the energies, the Schrodinger equation
also allows us to calculate the magnitude of the total angular momentum
p
L = l(l + 1)h (3.182)
and the z-component of the angular momentum
Lz = mh. (3.183)
Exercises
1. Probable. A hydrogen atom is prepared in the state (x) = A100 100 (x) + A211 211 (x) +
A211 211 (x). Whatis the probability that the state is observed to have an orbital angular momen-
tum of magnitude 2h?
2. Lonesome Electron. The orbital angular momentum of an electron has a magnitude of 4.716
1034 kg m2 /s. What is the angular-momentum quantum number l for this electron?
3. Pinakamahirap! What is the magnitude of the total angular momentum of an electron in the
s-state (l = 0) of the hydrogen atom?
4. What is the energy of the 4th excited state of the hydrogen atom and what is its degeneracy?
Radial Probability Distribution

We know that the probability of finding the particle within a given infinitesimal volume dxdydz is given
by ||2 dxdydz. For radially symmetric solutions (does not have any angular dependence) to the hydrogen
atom corresponding to l = 0 or s-states the probability of finding the electron within an infinitesimally
thin spherical shell of radius r is given by ||2 4r2 dr. The product 4r2 ||2 is then the probability
of finding the electron within a radius between r and r + dr. This is known as the radial probability
distribution.
Through this formulation of radial probability distributions, we can actually show that the most
probable radius of the electron for some quantum states with quantum number n is given by a0 n2 which
coincides with Bohrs result. But now the probabilistic interpretation is consistent with the Heisenberg
uncertainty principle. As said, there are hints of truth in a model that can yield experimentally correct
values.
Exercise: Ground State of H-atom

The ground state wave function for the hydrogen atom is given by
100 = Aer/a0 (3.184)
where a0 is the Bohr radius.

R
1. What is the normalization constant A? You can use the result 0
x2 e2x/a dx = a3 /4.
2. The radial probability distribution for the ground state of the hydrogen atom is P (r) = 4r2 |100 |2 .
What is the most probable radius of the electron?

74
3.16 Zeeman Effect and the Spin
The verification of the energy levels of the hydrogen atom can be done by observing the spectra of
hydrogen. It turns out that the experimental result (the spectra) exactly coincides with what can
be predicted from quantum mechanics. So we confirm the quantization of energy. But what about
the quantization of angular momentum? The solution to the Schrodinger equation tells us that the
magnitude
p of the angular momentum L and the z-component of the angular momentum Lz are quantized:
L = l(l + 1)h and Lz = ml h. We ask the question: how can we realize the quantization of angular
momentum?
To answer this question we review some of the concepts that we know from electrodynamics. A
charged particle that is orbiting around some point has a magnetic momentum ~ = IA~ where I is the
~
current and A is the area vector for the loop. When the system is exposed in a magnetic field, the
magnetic moment of the charged particle interacts with the magnetic field through the potential energy
U = ~ ~ Having this potential energy means that there is a force on the system that is the gradient
B.
of this potential energy. We now show that the magnetic field is related to the angular momentum of
the charged particle through a very simple relation. Let us consider the simple case where the charged
particle is an electron that is orbiting the positively charged nucleus as shown: We can immediately
Figure 42: Electron orbiting a positively charged nucleus. The direction of motion of the electron is opposite to
the direction of the conventional current (flow of positive charge).
figure out that

~ is anti-parallel to the direction of the angular momentum L ~ since the direction of the
electron is opposite the direction of the current. Now, we relate the magnitudes. We do so by first
writing down the magnetic moment of the electron in the way that we know: = IA. By the definition
of the current, as amount of charge passing through a point in a given time, we get I = e/T where e is
the charge of the electron and T is the orbital period. Given that the orbital period is the circumference
2r divided by the orbital speed v we obtain I = ev/2r. And noting that the area enclosed is simple
ev
r2 we get = 2r r2 m 1 e
m = 2 m L where L = mvr is the magnitude of the angular momentum. Taking
into account the direction we write down
1 e~
~ =
L. (3.185)
2m
The interaction energy between the magnetic moment and the magnetic field is then U = ~ ~ =
B
1 e ~ ~
2 m L B. This is conventionally written down as
B ~ ~
U= LB (3.186)
h

75
where (using electronic parameters)
eh
B = = 5.788 105 eV/T (3.187)
2m
is known as the Bohr magneton. It is evident from here that a quantization of angular momentum will
lead to quantized values for potential energy. This reduces the symmetry of the system and effectively
lifts the degeneracy for states with the same l quantum number.
Note that we have only proved Eq. 3.185 for a charged particle that is orbiting a fixed center in
a circle. It comes as a surprise that this relationship actually holds true in general, i.e. it also holds
true for a spinning charged sphere, charged disk, etc. Thus, anything that spins that has charge has a
magnetic moment-angular momentum relationship given by Eq. 3.185.
In the Schr
odinger equation picture recall that instead of having an electron that goes around in a
fixed trajectory around the nucleus, we instead have an electron which has probability of being there
or there given its quantum state (n, l, ml ). For l = 0 states, we get back spherical symmetry. These
are states in which the probability of finding the electron around the nucleus does not have any angular
dependence. States with l 6= 0 instead have angular momentum. When we expose this hydrogen atom
into a magnetic field we observe quantization of angular momentum as a quantization of a force. Note
that the magnetic field must be non-uniform so that there will be a non-zero force on the atom. The basic
setup in observing this quantization of angular momentum is shown below. And indeed when we subject
Figure 43: A beam of hydrogen atoms prepared in a state with l 6= 0 is passed through a region with non-uniform
magnetic field. The non-uniform nature of the magnetic field gives a force and splits the beam. The quantization
of the angular momentum manifests itself in the number of beams that is observed after passing through the
magnetic field.
the hydrogen atom prepared in the l = 1 state we observe three beams corresponding to ml = 1, 0, 1
as expected.
Spin
Reading assignment: Stern-Gerlach experiment.
It all turns out to be great. We can now observe quantization of angular momentum and it is
consistent with experiment. What comes as a surprise is that when we look closely the lines are split
into two. And what is worse? Even l = 0 states get split into two!
This is where physicists speculated the existence of an intrinsic quantum number that does not enter
into the Schr
odinger equation. This is called the spin quantum number ms . We say that the electron
has spin. Manifesting itself through splitting of beams when subject to the magnetic field, the simplest
way to model the spin is to relate it to an intrinsic magnetic moment of the electron in the same way as
the angular momentum is related to the magnetic moment as Eq. 3.185. Thus, we write down
B ~ ~
~s = ge SB (3.188)
h
~ is a vector representing the spin state of the electron.
where ge is called the gyromagnetic factor and S
This gyromagnetic factor turns out to be close to ge 2 for the electron. The experimental result that

76
even the l = 0 state splits into two shows that this spin has two possible quantum numbers. This is what
we refer to as spin up and spin down in chemistry and physics.
Guided with our intuition for the quantization of angular momentum we say that a quantum particle
with spin s has a magnitude of spin given by
p
S = s(s + 1)h (3.189)
and spin along the z axis given by
Sz /h = s, s + 1, ... 1, 0, 1, ..., s 1, s. (3.190)
Since there are only two line obtained for the splitting of the beam then it is evident that the possible
quantum number s must be a half-integer. Evidently, we should have s = 1/2 so that there are two
possibilities for Sz with ms = 1/2, 1/2. Now we say that the electron is spin 1/2.
Note: The spin does not enter automatically in the Schrodinger equation because it is actually
of relativistic nature. It naturally arises in a theory which reconciles special relativity with quantum
mechanics. In this theory, we can even derive the gyromagnetic factor of the electron to be exactly 2. In
a more accurate theory (albeit more difficult) of quantum field theory which takes into account even the
quantum nature of vacuum corrections to the gyromagnetic factor can be found. It is also in quantum
field theory that we can prove the Pauli exclusion principle for half-integer spin particles such as the
electrons, protons, neutrons, neutrinos, etc.
Total Angular Momentum

Having related the spin to an angular momentum, then we can consider what is called a total angular
momentum. Since spin S ~ is quantized and the orbital angular momentum L
~ is quantized, the total
angular momentum J~ given by
J~ = L
~ +S~ (3.191)
would also be quantized. The total angular momentum quantum number j can be calculated depending
on whether the spin and the orbital angular momentum are parallel or anti-parallel. When the spin and
orbital angular momentum are parallel, the total angular momentum quantum number is j = |l + s|.
When the spin and the orbital angular momentum are anti-parallel the total angular momentum quantum
number is j = |l s|. Given the total angular momentum quantum number we can calculate the
magnitude of the total angular momentum through
p
J = j(j + 1)h. (3.192)
The possible values for the z-component of the total angular momentum is
Jz = mj h (3.193)
where mj = j, j + 1, ... 1, 0, 1, ...j 1, j.
Exercises
1. Pinakamahirap! What is the magnitude of the total angular momentum of an electron in the
s-state (l = 0) of the hydrogen atom?
2. Stern-Gerlach. Which of the following is/are the result/s of the Stern-Gerlach experiment?
(a) The orbital angular momentum of a valence electron is quantized.

(b) The electron has an internal degree of freedom called spin.
(c) The screen detected even numbers of spectral lines.
3. Anti-parallel. What is the magnitude of the total angular momentum

for the l = 3 state when
the orbital and spin angular momenta are anti-parallel? Ans. 35h/2
4. total A hydrogen atom is in the l = 1 state. Compute for the magnitude of the total angular
momentum for ~ ~
the cases when L is parallel to S as well as when they are anti-parallel. Ans.
15h/2 and 3h/2

77
3.17 Many-Electron Systems and the Pauli Exclusion Principle
To end our discussion of quantum mechanics we tackle the problem of a quantum system which contains
more than a single particle. We shall find that there are two kinds of particle in nature. Those which
have symmetric wave function are called bosons while those which have asymmetric wave function are
called fermions.
Consider a two-particle wave function given by
(x1 , n1 |x2 , n2 ) = two-particle wave function (3.194)
where x1 and n1 labels the position and quantum state of particle 1 and x2 and n2 labels the position
and quantum state of particle 2. For simplicity, let us say that this describes the state of two electrons
in some system. This obeys the Schr odinger equation for the two particles. But it seems that solving the
Schrodinger equation for even the next simplest atom, the helium atom, is a formidable task. The task of
solving the Schrodinger equation is impossible with more than two particles and approximation schemes
need to be developed. With this in mind let us see how much we can proceed forward without actually
solving the Schrodinger equation. What matters in quantum mechanics is the probability density. Given
Eq. 3.194 then |(x1 , n1 |x2 , n2 )|2 is the probability density for finding particle in x1 in state n1 and
finding particle 2 in x2 in state n2 . However, the task of labelling two particles in quantum mechanics is
not possible. We cannot really distinguish two electrons from one another and call one red and the other
blue. When we know one electron we know all other electrons. So this means that even if we exchange
the particle labels in the probability density, we should get the same thing. In symbols, we write down
|(x1 , n1 |x2 , n2 )|2 = |(x2 , n2 |x1 , n1 )|2 . (3.195)
The probability of finding particle 1 in x1 in state n1 and particle 2 in x2 in state n2 is the same as the
probability of finding particle 1 in x2 in state n2 and particle 2 in x1 in state n1 . Eq. 3.195 predicts two
kinds of wave function:
(x1 , n1 |x2 , n2 ) = (x2 , n2 |x1 , n1 ), symmetric (3.196)

(x1 , n1 |x2 , n2 ) = (x2 , n2 |x1 , n1 ), anti-symmetric. (3.197)
These two kinds of wave function correspond to the two kinds of particle appearing in nature.
Bosons
The symmetric wave function is obeyed by bosons. Examples of bosons are the photon, graviton, and
the Higgs boson. In contrast with the other kind of particle to be introduced in a few minutes bosons
can occupy the same quantum state. To see this, consider the case where the two bosons described by
Eq. 3.196 are located at the same position x1 = x2 = a. In this case we have
(a, n1 |a, n2 ) = (a, n2 |a, n1 ). (3.198)
Other than occupying the same location in space we can see that there is no contradiction if the two
bosons also occupy the same quantum state n1 = n2 = n. That bosons obey this property is important
in the understanding of Bose-Einstein condensation where at absolute zero temperature bosons tend to
occupy the same quantum state. In this case a quantum phenomena becomes observed macroscopically
such as in liquid helium.
Fermions
The anti-symmetric wave function is obeyed by particles called fermions. Examples of fermions are the
electron, proton, and the neutron which basically makes up most of the matter that we deal with in our
daily lives. In contrast with bosons we shall find that: No two fermions can occupy the same quantum
state. This statement is most known to be the Pauli exclusion principle. Let us see how this comes
about in quantum mechanics using Eq. 3.197. Again let us consider the case in which the two electrons
are located in the same position in space x1 = x2 = a. This might mean for example that they are
localized in the same atom or in the same quantum well in a semiconductor. In symbols we have
(a, n1 |a, n2 ) = (a, n2 |a, n1 ). (3.199)

78
Now note how the minus sign makes all the difference. Consider the case when the two electrons also
occupy the same state n1 = n2 = n. Say, two electrons occupy spin down at the same ground state in a
quantum well. In this case we obtain
(a, n|a, n) = (a, n|a, n). (3.200)
Look closely. We have a number (the wave function is only a number) that is equal to its own negative.
This can be satisfied only when the wave function itself is zero:
(a, n|a, n) = 0. (3.201)
The probability density of measuring the two particles in the same position in the same state is then
zero. This is how nature shows us the Pauli exclusion principle for fermions.
Note that we did not really prove why electrons are fermions and why the photon is a boson. This is
a task for another day when we learn the techniques of quantum field theory.
Many-electron systems
Electrons are fermions and must therefore obey the Pauli exclusion principle. But other than the Pauli
exclusion principle electrons also interact with one another. Say, two electrons in the helium atom also
interact repulsively since both have the same sign of the charge. We might think that we cannot go
anymore further with our formalism of quantum mechanics for many-electrons. What turns out to be
surprising is that we can go way further in the analysis of systems under the simple assumption that
the electrons do not interact with one another. In semiconductors what is usually done is consider all
interactions through the so-called effective mass. In atomic systems, the use of the electron configuration
in which we put the electrons in all quantum states starting from the ground state and up assures us that
our crude approximation still holds some truth. This analysis go way further in the analysis of metals
and even astrophysical systems such as white dwarfs.
Helium atom
Having said that electrons are fermions and that these obey the Pauli exclusion principle let us try to
analyze the next simplest quantum system which is helium. The Schrodinger equation for the helium
atom is given by
2 2
h h2 2

1 (~x1 , ~x2 , t) (~x1 , ~x2 , t)
2m 2m 2
(3.202)
e2 2e2 1 2e2 1

1
+ (~x1 , ~x2 , t) = ih (~x1 , ~x2 , t).
4 |~x1 ~x2 | 40 |~x1 | 40 |~x2 | t
In the potential energy, the first term represents the repulsive interaction of the electrons while the
second and third terms represent the attractive interaction of the electrons with the positively charged
nucleus. This equation has not yet been solved exactly.
Let us attempt to move forward in this way. Assume that to a good approximation the interaction
between the two electrons is negligible compared to their interaction with the nucleus. Under this
assumption we write down the two-particle wave function in terms of single electron wave functions
n (~x) of the hydrogen atom:
(~x1 , ~x2 , t) = (1) x2 )(2)
n1 (~ x2 )eiEt/h .
n2 (~ (3.203)
The superscript (j) labels the two electrons. Using the ground state wave function of the hydrogen atom
problem this becomes
8 2(r1 +r2 )/a0
(r1 , r2 , t) = e (3.204)
a3
where a0 is the Bohr radius. Using the ways that we know we can calculate the energy corresponding
to the state. The result is 8E1 = 109 eV. This is not close to the experimental value 78.975 eV but
it gives us a starting point considering that we have simplified the problem greatly.
But Eq. 3.203 is not the correct way of writing the wave function for a two-electron system. As
we have noted electrons obey the Pauli exclusion principle and the two-particle wave function must be
anti-symmetric. With this goal of writing down an anti-symmetric wave function we obtain

(~x1 , ~x2 , t) = A (1)
n1 (~x1 )(2) x2 ) (1)
n2 (~ x2 )(2)
n2 (~ x1 ) eiEt/h .
n1 (~ (3.205)

79
We can normalize this anti-symmetric wave function to determine A. Then we can calculate the average
energy using the total energy operator:
2 2
= h h2 2
H 1
2m 2m 2
(3.206)
e2 2e2 1 2e2 1

1
+ .
4 |~x1 ~x2 | 40 |~x1 | 40 |~x2 |
We analyze the ground state by writing this down in terms of the hydrogen atom ground state given by
1
(r) = p er/a0 (3.207)
a30
noting that a0 is the Bohr radius. It is left as exercise to calculate the average ground state energy
that would come up following these lines. Note that the experimental ground state energy of helium is
78.975 eV. This is where we end.
Exercises
1. Totoo ba? Which of the following is/are FALSE about a many electron atom?
(a) No two electrons in a many electron atom can have the same quantum state.
(b) The wavefunction of a many electron atom is symmetric.
(c) Two electrons in a many electron atom can have the same quantum numbers if they are both
spin-up.
2. PEP. Which of the following statements is/are NOT part of Paulis exclusion principle?
(a) The electrons have unique values of quantum numbers.

(b) The probability for two electrons to be in the same quantum state is zero.
(c) The electrons have spin-orbital interaction to avoid being at the same state.
3. Metals. Consider an N electron system in the ground state neglecting first the spin quantum
state. The electrons are essentially free to move in a confined region of space. For a first analysis
also consider that the electrons only move in one dimension so that the problem is equivalent to
an N particle in a box system. Calculate the ground state energy of the system taking note of the
Pauli exclusion principle. What results from this calculation is called the Fermi energy. Hint:
Fill in the quantum states of a particle in a box for, say, ten particles only starting from the ground
state and up. Then attempt a generalization.
4. Repeat the same analysis but now take into account the spin.
5. You done! According to Paulis exclusion principle, which of the following systems CANNOT
exist in nature?
(a) Two electrons (one spin up and one spin down) both with principal quantum number n = 1
in a Coulomb potential.
(b) Two electrons (both spin down) in the n = 1 and n = 2 states in an infinite square well,
respectively.
(c) Two electrons with the same spin both in the n = 0 state in a harmonic oscillator potential.
(d) Two electrons (one spin up and one spin down) in the hydrogen atom occupying the state
nlm = 110.
6. Consider two non-interacting particles in a one-dimensional harmonic oscillator potential with
angular frequency . Neglect spin. What is the total ground state energy if the particles are
bosons? What is the total energy if the particles are fermions. Write down the wave function for
the system when the particles are bosons and when the particles are fermions.

80

Physics 73 Lecture Notes on Thermodynamics, Relativity and Quantum Physics

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Physics 73 Lecture Notes on Thermodynamics, Relativity and Quantum Physics

Transféré par

Droits d'auteur :

Formats disponibles

Physics 73 Lecture Notes (Thermodynamics, Special Relativity,

and Quantum Physics)

Thermodynamics - phenomenological description of properties of macroscopic systems in thermal

phenomenological - empirically based, experiment based

macroscopic - involving a large number of components, constituents (degrees of freedom)

1.1 Zeroth Law and Temperature Scales

insulator - does not allow passage of energy transfer

conductor - allows the passage of energy transfer

Consequences of the zeroth law:

2. allows the creation of thermometers (system C can be a thermometer)

Physics 73 Lecture Notes

Figure 2: liquid-in-glass thermometer

Figure 3: bimetalic strip-based thermometer

Figure 4: gas-pressure thermometer

Physics 73 Lecture Notes

Figure 5: Pressure-temperature graph for gases of different amounts and types.

1.2 Thermal Expansion

Physics 73 Lecture Notes

Physics 73 Lecture Notes

1.3 Heat, Specific Heat and Heat Capacity, Calorimetry

Results of Joules experiment:

4.184 J = 1 cal (1.14)

Heat Capacity and Specific Heat

heat capacity = C = heat needed to raise the temp. of substance by 1 K. (1.18)

Physics 73 Lecture Notes

Phase Changes and Latent Heat

Physics 73 Lecture Notes

1.4 Mechanisms of Heat Transfer

Figure 9: Thermal conduction by transfer of energy between consecutive atoms

Heat Current - amount of heat dQ transferred in time dt.

1. Heat current is proportional to the temperature gradient

2. Heat current is proportional to the cross-sectional area

Physics 73 Lecture Notes

We identify the thermal resistance as

Figure 10: Series connection of two thermal conductors

We can easily identify that

Req = R1 + R2 + ...RN . (1.34)

conductors are the same while the current varies. Thus,

Physics 73 Lecture Notes

temperature TS , the net heat current due to radiation is

Figure 12: Thermal conductivity exercise

Physics 73 Lecture Notes

1.5 Ideal Gas Equation and Introduction to Molecular Properties of Matter

Example: Approximate equation of state for a solid

Ideal Gas Equation

So, at this point, what do we mean by an ideal gas?

R = 8.3145 J/molK = 0.08206 Latm/molK. (1.43)

Physics 73 Lecture Notes

Show that if the atmosphere is at constant temperature T , then

Van der Waals Gas

- simple 2D representations to help specify/visualize the evolution of the state of a system.

An example of PV diagram for the ideal gas is shown below.

Physics 73 Lecture Notes

Molecular Properties of Matter

Figure 15: Typical potential energy vs separation for molecules

2. liquid - short-range order

3. gas - no attractive forces effectively holding the molecules

Physics 73 Lecture Notes

macroscopic viewpoint - phase of matter, has pressure, temperature, volume, etc.

Consider gas in a box with movable piston as shown.

dW = F (dx) = P dV. (1.47)

Ideal Gas Model: Microscopic standpoint

Physics 73 Lecture Notes