Académique Documents
Professionnel Documents
Culture Documents
1 Thermodynamics
Definitions:
thermal equilibrium - thermal properties of an object are uniform throughout its body and not
changing in time
Figure 1: Thermodynamic systems A and B are connected thermally to a thermodynamic system C through a
thermal conductor. Systems A and B are separated by an insulator. The whole systems is assumed to be inside
an insulating box.
A, B, C - thermodynamic systems
If two thermodynamic systems A and B are separately in thermal equilibrium with a third system C,
then the two systems A and B are in thermal equilibrium with each other.
1. allows for the definition of the temperature T (something must be the same for systems in thermal
equilibrium
T1 = T2 . (1.1)
This is in addition to the conditions P1 = P2 and 1 = 2 for equilibrium between two systems.
Note: A thermometer actually measures its own temperature. It is assumed that the thermometer and
the body is in thermal equilibrium when using the thermometer to measure the temperature of a body.
Different objects react to temperature changes differently liquid-in-glass thermometer, bimetallic
strip-based thermometer, gas-pressure thermometer, resistance thermometer
Temperature scales can be defined using two thermodynamic points. Conveniently chosen points are
the freezing pt. and boiling pt. of pure water. In the Celsius temperature scale, the freezing pt.
of pure water is set to 0 C while the boiling pt. of pure water is set to 100 C. Using a liquid-in-glass
thermometer or any thermometer based on a property that changes linearly with temperature changes
T , the Celsius temperature scale can be created by making 100 equally spaced divisions between the
freezing pt. and boiling pt. marks. In Fahrenheit temperature scale, the freezing pt. is set to 32 F
9
TF = TC + 32. (1.2)
5
Can you derive this equation?
Kelvin Scale
In principle, we can calibrate different thermometers so that the temperatures would agree at the ther-
modynamic points corresponding to temperatures T1 and T2 . For example, we can set the temperature
to agree at 0 C and 100 for the freezing and boiling pts. of water. But this does not necessarily mean
that the intermediate temperatures for the two thermometers would agree. Why?
In connection with this, it would be convenient to have a temperature scale that is the same for all
objects. This leads us to the Kelvin temperature scale. The results of the experiment shows that the
hypothetical temperature T0 at which the pressure of the gas goes to zero is independent on the amount
of gas and the type of gas. This hypothetical temperature which is 273.15 C can be used to define the
Kelvin scale. The conversion from the Celsius scale TC to the Kelvin scale TK can be done using
TK = TC + 273.15. (1.3)
The Kelvin scale (or absolute temperature scale) can be defined using only a single reference temperature
using the known properties of gases. The triple pt. of water (where solid, liquid, and gas phases coexist)
is conveniently used to define the Kelvin scale:
Ttriple
T = P (1.4)
Ptriple
where Ttriple and Ptriple are the temperature and pressure, respectively, at the triple pt. of water.
L = LT. (1.5)
where L is the initial length and is the coefficient of linear expansion. In the way Eq. 1.5 is written,
the coefficient of thermal expansion is an average only within the time interval T :
1 L
= . (1.6)
L T
1 L 1 dL
= lim = . (1.7)
0 L T L dT
In problems where the coefficient of thermal expansion changes significantly, we can start with the
infinitesimal form of Eq. 1.5 which is given by
dL = LdT. (1.8)
To calculate the change in length over a finite temperature interval T = T2 T1 we then integrate:
Z T2
L = (T )LdT. (1.9)
T1
Volume Expansion
All the dimensions of an object responds to a temperature change. If all the linear dimensions expands
according to Eq. 1.5, we can show that the change in the volume V is
V = V T (1.10)
where V is the initial volume and is the coefficient of volume expansion. As an exercise try to show
that the coefficient of volume expansion is related to the coefficient of linear expansion through
= 3. (1.11)
You can start by differentiating the volume of a cube with sides L1 , L2 , and L3 . As in linear expansion,
the coefficient of volume expansion might change, for some objects, significantly over given temperature
intervals. Similar infinitesimal relations hold for volume expansion as in linear expansion. What are
these?
Exercises (Tipler)
1. A steel bridge is 1000 m long. By how much does is expand when the temperature rises from 0 C to
30 C? (Given = 11 106 K 1 ). Ans. 0.33 m = 33 cm.
2. A 1-L glass flask is filled to the brim with alcohol at 10 C. If the temperature is raised to 30 C, how
much alcohol spills out of the glass flask? (Given alcohol = 1.1 103 K 1 and glass = 9 106 K 1 )
Ans. 21.5 mL.
Thermal Stress
Heat up an object and restrain the expansion or compression response (say, using clamps). The Youngs
modulus is defined to be
stress F/A
Y = = . (1.12)
strain L/L
The stress applied by the restraints must be equal to the thermal stress to keep the size of the object
the same. In this case, it follows from Eq. 1.12 that
F L
=Y = Y T. (1.13)
A L
Example (Tipler)
3. A copper bar is heated to 300 C and is then clamped rigidly between two fixed points so that it can
neither expand nor contract. If the breaking stress of copper is 230 M N
m2 , at what temperature will the
GN
bar break as it cools? (Given Y = 100 m2 ) Ans. 177 C.
From Eq. 1.16, we can write down Q = mcT = (mc)T and identify the heat capacity as
C = mc. (1.19)
In many instances, it is also useful to express quantities in moles. It is convenient to define the molar
heat capacity or molar specific heat. Note that the mass m of a substance can be expressed as
m = nM (1.20)
where n is the number of moles and M is the molar mass. Using Eq. 1.20 we can write down Eq. 1.16
as Q = mcT = n(M c)T and identify the molar heat capacity:
C 0 = M c. (1.21)
In terms of the molar hear capacity, we can express the heat corresponding to a temperature change T
as
Q = nC 0 T. (1.22)
Calorimetry
calorimetry - procedure to measure the specific heat of an object
In calorimetry, we assume that the system is isolated from the surroundings. In this way, all of the
energy transfer takes place between the calorimeter and the objects inside it. Because the object is
heated, it is at an initial temperature that is higher than the final temperature. It therefore releases heat
Q < 0 while the calorimeter and the water absorbs this released heat:
Qout = mc(Tf Ti ) < 0 (1.23)
Qin = mW cW (Tf Ti,W ) + mC cC (Tf Ti,W ). (1.24)
where subscripts W and C refer to the water and the calorimeter, respectively. Being isolated from the
surroundings, we have
Qreleased + Qabsorbed = 0. (1.25)
Try: Express the specific heat c of the object using the equations above.
Exercises
1. (Tipler) To measure the specific heat of lead, you heat 600 g of lead shot to 100 C and place it
in an aluminum calorimeter of mass 200 g that contains 500 g of water initially at 17.3 C. If the
final temperature of the mixture is 20.0 C, what is the specific heat of lead? (The specific heat of
kJ kJ
the aluminum container is 0.900 kgK .) Ans. 0.128 kgK .
2. (Young and Freedman) A heavy copper pot of mass 2.0 kg (including the copper lid) is at a
temperature of 150 C. You pour 0.10 kg of water at 25c ircC into the pot and quickly close the
lid of the pot so that no steam can escape. Find the final temperature of the pot and its contents
and determine the phase of the water. Assume that no heat is lost to the surroundings. Ans.
Tf = 100 C, partly liquid, partly gas.
The proportionality constant is called the thermal conductivity that we denote as k. In symbols, the
heat current for conduction is given by
T
H = kA (1.28)
x
where H is the heat current, A is the cross-sectional area, T
x of the temperature gradient, and k is the
thermal conductivity. We can establish the connection with I = VR for electrical conduction. By writing
current flowing through each of the conductors is the same. We thus have
T2 T1 T3 T2
= = H. (1.31)
R1 R2
The above relation can be used, for example, to calculate the temperature at the junction between two
conductors. The current flowing through the whole system can be written as
T3 T1
H= . (1.32)
Req
Consider a parallel connection of thermal conductors as shown. The temperature change for both
Figure 11: Parallel thermal conduction. The conductors are subject to the same temperature change.
T = H1 R1 = H2 R2 . (1.35)
Exercises
1. (Young and Freedman) Related to Physics 73.1 Conduction Experiment. Two metal bars, each
of length 5 cm and rectangular cross-section with sides 2 cm and 3 cm are wedged between two walls,
one held at 100 C and the other at 0 C as shown in Figure 12.
P V = nRT (1.42)
where P is the pressure, V is the volume, T is the temperature, and n is the number of moles in the gas.
The ideal gas equation (Eq. 1.42) is a summary of several observations (Boyle, Charles, etc.):
1. V n. Doubling n while keeping P and T constant doubles V .
1
2. V P . Doubling the pressure while holding T and n reduces the volume to half.
3. P T . Doubling T while keeping V and n doubles the pressure P .
4. V T . Doubling the temperature while keeping P and n doubles V .
It is a (surprising!) result that the constant R is the same for all gases. The value is given by
Exercises
1. (Young and Freedman) In an automobile engine, a mixture of air and gasoline is compressed in
the cylinders before being ignited. A typical engine has a compression ratio of 9.00 to 1.00; this
means that the gas in the cylinder is compressed to 1/9 of its original volume. The initial pressure
is 1.00 atm and the initial temperature is 27 C. If the pressure after compression is 21.7 atm, find
the temperature of the compressed gas. Ans. 723 K or 450 C
2. (Young and Freedman) Mass of air in scuba tank
A typical tank used for scuba diving has a volume of 11.0 L and a gauge pressure, when full, of
2.10 107 Pa. The empty tank contains 11.0 L of air at 21.0 C and 1 atm. When the tank is filled
3. TRY (Young and Freedman) From fluid dynamics, the variation of the pressure with the elevation
as a function of density is
dP
= g. (1.44)
dy
Real gases obey only the ideal gas equation under certain conditions. The Van der Waals equation is an
equation of state that is a next level approximation to real gases as compared to the ideal gas equation.
The Van der Waals equation takes into account the volume of molecules and the intermolecular forces
of attraction. The Van der Waals gas obeys the equation of state
n2
P +a (V nb) = nRT. (1.46)
V2
n
Eq. 1.46 reduces the the ideal gas equation for very small densities V 1.
PV diagrams
Figure 13: PV diagram for an ideal gas showing isotherms (curves of constant temperature)
As stated, the PV diagrams show the evolution of a state of a system and therefore can also show
phase transitions. What do we know about phase transitions? Experimentally, at the phase transition
we have constant pressure and temperature and a huge amount of energy is involved. After transition
from one phase to another the volume of the object is also drastically changed. An example of PV
diagram showing phase transition is given by Figure ??.
What do we know about matter? Recall that two point charges or point masses interact with a force
that varies as 1r where r is the separation. Gravitational forces are typically very small at atomic scales
A = 1 1010 m.) The interaction can be studied by plotting the potential energy vs the
(sizes at 1
separation distance.
We follow the same approach to study matter. A typical plot of the potential energy vs. the separation
for two molecules is shown below.
Phases of Matter
1. solid - crystal lattice structure: everywhere the same and periodic, long-range order
It is convenient to define the mole in analyzing the properties of matter. A mole is the amount of
substance that contains as many elementary entities as there are atoms in 0.012 kg of Carbon-12. This
leads to Avogadros number 6.022 1023 molecules/mol.
microscopic viewpoint - large number of molecules colliding with one another and the walls of the
container.
Figure 16: Gas molecules in a box with movable piston. The piston is going to pick up momentum from collisions
with the molecules. Thus, if it is to be stationary, it has to be held by some external agent.
The piston is going to pick up momentum from collisions with the molecules. If the piston moves by
a distance dx due to collision and we want to keep it still, then we push the piston with a force F with
displacement dx. The work that we do on the gas is then
The total pressure applied by the gas molecules on the piston must be equal to the total force that we
apply to keep the piston still. This way we can measure the pressure exerted by the gas molecules.
1. molecules in the gas collide elastically with the walls of the container
2. molecules are essentially point particles; the dimensions of the molecules are much smaller compared
to the dimensions of the container; the frequency of collision of molecules between molecules is small
3. molecules do not interact with each other; gravity is negligible. This implies that there is no
preferred direction for motion.
Under these assumptions we will have a uniform distribution of molecules in the container. If we have
N molecules and the volume of the container is V , then the number density of molecules is N/V .
Let us now calculate the total momentum imparted by the gas molecules on the movable piston. Note
Consider a single molecule with velocity components of vx , vy , and vz . Upon hitting the piston (which
we allow to move along the x), the momentum imparted to the piston is then mvx (mvx ) = 2mvx .
Now how many molecules are going to hit the piston in a time t? This is simple: density volume =
N
V Avx t. The total momentum imparted on the piston is therefore
N N
ptotal = Avx t 2mvx = 2 Atmvx2 . (1.49)
V V
Have we been missing anything? Upon averaging, we would actually find that only half of the molecules
are directed towards the piston! So we divide the number that we have for molecules hitting the piston
in time t by 2. The total momentum imparted on the piston is now
N
ptotal = Atmvx2 . (1.50)
V
Since this momentum is what accounts to the pressure that we measure
1 ptotal
P = (1.51)
A t
we obtain
P V = N mvx2 . (1.52)
Of course, not all the molecules have a velocity of vx in the x direction. We must therefore take the
above expression as an average and write it down as
P V = N mhvx2 i. (1.53)
Now recall that there is no preferred direction. This implies that the average speeds over the three
translational directions are the same. In symbols, we have hv 2 i = hvx2 i + hvy2 i + hvz2 i = 3hvx2 i. Thus we
can write down the last result in terms of the speed v as
2 1 2
PV = N mhv i . (1.54)
3 2
Because the gas molecules are not interacting the sum of their kinetic energies is the total energy of the
gas. We call this the gas internal energy which we denote as U :
1
U = N mhv 2 i. (1.55)
2
Finally, we have an expression for the pressure, volume, and the internal energy for the ideal gas:
2
PV = U. (1.56)
3
By comparing this with what we know experimentally P V = nRT we identify the internal energy in
terms of temperature
1 3
U = N mhv 2 i = nRT. (1.57)
2 2
The temperature is thus identified with the average kinetic energy of the molecules of a gas. Gas molecules
at the same temperature therefore have the same average kinetic energy. By writing down n = N/NA
where NA is Avogadros number we identify the average kinetic energy of a molecule as
1 3 R
mhv 2 i = T. (1.58)
2 2 NA
R
We call the constant NA the Boltzmann constant which we denote as k. Its numerical value is
k = 1.38 1023 m
2
kg/s2 K. (1.59)
Since E = 12 mv 2 , it follows that the fraction of molecules having energies between E and E + dE is
given by
dN
= F (E)dE (1.62)
N
where 3
2 1 2 1 E
F (E) = E 2 exp . (1.63)
kT kT
The fraction of molecules having speeds below v/vRMS is shown below.
v/vRMS fraction
0.20 0.011
0.40 0.077
0.60 0.218
0.80 0.411
1.00 0.608
1.20 0.771
1.40 0.882
1.60 0.947
1.80 0.979
2.00 0.993
Exercises
1. Fifteen students took a 25 point quiz. Their scores are 25, 22, 22, 20, 20, 20, 18, 18, 18, 18, 18, 15,
15, 15, 10. Find the average score and the RMS score. Ans. AVE = 18.3, RMS = 18.6 .
2. Use the Maxwell-Boltzmann distribution to show that the average kinetic energy per molecule of
the gas is 32 kT .
3. A vessel contains some amount of ideal monoatomic gas at temperature T . The mass of one
q molecule
1 12kT
of this gas is m. What fraction of this gas has molecules having speeds between v1 = 5 m and
q
v2 = 51 48kT
m ?
According to the kinetic theory the internal energy per mole of an ideal monoatomic gas is given by
3
2 R.What does experiment have to say about this? This tells us that the naive simple kinetic model
of an ideal monoatomic gas leads to results consistent with experiments. What about for the diatomic
gas? The diatomic gas can be visualized as two molecules connected to each other by a nonextensible
rod. Thus, other than the three translational motion there are the two rotational degrees of freedom
that contributes to the total energy of the system. The total energy per mole for a diatomic gas is then
5 12 RT or 52 RT . This leads to the heat capacity for a diatomic gas which is 52 R and we see that this
is consistent with experiment. There is no simple picture that we can use to analyze polyatomic gases.
But it turns out that considering two more degrees of freedom in addition to the two rotational degrees
of freedom for diatomic molecules, we get the heat capacity at constant volume of polyatomic molecules
to be 27 R. This turns out to be consistent with some polyatomic gases but further analysis is needed.
The model that we have for solid is that of a lattice and that each lattice point is occupied by
an atom of the solid. The atoms are connected to adjacent atoms by springs. Thus, there are six
degrees of freedom: three translational degrees of freedom and three vibrational degrees of freedom. The
equipartition theorem leads us to the result 6 12 RT for the total energy and thus 3R for the heat capacity
of a solid. This prediction turns out to have remarkable agreement with many elemental solids and is
called Dulong-Petit Rule.
This is the area under the curve in the PV diagram! What changes case to case is then the dependence
of the pressure P on V . Here are the few cases that we are going to cover:
1. Isochoric - constant volume process
In this case dV = 0 for all of the parts of the process. Eq. 1.68 is then trivial:
Wisochoric = 0. (1.69)
Wisobaric = P V (1.70)
dU = dQ + dW (1.74)
where dU is change in internal energy, dW is the work done ON THE SYSTEM, and dQ is the heat. In
terms of the system, the sign of dW is opposite and in some references we will find instead
dU = dQ dW. (1.75)
This is the first law of thermodynamics which is again another manifestation of the conservation of
total energy.
At constant volume V , all the heat goes into changing the internal energy of the system. At constant
pressure, some part goes into work. Thus, we have dQP = dU + dW = CP dT . Using an ideal gas as a
working substance we have dU = CV dT and dW = P dV = nRdT . Therefore,
P V = ( 1) U. (1.80)
Adiabatic Process
In an adiabatic process, all of the work goes into changing the internal energy as there is no heat flow
into or out of the system. To find the relationship between P and V we take the exact differential of Eq.
1.80:
dP V + P dV = ( 1) dU. (1.81)
Now, since all the work goes to changing the internal energy dU = P dV and we have
dP V + P dV = ( 1) P dV. (1.82)
By transposing all of the terms to the left hand side and dividing by P V we obtain
dP dV
+ = 0. (1.83)
P V
The solution to this differential equation is exactly Eq. 1.72.
Exercises
1. Adiabatic compression in a diesel engine (YF)
The compression ratio of a diesel engine is 15 to 1; this means that air in the cylinders is compressed
to 1/15 of its initial volume. If the initial pressure is 1.01 105 Pa and the initial temperature is
27 C, find the final pressure and the temperature after compression. Air is mostly a mixture of
diatomic oxygen and nitrogen; treat it as a gas with = 1.4. Ans. Pf = 44 atm, Tf = 613 C .
3. Photon gas What is the value of the heat capacity ratio for a photon gas? You can derive this by
showing that the equation of state for a photon gas is given by P V = 13 U . Use kinetic theory and
an advanced knowledge that the total energy of a photon is E = pc where p is the momentum and
c is the speed of light.
dU = dQ dW. (1.84)
This basically says that when heat dQ enters the system while the system does an amount of work dW
then the change in the internal energy of the system is the difference dQ dW . Also, we have learned
that the work done to change the state is dependent on the path taken (isochoric, isobaric, etc.) By the
definition of the adiabatic process we also have dQadiabatic = 0. So we also have an idea that the heat is
dependent on the path. We are now about to learn that, in contrast to the work and heat, the change
in the internal energy is independent on the path taken.
Recall that we can write down the internal energy of an ideal gas as
U = nCV T (1.85)
where n is the number of moles and CV is the molar heat capacity of the ideal gas e.g. CV = 25 R for
diatomic gas and CV = 32 R for monoatomic gas. This expression for the internal energy turns out to be
fairly general for gases at high temperatures and low pressures. This means that when we change the
state of the system, the change in the internal energy is given by
U = nCV T. (1.86)
The internal energy of a gas depends only on the temperature. Therefore, the change in the internal
energy depends only on the initial and final temperatures. What does experiment say about this result?
(Read about the free expansion experiment of Joule!)
Some of the consequences of the above result are:
1. For isothermal processes, dUisothermal = 0; Since the temperature in an isothermal process is zero,
the change in the internal energy must be zero.
2. For isochoric, isobaric, and adiabatic processes, dU = nCV dT . But this can still take different
forms when the ideal gas equation is used. For example, in an isobaric process P dV = nRdT so
the change in the internal energy can also be written as dU = CRV P dV .
3. For a cyclic process U = 0.
The heat absorbed or released in a process can now be calculated using the First Law.
In constant volume and pressure, the heat absorbed or released might be calculated without resorting
to the first law. At constant volume, recall that the heat absorbed is
dQ = nCV dT (1.87)
where CV is the molar heat capacity at constant volume. This can be used to calculate the heat in an
isochoric process. At constant pressure, we instead have
where CP is molar heat capacity at constant pressure. This can be used to calculate the heat in an
isobaric process.
Exercises
1. (Tipler) You do 25 kJ of work on a system consisting of 3 kg of water by stirring it with a paddle
wheel. During this time, 15 kcal of heat is removed. What is the change in the internal energy of
the system? (4.18 J = 1 cal) Ans. 37.7 kJ.
2. In a cyclic process, what is the relation between the total work done and the total heat?
3. Reconcile the apparent issue: We have said the the internal energy of a gas can be expressed
only in terms of the temperature by U = nCV T . But doesnt the internal energy also depend on
pressure since P V = ( 1) U ? So does the internal energy depend on the volume now? Does the
internal energy also depend on the pressure?
The heat engine is represented by the ERM (Energy-Reservoir Model) diagram shown below. Basically,
a heat QH is given off by the hot reservoir and a part of this energy is converted into useful work W by
the heat engine while the rest of the heat QC is thrown away to a cold reservoir. Since we restrict out
attention to cyclic processes, U = 0 and we have Q = W . In terms of the variables shown in the ERM
diagram we have
|QH | |QC | = |W |. (1.89)
The efficiency of the heat engine is defined as the ratio of the work done per cycle to the heat t input.
In symbols, we have
|W | |QH | |QC | |QC |
= = =1 . (1.90)
|QH | |QH | |QH |
We now state the heat engine form of the second law:
It is impossible for a heat engine working in a cycle to produce no other effect than that of extracting
thermal energy from a reservoir and performing an equivalent amount of work.
Here are the parts and their descriptions: What is important to us is to calculate the efficiency of
part description
ab compression stroke (adiabatic)
bc ignite fuel (isochoric heating)
cd power stroke (adiabatic expansion)
da reject heat to environment (isochoric cooling
the Otto engine? We ask, will an Otto engine with a compression ratio r = r1 have a greater efficiency
compared to an Otto engine with a compression ratio r = r2 > r1 ? Let us find out the answer to this
question.
We begin by calculating QH and QC . Since heat enters and leaves the system at constant volume, we
have
Using these, we can express the efficiency of the Otto engine in terms of the temperatures:
|QC | |Ta Td |
=1 =1 . (1.93)
|QH | |Tc Tb |
Now note that point c is connected to d adiabatically. Also, a is connected to b adiabatically. Thus, we
also have
Tc V 1 = Td (rV )1 (1.94)
1
Tb V = Ta (rV )1 . (1.95)
Cancelling out the common V and subtracting the resulting equations give
Td Ta 1
= 1 . (1.96)
Tc Tb r
Finally, we arrive at the desired expression for the efficiency of the Otto engine:
1
=1 . (1.97)
r1
This shows that a larger compression ratio leads to a larger efficiency! What can you say about the
dependence of the efficiency on the property of the working substance?
The story is: An amount of heat QC is taken away from the cold reservoir by using work W (input) to
throw away the heat QH to a hot reservoir. The direction of the energy flow shows that the following
relation still holds:
|QH | = |QC | + |W |. (1.98)
The ideal refrigerator will allow us to remove a heat from the cold reservoir and thrown it away into
the hot reservoir using only a minimum amount of work. The relevant number that allows us to tell the
quality of the refrigerator is the coefficient of performance. This number is given by
|QC |
K= . (1.99)
|W |
In contrast with the heat engines efficiency, note that this number can be greater than 1. The greater
the coefficient of performance is, the better the refrigerator.
We now state the refrigerator form of the second law:
It is impossible for a refrigerator working in a cycle to produce no other effect than to transfer thermal
energy from a cold object to a hot object.
Carnots Theorem
The most efficient engine can be built out using reversible steps.
This theorem leads us to the Carnot cycle which is illustrated in the PV diagram below. The efficiency
Exercise: What is the most efficient heat engine that is working between reservoirs with temperatures
of 473 K and 273 K, respectively? It is possible to have a heat engine with efficiency of 43 % working
under the same heat reservoirs? Ans. 42 %, No.
part description
12 isothermal expansion at TH
23 adiabatic expansion, gas expands
34 isothermal compression at TC
41 adiabatic compression, temperature increases
Carnot efficiency using an ideal gas as a working substance. But as discussed, the resulting equation
must not depend on the properties of the gas or the working substance. The result is more general and
can actually be derived without resorting to an ideal gas as working substance. We restrict our attention
to the simplest case using an ideal gas as a working substance.
Since two of the parts are adiabatic the heat input and output corresponds only to the isothermal
parts. Let the isotherms at points 1 and 2 correspond to the temperature TH and the isotherm at points
3 and 4 correspond to the temperature TC . In this case, we have
V2
QH = P1 V1 ln (1.101)
V1
V4
QC = P3 V3 ln . (1.102)
V3
Therefore, we have
V
QC P3 V3 ln V43 TC ln(V4 /V3 )
= V
= . (1.103)
QH P1 V1 ln V 2 TH ln(V2 /V1 )
1
Now note that points 2 and 3 as well as points 1 and 4 are connected adiabatically. Thus,
TH V21 = TC V31 (1.104)
TH V11 = TC V41 . (1.105)
Dividing these last two expressions lead to
V1 V4
= . (1.106)
V2 V3
Finally, we have
QC TC
= . (1.107)
QH TH
The minus sign is present only because QC < 0 for the heat engine. Thus, we arrive at Eq. 1.100. As
expected, all dependence on the properties of the working substance cancelled out.
Carnot refrigerator
Since each step in a Carnot cycle is reversible, the entire cycle for the Carnot heat engine can be reversed
leading to the Carnot refrigerator. The coefficient of performance of this most efficient refrigerator is
1 TC /TH
K= 1= . (1.108)
1 TC /TH
Exercises
1. A refrigerator has a coefficient of performance of 5.5. How much work is needed for this refrigerator
to make ice cubes from 1 L of water at 10 C?
2. A steam engine works between the hot reservoir at 100 C and cold reservoir at 0 C. What is the
maximum possible efficiency of this engine? If the engine is run backwards, what is the coefficient
of performance? Ans. 26.8 %, 2.73
3. A Carnot engine with an efficiency of 63 % performs 3.50104 J of work in each cycle. If it exhausts
heat at a 298 K reservoir, what is the temperature of its heat source?
S 0 (1.109)
dQrev
dS = (1.110)
T
where dQrev is the heat that must be added in the system in a reversible process that brings the system
from its initial state to its final state.
dQ = dU + dW
nRT (1.111)
= nCV dT + dV.
V
Dividing by the temperature T we have
dT dV
dS = nCV + nR . (1.112)
T V
When the initial and final points are finitely separated, we easily see that
S = k ln (1.117)
where k is as usual Boltzmann constant and is the number of possible ways of arranging the parts
of the system that leads to the same state. To go further, we need to define what is macroscopic and
microscopic. This is left as reading assignment.
We give out examples in terms of a very special gas composed of only three molecules. We place
this gas in a two compartment box that is separated by a valve. The gas molecules can be in either
compartments and we assume that we can track down the position of gas molecules. A microstate
macroscopic microscopic
3 molecules in left compartment (L, L, L)
2 molecules in left compartment (L, L, R), (L, R, L), (R, L, L)
1 molecule in the left compartment (L, R, R), (R, L, R), (R, R, L)
0 molecules in the left compartment (R, R, R)
2.1 Natural Units, Inertial Reference Frames, and the Principle of Relativity
We begin the subject by introducing natural units, inertial reference frames, and the principle of relativity.
The parable of the surveyor invites us to use the same units for both the space and time dimensions.
The conversion factor is given by the speed of light:
Conventionally, we say that 1 second is the time it takes for light to travel a distance of 3 108 m. In
this subject we start to measure time in units of length. We can define now what we will be calling a
clock (See figure 21). We create mirror A in such a way that it ticks each time the pulse hits it. In our
Figure 21: A very short pulse of light is trapped between two mirrors A and B separated by half a meter.
construction, we call the time it takes between the ticks as 1 meter of light travel time. Ok. Now we
can say that we are measuring the time in meters. What are the consequences of that?
Gravity affects everything. We know that. Every object has mass and energy and anything with mass
and energy is affected by gravity. Gravity acts on everything. In this case, we would not actually observe
any test particles moving in a straight line or at rest since these test particles would be acted upon by
gravity. Because of gravity, test particles would deviate from their natural straight line motions. In
considering this, an inertial reference frame must therefore also be a gravitation-free reference frame.
Also, we define a test object as an object that does not produce any significant gravitational force.
train is an inertial reference frame if when the train falls by the distance h, the change in separation
between the test objects y is undetectable by the current technology.
What this means is that the form of the equations is the same in every inertial reference frame. Thus,
when we derive the physical laws in, say, the lab frame, we do not have to derive the physical laws in
every other frame. The laws in all frames take the same form. For example, in the lab frame, the ideal
gas equation and the wave equation are given by
PV = N kT (2.2)
2 1 2
= . (2.3)
x2 c2 t2
In another inertial reference frame, say a rocket frame that is moving with constant speed with respect
to the lab, the form of the ideal gas equation and the wave equation are
P 0V 0 = N 0 kT 0 (2.4)
2 0 1 2 0
= . (2.5)
x02 c2 t02
We say that the form of the equations are preserved. Also, the value of the physical constants, e.g. the
speed of light c and the Boltzmann constant k, have the same value. What is not necessarily the same for
the two frames are the values of their measurements, say, of the coordinates, the electric and magnetic
fields, the pressure, and the volume, etc. The relation between these different values are related by what
we call the Lorentz transformations but we hold on to that discussion in quite a while.
We build a latticework of meter sticks and clocks as shown. In this latticework of clocks and meter
sticks, the time t of an event (with respect to this reference frame) is recorded by the clock that is nearest
to that event. The spatial components (x, y, z) of the event is given by the location of the clock in the
lattice. What we call the observer is actually the collection of all of this clocks. Through this latticework,
we can test if a given reference frame is going to be inertial. We simply check the motion of test particles.
But before going very far, we first have to calibrate the clocks. We now focus on an inertial reference
frame. To do this, we choose a particular clock that we shall call reference clock and we call its location
to be the origin of the inertial reference frame. What is want is that when this reference clock reads 5
meters of time then the clocks located 5 meters away from this clock will also read 5 meters of time.
To do this, we suppose that we have a large number of slaves that we position at each of the clocks in
the lattice. The slaves will set their assigned clocks at the time corresponding to the position of that
clock with respect to the reference clock. For example, the slaves assigned at the clocks (5, 0, 0), (0, 5, 0),
(0, 0, 5) will set the clocks at a time 5 meters of light travel time. The slaves hold the times in their
assigned clocks. Now, we, positioned at the reference clock, will release a flash of light that spreads out
in all directions. The job of the slaves is to start their clocks as soon as the flash reaches their location.
In this way, all of the clocks in the lattice will be synchronized.
Consider two inertial reference frames with overlapping regions of spacetime as shown. Because there
Figure 24: Two overlapping inertial reference frames. To distinguish the two frames, the clocks in frame A are
shaped circles while clocks in frame B are shaped as rectangles.
is a common region of spacetime, there is a number of events that can be described by the two inertial
reference frames. Without loss of generality, we suppose that the relative motion of the two frames is
along the x (the x axes coincide). We also consider the y and z axis of the two frames to be oriented in
the same way (no rotated) as viewed in either the x axis of frame A or the x axis of frame B.
tB = 2lB (2.6)
xB = 0. (2.7)
Now, inertial frame A sees something different. See next figure. According to an observer in inertial
frame A, the two events areqseparated by meters. The time taken by light to travel along the path
2 + 2 . According to this observer, the temporal and spatial separation
shown is (in natural units) 2 lA 2
of the events are
r 2
tA = 2 lA 2 + (2.8)
2
xA = . (2.9)
Using the principle of relativity, we can actually show that the coordinates perpendicular to the relative
direction of motion of the two frames remain the same (see argument by Taylor and Wheeler). This
means that we can set lA = lB = l. At this point, we should already notice that the time and space
coordinates of the events are different in the two frames. Since the coordinates perpendicular to the
relative direction of motion of the two inertial frames remain the same according to the principle of
relativity, then it follows that
t2A x2A = t2B x2B . (2.10)
Can you show this?
There is actually something special about our choice of reference frame B. That is the frame where
the two events E and R occur at the same place. To remove this special treatment, we consider another
inertial reference frame C (with a common region of spacetime with frames A and B). Let this inertial
frame travel at constant speed with respect to frame A that is larger than the speed of frame B with
respect to frame A. An observer in frame C will see events E and R as shown. According to an observer
here, the emission and reception events have the temporal and spatial separations of
s 2
2 +
tC = 2 lC (2.11)
2
xC = . (2.12)
This result is part of a more general result that is called the invariance of the spacetime interval.
The result is actually very general that it defines itself the geometry of special relativity and special
relativity itself. The spacetime interval ds between two events is defined by
Proper Time
The proper time is that time measured by an observer that is at rest in his own inertial reference
frame, that is, the wristwatch time. For a particular choice of two events, this is the time measured by
the observer in the frame where the events occur at the same place. In symbols, the proper time d
between two events (infinitesimally separated) is given by
2 = t2 x2 y 2 z 2 . (2.16)
Exercises
1. Two firecrackers situated 400 m apart exploded 500 m of time one after the other according to
a lab frame. A rocket frame observes the explosions to occur at the same place. Find the spatial
separation of these two events. What is the time between explosions according to this rocket frame?
What is the speed of the rocket with respect to the laboratory? Ans. 400 m and 0 m, 300 m, 4/5
2. A proton moving 3/4 light speed (wrt laboratory) enters two detectors 2 m apart. Events 1 and
2 are the transits through the two detectors. Find the time and space separation between the two
events according to the (i) lab frame and (ii) proton frame. Ans (2.67 m and 2 m) and (1.76 m and
0 m)
3. Two twins have their clocks initially synchronized. Observer A stays at rest in the lab frame while
Observer B travels from (0, 0) to (5 ly, 13/3 ly) as observed by A. Calculate the proper time of B
as well as its speed with respect to A. Ans. 2.5 ly, 13/15
Figure 28: Spacetime diagram. The vertical axis is the time axis while the horizontal axis is the position axis.
The points in these diagram correspond to events in spacetime.
same time (simultaneous). In general, events that like along a line that is parallel to the time axis occur
at the same place. Events that like along a line that is parallel to the position axis occur at the same time.
Now consider a reference frame that we call lab frame. In this lab frame, event O occurs at the origin
at time t = 0 and event B occurs also at the origin but at a later time t = tL . When these events are
viewed from a rocket frame (whose origin coincide with the lab origin at t = 0) moving to the right with
respect to the lab frame, event O occurs at the origin at t = 0 but event B occurs at some x < 0 at
some time x 6= tL . Viewed from a rocket frame moving to the left with respect to the lab frame, event
O occurs at the origin at t = 0 but event B occurs at some x > 0 at some time x 6= tL . Spacetime
diagrams drawn by observers in these frames are shown below. Since the event O occurs at the origin in
these frames we have x = x and t = t. The invariance of the interval immediately reveals to us the
following relationship:
t2 x2 = t02 x02 = t002 x002 = constant. (2.17)
This shows that even when the spacetime coordinates of event A are different according to the different
frames, event A lies in the hyperbola with equation t2 x2 = constant in all of the spacetime diagrams.
We call this hyperbola the invariant hyperbola.
Light Cones
Drawing light cones is a good way to determine if events are causally related. Light cones are the
surface generated by light that is emitted at a point in spacetime. See figure below. All events that
Figure 29: Light cone. Light rays emerge from point A at trajectories with slopes of 45 from the horizontal. By
rotating this drawing about the vertical axis passing through point A we generate a cone.
are in the future light cone of A are causally related to A. By causally related, we mean that whatever
happens at event A might affect what happens at event B. In contrast, event C is outside the light cone
of A. Thus, whatever A does cannot affect what happens to C. This is so because nothing can travel
faster than light. We have not yet proven this but introduce this idea now. The proof will come later
when we are talking about energy. We will eventually show that to accelerate a particle to the speed of
light an infinite amount of energy will be needed. Are events A and D causally related?
Exercise
An event occurs at (t, x) = (5 m, 3 m) in some lab frame. At what speed of the rocket frame will this
event have the largest time separation with respect to the common origin of the inertial frames? Is the
event timelike separated or spacelike separated with respect to the common origin? What is the proper
time/proper distance for the event with respect to the common origin?
Consider a light pulse travelling at the speed of light in a frame B that moves with speed relative
to A. According to the Galilean velocity addition rule, the speed of the light pulse measured by frame
A is given by + 1 6= 1. This violates the principle of relativity! Moreover, in the derivation of the
Galilean transformation, one assumes that the time measured in one frame is exactly the same as the
time that is measured in another reference frame. But we know that to be incorrect either. So, we agree
that there is room for improvement. Our goal now is to derive the so-called Lorentz transformation
which is the relativistic generalization of the Galilean transformation. It does not mean that the Galilean
transformation are totally incorrect and that it is useless starting this point. On contrary, we are going
to use it as a guide in our derivation.
With this goal in mind, we assume that the transformation between the two inertial reference frames
is linear. Can you present an argument for this assumption? For simplicity, we consider two frames
whose relative motion is along the x direction. Without loss of generality, we take the origin of both
frames to coincide at time zero in both frames. The principle of relativity tells us that y = y 0 and z = z 0 .
Now, for the time and position along the x-axis we have
t = at0 + bx0 (2.29)
0 0
x = ct + dx . (2.30)
The principle of relativity tells us that light travels at the same speed = 1 in any inertial reference
frame. This means that a light beam sent out to the positive x direction travels at the same speed = 1
to the right in both frames. Also, a light beam sent out to the negative x direction travels at the same
speed = 1 to the left in both frames. From Eqs. 2.29 and 2.30 we obtain t = at0 + bx0 and
x = ct + dx. From which we obtain
0
x c + d x
t0
= 0 . (2.31)
t a + b x
t0
That the pulse travels at the same speed in both directions in both frames gives us
c+d
1 = (2.32)
a+b
cd
1 = . (2.33)
ab
Exercises
1. Consider a rocket frame moving in the positive x axis with speed 0.75. Suppose an event A has
coordinates tA = 3 yr and xA = 5 yr in the lab frame. What is the coordinates of the same event
in the rocket frame? Ans. (t0 , x0 ) = (1.13 yr, 4.16 yr).
2. Your friend recorded the location of a certain explosion at (10 m, 25 m). If he is aboard a bus
moving at a velocity -0.75 with respect to your reference frame, what is the coordinates of the
event in your reference frame? Ans. (13.2 m, 26.5 m).
The Lorentz transformation written down in the form given by Eqs. 2.40, 2.41, 2.42, and 2.43 is
actually not the best way of writing it. It turns out that one can write down the Lorentz transformation
as
t = t0 cosh + x0 sinh (2.45)
0 0
x = t sinh + x cosh (2.46)
0
y = y (2.47)
0
z = z (2.48)
where
= tanh . (2.49)
The quantity is known as the rapidity and we will give more meaning to this later in the course. In
the mean time, performing the Lorentz transformation to find the coordinates of an event in another
inertial reference frame is quicker to input on a calculator using the latter form. Use this for the last
two exercises and show that your arrive at the same answer.
Exercises
1. Consider two events A (0 yr, 10 yr) and B (0 yr, 10 yr) as measured in the lab frame. According
to the lab frame, which of the two events happened first? Ans. Both happened at the same time.
2. Consider two events A (0 yr, 10 yr) and B (0 yr, 10 yr) as measured in the lab frame. In a rocket
frame moving to the right with speed 0.75, which of these two events will happen first? Ans. Event
B occurred first.
3. Consider two events A (0 yr, 10 yr) and B (0 yr, 10 yr) as measured in the lab frame. In a rocket
frame moving to the left with speed 0.75, which of these two events will happen first? Ans. Event
A occurred first.
So, how can we tell if two events can happen at the same time? Say, we have two events A and B
whose coordinates we know in the lab frame. In the rocket frame where these two events happen at the
same time we have t0 = 0. Using the invariance of the interval we have
x02 = x2 t2 . (2.50)
This tells us that if there is a rocket frame that can observe events A and B simultaneously, then the
interval squared for events A and B must be spacelike.
How can we tell if two events can happen at the same place? Again, we have two events A and B
whose coordinates we know in the lab frame. In the rocket frame where these two events happen at the
same place we have x0 = 0. Using the invariance of the interval we have
t02 = t2 x2 . (2.51)
This tells us that if there is a rocket frame that can observe events A and B at the same place, then the
interval squared for events A and B must be timelike.
Given the spacetime diagram in Figure 30 (according to some lab frame observer) can you tell which
events can happen at the same time/place in some rocket frame? If you are extra clever, can you tell the
speed of the rocket frame where this pair of events can happen at the same time/place? To guide you,
events B and D can happen at the same time in a rocket frame moving with speed 2/6 = 1/3 the speed
of light. Discover if something like this holds for other pairs of events.
Time Dilation
Consider two events A and B. Say, a muon is created (event A) propagates in spacetime by travelling
at uniform speed and decays (event B). We observe the two events in the lab and conclude that the
temporal separation for the two events is t. According to the muon, the creation and decay happened
at the same place x0 = 0. In the muon frame, its rest frame, the temporal separation between the two
events is equal to the proper time . We ask, what is the relationship between t and ? Using the
Lorentz transformation, setting x0 = 0, we immediately obtain
t = cosh . (2.52)
Since cosh 1 we see that t . Since the proper time is what we read as the wrist watch time
(of the muon), this is where the saying moving clocks run slower came from. It turns out that we
(observing the two events in the lab) actually age more compared to the muon.
Exercises
1. A muon decays into other particles with a mean lifetime of 2.20 106 s as measured in a reference
frame in which it is at rest. If a muon is moving at 0.990 relative to a lab frame, what will an
observer on this lab measure its mean lifetime to be? Ans. 15.6 106 s .
2. An airplane flies over a distance of 4.80 106 m at a steady speed of 300 m/s (1.00 106 in natural
units) with respect to a lab frame. How much time does the trip take as measured by an observer
in the lab frame? By an observer in the plane? Ans. 1.6 104 s and 1.6 104 s (1 0.5 1012 ).
Length Contraction
As with time dilation, it should not come as a surprise that the length of two sticks as measured in two
frames moving at some relative speed with respect to each other is different. Two give more insight, we
ask, how is the length of a stick to be measured in an inertial reference frame? The length of a stick is
measured by reading out the spatial coordinates of both ends of the stick at the same time. This implies
two events. These two events which are simultaneous in, say, a rocket frame, will not be simultaneous in
other reference frames. We know this now and this explains why the lengths are different. Now we ask:
how is the length of the sticks related?
To get the answer to this last question we consider two simultaneous events corresponding to reading
the spatial coordinates of both ends of the stick in a rocket frame. We know that the value that
comes out of this is the proper distance between the events. We can call this the proper length .
What does an observer in the lab frame measure? x. According to this lab observer, x0 = =
t sinh + x cosh . Since the two events are measured simultaneously in the rocket, we have
x = / cosh . (2.53)
Because cosh 1 we see that x < and we have contraction.
Exercises
1. A cylindrical rocket has radius 5.00 m and length 20.0 m in its rest frame (the rocket frame). This
rocket was observed in the lab frame to be moving along the direction of its length, at constant
velocity -0.700. What is the radius and length of this rocket according to the lab frame? Ans.
Radius = 5.00 m, Length = 14.3 m
2. Megaman X boards a spaceship and then zips past Zero (at rest in a lab frame) at a relative speed
of 0.600. Zero starts to blink just as X flies past him, and X measures that the blink takes 0.400 s
from beginning to end. According to Zero, what is the duration of his blink?
Let us consider two inertial reference frames whose relative motion is along the x-direction. In this
case, we know that the coordinates (t0 , x0 ) of one frame is related to the coordinates of the frame (t, x)
by the Lorentz transformation Eqs. 2.45 and 2.46. The time axis is the worldline corresponding to the
origin of the inertial reference frame. The position axis is the locus of points that occurs simultaneously
at the temporal origin of the inertial reference frame. This is what we ask: In the spacetime diagram of
a lab frame, say, with coordinates (t, x), how does the time axis and the position axis of a rocket frame
with coordinates (t0 , x0 ) look like? Let us simplify the answer to this question by considering inertial
reference frames whose origins coincide at the zero of time.
The time axis of the rocket frame is basically the worldline of its origin x0 = 0. To draw this
trajectory, we set x0 = 0 in the Lorentz transformation equations to obtain t = t0 cosh and x = t0 sinh .
The orientation of the time axis of the rocket frame is then given by x/t = tanh = . Thus, it is a line
tilted with respect to the time axis of the lab frame with a slope given by the speed of the rocket frame
relative to the lab. What about the x0 -axis? Since this is the locus of events that occur simultaneously
with t0 = 0 in the rocket frame, we set t0 = 0 in the Lorentz transformation. This gives t = x0 sinh
and x = x0 cosh . The orientation of the x0 -axis of the rocket frame as viewed in the lab is given by
t/x = tanh = . The two-observer spacetime diagram for a lab frame and a rocket frame moving
with velocity 0.25 with respect to the lab is shown below. Now, how does the two-observer spacetime
Figure 31: Two-observer spacetime diagram for a lab frame and a rocket frame moving with speed 0.25 to the
right relative to the lab frame.
diagram serve our purpose of determining the chronological order of events as viewed in different inertial
reference frames? Simple. Events that are simultaneous in an inertial reference frame lie along a line
that is parallel to the position axis of that inertial reference frame. Similarly, events that occur in the
same place in an inertial reference frame lie along a line that is parallel to the time axis of that inertial
reference frame.
Exercise
1. Draw the two-observer spacetime diagram for a lab frame and a rocket frame moving with a speed
0.25 to the left with respect to the lab.
3. Consider the events A (0 yr, 10 yr) and B (0 yr, 10 yr) as measured by an observer in a frame
that we shall call lab frame. By drawing a two-observer spacetime diagram, determine the chrono-
logical order of the events as viewed in a rocket frame that is moving to the right with speed 3/4
according to the lab frame. Also, using the two-observer spacetime diagram technique, determine
the chronological order of events in a rocket frame that is moving to the left with speed 3/4 relative
to the lab frame.
4. Explore time dilation and length contraction using the two-observer spacetime diagram technique.
Hint: Calibrate the temporal measurements by drawing the invariant hyperbolas. For example, the
1 m of time in the time axis of both frames corresponds to the intersection points of the hyperbola
1 m2 = t2 x2 with the t and t0 axes. Similarly, the 1 m of position along the x-axis of both frames
corresponds to the intersection points of the hyperbola 1 m2 = x2 t2 with the x and x0 axes.
XL = RL + XR . (2.55)
Let us now make a few steps backward and derive the velocity parameter addition rule Eq. 2.55.
Start with the Lorentz transformation for temporal and spatial separations:
Let the two events correspond to the passage of the object X through lattice clocks in the lab and rocket
frames. Therefore, we have x/t = XL and x0 /t0 = XR for the velocity of object X as measured
in the lab and rocket frames, respectively. The velocity of the rocket frame with respect to the lab frame
is of course given by RL = tanh . Dividing out the above equations for the coordinate separations gives
x t0 sinh + x0 cosh
=
t t0 cosh + x0 sinh
0 (2.58)
tanh + x
t0
= 0 .
1 + x
t0 tanh
Exercise
1. A spaceship is moving at a speed 0.70 with respect to a rocket moving at a speed 0.50 as measured
by the lab frame. What is the speed of the spaceship as measured by the lab frame? Solve the
problem in two ways: (i) direct use of the relativistic velocity addition rule (ii) use of the rapidity.
2. A spaceship is moving at a speed -0.70 with respect to a rocket moving at a speed 0.50 as measured
by the lab frame. What is the speed of the spaceship as measured by the lab frame?
Consider a light source (say, a lamp) that emits photons at a constant time interval of as measured
in the light sources rest frame. Let this light source be carried by a rocket frame that is moving with
a speed = tanh relative to a lab frame. In the rocket frame, the light source is at rest. Therefore,
the rocket frame is the rest frame of the source and the observers in the rocket frame would say that
the frequency of the source is fS = 1/ . The question we would like to ask is: What is the frequency
of the light that is emitted by the source as measured by observers in the lab frame? To get the
answer, we look closely at two adjacent photon emission events E1 and E2 shown in the two-observer
spacetime diagram below. In the rocket frame, the coordinates of events E1 and E2 are (t0 , 0) and
Figure 32: Relativistic Doppler effect: time interval between the emission and reception of photons in the two
inertial reference frames are different.
(t0 + , 0), respectively. In the lab, the coordinates of E1 and E2 are given by (t0 cosh , t0 sinh ) and
((t0 + ) cosh , (t0 + ) sinh ), respectively. Since the photons travel through diagonal lines oriented
Therefore, the frequency fR that is detected by the receiver is related to the frequency of the source fS
by s
1
fR = fS . (2.62)
1+
This is the relativistic expression for Doppler effect. To get the relationship for wavelengths simply
get the reciprocal of the above relation.
Eq. 2.62 tells us that the frequency that is detected by a receiver is less than the frequency of the
source is the source is moving away ( > 0) from the receiver. We say that the light is red-shifted
(recall that in terms of frequency red is at the lower end of the visible spectrum). On contrary, if the
source is moving towards the receiver ( < 0) the frequency that is detected by the receiver is greater
than the frequency of the source. In this case, we say that the light is blue-shifted. Can you guess
why? One of the hints that lead to the prediction of the expansion of the universe is that light that is
received from earth coming from all directions is always red-shifted. Can you explain why?
Exercises
1. How fast must you be approaching a red traffic light ( = 675 nm) for it to appear yellow ( =
575 nm)? Ans. 0.159 or 4.77 107 m/s
2. A source of electromagnetic radiation is moving in radial direction relative to you. The frequency
you measure is 1.25 times the frequency measured in the rest frame of the source. Is the source
moving toward you or away from you? What is the speed of the source relative to you?
2 = t2 x2 . (2.63)
This proper time is the same as calculated in different inertial reference frames. We can say that the
proper time is invariant under Lorentz transformation or, simply, it is Lorentz invariant.
We are now ready to tackle the concepts of energy and momentum in special relativity. Although
these are different from the Newtonian values of the energy and momentum, we see that we are guided
by the Newtonian theory in building a more satisfactory theory of nature.
What are some of the difficulties encountered in applying the Newtonian theory of mechanics to rela-
tivistic particles? First, it turns out that the quantity m for a particle of mass m and velocity which
is called Newtonian momentum is not conserved in high energy collisions. There are two ways to proceed
from this point. We can accept this form of momentum in special relativity but accept that momentum
is not going to be conserved in collisions; that is, the sum of the momentum vectors before and after a
collision are not equal. The other way is to drop completely the Newtonian expression of momentum
m and look for a new one that is going to be conserved in a collision. The conservation of momentum
and energy has guided and simplified the analysis of many problems not only in physics but in science
as a subject. We prefer and choose the latter solution: to find out that conserved quantity that we shall
call the momentum. Second, Newtonian mechanics cannot explain the production of particle-antiparticle
pairs even in terms of its conservation laws. But these phenomena are very common in these days. Try
Relativistic Momentum
So where do we begin? We start by finding out the expression for the relativistic momentum and energy
that is conserved before and after a collision. We do this considering a so-called glancing collision as
shown in the figure below. Momentum is a vector. This means that the first thing we must establish
Figure 33: Glancing collision of identical masses A and B as viewed in three inertial reference frames.
is the direction to where it points. The only unique direction we can actually choose for its direction is
the direction of the motion of the particle. If the momentum vector is oriented at an angle relative to
the direction of motion, then there are going to be an infinite set of possible choices corresponding to
each possible direction oriented with the same angle. The isotropy of space does not allow us to prefer
any one of these vectors. Thus, we choose that the momentum is parallel to the direction of motion. As
guided by symmetry we say that
The direction of the momentum of a particle is parallel to the direction of motion of the particle.
Let us now calculate the magnitude by being guided by the Newtonian expression for momentum. Let
us (in a lab frame) arrange the encounter such that the mass A is going to be relativistic while mass
B is non-relativistic. This means that we can use the non-relativistic expression of momentum for the
relativistic momentum of mass B. We can always look at the collision in the frame where the velocities
of the masses are pointing in opposite directions. In this velocity-symmetric frame, we can easily argue
that the total momentum before the collision is zero. We then require that the total momentum after
collision be zero as well. This frame where the total momentum is zero is shown in Figure 33. In some
rocket frame that is moving to the left the initial and final momentum along the horizontal of particle
B is going to be zero. We call this frame S. In another rocket frame that is moving to the right at
a sufficient speed, the horizontal component of the momentum of particle A is going to be zero. We
call this frame S. We choose the encounter such we can use the Newtonian expression my/t, to a
good approximation, for mass B. We analyze the collision in frame S. Note that for mass B the proper
time before collision to point of impact and the proper time immediately after impact to final position is
almost the same as the coordinate time t. Looking at frame S, the conservation of momentum along
the vertical direction leads us to
y
vertical component of pA = m . (2.64)
By the symmetry of frames S and S, we are further led to conclude that the proper times of A and B
are the same. The relativistic expression for the momentum of mass A can be obtained by using similar
triangle identities for the displacement and momentum diagrams (Figure ). The resulting expression is
given by
dr
p=m . (2.65)
d
Using the expression for the proper time we can write this down as
dr dr/dt
p = m = mq = mp (2.66)
dt2 dr2 1 (dr/dt)
2 1 2
or simply
p = m sinh . (2.67)
Eq. 2.67 is the relativistic expression for momentum which is conserved in particle interactions. Because
its derivation is guided by the Newtonian expression for momentum it naturally reduces to m for very
small velocities 1.
Relativistic Energy
Now we are one step ahead when we have started talking about momentum and energy. We make another
step here by considering how we can realize the generalization to the energy of a particle. In line with
this goal we introduce the notion of the 4-vector.
A 4-vector is labelled by four numbers just as the vectors that we know in three dimensions are
labelled by three numbers. We label the 4-vector as ( time component, 3 spatial components). An ex-
ample of the 4-vector is the displacement vector (dt, dx, dy, dz). What we know about the displacement
4-vector is that the components transform from one frame to another using Lorentz transformation. In
general, 4-vectors are defined such that their components transform under Lorentz transformation when
we go from one frame to another. To construct 4-vectors we can multiply known 4-vectors by scalar
quantities. Our goal now is to construct a 4-vector whose components correspond to the energy and the
momentum. Why? It is known from the theory of tensors that when three components of a 4-vector is
conserved then the remaining component is also conserved. Since we defined the momentum to be a con-
served quantity, if we can construct a 4-vector whose three components correspond to momentum then
the fourth momentum will be conserved. This is one reason to speculate the fourth component as energy.
The proper time is a scalar (invariant). We can divide the displacement vector by the proper time d
to construct the so-called unit tangent vector: (dt/d, dx/d, dy/d, dz/d ). This result is a 4-vector since
we have essentially multiplied a 4-vector (whose components transform under LT) with a scalar (does
not change under LT). The resulting quantity then changes components under LT. We then multiply
this tangent vector with the mass m of a particle. The resulting 4-vector is
dt dx dy dz
p = m , m , m , m . (2.68)
d d d d
Notice that the last three components of this vector p is the momentum that we have derived. This
momentum is conserved in all inertial frames of reference because this is how we have constructed it. Be-
cause the momentum is conserved in all inertial reference frames then the remaining component m dt/d
Before calling m dt/d the energy let us give some arguments. First, we want the energy to be
conserved. We have always been guided by the conservation of energy (in geology, chemistry, biology,
etc.). Second, this component has the same units as the momentum and we know that energy (in natural
units) has the same units as momentum. Third, we show that this reduces to the Newtonian expression
for the energy 12 m 2 for very small velocities:
dt dt m
m = m =p . (2.69)
d dt2 dx2 1 2
For 1 the expression above reduces to
dt 1
m = m + m 2 + higher-order terms. (2.70)
d 2
The second term is what we know to be the Newtonian expression for the kinetic energy. These arguments
give us enough reason to call the remaining component of p the energy. Thus, we write down the energy
E of a particle of mass m as
dt m
E=m =p (2.71)
d 1 2
or equivalently
E = m cosh . (2.72)
Nature turns out to combine the momentum and energy into a single energy-momentum 4-vector.
Kinetic Energy
By expanding the expression for the energy Eq. 2.72 in terms of the rapidity or, equivalently, in terms
of the speed we find that
E = m{1 + terms that depend on or }
(2.73)
= m + { motion dependent terms }.
It turns out that the relativistic expression for the energy suggests that the total energy of a system
can be divided into a term that does not depend on the state of motion of the particle and a term that
depends on the state of motion. The motion-independent terms is m and we call it the rest mass.
In conventional units this is mc2 . The first term of the motion dependent parts is 21 m 2 which is the
expression for the Newtonian kinetic energy. We borrow the Newtonian idea of kinetic energy as that
part of the energy which depends on the state of motion of the particle. Thus all the motion-dependent
terms in the last expression is what we call the kinetic energy. The energy of a particle can then be
written as
E =m+T (2.74)
where T is the kinetic energy.
Exercises
1. According to a lab frame, Albert Einstein throws a particle which then moves with velocity 0.900.
What is the particle energy and momentum according to the lab if the particle has a mass of 10
kg? Ans. 22.9 kg and 20.6 kg
2. An object of mass 300 MeV moves 8 m along the x-direction in 10 m of time as measured in the
lab. What is its energy and momentum? Ans. E = 500 MeV and p = 400 MeV
E = m. (2.75)
Now, take the square of the energy and the momentum and subtract the resulting expressions:
E 2 p2 = m2 cosh2 m2 sinh2
(2.76)
= m2 cosh2 sinh2 .
But we know that for any value of we have cosh2 sinh2 = 1. Therefore, we arrive at the following
relationship between the energy, the momentum, and the mass:
E 2 p2 = m2 . (2.77)
Notice that this is very similar to the expression for the invariant interval (t2 x2 = 2 ). In fact,
we can show that the mass (just as the interval is) is invariant under Lorentz transformations.
Exercise
A particle with mass 1 kg moves in the positive x-direction in the lab with kinetic energy equal to three
times its rest energy. What is the particles energy, velocity and momentum?
Using these we can confirm the results of the example that we have considered earlier.
Exercises
1. According to the lab frame, a 1-kg particle moves with a constant velocity of 0.5j. What is
the energy and momentum along the y-axis of the particle as seen by a rocket frame moving at a
velocity 0.25j with respect to the lab frame? Ans. 1.34 kg and 0.89 kg.
and
cos tanh
cos 0 = . (2.83)
1 tanh cos
Exercise
A lab frame emits a photon with frequency 3 MHz at an angle of 30 with respect to its x-axis. What
is the measured frequency and angle of this photon in a rocket frame moving at a velocity of 0.50 along
the x-axis with respect to the lab frame? f = 1.96 MHz, 0 = 49.8 .
We have already completed all the discussion on this course of the special theory of relativity. In the
beginning we have talked about events, how these are measured, and how different observers see these
events. We have talked about Lorentz transformations and have derived it from the invariance of the
interval. Finally, we discussed the relativistic generalizations to the concepts of energy and momentum.
The principle of relativity is what allows us to arrive at all these.
We have achieved enough for an introductory course. To end, we give an example and leave problems
that can be analyzed now that we have learned about the concepts of energy and momentum and mass
and how these three are intertwined with each other.
Let each of the photons have an energy E . The threshold energy can be analyzed in the frame where
the two photons have equal and opposite momenta. In this frame, the total energy of the system before
collision is
Etotal = 2E (2.84)
while the total momentum is
ptotal = 0. (2.85)
Because the total momentum is zero and the momentum is conserved, the electron-positron pair must
emerge in opposite directions with equal magnitudes of momentum. Let me be the mass of the electron
and the positron. If thep electron-positron pair emerge with nonzero momenta pe then each would be
having an energy Ee = p2e + m2e . It can be seen that the smallest value of this is equal to the rest energy
me of an electron. The minimum energy needed to produce the electron-positron pair then corresponds
to the case where the electron-positron pair is at rest when created. The total energy of the system after
collision is
Etotal = 2me (2.86)
Although the analysis of particle production processes is more complicated (involving the laws of
probability and quantum mechanics and an own set of conservation laws) we have made a step forward.
At best, we can collide particles and know the minimum energy that we need to set to produce a certain
set of particles after collision.
Figure 35: Monochromatic light is incident on a cathode (made up of some metal). The galvanometer reveals
that a current (we call this the photocurrent) flows when the frequency of the incident light is greater than
some cutoff frequency.
The main conclusion from the photoelectric effect experiment (Hallwachs and Lenard, 1886-1900) is as
follows:
No electrons were emitted unless the frequency of light is greater than some minimum value.
So what? Can this not be explained by the classical wave viewpoint of the electromagnetic wave? Let us
give this wave viewpoint an attempt. The energy carried by an electromagnetic wave is proportional to
the magnitude of the Poynting vector |S|~ E 2 where E is the magnitude of the electric field. If this were
true, then we would expect the magnitude of the current to be independent on the frequency of the light
source. Thus, if we increase the intensity of the light (whatever its frequency is) we are going to increase
the number of the photoelectrons that is emitted by the cathode. This leads to an increase in the current
detected by the galvanometer. This explanation however is readily debunked by the experimental result
that there is a cutoff voltage that depends on the frequency of the light source.
We call the minimum frequency of light needed to produce a photocurrent the threshold frequency
while the ejected electrons are called photoelectrons. The explanation to the photoelectric effect was
offered by Einstein by borrowing the Planck postulate that light is made up of quanta (photons) each of
which carries an energy
E = hf (3.1)
where f is the frequency of the light. At the time Planck made the postulate the value of the constant h
remained undetermined. It was the photoelectric effect experiment that provided the experimental value
of the constant h:
h = 6.626 1034 J s (3.2)
To determine the maximum kinetic energy of the emitted electrons we make the potential of the anode
relative to the cathode negative enough so that the current stops. In this case we get the value of the
cutoff voltage experimentally as
VAC = V0 . (3.3)
Through the work energy theorem we obtain the following relationship between the maximum kinetic
energy of the emitted electrons and the stopping voltage V0 :
1
KEmax = mv 2 = eV0 . (3.4)
2 max
Using Plancks hypothesis the maximum kinetic energy of the emitted electrons can be written as the
difference between the energy of the photon (hf ) that strikes the metal surface and the work function
characterizing the strength of the binding of the electrons to the cathode. Thus,
eV0 = hf . (3.5)
The story behind the Eq. 3.5 is simple (see figure). The electrons are bound to the metal with a binding
energy that we call the work function. The cutoff frequency for a particular cathode material can be
Figure 37: Photoelectric effect: Electron that is bound to the metal is targeted by a photon. The electron can
only get ejected if the energy of the photon is greater than the binding energy of the electron to the metal.
determined setting the maximum kinetic energy of the emitted electrons to be zero. This leads to
c
fcutoff = = (3.6)
cutoff h
2. For a certain cathode material in a photoelectric effect experiment you require a stopping potential
of 1.0 V for light of wavelength 600 nm, 2.0 V for 400 nm, and 3.0 V for 300 nm. Determine the
work function for this material and the value of Plancks constant. Ans. 1.0 eV and 6.4 1034 J s
Figure 38: Electrons are ejected from the cathode through thermal excitation heating and strike the anode
producing x-rays.
by V0 . When the temperature is very large, the thermal excitation energy kT (coming from random
molecular motion) can provide sufficient energy for the electrons to get ejected and travel accelerating
through the region between the cathode and the anode. When the ejected electron hits the anode it
suffers a very violent acceleration and emits a photon. In principle we must also take into account the
binding energy of the electrons to the cathode and the anode. But in Bremsstrahlung this binding energy
is very small compared to the thermal excitation energy. Since this thermal excitation energy provides
the energy of the released electrons we have kT KE. The electrons which are accelerated between the
cathode and the anode gains a kinetic energy eV0 (using the work energy theorem). When the ejected
electrons hit the anode then
eV0 = hf hf = hf /. (3.7)
The frequency and wavelength of the emitted photons are (to a good approximation) given by
Compton scattering refers to the detection of a wavelength shift of electromagnetic wave that is
scattered by target electrons. According to the classical viewpoint in which the photon is a wave the
wavelength of the scattered photon should be the same as that of the incident photon. A sufficient
explanation of Compton scattering can be explained by instead viewing the effect as a collision between
a photon and an electron. The conservation laws for energy and momentum lead to the following
E+m = E 0 + Ee (3.9)
0
E = E cos + pe cos (3.10)
0 = E 0 sin pe sin . (3.11)
A long but workable calculation using these equations lead to the Compton scattering equation
h
0 = (1 cos ) . (3.12)
m
h
= 0 = (1 cos ) . (3.13)
mc
Exercises
2. A photon with wavelength of 400 nm hits a stationary electron and is scattered at an angle 30 .
The scattered photon hits another stationary electron and is scattered by an angle 20 . What is
the total shift in the wavelength of the photon? Ans. 4.7 1013 m
3. An electron with kinetic energy 4.5 keV hits a metal surface. What is the minimum wavelength of
the photon emitted by the metal? Ans. 2.76 1010 m
4. An electron was accelerated from rest using a potential difference of 30 kV and hits a metal plate.
If the x-rays emitted have wavelength 5.0 10 11 m how much energy was absorbed by the metal
plate? Ans. 5.2 keV
5. A photon with wavelength 4.75 1011 m is scattered by an electron at rest. If the scattering angle
of the photon is 60 , what is the kinetic energy of the electron after collision?
What is going to come as a surprise at this point is that particles (such as electrons, protons, and
neutrons) also act as a wave, i.e. exhibits interference patters when subject through the double-slit
experiment. See two-slit interference experiment with electrons and the Davison-Germer experiment on
electron diffraction (reading assignment). Given this, then what are we to associate as a wavelength to
electrons and other elementary particles. It was de Broglie that took the big step and have written down
the wavelength for matter waves to be
h
= (3.14)
p
where h is Plancks constant and p is the momentum, say, of the electron. The inspiration in writing
down the wavelength for matter in this form comes from the expression for the energy and momentum
of a photon pc = E = hc/. Since photons behave as a wave and as a particle then maybe electrons also
behave as a wave and as a particle and we can associate the same relationship regarding the wavelength
and the momentum. Given that Eq. 3.14 was inspired by a relativistic expression the momentum p
appearing in Eq. 3.14 is also relativistic. For example, an electron with mass m will have the momentum
p = m cosh .
Exercises
1. What is the de Broglie wavelength of a proton that is moving with the speed of sound 340 m/s?
2. A 1.0 1030 kg relativistic particle moves at a speed half the speed of light. What is its de Broglie
wavelength?
3. A relativistic particle with mass 1.0 1030 kg has total energy 1.5 1013 J. What is the de
Broglie wavelength of the particle?
The Davison-Germer experiment provides us hints that electrons also behave as waves. Guided with
this intuition then we might think that the electron propagates around the nucleus as a wave. This leads
us to the Bohr model. We then write down
2r = n (3.15)
where n is a positive integer. Given this starting point, what are we to associate to the wavelength of the
electron? Fortunately, we are guided by de Broglies hypothesis that the wavelength of the electron is
given by Eq. 3.14. We make the further assumption that the electron is going to move non-relativistically
around the nucleus. Thus, the magnitude of the momentum is given by p = mv where m is the electron
mass and v is the orbital speed. Combining this with the Eq. 3.15 leads to
mvr = n (h/2) . (3.16)
The constant on the right hand side (despite being proportional to Plancks constant) is very special in
quantum mechanics and it is given by a special symbol h
= h/2. We thus write down the last result as
mvr = nh. (3.17)
This ends the hypothesis and we now continue with known lines of work.
Note that the constant h/mc appeared before in our analysis of Compton scattering. This constant h/mc
is actually known as the Compton wavelength (Do not confuse with de Broglie wavelength! It is only a
special name). Eq. 3.28 gives the spectrum of the hydrogen atom according to the Bohr model.
What is going to come as a surprise is that this model reproduces the experimental hydrogen atom
spectrum with remarkable accuracy given that it is a very crude quantum picture. Experimentally, the
spectrum of the hydrogen atom is fitted by the Rydberg equation:
!
1 1 1
=R 2 . (3.29)
n2i nf
1 mc 2
The experimental value of the Rydberg constant remarkably coincides with the value 2 h = 1.09737
107 /m.
So should we hate nature? Why is nature so brutal that we have to go through a hell of mathematics
only to predict the energy levels of its constituents? No! Nature is so good that it actually tells us
already all that we need to know about its constituents. All we have to look is to look at the spectra of
atoms. It is us that wants to calculate the spectra of all the atoms for pride. Well, this would be a noble
Exercises
1. Hydrogen-like atoms. Hydrogen-like atoms contain a positively charged core with Z protons and
a single valence electron that orbits this core. Consider a picture that resembles that of the Bohr
model and calculate the energy levels of the system. Show that the quantized energy, radius, etc.
for hydrogen-like atoms can be obtained by simply replacing each e2 by Ze2 in the corresponding
expressions for the Bohr model.
2. The orbital speed of an electron in the hydrogen atom is 7.3 105 m/s. What is its energy?
3. Consider an electron in hydrogen atom with energy 0.378 eV. What is the orbital radius of the
electron? What is the orbital speed of the electron?
How are we now to describe this dual nature of matter? We do so by assigning a number for a photon
and an electron. We call this number the wave function to represent the state of the quantum particle.
This is a fancy wave for number (a function) that satisfies the wave equation. An example of a wave
function is given by the electric and magnetic field vectors in a free electromagnetic field. In this case,
we know that
~ B
E, ~ ei(~k~xt) (3.30)
where ~k gives the direction of wave propagation and the wavelength k = 2
and gives the frequency of
the wave. Because we now picture the electromagnetic wave as consisting of photons then we can consider
this same function to also represent the photon. For a photon we know that pc = E = hf = hc/. In
this case, we can write down the wave function for the photon in terms of its momentum and energy:
We take a step from here and consider that, in general, the state of a free particle (may it be a photon
or a massive particle such as an electron) be represented by the wave function
where A is a constant. What is the meaning of the wave function? It contains information about the
position and momentum of the particle. To give more emphasis into this statement let us go back to the
case of a free photon. A free photon has a definite wavelength and therefore a definite momentum. But
it is spread to an infinite extent. What the free particle wave function tells us is that ||2 is a constant.
This means that the probability of finding a photon near any point in space is the same. A photon with
definite wavelength cannot be found. This is the same for any other free particle. A free particle will
have a definite momentum and hence a definite wavelength given by the de Broglie wavelength. The
payment for knowing the momentum of the free particle is knowledge about its position. As we go on in
our journey of the quantum realm we shall find that the payment for knowing the position of a particle is
knowledge about its momentum. This leads to the uncertainty principle which the heart of all quantum
phenomena.
2. A free particle has momentum 9.11 1027 kg m/s and energy 2.84 104 eV. What is the wave
function of the free particle?
3. What is the mass of a non-relativistic particle with wavefunction given by
= A exp i (1.28 107 m 1) x (4.52 1010 s1 ) t ?
(3.34)
Exercises
1. Particle Resonance. The lifetime of the baryon resonance particle ++ (1232) is 5.63 1024 s
before it decays to a positron and a pion. What is the minimum uncertainty in the measured
energy of the particle?
2. Nahihilooo.. nalilitooo... The position uncertainty of a 5.00 1028 kg quantum particle is
estimated to be 1.00 mm. What is the minimum uncertainty of its velocity assuming that the
particle moves in 1D motion?
3. Velocity Uncertainty. The uncertainty of the location of an electron along the x-direction was
determined to be 1 cm. What is the minimum uncertainty of the velocity along the x-direction of
the electron?
4. Impossible Estimate. Three experimenters estimated the uncertainties in location and momen-
tum of a particle in the x, y, and z-directions. Which experimenter/s made an impossible estimate?
(a) x = 3.74 103 m and px = 7.55 1015 kg m/s
(b) y = 1.85 109 m and py = 2.86 1026 kg m/s
(c) z = 4.32 106 m and pz = 6.10 1030 kg m/s
2 2
h
(~x, t) + U (~x)(~x, t) = ih (~x, t). (3.37)
2m t
In this elementary course we will consider in most cases its one-dimensional version that is given by
2 2
h
2
(x, t) + U (x)(x, t) = ih (x, t) (3.38)
2m x t
Eq. 3.37 is called the time-dependent Schr odinger equation and governs the behaviour of a particle
of mass m moving around in a potential that is given by U (~x). And as we have stated for the free particle
what has meaning is the quantity ||2 :
Note that this means taking the complex conjugate of the quantity and multiplying it with . By
the probabilistic interpretation that follows Eq. 3.39 then |(x, y, z, t)|2 dxdydz is the probability of
finding the particle within a volume dxdydz centered around the point (x, y, z). In one dimension ||/2 dx
is the probability of finding the particle within the length dx centered around x.
Given this interpretation, it is reasonable to assume that the probability of finding the particle
everywhere is equal to one. This leads us to the normalization integral
Z
(~x, t)(~x, t)d3 x = 1. (3.40)
all space
Stationary States
States with definite energy are called stationary states. It also turns out that these states do not evolve
in time. In symbols, the probability densities of these states do not depend on the time variable t.
Stationary states are represented by the separable wave function
Et
E (~x, t) = E (~x)ei h
. (3.42)
2 2
h
E (~x) + U (~x)(~x) = EE (~x). (3.44)
2m
The time-independent Schr
odinger equation describes the behaviour of a stationary state E with definite
energy E. In many cases, we shall be dropping the subscript E for stationary states for simplicity.
Schr
odinger Equation as Energy Statement
At this state it is impossible to derive the Schrodinger equation. The derivation requires a minimum
math knowledge of infinite dimensional vector spaces and operator algebra. But at least we can easily
motivate the Schr odinger equation. Eq. 3.37 is a statement of the nonrelativistic expression for energy
as a sum of kinetic and potential energies. We know that
KE + P E = H. (3.45)
and
H ih (3.48)
t
and act it on a function (~x, t) then we obtain the time-dependent Schrodinger equation.
Probability Theory
In probability theory, the question that we ask is what is the probability of occurrence of an event?
or what is the mean? or what is the variance? By event we mean heads and tails. Say, what is the
probability of obtaining three heads in a game where three unbiased coins are tossed at the same time?
These questions can be answered by constructing the so-called probability density for the random
variable. In a coin toss the random variable is the number of heads or tails.
Let us simplify the discussion by considering the specific game in which three coins (unbiased) are
tossed at the same time. Well find that (by drawing a tree diagram) or doing an experiment that 1/8
times we get three heads, 3/8 times we get two heads, 3/8 times we get one head, and 1/8 times we
get zero heads. We summarize this result by writing down P (3) = 1/8, P (2) = 3/8, P (1) = 3/8, and
P (0) = 1/8. The quantity P (N ) is what we call the probability density for finding N heads in a game
of three coin toss. What are some ofPits properties? Note that if we add the probability density for
all results, we obtain a number one, P (N ) = 1. If we want to find the mean number of heads that
appears we multiply each P (N ) with N and add all possibilities.
Now let us begin to be more elegant. In probability theory, we are usually given the probability
distribution P (x) for some random variable x. It is our task to extract all that is measurable from this
probability distribution. The probability distribution is normalized:
X
P (x) = 1. (3.49)
all x
and X
hx2 i = x2 P (x). (3.51)
all x
In application to coin toss problems or dice problems the formalism described above would be sufficient.
However, in many physical applications one also considers the probability of occurrence of a variable
that is continuous (in kinetic theory the Maxwell-Boltzmann distribution gives the probability that a
gas molecule will have a certain speed). In this case we consider the probability distribution (x). This
is normalized according to Z +
dx(x) = 1. (3.52)
The mean and the variance can be computed using
Z +
hxi = dx x(x) (3.53)
In quantum mechanics, what we are going to use as (x) is the absolute square of the wave function
||2 . But let us first take a step towards complex algebra.
Complex Algebra
The introduction of the imaginary number i = 1 allows us to make a whole new number space on
which we can construct our physical models. This is complex number space and its elements are complex
numbers satisfying some algebraic properties. An arbitrary complex number A can written as
A = a + ib (3.55)
where a and b are real numbers. The constants a and b are called the real and imaginary parts of A.
Alternatively, a complex number A can be written in the form
A = ei (3.56)
where and are real numbers. This latter form is called the polar form for a complex number where
is called the modulus and is called the phase. We can relate the polar form to the standard form
by expanding the exponential using de Moivres formula:
and
b
= arctan (3.61)
a
The last four equations are used to transform from the polar form to the standard (rectangular) form of
complex number and vice verse.
The complex conjugate of a complex number can be obtained by replacing all the i appearing in
a complex number with i. For example, the complex conjugate of A which is labelled A is given by
A = a ib = ei . (3.62)
The absolute square or modulo square of a complex number is the product of a complex number
and its own conjugate. For example, the absolute square of A is given by
The polar form is often useful in calculating the absolute squares of quantities because the phase angle
easily cancels out of the calculation.
Exercises
1. Consider two complex numbers A = a + ib and B = c + id where a, b, c, and d are real numbers.
Calculate the following quantities (a) A A (b) A B (c) AB (d) AB and (e) A B .
1+2i
2. Calculate the modulus of the complex number 12i .
q
1+2i i4 3+4i
3. Calculate the absolute square of the complex number 12i e 2i .
4. If you toss a dice 1000 times, what is the most probable score that you will obtain?
The probability of finding the particle over all space must of course be unity as implied by the normaliza-
tion condition Eq. 3.40. In one dimension the probability of finding the particle within a certain interval
(a, b) is given by
Z b
P (a, b) = (x, t) (x, t)dx. (3.65)
a
Following the probabilistic interpretation of quantum mechanics, we also have the average position
Z
h~xi = (~x, t) ~x(~x, t)d3 x. (3.66)
all space
Note that the position of the operator O in the integration matters, e.g. when O is a differential operator
then O 6= O|P si|2 6= |P si|2 O. The operator acts on the function to its right!The one dimensional
generalization to these equations for averages must be obvious.
By performing integration by parts on the second term, i.e. let u = x and dv = cos 2nx
L dx, it can be
shown that the second term reduces to zero. The average position of the particle is then located at the
center of the interval (0, L):
L
hxi = . (3.75)
2
Finally, the average momentum is given by
Z +
d
hpi = i h dx
dx
Z L "r # "r #
2 nx En t d 2 nx En t
= sin exp i ih sin exp i
0 L L h dx L L h
Z L (3.76)
2 n nx nx
= i
h sin cos dx
L L 0 L L
n L 1
Z
2 2nx
= i
h sin dx.
L L 0 2 L
It should be obvious from this point that the average momentum is zero:
hpi = 0. (3.77)
where b is a constant. This wave function represents the ground state energy of a particle in a harmonic
oscillator potential and plays important role in the low temperature description of solids. (i) Express
the normalization constant A in terms of the given variable b (ii) Calculate the average position of the
particle (iii) calculate the average momentum of the particle. Hint: You can differentiate the following
integral with respect to to perform the required integrations,
Z + r
exp x2 =
. (3.79)
(iv) Continue the integrations and calculate the expectation value of the square of the position hx2 i and
the uncertainty x (v) Calculate hp2 i and p. (vi) What must be the value of b so that the extrema of
the Heisenberg uncertainty principle xp = h /2 will be satisfied?
because the eigenfunction turn out to be linearly independent with each other. In Eq. 3.81, aE are
constants whose interpretation we shall now discuss.
A valid wave function in quantum mechanics is one that is normalizable. We subject Eq. 3.81 to this
constraint:
Z + Z + 0
X Et X Et
||2 dx = aE E (x) exp i aE 0 E 0 (x) exp i dx
h h
all possible E all possible E
Z +
E E0
X
= aE aE 0 exp i t E (x)E 0 (x)dx.
0
h
EE
(3.82)
To continue from this we note one special property of eigenfunctions. Eigenfunctions satisfy the condition
(known as orthogonality)
Z +
E (x)E 0 (x)dx = EE 0 (3.83)
where EE 0 is zero whenever E 6= E 0 and one when E = E 0 . This limits the sum to terms E = E 0 and
we obtain Z + X
||2 dx = |aE |2 . (3.84)
E
Because the wave function is normalized we obtain the following condition on the constants aE :
X
|aE |2 = 1. (3.85)
E
This apparently resembles the condition that the sum of all probabilities of having different energies E is
equal to one. This hints us to interpret that |aE |2 gives the probability that the quantum state represented
by (x, t) has a certain energy E. This is the currently accepted interpretation. To strengthen this
interpretation further, we can actually calculate the average energy corresponding to the state Eq. 3.81.
The result is X
hEi = E|aE |2 . (3.86)
E
This resembles the equation for calculating the average energy with probability |aE |2 for finding the
particular energy E.
That the most general quantum state can be written as a sum of stationary solutions (states which do
not evolve in time) is known as the superposition principle. Though this follows from the mathematics
of vector spaces this can also be interpreted physically. In quantum mechanics, a system is allowed
to possess only a particular/specific set of energies allowed by the Heisenberg uncertainty principle.
The system, at any given time, can be in any one of these quantum states with allowed energies. A
measurement of the energy of the state would yield any one of this possible values. Another measurement
can yield another of the possible values. When we make a sufficiently large number of measurements
we can experimentally create the probability density |aE |2 of finding the system in the particular energy
E. When we average all of the results, we also find that the average energy is consistent with what
probability theory tells us; that we add E|aE |2 for all possible values. This allows us to write down the
general quantum state in the form Eq. 3.81 even without the help of mathematics. It is important to
understand that this interpretation applies in quantum mechanics because of the probabilistic nature of
quantum systems.
Exercises
1. Normalization. Consider a particle described by the wave function (x) = Aebx (Ab are real and
positive constants) in the region 0 x < and (x) = 0 elsewhere. What is the normalization
constant A?
p
2. Expectation. Consider the wave function (x) = 5/(2L5 )x2 for |x| < L, where L is a positive
constant. What is the expectation value of the position?
3. Consider the following wave function of a particle in a one-dimensional quantum system:
r
i(0.38eV)t/ 1 i
(x, t) = C1 (x)e h
+ 2 (x)ei(1.52eV)t/h + 3 (x)ei(3.42eV)t/h (3.92)
3 2
where C is a complex number and 1 (x), 2 (x), and 3 (x) are normalized spatial wave functions
that correspond to the ground, first-excited, and second-excited states, respectively. (i) What is
the probability that the particle will be in the ground state? (ii) What is the average energy? (iii)
What is the minimum uncertainty in the particles lifetime according to the energy-time uncertainty
principle?
4. Consider a particle having the following probability distribution in momentum p:
P (p = 1.51 kg m/s) = 0.70 (3.93)
P (p = 2.51 kg m/s) = 0.25 (3.94)
P (p = 7.21 kg m/s) = 0.05 (3.95)
(i) What is the average momentum of the particle? (ii) What is the uncertainty in the momentum of
the particle? (iii) What is the minimum uncertainty in the position of the particle? (iv) Write down
the wave function for the particle letting i correspond to the states with the given momentum
and Ei be the energy of these states.
This implies that the particle is essentially free inside the box. The infinite potentials outside the box
is interpreted to stop any possibility that the particle can be found in those regions. This is a physical
constraint that we impose on the solution to the Schrodinger equation. We write down
Other than this physical constraint we also want the wave function to be a smooth function of the
position, i.e. the wave function and its derivative should be continuous:
= continuous (3.98)
d
= continuous. (3.99)
dx
2
h
These two conditions allow the kinetic energy term, 2m 2 , in the Schrodinger equation to have a
finite value.
Inside the box, the potential is zero and the Schrodinger equation becomes
2 d2
h
= E. (3.100)
2m dx2
The general solution to this is a linear combination of the oscillatory functions sin and cos:
where k is given by
2mE
k2 = . (3.102)
h2
Without loss of generality we place the boundaries of the box at x = 0 and x = L. At the left boundary
x = 0 the continuity condition of the wave function immediately leads to the condition
B = 0. (3.103)
A sin(kL) = 0. (3.105)
Because a quantum system may not have a definite energy we write down its wave function as a sum
of possible solutions corresponding to definite energy (superposition principle). The general quantum
state of a particle in a box is then given by
r !
X 2 nx
(x, t) = an sin eiEn t/h (3.110)
n=1
L L
where |an |2 gives the probability of finding the system in the state with energy En .
Exercises
1. An electron subjected
to an infinite square well potential was prepared in the following state
(x, 0) = A sin 4x
nm . What is the wave function of the particle after time t?
2. The Longer The Better. An electron in a 10 nm box emits a photon as it relaxes toward the
n = 2 state. What is the longest wavelength of the photon that the electron can emit? Ans.
6.60 105 m
3. Analyze the problem of the particle in a box carefully. What changes if we shift the location of the
box boundaries to x = L/2 to x = L/2?
Because of the translational symmetry of the problems, the well can be anywhere in space, we have the
option to use the box location that best simplifies the problem. In this case, we choose the boundaries
to be at x = L/2 and x = L/2. We are now ready to solve the Schrodinger equation and subject the
solution to normalization condition and Eqs. 3.98 and 3.99, continuity and smoothness.
The choice of box location puts a symmetry on the system that allows us to discuss even and odd
solutions to the Schr
odinger equation separately. The general solution is given by
(
A cos(kx) + B sin(kx) , in the well
(x) = x x
(3.112)
Ce + De , in the barriers
where
2mE
k2 = (3.113)
h2
and
2m(V0 E)
2 = . (3.114)
h2
The previous statement on symmetry allows us to write down the solutions for even and odd functions
of x separately. Furthermore, while the finite barrier allows the particle to be found inside the barriers
we do not expect this probability to be large. Physically, we can argue that the probability of finding
Exercises
1. The solution to the Schr
odinger equation with odd dependenceon x is
x
Ce
, x L/2
(x) = A sin(kx) , L/2 < x < L/2 (3.120)
x
Ce , x L/2.
Derive the quantization condition similar to Eq. 3.118 for this solution. Which should has greater
energy, the solution with even dependence on x or the solution with odd dependence on x?
2. Write down the solution to the Schrodinger equation for the particle in a well described by an
asymmetric potential well:
V0
, x < L/2
V (x) = 0 , L/2 < x < L/2 (3.121)
, x > L/2.
3. Numerical solution to square well problem. Equipped with the capacity of making plots
using spreadsheet programs one can actually calculate the exact energy levels for a particle in a
finite square well. This can be done by defining the dimensionless variables
= kL/2 (3.122)
= L/2. (3.123)
In this case, for example, the quantization
q condition for solutions with even dependence on position
2
becomes = tan and = mV 0L
h2
2
2 . Taking the intersection of these two plots lead to the
possible energy levels. Say, there is anintersection
at = 0 . Then one possible energy level is
2
h 2
given by 0 = kL/2 or simply E = 2 mL2 0 . For a particle with a mass of 0.067 times the
electron mass that is confined in a finite square well of with 100 A and height 240 meV, show
that this numerical routine leads to the energy level 32.47 meV. This is the lowest energy that the
system can occupy. Repeat the procedure for the solution with odd dependence on position and
show that the first excited state energy level is given by 124.44 meV.
We consider the case that the energy E of the particle is less than the height of the potential barrier. This
is represented
by the free particle wave function with positive momentum, i.e. (x) = Aeipx/h where
k = 2mE/h. When it hits the potential barrier it becomes reflected and transmitted at the same time.
Thus, an additional solution Beikx is introduced at x < 0 and a solution Ceikx is introduced for x > L.
In region II, inside the barrier the solution is a linear combination of exponential functions. The solution
the Schrodinger equation is given by
ikx ikx
Ae + Be
, region I
x
(x) = e + e x (3.125)
, region II
ikx
Ce , region III
where
2mE
k2 = (3.126)
h2
and
2m(U0 E)
2 = . (3.127)
h2
What are we interested for? The tunneling probability. This is the one that we can measure as a
current in a tunnel diode or a scanning tunneling microscope. To identify what this quantity is in terms
of our given variables we use the quantum mechanical interpretation. The constant A corresponds to the
amplitude of the incident/original wave that is scattered. Once scattered, it is divided into a reflected
wave with constant B and a transmitted wave with constant C. The ratio of the constants C/A is the
1
T = |C/A|2 = . (3.128)
2
1 k
sinh2 (L)
1+ 4 + k
In terms of U0 and E, the barrier height and the particle energy we have
1
T = . (3.129)
U02
1+ 1
4 U0 EE 2 sinh2 (L)
For very tall barriers or very long barriers or both we take the limit L 1 and this becomes
E2
E
T 16 2 exp(2L). (3.130)
U0 U0
The important result from this expression is that the transmission probability depends on the length L
of the barrier as
T e2L . (3.131)
This makes sense because the probability must decrease to zero as the width of barrier L becomes larger
and larger.
Exercises
1. Harang. A particle has an energy E, and encounters a barrier with height U0 and length L.
Which of the following can reduce transmission probability of the particle as it interacts with the
barrier? Increasing L? Decreasing U0 ? Increasing E?
2. Break Away. A 2.0 eV electron encounters a barrier with height 5.0 eV and width 1.50 nm. By
how much factor would the probability of tunneling increase if the width is decreased to 1.00 nm?
2 d2 1
h
+ m 2 x2 = E. (3.133)
2m dx2 2
This differential equation can be solved exactly but the level of math that is needed is beyond math
55. In this case we only give out the solution that leads to the quantized energies. The solution to the
harmonic oscillator problem in quantum mechanics is given by
14 1 y2
n (y) = Hn (y)e 2 (3.134)
2n n!
where
y= x (3.135)
and
m
= . (3.136)
h
Exercises
1. Promoted. A harmonic oscillator is initially in the first-excited state. It absorbs a photon,
raising its state by two levels. If the ground-state energy of such an oscillator is 1.20 eV, what is
the wavelength of the absorbed photon? Ans. 259 nm
2. final. A harmonic oscillator, initially in the first-excited state, absorbs a photon with a wavelength
of 550 nm. What is the final state nf of the oscillator if its angular frequency is 3.44 1015 rad/s?
3. Uncoupled. Three harmonic oscillators A, B, and C have ground state energies of EA , EB > EA ,
and EC > EB , respectively. Which of these oscillators would absorb a photon with the longest
wavelength to transition to the first-excited state?
4. Recoil. A photon was absorbed by a quantum harmonic oscillator that was prepared at ground
state with angular frequency . The excited harmonic oscillator emits a photon with the same
angular frequency , then relaxes into the second-excited state. What is the angular frequency of
the original incident photon that was absorbed by the quantum harmonic oscillator? Ans. 3
5. Pewpew. The ground-state energy of a harmonic oscillator is 5.00 eV. If the oscillator undergoes
a transition from its n = 3 to n = 2 level by emitting a photon, what is the wavelength of the
emitted photon?
Particle in a Box
In 3D, the problem that we intend to solve is given by
2 2
h
+ V (x, y, z) = E (3.144)
2m
where the potential energy function is defined by
(
0 , 0 < x, y, z < Lx , Ly , Lz
V (x, y, z) = (3.145)
, elsewhere.
Once again, as in the time-dependent Schrodinger equation given by Eq. 3.37, we have a partial differ-
ential equation. The simplest way to solve a particle differential equation is to assume that there is a
separable solution of the form
(x, y, z) = A(x)B(y)C(z). (3.147)
When we plug in this assumption into the original partial differential equation we instead obtain three
ordinary differential equations for the variables x, y, and z. When we subject these solutions to the
boundary conditions we obtain the allowable energies of the system.
Let us now go back to the problem at hand. The infinite potential barrier outside the box implies
that the particle would not be able to penetrate through the walls. This means that = 0 outside the
box. Thus, the requirement of continuity of the solution to the Schrodinger equation imposes
(x = 0, y, z) = (x = Lx , y, z) = 0 (3.148)
(x, y = 0, z) = (x, y = Ly , z) = 0 (3.149)
(x, y, z = 0) = (x, y, z = Lz ) = 0. (3.150)
2 2 2 2
h
2
+ 2+ 2 = E. (3.151)
2m x y z
2 d2 A d2 B d2 C
h
BC 2 + AC 2 + AB 2 = EABC. (3.152)
2m dx dy dz
2 1 d2 A
h h2 1 d2 B h2 1 d2 C
= E. (3.153)
2m A dx2 2m B dy 2 2m C dz 2
numbers (1, 1, 1) with energy 3E1 . The first excited state energy level belongs to state (2, 1, 1) with
energy 6E1 . This is where degeneracy enters. Note that the states (1, 2, 1) and (1, 1, 2) also have the
energy 6E1 . Thus we say that the first excited state is triply-degenerate or it has degeneracy of three.
An important point to make is that the degeneracy arises because of the symmetry of the cube.
We consider now the case of the isotropic harmonic oscillator where = x = y = z . In this special
case, the possible energies of the system become
3
E = h nx + nz + nz + . (3.169)
2
Since the quantum numbers nx , ny , nz = 0, 1, 2, ... the ground state of the system is labelled by (0, 0, 0)
with energy 23
h. The first excited state is (1, 0, 0) with energy 52 h. But we can easily see that (0, 1, 0)
and (0, 0, 1) also have energy 52
h. So, the first excited state is again triply-degenerate. The point that
we want to highlight is that degeneracy comes up again because of symmetry.
Exercises
1. S.E.S. A particle is in a box with dimensions Lx = Ly = Lz . What are the set of quantum
numbers corresponding to the second excited state?
2. Maldita sea IHO! A 3D isotropic harmonic oscillator transitions from the ground state to the
second excited state. What is the ratio of the ground state energy with the energy of the second
excited state, E0 /E2 ?
3. In Transit. A 3D isotropic harmonic oscillator makes a transition from the first excited state to
the ground state. If angular frequency of the oscillator is 1.80 1013 rad/s, what is the frequency
of the emitted photon? Ans. 1.80 1013 rad/s
4. For the next two numbers, consider an electron in a three-dimensional cubic box of length equal to
2.50 nm. (a) What is the degeneracy of the fourth-excited state? Neglect the effect of spin. (b)
What is the energy in the third-excited state? Ans. 1 and 0.54 eV
5. 3DHO. A three-dimensional isotropic harmonic oscillator (3DHO), initially in the third excited
energy level, transitions to the second-excited energy level, emitting a photon. What is the wave-
length of the emitted photon if the 3DHO oscillates at a frequency of 5.00 1015 Hz? Ans. 60 nm
1 e2
V (r) = (3.170)
40 r
where e is the electronic charge. The Schr
odinger equation for the hydrogen atom is then given by
2 2
h 1 e2
= E. (3.171)
2m 40 r
In contrast to the 3D particle in a box and 3D harmonic oscillator problem we cannot use the separable
solution
p (x, y, z) = A(x)B(y)C(z) since the potential energy function depends on the radial variable
r = x2 + y 2 + z 2 . What we can do instead is consider the separable solution (r, , ) = R(r)()().
But in this line of attack we have to express the operator 2 in spherical polar coordinates (r, , ).
Fortunately, these expressions can be found in mathematics and physics texts. This leads us to the
Schr
odinger equation in spherical polar coordinates for the hydrogen atom:
h2 1 2
2 1 1 1 1
2
r + 2 2 sin + 2 2 2
= E. (3.172)
2m r r r r sin r sin 40 r
The separable solution (r, , ) = R(r)()() in spherical polar coordinates then reduce this equation
into three ordinary differential equations:
1 d2
= m2 (3.173)
d2
1 d d
sin sin + l(l + 1) sin2 = m2 (3.174)
d d
2mr2 1 e2
d 2 dR
r + E R = l(l + 1)R. (3.175)
dr dr h2 40 r
The solution to the differential equation in is an exponential as can easily be verified:
() = eim . (3.176)
The azimuthal symmetry of the system, i.e. that the system does not change with the translation
+ 2, constrains m into positive and negative integers and zero: m = 0, 1, 2, .... The solution
to the differential equation in is well-known in mathematics as
where Plm (cos ) are called associated Legendre functions. The possible values of l are l = 0, 1, 2, 3, ....
It should be noted that the product of the and solutions is another special function that we call
spherical harmonics: s
2l + 1 (l m)! m
Ylm (, ) = P (cos )eim . (3.178)
4 (l + m)! l
Finally, the solution to the remaining radial differential equation is another well-known function in
mathematics: l
2r r r
R(r) = e na0 L2l+1
nl1 (3.179)
na0 na0
where Lln (x) are called associated Laguerre polynomials. The full normalized solution the the Schrodinger
equation for the hydrogen atom is given by
" 3 #1/2 l
2 (n l 1)! r 2r 2r
nlm (r, , ) = e na0 L2l+1
nl1 Ylm (, ). (3.180)
na0 2n(n + l)! na0 na0
Lz = mh. (3.183)
Exercises
1. Probable. A hydrogen atom is prepared in the state (x) = A100 100 (x) + A211 211 (x) +
A211 211 (x). Whatis the probability that the state is observed to have an orbital angular momen-
tum of magnitude 2h?
2. Lonesome Electron. The orbital angular momentum of an electron has a magnitude of 4.716
1034 kg m2 /s. What is the angular-momentum quantum number l for this electron?
3. Pinakamahirap! What is the magnitude of the total angular momentum of an electron in the
s-state (l = 0) of the hydrogen atom?
4. What is the energy of the 4th excited state of the hydrogen atom and what is its degeneracy?
2. The radial probability distribution for the ground state of the hydrogen atom is P (r) = 4r2 |100 |2 .
What is the most probable radius of the electron?
Figure 42: Electron orbiting a positively charged nucleus. The direction of motion of the electron is opposite to
the direction of the conventional current (flow of positive charge).
The interaction energy between the magnetic moment and the magnetic field is then U = ~ ~ =
B
1 e ~ ~
2 m L B. This is conventionally written down as
B ~ ~
U= LB (3.186)
h
Figure 43: A beam of hydrogen atoms prepared in a state with l 6= 0 is passed through a region with non-uniform
magnetic field. The non-uniform nature of the magnetic field gives a force and splits the beam. The quantization
of the angular momentum manifests itself in the number of beams that is observed after passing through the
magnetic field.
the hydrogen atom prepared in the l = 1 state we observe three beams corresponding to ml = 1, 0, 1
as expected.
Spin
Reading assignment: Stern-Gerlach experiment.
It all turns out to be great. We can now observe quantization of angular momentum and it is
consistent with experiment. What comes as a surprise is that when we look closely the lines are split
into two. And what is worse? Even l = 0 states get split into two!
This is where physicists speculated the existence of an intrinsic quantum number that does not enter
into the Schr
odinger equation. This is called the spin quantum number ms . We say that the electron
has spin. Manifesting itself through splitting of beams when subject to the magnetic field, the simplest
way to model the spin is to relate it to an intrinsic magnetic moment of the electron in the same way as
the angular momentum is related to the magnetic moment as Eq. 3.185. Thus, we write down
B ~ ~
~s = ge SB (3.188)
h
~ is a vector representing the spin state of the electron.
where ge is called the gyromagnetic factor and S
This gyromagnetic factor turns out to be close to ge 2 for the electron. The experimental result that
Since there are only two line obtained for the splitting of the beam then it is evident that the possible
quantum number s must be a half-integer. Evidently, we should have s = 1/2 so that there are two
possibilities for Sz with ms = 1/2, 1/2. Now we say that the electron is spin 1/2.
Note: The spin does not enter automatically in the Schrodinger equation because it is actually
of relativistic nature. It naturally arises in a theory which reconciles special relativity with quantum
mechanics. In this theory, we can even derive the gyromagnetic factor of the electron to be exactly 2. In
a more accurate theory (albeit more difficult) of quantum field theory which takes into account even the
quantum nature of vacuum corrections to the gyromagnetic factor can be found. It is also in quantum
field theory that we can prove the Pauli exclusion principle for half-integer spin particles such as the
electrons, protons, neutrons, neutrinos, etc.
The possible values for the z-component of the total angular momentum is
Jz = mj h (3.193)
Exercises
1. Pinakamahirap! What is the magnitude of the total angular momentum of an electron in the
s-state (l = 0) of the hydrogen atom?
2. Stern-Gerlach. Which of the following is/are the result/s of the Stern-Gerlach experiment?
4. total A hydrogen atom is in the l = 1 state. Compute for the magnitude of the total angular
momentum for ~ ~
the cases when L is parallel to S as well as when they are anti-parallel. Ans.
15h/2 and 3h/2
where x1 and n1 labels the position and quantum state of particle 1 and x2 and n2 labels the position
and quantum state of particle 2. For simplicity, let us say that this describes the state of two electrons
in some system. This obeys the Schr odinger equation for the two particles. But it seems that solving the
Schrodinger equation for even the next simplest atom, the helium atom, is a formidable task. The task of
solving the Schrodinger equation is impossible with more than two particles and approximation schemes
need to be developed. With this in mind let us see how much we can proceed forward without actually
solving the Schrodinger equation. What matters in quantum mechanics is the probability density. Given
Eq. 3.194 then |(x1 , n1 |x2 , n2 )|2 is the probability density for finding particle in x1 in state n1 and
finding particle 2 in x2 in state n2 . However, the task of labelling two particles in quantum mechanics is
not possible. We cannot really distinguish two electrons from one another and call one red and the other
blue. When we know one electron we know all other electrons. So this means that even if we exchange
the particle labels in the probability density, we should get the same thing. In symbols, we write down
The probability of finding particle 1 in x1 in state n1 and particle 2 in x2 in state n2 is the same as the
probability of finding particle 1 in x2 in state n2 and particle 2 in x1 in state n1 . Eq. 3.195 predicts two
kinds of wave function:
These two kinds of wave function correspond to the two kinds of particle appearing in nature.
Bosons
The symmetric wave function is obeyed by bosons. Examples of bosons are the photon, graviton, and
the Higgs boson. In contrast with the other kind of particle to be introduced in a few minutes bosons
can occupy the same quantum state. To see this, consider the case where the two bosons described by
Eq. 3.196 are located at the same position x1 = x2 = a. In this case we have
Other than occupying the same location in space we can see that there is no contradiction if the two
bosons also occupy the same quantum state n1 = n2 = n. That bosons obey this property is important
in the understanding of Bose-Einstein condensation where at absolute zero temperature bosons tend to
occupy the same quantum state. In this case a quantum phenomena becomes observed macroscopically
such as in liquid helium.
Fermions
The anti-symmetric wave function is obeyed by particles called fermions. Examples of fermions are the
electron, proton, and the neutron which basically makes up most of the matter that we deal with in our
daily lives. In contrast with bosons we shall find that: No two fermions can occupy the same quantum
state. This statement is most known to be the Pauli exclusion principle. Let us see how this comes
about in quantum mechanics using Eq. 3.197. Again let us consider the case in which the two electrons
are located in the same position in space x1 = x2 = a. This might mean for example that they are
localized in the same atom or in the same quantum well in a semiconductor. In symbols we have
Many-electron systems
Electrons are fermions and must therefore obey the Pauli exclusion principle. But other than the Pauli
exclusion principle electrons also interact with one another. Say, two electrons in the helium atom also
interact repulsively since both have the same sign of the charge. We might think that we cannot go
anymore further with our formalism of quantum mechanics for many-electrons. What turns out to be
surprising is that we can go way further in the analysis of systems under the simple assumption that
the electrons do not interact with one another. In semiconductors what is usually done is consider all
interactions through the so-called effective mass. In atomic systems, the use of the electron configuration
in which we put the electrons in all quantum states starting from the ground state and up assures us that
our crude approximation still holds some truth. This analysis go way further in the analysis of metals
and even astrophysical systems such as white dwarfs.
Helium atom
Having said that electrons are fermions and that these obey the Pauli exclusion principle let us try to
analyze the next simplest quantum system which is helium. The Schrodinger equation for the helium
atom is given by
2 2
h h2 2
1 (~x1 , ~x2 , t) (~x1 , ~x2 , t)
2m 2m 2
(3.202)
e2 2e2 1 2e2 1
1
+ (~x1 , ~x2 , t) = ih (~x1 , ~x2 , t).
4 |~x1 ~x2 | 40 |~x1 | 40 |~x2 | t
In the potential energy, the first term represents the repulsive interaction of the electrons while the
second and third terms represent the attractive interaction of the electrons with the positively charged
nucleus. This equation has not yet been solved exactly.
Let us attempt to move forward in this way. Assume that to a good approximation the interaction
between the two electrons is negligible compared to their interaction with the nucleus. Under this
assumption we write down the two-particle wave function in terms of single electron wave functions
n (~x) of the hydrogen atom:
(~x1 , ~x2 , t) = (1) x2 )(2)
n1 (~ x2 )eiEt/h .
n2 (~ (3.203)
The superscript (j) labels the two electrons. Using the ground state wave function of the hydrogen atom
problem this becomes
8 2(r1 +r2 )/a0
(r1 , r2 , t) = e (3.204)
a3
where a0 is the Bohr radius. Using the ways that we know we can calculate the energy corresponding
to the state. The result is 8E1 = 109 eV. This is not close to the experimental value 78.975 eV but
it gives us a starting point considering that we have simplified the problem greatly.
But Eq. 3.203 is not the correct way of writing the wave function for a two-electron system. As
we have noted electrons obey the Pauli exclusion principle and the two-particle wave function must be
anti-symmetric. With this goal of writing down an anti-symmetric wave function we obtain
(~x1 , ~x2 , t) = A (1)
n1 (~x1 )(2) x2 ) (1)
n2 (~ x2 )(2)
n2 (~ x1 ) eiEt/h .
n1 (~ (3.205)
2 2
= h h2 2
H 1
2m 2m 2
(3.206)
e2 2e2 1 2e2 1
1
+ .
4 |~x1 ~x2 | 40 |~x1 | 40 |~x2 |
We analyze the ground state by writing this down in terms of the hydrogen atom ground state given by
1
(r) = p er/a0 (3.207)
a30
noting that a0 is the Bohr radius. It is left as exercise to calculate the average ground state energy
that would come up following these lines. Note that the experimental ground state energy of helium is
78.975 eV. This is where we end.
Exercises
1. Totoo ba? Which of the following is/are FALSE about a many electron atom?
(a) No two electrons in a many electron atom can have the same quantum state.
(b) The wavefunction of a many electron atom is symmetric.
(c) Two electrons in a many electron atom can have the same quantum numbers if they are both
spin-up.
2. PEP. Which of the following statements is/are NOT part of Paulis exclusion principle?
4. Repeat the same analysis but now take into account the spin.
5. You done! According to Paulis exclusion principle, which of the following systems CANNOT
exist in nature?
(a) Two electrons (one spin up and one spin down) both with principal quantum number n = 1
in a Coulomb potential.
(b) Two electrons (both spin down) in the n = 1 and n = 2 states in an infinite square well,
respectively.
(c) Two electrons with the same spin both in the n = 0 state in a harmonic oscillator potential.
(d) Two electrons (one spin up and one spin down) in the hydrogen atom occupying the state
nlm = 110.
6. Consider two non-interacting particles in a one-dimensional harmonic oscillator potential with
angular frequency . Neglect spin. What is the total ground state energy if the particles are
bosons? What is the total energy if the particles are fermions. Write down the wave function for
the system when the particles are bosons and when the particles are fermions.