Lecture Notes - Week 1 (Lectures 1-3) Statistical Mechanics - Course Summary

1
Lecture Notes - Week 1 (Lectures 1-3)
Statistical Mechanics - Course Summary

Statistical Mechanics is the mathematical theory that underpins our understanding of
systems of large ( 1023 ) numbers of particles (e.g gases, liquids and solids). Such large
systems are often referred to as macroscopic. The particles or basic entities which
make up these systems are referred to as microscopic. Our goal in this course is the
study of such large systems which we can approach in 2 ways.
1. We can describe the system by a small number of measurable quantities (macro-

scopic variables, e.g. for a gas pressure p, volume V and temperature T ). This
approach is known as thermodynamics. Thermodynamics makes no assumptions
about the microscopic nature of these systems. Rather it is based on general laws
which have been obtained empirically (from repeated experiments). Such an ap-
proach might seem to lack much explanatory power but this is not the case. These
laws provide a coherent logical structure and allow us to classify a wide variety of
observations in a way which makes them much easier to understand and even gives
us the ability to make predictions. In fact, Albert Einstein said in 1949, A theory
is the more impressive the greater the simplicity of its premises, the more varied the
kinds of things that it relates and the more extended the area of its applicability.
Therefore classical thermodynamics has made a deep impression on me. It is the
only physical theory of universal content which I am convinced, within the areas
of the applicability of its basic concepts, will never be overthrown.
2. Alternatively we can describe the system microscopically with detailed assump-

tions about its microscopic structure (e.g. the masses, positions and velocities of
all the molecules which make up a gas). It should be evident that it is impossible
to control experimentally (or keep track of theoretically using mathematics) such
details of the system. For example, keeping track of the velocities and positions
of the molecules of 1 litre of a gas at atmospheric pressure and room temperature
requires that we solve of the order of 1023 equations of motion which is far, far
beyond the capabilities of even the best computers available today. But we are not
interested in all this redundant information. Rather we are interested in average
values of functions of the microscopic quantities and we would like to use this in-
formation to predict the behaviour of macroscopic variables. This is the approach
known as statistical mechanics.
More recently with the advent of fast computers, it has become possible to study systems
of many particles in the naive way discussed earlier by solving directly the equations of
motion (i.e. Newtons 2nd law fi = mai ) for each of the N particles in a system.
This is a deterministic approach to studying the problem called molecular dynamics
simulation and any randomness is due only to the variation in initial conditions. It is
time-consuming and computationally expensive and limited by numerical accuracy of the
1
University
c of Bristol 2017. This material is copyright of the University unless explicitly stated
otherwise. It is provided exclusively for educational purposes at the University and is to be downloaded
or copied for your private study only. These lecture notes are NOT a substitute for wide reading from
a variety of textbooks.
1
computer. The very largest systems that can currently be studied have N 109 particles
( 1023 required for macroscopic systems) however it gives some important insights into
the problem. Moreover it is the only way to precisely study medium sized systems
of say 1000 particles where both thermodynamics and statistical mechanics become
inaccurate (e.g. the study of the small scale dynamics of small biological molecules).
Syllabus
1. Thermodynamics (3-4 weeks) - macroscopic systems, state variables, laws of ther-

modynamics, heat engines and refrigerators, phase equilibria
2. Equilibrium Classical Statistical Mechanics (2-3 weeks) - ensembles, derivation of

thermodynamic quantities, entropy of mixing, ideal gas
3. Dynamical Foundations and aspects of non-equilibrium (1 week) - Hamiltonian

mechanics, Liouville equation, Poincare recurrence, Boltzmann equation and H-
theorem
4. Equilibrium Quantum Statistical Mechanics (3 weeks) - quantum statistics, density

matrix, boson and fermion systems, Bose-Einstein condensation
Books
1. Statistical mechanics, R.K. Pathria, Elsevier 2005, 529 pages.
2. Equilibrium Thermodynamics, C.J. Adkins, Cambridge 1983, 285 pages.
3. Introduction to modern statistical mechanics, D. Chandler, Oxford 1987, 274 pages.
4. An introduction to chaos in non-equilibrium statistical mechanics , J.R. Dorfman,

Cambridge 1999, 287 pages.
5. Pauli lectures on physics, vol 3 : Thermodynamics and the kinetic theory of gases,
W. Pauli, Dover 2003, 160 pages.
6. Statistical physics of particles, M. Kardar, Cambridge 2007, 330 pages.
7. Equilibrium and non-equilibrium statistical thermodynamics, M. LeBellac, F. Mor-

tessagne and G. Batrouni, Cambridge 2004, 616 pages.
A more detailed discussion of the books is on Blackboard.
2
Mathematical preliminaries
If you are not aware of the following they might be helpful for mastering the material in
the course.
0.1 Some identities involving partial derivatives

In thermodynamics we often have to express functions of one particular set of variables
in terms of another set of variables. This often requires to relate partial derivatives with
respect to different sets of variables while keeping other sets of variables constant.
Suppose we have three variables x, y and z which are not independent in that F (x, y, z) =
0. If we express x in terms of y and z

x x
dx = dy + dz .
y z z y
Alternatively, if we express y in terms of x and z 2

y y
dy = dx + dz.
x z z x
Substituting the second equation into the first one gives:
!
x y x x y
dx = dx + + dz.
y z x z z y y z z x
Since x and z can be varied independently we have

1
x y
= , (0.1)
y z x z
and
x y z
1 = . (0.2)
y z z x x y
In the first relation (0.1), z is constant on both sides of the equation and one obtains the
usual relation between the derivatives of a function and its inverse. The second relation
(0.2) is less familiar. Note that it can also be obtained from the equation for dy by
imposing y to be constant, implying dy = 0, and solving for dz/dx.
A further useful identity is obtained if we replace z by a new variable w and consider x
as a function of y and w such that

x x
dx = dy + dw .
y w w y
Dividing by dy and imposing a constant z leads to the following:

x x x w
= + . (0.3)
y z y w w y y z
2
The differentials dx etc can be thought of as small (infinitesimal) changes in the respective vari-
ables whose higher powers are negligible. This can be justified rigorously, eg using methods of differential
geometry.
3
Example: Let us take z = x2 + y and w = x2 y. We can differentiate to get
dz = 2xdx + dy and dw = 2xdx dy. We can also eliminate x to find z w = 2y and
hence dz dw = 2dy. Now, setting dz = 0 we find

x 1
=
y z 2x
and similarly
y y z
= 2x, = 1, = 2x
x z z x x y

x 1 x 1 w
= , = , = 2
w y 2x y w 2x y z
Thus equations (0.1), (0.2), (0.3) become, respectively,

1 1 1 1 1
= (2x)1 , 1 = 1(2x), = + (2)
2x 2x 2x 2x 2x
0.2 Exact differentials

An infinitesimal quantity of the form
f (x, y) dx + g(x, y) dy (0.4)
is an exact differential if it is the differential of a function of x and y, say h(x, y),
f dx + g dy = dh = hx dx + hy dy ,
where the subindices denote partial differentiation. This implies f = hx and g = hy . A

necessary and sufficient condition3 for (0.4) to be an exact differential is
fy = gx .
(Strictly speaking For example, y 2 dx + 2xy dy is an exact differential because

2
y = 2y = 2xy .
y x
It can easily be integrated to obtain the function h(x, y)
!
hx = y 2 = h = y 2 x + k(y) = hy = 2xy + k 0 (y) = 2xy ,
from which follows that k(y) = const. and h(x, y) = xy 2 + const..

If a differential is exact then the line integral in the xy-plane from some initial point
(x0 , y0 ) to some final point (x1 , y1 ) depends only on the end points and not on the path
in between them, and it is given by
Z (x1 ,y1 ) Z (x1 ,y1 )
f dx + g dy = dh = h(x1 , y1 ) h(x0 , y0 ) .
(x0 ,y0 ) (x0 ,y0 )
If the differential is not exact then the line integral depends on the path.
3
We ignore special cases that are not relevant for this unit
4
0.3 Laplaces method4
In our analysis of statistical mechanics we will often have to integrate functions that have
a very pronounced maximum and decrease rapidly away from the maximum. Laplaces
method can be used to approximate this type of integrals. We consider
Z b Z b Z b
ln f (x)
dx f (x) = dx e = dx eg(x) . (0.5)
a a a
Note that we expressed the integral in terms of the logarithm of f (x) because it varies
less rapidly than the function f (x). The integral will be dominated by the parts of the
domain close to the maximum x0 of g(x) that is determined by
g 0 (x0 ) = 0 , g 00 (x0 ) < 0.
Expanding g(x) in a Taylor series around the maximum x0 gives

1
g(x) = g(x0 ) + (x x0 )2 g 00 (x0 ) + . . . (0.6)
2
such that one can use Gaussian integration to approximate
s
Z b Z +
1 2 00 2
dx eg(x) eg(x0 ) dx exp (x x0 ) |g (x0 )| = eg(x0 ) 00
,
a 2 |g (x0 )|
where we have used the fact that g 00 (x0 ) < 0 and that the integrand is dominated by the
region close to the maximum to extend the limits to .
Note that this approximation can be put on a more rigorous basis if there is a large
parameter in the problem, as is often the case, and one integrates over a function of the
form exp(g(x)). Then one can show that (for one maximum at x0 )
Z b Z + s
2
dx eg(x) eg(x0 ) dx exp (x x0 )2 |g 00 (x0 )| = eg(x0 ) ,
a 2 |g 00 (x0 )|
is the leading order approximation for (this is covered in the unit Asymptotics).
1 Thermodynamics
1.1 Introduction
Thermodynamics is the description of macroscopic systems where heat and tempera-
ture play a role.5 Equilibrium thermodynamics is the study of such macrosystems
in thermodynamic equilibrium. This course will help us to understand the meaning
of those technical terms in the previous sentences. Let us set the stage by introducing
some of the terms and notions that will be important in the following.
4
Sometimes termed steepest descents from the more general complex variable version.
5
We will also consider a few processes, such as mixing of different substances, that need not specifically
involve heat.
5
A thermodynamic system is, quite generally, a bounded macroscopic quantity of
matter (or radiation). Macroscopic here means that the size of the system is comparable
to our human scale. The system is separated from the environment by its boundary.
This boundary can be naturally given, like the surface of a drop of a liquid, or it can be
specified by walls. An example that we often use is a gas that is contained in a cylinder
with a movable piston.
We are interested in interactions with these systems that are of two main types. There
are work-like processes that involve work done on or by a system, and there are
thermal processes that involve heat flow into or out of a system. A system is said to
be isolated if it does not interact with its environment.
We can specify the state of a thermodynamic system by so-called state variables.
These are macroscopic quantities like volume V , pressure p, number of particles N ,
temperature T or chemical potential (in contrast to microscopic quantities like the
masses, positions and momenta of the particles).
Some state variables are proportional to the size (the amount of matter) in the system
and are called extensive variables (e.g. volume V and particle number N ). Other state
variables are independent of the size of the system and are called intensive variables
(e.g. pressure p, chemical potential and temperature T ). Notice that the extensive
variables naturally include any quantities which are (for an isolated system) conserved.
Many of the state variables form intensive-extensive conjugate pairs such that their
product has dimensions of energy. The intensive member of each pair is a force-like
variable while the extensive member of the pair has the properties of a displacement.
Typically we refer to a generalized force and a generalized displacement.
Under a generalized force f the work done in moving a generalized displacement dx is
dW = f dx
where the dot product indicates a sum over components if there is more than one gen-
eralised force acting.
System Generalized Force, f Generalized Displacement, x
Gas pressure, p volume, V

Elastic filament tension, length, L
Liquid film surface tension, area, A
Magnet magnetic field, B mag. dipole moment, m
Dielectric material electric field, E, el. dipole moment, p
Chemical species chemical potential, , number of molecules, N
What is equilibrium?
A system is in thermodynamic equilibrium when its properties do not change appre-
ciably during the time when it is under observation. Clearly this definition is subjective
as it depends on how long we observe it 6 .
6
E.g. Diamond is slightly less energetically stable than graphite, decaying at high temperatures, and
presumably over extremely long times, so not quite forever.
6
A fundamental assumption of thermodynamics is the idea that any isolated system
settles after sufficient time into a stationary state, called the equilibrium state in
which it properties dont change anymore. The system is then said to be in thermody-
namic equilibrium. This is a very reasonable assumption (which has been obtained
empirically from observation) and we have plenty of experience which makes such an
assumption quite intuitively obvious.
We can characterise equilibrium states by the state variables. Note that many of the
state variables are well defined only when the system is in equilibrium (e.g. pressure).
The phrase sufficient time implies the existence of a typical timescale (the relaxation
time). When observing a system we would have to wait longer than the relaxation time
to be sure that the system is in equilibrium.
Given the subjective nature of our definition, a suitably (scientifically) pragmatic defi-
nition of a system in equilibrium would then be when all the state variables being used
to describe the system were not changing. If each state variable has its own specific
relaxation time, then this would then require us to wait for timescales longer than all
the relaxation times of the particular state variables being used to describe the system.
That is, we would have to wait for longer than the longest relaxation time in the system
(the relaxation time of the slowest state variable) 7 .
We also use thermodynamic equilibrium to describe the state of two bodies/systems
in contact, again if there are no changes to the state variables. A special type of thermo-
dynamic equilibrium occurs if we bring two bodies/systems into contact and consider
thermal but not work-like interaction. The thermal contact will in general lead to changes
in the values of their state variables. When all the state variables are no longer changing
the two systems are said to be in thermal equilibrium.
What are temperature and heat? - The zeroth and first laws
To continue our study of macroscopic systems we need to understand what is meant by

temperature and heat. We all have an intuitive idea of these concepts but we will
need to develop a more precise idea of these notions 8 .
1.2 Temperature and the zeroth law of thermodynamics

1.2.1 The zeroth law
We can ask ourselves the following question. Is it possible to predict if two bodies will
be in thermal equilibrium with each other when brought into thermal contact? Upon
reflection, the answer is yes. We would need a third body (such as a thermometer) to
7
This also leaves open the possibility for us to study systems where certain macroscopic quantities
are at equilibrium while others are not - the field of non-equilibrium thermodynamics, an exciting area
of active research with many fundamental open questions, unfortunately outside the scope of this course
8
There is an interesting discussion on the history of the development of the concept of heat on the
BBC Radio 4 programme In our time with Melvyn Bragg on December 4, 2008. The episode was
called Heat: A History - from fire to thermodynamics You can download a podcast on
www.bbc.co.uk/radio4/history/inourtime/inourtime_science.shtml.
7
verify if that was so. The zeroth law encapsulates this fact 9 . It states
If two macroscopic bodies are in thermal equilibrium with a third, then they are also
in thermal equilibrium with each other.
Let us consider the implications of this.
A C B
{c i } {bi }
{ai }
Zeroth Law
Figure 1: The Zeroth Law: Two systems A, B which are both separately in thermal
equilibrium with a third system C must be in equilibrium with each other.
Consider two systems A, B which are both separately in thermal equilibrium with a third
system C. (Note that we consider thermal equilibrium and we do not consider the
case where they do work on each other). The zeroth law then implies that A and B must
be in equilibrium with each other.
The zeroth law leads to the definition of a state variable, the temperature. A quick
argument is as follows: Thermal equilibrium is a transitive relation (A C and C B
= A B), and hence an equivalence relation (because it is also reflexive, A A and
symmetric A B = B A), and so partitions equilbrium systems into equivalence
classes, labelled by a quantity we can understand as temperature. This gives us very
little understanding about how we might define this practically, however, so we continue
with a more detailed argument:
The equilibrium states of A, B, C are described by their state variables (a1 , a2 , . . .) ,
(b1 , b2 , . . .) , (c1 , c2 , . . .) respectively.
If A is in equilibrium with C, then this equilibrium can be expressed as a constraint on
the variables of A and C, i.e. if a change is made to one of the state variables of A then
some change must occur in the variables of C to keep the two systems in equilibrium.
This constraint can be expressed as :
fAC (a1 , a2 , . . . ; c1 , c2 , . . .) = 0 c1 = FAC (a1 , a2 , . . . ; c2 , c3 , . . .) (1.1)
The equilibrium of B and C implies a similar constraint:
fBC (b1 , b2 , . . . ; c1 , c2 , . . .) = 0 c1 = FBC (b1 , b2 , . . . ; c2 , c3 , . . .) (1.2)

9
This law was only recognised as a logical necessity for thermometry only in the 1930s long after
the other laws were stated.
8
Both these equations imply that
FAC (a1 , a2 , . . . ; c2 , c3 , . . .) = FBC (b1 , b2 , . . . ; c2 , c3 , . . .) (1.3)
which implies further that
a1 = fABC (b1 , b2 , . . . ; a2 , a3 , . . . ; c2 , c3 . . .) (1.4)
However the zeroth law requires that the following is true :
fAB (a1 , a2 , . . . ; b1 , b2 , . . .) = 0 a1 = FAB (b1 , b2 , . . . ; a2 , a3 , . . .) , (1.5)
independent of the set of variables {ci }.

Now both equations (1.4) and (1.5) give an expression for a1 in terms of the other state
variables. Equation (1.5) shows that a1 depends only on the set of variables {bi } and not
on the set {ci }. This implies that all the ci variables must drop out of equation (1.4).
As a result it must also be possible to cancel them from equation (1.3). This means there
must exist a function only of the state variables of A which is equal to a function of only
the state variables of B when A and B are in thermal equilibrium with each other:
A (a1 , a2 , . . . ) = B (b1 , b2 , . . . ) . (1.6)
We conclude that the thermal equilibrium of two systems i and j is characterised by

functions i and j of their respective state variables having the same value 10 .
These functions are state functions since they are functions of state-variables. They
are therefore also constant at equilibrium and could also be used as a variable in an
experiment.
We denote the value at equilibrium by :
A (a1 , a2 , . . . ) = . (1.7)
From what we have developed to this point, we do not have a clear idea of what is,
but what we intuitively know as the temperature is a perfect candidate for this variable.
Indeed, is denoted as empirical temperature. Equation (1.7) which links to
the other state variables of system A is called equation of state. Furthermore, all
equilibrium states that correspond to the same temperature in (1.7) form a manifold
in the space of the state variables that is called an isotherm.
Example: We can get an illustration of the zeroth law by the following: Consider a
container of gas in a big room. If it can thermally interact with its environment then
after some time it will be in thermal equilibrium with the room. (It will adjust to the
room temperature. If the room is big then the properties of the air in the room will not
change noticeably by the temporal interaction with the gas and can be considered to be
constant.) If one then changes the volume V of the gas slowly so that it can always adjust
to thermal equilibrium with the room then the pressure p of the gas changes together
10
Note the functional form of i may be different for the two systems. These systems may consist of
completely different substances.
9
with the volume. If the gas is dilute then one finds that volume and pressure change
according to
p V = const. (1.8)
This is the isotherm of an ideal gas which describes any gas in the limit where it is
sufficiently dilute. The constant in (1.8) depends on the temperature of the gas.
If we introduce a second container of gas in the room, then it will also approach thermal
equilibrium with its environment and adjust to the room temperature. Its volume and
pressure will then lie on the isotherm of the second gas.
The zeroth law tells us that if the two containers of gas are both in thermal equilibrium
with the room then they will be in thermal equilibrium with each other when brought
into contact, and their state variables will not change. This happens when volume and
pressure of the two gases lie on any point of their isotherms that correspond to the room
temperature.
1.2.2 The ideal gas and thermodynamic temperature
The zeroth law implies the existence of a temperature but it does not assign a value to
it. To define a temperature scale one can choose a suitable material whose properties
change with temperature. For example, one could use the height of mercury in a thin
tube to define temperature (in a certain range).
The system that is used to define the thermodynamic temperature is the ideal gas.
This temperature is defined in terms of the constant of the isotherm (1.8) as
pV = RT . (1.9)
One reason for this choice is that this definition is material independent because all
gases follow the same law in the low pressure limit. We will see later that the so defined
temperature also arises naturally in thermodynamic theory (up to a scale factor).
The law (1.9) when extrapolated to low temperatures implies that there is a zero tem-
perature at which the pressure of the gas is zero. This temperature is called absolute
zero. (In reality gases condense before they reach this limit.)
The law (1.9) still contains a constant R that needs to be defined. R is an extensive
quantity, because the volume V is extensive, and it is expressed as R = kB N , where N
is the number of particles (molecules) in the gas. The remaining constant kB is fixed
by choosing one unique temperature. This is the triple point of water where the three
phases of water (ice, water and water steam) co-exist in equilibrium. It occurs at a
unique pressure and temperature. The value of this temperature was defined in 1954 to
be 273.16, and the unit so defined is called Kelvin (K) after Lord Kelvin. The reason
for the choice of 273.16K is that this results in 100K difference between the ice and
steam point of water at atmospheric pressure. They then occur at 273.15K and 373.15K
respectively. As a consequence, the Kelvin scale differs from the Celsius scale by a mere
shift of 273.15 degrees. (The absolute zero is at -273.15 degree Celsius.)
The ideal gas law (1.9) then has the form
p V = N kB T . (1.10)
10
This is the equation of state of an ideal gas. There are deviations from this law for
real gases, but they become smaller and smaller in the limit of low pressure.
The constant kB is called Boltzmann constant and has the value kB = 1.38 1023 J/K
(Joule per Kelvin). To get an estimate for the order of magnitude of the number of
particles in a gas one can consider one mole. One mole of a substance is defined as the
amount of a substance that contains as many elementary entities as there are carbon
atoms in 12g of Carbon-12 (12 C). For one mole of a gas, the number of particles is given
by NA = 6.02 1023 , called Avogadros number.
Appendix: Not examinable
Lagrange multipliers
Lagrange multipliers are very useful in minimisation or maximisation problems where the
given system must satisfy certain constraints.
Imagine that you are asked to determine the minimum of the function f (x, y) = x2 + y 2 with
the condition that x and y satisfy the following relationship: g(x, y) = y 2x 1 = 0.
It is easily solved by expressing y in terms of x in f (x, y):

F (x) f (x, y(x)) = f (x, 2x + 1) = 5x2 + 4x + 1 .
Then the minimum follows as

dF 2 1 1
0= = 10x0 + 4 = x0 = , y0 = , f (x0 , y0 ) = .
dx x0 5 5 5
There is another way to solve this problem. We define a function that depends on three
parameters x, y and :
h(x, y, ) f (x, y) g(x, y) = x2 + y 2 (y 2x 1) . (1.11)
A minimum of h with respect to all three parameters is then determined
h f g
0= = = 2x + 2 ,
x x x
h f g
0= = = 2y ,
y y y
h
0= = g(x, y) = 2x + 1 y . (1.12)

The solution for this system is y0 = 1/5, x0 = 2/5, 0 = 2/5 with the same values as
before. Note that the minimisation w.r.t. returns the constraint condition 2x+1y = 0.
Why does this yield the same solution as in the first case above? We dont provide a
formal proof but give the following argument. If you plot the contour lines f (x, y) =
const. of the function f and the constraint condition g(x, y) = 0 then you can easily
convince yourself that the extremum has to be at a point where the curve g(x, y) = 0 is
tangent to a contour line f (x, y) = c. (Consider a point where g = 0 crosses a contour
line of f . Why can this not be an extremum?) But then the gradient vectors of f and g
have to be parallel, because they are orthogonal to the contour lines. Hence one obtains
f (x, y) = g(x, y) in addition to g(x, y) = 0 , (1.13)
where is the proportionality factor. These conditions are identical to the ones in (1.12).
11
The method can be extended to higher dimensions and more constraints. We state here
only the results. In n dimensions
(f (x) g(x)) = 0 , g(x) = 0 , (1.14)
where x is an n-dimensional vector. In the case of two constraints one has
(f (x) 1 g1 (x) 2 g2 (x)) = 0 , g1 (x) = 0 , g2 (x) = 0 . (1.15)
This generalises to more than two constraints in the obvious way.
12
1
1.2.3 Thermodynamic reversibility
We want to describe thermodynamic processes in the space of the state variables. We

are particularly interested in processes that can be reversed after they have taken place
- returning a system to its initial state. We define a process to be reversible if its
direction can be reversed by an infinitesimal change in the conditions. Thermodynamic
reversibility requires that the process is quasistatic:
Quasistatic processes These are processes which are carried out so slowly that every
state through which the system goes can be considered an equilibrium state (i.e. the
process should be slow compared to the relaxation time of the system). Fast changes lead
to different parts of the system not being in equilibrium with each other. For example,
consider moving a piston of a gas cylinder to compress a gas inside it. If this is done
fast, sound/shock waves will be set up creating regions of different pressure/temperature.
Clearly this is irreversible since moving the piston out does not reverse the sound waves.
Quasistatic processes can be described using continuous curves in state space. Non-
quasistatic processes cannot be so illustrated, because they involve states that are not
equilibrium states and hence cannot represented by a point in state space.
Since all points on a quasistatic process are at equilibrium we can stop at any point in
the process and the system does not change. A quasistatic process is reversible if it can
be reversed by infinitesimal changes of the parameters of a system. Not all quasistatic
processes are reversible. A common source of irreversibility is friction. Another example
of irreversible systems are those with hysteresis. If a process in these systems is reversed
it does not retrace its previous path (e.g. magnetisation vs. magnetic field).
To proceed further, we need to introduce and make precise the concept of heat.
1.3 Heat and the first law of thermodynamics

1.3.1 An historical overview
The industrial revolution and resulting mechanisation in Europe/America during the

18th/19th century meant that machines such as the steam engine were invented which
involved heating fuel and generating mechanical power. It was due to the need for precise
ideas of what heating the material and temperature increase meant, that thermody-
namics was developed. By the late 18th century there were two competing notions of
heat. One was the caloric theory which proposed that heat was an indestructible
fluid permeating matter and the other was the molecular motion theory which pro-
posed vibrations of molecules as the cause of heat. Some highlights of the experimental
progression were:
1
University
1
1761 - Joseph Black studied ice melting and noted ice-cold water warmed up much
more quickly than ice+water. The caloric point of view was that water contained
more caloric than ice.
1799 - Humphrey Davy showed wax and ice could be made to melt by rubbing two
pieces togther. The caloric point of view was that rubbing squeezed out caloric.
1799- Count Rumford showed unlimited heat could be obtained by drilling with a
blunt tool (as drilling went on - tool got hotter and hotter) which as difficult for
the caloric sympathisers to explain as it implied an infinite amount of caloric.
1840s - James Joule finally put the nail in the caloric coffin by showing that heat
and work were equivalent. He produced heating in a variety of thermally isolated
systems by performing work on them. He compared the amount of work required
to produce a given amount of heat (T rise for fixed mass) and showed that there
were in fixed proportions to each other. Thus caloric not an independent entity,
but can be considered a type of a more general conserved quantity, energy.
1900s - Thermodynamics fully established
1.3.2 The first law
In his experiments Joule described heat and work in terms of changes in the state vari-
ables of the system (heat was specified in terms of the temperature rise of a unit of water,
and work was defined in terms of generalised forces and displacements). In order to make
precise of the idea of heat as form of energy, however, it is convenient to introduce the
equivalence of heat and work in terms of changes in energy of a system.
This definition of heat is incomplete without a precise way to control it. We define
a thermally insulated or adiabatic system as one where changes in energy are due
only to work done (i.e. of mechanical origin). We also say that a system that cannot
exchange heat with its environment is enclosed by adiabatic walls (or is adiabatically
insulated).
Now we can make a first formal statement of the first law :
If the state of a thermally isolated (i.e. adiabatic) system is changed by the performance
of work, the amount of work needed depends only on the change effected (i.e. initial
and final states) and not on the means by which the work was performed nor on the
intermediate stages through which the system passes between its initial and final states.
1.3.3 Internal energy
The first law as stated above leads to a natural assumption: that a macroscopic system
in thermodynamic equilibrium has an energy. Changes in its state variables will in
general lead to changes in the total energy of the system. Conservation of energy implies
that if we have a thermally isolated system which goes from state 1 to state 2 :
E12 = E(2) E(1) = W12 ,
and the 1st law requires that this does not depend on the path so the total energy of the
system should be a state function which we call the internal energy E.
2
W12 refers to the work done on the system. If W12 > 0, work is done on the
system and if W12 < 0 work is done by the system. Work can be done by or on the
system if we change the value of its extensive state variables (e.g. the volume V of a gas
or the length L of an elastic band). Hence, if all the external parameters of the system
which we can control are held fixed then no work is done by or on the system.
However, we know empirically that for non-adiabatic processes E12 6= W12 . We also
know empirically that when we relax the adiabatic constraint the work done on a system
depends on the path from 1 to 2. To generalise the first law to non-adiabatic processes it
is assumed that the internal energy E is a state function for all systems (both adiabatic
and non-adiabatic ones), and
E12 = Q12 + W12 Q12 = E12 W12 . (1.11)
and Q12 is defined as the heat absorbed by the system from its environment. The first
law in this form generalises the concept of conservation of energy to include processes
where heat flow occurs.
Heat then is the spontaneous flow of energy from one system to another. Heat is also
the amount of energy transferred between a macroscopic body and its environment when
no work is done, either by the system on its environment or by the environment on
the system and no matter is exchanged between them.
The energy difference E = Q + W is independent of path between two states
(because E is a state function) while in general Q, W are not (Q, W are NOT state
functions/variables). In other words, given an initial and final state, we can determine
the change in internal energy but not the heat flow or work done. To determine Q and
W we need information about the path from initial to final state.
1.3.4 General statement of the First Law
We can restate the first law of thermodynamics:2
The internal energy of a macroscopic system is an extensive state function of the

thermodynamic state variables, E = E({ai }). The change in the internal energy of the
system is the sum of the heat added to it and the work done on it.
E = Q + W ,
where Q is the heat added and W is the work done on the system.
For infinitesimal changes, we can write in differential form
dE = Q + W , (1.12)
where Q is the differential heat added and W is the differential work done on the
system. The use of instead of d indicates that Q and W are not exact differentials.
There are a number of points that we need to appreciate about this formula
2
Extensivity here is due to the microscopic interactions being short ranged, so that the total energy
of a composite system is very close to the sum of the energies of its parts; if this is not the case (eg
gravity), we need to be more careful.
3
1. By convention we take Q as positive for heat added to the system and negative
for heat flowing out of the system while W is positive for work done on the system
and negative for work done by the system.
2. For quasistatic processes the differential work W has the form :
W = f dx , (1.13)
where f is a generalized force and x is its conjugate generalized displacement. For

example
(gas) W = pdV , p = pressure , V = volume ,

(rubber band) W = dL , = tension , L = length .
3. Dimensional analysis is a powerful consistency tool in mathematical descriptions

of the physical world. If we use an equation to describe a physical phenomenom,
then both sides of the equation must have the same dimensions. We typically
write dimensions of a quantity y as [y]. The dimensions of all physical quantities
can be built up from the dimensions of basic or fundamental physical dimensions
(usually mass m, length l, time t, electric charge q, and temperature T ). Units of
measurement must also satisfy dimensional analysis.
We note, of course, that the first law implies that work and heat have dimensions
of energy. The dimensions of (a component of) force are [fi ] = mlt2 : in SI
Units, forces are measured in Newtons (N), i.e. 1 kg m s2 . The dimensions of (a
component of) displacement are [xi ] = l: in SI Units, distances are measured in
metres (m). Hence the dimensions of energy, work and heat are [E] = ml2 t2 : in
SI units, energy (and work and heat) is measured in Nm. The product Nm is also
called Joule (J) 3 . You should convince yourself that the product of all the given
examples of generalized forces and generalized displacements have dimensions of
energy (use google if you have to)!
4. The first law can also be seen as a definition for heat (and is totally consistent
with our discussion of what heat is earlier), Q = dE W . Heat is the change in
energy of a system on which no work is done or that part of the change in energy
of a system which is not due to work done on or by the system.
5. Since heat and work are not state functions there are no functions Q, W (p, V, T ) to
which we can associate total differentials dQ, dW hence we use Q, W to represent
the differential forms corresponding to them.
6. Since E is a state function, this means that E is a function of state variable E({Yi })
and we can define an infinitesimal change in energy in terms of partial derivatives
w.r.t. the state variables
X E
dE = dYi ,
i
Yi Yj6=i
where Yi are the state variables.

3
SI units refer to a globally agreed standard system of units of measurement for science abbreviated
from the French: Systeme International dunites
4
7. Since dE is an exact differential this means
R A that the energy difference between any
two states A, B, EBA = EA EB = B dE depends only on the state variables
at A, B and not on the path between A and B.
Example: If the internal energy of a volume of gas went from 25kJ to 16kJ and the
work done by the gas was 5 kJ, how much heat did the gas get in the process ?
E = 9 , W = 5 , Q = 4. The system loses 4 kJ in heat.
1.3.5 Caloric Equation
Recall that last week we discussed the ideal gas equation
pV = N kB T ,
which was an example of an equation of state.

As a consequence of the equation of state we can specify the state of an ideal gas by two
state variables (if we keep the number of particles N fixed), for example p and V . Other
state functions/variables like the temperature T are then determined.
The internal energy of an thermodynamic system is specified by the caloric equation.
For an ideal gas it has a particularly simple form (which will be derived in the statistical
mechanics part of the course). If the ideal gas is monatomic then
3
E = N kB T , (1.14)
2
i.e. it can be written as a function of a single state variable T . In general the expression
for the caloric equation of an ideal gas is
f
E= N kB T , (1.15)
2
where e.g. f = 3 for monatomic gas, f = 5 for diatomic gas 4 .
An example of a monatomic gas is Argon (Ar) or Helium (He); diatomic gases are oxygen
(O2 ), nitrogen N2 or carbon monoxide CO; a triatomic gas is carbon-dioxide CO2 . . .
Examples of thermodynamic processes: RV We calculate for some quasistatic ther-

modynamic processes the work W = Vif p dV being done on an ideal gas during the
process from an initial state to a final state. Note that the energy difference E can be
calculated from the caloric equation, and the heat flow Q then from the first law.
1. Isobaric: constant pressure, p = const., W = pV = p(Vf Vi ).
2. Isochoric: constant volume, V = const., W = 0.
3. Isothermal: constant
R Vf temperature, T = const.. Using the equation of state we
obtain W = Vi N kB T V 1 dV = N kB T ln Vf /Vi .
4
We will see later that this is related the equipartition theorem in the statistical mechanics part of
the course.
5
4. Adiabatic: no heat transfer, Q = 0, so dE = W = p dV . On the other hand,
the caloric equation for a monatomic ideal gas states that
3 3 3
E = N KB T = pV = dE = (V dp + p dV ) ,
2 2 2
where we used the equation of state. If we combine this with dE = p dV we
obtain
3 5 dp 5 dV
V dp + pdV = 0 = + = 0 = p V 5/3 = constant .
2 2 p 3 V
RV
Hence W = Vif c V 5/3 dV where c is a constant.
What happens for a diatomic gas?
p
5/3
adiabatic ~ V
isotherms ~ 1/V
Figure 2: p-V plot showing adiabatic & isothermal expansion of a monatomic ideal gas.
1.3.6 Refining Thermodynamic Equilibrium
Consider two bodies which are brought in contact with each other. Their properties
change until the systems reach a thermodynamic equilibrium. In general this involves
work-like and thermal processes. Sometimes one of the work-like processes is treated
separately. This is chemical work. It consists of adding particles to a system.
Wchem = dN
where is the chemical potential. So adding particles to a system requires work.

Systems reach a chemical equilibrium if no particles are exchanged anymore. Systems
reach a mechanical equilibrium if no work is done anymore. Systems reach a thermal
equilibrium if no heat is exchanged anymore.
A macroscopic system is in thermodynamic equilibrium if it is simultaneously in chem-
ical, mechanical and thermal equilibrium.
Example Consider the chemical reaction
A+B C .
6
The chemical work done in this reaction is Wchem = A dNA +B dNB +C dNC . However
dNC = dNA = dNB so
Wchem = dNA (A + B C ) .
At chemical equilibrium the system gains no energy by performing the chemical reaction
in either direction, hence Wchem = 0. So at chemical equilibrium A + B = C .
1.3.7 Heat capacity
The heat capacity of a macroscopic object is the amount of heat required to raise its
temperature by one unit (e.g. degrees celsius C, or Kelvin, K)
Q
C= , (1.16)
dT
Since heat is not a state function we also need to specify the path by which it is supplied.
Q dE W dE + pdV
C= = = .
dT dT dT
For example consider supplying heat quasistatically to a gas at a constant volume or
constant pressure,

E
CV = (1.17)
T V

E V
Cp = +p (1.18)
T p T p
Since heat capacity is extensive, we will often be using specific heat capacity which
is the heat capacity per unit mass.
The heat capacity at constant pressure is conveniently expressed in terms of a new state
function called enthalpy
H = E + pV , (1.19)
in terms of which
H
Cp = .
T p
We can sometimes put heat into a system without changing the temperature. Two
common situations are
1. at a phase transformation, e.g water ice, water steam.

2. at an isothermal process at constant T (for which C is not defined)
1.4 The Second Law of Thermodynamics

The 0th and 1st laws of thermodynamics helped us introduce temperature and heat.
The 2nd law of thermodynamics will allow us to introduce a further thermodynamic
variable, entropy.
We know from daily experience that many macroscopic processes occur only in one
direction. Consider for example
7
1. A hot and a cold body are brought into contact. Eventually they have the same
T , i.e. heat flows from the hotter to the colder body.
2. A piece of rubber sliding across a desk eventually comes to rest due to friction.
During this process the kinetic energy of the rubber is transformed into internal
energy of rubber (and desk) which can be noticed by a small temperature increase.
Each of these processes satisfies the first law and we know from experience that they are
typical. The reverse processes would also satisfy the first law, but they never happen.
Nature at the macroscopic scale is inherently irreversible and we need to take this into
account when describing macroscopic systems. We need an additional principle to do
this. Providing this principle is the first function of the second law of thermodynamics.
The second function of the second law is to express the inherent limit of efficiency with
which heat can be converted into work.
1.4.1 Heat Engines and the 2nd Law
Engines were the great advance of the industrial revolution and there was great im-
petus in the 19th century to better understand and to improve engines. Apart from
irreversibility, the other motivation for the second law was to understand the limits of
efficiency with which heat can be converted into work. In order to do this we will need
a conceptual idea of a machine that does that.
A heat engine is a device that absorbs heat and converts part of it to work. The engine
should not change in doing this, i.e. after completing a prescribed set of processes which
do this conversion it must return to its initial state. The set of these processes is called
a cycle.
hot Qh
HEAT system
W
ENGINE working
substance
Qc
cold
Figure 3: A heat engine
The principle of a heat machine is illustrated in Figure 3. It consists of a working sub-

stance (system) that absorbs some heat Qh from a heat source, converts some portion
into work W and rejects the rest Qc by transferring it to a reservoir 5 . We can define
5
Note that the work W refers here to the work done by the system (working substance). By
convention, we have W, Qh , Qc > 0 i.e. for the engine we have W > 0 for work done by the system,
Qh > 0 if the system absorbs heat Qh from the hot reservoir and Qc > 0 if the system dumps heat Qc
into the cold reservoir.
8
the efficiency, e of the heat engine as the fraction of the heat converted into work.
W
e = . (1.20)
Qh
Since the system is unchanged after its cycle, its internal energy remains constant so
Qh = Qc + W = W = Qh Qc . (1.21)
Therefore the efficiency is given by
Qc
e = 1 1. (1.22)
Qh
Clearly, e cannot be great than 1 (and is 1 only if Qc = 0). In fact, the Kelvin statement
of the second law says that e = 1 is impossible.
A refrigerator is a heat engine running in reverse. By doing mechanical work, it extracts
heat from a cold reservoir and transfers it to a hot one.
hot
Qh
system
W
working refrigerator
substance
cold Qc
Figure 4: A refrigerator
We can similarly define the efficiency of a refrigerator

Qc Qc
r = = . (1.23)
W Qh Qc
We see that unlike the heat engine, the efficiency of a refrigerator can be greater than
one, however we also see that it rapidly goes down as the amount of rejected heat goes
up (i.e. as Qh ) 6 .
There are several ways to formulate the second law of thermodynamics. Two famous
ones are :
The Clausius statement: No process is possible whose sole result is the transfer
of heat from a colder to a hotter body.
The Kelvin statement: No process is possible whose sole result is the complete
conversion of heat into work.
6
Note that the work W refers here to the work done on the system (working substance). Again
by convention, we have W, Qh , Qc > 0 i.e. for the refrigerator we have W > 0 for work done on the
system, Qc > 0 if the system absorbs heat Qc from the cold reservoir and Qh > 0 if the system dumps
heat Qh into the hot reservoir.
9
Exercise: Prove that these are equivalent by showing that if one is violated then so is
the other. Hint: do this by combining a refrigerator with a heat engine into a combined
engine.
1.4.2 The Carnot Engine
There is a particularly simple cyclic process which has played an important role in the
development of thermodynamics and the understanding of engines in general.
The Carnot engine (or Carnot cycle) is an engine which works by following the cycle
below. The working substance :
1. expands isothermally and reversibly at Th absorbing heat Qh (from the hot reser-
voir)
2. expands adiabatically and reversibly with temperature changing from Th to Tc
3. is compressed isothermally and reversibly at Tc giving up heat Qc (to the cold

reservoir)
4. is compressed adiabatically and reversibly from Tc to initial state at Th .
The cycle is the intersection of two adiabats and two isothermals and we note that heat
is absorbed and rejected at two fixed temperatures, Th and Tc .
p
a adiabats
Qh isotherms
b
Th
Qc
d
c Tc
V
Figure 5: A Carnot cycle for a monatomic ideal gas
1.4.3 Ideal Gas Carnot Engine
Let us consider the Carnot engine with a monatomic ideal gas as working substance.
We can calculate the work done on/by and heat flow into/out of the system in each of
the four steps.
10
(i) In step a b the work done by the gas is
Z b
Vb
Wab = pdV = N kB Th ln .
a Va
The heat absorbed in step a b is obtained using the fact that Eab = 0 from the
Caloric equation
Vb
Qab = N kB Th ln .
Va
(ii) Step b c, is adiabatic (no heat transfer) so
Qbc = 0 .
For step b c we have pV = constant D with = 5/3, hence

Z c Z c 1 c c
V pV
Wbc = pdV = D V dV = D = ,
b b 1 b 1 b
where we used DV = p. Furthermore applying the equation of state we obtain

c
N kB T N kB (Th Tc )
Wbc = = .
1 b 1
We could also have used the fact that Wbc = Ebc to obtain this result.
(iii) The work done and heat absorbed in step c d are calculated in a similar fashion
to a b, and are given by

Vd
Wcd = N kB Tc ln = Qcd .
Vc
Note that Qcd < 0 implying heat is dumped into the cold reservoir (Qc = Qcd ).
(iv) Finally in step d a, we obtain the work done and heat absorbed respectively as
N kB (Tc Th )
Wda = ; Qda = 0 .
1
The efficiency of the engine is then given by

Vb Vd
Wcycle Wab + Wbc + Wcd + Wda Th ln Va
+ Tc ln Vc
e = = = (1.24)
Qh Qab Th ln VVab
Note that this is also equal to e = 1 Qc /Qh . Finally, we can obtain a relationship
between Vb /Va and Vd /Vc using the fact that steps b c and d a are adiabatic so
Th Vb1 = Tc Vc1

1
Vd Va Vb
= =
Vc Vb Va
Th Va1 = Tc Vd1

11
and hence the efficiency of the ideal gas carnot engine is
Qc Tc
e = 1 =1 . (1.25)
Qh Th
We note that all steps are reversible and that we can reverse the cycle to extract heat Qc
from the cold reservoir by doing work Wcycle . In this way we obtain an ideal gas Carnot
refrigerator. It has the efficiency
Qc Tc
r = = . (1.26)
Qh Qc Th Tc
Note that the efficiencies are independent of the other state variables of the system.
This is a very useful result because of the following theorem :
1.4.4 First Carnot Theorem :
All Carnot engines operating between two reservoirs at constant temperatures Th and
Tc have the same universal efficiency e = 1 Tc /Th .
Proof: Suppose we have two Carnot engines with different efficiencies and 0 oper-
ating between Th and Tc . Let us choose for definiteness > 0 .
Since all of the steps of the Carnot engine are reversible, we can run one of the Carnot
engines backward as a refrigerator. Let us run the two engines in parallel as in Figure
6 using Carnot- as an heat engine and Carnot- 0 as refrigerator. Let us adjust the
parameters of the two engines in such a way that the work W done by the Carnot-
engine is equal to the work done on the Carnot- 0 engine. Combining the two engines
this means that there is no work done on or by the combined engine.
hot reservoir hot Th

Th
Qh Qh Qh Qh
W W
Qc Qc Qc Qc
Tc
cold Tc
cold reservoir
Figure 6: 2 Carnot cycles with different efficiencies , 0 , one as engine the other as
refrigerator.
Due to our assumption that > 0 we have

W W
Qh = < 0 = Q0h ,

12
and
Qc = Qh W < Q0h W = Q0c .
This implies that the combined engine absorbs heat Q0c Qc > 0 from the cold reservoir
and transfers it to the hot reservoir. This violates Clausius statement of the second law
and we conclude that > 0 is not possible
We get a similar conclusion if we choose 0 > and run the Carnot- as a refrigerator
and Carnot- 0 as engine.
Therefore the only possible solution is
0 = . (1.27)
An obvious corollary is that the efficiency of a Carnot engine is independent of the

working substance.
1.4.5 Second Carnot Theorem
No heat engine operating between two reservoirs (at constant temperatures Th and Tc )
is more efficient than a Carnot engine operating between the same two reservoirs.
i.e.
Qc Tc Qc Tc
e = 1 1 = (1.28)
Qh Th Qh Th
Proof: Consider a (possibly) irreversible engine with an efficiency greater than a Carnot
engine, irr > .
Let us combine this engine with efficiency irr again with a Carnot engine with efficiency
running backwards as a refrigerator (see Figure 6) 7 .
The combined effect of both machines (irreversible engine + Carnot refrigerator) is solely
to transfer heat from the cold reservoir to the hot one violating Clausius statement of
the 2nd law 8 . Therefore we conclude
irr (1.29)
For a reversible engine the role of the two engines can be interchanged and we obtain
rev = , but this cannot be concluded for an irreversible engine.
Remarks: Using the 2nd law, we found a limit on the efficiency of engines working
between large reservoirs (which may be reasonably assumed to have constant temper-
ature). However while the Carnot engine is maximally efficient, it is also impractical.
Since all the steps are reversible they are also quasistatic and so heat flows very slowly
and it takes a long time to get a significant amount of work done. A real engine must
make the compromise between maximising either the efficiency of doing work or the
7
Note that we can run the Carnot engine backwards as a fridge (by reversing inputs/outputs) because
it is reversible while for an irreversible engine this cannot in general be done.
8
Note that if irr < then we have transferred heat from the hot to the cold reservoir which of
course is absolutely allowed and expected.
13
rate of doing work. Real engines operate irreversibly and hence at lower efficiency than
Carnot engines but also at a faster rate.
Previously, we defined thermodynamic temperature using the ideal gas equation of
state. The disadvantage of this is that the ideal gas is not a real physical substance, but
a limit of low density real gases. We see now that thermodynamic temperature can be
defined more fundamentally using the Carnot theorems.
Finally we point out that (1.28) implies that
Qh Qc X Q
+ 0 or 0.
Th Tc T
Where Q denotes the heat absorbed by the system at temperature T . This relation
will be examined in more detail in the next section.
14
1
Example
A power plant produces 1 GW of electricity with an efficiency of 40 % (typical of coal-

fired plants). At what rate does it expel heat into the environment? Let the cold reservoir
be a river with a volume flow rate of V = 100m3 /s. How much does the temperature
of the river rise? (The specific heat of water is cv = 4.2kJK1 kg1 , density of water is
= 1000kgm3 .)
The efficiency is given by = W/Qh and assuming a constant rate of both work and
heat production then efficiency is also given by = W /Qh and rate of production of
waste is
1
Qc = Qh W = W 1 = (5/2 1)W = 1.5 GigaWatts .

The river absorbs all the heat from the plant and its temperature rises by T . This
temperature rise is related to the heat that is absorbed by the river per time interval t
Qriver = Qc t = M cv T ,
where M is the mass of the water flowing by during the time interval t. It is given by
M = V t. The resulting temperature rise T is
Qc
T = = 3.6 K .
V cv
1.4.6 Clausius Theorem
The link between the 2nd law as proposed as a limit on the efficiency of engines and as a
statement of the fundamental irreversibility of nature is provided by Clausius Theorem.
Clausius Theorem states :
I
Q
Given any quasistatic thermodynamic system, for any closed cycle 0, where
T
Q is the differential heat supplied to the system at temperature T . The equality
necessarily holds for a reversible cycle.
Remark: The system needs to be quasistatic so that the system can be coupled to a
Carnot engine in the proof below. For non-quasistatic systems it is not clear how to
define many state variables including the temperature T .
1
University
1
Proof: For the proof we combine the system that undergoes the cycle with a tiny
Carnot engine that is coupled to a heat reservoir at T0 . The purpose of the Carnot
engine is to provide the large system at every time step with the heat Q. The Carnot
engine is assumed to have a much faster cycle than the system so that at each step of
the systems cycle during which the heat Q is received, the Carnot engine goes through
a full cycle with efficiency
Q T Q
=1 =1 Qh = T0 , (1.30)
Qh T0 T
where Qh is the heat absorbed by the Carnot engine from the heat reservoir, see Fig. 7.
During this step the system does the work W and the Carnot engine does the work
WCE .
hot reservoir
T0
Qh
Carnot WCE
Engine
T
W
system
Figure 7: System at a point in its cycle where it is absorbing heat Q and is at temper-
ature T . It acts as one reservoir to the Carnot Engine. The other reservoir is at T0 .
Note that as the system goes through its cycle the quantities T , Q and W change and
consequently also Qh and WCE (but T0 stays constant).
Integrating over a cycle of the system the amount of heat absorbed by the Carnot
engine from the heat reservoir in one such cycle is
I I
Q
Qh = Qh = T0 (1.31)
T
By conservation of energy the total work done by the Carnot engine plus the system in
one such system cycle, I
WT (W + WCE ) = Qh (1.32)
So the composite system either
1. absorbs heat from the reservoir and transforms it into work.

2. transforms work into heat that is rejected at the reservoir.
From the Kelvin statement (1) is impossible so it must convert work to heat which
implies that I
Q
Qh 0 0. (1.33)
T
2
I
Q
For a reversible system we can run the system backwards which implies both 0
I T
Q
and 0 and hence
T I
Qrev
= 0. (1.34)
T
This concludes the proof.
1.4.7 Entropy
Qrev
Clausius theorem allows us to introduce a new state variable. Let us define dS =
T
for an infinitesimal reversible change.
f
B
D
A
Figure 8: A reversible cycle ACBDA.
From Clausius theorem follows that this is an exact differential because any integral over
dS in the space of the state variables is independent of the path. To see this consider a
loop in state-space. From Clausius theorem
I I Z Z
Qrev Qrev Qrev Qrev
= 0 =0 + =0
T ACBDA T ACB T BDA T
Z Z
dS = dS = SB SA . (1.35)
ADB ACB
This is true for any loop so SAB is independent of the path from A to B. (From another
point of view, 1/T is an integrating factor that transforms Q into an exact differential.)
Hence S is a state variable which we call entropy. It is an extensive variable because
heat is extensive.
1.4.8 Irreversible processes
Consider a quasistatic irreversible process from AB. Let us make a cycle by combining
it with a reversible process BRA, see Fig. 9.
From Clausius theorem we obtain
Z Z Z Z
Qirr Qrev Qirr
+ 0 dS (1.36)
AB T BRA T AB T ARB
3
f
B
R
A
Figure 9: An irreversible cycle ABRA.
which gives
Z
Qirr
SB SA (1.37)
AB T
For infinitesimal processes we have

Qirr
dS .
T
In an isolated system Qirr = 0 and we that conclude that dS 0.
This means that entropy can only increase in isolated systems. The increase of the
entropy is an important observation whose consequences we will explore in the following.
1.4.9 Mathematical statement of the 2nd law
Applying the definition of entropy to the first law we have for reversible processes that
dE f
dE = T dS + f dx dS = dx (1.38)
T T
Since this is an equation linking state variables it is true also for irreversible quasistatic
changes and we will call it the mathematical statement of the 2nd law. (The mathematical
statement of the first law is dE = Q + W .)
From this follow the definitions of the partial derivatives of S(E, x) as

S 1 S f
= ; = (1.39)
E x T x E T
4
Example
Find the change in entropy of an ideal gas when it undergoes a reversible isothermal
expansion.
Z 2 Z 2
Q 1
S = = Q ,
1 T T 1
Now dE = Q pdV and the caloric equation gives the internal energy of an ideal gas as
a function of T only so for an isothermal (constant T ) expansion dE = 0 Q = pdV .
Hence using the ideal gas equation of state pV = N kB T ,
1 V2
Z Z V2
dV
S = pdV = N kB = N kB ln(V2 /V1 )
T V1 V1 V
1.4.10 Variational form of the 2nd law
As a consequence of Clausius theorem we concluded that the entropy in an isolated

system can only increase. We also know that an isolated system will approach a thermo-
dynamic equilibrium if one waits long enough. These two facts seems to imply that the
entropy of the equilibrium state should in some sense be maximal. The difficulty with
making this statement more formal is that entropy is a state variable and is only well
defined for equilibrium states.
In order to be able to compare the entropy of the equilibrium state to the entropy of
near-by states it is useful to introduce systems with constraints. These are systems
that are divided into two or more subsystems, for example by dividing walls. These walls
can, for example, be immovable so that the subsystems cannot do work on each other
(and, as we will see, can have different pressures). The walls can also be adiabatic so
that the subsystems cannot exchange heat (and can have different temperatures). The
walls can also be impermeable so that the subsystems cannot exchange particles (and
can have different chemical potentials). The entropy Sc of the system with constraints
is the sum of the entropy of its parts, because of the extensive property of the entropy.
For example, for a division into two subsystems
Sc (E, x) = S1 (E1 , x1 ) + S2 (E2 , x2 ) ,
where E = E1 + E2 and x = x1 + x2 .
11
00
00
11
00
11
00
11
00
11
00
11
00
11
00
11
V1 00
11
00
11
00
11
00
11
00
11
V2
00
11
00
11
00
11
00
11
00
11
00
11
00
11
00
11
Figure 10: Applying an internal constraint.
If we now remove the constraints then the system will adjust to a new equilibrium. The
approach to this new equilibrium in general occurs through irreversible processes and
Clausius theorem tells us that the entropy of the new equilibrium has to be larger or
5
equal to the entropy before the removal of the constraints S Sc . Note that the total
values of the extensive variables E and x does not change during this process, since the
system is isolated.
This observation leads to the variational formulation of the 2nd law.
The equilibrium state is the state at which Sc (E, x) has its global maximum. This
is true for any internal constraint.
By equilibrium state we mean here that of the system without constraints.
In non-quasistatic systems we may have difficulty defining the entropy at all times,
however if the system moves from a constrained system to the unconstrained equilibrium
state, we now know it must have higher entropy than previously.
Example: Find the change in entropy of an ideal gas when it undergoes free expansion.
A partition is removed, and the gas expands to a larger volume, with no heat or work
from the environment. From the caloric equation E = f2 N kB T and the fact there is no
heat or work done, the temperature remains the same. The beginning and end states
of the process are the same as for the reversible isothermal expansion in the previous
example. Since entropy is a state variable, we must again have
V2
S = N kB ln
V1
Notice that this is positive, since V2 > V1 : Free compression is impossible. Also, this
entropy gain is definitely greater than the integral of Q/T since the latter is identically
zero (and in any case this process is not quasistatic).
1.4.11 Conditions at thermodynamic equilibrium
An important application of this variational statement of the 2nd law is a derivation of

conditions at equilibrium.
Consider two isolated bodies, characterised by the variables E1 , x1 and E2 , x2 , which
are brought into contact and are allowed to exchange matter, energy and do work on
each other. The combined system will approach a thermodynamic equilibrium that is
characterised by the extensive variables E, x defined by E = E1 + E2 and x = x1 + x2 .
Let us ask how the temperatures T1 and T2 are related when the total system is at
equilibrium.
E1 E2
S1 S2
T1 T2
Figure 11: Two macroscopic bodies in contact.
For this purpose, imagine applying an internal constraint which displaces the system
slightly away from equilibrium such that E1 E1 + E1 and E2 E2 + E2 with
6
E1 + E2 = 0, while x1 , x2 do not change. From the variational statement of the 2nd
law we conclude that
(S)E,x = (Sc S)E,x 0 ,
where S is the entropy of the system without constraints. Since S is extensive S =
S1 + S2

S1 S2
S = E1 + E2 ,
E1 x1 E2 x2
and hence
1 1
S = E1 , (1.40)
T1 T2
and since we must have S 0 both for E1 < 0 and E1 > 0 then this requires that
T1 = T2 . (1.41)
We hence re-derived the condition of equal temperatures at equilibrium that we obtained

earlier from the zeroth law of thermodynamics.
Further conditions can be obtained by considering other types of constraints. For ex-
ample, consider a constraint that changes the two volumes slightly from their values at
equilibrium such that V1 V1 + V1 and V2 V2 + V2 with V1 + V2 = 0, while the
other extensive variables do not change. We then conclude that

S1 S2 p1 p2
0 S = V1 + V2 = V1 ,
V1 E1 ,x0 V2 E2 ,x0 T1 T2
1 2
where x0 denotes the other extensive variables besides V . This relation has to hold for
V1 > 0 and V1 < 0 and hence the pressures also have to be equal at thermodynamic
equilibrium, p1 = p2 .
Finally, consider a constraint that changes the particle numbers slightly from their values
at equilibrium such that N1 N1 + N1 and N2 N2 + N2 with N1 + N2 = 0, while
the other extensive variables do not change. We then conclude that

S1 S2 1 2
0 S = N1 + N2 = N1 ,
N1 E1 ,x0 N2 E2 ,x0 T1 T2
1 2
where x0 here denotes the other extensive variables besides N . This relation has to hold
for N1 > 0 and N1 < 0 and hence the chemical potentials also have to be equal at
equilibrium, 1 = 2 .
We find that for bodies in thermodynamic equilibrium T , p and are equal.
Before we continue, let us use the variational statement to show that heat flows from
the hotter to the colder body. For this purpose, let us change our perspective slightly
and consider bringing two bodies with T1 6= T2 together (i.e. not at equilibrium). Let us
assume that they only interact by exchanging heat. They will evolve towards equilibrium
7
with their temperatures eventually equal. The 2nd law tell us that in this case since we
start out of equilibrium and go towards the equilibrium state of maximum entropy:

1 1
S > 0 S1 + S2 = E1 > 0
T1 T2
For the inequality to be satisfied if T1 > T2 then E1 < 0 and if T1 < T2 then E1 > 0.
Therefore
The flow of energy is from the hotter (higher T ) to the colder (lower T ) body.
1.5 Auxillary Functions and Legendre Transforms

We obtained from the 2nd law the relation
dE = T dS p dV + dN . (1.42)
This shows that the natural parameters for the energy are the extensive variables S, V
and N (because changes of E are expressed in terms of changes of S, V and N ). If we
know the function E(S, V, N ) then the intensive variables can be determined, because

E E E
T = , p = , = .
S V,N V S,N N S,V
The general approach that characterises macroscopic systems mostly by their extensive
variables is not always the most convenient one. It is often easier to do experiments
varying a mixture of intensive and extensive variables. For example an experimenter
might want to study a gas as a function of T, N, V or T, N, p.
There is a way to modify our thermodynamic description to take account of such sce-
narios, namely the Legendre transform.
Suppose f = f (x1 , . . . , xn ) is a function of x1 , . . . , xn and hence
n
X f
df = ui dxi where ui = . (1.43)
i=1
xi xj 6=i
If we define
rn
X
g=f ui xi ,
i=1
then clearly
rn
X r
X n
X
dg = df (ui dxi + xi dui ) = (xi )dui + ui dxi , (1.44)
i=1 i=1 i=r+1
and g = g(u1 , . . . , ur , xr+1 , . . . , xn ) is a function of u1 , . . . , ur , xr+1 , . . . , xn .

As you can probably imagine it is much easier to do experiments by controlling T, x
rather than S, x as it is much easier to tell what the temperature of a body is than
8
what its entropy is. Therefore the most common Legendre transform is used to make
the transformation from S T .
Let us define F = E T S ; F = F (T, x)
F is called the Helmholtz free energy and
dF = SdT + f dx
For a gas, F = F (T, V, N ) and
dF = SdT pdV + dN (1.45)
It is important to note that
1. we can only exchange conjugate pairs
2. there is no more information in the new function (it is just presented in a more
easily usable form)
Another common transform is
G = E T S (pV ) = F + pV .
G = G(T, p, N ) is called the Gibbs free energy and it has the differential
dG = SdT + V dp + dN . (1.46)
For a generic system with extensive variables x, the Gibbs free energy is defined via a
Legendre transform w.r.t. all extensive variables except N .
We have already seen the enthalpy H = H(S, p, N ) :
H = E (pV ) = E + pV
with the differential

dH = T dS + V dp + dN . (1.47)
These four functions (internal energy, Helmholtz free energy, Gibbs free energy, enthalpy)
are sometimes called auxiliary functions or thermodynamic potentials.
Maxwell Relations
Now we have defined the functions of the groups of thermodynamic variables, we can
obtain many relationships between them because of the properties of exact differentials.
If we have an exact differential, df = a(x, y)dx + b(x, y)dy, then

a b
= ,
y x x y
2f 2f
since a = (f /x)y and b = (f /y)x and =
xy yx
9
For example, since dF = SdT pdV + dN then

S p
= (1.48)
V T,N T V,N

S
= (1.49)
N V,T T V,N
These are called Maxwell Relations. There are many of them and they can all be de-
rived from the differentials of the relevant auxillary function (thermodynamic potential).
The four most common ones are those involving T , S, p and V

T p
=
V S S
V
S p
=
V T V
T
S V
=
p T T p

T V
= (1.50)
p S S p
N is constant for all these derivatives, we do not indicate this explicitly.

Exercise: obtain the Maxwell relations (1.50) from the four thermodynamic potentials.
Note: It is not difficult to remember the relations (1.50). Conjugate variables appear
always in diagonal positions, and there is a minus sign if p and T are on opposite sides
of the equation.
Examples
1. Using the Maxwell relations we can obtain an important relationship between the
heat capacities at constant volume and pressure, CV and Cp , respectively. We
consider N to be constant throughout this section.

Q S
The heat capacity at constant volume is CV = =T .
dT V T V

Q S
The heat capacity at constant pressure is Cp = =T .
dT p T p
Now, let us express the entropy S as a function of T, V

S S
dS = dT + dV .
T V V T
We divide by dT and impose constant p to obtain

S S S V
= + .
T p T V V T T p
10
Then using the Maxwell Relation from equation (1.48)

Cp CV p V
= + . (1.51)
T T T V T p
We now use a relation (0.4) that we derived in the first lecture. If we consider one
of the variables p, V and T as a function of the other two we obtain

p T V
1 = ,
T V V p p T
and hence
p V p
= .
T V T p V T
Using this relation we get

2
2

p V
Cp CV = T = TV . (1.52)
V T T p
This is a famous expression which links the heat capacities to the isothermal
compressibility and the coefficient of thermal expansion

1 V 1 V
= , = .
V p T V T p
For an ideal gas, using the equation of state pV = N kB T , we get Cp CV = N kB .

CV
2. Obtain an expression for in terms of p, T .
V T

CV S S
=T =T ,
V T V T V T T V T V
and using the Maxwell Relation from equation (1.48)

2
CV p p
=T =T . (1.53)
V T T T V V T 2 V
1.6 Extensive Functions and the Gibbs-Duhem Equation

Earlier we defined extensive functions which scale linearly with the system size. An
example is the internal energy E(S, x) which is extensive as are S, x. This implies
E(S, x) = E(S, x)
This signifies that E is a first-order homogeneous function of S and x. A homoge-

neous function of order n is one such that f (x) = n f (x).
First-order homogeneous functions have nice properties which we will illustrate below.
11
Suppose f (x1 , . . . , xn ) is a 1st-order homogeneous function of x1 , . . . , xn . Let yi = xi
which means that
f (x1 , . . . , xn ) = f (y1 , . . . , yn ) .
Differentiate both sides of this equation with respect to
n n
X f yi X f
f (x1 , . . . , xn ) = = xi .
i=1
y i i=1
y i
Setting = 1 we obtain the expression

n
X f
f (x1 , . . . , xn ) = xi . (1.54)
i=1
x i
This is called Eulers theorem for 1st-order homogeneous functions.

Since E(S, x) is such a function this implies that

E E
E(S, x) = S+ x
S x x S
E = TS + f x (1.55)
Let us consider a gas with x = N, V and f = p, so we obtain the important relation
E = T S pV + N . (1.56)
The differential of (1.56) is
dE = T dS + SdT pdV V dP + dN + N d .
On the other hand we know that
dE = T dS pdV + dN .
These two expressions for dE are only compatible if
0 = SdT V dp + N d. (1.57)
This is known as the Gibbs-Duhem equation. It expresses the fact that the three
intensive variables T , p and are not independent of each other.

Quasistatic vs reversible processes
Let us discuss a subtle point concerning quasistatic and reversible processes. The statement of
the first law of thermodynamics
dE = Q + W
12
is an expression of the conservation of energy and is true for all quasistatic processes. These
processes do not have to be reversible.
The statement of
dE = T dS p dV
is a relation between state variables and is also true for all quasistatic processes which do not
necessarily have to be reversible.
On the other hand we know that Q = T dS is true in reversible processes, whereas in general
quasistatic processes Q < T dS.
The three previous statements are only compatible if we conclude that W = p dV is true
in reversible processes, whereas in general quasistatic processes W > p dV . This is indeed
reasonable. An example of an irreversible quasistatic process is one where the movement of
a piston in a cylinder of gas involves friction. In this case we have W > p dV since work
is done not only by moving against the pressure of the gas, but also by moving against the
friction.
13
1
1.7 Conditions for Equilibrium and Stability

We have seen for systems with internal constraints that the equilibrium is an extremum
of the entropy (i.e. the equilibrium state without constraints). Let us examine this
in more detail by considering an isolated macroscopic body made up of two equal
sub-systems at equilibrium and ask what happens if we perturb it slightly away from
equilibrium. We have already seen that the stationarity of the entropy at equilibrium in
an isolated system implies that the intensive variables of the two subsystems are equal,
T1 = T2 and f1 = f2 (we showed this for p and , but it holds for other generalised forces
too). Let us now investigate the consequences of the fact that the stationary point is in
fact a maximum.
The two equal subsystems are characterised by their extensive variables E1 = E2 , S1 = S2
and x1 = x2 , and at equilibrium the intensive variables are equal too, f1 = f2 and T1 = T2 .
E1 S1 x1 E2 S2 x2
T1 f 1 T2 f 2
conducting and permeable wall

Figure 12: Macroscopic body made up of 2 sub-systems.
We now consider small changes in the extensive variables Ei and xi while keeping the
full system isolated so that the combined E1 + E2 and x1 + x2 stay fixed.
We hence apply an internal constraint of the form
x1 x01 = x1 + x1 , E1 E10 = E1 + E1
x2 x02 = x2 x1 , E2 E20 = E2 E1
For simplicity we combine the energy and the variables x into a new variable y = (E, x):
y1 y10 = y1 + y1 , y2 y20 = y2 + y2 , y1 = y2 .
Using the extensive property of the entropy we can write the entropy Sc of the system
1
University
1
with the given constraints as
Sc = S1 (y1 + y1 ) + S2 (y2 + y2 )
S1 S2
= S1 (y1 ) + S2 (y2 ) + y1 + y2 +
y1 y2
1 X 2 S1 1 X 2 S2
+ y1,i y1,j + y2,i y2,j + . . . (1.58)
2 i,j y1,i y1,j 2 i,j y2,i y2,j
where we performed a Taylor expansion around the values at equilibrium, and the sums
run over the components of the vectors y1 and y2 . The linear terms in (1.58) vanish
because the entropy is stationary at equilibrium
S1 S2
= ,
y1 y2
and y1 = y2 . The quadratic terms are equal, because we divided the system in
two equal halves, and the terms are quadratic in the changes y1 and y2 . To simplify
notation, we denote the values of the half-systems at equilibrium by
E1 = E2 = E , x1 = x2 = x , y1 = y2 = y .
We now use the fact that the entropy is maximal at equilibrium to conclude that the
quadratic terms have to be non-positive. With our new notation this can be written as
X 2S X S
0 yi yj = yj , (1.59)
i,j
yi yj j
yj
where the variation of S/yj is defined as

X 2
S S
= yi .
yj i
y i y j
Equation (1.59) is the main result of this section. In order to better understand its
meaning we write it in terms of the original variables E and x

S S 1 f
0 E + x = E x .
E x T T
We can rewrite this equation using

1 1 f 1 1
E = T S + f x , = 2 T , =f + (f ) ,
T T T T T
and we finally obtain the general condition for stable equilibrium:
T S + f x 0 . (1.60)
We derived this relation from the condition that the entropy is maximal at equilibrium.
It is hence also a condition for the equilibrium to be stable because we know that the
entropy cannot decrease in an isolated system.
Let us consider two applications of this formula.
2
Examples:
The heat capacity of a gas at constant volume is given by

S
CV = T .
T V
The generalised displacements of a gas typically are x = (V, N ). For constant

volume processes V = 0, and for processes with constant particle number N = 0,
hence we have x = 0. The stability condition then states that T S 0.This
implies that changes of T and S have the same sign, and consequently CV 0.
This is reasonable. If we bring two bodies with different temperatures in contact,
then heat will flow from the hotter to the colder body, as we have shown earlier.
To approach equilibrium the temperature of the body that receives heat has to
increase, and the temperature of the body that rejects heat has to decrease. This
requirement agrees with the condition CV 0 (remember that Q = CV T ).
We see now the physical meaning of the stability criterion. If it is satisfied then
processes induced by deviations from equilibrium tend to take the system back
towards equilibrium.
For a gas with a constant number of particles the stability condition is S T

p V 0. For constant temperature processes this implies p V 0 and hence

p
0. (1.61)
V T
This states that for processes at constant temperature a decrease in volume leads
to an increase in pressure as we would expect. You can check that this condition
is satisfied for an ideal gas by using the equation of state.
1.8 Phase Equilibria and Phase Transformations

Interesting applications of the concepts that we have just studied appear in the equilib-
rium between phases of materials such as the transition from a liquid to a gas or from a
solid to a liquid.
1.8.1 Clausius-Clapeyron equation
From observation/experiment we find that for each temperature T there is a unique

pressure p = pE (T ) at which a phase transition occurs. We can use this to draw a curve
pE (T ) in the p T plane which describes the boundary between the two phases. This
is an example of a phase diagram, and the curve pE (T ) is an example of a phase
boundary.
For example, see Fig. 13, which is (a cartoon of) that of water and shows three phases. It
contains a triple point, denoted by A, at which all three phases can coexist in equilibrium.
It occurs at a unique temperature and pressure. (For water pA 0.006atm. And
TA = 273.16K by definition. As we discussed, this point defines the Kelvin temperature
scale.) There is also a critical point, denoted by C, at which the liquid-gas equilibrium
3
p
C
liquid
solid
A gas
Figure 13: Phase diagram of water showing boundaries between different phases
line ends. (For water TC 647K and pC 218atm.) At this point the volume difference
between liquid and gas phases has approached zero. Beyond this point the very dense
gas has become indistinguishable from the liquid.
In the following we will derive an equation that describes the phase boundaries between
different phases as those in Fig. 13.
The condition for the phase transition between two different phases at temperature T
and pressure p is that they are at equilibrium, in particular, the chemical potentials of
the two phases have to be equal
1 (T, p) = 2 (T, p) . (1.62)
Equivalently we can also require that the Gibbs free energies per molecule are identical,
because G = E T S + pV , and since E(S, V, N ) is a 1st order homogeneous function
we have further E = T S pV + N which implies that G = N .
From the Gibbs-Duhem equation
N d = SdT + V dp d = sdT + vdp ,
where s = S/N is the entropy per particle and v = V /N is the volume per particle.
Hence

= s ; =v.
T p p T
Along a boundary line between two phases we have 1 = 2 and hence d1 = d2 , and
this leads to the condition
s1 dT + v1 dp = s2 dT + v2 dp .
So on the co-existence line

dp s
= , (1.63)
dT v
where v = v1 (T, p) v2 (T, p) is the volume difference per molecule between the two
phases, and s = s1 (T, p) s2 (T, p) is the entropy difference per molecule between the
two phases. Equation (1.63) is called the Clausius-Clapeyron equation.
4
Phase transitions usually require the input of heat (the latent heat) as particles move
from one phase to the other. The entropy difference is related to the latent heat l per
particle by
` = T s ,
(remember Q = T dS.). For example at atmospheric pressure, the specific (per mass) la-
tent heats of water are 334 kJ kg1 for melting (latent heat of fusion), and 2265 kJ kg1
for boiling (latent heat of vaporization.)
Example: Note that one of the many special properties of water is that upon freezing
it expands. So upon melting ice contracts v < 0 while it gains entropy (absorbs heat)
s > 0 which implies from Clausius-Clapeyron that dp/dT < 0. Therefore increasing the
pressure lowers the freezing temperature. Water remains liquid at lower temperatures
when under pressure. This is one reason why ice skating works.
1.8.2 Van der Waals gas and the liquid-gas transition
We now give a specific model of the liquid-gas phase transition. The ideal gas law
describes gases in the limit of low densities, but for real gases there are deviations. Van
der Waals introduced a model as part of his PhD thesis from Leiden University (under
the supervision of Pieter Rijke - defended in 1873) that includes two correction terms.
The equation of states of the van der Waals model has the form
N2

p + a 2 (V bN ) = N kB T .
V
The correction of the volume term takes account of the fact that molecules occupy some
volume, and this diminishes the volume that is available to the other molecules. The
pressure term is corrected due to existing attractive forces between molecules.

p
4

3 t = 169 > tc

t = 79 < tc
2

t = 89 = tc
1

v
1 2 3 4
Figure 14: Isotherms of the Van der Waals gas for a, b > 0 where b2 p/a p/27, v
V /(3N b) and 3bkB T /a t .
A discussion of the isotherms of the van der Waals gas, using suitably scaled parameters,
was included in problem 10 of the first homework sheet.
There is a critical temperature Tc above which the isotherms look similar to those of
an ideal gas. They decrease monotonically and for every pressure p there is a unique
volume V . However isotherms for temperatures below Tc develop a hump and show
non-monotonic behaviour with a minimum and a maximum. Note also that there is a
5
critical pressure pc at which the critical isotherm Tc has a point of inflection, and the
maximum of the isotherms for T < Tc lie below pc .
Recall that thermodynamic stability requires for a gas that

p
0.
V T
Obviously the non-monotonic curves have a region that is thermodynamically unstable

between their maximum and minimum. So what is going on here?
Let us consider a temperature T < Tc where the isotherm has a maximum pmax < pc
and a minimum pmin > 0, see Fig. 15.
1. For p pmax there is only one volume V for each value of p. The volume value is
small (high density at fixed N ) and the curve has a large slope (low compressibility),
properties that are usually associated with a liquid. It is tempting to identify this
region with a liquid phase.
2. For p pmin there is again only one volume value of V for each value of p. The
volume value is high (low density at fixed N ) and the curve has a small slope (high
compressibility), properties that are usually associated with a gas. It is tempting
to identify this region with a gas phase.
3. For pmin < p < pmax , there are three values of N for each value of p (third order
polynomial in N ). Two of these values lie on a stable part of the p V curve and
one value on an unstable part. One of the stable solutions has low density (gas)
while the other has high density (liquid). Of the two stable possibilities, which if
any is the right one?
(a) p V curve (b) V p curve
Figure 15: An isotherm of the Van der Waals gas at T < Tc .
To continue we make a basic assumption that is in agreement with our daily experience
with the liquid gas transition. We assume that for each temperature there exist a unique
pressure pE (T ) at which the transition between liquid state and gas state occurs. If
6
p > pE (T ) then the system is in the liquid phase, and the low volume solution is the
correct one. If p < pE (T ) then the system is in the gas phase, and the high volume
solution is the correct one. At p = pE (T ) both phases can coexist. They are in thermal
equilibrium. At p = pE (T ) we replace the van der Waals curve by a horizontal line (the
line connecting 1 and 2 in Fig. 15). The phase transition occurs along this line as more
and more molecules go from the liquid phase to the gas phase and the volume of the
system increases.
How does one find the pressure pE (T ) at which the transition occurs? We already
mentioned that both phases have to be at thermodynamic equilibrium at the phase
transition, thus
1 (T, pE ) = 2 (T, pE ) . (1.64)
To calculate differences in the chemical potential we can either use the differential form
for G = N (see the last section) or the Gibbs-Duhem relation. Both give
S V
d = dT + dp .
N N
At constant temperature dT = 0, and we find that the difference of the chemical poten-
tials between two states is given by
Z
1
= V dp . (1.65)
N
Now we can answer the question at which pressure the transition between liquid and gas
phases occurs. Condition (1.64) requires that the chemical potentials are equal at the
transition, and we can calculate the difference between the chemical potentials of the
liquid and gas phases by evaluating the integral (1.65) between points 1 and 2 in Fig. 15.
We hence obtain the condition
Z
1
2 1 = V dp = 0 .
N 12
We can evaluate the integral by splitting it into four sections (see Figure 15) :
Z pmin Z pE Z pmax Z pE
2 1 = dp v(p, T ) + dp v(p, T ) + dp v(p, T ) + dp v(p, T )
pE pmin pE pmax
| {z } | {z }
Area 1 Area 2
So the chemical potentials are equal 1 = 2 if the two areas in the diagram (Figure 15)
are equal. This technique is called the equal areas or Maxwell construction. The unique
pE that satisfies this equal area constraint will give us the required pressure at which the
phase transition occurs.
We argued that the system is in the liquid phase if p > pE . Then Area 1 > Area 2
and 2 > 1 . The system is in the gas phase if p < pE . Then Area 1 < Area 2 and
1 > 2 . This can be understood in the following way. It can be shown that the principle
of maximal entropy, that holds in isolated systems, implies that the Gibbs free energy is
minimal in systems that are kept at constant temperature and pressure (we wont show
this in this course). Hence the system goes into the phase of smaller chemical potential.
We close this discussion with a word of warning - something funny is going on at pE
because we have been integrating along a thermodynamically unstable curve. This in-
volves questions about what are called metastable states and nucleation theory which
is still one of the big open questions in statistical mechanics today.
7
Figure 16: Isotherm and coexistence curves of the Van der Waals gas.
1.9 The Third Law of Thermodynamics

The third law of thermodynamics describes the behaviour of thermodynamic systems
near absolute zero. The Nernst statement of the 3rd law:
The entropy of a system approaches a universal constant at absolute zero that is inde-
pendent of all system parameters and can be taken to be zero.
lim S(T, x) = 0 , x
T 0
Unlike the other laws, the third law is not believed to be universally valid for all sub-
stances, though we will assume it unless otherwise stated.
1.9.1 Implications
1. We have
T
Cx (T 0 )
Z
S(T, x) S(0, x) = dT 0 ,
0 T0
where Cx is the heat capacity at constant x. (You can consider also other processes
where other parameters are kept fixed). The third law tells us that the entropy
stays finite at T = 0. Hence we conclude that Cx (T ) 0 as T 0 at least as fast
as T , because otherwise the integral would diverge as T 0 0.
Note that this means that the ideal gas does not satisfy the third law.
2. Further relations describing properties as T 0 follow from

S
lim = 0,
T 0 x T
plus possibly using Maxwells relations.
3. It is not possible to reach absolute zero using a finite sequence of steps.
8
2 Statistical Mechanics
Q: What is equilibrium statistical mechanics ?
A: It is a probabilistic approach to derive the equilibrium macroscopic properties of
systems based on the mechanics of their constituents. It is useful for systems with a
large number of microscopic degrees of freedom.
Historically, its foundations were motivated in the mid 1800s after the basic laws of
thermodynamics had been put on a sound and secure basis. Initially it was driven by
kinetic theory of gases which was based on the then controversial molecular (atomistic)
view of nature. This culminated in the monumental work of Boltzmann in the 1870s
when he produced his celebrated H-theorem making a direct link between entropy (a
macroscopic thermodynamic quantity) and (microscopic) molecular dynamics. However
there was still plenty of resistance to these ideas mainly centred on the approximations
by which macroscopic irreversibility was obtained from microscopic dynamics that was
reversible. Many of these shortcomings were circumvented by the emergence of the en-
semble theory culminating in the work of Gibbs around 1900. In ensemble theory one
considers an ensemble or many copies of the system from which, based on mechanics
of their components, statistical statements can be made about macroscopic thermody-
namics quantities which become more and more precise the larger the system is. This is
essentially still the way in which statistical mechanics is done today.
While the ensemble approach to equilibrium statistical mechanics (at least for systems
with short range interactions) is generally agreed to be on a very sound basis, the foun-
dations of non-equilibrium statistical mechanics are still not very well understood and
remain a still much studied area at the boundary of physics/chemistry/mathematics.
2.0 Preliminaries
Before we proceed let us make sure that we are all on the same page regarding the
microscopic dynamics. We consider first systems whose microscopic dynamics is classical,
and we start by briefly reviewing some parts of classical mechanics that will be relevant
in the following. A more detailed description can be found in any number of classical
mechanics text books, for example
Classical Mechanics by R. Douglas Gregory (Cambridge University Press, 2006).
Classical Mechanics by H. Goldstein, C.P. Poole and J.L. Safko (Pearson, 2001).
Hamiltonian mechanics
Hamiltonian mechanics is a formulation of classical mechanics that is alternative to

Newtonian mechanics. It provides a deeper understanding of the structure of classi-
cal mechanics2 and is another, more powerful, way of implementing the same physical
principles. It also makes the link between classical and quantum mechanics much more
explicit. It is the natural framework in which to study microscopic degrees of freedom
required for statistical mechanics.
2
Its sympletic structure (see Mathematical Methods of Classical Mechanics by V.I. Arnold).
9
Let us start by considering Newtons equation of motion for a particle in 3 dimensions
mq = F(q) , (2.0.1)
where m is the mass of the particle and F is the force. Written component-wise
mq1 = F1 (q1 , q2 , q3 ) ,
mq2 = F2 (q1 , q2 , q3 ) ,
mq3 = F3 (q1 , q2 , q3 ) . (2.0.2)
This is a system of coupled second order ODEs and we need six initial conditions to
specify its solution uniquely, for example by providing the components of q(t0 ) and q(t0 )
at some time t0 .
In cases where the force is conservative (this is the only case we consider) it can be
derived from a potential function V (q) that has the interpretation of a potential energy
F(q) = V (q) = mq = V (q) . (2.0.3)
In a first step we transform the system of three 2nd order equations into a system of six
first order equation. This is done by introducing the momentum of a particle
p = mq , (2.0.4)
in terms of which
p
q = ,
m
p = V (q) . (2.0.5)
The solution is specified uniquely by providing q(t0 ) and p(t0 ) at some time t0 .
Equations (2.0.5) can be rewritten in terms of the kinetic energy
3 3
mX 2 1 X 2 T pj
T (p) = qi = p = = = qj .
2 i=1 2m i=1 i pj m
In the final step, we introduce the total energy

3
X p2i
H(q, p) = T (p) + V (q) = + V (q) . (2.0.6)
i=1
2m
in terms of which we obtain Hamiltons equations of motion

qj = H(q, p) , j = 1, 2, 3,
pj

pj = H(q, p) , j = 1, 2, 3. (2.0.7)
qj
The function H(q, p) is Hamiltons function, or the Hamiltonian, and its value is
equal to the total energy E of the particle.
10
As an example, consider the Hamiltonian of a harmonic oscillator in one dimension. It
is the sum of kinetic and potential energy and has the form
p2 m
H(q, p) = + 2q2 . (2.0.8)
2m 2
The equations of motion are
H p

q = =

p m
q = 2 q , (2.0.9)
H 2
p = = m q

q
and the solutions are given by
q = A cos(t + ) , p = mA sin(t + ) , (2.0.10)
where A and are determined by initial conditions. One can illustrate the solutions in
phase space. This is the space of the q and p variables. In phase space the solutions
form ellipses. This can be seen from H(q, p) = E and (2.0.8).
Hamiltons equations of motion can be generalised to N particles. Then one has N
coordinates qj and N momenta pj , altogether 6N parameters if the particles move in
three dimensions. In Cartesian coordinates, the kinetic energy typically has the form
N
X 1
T (p1 , . . . , pN ) = pj pj , (2.0.11)
j=1
2m j
but in general it can depend also on the coordinates qj (for example, in the presence of
a magnetic field or in non-Cartesian coordinates). A typical form of the potential is
X X
V (q1 , . . . , qN ) = Uint (qi qj ) + Uext (qi ) , (2.0.12)
i<j i
where Uint describes interactions between particles, and Uext the potential of external
fields.
It is often convenient to combine all coordinates into one vector q with 3N compo-
nents and all momenta into a vector p with 3N components. In this compact notation
Hamiltons equation of motion are

qj = H(q, p) , j = 1, . . . , 3N,
pj

pj = H(q, p) , j = 1, . . . , 3N, (2.0.13)
qj
where H = T + V .
Remarks:
The qj s and pj s are called conjugate variables (different from the conjugate
extensive-intensive pairs in Thermodynamics).
11
The q and p form a 6N -dimensional phase space. The motion of the whole system
can be imagined by a trajectory in this high-dimensional phase space.
Hamiltons equation of motion are first order. This implies that one point in phase
space uniquely determines the trajectory passing through it = trajectories in
phase space do not intersect.
The classical flow in phase space is volume preserving (this will be shown later, see
Liouville theorem).
Hamiltons formalism is convenient for the transition to quantum mechanics.
The Hamilton function H(q, p) is the total energy of the system and contains all
the information about the dynamics of a system. We assume that it does not
depend explicitly on time.
Energy conservation follows from the equations of motion.

3N 3N
d X H H X
H(q(t), p(t)) = qj + pj = (pj qj + qj pj ) = 0 . (2.0.14)
dt j=1
qj pj j=1
The time evolution of any function on phase space is given by

3N 3N
d X A A X A H A H
A(q(t), p(t)) = qj + pj = =: {A, H} ,
dt j=1
qj pj j=1
qj pj pj qj
(2.0.15)
where the last equality defines the Poisson bracket {A, H}.
A large class of coordinate transformations (canonical transformations, not

discussed in this unit) leaves the form of Hamiltons equation in (2.0.13) invariant.
This is one of the main advantages of Hamiltons formalism. q and p in (2.0.13)
in general refer to generalised coordinates and momenta that can be different from
what we physically associate with coordinates and momenta.
Example: two-dimensional motion in a central field V (r) = V (|r|) = V (r). Let

us compare Newtons and Hamiltons approach. This problem is conveniently described
in polar coordinates
x = r cos ,
y = r sin . (2.0.16)
The gradient in polar coordinates has the form

1 cos sin
= r + , where r = and = . (2.0.17)
r r sin cos
Newtons equation of motion are

mr = F(r) = V (r) = r V (r) =: f (r) r . (2.0.18)
r
12
To continue we have to differentiate r = rr. This can be done using the relations

r = and = r (2.0.19)
that are obtained from (2.0.17). Inserting into (2.0.18) yields
mr = mr(r r2 ) + m(2r + r) = f (r)r . (2.0.20)
and hence
2 1d 2
m(r r ) = f (r) , and 2mr + mr = mr = 0 . (2.0.21)
r dt
The second of these two equations expresses the conservation of angular momentum.
If one performs the transformation to polar coordinates in Hamiltons formalism then one
obtains the conjugate momenta pr = mr and p = mr2 (how to obtain the conjugate
momenta is not examinable - see below). Note that the conjugate momentum of the
angular coordinate is the angular momentum. One then proceeds by expressing kinetic
and potential energy in polar coordinates.
p2r p2
H(r, , pr , p ) = + 2 + V (r) . (2.0.22)
2m 2mr
Hamiltons equation of motion are then
H pr H p
r = = , = = ,
pr m p mr2
H p2 H
pr = = f (r) + 3 , p = = 0. (2.0.23)
r mr
(2.0.24)
These equations are equivalent to (2.0.21).

Background to Hamiltonian mechanics
The idea is that free particles move in straight lines in space-time, corresponding to mini-
mal length; can curved particle trajectories due to external forces be described as minimising
something more complicated than length? The answer (you guessed) is yes.
In Lagrangian mechanics introduce arbitrary coordinates qi describing the state of the system
(for one particle moving in 3 dimensions, i = 1..3). Fix the initial and final states of the system,
q(i) , q(f ) then find a stationary point of the action, defined by
Z tf
S= L(q, q, t)dt
ti
where the Lagrangian function L is chosen so as to give the desired dynamics, the dot indicates
a time derivative, and the stationary point is calculated over the space of all functions q(t)
satisfying the initial and final conditions.
13
Suppose we vary the trajectory by adding a small perturbation q vanishing at the endpoints.
Then the new action will be
Z tf
S + S = t)dt
L(q + q, q + q,
ti
Z tf X L L
S = ( qi + qi )dt
ti qi qi
i
Integrating the second term by parts, and noting that the boundary terms vanish by assump-
tion, we find Z tf X
L d L
S = qi ( )
ti qi dt qi
i
This will vanish for any qi if

L d L
=0
qi dt qi
These equations (one for every i) are known as the Euler-Lagrange equations.
Example: Suppose that q = (x, y, z) and
m 2
L(q, q, t) = (x + y 2 + z 2 ) V (x, y, z)
2
where m is a constant and V is an arbitrary function. Then Lagranges equations read
V d
m x = 0
x dt
V d
m y = 0
y dt
V d

m z = 0
z dt
which is equivalent to Newtons equations with a potential energy function V (x, y, z). Note
that the Lagrangian function is kinetic minus potential energy.
The statement that the action is invariant under small perturbations makes no mention of any
coordinates, therefore, kinetic minus potential energy in any coordinates gives the
correct equations of motion.
Example: Kepler problem in two dimensions. We use polar coordinates:
x = r cos
y = r sin
and so
x = r cos r sin
y = r sin + r cos
hence
m 2 mM
L= (r + r2 2 ) +
2 r
(we have chosen units to set Newtons gravitational constant G = 1). Lagranges equations are
mM
mr2 mr = 0
r2
14
d
(mr2 ) = 0

dt
The second equation states that a certain quantity, the angular momentum, is conserved.
We note that in general, if L does not depend on qi , L/ qi is conserved. This allows us to
reduce the number of equations to be solved, a very important principle for simplifying the
solution to dynamical problems. We give this quantity a new symbol:
pi = L/ qi
and call it the canonical momentum conjugate to qi . Thus Lagranges equations can be written
L
pi =
qi
Note that the canonical momentum is equal to the usual (mechanical) momentum in Cartesian
coordinates, but not otherwise. It also differs from the mechanical momentum in the case of a
magnetic field.
We can further capitalise on the canonical momentum by using it instead of the velocity qi , as
follows: solve to obtain qi = ui (qi , pi ) and construct another function, called the Hamiltonian,
using a Legendre transform
X
H(qi , pi , t) = pi ui (qi , pi ) L(qi , ui (qi , pi ), t)
i
then we have Hamiltons equations,
H/qi = L/qi = pi
H/pi = qi
dH/dt = L/t
thus the Hamiltonian is conserved if L is independent of t. Recall that L is kinetic minus
potential energy; the Hamiltonian is usually kinetic plus potential energy, ie total energy.
Example: the Hamiltonian of the 2D Kepler problem. We find
L
pr = = mr
r
L
p = = mr2

which we invert (trivially)
r = pr /m
= p /(mr2 )
and write
X
H = pi qi L
1 2
= p2r /m + p2 /(mr2 ) (p + p2 /r2 ) mM/r
2m r
1 2 mM
= (p + p2 /r2 )
2m r r
15
1
2.1 Ensembles
2.1.1 Macroscopic and microscopic states
The laws of thermodynamics are a powerful phenomenological description of equilibrium

properties of macroscopic bodies. However, they have been obtained empirically, and
they have parameters that are not derived from a microscopic theory but whose values
can only be set by comparison with experiment. Statistical Mechanics attempts to
address this shortcoming.
We can describe a macroscopic system in equilibrium by a macrostate M that is charac-
terised by a small number of thermodynamic parameters, for example E, V, N for a gas.
A macroscopic body is made up of many N 1023 particles which means that the consid-
eration of such as system at a microscopic level implies necessarily that there are a huge
number of different ways of realising a given macrostate microscopically. Each of these
may be called a microstate of the system. For example, for a gas a microstate can be
specified by the set of positions and momenta of its molecules {(qi , pi ), i = 1, . . . , N }.
As found by the early kinetic theorists such as Maxwell and Boltzmann, a dynamical
description of such microstates is far too complicated. It is more productive to consider
very many copies of the system (ensembles of microstates) which correspond to the same
macrostate. The ensemble can be defined by specifying a probability density pM () of
finding a system in a microstate given a macrostate M . This will enable us to determine
what fraction of the ensemble would be in that microstate. The probability density can
then be used, by averaging over the ensemble, to calculate macrocopic properties which
can be tested against thermodynamics and experiment.
2.1.2 Phase space
The microstate of a system with N microscopic particles can be characterised by gen-

eralized coordinates qi and generalized momenta pi , where i = 1, . . . , N . It can be
represented by a point in phase space where = (q, p) = (q1 , . . . , qN , p1 , . . . , pN ).
For particles moving in three dimensions phase space has 6N dimensions. Each repre-
sentative point in phase space is a possible microstate of the system.
The microscopic dynamics of the system is governed by Hamiltons equations of motion
qi H() pi H()
= , = , i = 1, . . . , 3N , (2.1.1)
t pi t qi
which conserve total energy so that H((t)) = E.
For a system in a macrostate M one defines an ensemble of microsystems by specifying
the probability
Q3N density pM (). When multiplied by a volume element in phase space
d = i=1 dqi dpi it specifies the probability pM () d that the state of a microsystem
1
University
1
is in a volume element d around the point . That is, if one considers N copies of the
microsystem, then the fraction of systems whose state is in a volume element d around
is given in the limit N by
dN ()
pM () d = lim , (2.1.2)
N N
where dN () is the number of microsystems, out of the N systems, whose state is in d
around .
Often we would like to be able to count classical states in phase space. As we will see
later, in quantum mechanics microstates form a discrete set and are countable. In order
to count classical microstates as well one can introduce a discretisation of classical phase
space. One divides phase space into small cells and considers all phase space points in
a cell to represent the same classical state. The size of the cells is specified by qi and
pi where
qi pi = h0 , i = 1, . . . , 3N . (2.1.3)
This is reminiscent of Heisenbergs uncertainty relation in quantum mechanics and sug-
gests that choosing h0 of the order of Plancks constant h = 6.626 1034 Js might be
a reasonable choice. We will see that quantum-classical correspondence leads to h0 = h.
From a purely classical point of view the macroscopic results that we derive from our
theory will not depend on the exact value of h0 . We will usually assume the classical-
quantum correspondence and just write h instead of h0 .
2.1.3 Ergodic hypothesis
This was first introduced by Boltzmann in 1871. A system is ergodic if a typical trajectory
in phase space explores all the available phase space uniformly. By available phase space
we mean the region in phase space that is compatible with a given macrostate, specified
for example by energy E and generalised displacements x, and typical means that any
exceptions are of measure zero. If a system is ergodic then averages of phase space
functions along a trajectory over time are equivalent, as time goes to infinity, to averages
over the available region in phase space.
Since it is expected that an experimental measurement of a particular physical property
at equilibrium would measure the long time average of that physical property, then the
ergodic hypothesis implies that the ensemble average of any physical quantity is identical
to the value one would expect to obtain if one made a measurement of that quantity
for any given system at equilibrium. It is often assumed that thermodynamic systems
are ergodic (or at least ergodic for all practical purposes) because of the complicated
interactions between huge numbers of particles. In some cases ergodicity can be proven
exactly, for example for exactly hard billiard-ball collisions by Simanyi in 2013.2
2.2 Microcanonical Ensemble (MCE)

Like in thermodynamics, we begin our discussion of statistical mechanics with an isolated
system with fixed internal energy E. The macrostate is given by the internal energy E
2
N. Simanyi Nonlinearity 26, 1703-1717 (2013); not for the faint hearted! A more tractable treatment
of ergodicity may be found in the unit Ergodic Thoery and Dynamical Systems.
2
H( )=E
Figure 17: An isolated system picked from a microcanonical ensemble.
and generalised displacements x. For example, for a gas x = (V, N ). The corresponding
set of microstates is called the microcanonical ensemble (MCE).3 Since the total
energy of the system has a fixed value E, then all the microstates in this ensemble will
be confined to the surface H() = E. However, it is convenient to allow for a small
range of energies in an interval [E, E + ] with a very small 4 .
The central postulate of statistical mechanics is Boltzmanns assumption of equal a

priori equilibrium probabilities which states that for an isolated system all mi-
crostates with the same energy are equally likely to be occupied.
This implies that for the MCE, the probability of being in a particular microstate , is

1

for E H() E + ,
(E, x)

pE,x () = (2.2.1)

0 otherwise .
where (E, x) is the volume of phase space in the region E H() E + . For small
enough it is given by the area of the surface of constant energy H() = E in phase
space times .
As will be shown in the following, the central quantity for the microcanonical ensemble is
the entropy. It allows to connect the microscopic theory to macroscopic thermodynamics.
In the following section we will identify what the entropy is in the microscopic theory.
2.2.1 Entropy
Entropy is often associated with disorder. Consider the two systems in Figure 18.
One can argue that the left system shows much more disorder than the right system
where the particles occupy only half the system The entropy of the left system is indeed
much higher than the entropy of the right system: We have from the examples in week
3 that the difference in energy for an ideal gas of N particles at fixed temperature T is
V2
S = N kB ln = N kB ln 2
V1
3
Sometimes CE in the literature.
4
In classical mechanics it is easier to deal with phase space volumes, and in quantum mechanics the
energy spectrum of microstates is discrete, so for most energies E there is no corresponding micro state.
3
A B
Figure 18: Two cylinders of gas. The particles in system A are uniformly distributed over
the whole volume, whereas they are concentrated in one half of the volume in system B.
From another point of view, there are many more microstates that are compatible with
the state of the left system than with those of the right system, in fact the relation
between the number of microstates is
N
nB 1
= , (2.2.2)
nA 2
because the particles in the right system have only half the volume available to them.
Let us assume that there is indeed a relation between the entropy SA of a macrostate
and the number of microstates nA that are compatible with the macrostate, and that
this relation is given by SA = f (nA ). We will use the extensive property of the entropy
to find the form of the function f .
Let us assume that we have a system that consists of two independent parts, labelled
by 1 and 2. The entropy is extensive and hence S1+2 = S1 + S2 . On the other hand,
the number of microstates of the combined system is the product of the number of
microstates of the partial systems, n1+2 = n1 n2 . We hence have
f (n1 n2 ) = f (n1 ) + f (n2 ) . (2.2.3)
Differentiating this equation with respect to both, n1 and n2 , yields
f 0 (n1 n2 ) + n1 n2 f 00 (n1 n2 ) = 0 .
Denoting n1 n2 = n we obtain further

f 00 (n) 1 k
0
= = ln f 0 (n) = ln n + const. = f 0 (n) = ,
f (n) n n
where k is a constant. One further integration yields
f (n) = k ln n , (2.2.4)
where we dropped another integration constant because it would not satisfy the extensive
property (2.2.3).
Back to our microcanonical ensemble, we use these considerations to define the entropy
of a macrostate with parameters E and x by
S(E, x) = kB ln (E, x) . (2.2.5)
Here (E, x) is the number of microstates that are compatible with the macrostate. For
the constant k we choose the Boltzmann constant kB . It will be seen later in the example
4
of the ideal gas that this is indeed the correct identification of the constant k. Expression
(2.2.5) is due to Boltzmann.
The remaining step is to specify the number of microstates (E, x). It is given by
(E, x)
(E, x) = . (2.2.6)
h3N N !
The expression (E, x)/h3N follows from our discussion of the number of classical states,
but there is an additional term of N ! that has to be included. It will be seen later in
the example of an ideal gas that this term is needed, because otherwise the entropy is
not extensive (Gibbs paradox). Its interpretation is that identical microscopic particles
are indistinguishable and that microstates that differ only by the interchange of parti-
cles should be considered identical. Hence the division by the number N ! of possible
permutations of N particles. This is an indication that classical mechanics is not the
basic microscopic theory, because particles are indistinguishable in quantum mechanics
whereas they are usually considered to be distinguishable in classical mechanics.
5
Information and entropy
We show here an alternative way to introduce entropy. We considered the idea of an
ensemble of microstates from which we can randomly pick a system. This random
choice is intimately linked to information.
To see this let us consider an object hidden with equal probability in one of x
identical boxes. The fact that we cannot tell which particular box the object is in
reflects our lack of information. Let us denote the amount of missing information as S.
It clearly has a number of properties
1. It must be a function of x S = S(x).
2. The larger x is, the larger our lack of information S(x) > S(y) if x > y.
3. If there is only one box, x = 1, there is no missing information S(1) = 0.
4. One can ask how the missing information is distributed among subsystems. Sup-
pose we split a system of 2x boxes into 2 sets of x boxes? We can then divide
the information of the entire system S(2x) into two parts. First there is the miss-
ing information S(2) for the choice of one of the two sets, and then the missing
information S(x) for the choice of one of the x boxes in a set.
S(2x) = S(2) + S(x) .
More generally we see that
S(xy) = S(x) + S(y) .
Note that S(1 y) = S(1) + S(y) = S(y).
Let us generalize to continuous variables x 1 so that S(x) is a continuous function

for x 1. Then the properties 1 to 4 above uniquely determine the functional form of
S(x). To see this let x = em/n , hence xn = em and
m
S (xn ) = S (em ) nS(x) = mS(e) S(x) = S(e) = S(e) ln x .
n
Hence, the amount of missing information lost in having to make a random pick of one
out of x equal choices, S(x), called the information entropy is given by
S(x) = k ln x
where k = S(e) is a positive number defining the units. This agrees with (2.2.4).
2.2.2 Zeroth law
Using the MCE we can derive many of our thermodynamical concepts.

The zeroth law of thermodynamics describes systems at thermal equilibrium. Consider
two systems that are brought into thermal contact. They can exchange energy in the
6
form of heat. The other parameters, x1 and x2 , of the systems are kept fixed. The
combined system is isolated so that the total energy E = E1 + E2 is constant, and we
can apply the microcanonical ensemble to the combined system.
E1 E2 = E E1
Figure 19: Zeroth law: two isolated systems brought into contact.
Due to the thermal contact of the subsystems the combined system has a much larger
number of microstates available than before the contact, because E1 and E2 are now free
to change as long as the total energy E = E1 + E2 is constant.
What is the probability density P (E1 ) that system 1 has energy E1 ? The probability
density P (E1 ) is proportional to the total number of microstates that are available to the
combined system if system 1 has energy E1 . This number is the product of the number
of microstates of system 1 times the number of microstates of system 2.
P (E1 ) 1 (E1 , x1 ) 2 (E E1 , x2 ) . (2.2.7)
The function (E, x) is an enormously rapidly increasing function of the energy (for an
ideal gas E 3N/2 , see next section). Hence P (E1 ) is the product of a very rapidly
increasing function and a very rapidly decreasing function and has a very pronounced
maximum as in Figure 20.
P(E1)
E*1 E1
Figure 20: The probability that system 1, in thermal contact with system 2, has energy
E1 satisfies P (E1 ) 1 (E1 , x1 )2 (EE1 , x2 ) and has a very pronounced maximum at E1 .
The distribution (2.2.7) is dominated by its maximum. To find the position E1 of its
maximum we can consider its logarithm ln P (E1 ) S1 (E1 , x1 ) + S2 (E E1 , x2 ) (which
depends more smoothly on energy) and obtain the condition
d S1 S2
0= P (E1 ) = = 0, (2.2.8)
dE1 E1 E2
7
where E2 = E E1 . All microstates are equally likely however there are an exponentially
larger number of them in the vicinity of E1 , E2 , where E2 = E E1 , than at any
other point, so that in general even though the system started at some (arbitrary) point
E1 , E2 , it eventually evolves to the state with internal energies E1 , E2 , defining thermal
equilibrium.
As in the section on the zeroth law we find again from equation (2.2.8) that thermal
equilibrium is characterised by a state variable of the systems being equal. We define

S 1
= (2.2.9)
E x T

as before, and hence the temperatures of the two systems are equal at equilibrium. Equa-
tion (2.2.9) defines temperature for the MCE. We will see later in the example on the ideal
gas that it agrees with the temperature scale that was introduced in thermodynamics.
2.2.3 First and second law
In (2.2.9) we found how the entropy changes with energy. One can also study varia-
tions of the entropy with generalised displacements x by considering how the number of
microstates is affected by reversible changes dx, and this can be related to an average
generalised force f . In this way one obtains a microscopic derivation of the relation5

S f
= . (2.2.10)
x E T
The differential of the entropy is hence given by
S S 1 f
dS = dE + dx = dE dx = T dS = dE f dx . (2.2.11)
E x T T
The work done on the system is given by W = f dx, and identifying the heat again as
Q = T dS we obtain the first law
dE = Q + W . (2.2.12)
Note the (2.2.11) agrees with what we previously described as mathematical formulation
of the second law.
Let us return to our composite system made by bringing two systems into contact to
discuss the variational formulation of the second law. Our statistical definition of equilib-
rium depends on the fact that N 1. We are exponentially unlikely to find the combined
system with energies of the subsystems different from E1 , E2 , because the equilibrium
point has a exponentially larger number of accessible states than any starting point.
Therefore for any starting state E1 , E2 .
1 (E1 ) 2 (E2 ) 1 (E1 ) 2 (E2 ) , (2.2.13)
and the system evolves to a much more likely (because much bigger) region of the joint
phase space. Taking the logarithm of (2.2.13) we find that the entropy increases
S = S1 (E1 ) + S2 (E2 ) S1 (E1 ) S2 (E2 ) 0 . (2.2.14)

5
We cite here only the result, see e.g. F. Reif, Fundamentals of statistical and thermal physics.
8
Finally, when the two bodies are first brought into contact, their temperatures are in
general not the same. For initial energies close to the new equilibrium energies,

S1 S2
S ' (E1 E1 ) + (E2 E2 )
E1 x1
E2 x2

!
S 1
S 2 1 1
= (E1 E1 ) = (E1 E1 ) 0, (2.2.15)
E1 x1 E2 x2 T1 T2
which implies that heat flows from the hotter to the colder body.
2.2.4 Entropy of a monatomic ideal gas of N particles in a box of volume V
To calculate the entropy of a monatomic ideal gas we need to determine the number of
microstates in the phase space region E H E + . The volume of this region is
given by (E, x) = (E, V, N ). A point in 6N -dimensional phase space is denoted by
= (q, p) and the Hamiltonian has the form
3N
X p2i
H() = , (2.2.16)
i=1
2m
where the position coordinates of the N particles are restricted to the box with volume V .
The equation H() = E hencedescribes the surface of a hypersphere in 3N dimensional
momentum space with radius 2mE. We thus find that (for small )

(E, V, N ) = V N S3N ( 2mE) , (2.2.17)
where Sd (R) denotes the surface area of a hypersphere of radius R in d-dimensional

space. The volume term V N arises from the integration over the position coordinates.
The surface area has been calculated on homework sheet 3
2 d/2 Rd1
Sd = . (2.2.18)
(d/2 1)!
We find from equation (2.2.6) that the number of microstates is given by
(E, V, N ) V N 2 3N/2 (2mE)(3N 1)/2

(E, V, N ) = = . (2.2.19)
h3N N ! h3N N ! (3N/2 1)!
The factorial function can be approximated by Stirlings formula (homework sheet 1)
nn
n! 2n as n . (2.2.20)
en
When taking the logarithm of (2.2.19) to obtain the entropy we can approximate the
leading term by ln n! ' n ln n n for n 1 and we obtain
" 3/2 #
V e 4meE
S(E, V, N ) = kB ln (E, V, N ) = N kB ln , (2.2.21)
N 3N h2
where a term of kB ln 2 has been neglected in comparison to the other terms. Equation
(2.2.21) is an explicit expression for the entropy of an ideal gas. One can easily check
9
that the energy and volume dependence agree with the result from Thermodynamics
(homework sheet 3).
Before we apply formula (2.2.21) let us check that it satisfies the extensive property. One
sees indeed that
S(E, V, N ) = S(E, V, N ) . (2.2.22)
Note however, that the term e/N that is marked in red in (2.2.21) is necessary for this
property. The origin of this term is the factor N ! in (2.2.19) from the indistinguishability
of particles. Without this term the entropy would not be extensive and one would have
Gibbs paradox.
We now apply equation (2.2.21) to derive the caloric equation and the equation of state
of an ideal gas. We have from (2.2.9)

1 S 3N kB 3
= = = E = N kB T , (2.2.23)
T E V,N
2E 2
from which we obtain further the heat capacity at constant volume

E 3
CV = = N kB . (2.2.24)
T V,N
2
The equation of state follows from (2.2.10)

p S N kB
= = = pV = N kB T . (2.2.25)
T V E,N
V
Exercise: what is the chemical potential of an ideal gas?
2.2.5 Maxwell-Boltzmann distribution
We can apply the calculation in the previous section to obtain the probability density for
a particle in an ideal gas to have momentum p. Using the definition of a microcanonical
ensemble this probability density is given by
Z
1
p(p) = d3N q d3(N 1) p . (2.2.26)
E 0 H0 E 0 + (E, V, N )
The integral goes only over the momenta of N 1 particles, because the N -th momentum
is fixed by p. H0 is the Hamiltonian of the remaining N 1 particles and E 0 = E
|p|2 /2m. We can express (2.2.26) as
V (E |p|2 /2m, V, N 1)
p(p) = , (2.2.27)
(E, V, N )
and inserting the result for (E, V, N ) from the previous section we obtain
(3N 4)/2
|p|2
1 3N

2mE 1 !
p(p) = 2 . (2.2.28)
(2mE)3/2 3(N 1)
1 !
2
10
For large N this expression can be simplified. We have
(3N 4)/2
|p|2 |p|2 3N |p|2

3N 4
1 = exp ln 1 exp , (2.2.29)
2mE 2 2mE 4mE
where only terms that do not vanish in the limit N have been kept (assuming that
|p|2 /2mE is of order 1/N ). Furthermore
3N
3/2

2
1 ! 3N 3(N 1) 3N
= exp ln 1 ! ln 1 ! . (2.2.30)
3(N 1)
1 ! 2 2 2
2
The last equality has been obtained by inserting Stirlings approximation for the factorial
function and expanding in 1/N in the exponent (slightly lengthy calculation). The final
result is 3/2
3N |p|2

3N
p(p) ' exp . (2.2.31)
4mE 4mE
One can check that this expression is normalised and integration over d3 p yields one. In
a final step we use the caloric equation to replace the energy by temperature
|p|2

1
p(p) ' exp .
(2mkB T )3/2 2mkB T
This is the famous Maxwell-Boltzmann distribution.
2.3 Canonical Ensemble (CE)

The microcanonical ensemble is not always the most convenient ensemble. On the one
hand one has to calculate phase space volumes which can be complicated, on the other
hand one often has the situation that a system is not isolated but in thermal contact with
its environment, for example when experiments are performed at room temperature in
a laboratory. In this case T and x are the appropriate thermodynamic variables instead
of E and x. The corresponding statistical ensemble is the canonical ensemble.
To derive its properties we consider the system that we are interested in to be in contact
with a much larger heat reservoir at temperature T , like in Figure 21. The combined
system of system S and heat reservoir R has constant energy EC = E + ER and can
be described by the microcanonical ensemble. We would like to know the probability
density for system S to be found in a microstate . Similar to our earlier discussion, this
probability density is proportional to the number of microstates of the combined system
that are compatible with system S being in state . Because the state of the system S is
fixed this number is equal to the number R (EC E, xR ) of microstates of the reservoir
at energy EC E where E = H() and H is the Hamiltonian of system S.
We hence have

1
pT,x () R (EC H()) = exp SR (EC H()) , (2.3.1)
kB
where SR is the entropy of the reservoir and we did not explicitly write the dependence
on xR . The energy E = H() is much smaller than EC and we obtain further
SR H()
SR (EC H()) SR (EC ) H() = SR (EC ) . (2.3.2)
ER T
11
Reservoir R
ER
System S T
E
Figure 21: A system S in contact with a much bigger heat reservoir R. The reservoir
is at temperature T and the energies of system and reservoir are denoted by E and ER ,
respectively.
The term SR (EC ) is constant and can be absorbed in the normalisation factor. We finally
obtain the probability density for the canonical ensemble
1 1
pT,x () = eH() , where = , (2.3.3)
Z(T, x) kB T
and Z(T, x) is the normalisation factor. It is given by a sum over all microstates of the
system X
Z(T, x) = eH() . (2.3.4)

The function Z(T, x) is called the partition function, and the probability distribution
pT,x (s ) is the Boltzmann distribution.
Note that we use again the convention about counting classical states that we discussed
earlier so that for N particles with classical Hamiltonian H(q, p) we have
Z
1 1
Z(T, x) = 3N d3N q d3N p exp(H(q, p)) , where = . (2.3.5)
h N! kB T
Information Entropy Revisited

In the CE it is clear that the probability of being in a particular state is not constant
and depends on the properties of the state in question, i.e. its energy. But our earlier
discussion of missing information was based on equal probabilities of events. How can
we define missing information when this is not the case?
Let us return to our problem of choosing from n different configurations. However, now
the normalised probability of finding the system in configuration i is given by
n
X
pi 0 , i = 1, . . . , n ; pi = 1 . (2.3.6)
i=1
Let us further consider an ensemble of N such systems with N . Then we know

that to a very good approximation, the number of systems in state i is given by Ni = N pi
n
X
for each configuration i and that Ni = N .
i=1
12
Now we can ask about the amount of missing information in this ensemble. This ensemble
has N1 systems in state 1, N2 in state 2, . . . , and so on. The missing information is the
lack of knowledge which systems are in state 1, which are in state 2, and so on.
This amount of missing information depend on the total number of different ways which,
given N systems, we can find N1 in state 1, N2 in state 2, and so on. We could think about
these as being the possible and equally likely states of the ensemble. The probability of
the ensemble being in any one of these states is equal and we do not know which one
of them it is in! So we are back to the problem of the missing information for choosing
from identical boxes.
The total number of different ways we can arrange N systems such that N1 are in state
1, N2 in state 2, . . . , Nn in state n, is given by
N! N!
N = = Qn (2.3.7)
N1 !N2 ! Nn ! i=1 Ni !
The missing information of the whole ensemble is then
SN = kB ln N , (2.3.8)
and we can define the missing information per system (member of the ensemble) as
simply
1 1
S = lim SN = lim kB ln N . (2.3.9)
N N N N
In the large N limit we can use Stirlings approximation:

" n n
#
kB X X
S = lim N ln N N Ni ln Ni + Ni
N N
i=1 i=1
" n
# n
!
kB X kB X
= lim N ln N (pi N ) ln(pi N ) = lim N pi ln pi
N N N N
i=1 i=1
Hence finally we obtain the missing information (or information entropy) for a system
whose states are occupied with varying probabilities as
n
X
S = kB pi ln pi = kB hln pi i (2.3.10)
i=1
where hln pi i denotes the average of ln pi . Note that when the probabilities are equal,
pi = 1/n then we obtain S = kB ln n as before.
This result suggests that we can calculate the entropy for the CE as
1 1
S = kB hln pT,x ()i = hH()i + kB ln Z = (E + kB T ln Z) , (2.3.11)
T T
where we used (2.3.3). Recalling that the Helmholtz free energy is F = E T S we can
then make the identification
F (T, x) = kB T ln Z(T, x) . (2.3.12)
13
Why entropy is related to probability and information
In our derivation of entropy in the MCE we considered a container of gas, filling either all or
half of the container. Clearly, if the gas fills half the container, we would expect it to move to
fill the whole container, thus attaining the higher entropy equilibrium state. However, since all
states are equally likely, it is possible that it will return near the original state6 We would then
conclude that the entropy had decreased by a significant amount again. In this sense the second
law is probabilistic in statistical mechanics - violated with a probability that is exponentially
small in the amount of violation, and hence not observed in macroscopic systems. However
it remains an absolute law in the sense that it is not possible to design a cyclic process to
continually violate the Clausius or Kelvin statements. This is true irrespective of the number
of particles, and in fact, irreversibility has been observed with a single particle7
Could we violate the second law by deliberately manipulating the system? Place a partition
in the middle of the container, with a small hole that we can make appear and disappear at
will. If we let particles move only in one direction, we end up with all the particles on one side.
Alternatively we could let fast moving particles move to the right and slow moving particles to
the left, and end up with a temperature difference. In each case we have decreased the entropy
of the system without doing any work on it. This scenario was first proposed by Maxwell, and
the device is called Maxwells demon. We can recover the second law of thermodynamics only
by postulating that the information we hold about the system (the microstate of the particles)
that enables us to control it in this way must be taken into account in the calculation of the
entropy. This led to many discussions and calculations (search for Landauers principle)
and is currently important for quantum computing.
There has also been a recent connection made between entropy and information via an ax-
iomatic formulation of thermodynamics; skim read the paper M. Weilenmann, L. Kraemer, P.
Faist and R. Renner, Phys. Rev. Lett. 117, 260601 (2016) without trying to understand all
the details.
6
Actually, this can be proved from the Hamiltonian dynamics and is called the Poincare recurrence
theorem; we may discuss this in the kinetic theory section.
7
M. Gavrilov and J. Bechhoefer, EPL 114, 50002 (2016).
14

Lecture Notes - Week 1 (Lectures 1-3) Statistical Mechanics - Course Summary

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Lecture Notes - Week 1 (Lectures 1-3) Statistical Mechanics - Course Summary

Transféré par

Droits d'auteur :

Formats disponibles

1

Lecture Notes - Week 1 (Lectures 1-3)

Statistical Mechanics - Course Summary

1. We can describe the system by a small number of measurable quantities (macro-

2. Alternatively we can describe the system microscopically with detailed assump-

1. Thermodynamics (3-4 weeks) - macroscopic systems, state variables, laws of ther-

2. Equilibrium Classical Statistical Mechanics (2-3 weeks) - ensembles, derivation of

3. Dynamical Foundations and aspects of non-equilibrium (1 week) - Hamiltonian

4. Equilibrium Quantum Statistical Mechanics (3 weeks) - quantum statistics, density

1. Statistical mechanics, R.K. Pathria, Elsevier 2005, 529 pages.

2. Equilibrium Thermodynamics, C.J. Adkins, Cambridge 1983, 285 pages.

3. Introduction to modern statistical mechanics, D. Chandler, Oxford 1987, 274 pages.

4. An introduction to chaos in non-equilibrium statistical mechanics , J.R. Dorfman,

6. Statistical physics of particles, M. Kardar, Cambridge 2007, 330 pages.

7. Equilibrium and non-equilibrium statistical thermodynamics, M. LeBellac, F. Mor-

A more detailed discussion of the books is on Blackboard.

0.1 Some identities involving partial derivatives

Since x and z can be varied independently we have

Thus equations (0.1), (0.2), (0.3) become, respectively,

0.2 Exact differentials

f (x, y) dx + g(x, y) dy (0.4)

is an exact differential if it is the differential of a function of x and y, say h(x, y),

where the subindices denote partial differentiation. This implies f = hx and g = hy . A

(Strictly speaking For example, y 2 dx + 2xy dy is an exact differential because

from which follows that k(y) = const. and h(x, y) = xy 2 + const..

g 0 (x0 ) = 0 , g 00 (x0 ) < 0.

Expanding g(x) in a Taylor series around the maximum x0 gives

System Generalized Force, f Generalized Displacement, x

Gas pressure, p volume, V

To continue our study of macroscopic systems we need to understand what is meant by

1.2 Temperature and the zeroth law of thermodynamics

Let us consider the implications of this.

fAC (a1 , a2 , . . . ; c1 , c2 , . . .) = 0 c1 = FAC (a1 , a2 , . . . ; c2 , c3 , . . .) (1.1)

The equilibrium of B and C implies a similar constraint:

fBC (b1 , b2 , . . . ; c1 , c2 , . . .) = 0 c1 = FBC (b1 , b2 , . . . ; c2 , c3 , . . .) (1.2)

FAC (a1 , a2 , . . . ; c2 , c3 , . . .) = FBC (b1 , b2 , . . . ; c2 , c3 , . . .) (1.3)

which implies further that

a1 = fABC (b1 , b2 , . . . ; a2 , a3 , . . . ; c2 , c3 . . .) (1.4)

However the zeroth law requires that the following is true :

fAB (a1 , a2 , . . . ; b1 , b2 , . . .) = 0 a1 = FAB (b1 , b2 , . . . ; a2 , a3 , . . .) , (1.5)

independent of the set of variables {ci }.

A (a1 , a2 , . . . ) = B (b1 , b2 , . . . ) . (1.6)

We conclude that the thermal equilibrium of two systems i and j is characterised by

1.2.2 The ideal gas and thermodynamic temperature

Appendix: Not examinable

It is easily solved by expressing y in terms of x in f (x, y):

(f (x) g(x)) = 0 , g(x) = 0 , (1.14)

where x is an n-dimensional vector. In the case of two constraints one has

(f (x) 1 g1 (x) 2 g2 (x)) = 0 , g1 (x) = 0 , g2 (x) = 0 . (1.15)

This generalises to more than two constraints in the obvious way.

We want to describe thermodynamic processes in the space of the state variables. We

1.3 Heat and the first law of thermodynamics

The industrial revolution and resulting mechanisation in Europe/America during the

1.3.2 The first law

1.3.3 Internal energy

E12 = Q12 + W12 Q12 = E12 W12 . (1.11)

1.3.4 General statement of the First Law

We can restate the first law of thermodynamics:2

The internal energy of a macroscopic system is an extensive state function of the

2. For quasistatic processes the differential work W has the form :

where f is a generalized force and x is its conjugate generalized displacement. For

(gas) W = pdV , p = pressure , V = volume ,

3. Dimensional analysis is a powerful consistency tool in mathematical descriptions