3 Statistical Mechanics
“Ludwig Boltzmann, who spent much of his life studying statistical mechanics, died in 1906
by his own hand. Paul Ehrenfest, carrying on the work, died similarly in 1933. Now it is
your turn to study statistical mechanics.”
David Goodstein
Today, I want to tell you about entropy and the Second Law of thermodynamics. This will
address deep questions about the world we live in, such as why we remember the past and not
the future. It will also have relations to information theory, computing and even the physics
of black holes. Along the way, we will encounter one of the most important formulas in all of
science:
24
3.1 More is Di↵erent 25
3.1 More is Di↵erent
Suppose you’ve got theoretical physics cracked—i.e. you know all the fundamental laws of Na-
ture, the properties of the elementary particles and the forces at play between them. How can
you turn this knowledge into an understanding of the world around us?
Consider this glass of water:
It contains about N = 1024 atoms. In fact, any macroscopic object contains such a stupendously
large number of particles. How do we describe such systems?
An approach that certainly won’t work, is to write down the equations of motion for all 1024
particles and solve them. Even if we could handle such computations, what would we do with
the result? The positions of individual particles are of little interest to anyone. We want answers
to much more basic questions about macroscopic objects. Is it wet? Is it cold? What colour is
it? What happens if we heat it up? How can we answer these kind of questions starting from
the fundamental laws of physics?
Statistical mechanics is the art of turning the microscopic laws of physics into a description
of the macroscopic everyday world. Interesting things happen when you throw 1024 particles
together. More is di↵erent: there are key concepts that are not visible in the underlying laws of
physics but emerge only when we consider a large collection of particles. A simple example is
temperature. This is clearly not a fundamental concept: it doesn’t make sense to talk about the
temperature of a single electron. But it would be impossible to talk about the world around us
without mention of temperature. Another example is time. What distinguishes the past from
the future? We will start there.
3.2 The Distinction of Past and Future
Nature is full of irreversible phenomena: things that easily happen but could not possibly happen
in reverse order. You drop a cup and it breaks. But you can wait a long time for the pieces to
come back together spontaneously. Similarly, if you watch the waves breaking at the sea, you
aren’t likely to witness the great moment when the foam collects together, rises up out of the
sea and falls back further out from the shore. Finally, if you watch a movie of an explosion in
reverse, you know very well that it’s fake. As a rule, things go one way and not the other. We
remember the past, we don’t remember the future.
Where does this irreversibility and the arrow of time come from? Is it built into the microscopic
laws of physics? Do the microscopic laws distinguish the past from the future? Things are not
so simple. The fundamental laws of physics are, in fact, completely reversible.1 Let us take
1
This isn’t quite true for processes involving the weak nuclear force, but this isn’t relevant for the present
discussion.
26 3. Statistical Mechanics
the law of gravity as an example. Take a movie of the planet going around a star. Now run
the movie in reverse. Does it look strange? Not at all. Any solution of Newton’s equations
can be run backward and it is still a solution. Whether the planet goes around the star one
way or the opposite way is just a matter of its initial velocity. The law of gravitation is time
reversible. Similarly, the laws of electricity and magnetism are time reversible. And so are all
other fundamental laws that are relevant for creating our everyday experiences.
So what distinguishes the past from the future? How do reversible microscopic laws give rise
to apparent irreversible macroscopic behaviour? To understand this, we must introduce the
concept of entropy.
3.3 Entropy and The Second Law
“If someone points out to you that your pet theory of the universe is in disagreement with
Maxwell’s equations—then so much the worse for Maxwell’s equations. If it is found to be
contradicted by observation—well these experimentalists can bungle things sometimes. But
if your theory is found to be against the Second Law of Thermodynamics I can give you no
hope; there is nothing for it but to collapse to deepest humiliation.”
Sir Arthur Eddington
3.3.1 Things Always Get Worse
We start with a vague definition of entropy as the amount of disorder of a system. Roughly, we
mean by “order” a state of purposeful arrangement, while “disorder” is a state of randomness.
For example, consider dropping ice cubes in a glass of water. This creates a highly ordered, or
low entropy, configuration: ice in one corner, water in the rest. Left on its own, the ice will melt
and the ice molecules will mix with the water.
low entropy high entropy
The final mixed state is less ordered, or high entropy.
3.3 Entropy and The Second Law 27
Similarly, the natural tendency of co↵ee and milk is to mix, but not to unmix
low entropy high entropy
These basic facts of life are summarized in the Second Law of Thermodynamics:2
The entropy of an isolated system always increases.
To the physicists of the late 19th century the Second Law was a serious paradox. They knew
that the microscopic laws of physics are time reversible. So if entropy can increase, the laws
of physics say it must be able to decrease. Yet, experience says otherwise. Entropy always
increases.
3.3.2 Probability
This is where Ludwig Boltzmann’s genius came in. He realized is that the Second Law is not a
law in the same sense as Newton’s law of gravity or Faraday’s law of induction. It’s a probabilistic
law that has the same status as the following obvious claim: if you flip a coin a million times you
will not get a million heads. It simply won’t happen. But is it possible? Yes, it is—it violates
no law of physics. Is it likely? Not at all. Boltzmann’s formulation of the Second Law is just
that. Instead of saying entropy does not decrease, he said that
entropy probably doesn’t decrease.
This is where the di↵erence between 1 (or a few) and 1024 is important again. It is much more
likely for a handful of particles to spontaneously do crazy things than for 1024 particles. The
Second Law emerges for a large number of particles.
This also implies that if you wait around long enough, you will eventually see entropy decrease:
• By accident, particles and dust will come together and form a perfectly assembled bomb.
But, how long does it take for that to happen? A very long time. A lot longer than the
time to flip a million heads in a row, and even a lot longer than the age of the universe.
• Imagine I drop a bit of black ink into a glass of water. The ink spreads out and eventually
makes the water grey. Will a glass of grey water ever clear up and produce a small drop
of ink? Not impossible, but very unlikely.
• The air in this room is uniformly distributed. Is it possible that all air molecules sponta-
neously collect in one corner of the room, leaving the rest a vacuum? Not impossible, but
very unlikely.
2
The First Law is the conservation of energy.
28 3. Statistical Mechanics
3.3.3 Counting
Let us now get a bit more precise and less philosophical. First, we want to give definitions of
three related concepts: microstates, macrostates and statistical entropy.
As a concrete example, consider a collection of N particles in a box. If a particle is in the
left half of the box, we say that it is in state L. If it is in the right half, we call its state R. We
specify a microstate of the system by making a list of the states of each particle, whether its
left (L) or right (R). For instance, for N = 10 particles, a few possible microstates are
LLLLLLLLLL
RLLLLLLLLL
LRLLRRLLLL
···
The total number of possible microstates is 2N (two possibilities for each particle). For N = 1024
24
particles, this is a ridiculously large number, 210 . Luckily, we never need a list of all possible
microstates. All macroscopic properties only depend on the relative number of left and right
particles and not on the detail of which are left and right.
We can collect all microstates with the same numbers of L and R into a single macrostate,
labelled by one number
n ⌘ NL NR . (3.3.1)
How many microstates are in a given macrostate? Look at N = 10. For n = 10 (all left) and
n = 10 (all right), we only have 1 unique microstate each. For n = 8 (one right), we get 10
possible microstates since for 10 particles there are 10 ways of putting one particle on the right.
For n = 0 (equal number on the left and the right), we get 252 microstates. The complete
distribution of the number of microstates per macrostate is summarized in the following figure:
252
250
210
200
150
120
100
50 45
10
1
n
-10 -8 -6 -4 -2 0 2 4 6 8 10
It is easy to generalize this: let W (n) be the number of ways to have N particles with NL
particles on the left and NR particles on the right. The answer is
N! N!
W (n) = = N n N +n . (3.3.2)
NL !NR ! ( 2 )!( 2 )!
3.3 Entropy and The Second Law 29
For very large N , your calculator won’t like evaluating the factorials. At this point, a normal
distribution is a very good approximation
n2 /2N
W (n) ⇡ 2N e . (3.3.3)
So far this was just elementary combinatorics. To this we now add the fundamental assump-
tion of statistical physics:
each (accessible) microstate is equally likely.
Macrostates consisting of more microstates are therefore more likely.
Boltzmann then defined the entropy of a certain macrostate as the logarithm3 of the number
of microstates
S = k log W , (3.3.5)
where k = 1.38 ⇥ 10 23 JK 1 is Boltzmann’s constant. The role of Boltzmann’s constant is
simply to get the units right. Eq. (3.3.5) is without a doubt one of the most important equations
in all of science, on a par with Newton’s F = ma and Einstein’s E = mc2 . It provides the link
between the microscopic world of atoms (W ) and the macroscopic world we observe (S). In
other words, it is precisely what we were looking for.
Given the fundamental assumption of statistical mechanics (that each accessible microstate
is equally likely), we expect systems to naturally evolve towards macrostates corresponding to
a larger number of microstates and hence larger entropy
dS
0. (3.3.6)
dt
We are getting closer to a microscopic understanding of the Second Law.
3.3.4 Arrow of Time
With this more highbrow perspective, we now return to the question how macroscopic features
of a system made of many particles evolve as a consequence of the motion of the individual
particles.
Let our box be divided in two by a wall with a hole in it. Gas molecules can bounce around on
one side of the box and will usually bounce right o↵ the central wall, but every once in a while
they will sneak through to the other side. We might imagine, for example, that the molecules
bounce o↵ the central wall 995 times out of 1,000, but 5 times they find the hole and move to
the other side. So, every second, each molecule on the left side of the box has a 99.5 percent
chance of staying on that side, and a 0.5 percent chance of moving to the other side—likewise
for the molecules on the right side of the box. This rule is perfectly time-reversal invariant—if
you made a movie of the motion of just one particle obeying this rule, you couldn’t tell whether
3
Taking the logarithm of W to define entropy has the following important consequences: i) It makes the
24
stupendously large numbers, like W = 210 , less stupendously large; ii) More importantly, it makes entropy
additive—i.e. if we combine two systems 1 and 2, the number of microstates multiply, Wtot = W1 W2 , which
means that the entropies add
Stot = k log Wtot = k log(W1 W2 ) = k log W1 + k log W2 = S1 + S2 . (3.3.4)
30 3. Statistical Mechanics
it was being run forward or backward in time. At the level of individual particles, we can’t
distinguish the past from the future.
However, let’s look at the evolution from a more macroscopic perspective:
time
t = 200
t = 50
t=1
The box has N = 2,000 molecules in it, and starts at time t = 1 with 1,600 molecules on the
left and only 400 on the right. It’s not very surprising what happens: because there are more
molecules on the left, the total number of molecules that shift from left to right will usually
be larger than the number that shift from right to left. So after 50 seconds we see that the
numbers are beginning to equal out, and after 200 seconds the distribution is essentially equal.
This box clearly displays an arrow of time. Even if I hadn’t labelled the di↵erent distributions
in the figure with the specific times to which they correspond, you wouldn’t have any trouble
guessing that the bottom box came first and the top box came last. We’re not surprised when
the air molecules even themselves out, but we would be very surprised if they spontaneously
congregated all on one side of the box. The past is the direction of time in which things were
more segregated, while the future is the direction in which they have smoothed themselves out.
It’s exactly the same thing that happens when a ice cube melts or milk spreads out into a cup
of co↵ee.
It is easy to see that this is consistent with the Second Law. Using eqs. (3.3.2) and (3.3.5), we
can associate an entropy with the system at every moment in time. A plot of the time evolution
of the entropy looks as follows
1.0
0.9
entropy
0.8
0.7
0.6
50 100 150 200 250 300 350
time
3.4 Entropy and Information 31
3.4 Entropy and Information
“You should call it entropy, for two reasons. In the first place, your uncertainty function
has been used in statistical mechanics under that name, so it already has a name. In the
second place, and more important, no one knows what entropy really is, so in a debate you
will always have the advantage.”
John von Neumann to Claude Shannon.
3.4.1 Maxwell’s Demon
In 1871, Maxwell introduced a famous thought experiment that challenged the Second Law.
The setup is the same as before: a box of gas divided in two by a wall with a hole. However,
this time the hole comes with a tiny door that can be opened and closed without exerting a
noticeable amount of energy. Each side of the box contains an equal number of molecules with
the same average speed (i.e. same temperature). We can divide the molecules into two classes:
those that move faster than the average speed—let’s call them red molecules—and those that
move slower than average—called blue. At the beginning the gas is perfectly mixed (equal
numbers of red and blue on both sides, i.e. maximum entropy). At the door sits a demon, who
watches the molecules coming from the left. Whenever he sees a red (fast-moving) molecule
approaching the hole, he opens the door. When the molecule is blue, he keeps the door shut. In
this way the demon ‘unmixes’ the red and blue molecules, the left side of the box gets colder
and the right side hotter. We could use this temperature di↵erence to drive an engine without
putting any energy in: a perpetual motion machine. Clearly this looks like it violates the Second
Law. What’s going on?
Maxwell’s demon and its threat to the Second Law have been debated for more than a century.
To save the Second Law there has to be a compensating increase in entropy somewhere. There
is only one place the entropy could go: into the demon. So does the demon generate entropy in
carrying out his demonic task? The answer is yes, but the way that this work is quite subtle
32 3. Statistical Mechanics
and was understood only recently (by Szilard, Landauer and Bennett).4 The resolution relies
on a fascinating connection between statistical mechanics and information theory.5
3.4.2 Szilard’s Engine
In 1929, Leo Szilard launched the demon into the information age.6 In particular, he showed
that
information is physical.
Possessing information allows us to extract useful work from a system in ways that would have
otherwise been impossible. Szilard arrived at these insights through a clever new version of
Maxwell’s demon: this time there is only a single molecule in the box. Two walls of the box are
replaced by movable pistons
A partition (now without a hole) is placed in the middle. The molecule is on one side and the
other side is empty
The demon measures and records on what side of the partition the gas molecule is, gaining
one bit of information. He then pushes in the piston that closed o↵ the empty half of the box
In the absence of friction, this process doesn’t require any energy. Note the crucial role played
by information in this setup. If the demon didn’t know which half of the box the molecule was
in, he wouldn’t know which piston to push in. After removing the partition, the molecule will
push against the piston and the one-molecule gas “expands”
4
One of the giants in this story wrote a nice article about it: Bennett, “Demons, Engines and the Second
Law”, Scientific American 257 (5): 108-116 (1987).
5
William Bialek recently gave a nice lecture of the relationship between entropy and information. The video
can be found here: http://media.scgp.stonybrook.edu/video/video.php?f=20120419 1 qtp.mp4
6
Szilard, “On the Decrease of Entropy in a Thermodynamic System by the Intervention of Intelligent Beings.”
3.4 Entropy and Information 33
In this way we can use the system to do useful work (e.g. by driving an engine). Where did the
energy come from? From the heat Q of the surroundings (with temperature T ),
The work done when the gas expands from Vi = V to Vf = 2V is given by a standard formula
in thermodynamics:
⇣V ⌘
f
W = kT log = kT log 2 . (3.4.7)
Vi
Recall that dW = F dx = p dV , where p is the pressure of the gas. The integrated work done is
therefore Z Vf
W = p dV .
Vi
Using the ideal gas law for the one-molecule gas, pV = kT , we can write this as
Z Vf ⇣V ⌘
kT f
W = dV = kT log .
Vi V Vi
The system returns back to its initial state
This completes one cycle of operation. The whole process is repeatable. Each cycle would allow
extraction and conversion of heat from the surroundings into useful work in a cyclic process. The
demon seems to have created a perpetual motion machine of the second kind.7 In particular,
in each stage of the cycle the entropy decreases by S = Q/T (another classic formula of
thermodynamics). Using Q = W , we find
S= k log 2 . (3.4.8)
Szilard’s demon again seems to have violated the Second Law.
3.4.3 Saving the Second Law
In 1982, Charles Bennett observed that Szilard’s engine is not quite a closed cycle.8 While after
each cycle the box has returned to its initial state, the mind of the demon has not! He has
7
It is of the ‘second kind’ because it violates the ‘second’ law of thermodynamics. A perpetual motion machine
of the first kind violates the ‘first’ law—the conservation of energy.
8
Bennett, “The Thermodynamics of Computation”.
34 3. Statistical Mechanics
gained one bit of recorded information. The demon needs to erase the information stored in his
mind in order for the process to be truly cyclic. However, Rolf Landauer had shown9 in 1961
that the erasure of information is necessarily an irreversible process.10 In particular, destroying
one bit of information increases the entropy of the world by at least
S k log 2 . (3.4.9)
Therein lies the resolution of Maxwell’s demon: the demon must collect and store information
about the molecule. If the demon has a finite memory capacity, he cannot continue to cool
the gas indefinitely; eventually, information must be erased. At that point, he finally pays the
entropy bill for the cooling he achieved. (If the demon does not erase his record, or if we want
to do the thermodynamic accounting before the erasure, then we should associate some entropy
with the recorded information.)
3.5 Entropy and Black Holes⇤
I want to end this lecture with a few comments about entropy, black holes, and quantum gravity.
3.5.1 Information Loss?
Every black hole is characterized by just three numbers: its mass, its spin and its electric charge.
It doesn’t matter what created the black hole; in the end all information is reduced to just these
three numbers. This is summarized in the statement that
Black holes have no hair.
This means that if we throw a book into a black hole, it changes the mass (and maybe the
spin and charge) of the black hole, but all information about the content of the book seems lost
forever. Do black holes really destroy information? Do they destroy entropy? Do they violate
the Second Law?
3.5.2 Black Holes Thermodynamics
The Second Law could be saved if black holes themselves carried entropy and if this entropy
increased as an object falls into a black hole. In 1973, Jacob Bekenstein, then a graduate student
at Princeton, thought that this was indeed the solution. In fact, there were tantalizing analogies
between the evolution of black holes and the laws of thermodynamics. The Second Law of
thermodynamics states that entropy never decreases. Similarly, the masses of black holes (or
equivalently the area of the event horizon, A = 4⇡R2 / M 2 ) never decreases. Throw an object
into a black hole and the black hole gets bigger. Bekenstein thought that this was more than
just a cheap analogy. He conjectured that black holes, in fact, carry entropy proportional to
their size,11
SBH / A . (3.5.10)
9
Landauer, “Irreversibility and Heat Generation in the Computing Process”.
10
In other words, you can’t erase information if you are part of a closed system operating under reversible laws.
If you were able to erase information entirely, how would you ever be able to reverse the evolution of the system?
If erasure is possible, either the fundamental laws are irreversible—in which case it is not surprising that you can
lower the entropy—or you’re not really in a closed system. The act of erasing information necessarily transfers
entropy to the outside world.
11
Bekenstein, “Black Holes and the Second Law”.
3.5 Entropy and Black Holes⇤ 35
3.5.3 Hawking Radiation
Stephen Hawking thought that this was crazy! If black holes had entropy, they also had a
temperature, and you could then show that they had to give o↵ radiation. But everyone knows
that black holes are black! Hawking therefore set out to prove Bekenstein wrong. But he failed!
What he found12 instead is quite remarkable: Black holes aren’t black! They do give o↵ radiation
and do carry huge amounts of entropy.
The key to understanding this is quantum mechanics: In quantum mechanics, the vacuum is
an interesting place. According to Heisenberg’s uncertainty relation, nothing can be completely
empty. Instead particle-antiparticle pairs can spontaneously appear in the vacuum. However,
they are only virtual particles, living only for a short time, before annihilating each other
annihilation
particle
anti-particle
Most pop science explanations of this e↵ect are completely wrong (try googling it!), so it is worth giv-
ing you are rough outline of the correct argument. We start with the following version of Heisenberg’s
uncertainty principle
~
E t .
2
This means the following:
~
To measure the energy of a system with accuracy E, one needs a time t 2 E.
In other words, to decrease the error in the measurement, we perform an average over a longer time
period. But this increases the uncertainty in the time to which this energy applies.
Now consider a particle-antiparticle pair with total energy E spontaneously appearing out nothing:
energy
time
vacuum vacuum
~
If the lifetime ⌧ of the excited state is less than 2E , then we don’t have enough time to measure
the energy with an accuracy smaller than E. Hence, we can’t distinguish the excited state from the
zero-energy vacuum. The Heisenberg uncertainty principle allows non-conservation of energy by an
amount E for a time t 2 ~E .
The story changes if the particle-antiparticle pair happens to be created close to the event
horizon13 of a black hole. In that case, one member of the pair may fall into the black hole and
12
Hawking, “Black Hole Explosions?”
13
The event horizon is the point of no return. Nothing, not even light, can escape from inside the event horizon:
see Lecture 6.
36 3. Statistical Mechanics
disappear forever. Missing its annihilation partner, the second particle becomes real :
Hawking Radiation
An observer outside the black hole will detect these particles as Hawking radiation.
Analyzing this process, Hawking was able to confirm Bekenstein’s guess (3.5.10). In fact, he
did much more than that. He derived an exact expression for the black hole entropy
1A
SBH = , (3.5.11)
4 `2p
where `p is the Planck length, the scale at with the e↵ects of quantum mechanics and gravity
become equally important (see Lecture 6). In terms of the fundamental constants of quantum
mechanics (~), relativity (c) and gravity (G), the Planck length is
r
~G
`p = ⇡ 1.6 ⇥ 10 35 m . (3.5.12)
c3
Eq. (3.5.11) is a remarkable formula: it links entropy and thermodynamics (l.h.s.) to quantum
gravity (r.h.s.). It is therefore the single most important clue we have about the reconciliation
of gravity with quantum mechanics.
3.5.4 Black Holes in String Theory
The great triumph of Boltzmann’s theory of entropy was that he was able to explain an ob-
servable macroscopic quantity—the entropy—in terms of a counting of microscopic components.
Hawking’s formula for the entropy of a black hole seems to be telling us that there are a very
large number of microstates corresponding to any particular macroscopic black hole
SBH = k log WBH . (3.5.13)
What are those microstates? They are not apparent in classical gravity (where a black hole has
no hair). Ultimately, they must be states of quantum gravity. Some progress has been made on
this in string theory, our best candidate theory of quantum gravity. In 1996, Andy Strominger
and Cumrun Vafa derived the black hole entropy from a microscopic counting of the degrees
of freedom of string theory (which are strings and higher-dimensional membranes). They got
eq. (3.5.11) on the nose, including the all important factor of 1/4.