Title: Two topics will be discussed
1 INTRODUCTION
The aim of computational structural biology is to
understand (and predict) the structure and
function of biological macromolecules, such as
proteins and nucleic acids based on their
microscopic (atomic) interactions. These are
thermodynamic systems, which are also affected
by environmental factors such as temperature,
pressure, and solvent conditions. Classical
thermodynamics is a general methodology that
enables one to derive relations among
macroscopic properties such as volume (V),
pressure (P), temperature (T), energy (E) and
entropy (S) without the need to consider the
atomic properties of the system, e.g., the
equation of state,
PVNRT.
2In statistical mechanics, on the other hand, the
macroscopic thermodynamic behavior is obtained
from the specific microscopic description of a
system (atomic interactions, atomic masses,
etc.). Therefore, statistical mechanics also
provides information that is beyond the reach of
classical thermodynamics  what is the native
structure of a protein ? It is impossible to
know the exact dynamic behavior of a large system
(i.e., the atomic coordinates at time t).
Appearance of a 3D configuration is only known
with a certain probability. Thus, statistical
mechanics is a probabilistic theory, and
computational methods, such as Monte Carlo and
molecular dynamics are based on these
probabilistic properties. Part of the course will
be devoted to basic probability theory.
3This is a short course. Therefore, the theory of
statistical mechanics will not be derived
rigorously. The emphasis will be on solving
problems in statistical mechanics within the
framework of the canonical ensemble, treating
polymer systems and proteins. The probabilistic
nature of the theory will be emphasized as
reflected in computer simulation techniques.
4Mechanical Systems are Deterministic
Refreshing some basic physics Examples of forces
F (FF) 1) Stretching a spring by a distance
x Fkx, Hooks Law k spring
constant. 2) Gravitation force F kMm/r2  m
and M masses with distance r k  constant. On
earth (R,M large), gkM/R2
Fmg 3) Coulomb law Fkq1q2/r2
q1,q2 charges.
5Newtons second law Fma  a acceleration
Mechanical work W if a constant force is applied
along distance d, WFd (FF). More
general, W! F..dx. Potential energy If mass
m is raised to height, h negative work is done,
W mgh and the mass gains potential
energy,Ep W mgh  the ability to do
mechanical work when m falls dawn, Ep is
converted into kinetic energy, Ek mv2/2,
where v2/2gh (at floor). A spring stretched by
d Ep W k! xdx kd2/2 In a closed system
the total energy, Et Ep Ek is constant but
Ep/Ek can change e.g., oscillation of a mass
hung on a spring and distorted from its
equilibrium position.
6The dynamics of a mechanical macroscopic system
in principle is deterministic in the sense that
if the forces are known, and the positions and
velocities of the masses at time t0 are known as
well, their values at time t can in principle be
determined by solving Newtons equations.
Simple examples harmonic oscillator (a
spring), a trajectory of a projectile, movement
of spaceship, etc. In some cases the solution is
difficult and requires strong computers.
7Stability  a system of interacting masses (by
some forces) tends to arrange itself in the
lowest potential energy structure (which might be
degenerate) also called the ground state. The
system will stay in the ground state if the
kinetic energy is very small  this situation
defines maximum order. The larger the kinetic
energy the larger is the disorder  in the sense
that at each moment a different arrangement of
the masses will occur (no ground state any more).
Still, in principle, the trajectories of the
masses can be calculated.
Two argon atoms at rest positioned at the lowest
energy distance e interacting through
LennardJones potential. Microscopic system.
e
8 Thermodynamic systems and Statistical
Mechanics A typical
system is described below examined from a
microscopic point of view nonrigorous treatment
TR
C
R
TC
A system C of N molecules in a constant volume V
is in thermal contact with a large reservoir (R)
(also called heat bath) with a well defined
temperature TR. At equilibrium (after a long
time) energy is still exchanged between R and C
but the average kinetic (and potential) energy of
a molecule of C is constant, leading to TcTR.
9However, in contrast to a macroscopic mechanical
system, there is no way to know the trajectories
of the particles that are changed constantly due
to the energy exchange between CR and quantum
mechanics limitations. Relating kinetic energy
to temperature, at low T, Ek is low, the effect
of the molecular forces significant  the system
arrange itself in a low potential energy state
relatively high order. At high T, Ek is high and
dominant, Ep is high high disorder that
includes the uncertainty related to trajectories.
Therefore, a thermodynamic system at
equilibrium cannot be characterized by the
positions velocities of its 1023 particles but
only by the average values of several macroscopic
parameters such as P, T, E (internal energy) and
entropy, S.
10For example, in measuring T the thermometer feels
the average effect of many molecular
configurations of the tested system and for long
measurement all the microscopic states are
realized and affect the result of T. Hence to
obtain the average values of macroscopic
parameters from microscopic considerations a
probability density P(xN,vN) should be assigned
to each system state (xN,vN) where (xN,vN)
(x1,y1,z1,x2,y2,z2, .xN,yN,zN,vx1,vy1vz1.
vxN,vyNvzN) thus assuming that all states
contribute.
11Then a macroscopic parameter M is a statistical
average, ltMgt ! P (xN,vN) M(xN,vN)
d(xNvN). The entropy for a discrete and
continuous system is (kB is the Boltzmann
constant), S ltSgt kB S Pi ln
Pi and S kB! P(xNvN) lnP(xN,vN)
d(xNvN)
lnconst.dimension (xNvN) Notice, P is a
probability density with dimension 1/ (xNvN) .
The constant is added to make S independent of
(xNvN).
12The problem how to determine P. In
thermodynamics an N,V,T system is described by
the Helmholtz free energy A,
A(T,V,N)E TS, which from the
second law of thermodynamics should be minimum
for a given set of constraints. We shall
determine P by minimizing the statistical free
energy with respect to P.
13A can be expressed as the average A
ltA(P)gt ! P(X)E(X)kBTlnP(X)dX
const. X(xN,vN). We derive A with respect to
P and equate to zero the const. is
omitted. A! E(X)kBTlnP(X) kBT P(X)
/P(X)dX0 E(X)kBTlnP(X)
10 lnP(X)  E(X)
kBT/kBT  E(X)/kBT 1
P(X) const.expE(X)/kBT
The normalization is Q! expE(X)/kBT)
14 i.e., PB(X)expE(X)/kBT/Q
PB the Boltzmann probability (density).
Q the canonical partition
function. The intgrand defining A is
E(X)kBTlnPB(X) Substituting PB and
taking the ln gives, E(X)kBT
E(X)/kBT lnQ kBTlnQ kBTlnQ is
constant for any X and can be taken out of the
integral of A. Thus,
AETS  kBTlnQ
15The relation A  kBTlnQ is very important. It
enables calculating A by Q that is based on the
details of the system. In classical
thermodynamics all the quantities are obtained as
derivatives of A. Hence, with statistical
mechanics all these quantities can be obtained
by taking derivatives of lnQ. Also, the
probabilistic character of statistical mechanics
enables calculating averages even without
calculating Q. These averages can be even
geometrical properties that are beyond the reach
of thermodynamics, such as the endtoend
distance of a polymer, the radius of gyration of
a protein and many other geometrical or dynamic
properties. This is in particular useful in
computer simulation.
16(1) Probability and Statistics, M.R. Spiegel,
Schaums Outline Series, McGRAWHill ISBN
0070602204. (2) An Introduction to
Statistical Thermodynamics, T.L. Hill, Dover,
ISBN 0486652 424. (Also
EddisonWesley). (3) Statistical Mechanics, R.
Kubo. North Holland (ISBN 0720400902) and
Elsevier (0444106375). (4) Statistical
Mechanics of Chain Molecules. P. J. Flory. Hanser
(ISBN 3446 152059) and Oxford (ISBN
0195207564). (5) Phase Transitions and
Critical Phenomena. H.E Stanley (Oxford). 6)
Introduction to Modern Statistical Mechanics.
David Chandler (Oxford). ISBN
0195042778.
17Lecture 2 Calculation of the partition function Q
TR
Systems at equilibrium
N particles with velocities vN and coordinates xN
are moving in a container in contact with a
reservoir of temperature T. We have seen last
time that the Helmholtz free energy, A is
AETS  kBTlnQ
where Q!
expE(xN,vN)/kBT) d(xNvN)
18 E(xN,vN) Ek(vN)
Ep(xN) E(xN,vN) is the Hamiltonian of the
system. If the forces do not depend on the
velocities (most cases) Ek is independent of Ep
and the integrations can be separated. Also, the
integrations over the velocities of different
particles are independent. Moreover, the
integrations over the components vx ,vy ,and vz
are independent. Therefore, we treat first the
integration over vx (denoted v) of a single
particle. To recall the linear momentum vector
pmv therefore, for one component
Ek mv2/2 p2/2m
19A useful integral (from table)
Therefore, our integral is
because
20The integral over the 3N components of the
momentum (velocities) is the following product
Q is (h is the Planck constant see Hill p.74
?VN),
The problem is to calculate the configurational
integral
21The origin of the division by N factorial, N!1?2
? 3 ? N is that each configuration of the N
particles in positions x1 , x2, x3,.,xN can be
obtained N! times by exchanging the particles in
these positions (introduced by Gibbs). For
example, for 3 particles there are 3!6 possible
permutations.
sites
Particles 1,2,3 1
2 3
3 2
1
3 1
2 1
3 2
2
1 3
2 3
1
22Stirling approximate formula ln N!? N ln
(N/e) The Helmholtz free energy A(N,V,T) is
The velocities (momenta) part is completely
solved it contains m and h  beyond classical
thermodynamics! The problem of statistical
mechanics is thus to solve the integral. For an
ideal gas, E 0 (no interactions) hence ?VN
trivial!!
23Thermodynamic derivatives of A of an ideal gas
(properties N or V are called extensive)
Pressure Intensive variable N/V
Internal energy Extensive variable N
E is the average kinetic energy, proportional to
T, independent of V. Each degree of freedom
contributes (1/2)kBT. If the forces
(interactions) do not depend on the velocities, T
is determined by the kinetic energy only (see
first lecture).
24Specific heat CV is independent of T and V.
The entropy is
S increases with increasing T, V and N
extensive variable. S is not defined at T0
should be 0 according the third low of
thermodynamics the ideal gas picture holds only
for high T. Both S and E increase with T.
25In the case of a real gas E(xN)?0 and the problem
is to calculate the configurational partition
function denoted Z, where the momentum part is
ignored, Z ! exp
E(xN)/kBTdxN Notice While Q is
dimensionless, Z has the dimension of xN. Also,
Z ! ?(E)exp E/kBT dE
?(E) the density of states around E ?(E)dE
the volume in configurational space with energy
between E and E dE. For a discrete system (n(Ei)
is the degeneracy of E) Z?
exp  Ei/kBT ? n(Ei) exp  Ei/kBT
26The contribution of Z to the thermodynamic
functions is obtained from derivatives of the
configurational Helmholtz free energy, F
FkBTln Z Calculating Z for
realistic models of fluids, polymers, proteins,
etc. by analytical techniques is unfeasible.
Powerful numerical methods such as Monte Carlo
and molecular dynamics are very successful. A
simple example N classical harmonic oscillators
are at equilibrium with a heat bath of
temperature T  a good model for a crystal at
high temperature (the Einstein model), where each
atom feels the forces exerted by its neighbor
atoms and can approximately be treated as an
independent oscillator that does not interact
with the neighbor ones.
27Therefore, QNqN, where q is the partition
function of a single oscillator. Moreover, the
components (x, y, z), are independent as well
therefore, one can calculate qx and obtain
QNqx3N. The energy of a macroscopic
oscillator (e.g., a mass hung on a spring) is
determined by its amplitude (the stretching
distance). The amplitude of a microscopic
oscillator is caused by the energy provided by
the heat bath. This energy changes all the time
and the amplitude changes as well but has an
average value that increases as T increases.
Unlike a macroscopic mechanical oscillator, the
position of the mass is unknown as a function of
t. We only know PB(x).
28The kinetic and potential energy (Hamiltonian) of
an oscillator are
p2/2mfx2/2 f is the
force constant and qqkqp where qk was
calculated before ? is the frequency of the
oscillator.
29? CV kB
The average energy of one component (e.g., x
direction) of an oscillator is twice as that of
an ideal gas effect of interaction. For N 3D
oscillators, E3NkBT extensive (N) The entropy
is (also extensive),
S E/T A/T3NkB(1ln kBT/h? ) E
and S increase with T. In mechanics the total
energy of an oscillator is constant, fd2/2 where
d is the amplitude of the motion and at time t
the position of the mass is known exactly.
30In statistical mechanics a classical oscillator
changes its positions due to the random energy
delivered by the heat bath. The amplitude is not
constant, but the average energy is proportional
to T. The positions of the mass are only known
with their Boltzmann probability. When T
increases the energy increases meaning that the
average amplitude increases and the position of
the mass is less defined therefore, the entropy
is enhanced. Notice A classical oscillator is
a valid system only at high T. At low T one has
to use the quantum mechanical oscillator.
31Lecture 3 Thus far we have obtained the
macroscopic thermodynamic quantities from a
microscopic picture by calculating the partition
function Q and taking (thermodynamic)derivatives
of the free energy kBTln Q. We have discussed
two simple examples, ideal gas and classical
oscillators. We have not yet discussed the
probabilistic significance of statistical
mechanics. To do that, we shall first devote
two lectures to basic probability theory.
32Experimental probability Rolling a die n times.
What is the chance to get an odd number? n
10 50 100 400 1000 10,000 .
m 7 29 46 207 504 5,036
. Relative frequency f(n)m/n
0.7 0.58 0.46 0.517 0.5040 0.5036 .
f(n) ? P 0.5 P experimental
probability
33In many problems there is an interest in P and
other properties. To predict them, it is useful
to define for each problem an idealized model 
a probability space. Sample space Elementary
event Tossing a coin A happened, or B
happened. Rolling a die  1, 2, 3, 4, 5, or 6
happened. Event Any combination of elementary
events. An even number happened (2,4,6) a number
larger than 3 happened (4,5,6).
34The empty event impossible event ? (2 lt a
number lt 3). The certain event O (Coin A or
B happened). Complementary event  AOA
(1,2,3,4,5,6) (2,4,6) (1,3,5). Union
(1,2,3) ? (2,3,5) (1,2,3,5). Intersection
(1,2,4) ? (2,5,6) (2)
35 A
B A?B ?
A?B intersection red
green
A?B A and B A?B whole
36Elementary probability space The sample
space consists of a finite number n of points (
(elementary events) B. Every partial
set is an event. A probability P(A) is
defined for each event A. The probability
of an elementary event B is P(B)1/n.
P(A)m/n m of points in event A.
37Properties of P 0 ? P(A) ? 1 P(A?B)
? P(A) P(B) ?i P(Ai) 1 (Ai, elementary
events) Examples a symmetric coin an exact
die. However, in the experimental world a die
is not exact and its rolling is not random thus,
the probabilities of the elementary events are
not equal. On the other hand, the probability
space constitutes an ideal model with equal
Ps. Comment In general, elementary events can
have different probabilities (e.g., a
nonsymmetric coin).
38Example A box contains 20 marbles, 9 white, 11
black. A marble is drawn at random. What is the
probability that it is white? Elementary event
(EE) selection of one of the 20
marbles. Probability of EE 1/20 The event A a
white marble was chosen contains 9 EE,
P9/20 This consideration involves the ideal
probability space the real world significance P
is a result of many experiments, Pf(n), n??.
39More complicated examples What is the number
of ways to arrange r different balls in n
cells? Every cell can contain any number of
balls. Each ball can be put in n cells ?
of
ways n?n ?n?.nnr
nr the number of words of length r (with
repetitions) based on n letters. A,B,C ? AA, BB,
AB, BA, CC, CA, AC, BC, CB
32 9
40Permutations of samples of r objects out of n
objects without repetitions (the order is
considered) (0!1)
(n)r n(n1)(nr1) 1?2 ... (nr)(nr1)n
n! 1?2
(nr) (nr)!
(3)2 ? (1,2), (1,3), (2,1), (2,3), (3,1), (3,2)
(1,3) ? (3,1) of r (2) letter words from n
(3) lettersAB, BA, CA, AC, BC, CB
6
41Problem Find the probability that r people (r ?
365) selected at random will have different
birthdays? Sample space all arrangements of r
objects (people) in 365 cells (days) their
number 365r ? p(EE) 1/365r Event A not
two birthdays fall on the same day of points
in A 365?364 ?. ?(365r1)(n)r
P(A) (n)r 365!
365r (365r)!365r
42 Combinations of
ways to select r out of n objects if the order is
not considered. of combinations of
permutations / r!
n3
r2 Permutations/2 (1,2) (2,1) /2 (1,3)
(3,1) /2 (2,3) (3,2) /2
43Problem In how many ways can n objects be
divided into k groups of r1, r2,..rk ? rk n
without considering the order in each group but
considering the order between the groups?
Problem How many events are defined in a sample
space of n elementary events?
Binomial coefficient
44Problem 23 chess players are divided
into 3 groups of 8, 8, and 7 players. What
is the probability that players A, B, and C
are in the same group (event A)? EE an
arrangement of the players in 3 groups. EE
23!/(8!8!7!) If A, B, and C in the first group
the of arrangs. 20!/(5!8!7!) etc. ?
45Problem What is the number
of ways to arrange n objects
that r1 , r2 , r3 , rk of them are identical,
? ri n? A permutation of the n objects can be
obtained in r1 ! r2 ! rn ! times ? the n!
permutations should be divided by this factor?
6 6 24
5551112222
5115251222
46Problem 52 cards are divided among 4 players.
What is the probability that
every player will have a king? EE a possible
division of the cards to 4 groups of 13. EE
52!/(13!)4 If every player has a king, only 48
cards remained to be distributed into 4 groups of
12 ? of EE(A) 48!/(12!)4
P(A)48!/(12!)4/52!/(13!)4
47 Product
Spaces So far the probability space modeled a
single experiment tossing a coin, rolling a die,
etc. In the case of n experiments we define a
product space Coin two EE 0 ,1
P1/2 1 2 . n 1
. n EE (1,0,0,1,,1) (0,1,1,1,,0) (.)
(EE)2n If the experiments are independent,
P(EE)(1/2)n
48Die EE are 1, 2 , 3 , 4 , 5 ,6
1..n 1...n EE (1,5,2,4.,3,6)
(2,3,2,5,.,1,1) (4,.) (EE)6n In the
cases of independent experiments, P(EE) (1/6)n
49Problem 15 dice are rolled. Find the probability
to obtain three times the numbers 1,2, and 3 and
twice, 4, 5, and 6? EE all possible outcomes of
15 experiments, (EE) 615 (A) according to
formula on p. 45 15!/(3!)3(2!)3 ?
Also 1 can be chosen in (15 ?14 ?13)/3! ways.
2 in (12 ?11 ?10) )/3! ways etc.
50 Dependent and independent
events Event A is independent of B if P(A) is
not affected if B occurred.
P(A/B)P(A)
P(A/B) conditional probability. For example,
die Independent P(even)1/2
P(even/square)1/2 P(2,4,6/1,4)1/2 Depen
dent P(2)1/6
P(2/even)1/3 P(even/odd)0, while
P(even)1/2 (disjoint)
51 Bayes
Formula P(A/B) P(A?B)/P(B) P(B/A)
P(A?B)/P(A) P(A)gt0
P(B)gt0
A(2) Beven (2,4,6) ? P(A/B)1/3. Using
formula P(A)1/6 P(B)1/2 P(A?B)1/6 ?
Independency P(A?B) P(A)P(B)
52If an event A must result in a mutually exclusive
events, A1,.An , i.e., AA?A1 A?A2 A?An
then
P(A)P(A1)P(A/A1) P(An)P(A/An)
Problem Two cards are drawn successively from a
deck. What is the probability
that both are red? EE product space (r,r),
(r,no), (no,r), (no,no) A? first card is red
(r,r) (r,no) B ? second card is red (no,r)
(r,r) P(A?B)P(r,r)P(A)P(B/A)1/2(25/51)
53 Problems
 Calculate the energy E of an oscillator by a free
energy derivative, where qkBT/h?  2) Prove the equation on p.45.
 3) A die is rolled four times. What is the
probability to obtain 6 exactly one time.  4) A box contains 6 red balls, 4 white balls,
and 5 blue  balls. 3 balls are drawn successively. Find the
probability  that they are drawn in the order red, white,
and blue if  the ball is (a) replaced, (b) not replaced.

54 What is the expectation value of m in the random
variable of Poisson P(X
m)?mexp(?)/m! (m 0,1,2,..). .
55Summary We have defined experimental probability
as a limit of relative frequency, and then
defined an elementary probability space, where P
is known exactly. This space enables addressing
and solving complicated problems without the need
to carry out experiments. We have defined
permutations, combinations, product spaces, and
conditional probability and described a
systematic approach for solving problems.
However, in many cases solving problems
analytically in the framework of a probability
space is not possible and one has to use
experiments or computer experiments, such as
Monte Carlo methods.
56Random variables For a given probability space
with a set of elementary events??, a random
variable is a function XX(?) on the real line ?
lt X(?) lt ?. Examples Coin p  head q tail.
One can define X(p)1 X(q) 0. However, any
other definition is acceptable  X(p)15 X(q)
2, etc., where the choice is dictated by the
problem. Tossing a coin n times, the sample
space is vectors (1,0,0,1) with P(1,0,0,1). One
can define Xm, where m is the number of
successes (heads).
57Distribution Function (DF) or Cumulative DF For
a random variable X (? lt X(?) lt ?)
Fx(X) PX(?) ? x
Fx(X) is a monotonically increasing function
1 5/6 4/6 3/6 2/6 1/6
Die for x lt1, Fx(X) 0 for ? 6, Fx(X)
1
1 2 3 4 5
6
58Random variable of Poisson
m1,2,3,
So far we have discussed discrete random
variables. Continuous random variable if F(x)
continuous and its derivative f(x) F(x) is
also continuous. f(x) is called the probability
density function f(x)dx is the probability
between x and xdx.
59The normal random variable
The uniform random variable
f(x) is constant between a and b
f(x)
1/(ba)
a b
60 Expectation
Value X is a discrete random variable with n
values, x1, x2 , xn. and P(x1), P(x2), P(xn),
the expectation value E(X) is
Other names are mean, statistical
average. Coin X? 1 0 with P and
1P. E(X) P?1 (1P)?0P Die
X? 1, 2, 3, 4, 5, 6 with P 1/6 for all.
E(X) (1/6)?(123456)21/63.5
61Continuous random variable with f(x)
Provided that the integral converges. Uniform
random variable
Properties of E(X) If X and Y are defined on the
same space
E(XY) E(X)E(Y) E(CX)CE(X)
C const.
?X(?)Y(?)P(?) ? X(?)P(?) ? Y(?)P(?) ???
62Arithmetic average X1 , X2 , Xn are n
random variables defined on the same sample space
with the same expectation value ?E(Xi), then the
arithmetic average
is also a random variable with
,
Notice is defined over the product space
(X1, X2, Xn) with P(X1, X2, Xn). is
important in simulations.
63Variance
?
Standard deviation
64Example Random variable with an expectation
value but without variance.
Independence random variables X and Y defined on
the same space are called independent if
P(X,Y)P(X)P(Y)
Tossing of a coin twice (1,1), (0,1), (0,0),
(1,0) the probability of the second toss is
independent of the first. In a product space
P(X1, X2 ., Xn)P(X1)P(X2).P(Xn)
P(X1, X2 ., Xn) _ Joint probability
65Uncorrelated random variables If X and Y are
independent random variables defined on the same
sample space ? they are uncorrelated, i.e.,
E(X?Y)E(X)?E(Y)
Proof E(X?Y) ?ij xiyjP(xiyj) ?ij xiyj
P(xi)P(yj) ?i xiP(xi)??j
yj P(yj) E(X)?E(Y) X,Y independent ? X,Y
uncorrelated. The opposite is not always true.
X,Y uncorrelated defined on the same sample space
then
V(XY)V(X) V(Y)
V(CX) E(C2X2)  E2(CX) C2E(X2)  CE(X)2
C2V(X) ? V is not a linear operator.
66Variance of the arithmetic average X1, X2, ..
, Xn are uncorrelated random variables with the
same ? and ?2 ?
 While the expectation value of the arithmetic
average is also ?, the variance decreases with
increasing n!  The above result, is
extremely important playing a central role in
statistics analysis of simulation data.
67 Sampling So
far we have dealt with probability spaces (ideal
world), where the probability of an elementary
event is known exactly and probabilities of
events A could be calculated. Then, we defined
the notion of a random variable (X) which is a
function from the objects of a sample space to
the real line, where the function can be defined
according to the problem of interest.
Accumulative distribution function and
probability density function (for a continuous
random variable) were defined. This enables, at
least in principle, calculating expectation
values E(X) and variances V(X).
68 It is important to calculate E (? ) and V of the
normal distribution (also called the Gaussian
distribution), see p. 61.
It can be shown (see integral on p.19) that f(x)
is normalized,
and its expectation value is 0, because x is a
symmetric odd function
69The variance is therefore
Here we used the integral
Thus, a Gaussian is defined by only two
parameters, E(X) and V(X)  in the above case, 0
and ?2, respectively. In the general case, f(x)
exp(x? )2/(2 ?2). ? defines the width of
the distribution.
70f(x)
? ?V
x
E?0
The integral of f(x) from ?  ? ? x? ? ?
provides 68 of the total probability 1 the
integral over ?  2? ? x? ? 2? covers 95 of
the total area. Unlike the ideal case (i.e.,
known probabilities) sometimes a distribution is
known to be Gaussian but ? and ? are unknown. To
estimate ? one can sample an x value from the
distribution the smaller is ? the larger the
chance that x is closer to ? .
71We shall discuss later how to sample from a
distribution without knowing its values. One
example is a coin with unknown p (1) and 1p (0)
tossing this coin will produce a sample of
relative frequencies ?(1) ?p, ?(0) ?1p. Also,
E(X) p and V(X)p(1p) are unknown apriori. If
p ?0 or p ?1, V(X) ? 0 and even a single tossing
experiment would lead (with high chance) to 0 and
1, the values of E(X), respectively. To estimate
an unknown E(X) in a rigorous and systematic way
it is useful to use the structure of probability
spaces developed thus far, in particular the
product spaces and the properties of the
arithmetic average,
72Thus, if X1, X2, .. , Xn are random variables
with the same ? and ?2 (
) and if these random variables are also
uncorrelated then
For example one can toss a coin p (1), 1p (0),
p unknown n times independently. The result of
this experiment is one term (vector) denoted (x1,
x2, .. , xn), e.g., (1,0,0,1.1,0) out of the
2n vectors in the product space. Estimation of
by (x1 x2 .. xn )/n is improved in the
statistical sense as the variance is
decreased by increasing n, the number of
experiments for n ? ? the estimation becomes
exact because .
73Thus, to estimate ? (and other properties) one
has to move to the experimental world using the
theoretical structure of the probability spaces.
Notice again that while the value of P or f(x)
is unknown, one should be able to sample
according to P! (see the above example for the
coin). This is the basic theory of sampling that
is used in Monte Carlo techniques and molecular
dynamics. However, notice that with these methods
the random variables in most cases are
correlated therefore, to use the equation
, the of samples generated, n
should be larger, sometimes significantly larger
than n, the number of uncorrelated samples used
in the above equation. This topic will be
discussed in more detail later.
74The probabilistic character of statistical
mechanics Thus far, the thermodynamic properties
were obtained from the relation between the free
energy, A and the partition function Q, AkBTlnQ
using known thermodynamics derivatives. However,
the theory is based on the assumption that each
configuration in phase space has a certain
probability (or probability density) to occur 
the Boltzmann probability,
PB(X)expE(X)/kBT/Q where X ? xNvN is a
6N vector of coordinates and velocities
Therefore, any thermodynamic property such as
the energy is an expectation value defined with
PB(X).
75For example, the statistical average (denoted by
ltgt) of the energy is ltEgt !
PB(xN,vN) E(xN,vN) d (xNvN), where E(xN,vN) is a
random variable. ltEgt is equal to E calculated by
a thermodynamic derivative of the free energy, A.
For an ideal gas (pp.1920) ltEgt is obtained by,
76This is the same result obtained from
thermodynamics (p. 23). For two degrees of
freedom the integral is
77The entropy can also be expressed as a
statistical average. For a discrete system,
SltSgt kB Si Pi ln Pi If the
system populates a single state k, Pk1 and S0 ?
there is no uncertainty. This never occurs at a
finite temperature. It occurs only for a quantum
system at T0 K. On the other hand, if all
states have the same probability, Pi1/?, where ?
is the total number of states, the uncertainty
about the state of the system is maximal (any
state can be populated) and the entropy is
maximal as well,
S kB ln ? This occurs at very high
temperatures where the kinetic energy is large
and the majority of the systems configurations
can be visited with the same random probability.
78It has already been pointed out that the
velocities (momenta) part of the partition
function is completely solved (pp. 1926). On the
other hand, unlike an ideal gas, in practical
cases the potential energy, E(xN)?0 and the
problem of statistical mechanics is to calculate
the configurational partition function denoted Z,
where the momentum part is ignored (see p. 20)
Z ! exp E(xN)/kBTdxN
where Z has the dimension of xN. Also,
Z ! ?(E)exp E/kBT dE ?(E)
the density of states ?(E)dE the volume in
configurational space with energy between E and
E dE. For a discrete system n(Ei) is the
degeneracy and Z is Z? exp 
Ei/kBT ? n(Ei) exp  Ei/kBT
79 The thermodynamic functions can be obtained from
derivatives of the configurational Helmholtz free
energy, F FkBTln
Z. From now on we ignore the velocities part and
mainly treat Z. Thus,the configurational space is
viewed as a 3N dimensional sample space ?, where
to each point xN (random variable) corresponds
the Boltzmann probability density,
PB(xN)expE(xN )/Z
where PB(xN)dxN is the probability to find the
system between xN and xN dxN. The potential
energy E(xN) defined for xN is also a random
variable with an expectation value ltEgt
80 ltEgt !? E(xN ) PB(xN)dxN
While ltEgt is identical to the energy obtained by
deriving F/T with respect to T (see p. 23 ),
calculation of the latter is significantly more
difficult than calculating ltEgt because of the
difficulty to evaluate Z hence F. In simulations
calculation of ltEgt is straightforward. Again,
notice the difference between E(xN ) the
potential energy of the system in configuration
xN, and ltEgt  the average potential energy of all
configurations weighed by the Boltzmann
probability density.
81The power of the probabilistic approach is that
it enables calculating not only macroscopic
thermodynamic properties such as the average
energy, pressure etc. of the whole system, but
also microscopic quantities, such as the average
endtoend distance of a polymer. This approach
is extremely useful in computer simulation, where
every part of the system can be treated, hence
almost any microscopic average can be calculated
(distances between the atoms of a protein, its
radius of gyration, etc.). The entropy can also
be viewed as an expectation value, where ln
PB(xN) is a random variable,
S ltSgt kB!? PB(xN)ln PB(xN) dxN
82Likewise, the free energy F can formally be
expressed as an average of the random variable,
E(xN ) kBT ln PB(xN), F  kBT lnZ !?
PB(xN)E(xN ) kBT ln PB(xN) dxN
Fluctuations (variances) The
variance (fluctuation) of the energy (see p. 63)
is ?2(E) !? PB(xN)E(xN ) 
ltEgt2 dxN ltE(xN )2 gt
 ltE(xN ) gt2 Notice that the expectation value
is denoted by ltgt and E is the energy. It can be
shown (Hill p. 35) that the specific heat at
constant volume is Cv (dE/dT)V ?2(E)
/kBT2
83In regular conditions Cv ?2(E) /kBT2 is an
extensive variable, i.e., it increases N as the
number of particles N increases therefore, ?(E)
N1/2 and the relative fluctuation of E
decreases with increasing N,
Thus, in macroscopic systems (N 1023) the
fluctuation (i.e., the standard deviation) of E
can be ignored because it is 1011 times smaller
than the energy itself ? these fluctuations are
not observed in macroscopic objects. Like Z,ltEgt
can be expressed, ltEgt !
E?(E)PB(E)dE where ?(E) is the density of
states and PB(E) is the Boltzmann probability of
a configuration with E.
84The fact that ?(E) is so small means that the
contribution to ltE gt comes from a very narrow
range of energies around a typical energy E(T)
that depends on the temperature
PB
E
Therefore, the partition function can be
approximated by taking into account only the
contribution related to E(T). Z !
?(E)exp E/kBT dE ? fT(E) ?(E)exp
E/kBT ? F ? E(T)kBTln
?(E) E(T)TS
85The entropy, kBln ?(E) is the logarithm of the
degeneracy of the most probable energy. For a
discrete system Z ?
n(Ei) exp  Ei/kBT where Ei are the set of
energies of the system and n(Ei) their
degeneracies. For a macroscopic system the number
of different energies is large (N for a discrete
system) while only the maximal term
n(E) exp  E/kBT
contributes. This product consists of two
exponentials. At very low T the product is
maximal for the ground state energy, where most
of the contribution comes from exp EGS/kBT
while n(EGS) 1 (S0). At very high T the
product is maximal for a high energy, where n(E)
is maximal (maximum degeneracy ? maximum entropy)
but the exponential of the energy is small. For
intermediate T
86n(E)eE/kBT
fT(E)
eE/kBT
n(E)
E
E(T)
87The fact that the contribution to the integrals
comes from an extremely narrow region of energies
makes it very difficult to estimate ltEgt, S and
other quantities by numerical integration. This
is because the 3N dimensional configurational
space is huge and the desired small region that
contributes to the integrals is unknown apriori.
Therefore, dividing the space into small
regions (grid) would be impractical and even if
done the corresponding integration would
contribute zero because the small important
region would be missed  clearly a waste of
computer time. The success of Monte Carlo
methods lies in their ability to find the
contributing region very efficiently leading to
precise estimation of various averages, such as
ltEgt.
88Numerical integration
f(x)
x
x1 x2 x3 xn
! f(x)dx ? ?i f(xi)?xi ?xi xixi1
89What is the probability to find the system in a
certain energy (not xN)?
PB(E)n(E)expE/kBT/Z
So, this probability depends not only on the
energy but also on the degeneracy n(E). The
relative population of two energies is therefore
90Problems 1. Show that the number of ways n
objects can be divided into k groups of r1,
r2,..rk ? rk n without considering the order
in each group but considering the order between
the groups is n!/(r1!r2!.rk!) 2. Two
random variables X and Y are uncorrelated if
E(XY)E(X)E(Y). Show that in this case
V(XY)V(X)V(Y). V is the variance. 3. The
configurational partition function of an one
dimensional oscillator qp is defined on p.28.
Calculate the average potential energy ltEgt. Use
the integral on p. 69.
91 Solving problems in statistical mechanics
 The first step is to identify the states of
the system and the corresponding energies (e.g.,
the configurations of a fluid and their
energies). Then, three options are available  1) The thermodynamic approach Calculate the
partition function Z ? the free energy FkBTlnZ
and obtain the  properties of interest as suitable
derivatives of F.  2) Calculate statistical averages of the
properties of interest.  3) Calculate the most probable term of Z and the
most dominant contributions of the other
properties.
92Problem N independent spins interact with a
magnetic field H. the interaction energy
(potential) of a spin is ?H or  ?H, depending of
whether ?, the magnetic moment is positive or
negative. Positive ? leads to energy  ?H.
Calculate the various thermodynamic functions
(E,F,S, etc.) at a given temperature T. 2N stats
of the system because each spin is or 1(?) or
1(?). Potential energy of spin configuration i
Ei N ?H N ?H or Ei (NN)?H N ? H
where N and N are the numbers of 1 and 1
spins. The magnetization of i is
M N?H  N ?H No kinetic energy is defined
for this model.
93Option 1 Thermodynamic approach We have to
calculate Z ?i expEi/kBT i runs over all
the 2N different states of the system! This
summation can be calculated by a trick. The spins
are independent, i.e., they do not interact with
each other ? changing a spin does not affect the
other spins. Therefore, the summation over the
states of N spins can be expressed as the product
Z(z1)N where z1 is the partition function of a
single spin. z1 exp(?H/kBT)
exp(?H/kBT)2cosh?H/kBT cosh(x)exp(x)exp(x
)/2 Z 2cosh(?H/kBT)N
94 FkBTlnZ kBTNln2cosh(?H/kBT) En
tropy
T?, S/Nln 2 T0, S0
Energy
95Magnetization MN N
Specific heat
96Option 2 Statistical approach Again we can
treat first a single spin and calculate its
average energy. z1 for a single spin is
z1 exp(?H/kBT) exp(?H/kBT)
2cosh?H/kBT The Boltzmann probability for ?
spin is exp? ?H/kBT / z1 The average energy is
ltEgt1 ?Hexp(?H/kBT)
?Hexp(?H/kBT)/z1 ?H
2sinh?H/kBT/ 2cosh?H/kBT
?H tanh(?H/kBT) ltEgt ?HNtanh(?H/kBT)
97The entropy s1of a single spin is s1
kBPlnP PlnP, where P is the Boltzmann
probability. s1 kBexp?H/kBT?H/kBT ln z1
exp?H/kBT?H/kBT ln z1/z1
kB?H/kBT e  e/ z1 ln z1e e/ z1
kB ?H/kBT tanh?H/kBT ln z1
The same result as on p. 94.
98Option 3 Using the most probable term E
N ? H N ? H ? E E/?H N N
N NN N (N E)/2
N (N E)/2 M ? (N  N) E
MH The of spin configurations
with E, W(E)N!/(N!N!) The terms of the
partition function have the form,
99For a given N, T, and H we seek to find the
maximal term. We take ln fT(E), derive it with
respect to E, and equate the result to 0, using
the Stirling formula, lnN! ? NlnN.
The maximum or minimum of a function f(x) with
respect to x, is obtained at the value x where
(df/dx)x f(x) 0 and (df/dx)xlt0 or gt0,
respectively.
100The most probable energy, E for given T and H
is EN?H
tanh(?H/kBT) and M N?
tanh(?H/kBT) As obtained before.
101 degeneracy energy typical T
spin configurations 1 (min. S0)
N?H (min.) T0 0 ??????
?(H) N (N1)?H?H
very low (T0) ???? ??
N?H2?H N!/(N2)!2!
N(N1)/2 N?H4?H T1gtT
0 ???? ?? .
k ? N!/(Nk)!k! N?H
2k?H Tk gtTk1 ???? ??
. .
N/2 N/2 N! / (N/2)!(N/2)! ?H
(N/2N/2) high T ????. .??
?? (max. degeneracy 0 S
ln 2) N  gt N not physical, negative
temperature! dE decreases, dS increases.
102(No Transcript)
103Several points The entropy can be defined in
two ways 1) As a statistical average
S kB?i PilnPi (Pi Boltzmann)
and 2) as S ? kB ln n(E)
n(E)  degeneracy of the most probable energy.
For large systems the two definitions are
identical. As a mechanical system the spins
would like to stay in the ground state (all
spins are up lowest potential energy ? most
stable state), where no uncertainty exists
(maximum order) ? the entropy is 0.
104However, the spins interact with a heat bath at a
finite T, where random energy flows in and out
the spin system. Thus, spins parallel to H (spin
up) might absorb energy and jump to their
higher energy level (spin down), then some of
them will release their energy back to the bath
by returning to the lower energy state (?) and
vice versa. For a given T the average number of
excited spins (?) is constant and this number
increases (i.e., the average energy increases) as
T is increased. The statistical mechanics
treatment of this model describes this physical
picture. As T increases the average energy (
E(T)  the most probable energy) increases
correspondingly.
105As E(T) increases the number of states n(E)
with energy E(T) increases as well, i.e., the
system can populate more states with the same
probability ? the uncertainty about its location
increases ? S increases. So, the increased
energy and its randomness provided by the heat
bath as T increases, is expressed in the spin
system by higher E(T) and enhanced disorder,
i.e., larger ln n(E) ? larger S(T). The
stability of a thermodynamic system is a
compromise between two opposing tendencies to
be in the lowest potential energy and to be in a
maximal disorder. At T0 the potential energy
wins it is minimal ? complete order S0
(minimal). At T? the disorder wins S and E are
both maximal.
106At finite T the stability becomes a compromise
between the tendencies for order and disorder it
is determined by finding the most probable
(maximal) term of the partition function at
E(T) n(E)
expE/kBT or equivalently the minimal term of
the free energy,
E kBTln n(E) ETS Notice that while the
(macroscopic) energy is known very accurately due
to the small fluctuations, the configuration
(state) is unknown. We only know that the system
can be located with equal probability in any of
the n(E) states.
107 Simplest polymer model ideal chain
This model only satisfies the connectivity of
the chains monomers the excluded volume
interaction is neglected, i.e., two monomers can
occupy the same place in space. In spite of this
unrealistic feature the model is important as
discussed later. We study this model on a
ddimensional lattice the chain starts from the
origin no kinetic or potential energy.
a4
N8 bonds (steps) or N19 monomers
endtoend distance
a2
a1
Probabilistic approach Sample space ensemble
of all chain configurations for a given N. No
interactions, E0, expE/kBT 1 for all chains ?
they are equally probable.
108This ensemble can be built stepbystep. For a
square lattice first bond 2d4 possible
directions, and the same for the rest. ? the
partition function Z is the total of chain
configurations Z ?i14N (2d)N
PB(i)1/Z (1/4)N (1/2d)N
S kB ?i PB(i)ln PB(i) NkB ln4 NkB
ln2d There is interest in global geometrical
properties of a polymer such as the root mean
square endtoend distance (ETED) R. Denoting by
V the ETED vector,
V ?iai and R2 ltV?Vgt ?ij ltai ? ajgt ?i
ltai ? aigt Na2 aa
RN½a
109This is an important result even though the
chain can intersect itself and go on itself many
times, the global dimension of the chain
increases as N1/2 as N is increased this means
that the number of open structures is larger than
that of the compact ones dominating thereby the
statistical averages. About the calculation
i ? For any
direction of i i ? 11
i ?2 0 i ? 3 1 i ? 4 0
? if i ?j ltai?ajgt 0 A similar
proof applies to a continuum chain.
j
1
4
2
3
110Problem One dimensional (d1) ideal chain of n
bonds each of length a starts from the origin
the distance between the chain ends is denoted by
x (ETED). Find the entropy as a function of x and
the relation between the temperature and the
force required to hold the chain ends at x.
Comments 1) The force is defined as
?(?F/?x)T parallel to the definition of the
pressure of gas, P(?F/?V)T 2) We have
defined the entropy in terms of the degeneracy of
the most probable energy. Here it is defined as
the degeneracy of the most probable x. Thus,
the entropy is maximal for x0 and minimal when
completely stretched, xna (without force
ltx2gtan).
111We shall use the most probable term method
x
x can be defined by n bonds in direction and
n bonds in direction _ x (n  n )a
n n n ? n (nax)/2a(nx/a)/2
n(nax)/2a (nx/a)/2 The number of chain
combinations for given n and n (i.e., a given
x) is w(x)n!/
(n!n!)
112Using Stirlings formula, S(x)kBlnw(x)
kB (nln n  nln n  nln n)
F TS ? ? (?F/?x)T
T(?S/?x)
Using 1/(1q) 1qq2q3 for q lt 1
q x/na ?1 i.e., x is much smaller than the
stretched chain.
113Because x/2a ?1 it is justified to take only the
leading terms in the expansion, 1 and 2x/na.
We also used here the expansion, ln(1y) ? y for
y ? 1. Therefore, for x ? na we obtain Hooks
law where kBT/na2 is the force constant. This
derivation applies to rubber elasticity which is
mainly caused by entropic effects.
114 Solution of set problems 1
 Calculate the energy E of an oscillator by a free
energy  derivative, where qkBT/h?
 FkBTlnq kBTln(kBT/h?)

For N oscillators
E3NkBT
1152) Prove the equation on p.45. 3) A die is
rolled four times. What is the probability to
obtain 6 exactly one time. This experiment
is defined in the product space, where elementary
events are vectors (i,j,k,l), 0? i,j,k,l
?1. The total number of EE 64 (P1/6) The
event 6 occurred exactly one time consists of
the following events A) i6, j,k,l remain
between 15 ? 53 EE. B) j6 the rest are between
15 ? 53 EE, etc. ? P
4?53/64 0.3858
1164) A box contains 6 red balls, 4 white balls, and
5 blue balls. 3 balls are drawn successively.
Find the probability that they are drawn in the
order red, white, and blue if the ball is (a)
replaced, (b) not replaced. (a) The problem is
defined in the product space
(15)(15)(15), where the experiments are
independent. P(red)6/15
P(white)4/15 P(blue)5/15
P(red,white,blue) (6?4?5)/153
120/33750.0356 (b)
P(red,white,blue)(6/15)?(4/14)?(5/13)0.0440
117About force and entropy In the mechanical
systems the force is obtained from a derivative
of the potential energy in gravitation Emgh
force dE/dhmg. For a spring, Ekx2/2 force
dE/dx kx. If the spring is stretched and
released the mass will oscillate. The potential
energy is converted into kinetic energy and vice
versa. TS has a dimension of energy. Can a
force be obtained from pure entropy? Answer
yes. To show that we examine the 1d ideal chain
studied recently.
118 We have shown on p. 108 that the entropy of an1d
ideal chain (i.e., ln of total of chains) is,
S NkB ln2d NkB ln2.  On p. 113 we obtained, ?/T dS/dxconst.ln(12x
/na..) for x ? na. Equating this derivative to 0
and solving for x leads to x0, which is the ETED
value for which S is maximal (the second
derivative of S with respect to x is negative at
x0).  For x0, n n n/2 ? the of chain configurs.
for x0 is  w(x0)n!/(n/2)!(n/2)!
 and
 ln(w) nlnn  nln(n/2) nln2 S/kB of
this system! ?  using Stirling approximation the of
configurations at x0 equals to the total number
of configurations.
119 x0 is the most probable ETED, i.e., it has
the maximal entropy thus, to hold the chain
ends at a distance xgt0 one has to apply force
against the will of the chain to remain in its
most probable state (x0). In other words, the
force is required for decreasing the chain
entropy.  This is an example for what is known as the
potential of mean force (PMF), which is the
contribution to the free energy of configurations
that satisfy a certain geometrical restriction.
In our case this restriction is a certain value
of the ETED x.  Deriving the PMF (free energy) with respect to
the corresponding restriction (e.g., x), leads to
the average force required to hold the system in
the restricted geometry.
120Rubber elasticity stems from such entropic
effects. Rubber consists of polymer chains that
are connected by crosslinks. When the rubber is
stretched bond angles and lengths are not changed
(or broken). The main effect is that the polymers
become more ordered, i.e., their entropy is
decreased. When the rubber returns to its normal
length the energy invested is released as heat
which can be felt. Thus, this thermodynamic
spring differs from the mechanical
one. Calculation of PMF values and the
corresponding forces is also carried out for
proteins. An example is the huge muscle protein
titin (33,000 residues), which is stretched by
atomic force microscopy and the forces are
compared to those obtained from derivatives of
PMF calculated by MD. (see, Lu Schulten,
Biophysical Journal vol. 79, 5165, 2000.
121For x/2a ?1 we have obtained Hooks law for the
force ?
Thus, the force required to hold the chain at x
increases linearly with increasing T, because the
free energy (TS) at x0 decreases, i.e., the
system becomes more stable at higher T and it is
difficult to stretch it to a larger x. Also, the
force is proportional to 1/(na2 ), meaning that
the longer the chain the easier is to stretch it.
122 Phase transitions A
phase transition occurs when a solid become
liquid and the latter gas. A Phase transition
also occurs when a magnet loses its magnetization
or a liquid crystal becomes disordered. While
phase transitions are very common they involve a
discontinuity in thermodynamic properties and it
was not clear until 1944 whether this phenomenon
can be described within the framework of
equilibrium statistical mechanics. In 1944
Onsager solved exactly the Ising model for
magnetism where the properties of phase
transition appeared in the solution. This field
was developed considerably during the last 40
years. It was also established that polymers
correspond to
123 magnetic systems and show a phase transition
behavior.  First order phase transition is a discontinuity
in a first derivative of the free energy F.
Examples of first derivatives are the energy, ltEgt
T2 ?(F/T)/?TN,V and the magnetization M of a
spin system in a magnetic field, ltMgt (?F/ ?H)T.  A known example for a system undergoing first
order transition is a nematic liquid crystal. The
molecules are elliptic or elongated. At low T
they are ordered in some direction in space and
ltEgt is low. A random arrangement (and high ltEgt)
occurs at high T. At the critical temperature Tc
the system can coexist in two states random and
ordered.  There is finite difference in the energy, ?E
(latent heat) and entropy ?S of the two phases
where ?E T?S ? F1 F2.
124In other words, the two states are equally stable
at Tc.
Tlt Tc ordered state T gtTc
disordered state low energy entropy
high energy entropy
?E
ltEgt
ForderedFdisordered
ordered disordered
T
125For a regular system the function
fT(E)n(E