Topics in Statistical Mechanics: The Foundations of Molecular Simulation
Topics in Statistical Mechanics: The Foundations of Molecular Simulation
Organizational details
Location: WE 76 (Wetmore Hall, New College)
Dates and Time: Wednesdays, 3:00 - 5:00 pm
Instructors
Prof. Jeremy Schofield
Office: Lash Miller 420E
Telephone: 978-4376
Email: jmschofi@chem.utoronto.ca
Office hours: Tuesdays, 2:00 pm - 3:00 pm
Dr. Ramses van Zon
Office: Lash Miller 423
3
4
Telephone: 946-7044
Email: rzon@chem.utoronto.ca
Office hours: Fridays, 3:00 pm - 4:00 pm
Grading
1. 4 problem sets
70%
30%
Contents
1 Review
1.1 Classical Mechanics . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Ensembles and Observables . . . . . . . . . . . . . . . . . . .
1.3 Liouville Equation for Hamiltonian Systems . . . . . . . . . .
1.3.1 Equilibrium (stationary) solutions of Liouville equation
1.3.2 Time-dependent Correlation Functions . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
9
9
12
17
20
20
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
23
23
24
28
28
28
33
36
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
41
41
42
43
43
4 Molecular dynamics
4.1 Basic integration schemes . . . . . . . . . . . . . . . . . . . .
4.1.1 General concepts . . . . . . . . . . . . . . . . . . . . .
4.1.2 Ingredients of a molecular dynamics simulation . . . .
4.1.3 Desirable qualities for a molecular dynamics integrator
4.1.4 Verlet scheme . . . . . . . . . . . . . . . . . . . . . . .
47
47
47
52
56
60
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
44
45
CONTENTS
4.2
4.3
4.4
4.5
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5 Advanced topics
5.1 Hybrid Monte Carlo . . . . . . . . . . . . . . . . . . . . . . . . .
5.1.1 The Method . . . . . . . . . . . . . . . . . . . . . . . . . .
5.1.2 Application of Hybrid Monte-Carlo . . . . . . . . . . . . .
5.2 Time-dependent correlations . . . . . . . . . . . . . . . . . . . . .
5.3 Event-driven simulations . . . . . . . . . . . . . . . . . . . . . . .
5.3.1 Implementation of event-driven dynamics . . . . . . . . . .
5.3.2 Generalization: Energy discretization . . . . . . . . . . . .
5.4 Constraints and Constrained Dynamics . . . . . . . . . . . . . . .
5.4.1 Constrained Averages . . . . . . . . . . . . . . . . . . . . .
5.4.2 Constrained Dynamics . . . . . . . . . . . . . . . . . . . .
5.5 Statistical Mechanics of Non-Hamiltonian Systems . . . . . . . . .
5.5.1 Non-Hamiltonian Dynamics and the Canonical Ensemble .
5.5.2 Volume-preserving integrators for non-Hamiltonian systems
A Math Appendices
A.1 Taylor expansion . . . . . . . . . . . . .
A.2 Series expansions . . . . . . . . . . . . .
A.3 Probability theory: . . . . . . . . . . . .
A.3.1 Discrete systems . . . . . . . . .
A.3.2 Continuous Systems . . . . . . .
A.3.3 Gaussian distributions . . . . . .
A.4 Fourier and Laplace Transforms . . . . .
A.5 Calculus . . . . . . . . . . . . . . . . . .
A.5.1 Integration by parts . . . . . . .
A.5.2 Change of Variable and Jacobians
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
61
62
65
73
75
79
80
82
84
85
.
.
.
.
.
.
.
.
.
.
.
.
.
87
87
87
90
94
96
99
101
104
104
107
114
118
124
.
.
.
.
.
.
.
.
.
.
129
129
130
130
130
131
132
133
134
134
134
CONTENTS
B Problem sets
B.1 Problem Set
B.2 Problem Set
B.3 Problem Set
B.4 Problem Set
1
2
3
4
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
137
137
139
143
145
CONTENTS
1
Review
1.1
Classical Mechanics
p = mx
a(t) = x(t)
p(t)
= m
x(t) = F (t) =
dV
dx
p2
2m
+ V (x).
Introduce terminology
p2
= kinetic energy
2m
x =
p
H
=
m
p
p =
dV
H
=
.
dx
x
These are coupled ordinary differential equations whose solution is uniquely specified by specifying two conditions, such as x0 = x(0) and p0 = p(0) at some
reference time t0 = 0.
9
10
1. REVIEW
3-dimensional system of 1 particle
Notation: r = (x, y, z) and p = (px , py , pz ). Also, p p = p2x + p2y + p2z .
The Hamiltonian is:
pp
2m
+ V (r).
px
rx
ry = 1 py
m
pz
rz
r =
H
p
=
p
m
p =
shorthand for
V
H
=
r
r
2 particles in 3-dimensions
Hamiltonian: H =
p1 p1
2m1
p2 p2
2m2
+ V (r1 , r2 )
H
p
= 2
p2
m2
H
p2 =
r2
r2 =
r1 =
H
p(2)
p (2) =
H
.
r(2)
H
p(N )
p (N ) =
H
.
r(N )
A total of 6N equations!
At each point in time, the system is specified by 6N coordinates (r(N ) (t), p(N ) (t))
x(N ) (t) called the phase point.
The set of all phase points is called phase space.
Classical dynamics describes a path through the 6N -Dimensional phase space.
11
(N ) .
t
r(N )
p(N )
p(N )
r
We can define the Liouville operator L to be:
L=
H
(N ) (N )
(N
)
p
r
r
p(N )
B
H
B
H
(N ) .
(N
)
(N
)
(N
)
r
p
p
r
12
1. REVIEW
In particular,
x(N ) (t) = eLt x(N ) (0).
Note that LH = 0.
Can also define the Poisson bracket operator via
{A, B}
A
B
A
B
(N ) .
(N
)
(N
)
(N
)
r
p
p
r
dG(x(N ) , t)
G(x(N ) , t)
=
+ {G(x(N ) , t), H(x(N ) )}.
dt
t
so
Important property:
eLt A(x(N ) )B(x(N ) ) = eLt A(x(N ) ) eLt B(x(N ) ) = A(x(N ) (t))B(x(N ) (t)).
1.2
Consider some arbitrary dynamical variable G(r(N ) , p(N ) ) = G(x(N ) ) (function of phase
space coordinates and hence possibly evolving in time).
An experimental measurement of quantity corresponds to a time average of some (possibly short) sampling interval .
1
Gobs (t) = G(t)
d G r(N ) (t + ), p(N ) (t + ) .
13
Ensemble average therefore corresponds to a weighted average over different configurations of the system.
Define a probability density for phase space (often loosely called the distribution
function):
f (r(N ) , p(N ) , t) = distribution function
and hence
f (r(N ) , p(N ) , t)dr(N ) dp(N )
microcanonical ensemble: All systems in ensemble have the same total energy.
All dynamical trajectories with same energy compose a set of states in microcanonical ensemble.
Technically, all conserved quantities should also be the same.
What is the connection between the ensemble average and the experimental observation
(time average)?
Quasi-ergodic hypothesis: As t , a dynamical trajectory will pass arbitrarily
close to each point in the constant-energy (if only conserved quantity) hypersurface of
phase space (metrically transitive).
Another statement: For all initial states except for a set of zero measure, the
phase space is connected through the dynamics.
Hypersurfaces of phase space covered by trajectory.
14
1. REVIEW
So in some sense, as :, we expect
Z
Z
1
1 0 (N ) (N )
(N )
(N )
Gobs (t) =
d G r (t + ), p (t + ) =
dr dp
G(r(N ) , p(N ) )
0
where
Z
=
dr
(N )
dp
(N )
Z
=
dr(N ) dp(N )
E<H(x(N ) )<E+E
hence
Z
Gobs (t) = G(t) =
1
exp{(A H(x(N ) ))}
N !h3N
15
1
1 exp{H(x(N ) )}
(N )
.
exp{(A
H(x
))}
=
N !h3N
N !H 3N
QN (T, V )
dx(N ) exp{H(x(N ) )}
N !h3N
1 QN
ln QN
=
=
.
QN
where
Z
1
N (E)
dx(N ) (E H(x(N ) ))
3N
N !h
= density of unique states at energy E (microcanonical partition function).
16
1. REVIEW
2 ln QN
E
= kT 2 Cv
E2 = H(x(N ) )2 hH(x(N ) )i2 =
=
and hence
kT 2 Cv
P r H(x(N ) ) E E
2
2 E
For an ideal gas system, E = 3/2N kT and hence Cv = 3/2N k.
Typically, E N and Cv N .
kT 2 Cv
1
P r H(x(N ) ) E E
2
N 2
2 E
As N increases, it becomes less and less likely to observe a system with energy
very different from E,
(N )
hB(x
Z
)icanon =
(1 + O(1/N )) .
dE P (E)hB(x(N ) )imicro at E hB(x(N ) )i
micro at E
P (E) P (E)
1
2E2
1/2
(E E)2
exp
2kT 2 Cv
1.3
17
dX0N f (X0N , 0)
P (V0 ) =
V0
X0N
V0
N
Xt
Vt
Z
P (V0 ) =
dX0N
f (X0N , 0)
N
N
N
X N , t t)
dXt
J(X N ; Xt
)f (Xt
Vt
V0
= Pt (Vt )
N
Recall that Xt
X0N X0N .
18
1. REVIEW
Evaluation of the Jacobian is a bit complicated, but gives
N
N
J(X0N ; Xt
) = Jacobian for transform X0N = Xt
X N
X0N
= 1 X N X N
=
N
X
t
So
Z
Pt (Vt ) = P (V0 ) =
N
N
dXt
(1 X N X N ) f Xt
X N , t t
Vt
for small X N .
What is X N ?
N
' X0N + X 0N t, or X N = X 0N t.
For Hamiltonian systems Xt
t (X N f ) X N +
2X N f (X N )2 + . . .
t
2
Inserting this in previous equation for Pt (Vt ) = P (V0 ), we get
Z
N
Pt (Vt ) = Pt (Vt ) +
dXt
Vt
f
1 2
N
N 2
t X N (X f ) + X N f (X )
t
2
or
Z
1 2
f
N
N 2
N
t X N (X f ) + X N f (X ) = 0
dXt
t
2
Vt
Since this holds arbitrary volume Vt , the integrand must vanish.
f
1
t = X N (X N f ) + 2X N f (X N )2 +
t
2
Now, let us evaluate this for X N = X 0N t
To linear order
in t
N
N
N
X N X0 f t = X X N f + X N X f t
but
R N
P N
H
H
X N X N =
+
=
= 0!
N
N
N
N
R
P
R P
P N RN
19
Note that this implies the volume element does not change with normal
Hamiltonian propagation:
N
N
N
N
N
N
N
X N X N G =
20
1. REVIEW
1.3.1
dH
dt
= 0.
H
f
H
f
f
= {H, f (H)} =
t
RN P N
P N RN
but
f H
f
=
P N
H P N
f
=
t
f H
f
=
RN
H RN
H
H
H
H
RN P N
P N RN
f
=0
H
1.3.2
Consider the time-dependent correlation function CAB (t) in the canonical ensemble
Z
(N )
(N )
A(x , t)B(x , 0) = dx(N ) A(x(N ) , t)B(x(N ) , 0)f (x(N ) ).
From the form of the Liouville operator, for arbitrary functions A and B of the phase
space coordinates
A(x(N ) , t)B(x(N ) , t) = eLt A(x(N ) , 0) eLt B(x(N ) , 0) = eLt A(x(N ) , 0)B(x(N ) , 0) .
It can be shown by integrating by parts that:
LA(x(N ) ) B(x(N ) ) = A(x(N ) ) LB(x(N ) ) .
A(x(N ) , t)B(x(N ) , 0) = A(x(N ) )B(x(N ) , t) .
(N )
dx
Lt
(N )
e A(x
, 0) f (x(N ) , 0) =
Z
Z
dx(N ) A(x(N ) , 0) eLt f (x(N ) , 0)
dx(N ) A(x(N ) , 0)f (x(N ) , t)
21
22
1. REVIEW
2
Numerical integration and importance
sampling
2.1
Quadrature
dx f (x)
I(a, b) =
a
Rectangle rule: on small interval, construct interpolating function and integrate over
interval.
Polynomial of degree 0 using mid-point of interval:
Z
(a+1)h
Polynomial of degree 1 passing through points (a1 , f (a1 )) and (a2 , f (a2 )): Trapezoidal rule
Z a2
x a1
a2 a1
(f (a2 ) f (a1 ))
dx f (x) =
(f (a1 ) + f (a2 )) .
f (x) = f (a1 ) +
a2 a1
2
a1
Composing trapezoidal rule n times on interval (a, b) with even sub-intervals
[kh, (k + 1)h] where k = 0, . . . , n 1 and h = (b a)/n gives estimate
!
Z b
n1
b a f (a) + f (b) X
dx f (x)
+
f (a + kh) .
n
2
a
k=1
23
24
dx f (x)
a
ba
[f (a) + 4f (a + h) + 2f (a + 2h) + 4f (a + 3h) + 2f (a + 4h) + + f (b)] .
3n
Error bounded by
h4
(b a) f (4) () .
180
where (a, b) is the point in the domain where the magnitude of the 4th
derivative is largest.
Rules based on un-even divisions of domain of integration
w(x) = weight function.
P
I(a, b) ni=1 wi fi , where wi = w(xi ) and fi = f (xi ) with choice of xi and wi
based on definite integral of polynomials of higher order.
Example is Gaussian quadrature with n points based on polynomial of degree
2n 1: well-defined procedure to find {xi } and {wi } (see Numerical Recipes).
Error bounds for n-point Gaussian quadrature are
(b a)2n+1 (n!)4 (2n)
f () for (a, b)..
(2n + 1)! [(2n)!]3
For multi-dimensional integrals, must place ni grid points along each i dimension.
Number of points in hyper-grid grows exponentially with dimension.
Unsuitable for high-dimensional integrals.
2.2
Suppose integrand f (x) depends on multi-dimensional point x and that integral over hypervolume
Z
I=
dx f (x)
V
25
As N , I I.
How does this improve the rate of convergence of the calculation of I? We will see that
the statistical uncertainty is related to the variance I2 of the estimate of I, namely
I2
N
1 X
=
hIi Ii i
N i
where
Ii =
f (xi )
I.
w(xi )
and we have assumed that the random variables Ii are statistically independent. Here
h i represents the average over the true distribution of f /w that is obtained in the
limit N .
Vastly different values of ratio f (xi )/w(xi ) lead to large uncertainty.
The error is minimized by minimizing I2 .
If w(xi ) = f (xi ), then f (xi )/w(xi ) = and
*
2 +
f (xi )
f (xi )
=I=
= 2 ,
w(xi )
w(xi )
and I2 = 0.
Note that writing w(xi ) = f (xi )/ requires knowing = I, the problem we are
trying to solve.
Generally desire all f (xi )/w(xi ) to be roughly the same for all sampled points xi
to mimimize I2 .
Rb
Rb
f (x)
Example in 1-dimensional integral I = a dx f (x) = a dx w(x)
w(x).
26
N
N
N
1 X f (xi )
baX
1 X
=
Ik ,
f (xi ) =
N i=1 w(xi )
N i=1
N k=1
Note that y(a) = 0 and y(b) = 1 if w(x) is normalized over (a, b).
How are the x distributed? Since the y are distributed uniformly over (0, 1), the
probability of finding a value of y in the interval is
dy(x) = w(x) dx,
so the x are distributed with weight w(x).
It then follows that the one-dimensional integral can be written in the transformed
variables y as:
Z b
Z y(b)
Z 1
f (x)
f (x(y))
f (x(y))
I=
dx w(x)
=
dy
=
dy
w(x)
w(x(y))
w(x(y))
a
y(a)
0
Integral easily evaluated by selecting uniform points y and then solving x(y).
Must be able to solve for x(y) to be useful.
27
Procedure:
1. Select N points yi uniformly on (0, 1) using yi = rand.
Rx
2. Compute xi = x(yi ) by inverting y(x) = a dz w(z).
P
3. Compute estimator for integral I = N1 N
i=1 f (xi )/w(xi ).
This procedure is easy to do using simple forms of w(x). Suppose the integrand is
strongly peaked around x = x0 . One good choice of w(x) might be a Gaussian weight
w(x) =
1
2 2
(xx0 )2
2 2
+
d
x e 22
=
2
2 2 0
Z xx
0
2
1
1
1
x x0
w2
+
dw e
=
=
1 + erf
.
2
2
0
2
Inverting gives xi = x0 + 2 ierf(2yi 1), where ierf is the inverse error function
with series representation
3 7 2 5
ierf(z) =
z+ z +
z + ...
2
12
480
The estimator of the integral is therefore
N
I=
X
(xi x0 )2
1
2 2
f (xi )e 22 .
N
i=1
This estimator reduces the variance I2 if w(xi ) and f (xi ) resemble one another.
Another way to draw from a Gaussian:
1 x21 x22
e e
2
28
2.3
2.3.1
y1
x1
y2
x1
y1
x2
y2
x2
dx1 dx2 = w(x1 , x2 )dx1 dx2
Generally, we cannot find a simple way of generating the {xi } according to a known
w(xi ) for high-dimensional systems by transforming from a uniform distribution.
Analytical integrals for coordinate transformation may be invertable only for separable coordinates.
Many degrees of freedom are coupled so that joint probability is complicated and
not the product of single probabilities.
Typical integrals are ensemble averages of the form:
1
hAi =
Z
Z
dr
(N ) u(r(N ) )
(N )
R
A(r
(N )
)=
(N ) )
(N )
2.3.2
Markov Chains
29
We assume the stochastic dynamics of the random variable is determined by a timeindependent transition function K(Y X), where the transition function defines the
probability density of going from state Y to state X in a time step.
Since K is a probability density, it must satisfy
Z
dX K(Y X) = 1
K(Y X) 0.
The state Xt at time t is assumed to be obtained from the state at time Xt1 by a
realization of the dynamics, where the probability of all transitions is determined by
K.
At the second step of the random walk, the new state X2 is chosen from P2 (X|X1 ) =
K(X1 X). At the third step, the state X3 is chosen from P3 (X|X2 ) = K(X2 X),
and so on, generating the random walk sequence {X1 , . . . , Xn }.
If infinitely many realizations of the dynamics is carried out, we find that the distribution of Xi for each of the i steps are
P1 (X) R
P2 (X) = R dY K(Y X)P1 (Y)
P3 (X) = dY K(Y X)P2 (Y)
..
.
R
Pt (X) = dY K(Y X)Pt1 (Y)
for X1
for X2
for X3
..
.
for Xt .
The transition function K is ergodic if any state X can be reached from any state Y in
a finite number of steps in a non-periodic fashion.
If K is ergodic, then the Perron-Frobenius Theorem guarantees the existence of a unique
stationary distribution P (X) that satisfies
Z
P (X) = dY K(Y X)P (Y)
and
lim Pt (X) = P (X).
t
Implications
1. From any initial distribution of states P1 (X), an ergodic K guarantees that states
Xt will be distributed according to the unique stationary distribution P (X) for
large t (many steps of random walk).
2. P (X) is like an eigenvector of matrix K with eigenvalue = 1.
3. Goal is then to design an ergodic transition function K so that the stationary or
limit distribution is the Bolztmann weight w(X).
30
Kij = 1.
i=1
(1)
(m)
31
0
J1
K =
...
0
JI
32
i 1
0
0 i 1
Ji =
.. .
.
0 i
... ...
0
The dimension of the square matrix Ji depends on the original matrix K. For
any given eigenvalue , the sum of the dimensions of the Ji for which = i is
equal to the algebraic multiplicity of the eigenvalue. It is easily shown that the
nth power of a block Ji has elements bounded by nk ink for a block of size k,
goes
which goes to zero exponentially has n goes to infinity. Hence, the matrix K
to
1
0
0
K =
.
.
.
0
0
Thus it follows that as n goes to infinity
n P1
(Kn ) = PK
P1 P1
1 = (P1 ) ,
K = 0
1
2
1
3
1
.
3
1
3
0
1
2
1
2
1
2
=
1
6
=
=
33
2.3.3
We wish to devise a procedure so that the limiting (stationary) distribution of the random
walk is the Boltzmann distribution Peq (x).
Break up the transition matrix into two parts, generation of trial states T(y x) and
acceptance probability of trial state A(y x)
K(y x) = T(y x)A(y x).
Will consider definition of acceptance probability given a specified procedure for
generating trial states with condition probability T(y x).
Write dynamics in term of probabilities at time t
Z
dy Pt (y) K(y x)
Z
Probability of starting in x and remaining in x = Pt (x) 1 dy K(x y)
Probability of starting in y and ending in x =
so
Pt+1 (x) =
dy Pt (y) K(y x) + Pt (x) 1 dy K(x y)
Z
= Pt (x) + dy Pt (y)K(y x) Pt (x)K(x y) .
Z
dy Pt (y)K(y x) Pt (x)K(x y) = 0.
34
Simple example
Suppose we want to evaluate the equilibrium average hAi at inverse temperature
of some dynamical variable A(x) for a 1-dimensional harmonic oscillator system with
potential energy U (x) = kx2 /2.
Z
hA(x)i =
dx Peq (x)A(x)
Monte-Carlo procedure
Peq (x) =
1 kx2 /2
e
Z
35
1. Suppose the current state of the system is x0 . We define the proposal probability
of a configuration y to be
1
if y [x0 x, x0 + x]
2x
T(x0 y) =
0
otherwise
x is fixed to some value representing the maximum displacement of the trial
coordinate.
y chosen uniformly around current value of x0 .
Note that if y is selected, then T(x0 y) = 1/(2x) = T(y x0 ) since x0
and y are within a distance x of each other.
2. Accept trial y with probability
T(y x0 )Peq (y)
A(x0 y) = min 1,
= min 1, eU ,
T(x0 y)Peq (x0 )
where U = U (y) U (x0 ).
Note that if U 0 so the proposal has lower potential energy than the
current state, A(x0 y) = 1 and the trial y is always accepted as the next
state x1 = y.
If U > 0, then A(x0 y) = eU = q. We must accept the trial y
with probability 0 < q < 1. This can be accomplished by picking a random
number r uniformly on (0, 1) and then:
(a) If r q, then accept configuration x1 = y
(b) If r > q, then reject configuration and set x1 = x0 (keep state as is).
3. Repeat steps 1 and 2, and record states {xi }.
Markov chain of states (the sequence) {xi } generated, with each xi appearing in the
sequence with probability Peq (xi ) (after some number of equilibration steps).
After collecting N total configurations, the equilibrium average hAi can be estimated
from
hAi =
N
1 X
A(xi )
N i=1
36
2.4
Statistical Uncertainties
We would like some measure of the reliability of averages computed from a finite set of
sampled data. Consider the average x of a quantity x constructed from a finite set of N
measurements {x1 , . . . , xN }, where the xi are random variables drawn from a density (x).
N
1 X
x=
xi
N i=1
Suppose the random variables xi are drawn independently, so that the joint probability
satisfies
(x1 , . . . xN ) = (x1 )(x2 ) (xN ).
Suppose that the intrinsic mean hxi of the random variable is hxi =
that the intrinsic variance is 2 , where
Z
2
2
= dx x hxi (x).
dx x(x) and
A mesaure of
p reliability is2 the standard deviation of x, also known as the standard
error E = E2 , where E is the variance of the finite average x around hxi in the
finite set {x1 , . . . , xN }
2
E2 = h x hxi i.
37
We expect
1. Variance (error) decreases with increasing N
2. Error depends on the intrinsic variance 2 of the density . Larger variance should
mean slower convergence of x to hxi.
Suppose all the intrinsic moments hxn i of (x) exist. If we define the dimensionless variable
z=
x hxi
,
E
(2.1)
where N (t) is the characteristic function of P (z). Using the fact that E = / N , as will
be shown shortly, z can be expressed as
N
N
1 X xi hxi
N X xi hxi
z=
=
,
N i=1
N i=1
and hence
N
it X
*
N i=1
N (t) =
e
xi hxi +
D
it
( xhxi
)
EN
= (t/ N )N ,
where
(u) = eiu(xhxi)/ ,
which, for small arguments u can be expanded as (u) = 1 u2 /2 + iu3 3 /(6 3 ) + . . . ,
where 3 is the third cumulant of the density (x). Using this expansion, we find ln N (t) =
2
t2 /2 + O(N 1/2 ), and hence N (t) = et /2 for large N . Inserting this form in Eq. (2.1)
gives
1
2
P (
z ) = ez /2
2
and hence we find the central limit theorem result that the averages x are distributed around
the intrinsic mean according to the normal distribution
2
N (x; hxi, E2 )
(xhxi)
1
2
=
e 2E ,
2E
38
provided the number of samples N is large and all intrinsic moments (and hence cumulants)
of (x) are finite.
If the central limit theorem holds so that the finite averages are normally distributed
as above, then the standard error can be used to define confidence intervals since the
probability p that the deviations in the average, x hxi, lie within a factor of c times
the standard error is given by
Z
cE
cE
If p = 0.95, defining the 95% confidence intervals, then we find that c = 1.96, and hence
the probability that the intrinsic mean hxi lies in the interval (x 1.96E , x + 1.96E )
is P (x 1.96E hxi x + 1.96E ) = 0.95.
If the data are not normally distributed, then higher cumulants of the probability
density may be needed (skewness, kurtosis, ...). The confidence intervals are defined
in terms of integrals of the probability density.
How do we calculate E or E2 ? We defined the variance as
2
E2 = h x hxi i
where
x=
1 X
xi
N i
!
X
i
hx2i i +
XX
i
hxi ihxj i
j6=i
2
1
2
2
2
N
(
+
hxi
)
+
N
(N
1)hxi
=
+ hxi2 ,
2
N
N
hence E2 = 2 /N .
However we dont really know what 2 is, so we must estimate this quantity. One way
to do this is to define the estimator of the standard deviation
2:
2 =
1 X
(xi x)2 .
N i
39
h
2i =
hxi xi =
hx2 i =
2
+ hxi2 ,
N
we get h
E2 i = (N 1) 2 /N and hence 2 = N h
2 i/(N 1).
The standard error can therefore be estimated by
E =
2
N 1
(j)
1 X
xi
=
N 1 i6=j
40
2 =
2 = N (N 1)h2 i.
N 1 N
N (N 1)
The standard error can therefore be estimated using the Jackknifed set as
v
u
N
uN 1 X
2
t
x(j) x .
E =
N j=1
3
Applications of the Monte Carlo
method
3.1
Quasi-ergodic sampling
Detailed balance condition and ergodic transition matrix imply that random walk
Monte-Carlo method correctly generates distribution of configurations.
Says nothing about the rate of convergence, which depends on implementation (eigenvalue spectrum of transition matrix).
Consider a 1-dimensional system with a quartic potential U (x) = x2 (x2 2).
For = 10, probability density P (x) eU (x) is bimodal with small relative probability to be found near x = 0.
1
U(x)
p(x)
0.5
-0.5
-1
-1.5
-1
-0.5
0
x
41
0.5
1.5
42
1
2x
if y [x x, x + x]
otherwise.
3.2
Umbrella sampling
43
Generate Markov chain of states {x1 , . . . , xN } according to (x) so that an estimator of the average is
hA(x)i =
N
N
N
1 X
P (xi )
1 X
1 X
A(xi )
=
A(xi )w(xi ) =
A(xi )eUb (xi ) .
N i=1
(xi )
N i=1
N i=1
Weight factor w(xi ) = eUb (xi ) accounts for bias introduced by the umbrella potential. In this case, it will assign greater weight to regions around the modes at
x = 1 since the biased random walk attaches less signficance to these regions.
The parameter k in the umbrella potential can be adjusted to minimize the statistical uncertainties.
Disadvantage of umbrella approach: Must know a way in which to enhance movement
between all modes of system in order to define an effective umbrella potential.
3.3
3.3.1
At high temperatures ( 1), the equilibrium distribution Ph (x) is only weakly bimodal.
Transition rate between modes depends exponentially on U , where U is the
barrier height of the potential separating different modes.
If is small so that U 1, then the system moves easily between modes.
Can use high-temperature distribution Ph (x) as an importance function (x), resulting
in weight function w(xi ) given by
w(xi ) =
P (xi )
= e(h )U (xi ) = eU (xi ) .
(xi )
44
3.3.2
(N )
hA(r
(N )
1X
(N ) W1 (1 i )U (r(N ) )
k
k
)i =
A(rk )
e
.
n k=1
Wik
Drawbacks:
Must specify the parameter set {i } properly to ensure the proper movement
between modes.
Must know how to choose weights Wk for a given set of {k }. This can be done
iteratively, but requires a fair amount of computational effort.
3.3.3
45
Use an extended state space composed of replicas of the system to define a Markov
chain X = (r1 , . . . rm ), where each ri is a complete configuration of the system.
Design a transition matrix so that limiting distribution is
P (X) = 1 (r1 ) . . . m (rm )
The (statistically independent) individual components i of the extended state
space vector can be assigned any weight i . One choice is to use a Boltzmann
distribution i (ri ) = ei U (ri ) with inverse temperature i .
The Monte-Carlo process on the extended state space can be carried out as follows:
1. Carry out a fixed number of updates on all replicas, each with a transition matrix
Ki that has a limit distribution i .
2. Attempt a swap move, in which different components of the extended state space
vector (replicas) are swapped.
For example, any pair of components, possibly adjacent to one another, can
be selected from a set of all possible pairs with uniform probability. Suppose
one picks components 2 and 3, so that the original configuration Xi and
proposed configuration Yi are
r1
r1
r3
r2
Yi = r2
Xi = r3
..
..
.
.
rm
rm
The proposed configuration Yi should be accepted with probability A(Xi
Yi ) given by
P (Yi )
2 (r3 )3 (r2 )
A(Xi Yi ) = min 1,
= min 1,
P (Xi )
2 (r2 )3 (r3 )
Note that no adjustable weight factors Wi are needed.
If i = ei U (ri ) , then
2 (r3 )3 (r2 )
e2 U (r3 ) e3 U (r2 )
= 2 U (r2 ) 3 U (r3 ) = e(2 3 )U (r3 ) e(2 3 )U (r2 ) = eU ,
2 (r2 )3 (r3 )
e
e
where = 3 2 and U = U (r3 ) U (r2 ).
Each component of Xi in Markov chain of extended states {X(1) , . . . X(N ) } is distributed
with weight i
46
N
1 X
(k)
A(Xi ),
N k=1
(k)
where Xi is the ith component of the extended configuration X(k) (the kth configuration in the Markov chain).
Advantages: Quasi-ergodic sampling is mitigated by using a range of parameters such
as i . The parameters should be defined such that their extreme values (such as highest
temperature) should demonstrate no trapping in single modes.
Disadvantages: There must be some overlap in adjacent densities i and i+1 if a swap
move is to be accepted with significant probability. Ideally, the parameters i should
be chosen so that any given labelled configuration spends an equal amount of time at
all parameter values.
Sometimes requires many replicas, which means that it takes many Monte-Carlo
exchange moves for a given configuration to cycle through the parameters.
One of the major advantages of the replica exchange method is the ease with which
one can parallelize the algorithm.
Normal single-chain algorithm cannot be parallelized efficiently because of sequential nature of random walk procedure.
Can parallelize the generation of many trial configurations and use an asymmetric
proposal procedure to select one preferentially.
Can parallelize computation of energy, if this is a rate-determining step.
The exchange frequency between replicas can be optimized to suit the computer architecture.
Each processor can deal with a single replica, or sub-sets of replicas.
4
Molecular dynamics
4.1
4.1.1
General concepts
P N P N
2m
U (R ) =
X
(i,j)
(rij ) =
N X
i1
X
(rij ),
i=1 j=1
48
4. MOLECULAR DYNAMICS
Examples of quantities of interest:
1. Radial distribution function (structural equilibrium property)
g(r) =
X
2V
h(ri rj r)i ,
N (N 1)
(i,j)
where
N
dX N A(X N )eH(X )
R
hAi =
dX N eH(X N )
R
dX N A(X N ) (E H(X N ))
R
or =
dX N (E H(X N ))
R
(canonical ensemble)
(microcanonical ensemble).
1X
hFij rij i
3
(i,j)
|r(t) r(0)|2 6Dt for long times t,
where D is the self diffusion coefficient.
4. Time correlation function (relaxation properties)
C(t) = hv(t) v(0)i
which is related to D as well:
Z t
1
C( ) d
D = lim lim
3 t N,V 0
If the system is ergodic then time average equals the microcanonical average:
R
Z tfinal
dX N A(X N ) (E H(X N ))
1
N
R
lim
dt A(X (t)) =
.
tfinal tfinal 0
dX N (E H(X N ))
For large N , microcanonical and canonical averages are equal for many quantities A.
Need long times tfinal !
49
(4.1)
We want to solve for the trajectory x(t) numerically, given the initial point x(0) at
time t = 0.
Similar to the case of integration, we restrict ourselves to a discrete set of points,
separated by a small time step t:
tn = nt
xn = x(tn ),
where n = 0, 1, 2, 3, . . . .
To transform equation (4.1) into a closed set of equations for the xn , we need to express
the time derivative x in terms of the xn . This can only be done approximately.
Using that t is small:
x(t
n)
x(tn + t) x(tn )
xn+1 xn
=
.
t
t
Euler Scheme.
(4.2)
This formula allows one to generate a time series of points which are an approximation
to the real trajectory. A simple MD algorithm in pseudo-code could look like this:1
1
Pseudo-code is an informal description of an algorithm using common control elements found in most
programming language and natural language; it has no exact definition but is intended to make implementation in a high-level programming language straightforward.
50
4. MOLECULAR DYNAMICS
EULER ALGORITHM
SET x to the initial value x(0)
SET t to the initial time
WHILE t < tfinal
COMPUTE f(x,t)
UPDATE x to x+f(x,t)*dt
UPDATE t to t+dt
END WHILE
DO NOT USE THIS ALGORITHM!
It is easy to show that the error in the Euler scheme is of order t2 , since
1
x(t + t) = x(t) + f (x(t), t)t + x(t)t2 + . . . ,
2
so that
xn+1 = xn + f (xn , tn )t + O(t2 ) .
| {z }
(4.3)
local error
The strict meaning of the big O notation is that if A = O(tk ) then limt0 A/tk
is finite and nonzero. For small enough t, a term O(tk+1 ) becomes smaller than
a term O(tk ), but the big O notation cannot tell us what magnitude of t is small
enough.
A numerical prescription such as (4.3) is called an integration algorithm, integration
scheme, or integrator.
Equation (4.3) expresses the error after one time step; this is called the local truncation
error.
What is more relevant is the global error that results after a given physical time tf of
order one. This time requires M = tf /t MD steps to be taken.
Denoting fk = f (xk , tk ), we can track the errors of subsequent time steps as follows:
x1 = x0 + f0 t + O(t2 )
x2 = [x0 + f0 t + O(t2 )] + f1 t + O(t2 )
= x0 + (f0 + f1 )t + O(t2 ) + O(t2 )
..
.
xM = x0 +
M
X
k=1
fk1 t +
M
X
k=1
O(t2 );
51
but as M = tf /t:
tf /t
x(t) = xM = x0 +
X
k=1
tf /t
fk1 t +
O(t2 ) .
|k=1 {z
global error
(4.4)
k=1
(4.5)
(4.6)
Note that the true relation is x(t + t) = et x(t) = x(t) tx(t) + O(t2 ), i.e.,
the local error is of order t2 .
Equation (4.6) is solved by
xn = (1 t)n x0 = (1 t)t/t = e[ln(1t)/t]t .
(4.7)
52
4. MOLECULAR DYNAMICS
By comparing equations (4.5) and (4.7), we see that the behaviour of the numerical
solution is similar to that of the real solution but with the rate replaced by 0 =
ln(1t)/t. For small t, one gets for the numerical rate 0 = +2 t/2+ =
+O(t), thus the global error is seen to be O(t), which demonstrates that the Euler
scheme is a first order integrator. Note that the numerical rate diverges at t = 1/,
which is an example of a numerical instability.
4.1.2
1. Boundary conditions
We can only simulate finite systems.
A wall potential would give finite size effects and destroy translation invariance.
More benign boundary conditions: Periodic Boundary Conditions:
Let all particles lie in a simulation box with coordinates between L/2 and L/2.
A particle which exits the simulation box, is put back at the other end.
Infinite checkerboard picture (easiest to visualize in two dimensions):
53
Conversely, for any particle at position r0 not in the simulation box, there is a
particle in the simulation box at
0 L
(x + 2 ) mod L L2
(4.8)
r = (y 0 + L2 ) mod L L2 ,
(z 0 + L2 ) mod L L2
Yet another way to view this is to say that the system lives on a torus.
2. Forces
Usually based on pair potentials.
A common pair potential is the Lennard-Jones potential
12 6
(r) = 4
,
r
r
N X
n1
XX
(|rn rm + iL
x + jL
y + kL
z|).
(4.9)
While this converges for most potentials , it is very impractical to have to compute an infinite sum to get the potential and forces.
To fix this, one can modify the potential such that it becomes zero beyond a
certain cut-off distance rc :
(
(r) (rc ) if r < rc
0 (r) =
0
if r rc
where the subtraction of (rc ) is there to avoid discontinuities in the potential
which would cause violations of energy conservation.
54
4. MOLECULAR DYNAMICS
To also avoid discontinuities in derivatives, one can use a schemes such as
00 (r) = (r)(r)
(4.10)
where
(r) =
(rc
0
c 3rc +2r)
(rc rc0 )3
r)2 (r
r < rc0
rc0 r rc .
r > rc
(4.11)
, cutoff
-1
1
1.2
1.4
1.6
1.8
r
2.2
2.4
2.6
Once the potential is zero beyond a point, the sums over i, j, and k in equation (4.9) become finite.
In fact, if rc < L/2, the sum contains at most one non-zero contribution for each
pair of particles (i, j). This pair is either in the same box, or one of them is in
the adjacent boxes.
For any pair, the correct distance vector can be found from the original distance
vector r = ri rj using equation (4.8).
3. Initial conditions
The initial conditions are to some extend not important if the system naturally
tends to equilibrium (ergodicity).
Nonetheless, one would not want too extreme an initial configuration.
55
Starting the system with the particles on a lattice and drawing initial momenta
from a uniform or Gaussian distribution is typically a valid starting point.
One often makes sure that the kinetic energy has the target value 23 N kT , while
the total momentum is set to zero to avoid the system moving as a whole.
4. Integration scheme
Needed to solve the dynamics as given by the equations of motion.
Below, we will discuss in detail on how to construct or choose an appropriate
integration scheme.
5. Equilibration/Burn-in
Since we do not start the system from an equilibrium state, a certain number of
time steps are to be taken until the system has reached an equilibrium.
One can check for equilibrium by seeing if quantities like the potential energy are
no longer changing in any systematic fashion and are just fluctuating around a
mean values.
The equilibrium state is microcanonical at a given total energy Etot = Epot + Ekin .
Since Ekin > 0, the lattice initialization procedure outlined cannot reach all possible values of the energy, i.e. Etot > Epot (lattice).
To reach lower energies, one can periodically rescale the momenta (a rudimentary
form of a so called thermostat).
Another way to reach equilibrium is to generate initial conditions using the Monte
Carlo method.
6. Measurements
Construct estimators for physical quantities of interest.
Since there are correlations along the simulated trajectory, one needs to take
sample points that are far apart in time.
Although when one is interested in dynamical quantities, all points should be
used. In the statistical analysis there correlations should be taken into account.
56
4. MOLECULAR DYNAMICS
Given these ingredients, the outline of an MD simulation could look like this:
OUTLINE MD PROGRAM
SETUP INITIAL CONDITIONS
PERFORM EQUILIBRATION by INTEGRATING over a burn-in time B
SET time t to 0
PERFORM first MEASUREMENT
WHILE t < tfinal
INTEGRATE over the measurement interval
PERFORM MEASUREMENT
END WHILE
in which the integration step from t1 to t1+T looks as follows (assuming t=t1 at the start):
OUTLINE INTEGRATION
WHILE t < t1+T
COMPUTE FORCES on all particles
COMPUTE new positions and momenta according to INTEGRATION SCHEME
APPLY PERIODIC BOUNDARY CONDITIONS
UPDATE t to t+dt
END WHILE
Note: we will encounter integration schemes in which the forces need to be computed in
intermediate steps, in which case the separation between force computation and integration
is not as strict as this outline suggests.
4.1.3
Accuracy:
Accuracy means that the trajectory obeys the equations of motion to good approximation. This is a general demand that one would also impose on integrators for general
differential equations. The accuracy in principle improves by decreasing the time step
t. But because of the exponential separation of near-by trajectories in phase space
(Lyapunov instability), this is of limited help.
Furthermore, one cannot decrease t too far in many particle systems for reasons of
Efficiency:
It is typically quite expensive to compute the inter-particle forces F N , and taking
smaller time steps t requires more force evaluations per unit of physical time.
57
58
4. MOLECULAR DYNAMICS
All integration schemes become numerically unstable for large time steps t, even if
they are stable at smaller time steps. A large step may can the system to a region of
large potential energy. With infinite precision, this would just cause a large restoring
force that pushes the system back into the low energy region. But with finite precision,
the low energies cannot be resolved anymore, and the system remains in the high energy
state.
The dominance of the force evaluations means that the efficiency of a simulation can
greatly be improved by streamlining the evaluation of the forces, using
1. Cell divisions:
Divide the simulation box into cells larger than the cutoff rc .
Make a list of all particles in each cell.
In the sum over pairs in the force computation, only sum pairs of particles in
the same cell or in adjacent cells.
When adjacent is properly defined, this procedure automatically picks out
the right periodic image.
Draw-backs:
1. needs at least three cells in any direction to be of use.
2. Still summing many pairs that do not interact (corners).
2. Neighbour lists (also called Verlet lists or Verlet neighbour lists):
Make a list of pairs of particles that are closer than rc + r: these are neighbours.
Sum over the list of pairs to compute the forces.
The neighbour list are to be used in subsequent force calculations as long as
the list is still valid.
Invalidation criterion: a particle has moved more than r/2.
Therefore, before a new force computation, check if any particle has moved
more than r/2 since the last list-building. If so, rebuild the Verlet list,
otherwise use the old one.
Notes:
1. r needs to be chosen to balance the cost of rebuilding the list and considering non-interacting particles.
2. The building of the list may be sped up by using cell divisions.
For large systems, these methods of computing the interaction forces scale as N instead
of as N 2 , as the naive implementation of summing over all pairs would give.
59
(4.12)
(4.13)
(4.14)
The eigenvalues of the matrix on the right hand side of equation (4.14) are given by
= 1 it, and the solution of equation (4.14) can be expressed as
n
1
t
r0
rn
(4.15)
=
p0
pn
t 1
1 n
1
n
n
n
)
)
+
r0
(
(
+
+
2
2i
= 1
(4.16)
1
n
n
n
n
(
)
(
+
)
p
0
+
2i
2
!
0
0
0
0
ei tn ei tn
ei tn +ei tn
r0
2
2i
= (+ )n/2
(4.17)
0
0
0
0
ei tn ei tn
ei tn +ei tn
p
0
2i
2
0
0
t
cos( t) sin( t)
r0
2 2t
,
(4.18)
= (1 + t )
sin( 0 t) cos( 0 t)
p0
0
where ei t = (+ / )1/2 . By comparing equations (4.13) and (4.18), we see that the
behaviour of the numerical solution is similar to that of the real solution but with a
different frequency, and with a prefactor which is larger than one and grows with time.
Rather than performing a periodic, circular motion in phase space, the Euler scheme
produces an outward spiral.
Accuracy: 0 = + O(t) so this is only a first order integration scheme.
Time reversal invariant? No.
60
4. MOLECULAR DYNAMICS
Conserves energy? No.
Conserves angular momentum? No.
Conserves phase space? No.
Stable? No.
4.1.4
Verlet scheme
P = FN,
then one can exploit the form of these equations to construct (better) integrators.
The Verlet scheme is one of these schemes. It can be derived by Taylor expansion
t
t2 ...N t3
+ FnN
R (t)
+ O(t4 )
m
2m
6
2
...N t3
N
N
N t
N t
= R (t + t) = Rn + Pn
+ Fn
+ R (t)
+ O(t4 ).
m
2m
6
N
Rn1
= RN (t t) = RnN PnN
N
Rn+1
N
N
Rn1
+ Rn+1
= 2RnN + FnN
N
Rn+1
(4.19)
No momenta!
Requires positions at the previous step!
Simple algorithm (r(i) and rprev(i) are particle is current and previous position)
VERLET ALGORITHM
SET time t to 0
WHILE t < tfinal
COMPUTE the forces F(i) on all particles
FOR each particle i
COMPUTE new position rnew = 2*r(i)-rprev(i)+F(i)*dt*dt/m
61
t2
.
m
4.1.5
N
RnN
Rn+1
.
t
(4.20)
t
(which follows from (4.20)),
m
62
4. MOLECULAR DYNAMICS
where
N
N
N
N
Rn+1
RnN
RN Rn1
+ Rn+1
+ Rn1
2RnN
=m n
t
t
N
N
= Pn1/2 + Fn t (as follows from (4.19)).
N
=m
Pn+1/2
N
= RnN + Pn+1/2
t
m
(4.21)
Since the Leap Frog algorithm is derived from the Verlet scheme, it is equally stable.
The Leap Frog algorithm has the appearance of a first order Taylor expansion, but
because of the half-step momenta, it is third order in positions and second order in
momenta.
Since momenta are defined at different time points than positions, conservation laws
(energy, momentum,. . . ) can still not be checked.
The Leap Frog scheme is easy to implement:
LEAP-FROG ALGORITHM
SET time t to 0
WHILE t < tfinal
COMPUTE the forces F(i) on all particles
FOR each particle i
UPDATE momentum p(i) to p(i)+F(i)*dt
UPDATE position r(i) to r(i)+p(i)*dt/m
END FOR
UPDATE t to t+dt
END WHILE
4.1.6
This scheme will integrate positions and momenta (or velocities) at the same time
points, while keeping the position equivalence with the original Verlet scheme.
We define the momenta at time t = nt as
PnN =
1 N
N
.
Pn+1/2 + Pn1/2
2
63
Using that the half step momenta are correct to O(t2 ), we see that this is also correct
to that order, since
t
t
1
N
N
t+
t
P
+P
2
2
2
1
2
N
N t
N
N t
P (t) + Fn
+ P (t) Fn
+ O(t )
=
2
2
2
= P N (t) + O(t2 ).
Using the momentum rule of the Leap Frog algorithm
N
N
Pn+1/2
= Pn1/2
+ FnN t,
or
N
Pn+1/2
= PnN + FnN
t
.
2
(4.22)
Substituting Eq. (4.22) into the position rule of the Leap Frog gives the position transformation in the momentum-Verlet scheme
N
Rn+1
RnN
PnN
2
t
N t
+ Fn
.
m
2m
The corresponding momentum rule is found using the definition of the momentum and
the momentum rule of the Leap Frog:
1 N
N
N
Pn+1
= [Pn+3/2
+ Pn+1/2
]
2
1 N
N
N
t + Pn+1/2
]
= [Pn+1/2
+ Fn+1
2
N
N t
= Pn+1/2
+ Fn+1
2
N
F
+ FnN
= PnN + n+1
t,
2
(4.23)
64
4. MOLECULAR DYNAMICS
Summarizing:
FN
PnN
t + n t2
m
2m
N
N
F
+ Fn
+ n+1
t
2
N
Rn+1
= RnN +
N
Pn+1
= PnN
N
is required. But to compute
The momentum rule appears to pose a problem since Fn+1
N
N
Fn+1 , we need only Rn+1 , which is computed in the integration step as well. That is,
given that the forces are known at step n, the next step is can be taken by
The extra storage step can be avoided by reintroducing the half step momenta as
intermediates. From Eqs. (4.21) (first line), (4.23) and (4.22), one finds
1
N
= PnN + FnN t
Pn+1/2
2
N
Pn+1/2
N
Rn+1
= RnN +
t
m
1 N
N
N
t
Pn+1
= Pn+1/2
+ Fn+1
2
(4.24)
4.2
(4.25)
where the Liouville operator L acting on a phase space function A was defined in terms
66
4. MOLECULAR DYNAMICS
of the Poison bracket as
LA = {A, H}
A
H
H
A
=
N
N
N
R
P
R
P N
H
A=
=
P N RN
RN P N
!
T
H
J
A,
X N
X N
(4.26)
H
X N = J
X N
(4.27)
U
P N = F N = N .
R
H
2H
=
J
:
X
=
=0
X N
X N
X N
X N X N
since J is an antisymmetric matrix and
2H
X N X N
is symmetric.
=
X =
J
= 0,
N
N
dt
X
X
X N
again using the antisymmetry of J.
1 0
.
0 1
=T
,
N
X
X N
A=T
T A,
N
X
X N
i.e.
One easily shows that the T matrix and the symplectic matrix anti-commute
T J = J T.
This property, when combined with a T -invariant Hamiltonian, implies time-
68
4. MOLECULAR DYNAMICS
reversal, since
T LT = T
H
J
X N
T
T
X N
T
H
=T J
TT
N
X
X N
T
H
=TT JT
T
N
X
X N
T
H
= JT
T
N
X
X N
T
= TJ
X N
X N
T
H
= J
TT T
N
X
X N
T
H
= J
N
X
X N
= L
q.e.d.
The idea is now to construct symplectic integrators, such that (by construction), they
conserve phase space volume, conserve momentum and linear momentum when applicable and are time-reversible.
Remember that the formal solution of the equations of motions in equation (4.25) is
X N (t) = eLt X N (0),
but this exponent can only be evaluated explicitly for exceptional forms of L, such as
for
free particles (e.g. an ideal gas),
systems with harmonic forces (e.g. particles harmonically bound on a lattice),
and
free rigid bodies.
Thus, for a Hamiltonian without the potential in (4.26), which is just the kinetic energy
K=
PN PN
,
2m
K
J
X N
T
X N
PN
=
,
m RN
LU = J
= FN
N
N
X
X
P N
one can also evaluate the exponential
eLU t A(RN , P N ) = A RN , P N + F N t .
We will call this a force propagation over a time t.
Although we can solve the exponent of the operators LK and LU separately, we cannot
exponentiate their sum, because
eX+Y 6= eX eY
when X and Y are non-commuting operators. The operators LK and LU do not
commute since
[LK , LU ] = LK LU LU LK
N
N
N
F
= PN
RN P N
P N
RN
N
= PN
FN
N
N
R
P
RN
6= 0.
We can however use the Baker-Campbell-Hausdorff (BCH) formula
1
70
4. MOLECULAR DYNAMICS
Now let X = LU t and Y = LK t, then [X, Y ] = O(t2 ), [X, [X, Y ]] = O(t3 ) etc.
Denote any repeated commutators by . . ., then the BCH formula gives
1
(4.30)
Using this in Eq. (4.29) would give a scheme in which alternating force and free propagation is performed over small time steps t.
The accumulated error of this scheme is O(t), but the Verlet scheme was O(t2 ).
The missing ingredient is time reversal. Note that while the true propagation satisfies
time reversal invariance:
T eLU t+LK t T = [eLU t+LK t ]1 .
Due to the symplecticity of the LU t and LK t operators separately, the approximate
evolution has
T eLU t eLK t T = T eLU t T T eLK t T = eLU t eLK t ,
which is not equal to the inverse
L t L t 1
e U e K
= eLK t eLU t .
Time reversal can be restored by taking
eLU t+LK t = eLU t/2 eLK t eLU t/2 + . . . ,
since
T eLU t/2 eLK t eLU t/2 T = T eLU t/2 T T eLK t T T eLU t/2 T
= eLU t/2 eLK t eLU t/2
1
= eLU t/2 eLK t eLU t/2
(4.31)
= eLU t/2 eLK t e 2 [LU t/2,LK t]+... e 4 [LK t,LU t]+... eLU t/2
1
= eLU t/2 eLK t e 4 [LU t,LK t]+... e 4 [LK t,LU t]+... eLU t/2
= eLU t/2 eLK t e... eLU t/2
Remember that . . . was O(t3 ), so this scheme has a third order local error and
consequently a second order global error.
The scheme in equation (4.31) is our momentum Verlet scheme that we had derived
before, see equation (4.24). It first performs a half force propagation, a whole free
propagation and then another half force propagation.
The splitting method is more general than what we have just derived. One does not
have to split up the Hamiltonian into the kinetic and potential part. All that is required
is that the separate sub-Hamiltonian can be explicitly exponentiated.
The following general construction of a splitting scheme is as follows
1. Split the Hamiltonian H up into n partial Hamiltonians H1 , H2 , . . . Hn :
H=
n
X
Hj .
j=1
In more advanced splitting schemes, one may in addition define auxiliary Hamiltonians Hj>n which do not enter in H.
2. Associate with each partial Hamiltonian Hj a partial Liouville operator
Lj = LHj
T
Hj
= J
N
X
X N
n
X
j=1
Lj .
72
4. MOLECULAR DYNAMICS
3. One splits up the exponentiated Liouvillean in S factors
Lt
Pn
=e
j=1
Lj t
S
Y
eLjs ts ,
(4.32)
s=1
where the product is taken left-to-right, such that the first factor on the left is
s = 1 while the last on the right is s = S.
4. Since each Liouvillean is multiplied by a total time interval t in the original
exponent, the separate time intervals for each Liouvillean Lj 0 have to add up to
t as well, at least up to the order of the scheme
S
X
ts = t + O(tk+1 ).
s=1
with js = j 0
for each j 0 = 1 . . . n.
For auxiliary Hamiltonians Hj>n , their combined time steps should be zero up to
the order of the scheme:
S
X
ts = 0 + O(tk+1 ).
s=1
with js = j 0
for j 0 > n.
5. One should take tS+1s = ts and jS+1s = ts in order to have time reversal
invariance (we will assume that the Hj are invariant under time reversal).
6. One uses the BCH formula to adjust the ts further, such that
eLt =
P
Y
eLjs ts + O(tk+1 )
(4.33)
s=1
s=1
S
Y
s=1
s=1
"
1
Y
Lj ts 1
e s
=
eLjs ts
s=S
#1
"
=
S
Y
s=1
#1
eLjs ts
73
The scheme is of even order: Reason: the scheme is time reversible, so each error
found by applying the BCH formula must also be time-reversible, i.e., each term
X in the exponent satisfies T XT = X. Thus, the error terms are odd in the
partial Liouvilleans. Since each Liouvillean comes with a factor of t, the local
error terms are odd in t.
The resulting global error is then even in t.
If the full Hamiltonian conserves a quantity Q, i.e. {H, Q} = 0, and if also
each partial Hamiltonian Hj also satisfies {Hj , Q} = 0, then the quantity Q
is conserved in each step in equation (4.32), and thus exactly conserved in the
integration scheme.
4.3
74
4. MOLECULAR DYNAMICS
Consider now the Verlet splitting scheme, writing for brevity X = LU t/2 and Y =
LK t and using the BCH formula to one order further than before:
1
= eX+Y + 2 [X,Y ]+ 12 [X,[X,Y ]]+ 12 [Y,[Y,X]]+X+ 2 [X+Y + 2 [X,Y ],X]+ 12 [X+Y,[X+Y,X]]+ 12 [X,[X,X+Y ]]+...
1
= e2X+Y + 2 [X,Y ]+ 12 [X,[X,Y ]]+ 12 [Y,[Y,X]]+ 2 [Y + 2 [X,Y ],X]+ 12 [X+Y,[Y,X]]+ 12 [X,[X,Y ]]+...
1
1 1
(4.34)
eLU t/2 eLK t eLU t/2 = eLH t 24 [LU ,[LU ,LK ]]t + 12 [LK ,[LK ,LU ]]t +O(t )
1
1
2
2
4
= e{LH 24 [LU ,[LU ,LK ]]t + 12 [LK ,[LK ,LU ]]t +O(t )}t
This is the evolution belonging to the following operator
1
1
Lshadow = LH [LU , [LU , LK ]]t2 + [LK , [LK , LU ]]t2 + O(t4 )
24
12
1
1
2
= LH [LU , L{K,U } ]t + [LK , L{U,K} ]t2 + O(t4 )
24
12
1
1
2
= LH L{{K,U },U } t + L{{U,K},K} t2 + O(t4 )
24
12
= LHpseudo ,
where the pseudo-Hamiltonian or shadow Hamiltonian is
1
1
Hpseudo = H {{K, U }, U }t2 + {{U, K}, K}t2 + O(t4 ).
24
12
N 2
N
If H is of the form |P | /(2m) + U (R ), one has
2
2K
U
1 U
U
1 N2
{{K, U }, U } =
=
=
|F |
RN P N P N RN
m RN
m
2U
K
1 N
2U
K
{{U, K}, K} =
=
P
PN,
P N RN RN P N
m2
RN RN
so that
1 U
U
1
2U
N
Hpseudo = H
+
P
P N + O(t4 ).
24m RN RN
12m2
RN RN
The leading correction to the scheme is of Hamiltonian form.
(4.35)
(4.36)
(4.37)
Hh th ,
(4.38)
h=0
4.4
Despite the theoretical prediction that splitting schemes should be stable, one sees in
practice that large time steps still lead to instabilities.
To understand that, we will apply the splitting method to a harmonic oscillator.
Hamiltonian of the harmonic oscillator:
1
1
H = p2 + r 2
2
2
76
4. MOLECULAR DYNAMICS
Note: mass and frequency have been set to 1.
Split the Hamiltonian in a kinetic part K = 12 p2 and a potential part U = 12 r2
Use the Verlet scheme
eLH t = eLU t/2 eLK t eLU t/2 + O(t3 )
Since LK = (J
K T
)
one has
r + pt
1 t
LK t
e
x=
=
x,
p
0 1
where
r
x=
,
p
so the operator eLK t acts as a linear operator.
)T
, and
Similarly, LU = (J U
1
r
1
0
LU t
2
e
x=
=
x.
p 21 rt
12 t 1
(4.39)
We saw above that a splitting method conserves a pseudo-Hamiltonian, which is composed of the original Hamiltonian plus repeated Poisson brackets of the partial Hamiltonians.
To leading order, we had
Hpseudo = H
1
1
{{K, U }, U }t2 + {{U, K}, K}t2 + . . .
24
12
p2
1
=
+ m 2 r2 ,
2m 2
p2
2m
Hpseudo = xT H x
where
H=
1
2m
0
.
1
m 2
2
Hpseudo
LHpseudo x = J
x=J
= 2J
| {z H} x,
x
x
x
L
78
4. MOLECULAR DYNAMICS
The solutions of the equations of motion are determined by eLt , which is found to be
cos(t)
m sin(t)
Lt
e
=
1
m
sin(t)
cos(t)
This is not a surprising result, since it is just the solution of the classical harmonic
oscillator.
This result should coincide with the form on the right-hand side of equation (4.39).
We can therefore identify the renormalized frequency and mass in the splitting scheme:
1
1
arccos(1 t2 )
t
2
t
m=
sin(t)
=
(4.40)
(4.41)
renormalized quantities
79
frequency
mass
2.5
2
1.5
1
0.5
0
0.5
1
dt
1.5
Figure 4.1: The way the instability limit is approached when using the Verlet splitting
scheme on the harmonic oscillator. Plotted are the renormalized frequency and mass m
as a function of t. The unnormalized values were 1.
For the harmonic oscillator, the radius of convergence was clearly t = 2.
Also for general system, one expects a radius of convergence, i.e., a time step beyond which the pseudo-Hamiltonian becomes unbounded and the simulation becomes
unstable.
4.5
Higher order schemes give better accuracy, but at the price of performing more force
computations.
Higher order schemes are very important in contexts such as astrophysics, where accurate trajectories matter.
For MD, depending on the level of accuracy required, higher order schemes can also
be advantageous.
Higher order schemes may be devised by
taking time points which are not evenly spaced,
80
4. MOLECULAR DYNAMICS
incorporating leading order correction terms in the pseudo-Hamiltonian
incorporating more time points,
or a combination of the above.
All but the first option lead to more force computations per unit of physical time,
which decreases the efficiency.
We will restrict ourselves to symmetric splitting schemes, to ensure time-reversibility.
To facilitate many of the derivations, we restate the symmetrized BCH formula of
equation (4.34)
1
4.5.1
commutators
(4.42)
Optimized schemes
(4.43)
(12)2 t3
= eLU t eLK t+(12)LU t 24 [LK ,[LK ,LU ]]+ 12 [LU ,[LU ,LK ]]+O(t ) eLU t
(
(1 2)t3
(1 2)2 t3
= exp LK t + LU t
[LK , [LK , LU ]] +
[LU , [LU , LK ]]
24
12
)
2 t3
t3
[LU , [LU , LK ]] +
[LK + (1 2)LU , [LK , LU ]] + O(t5 )
6
6
LH t+t3
=e
2
61
[LK ,[LK ,LU ]]+ 16+6
[LU ,[LU ,LK ]]
24
12
3 (
= eLH t+t
5
1 [LK ,[LK ,LU ]]+2 [LU ,[LU ,LK ]])+O(t )
+O(t5 )
81
where
6 1
24
1 6 + 6 2
2 =
12
1 =
If both 1 and 2 were zero, this would give a fourth order scheme.
However, this is not possible here: we have one parameter and two prefactors to set to
zero.
Alternatively, one could make the error as small as possible, e.g. by minimizing
12 + 22 . This gives
= 0.1931833275037836 . . .
(4.44)
The scheme in equation (4.43) that uses this value of is called the Higher-order
Optimized Advanced scheme of second order, or HOA2 for short.
In general, optimized schemes are based on minimizing the formal error, but this cannot
guarantee that the actual error is small: Numerical tests are necessary.
The HOA2 scheme requires two force computations:
e|L{zU t}
eLK t/2
U t
e|(12)L
{z }
eLK t/2
e|L{zU t}
(4.45)
82
4. MOLECULAR DYNAMICS
For more accurate simulations, the HOA2 is more efficient, by about 50%, until
the HOA2 scheme becomes unstable.
The higher instability of HOA2 at large time steps, is due to the uneven time-steps
than are taken.
The average time step determines the computational cost, but the largest of the time
steps determines the stability.
Using uneven time steps instead of even time steps (at fixed computational cost) therefore increases some of the time intervals and decreases other.
uneven time steps become unstable for smaller time steps than even time step
variants.
4.5.2
While using the HOA2 scheme is beneficial, this could only be known after numerical
tests.
True higher-order schemes are a bit better in that respect: one knows that at least for
small enough t, they are more efficient.
The simplest way to get a higher order scheme is to concatenate lower order schemes.
To understand this, note that if we have a k-th order scheme Sk (t) approximating
S(t) = eL up to O(tk+1 ), i.e., if
Sk (t) = S(t) + tk+1 S + O(tk+3 )
then
Sk (s)Sk (t 2s)Sk (s)
h
i
= S(t) + 2sk+1 + (t 2s)k+1 S + O(tk+3 )
(4.46)
(4.47)
(4.48)
t
.
2 21/(k+1)
(4.49)
83
When Sk = S2 is given by the second Verlet scheme, the corresponding fourth order
scheme becomes
eLt eLU s/2 eLK s eLU s/2 eLU (t/2s) eLK (t2s) eLU (t/2s) eLU s/2 eLK s eLU s/2
= eLU s/2 eLK s eLU (ts)/2 eLK (t2s) eLU (ts)/2 eLK s eLU s/2
(4.50)
This is called the fourth order Forest-Ruth integration scheme (FR4). Note that it has
seven parts and requires three force evaluations per time step.
Note that s > t/2, so that t 2s < 0.
It therefore requires to take a negative time step:
This tends to lead to instabilities.
One can prove that order k splitting schemes using a two-operator split up (such as
LU and LK ) must have at least one negative time step if k > 2.
The negative steps thus seem unavoidable.
One can minimize these however, by allowing more general splitting schemes than those
constructed from equation (4.46), i.e., using the general form in equation (4.32).
This gives more parameters, allowing one to combine this with the higher-order nature
with optimization, i.e. minimizing the leading error terms (cf. the HOA2 scheme).
A good fourth order scheme of this type is called EFRL4 (Extended Forest-Ruth-like
Fourth order scheme) and looks like this:
1
eLt = eLU t eLK ( 2 )t eLU t eLK t eLU (122)t eLK t eLU t eLK ( 2 )t eLU t
(4.51)
+ O(t5 ),
where
= 0.3281827559886160
= 0.6563655119772320
= 0.09372690852966102
Even though this requires four force evaluations for each time step, it is more efficient
than the FR4 scheme due to a much smaller leading order error.
84
4. MOLECULAR DYNAMICS
4.5.3
Gradients are, in this context, derivatives and higher-order derivatives of the potential.
Using gradients as auxiliary Hamiltonians, one can reduce the order of a scheme
The simplest can be derived by considering once more the five-fold splitting scheme,
before optimization:
3 (
5
1 [LK ,[LK ,LU ]]+2 [LU ,[LU ,LK ]])+O(t )
3
5
= eLH t+t (1 L{{U,K},K} +2 L{{K,U },U } )+O(t )
where
6 1
24
1 6 + 6 2
2 =
12
1 =
and let K and U be of the usual form such that (cf. equations (4.35) and (4.36))
2
U
2K
U
1 U
1
{{K, U }, U } =
=
= |F N |2
N
N
N
N
N
R
P P
R
m R
m
2
2
U
K
1 N
U
K
=
P
PN.
{{U, K}, K} =
P N RN RN P N
m2
RN RN
The former depends only on RN , but the latter is more complicated and depends on
both RN and P N .
We can eliminate the more complicated term by setting 1 = 0 = 1/6, leaving us
with
1
1
L N 2 +O(t5 )
LH t+t3 72m
|F |
e 6 LU t e 2 LK t e 3 LU t e 2 LK t e 6 LU t = e
LK +L
=e
t2 |F N |2
U + 72m
t+O(t5 )
(4.52)
85
giving
e
1
L t
6 U
1
L t
2 K
2
L t
3 U
1
L t
2 K
1
L t
6 U
LK +L
t2 |F N |2
U + 72m
=e
t+O(t5 )
5)
= e 6 LU t e 2 LK t e 3 [LU + 2 U ]t e 2 LK t e 6 LU t
1
= e 6 LU t e 2 LK t e 3 [LU ]t e 2 LK t e 6 LU t
(4.53)
(4.54)
4.5.4
Decompose the potential and the corresponding Liouville operator into two parts: one
for fast varying forces and the other for the slow varying forces:
U = Uf + Us
(4.55)
The fast motion could represent e.g. , intermolecular vibrations while the slowly varying
forces might be intermolecular forces.
86
4. MOLECULAR DYNAMICS
The simplest multiple time-step algorithm is then
1
M 1
1
1
3
e 2 LUs t e 2 LUf t/M eLK t/M e 2 LUf t/M
e 2 LUs t = eLt+O(t ) .
(4.56)
While this is of second order, like the Verlet scheme, the time step for the fast part of
the motion if M times smaller than that of the slow motion.
5
Advanced topics
5.1
5.1.1
One drawback of traditional Monte-Carlo simulation methods is that typically the method of
generating trial configurations based on a probability T(x y) results in trial configurations
y that are highly correlated with the initial configuration x, with only small differences
between the configurations.
As an example, consider a dense liquid system where one could generate trial configurations by randomly selecting one of the particles in the fluid and giving the particle a
random displacement.
Very inefficient since particles are close to one another on average in a dense fluid,
so the trial configuration has a high probability of either overlapping with another
particle (hard spheres) or being in the strongly repulsive region of the interaction
potential of another particle, resulting in a configuration of high potential energy.
Molecular dynamics, in contrast, moves all particles in a cooperative fashion, so
that the potential energy undergoes only small fluctuations.
It is possible to try to move a group or cluster of particles cooperatively using
tricks, but this requires some cleverness.
Can we devise a dynamical procedure of generating a trial configuration in which a large
number of degrees of freedom are moved cooperatively in a energetically reasonable
fashion?
Suppose we use a dynamical procedure to change a set of coordinates for use as a trial
configuration in a Monte-Carlo procedure. See S. Duane, A.D. Kennedy, B.J. Pendleton and
D. Roweth, Phys. Lett. B 45, 216 (1987).
87
88
5. ADVANCED TOPICS
We require a momentum coordinate p conjugate to each spatial degree of freedom x
we wish to evolve dynamically.
Suppose the evolution is a one-to-one mapping (i.e. deterministic evolution) of the
initial phase point (x0 , p0 ) to another phase point (xt , pt )
g t (x0 , p0 ) = (x0 (t), p0 (t)) = (xt , pt ).
(5.1)
89
Claim: The transition probability Eq. (5.3) satisfies the stationarity condition Eq. (5.2)
provided
1. The dynamics is symplectic, so that the mapping is volume-preserving and timereversible.
2. The probability density for the conjugate momenta satisfies m (p) = m (T p) =
m (p), where T is the momentum inversion operator T f (p) = f (p).
Proof. To show Eq. (5.2) is satisfied, we must compute K(xt x0 ) under the dynamical
procedure. Suppose the dynamical mapping is symplectic and hence reversible. The
mapping g t therefore satisfies g t = T g t T and hence if g t (x0 , p0 ) = (x0 (t), p0 (t)) =
(xt , pt ), we have T g t T (xt , pt ) = (x0 , p0 ) and hence g t (xt , T pt ) = (x0 , T p0 ). From the
definition of the transition probability, we have
Z
K(xt x0 ) =
(5.4)
whereas
Z
90
5. ADVANCED TOPICS
Changing the variables of integration from (xt , pt ) to (x0 , p0 ) = T g t (xt , T pt ) gives
Z
Z
dxt P (xt ) K(xt x0 ) =
Z
=
Z
=
dx0 dp0 J((xt , T pt ); (x0 , p0 )) min (g t (x0 , T p0 )), (x0 , T p0 )
dx0 dp0 min (T g t (x0 , p0 )), (x0 , p0 ),
dx0 dp0 min (g t (x0 , p0 )), (x0 , p0 )
since the Jacobian J((xt , T pt ); (x0 , p0 )) of the transformation is unity due to the
volume-preserving property of the dynamical evolution. Since Eq. (5.5) and Eq. (5.7)
are equal, the equilibrium distribution is stationary under the transition matrix K.
5.1.2
(N )
) =
N X
N
X
i=1 j=i+4
Vnb (|ri rj |) +
N
X
Utor (i ) +
i=4
N
X
Ubend (i ) +
i=3
N
X
i=2
(5.7)
91
The configuration of the polymer can be specified by either the Cartesian coordinates of
all the monomers, (r1 , . . . , rN ), or by a set of generalized coordinates (r1 , q2 , . . . , qN ) where
qi = (li , i , i ).
The coordinates 2 , 3 and 4 can be defined with respect to fictitious fixed Cartesian
coordinates r0 and r1 that do not change.
Cartesian positions can be calculated from the generalized coordinates via a set of
locally-defined spherical polar reference frames in which the monomer i is placed along
the x-axis at a distance li from the origin, chosen to be the Cartesian position of bead
i 1. Then the Cartesian position of bead i can be written in the lab frame (defined
to be the frame of bead 1) as
ri = r1 + T2 l2 + T2 T3 l3 + T2 T3 T4 l4 + + T2 T3 Ti li ,
(5.8)
where li = (li , 0, 0) is the vector position of bead i in the local reference frame for
bead i with monomer i 1 as an origin. In Eq. (5.8), the matrix Ti (i , i ) is the
transformation (a rotation) matrix between the reference frames of bead i and bead
i 1, and is given by
cos i
sin i
0
Ti = sin i cos i cos i cos i sin i .
sin i sin i cos i sin i cos i
Note that Tlab
k = T2 Tk transforms between the reference frame of bead k and the
fixed lab frame and Tlab
k lk is the lab frame representation of the relative vector rk rk1
connecting monomers k and k 1.
Ensemble averages hAi of a variable A can be written as integrals over Cartesian or
generalized coordinates using
Z
hAi =
92
5. ADVANCED TOPICS
where J(q2 , . . . , qN ) is the Jacobian of the transformation between Cartesian and generalized coordinates. It can be shown that
J(q2 , . . . , qN ) =
N
Y
li2 cos i .
i=2
2mi
m (i , i ) e
2
P
2mi
P
i = i
m
U
Pi =
.
i
93
i
The stationary distribution for the phase point in the Monte-Carlo process is
proportional to eHe , due to the form of the densities of the momenta.
If the current configuration at the start of the dynamical update is X, accept the
trial configuration Y generated by the trajectory with probability
J(Y) He
min 1,
e
,
J(X)
where He = He (Y) He (X).
Note that if the dynamics was integrated perfectly, He = 0 since the effective
Hamiltonian is conserved by the dynamics.
Comments:
1. The smaller the time step, the smaller the average He and the greater the
acceptance probability.
2. The smaller the time step, the smaller the overall change in configuration of the
system for a fixed number of molecular dynamics steps.
3. The dynamics is fictitious, as the equations of motion are not the Hamiltonian
equations for the generalized coordinates. In particular, the kinetic energy in the
Hamiltonian has a different form.
4. The Jacobian factor is important in the acceptance probability. However, if the
li and i are held fixed and only the torsional angles evolve, the Jacobian factor
is constant.
5. Any subset of coordinates can be selected for updating for any trial move. It may
be advantageous to select different groups of coordinates for updating at different
times.
94
5. ADVANCED TOPICS
5.2
Time-dependent correlations
We have considered primarily averages of static properties that are constructed out ensemble
averages involving a single phase point at a time. We have seen that these averages may be
computed by either:
1. A Monte-Carlo algorithm that generates ensemble averages by sampling phase points
from the phase space density for the ensemble. Different ensemble averages can be
computed by altering the transition matrix. Within this class of algorithms, we include
the hybrid Monte-Carlo scheme, which uses a dynamical procedure to generate trial
configurational coordinates.
2. A molecular dynamics procedure that uses the real Hamiltonian dynamics of the system
to compute the time average of a dynamical variable. By hypothesis, the time average
is equal to the micro-canonical ensemble average. In the dynamical evolution, the
micro-canonical phase space density and phase space volume are conserved, which
reflects the conservation of probability.
In many physical situations, we are interested in computing quantities that obey some physical law, and which may involve the real dynamics of the system. For example, suppose one
is interested in computing how an initially nonuniform concentration profile (say a drop of
dye in water) is smoothed in the absence of flow or stirring. Such a process is described
phenomenologically by the law of diffusion (Ficks law), which states that the flux j of the
diffusing species is proportional to the negative gradient in the concentration of the species:
c(r, t)
,
j = D
r
where c(r, t) is the local concentration of the species and D is a constant known as the
diffusion coefficient. Under Ficks law, the concentration obeys the equation
c(r, t)
= j(r, t)
t
r
c(r, t)
= D2 c(r, t).
(5.9)
t
If the initial dye is concentrated in a small region c(r, 0) = (r), then the concentration
profile is
1
r 2 /(4Dt)
c(r, t) =
e
.
(4Dt)3/2
R
Note that the concentration profile is normalized, dr c(r, t) = 1. This is of the form of a
Gaussian distribution with a time-dependent width 2Dt. Hence the diffusion coefficient is
related to the second moment of c(r, t):
Z
2
r (t) = dr r2 c(r, t).
95
The second moment can be interpreted to mean the average distance squared that a particle
moves away from the origin in a time interval t. To see how one might compute the coefficient
D, we multiply Eq. (5.9) by r2 to compute the second moment to get
Z
Z
hr2 (t)i
2 2
= D dr r c(r, t) = 6D dr c(r, t) = 6D,
t
after the angular integrals have been carried out using the spherical symmetry of c(r, t).
This equation suggests that hr2 (t)i = 6Dt, so that a plot of the average squared distance
versus time should be linear with slope 6D. To compute diffusion coefficient from the timedependent correlation function hr2 (t)i, we must:
1. Draw an initial configuration according to the correct phase space density (i.e. microcanonical, canonical, etc..).
2. Propagate the system using the correct dynamics and measure the displacement vector
ri (t) = ri (t) ri (0) of each particle for a set of different times t.
3. If we ignore any cross-correlation between the motion of particles, we can measure the
self-diffusion coefficient Ds by calculating
N
hr2 (t)i
1 X ri (t) ri (t)
Ds =
=
.
6t
N i=1
6t
Rt
From the equations of motion, ri (t) = 0 d vi ( ), and hence the average can be
written as
Z t
N Z
1 X t
2
d 0 vi ( ) vi ( 0 ).
d
hr (t)i =
N i=1 0
0
Note that the procedure consists of two different steps, the drawing of the initial phase
points and then the propagation of the system.
General time-dependent correlation functions of the form
Z
hAB(t)i = dx(N ) f (x(N ) )A(x(N ) )B(x(N ) (t))
can be computed analogously.
If the ensemble is not microcanonical, the initial points must be drawn using
a Monte-Carlo algorithm (or with a special dynamical procedure) that generates
phase points according to f (x(N ) ). Then the system must be evolved using the real
dynamics of the system, preferably with a symplectic integrator. This evolution
therefore evolves the system with constant energy.
96
5. ADVANCED TOPICS
5.3
Event-driven simulations
Consider a system of N impenetrable hard spheres that interact via the potential
r
U (r) =
,
0 r>
where r is the distance between the centres of two hard spheres and is the diameter of the
spheres.
The probability of finding the system in a configuration in which spheres overlap is
zero.
The energy of the system is given by the Hamiltonian
N
X
p2i
1X
K if no overlap
H=
+
U (rij ) = K + U =
.
otherwise
2m
2
i=1
i,j
How can dynamics of this system be performed on a computer?
Particles coming together should bounce off one another, but preserve the total energy
H.
To examine the complications of systems with discontinuous potentials, consider a system
interacting with the short-ranged potential
10
U (r, 100)
U (r, 50)
U (r, ) = e(r)
f (r, ) = 2 e(r)
1.02
1.04
1.06
1.08
1.1
1.12
1.14
1.16
1.18
1.2
r/
97
t2 X
2U
t2 X
p
Fi Fi + O(t4 ).
i
j
2
12m i,j
ri rj
24m i
From form of the potential, we see that the correction term is proportional to (2 t)2 ,
and hence the radius of convergence of the shadow Hamiltonian shrinks to zero as
.
The integrator is unstable for any choice of t.
In impulsive limit , force acts discontinuously at a time tc where rij (tc ) = .
How can we integrate equations of motion with impulsive forces that act at only one
time? Consider the integral form of the equation of motion for the momentum
Z
t+t
pi (t + t) = pi (t t) +
d Fi ( ).
tt
Srij if tc [t t, t + t]
0
otherwise.
98
5. ADVANCED TOPICS
Up to the moment of collision, the pair of spheres i and j move freely with constant
momentum, so
pi
tc
mi
pj
tc
rj (tc ) = rj (0) +
mj
ri (tc ) = ri (0) +
t2c +
if vij2 6= 0, where vr = rij vij is the projection of the relative along the relative
2
vector rij and 2ij = (rij
2 )vij2 .
Real solutions exist if vij2 6= 0 and vr2 2ij .
2
> 2 and hence 2ij > 0.
If no initial overlap, then rij
1/2
.
If 2ij > 0, then |vr | > vr2 2ij
1. If vr > 0, then tc < 0. Particles are moving away from one another and will
not collide (in a non-periodic system).
2. If vr < 0, the particles moving towards one another and 2 positive roots are
found:
q
vr vr2 2ij
tc =
= time of initial contact
(5.10)
v j2
qi
vr + vr2 2ij
= time particles pass through one another
tleave =
vij2
1
2
v12
1
tc
tleave
99
(5.11)
where p0i is the momentum of particle i after the collision with particle j.
Conservation of energy implies that the post-collisional energy H 0 is equal to the precollisional energy H, or
pj pj
p0i p0i p0j p0j
p p
+
= i i+
,
2mi
2mj
2mi
2mj
which, using Eq. (5.11), gives a condition on the impulse S
S2
Svr = 0
2
S = 2vr ,
5.3.1
(5.12)
For the hard sphere system, the dynamics can be executed by:
1. Initially calculate all collision events for system using Eq. (5.10).
Boundary conditions must be properly included: hard walls, periodic system, ...
2. Find first collision event and evolve system (with free evolution) up to that time.
3. For colliding pair, compute the consequences of the collision using Eq. (5.12).
100
5. ADVANCED TOPICS
4. Compute the new first collision time for system after the momentum adjustments and
repeat steps 2 to 4.
Tricks of the Trade: A number of standard techniques have been developed to improve
the efficiency of event-driven simulations. These include:
1. Cell division and crossing events
If cubic cells of length are used to partition the system, collisions of a given
particle in a cell can occur only with particles in the same or neighboring cells
before the particle moves out of a cell.
If a particle moves out of a cell (i.e. a cell-crossing event), the new neighboring
cells must be check for a collision.
Cell-crossing time is analytically computable.
Can treat cell-crossing as an event to be processed like a collision, with the processing of the event defined to mean the computation of new collision times with
particles in the new neighboring cells.
2. Local clocks
The position of any particle not involved in an event does not need to be updated
since it will continue to move freely until it is specifically involved in a collision.
Time of last event for each particle can be stored and used to compute interaction
times or updates of positions.
Global updates of the positions of all particles must be performed before any
measurement of the system takes place.
3. Elimination of redundant calculation of event times
101
If a collision occurs between a pair of particles i j, new event times result only
for particles that have an collision event involving particle i or j as a partner.
Most event times unaffected in large system, and need not be recomputed.
Can devise routines that only compute new possible events following a collision.
4. Usage of data structures to manage event times
Search for first event time in system can be facilitated by use of binary tree data
structures, using the event time as an ordering key.
Functions required to insert new events in tree and search for earliest times.
Information for whether an event in the tree has been invalidated by an earlier
event can be stored in each node of tree.
Small hybrid tree structures that only insert valid events in the near future are an
efficient means of managing events in the system. These data structures typically
use linked lists of events in specific time intervals that are periodically used to
populate the binary tree.
5.3.2
To mimic systems interacting by a continuous potential U (r), one can construct a discontinuous potential V (r) with discrete energies [see van Zon and Schofield, J. Chem. Phys. 128,
154119 (2008)]:
V (r) = Umin +
K
X
k=1
(U (r) Uk )Vk ,
102
5. ADVANCED TOPICS
where is the Heaviside function and Vk is the change in potential when U (r) = Uk .
If pair of particles has a potential energy U (r) under the continuous potential, where
U
Pkk < U (r) < Uk+1 , then the interaction is assigned potential energy V (r) = Umin +
i=1 Vi .
Collision times at which U (r) = Uk ca be solved analytically for simple potentials.
By examining representations of the Heaviside function, one can derive that the impulse
on body i due to an interaction with another body j can be written as
p0i = pi + Fij (tc )
Fij (t) = SFij (tc )(t tc ),
where Fij (t) is the force on i due to j arising from the continuous potential U (rij ).
At discontinuity at Uk , the impulse S satisfies
S2
Fij Fij
+ Svij Fij + Vk = 0.
m
103
If roots are complex, the collision is a reflection (bounce back) due to inadequate
total energy to overcome the discontinuity. In this case, S = mvij Fij /Fij2 .
High energy
Low energy
104
5.4
5. ADVANCED TOPICS
Typical time scales of most intramolecular motions are 10 to 50 times shorter than the
translational time scale of a molecule.
Time step of an integrator determined by shortest relevant time scale.
Multiple time step methods can be used to deal with differing time scales in dynamical
simulations.
Many observables are independent or insensitive to intramolecular motion.
Molecular conformation is usually only weakly dependent on bond lengths. Bond
vibrations are typically restricted to small motions on rapid time scales.
Some bond angles remain relatively constant, aside from high frequency oscillations.
5.4.1
Constrained Averages
Suppose a set of ` variables, such as bond lengths, are effectively constant during a
simulation.
We will assume that the constraints are only functions of the positions, and independent
of momenta. Such constraints are called holonomic.
Can specify these constraints by
1 (r(N ) ) = 0
such as
2
1 (r12 ) = r12
d2
2 (r(N ) ) = 0
..
.
We write the condition that the set of all constraints = (1 , . . . , ` ) are satisfied in
the compact notation
`
Y
(i (r(N ) )) = ().
i=1
105
R
1
r m r V (r)
2
1
H(r, p) = p m1 p + V (r),
2
1
where mij = mi i,j and m1
ij = mi i,j .
u V (u) = u G u V (u)
mi u
2 i
u u
2
X
ri ri
=
mi
u u
i
=
L(u, u)
G
where we have used a notation that repeated Greek indices are summed over. The
conjugate momenta pu to the generalize coordinates are therefore
pu =
L
= G u
u
so
u = G1 pu ,
106
5. ADVANCED TOPICS
Note from the definition of the matrix G, we have
X 1 u u
G1
=
mi ri ri
i
Note that the equations of motion in the Cartesian phase space coordinates =
(r, p) and in the generalized phase space coordinates = (u, pu ) are in symplectic
form
= J
H()
= J
.
=
= M
M =
From the symplectic form of the equation of motion for , we see that
H
= M J
.
so
H
H
= MT
.
Thus we find the equation of motion for the transformed phase space coordinates obeys
H
= M J MT
.
This equation maintains symplectic form if M J M = J . Now consider the transformation of the volume element d = | det M|d.
If the transformed coordinates preserve
T
the symplectic form, then det M J M = det (J ) = det2 (M) det (J ), and hence
det (M) = 1, and so d = d. If the phase space probability is P ()d, it therefore
follows that P ()d = P (())d.
107
V (u)
= c det G e
du = (u)du,
where c is a normalization constant.
From our transformation where u = (q, ), the conditional density is therefore
Note that the factor det G is related to the Jacobian of the transform from
Cartesian spatial coordinates r to generalized spatial coordinates u.
Conditional averages can be computed by either
1. Devising a Monte-Carlo procedure that works in the q generalized coordinates.
Note trial moves must not violate the constraints = 0, which is easy to implement if generalized spatial coordinates are used. The transition matrix should
have limit density of c , and therefore the Jacobian factor must be used in the
final acceptance criterion of the Monte-Carlo procedure.
2. Carrying out constrained dynamics, which effectively correspond to Hamiltonian
dynamics in a lower-dimensional sub-space of the full phase space (r, p) or (u, pu ).
5.4.2
Constrained Dynamics
To construct the equations of motion for a constrained system, consider the Lagrangian
written in the generalized coordinates u while the ` constraints = 0 are maintained:
1
=
L(u, u)
u G u V (u)
2
1
=
q A q V (q, = 0),
2
since = 0 under the constraint. In this equation, the matrix A is a sub-matrix of G given
by
X
ri ri
A =
mi
,
q q
i
which is of dimension (N `) (N `). From the Lagrangian, we construct the Hamiltonian
in generalized coordinates
1
L
Hc = pq A1 pq + V (q, = 0)
pq =
= A q.
2
q
108
5. ADVANCED TOPICS
det A
con (q, = 0).
(q) = c
det G
Note that the probability density associated with the Hamiltonian dynamics of the
constrained system is (q), while the targeted constrained density is c (q, = 0).
Each configuration generated
by constrained dynamics must be weighted by ratio
p
of Jacobian factors det G/ det A.
How can this weight factor be evaluated?
We write G and its inverse G1 in block form:
!
X ri ri
X ri
X ri ri
A B
B
=
m
=
mi
A
=
m
G=
i
i
q q
q
BT
i
i
i
!
X 1 q q
X 1 q
X 1
E
1
E
=
Z
=
=
G =
mi ri ri
mi ri ri
mi ri
ET Z
i
i
i
We now define a matrix X so that det X = det A:
A 0
,
X =
T
B
I
where I is the identity matrix.
Writing X = G G1 X, we get
T
A+EB
E
X = G
T
T
E A+ZB
Z
G1 G =
A+EB
T
E A+ZB
B+E
T
E B+Z
ri
ri
109
X 1
m
r
ri
i
i
i
2
d
Z =
m
3
3
X
1 1 X 1 2
2
ri ri i=1 ri ri
1
4r
2r
r
12
23
12
i=1
=
,
2
3
3
X
4r23
2 1 X 2 2 m 2r12 r23
ri ri i=1 ri ri
i=1
8d4
8d4
1 1/4 (r12 r23 )2 =
1 cos2 /4 .
m
m
Procedure
Generate dynamics and then calculate constrained averages properly weighted by factor
det Z1/2 .
Constrained dynamics may be done in two different ways:
110
5. ADVANCED TOPICS
1. Re-writing Hamiltonian in generalized coordinates and using symplectic integrator
with Hamiltonian system.
Typically, Hamiltonian has complicated form as A1 is not diagonal.
2. Lagrange multiplier approach: Define Lagrangian with constraints
X
L0 = L
L=
mi /2ri2 V (r(N ) )
i
miri =
,
ri
ri
provide the constraints are holonomic (depend only on r and not on p).
The dynamics in the full phase X = (r, p) is generated by the linear operator L0
L0
X
=
= X
X
i
pi
+ Fi
mi ri
pi
ri
pi
dA(X(t))
= L0 A(X(t))
dt
with formal solution A(X(t)) = exp {L0 t}A(X(0)).
The effective forces have be re-defined to include a constraint force Gi = /ri .
The Lagrange multipliers typically depend on both r and p, and so the system
is non-Hamiltonian, as the generalized forces depend on p.
The operator L0 is not obtained from the Poisson bracket of a Hamiltonian.
Even though the dynamics is not Hamiltonian in the full phase space X, we saw
earlier that it is Hamiltonian in a sub-space of the phase space X.
To solve for the Lagrange multipliers, note that the time-derivatives of the constraint
condition = 0 must vanish, so
= 0
so
X
ri
+ r i
= 0.
ri
ri
i
= 0
=0
ri
so
r i
111
The equation of motion miri = Fi /ri therefore gives a linear equation for
the Lagrange multipliers
X Fi
1
X
2
+
r i
r j = 0
m
m
r
r
r
r
i
i
i
i
i
j
i
i,j
or
= Z1
(F + T )
F =
X Fi
m
ri
i
i
T =
Xp
2 pj
i
.
m
r
r
m
i
i
j
j
i,j
pi
ri
mi
=
L0 X =
1
p i
Fi
Z (F + T )
ri
is difficult to solve in the full phase space. One can construct a symplectic integrator in a lower dimensional phase space, or resort to an iterative solution method
known as SHAKE.
Example:
Consider the example of a particle moving on a sphere of radius d centered at the origin.
The constraint condition on the position vector r can be written as = (r2 d2 )/2 = 0. For
this system, the constraint force G = r and Z = r2 /m and hence Z1 = m/r2 . We also
find
F=
Fr
F
=
m r
m
T = r
2
r = r 2 .
r2
leading to
m
= 2
r
Fr
2
+ r .
m
112
5. ADVANCED TOPICS
r(t),
m
= r (t + t) =
ru2 (t
2
+ t) r(t) ru (t + t) +
m
2
r(t) ,
m
whose solution is
q
m
=
rp (t + t) rp2 (t + t) (ru2 (t + t) d2 ) ,
d
where rp (t + t) = r(t) ru (t + t) is the projection of the new unconstrained position
ru (t + t) along the direction r(t).
Cannot always solve for the exact that satisfies the constraint condition analytically
if multiple constraints are applied.
Iterative and efficient numerical solutions of the Lagrange multiplier exist, called
SHAKE.
113
u
ri (t + t) = ri (t + t)
,
mi
ri r(t)
where rui (t + t) is the solution of the unconstrained position of component i at time
t + t.
2. We require that all constraints are satisfied at the new time step, (t+t) = (r(t+
t)) = 0. To enforce this, we use an iterative procedure starting with a guess of
(0)
(0)
= 0 and taking the positions xi (t + t) = rui (t + t). We then Taylor expand
(0)
(0)
the constraint condition at the estimated coordinates xi around the difference i =
(0)
ri (t + t) xi (t + t),
X
(0)
(0)
0 = (t + t) = (xi (t + t)) +
i ,
(0)
ri x
i
i
+ O(t4 ),
mi
ri ri (t)
and hence we need
(0)
(xi )
X t2
i
mi
(1)
ri
(0)
xi
ri
(0) (1)
= t2 Z .
ri (t)
Solving this equation for the new estimate of the Lagrange multipliers (1) corresponds
to the matrix form
(0)1
(0)
t2 (1)
= Z (xi (t + t)).
3. Armed with this Lagrange multiplier, we form a new estimate of the constrained position at time t + t using
(1)
t2
(1)
(0)
xi (t + t) = xi (t + t)
mi
ri ri (t)
114
5. ADVANCED TOPICS
(n)
4. We repeat the previous steps to get (n+1) by forming Z(n+1)1 and (xi ) until the
(n)
(n)
xi (t+t) converge, at which point ri (t+t) = xi (t+t) satisfy all the constraints
to some specified precision.
It is possible to avoid the matrix inversion step by modifying the algorithm to evaluate
the sequentially. In practice, this is carried out by taking only the diagonal terms
of the matrix Z1 . This effectively amounts to using the iteration
=
t2 (n+1)
(x(n) (t + t))
(n)
5.5
In the previous section, we saw that the constrained dynamics generated configurational
states with a probability density which differs from the constrained probability density
con . What ensemble does the constrained dynamics generate?
Consider the unconstrained Hamiltonian written in the generalized coordinates (u, pu ) =
(q, , pq , p ),
1
H(u, pu ) = puT G1 pu V (u),
2
where we recall the block form of the matrices G and G1
E
A B
,
G1 =
G=
T
T
E
Z
B
we see
and from pu = G u,
pq = A q + B
p = BT q +
T q = B
T A
1 pq = p
,
p = B
= A(q, = 0) and B
= B(q, = 0).
where A
115
pq
p
,
T pq + Z
p , where A
indicates a matrix A evaluated when = 0.
and hence = E
1 gives
Multiplying this equality by the matrix Z
1 (r) (r)
1 E
T pq .
Z
= p + Z
From the block forms of G and G1 , we have AE+BZ = 0, and hence ET AT +ZT BT =
0. From the definitions of the symmetric matrices A and Z, we see that AT = A and
ZT = Z, so ET A = Z BT , which implies Z1 ET = BT A1 . Finally, when = 0,
we have
1 = p B
A
1 pq = p p
,
Z
1 ).
1 (r)(r,
) = (Z
Note that Z
and so (p p
p) are easily expressed in the
original phase space coordinates (r, p).
116
5. ADVANCED TOPICS
)ddp
c (Hc ) dqdp ()(p p
= 0
dt
d
det Z drdp
c (H(r, p)) () ()
= 0.
dt
so
The constrained dynamics conserves H(r, p) = Hc (q, pq ) and satisfies the constraint
conditions = 0 and = 0 at all times. Hence, for the full phase space X = (r, p),
d
det Z dX
= 0
dt
det Z(r(t)) dX(t) = det Z(r(0)) dX(0).
Once can therefore interpret d(X) = det Z dX as the invariant measure for the phase
space flow.
To see the connection between the dynamics of the system and the det Z factor, consider
the flow of the standard volume element dX0 under the dynamics to a time t at
which the volume is dXt = det J(Xt ; X0 ) dX0 , where the matrix J has elements Jij =
Xi (t)/Xj (0). To find the evolution of the Jacobian determinant, we use the fact that
det J = eTr ln J , which follows from the general property of a square matrix, det A = eTr A ,
with J = eA .
Proof. To establish this fact, note that any square matrix can be written in Jordan
normal form as A = P1 D P, where D is in Jordan form. Recall that the Jordan form
consists of an upper-triangular matrix with the eigenvalues of A along the diagonal, and
that the determinant of a triangular matrix is the product of the diagonal elements.
The exponential of the matrix A can be written as eA = P1 eD P, and the determinant
of this exponential is det eA = det P1 det eD det P. Since det P1 = 1/ det P, we see
117
that det eA = det eD . Since D is in Jordan form, eD is a triangular matrix with diagonal
elements 1 + i + 2i /2 + = ei , where i = Dii is an eigenvalue of A, where we
have used the fact that Dnij = nj inj for j i. Since the determinant of eD is the
P
Q
productPof its diagaonal elements i ei = e i i , we see that det eA = eTr A since
Tr A = i i .
From this property, we have
d det J
dJ 1
Tr ln J d
= e
(Tr ln J) = det J Tr
J
dt
dt
dt
!
X X
X X
i (t) X(0)
i (t)
= det J
= det J
X(0) Xi (t)
Xi (0)
i
i
= det J
X(t)
= det J (t),
X(t)
d ln det Z
= .
dt
118
5. ADVANCED TOPICS
from the equations
This relation can be verified explicitly by evaluating = X X
of motion
X (X)
d
=
= X X
= ln det Z.
pi
ri
dt
i
For the holonomically constrained system, we note that is the total time derivative of aR function of the phase space variable X, ln det Z = (X). Thus the
t
integral 0 d ( ) = (X(t)) (X(0)) and
det J(t)e(X(t)) = det J(0)e(X(0)) = e(X(0)) .
Noting that dX(t) = det J(X(t); X(0)) dX(0), we see that the invariant volume
element can be written as e(X(t)) dX(t) = e(X(0)) dX(0), and e(X(0)) =
det Z(X(0)).
5.5.1
Consider a system with phase space coordinate x that obeys the evolution equation
x0 ) =
x(t;
dx
= x(t; x0 )
dt
and suppose the solution of this equation is x(t; x0 ) subject to the boundary condition
x(0; x0 ) = x0 .
Z
d A(x(t + )) = hAit =
Z
dxdx0 (x x(t; x0 )) (x0 )A(x) =
Z
=
dx (x, t)A(x),
119
The equation of motion for (x, t), the Liouville equation, follows from
(x, t)
d
x0 ) x(t) (x x(t; x0 )) 0
=
h(x x(t; x0 ))i0 = x(t;
t
dt
x0 )(x x(t; x0 ))i0 = x h(x(t; x0 ))(x x(t; x0 ))i0
= x hx(t;
= x (x)h(x x(t; x0 ))i0 = x (x)(x, t) .
We have seen that many dynamics have non-zero phase space compressibilities =
x (x), so that the invariant measure, if it exists, assumes the general form d(x) =
(x)dx. We therefore define a density with respect to this measure f (x) by the relation
(x) = (x)f (x), where (x) is a positive function of the phase point x.
Inserting this definition in the Liouville equation, we find
f (x, t)
+ (x) x f (x, t) = (x, t)f (x, t)
t
1
(x, t)
+ x ((x)(x, t)) .
(x, t) =
(x, t)
t
If we choose (x, t) to satisfy (x, t) = 0, then we find it must satisfy Liouvilles
equation,
+ x () = 0
t
+ x = x =
t
then
d ln (x, t)
=
dt
df (x, t)
f (x, t)
f (x, t)
(5.15)
=
+ (x) x f (x, t) =
+ L0 f (x, t) = 0.
dt
t
t
If a function w(x) exists that satisfies
(x) =
dw(x)
= (x) x w(x),
dt
then ln((x, t)/(x, 0)) = w(x(t; x0 )) w(x0 ), and one can define the invariant
measure to be
d(x) = ew(x) dx.
120
5. ADVANCED TOPICS
If (x) cannot be written as the total time derivative, then
(x(t; x0 )) = (x0 )e
Rt
0
d (x(,x0 ))
which is equivalent to solving the full equation for the density (x, t). There is
no clear way to define an invariant measure for this type of dynamics.
Note that if we have periodic orbits in the dynamics (only possible in nonHamiltonian systems) so that x(T ) = x(0), then we must have
RT
w(x(T )) = w(x(0)) or e 0 d ( ) = 1.
If the periodic orbit is net contracting or expanding, then we either have that
cannot be written as a total time derivative of w(x) or that w(x) is not
well-behaved (singular).
The formal solution of Eq. (5.15) is
f (X, t) = eL0 t f (X, 0),
as in the normal situation.
For the special case of a constrained system, since the equations of motion have H(X),
and as constants of motion, we can write the equilibrium phase space distribution
function, which is a stationary solution of Eq. (5.15), as
Z
d(X) B(X)f (X, t) =
121
dXB(X)L0 A(X) =
dX L0 + X X B(X) A(X)
Z
= dX (L0 + ) B(X) A(X)
Z
Z
d(X) B(X)L0 A(X) =
=
=
=
=
L0 t
d(X)B(X)e
Z
A(X) =
L0 t
d(X) e
B(X) A(X) =
Z
d(X) B(X(t))A(X).
Note that this result also applies for A(X) = f (X), so that the operator L0 is
self-adjoint when the invariant measure d(X) is used to define the inner-product.
This is not the case when the inner product does not include the factor of (X)
in the measure.
For Hamiltonian systems, we have (X) = 1, and the Liouville operator L0 is
self-adjoint with respect to the simple measure dX.
Note that for the special case of holonomically-constrained systems, we have
(X) = det Z(X) as the term in the invariant measure.
122
5. ADVANCED TOPICS
Canonical Dynamics
Now consider a system with 3N coordinates q and with corresponding momenta p
and p whose evolution is governed by the equations
p
= p
m
p = F(q) pp KT 0 (q)
q =
p =
pp
3N kT (q),
m
(5.16)
p2
pp
(5.17)
q i +
p i +
= x (x) =
+
= 3N p = 3N
qi
pi
p
i=1
"
#
d EH
=
dt kT (q)
In this case, the compressibility is equal to a total time derivative, and hence the
invariant measure is
(q, , p, p ) = (E H(q, , p, p )) eE/(kT (q)) ep /(2kT (q)) eH0 (q,p)/(kT (q)) /Z,
where H0 = p p/(2m) + (q) and Z is a normalization constant
123
1
2
dqdp ddp
(E/(3N kT (q)) )eE/(kT (q)) ep /(2kT (q)) eH0 (q,p)/(kT (q))
3N kT (q)
s
Z
1
2 H0 /(kT (q))
=
dqdp
e
.
3N kT (q)
=
1
eH0 (q,p)/(kT (q)) /Zr
r (q, p) = p
kT (q)
Z
1
Zr =
dqdp p
eH0 (q,p)/(kT (q)) .
kT (q)
Note that when T = T (q) is uniform, the r (q, p) is the canonical density for an
ensemble with temperature T .
If the system is ergodic, then the dynamics will generate the canonical density, so that
a time average of a quantity will correspond to the canonical ensemble average.
This approach can be extended to include a system with better mixing properties by
using a chain of thermostat variables i .
The isobaric-isothermal ensemble can also be sampled using molecular dynamics schemes
using the volume V as a dynamical coordinate with conjugate moment PV .
Systems with no invariant measure
A different type of system can be defined by the set of equations
q =
p
m
p = F (q) pp
p =
p2
kT (q),
m
y = y
>>0
124
5. ADVANCED TOPICS
System has a fixed point at the origin (0, 0) that is net attractive.
Compressibility is = ( ) = d/dt (ln |x| + ln |y|), and hence invariant measure = 1/(|x||y|) is singular along the x-axis, y-axis and at the origin.
5.5.2
Wed like to develop good integration schemes for non-Hamiltonian systems along the lines
used for Hamiltonian systems. Using splitting methods, it turns out that we can sometimes
develop integrators that
1. are time reversible,
2. preserve the invariant volume (should it exist).
In particular, the preservation of phase space volume is an important property in determining
the stability of an integrator.
P
= L x.
Liouville operator L = x x into parts L = L so that x()
We have seen that if it exists, the invariant measure d(x) = (x)dx satisfies the
divergence equation
= 0,
x ((x)x)
(5.18)
p
m
= p
p = F (q) pp
p =
p2
kT,
m
L=
+ F (q) pp
+
kT
.
m q
p
p
m
p
P
where L x = x().
125
= F (q)
p
= kT
p
p
L2 =
+p
m q
p
L4 = p p +
.
p
(5.19)
(e F (q)) = 0, and hence
(e p/m)+ p (e p2 /m) =
(e kT ) = 0.
(e pp ) +
(e p ) = p e + p e = 0.
and x (e x(4))
= p
Note that each of the effect of each of the propagators eL t on the phase
point x is exactly computable since
q + mp t
q
p + F t
eL2 t x =
eL1 t x =
p2
p
p + m t
q
q
p t
p
eL4 t x = pe
eL3 t x =
+ p t ,
p kT t
p
where we have used the fact that exp{cxd/dx}x = xec .
This P
splitting can then be used in a symmetric sequence of exp{Li tk }
with k tk = t.
2. The choice:
p
L1 = F (q)
L2 =
p
m q
2
p
L3 =
kT
L4 = p p +
.
m
p
p
The difference from the first scheme is in the combination of operators in
the definitions of L2 and L3 . This are trivial re-arrangements that still
126
5. ADVANCED TOPICS
satisfy the conservation of the invariant phase space volume, but now
q
q + mp t
L2 t
L3 t
e
x=
e
x=
.
2
p
p pm kT t
3. The choice
p2
p
+
L2 = F (q) kT
m q m p
p
p
L4 = p p +
.
p
Note that this choice is superior since it involves fewer Liouville operators.
L1 =
Note that all these schemes use the same L4 , since this form is the one which
preserves the volume when the compressibility is non-zero.
General splitting procedure
Suppose we can write the dynamics in a form that is a generalization of the symplectic
form:
x i = Aij (x)
H
= i (x),
xj
(5.20)
where H is the Hamiltonian function that is conserved in the dynamics and Aij (x) is
an anti-symmetric matrix, satisfying Aij (x) = Aji (x), that generalizes the symplectic
matrix J.
Since A is anti-symmetric, we can write Aij = 12 (Aij Aji ), which implies
H
H
H
H
H =
= i
=
Aij
x
xi
xi
xj
H
H
H
1 H
Aij
Aji
= 0,
=
2 xi
xj
xi
xj
where we have used the convention that repeated indices are summed over.
If d(x) = (x)dx is the invariant volume element under the dynamics, then (x)
obeys
=0=
x ( x)
( x i ) =
xi
xi
H
Aij
xj
.
127
H
Bij
= 0
xi
xj
Bij
2H
Bij
=
+ Bij
=
,
xi
xi xj
xi
since
Bij
2H
=0
xi xj
as 2 H/xi xj = 2 H/xj xi .
P
Separating the Hamiltonian H P
into several different terms, H =
H(), we can
write the flow equation as = (), where i () = Aij H()/xj and define the
Liouville operators
L = i ()
H()
= Aij
.
xi
xj xi
The action of each of the Liouville operators on the invariant measure is therefore
xi
H()
(x)Aij
xj
=
Bij H()
2 H()
+ Bij
= 0,
xi xj
xi xj
p
m
= p
p = F (q) pp
p =
p2
kT,
m
and the Hamiltonian H, we will deduce the form of the asymmetric matrix A as follows:
128
5. ADVANCED TOPICS
Note that
H
= 0 (q)
x1
H
p
=
x2
m
H
= kT
x3
H
= p .
x4
0
1
A=
0
0
1
0
0
p
0
0
0
1
0
p
.
1
0
From this matrix, one can easily verify that Bij = e Aij satisfies Bij /i = 0 for all j.
For example, consider j = 4:
Bi4
=
(e p) +
(e ) = 0.
xi
p
H(2) =
p2
2m
H(3) = kT
H(4) =
p2
2
Appendix A
Math Appendices
A.1
Taylor expansion
1
f (x + a) = f (x) + f 0 (x)a + f 00 (x)a2 +
2
X
aj dj
f (x).
=
j
j!
dx
j=0
Since
= exp (x) =
xj
j=0
f (x + a) = exp a
j
,
j!
d
f (x).
dx
129
130
A.2
Series expansions
A.3
Probability theory:
A.3.1
Discrete systems
Properties:
1. 0 Pi 1
Pn
2.
i=1 Pi = 1
Averages:
E =
n
X
Ei Pi
i=1
E2 =
n
X
Ei2 Pi
i=1
H(E) =
n
X
i=1
H(Ei )Pi
Ni
P (Ei )
N
131
Variance of E:
2
E2 E 2 E
2
= Ei E
E2 measures the dispersion of the probability distribution: how spread out values are.
In general, E2 6= 0 unless Pi = ij for some j. This notation means:
1 if i = j
Pi =
which implies E = Ei .
0 otherwise
Tchebycheff Inequality:
2
P rob E E E E 2 .
2 E
Joint probability: Suppose N measurements of two properties E and G.
nij = number of measurements of Ei and Gj
nij
Pij = lim
P (Ei , Gj ) joint probability.
N N
Properties:
P
1.
i.j P (Ei , Gj ) = 1.
P
2.
i P (Ei , Gj ) = P (Gj ).
P
3.
j P (Ei , Gj ) = P (Ei ).
4. If Ei and Gj are independent, then P (Ei , Gj ) = P (Ei )P (Gj ).
A.3.2
Continuous Systems
x =
x2
dx xp(x)
f (x) =
dx f (x)p(x)
Z
2
2
= x x =
dx x2 x2 p(x)
132
A.3.3
Gaussian distributions
1
2 2
1/2
(xhxi)2
2 2
hxi = average of x
h(x hxi)2 i = 2
2. Important properties (assuming hxi = 0): hx2n+1 i = 0 and hx2n i = f ( 2 ).
1/2
n 2
o
1/2
x1
x2n
1
1
exp
3. If P (x1 , . . . , xn ) = 2hx
... 2hx
,
...
+
2i
2
2i
2hx2 i
2hx i
n
Then
hxi i = 0
hxi xj i = i2 i,j
What happens when x2 0? Infinitely narrow distribution, called a dirac delta
function. Probability density has all the weight on one value.
There are other representations of the dirac delta function: basically defined in such a
way that one value receives all the weight.
Delta functions: defined in a limiting sense.
()
(x) =
1
2 x
0
|x| > 2
()
2
()
/2
dx (x) =
/2
1
dx = 1.
if 1.
1 |x|/
e
2
1
x2 +2
2 2
1
ex /
1 sin x/
1
(x)
|c|
3. [g(x)] =
(xxj )
j |g 0 (xj )|
A.4
Fourier Transform:
Z
eikx f (x) dx
f (k) =
Z
1
f (x) =
eikx f(k) dk
2
Properties:
Z
eix(kk0 ) dx
2(k k0 ).
Z
Z
1
f (x y)g(y) dy =
eikx f(k)
g (k) dk.
2
1/2
2 1/2
1
x2 /2 2
2 k2 /2
(k) =
f (x) =
e
f
e
2 2
2
ex
4
f (x) =
f(k) = 2
x
k + 2
1
4
f (x) =
f(k) = 2
x
k
133
134
Laplace Transform:
f(z) =
ezt f (t) dt
1
z+a
tn1
f(z) =
.
(n 1)!
A.5
1
tn
Calculus
A.5.1
Integration by parts
Z
udv = uv
vdu
Example:
Z
a
A.5.2
b Z b
dx f (x)g(x) = f (x)g(x)
dx f (x)g 0 (x)
0
R
Let I = Rxy dxdy f (x, y) be the integral over a connected region Rx,y . Change variables to
u, v via the transform g(u, v) = x and h(u, v) = y. It follows that:
Z
(g, h)
.
I=
dudv f (g(u, v), h(u, v))
(u, v)
Ruv
where the Jacobian
(g, h)
(u, v)
(g,h)
(u,v)
g
u
g
v
h
u
h
v
is
A.5. CALCULUS
135
Example: Suppose
Z
I=
= 1
Thus,
Z
dudv f (u)g(v)| 1| =
I=
dvg(v).
duf (u)
136
Appendix B
Problem sets
B.1
Problem Set 1
Notes:
This set contains 3 problems.
This first set is due in one week, October 8, 2008.
1. Suppose a Markov
0 1
1 0
K=
0 0
0
0 .
1
(a) What are the eigenvalues of K? What independent eigenvectors have eigenvalue
= 1?
(b) If a random walk starts in state 1, what is the probability of finding state 1 after
2 steps? After 3 steps?
(c) Is the transition matrix regular? Does this random walk have a unique limit
distribution?
2. Consider a one-dimensional, time-homogeneous, discrete time, Markovian random walk
process in which a walker always moves to its neighbor to the right. Suppose there are
4 sites, and a walker at site number 4 always cycles back to site 1:
111
000
000
111
000
111
000
111
000
111
000
111
111
000
000
111
000
111
000
111
000
111
000
111
137
111
000
000
111
000
111
000
111
000
111
000
111
111
000
000
111
000
111
000
111
000
111
000
111
138
(b) Show that the right eigenvectors of K(0) are ek = 1/2 (eik/2 , eik , ei3k/2 , ei2k )
(0)
and that the left eigenvectors are the complex conjugates ek .
(c) Now suppose that there is a small but finite probability that the walker remains
on site 1 and probability 1 to step to site 2. What is the form of K = K(0) + K
now?
(d) The effect of the perturbation K on the eigenvalues can be evaluated using
perturbation theory using
(0)
k = k + k
(0)
(0)
B.2
139
Problem Set 2
Notes:
This set contains 3 problems.
This second set is due in two weeks, October 22, 2008.
1. Consider a three state process in which the probability vector p(t) satisfies the master
equation
p(t + 1) = K p(t),
with a transition matrix given by
3 1 1
4
K = 18
1
8
32
13
16
5
32
16
1
.
8
13
16
(B.1)
(a) What are the eigenvalues of the transition matrix K? Give the stationary vector
p
(i.e. that which satisfies K p
=p
). Make sure it is normalized such that the
sum of its components is one.
(b) Show that detailed balance is not obeyed by this process.
Recall from linear algebra that any symmetric matrix S can brought to diagonal form
D by a similarity transformation P, in that
1
/0
,
2
D = P1 S P =
(B.2)
/0
3
where k are the eigenvalues and the columns of the matrix P are the (right) eigenveck of S. Yet K is not symmetric and has degenerate eigenvalues. For such cases,
tors e
linear algebra tells us that the matrix can be brought to Jordan normal block form:
1 1
/0
...
/
0
..
.
1
/0
2 1
/0
1
,
(B.3)
J=P KP=
..
.
2
..
. 1
/
0
.
.
.
/0
140
1 0
0
Jt = 0 t2 t2t1 .
0 0
t2
(f) Determine the matrix
K = lim Kt ,
t
and show that it transforms any initial probability vector into the stationary
probability vector.
2. Consider now a slightly different process with transition matrix
3 1 1
4
K = 18
1
8
16
13
16
1
8
16
1
.
8
13
16
p1 0
p2 0 .
Q= 0
0
0
p3
(d) What does this imply for the Jordan normal form of K?
(B.4)
141
(e) Show that for any n-state process which satisfies detailed balance with stationary
, the transformed matrix Q1 K Q is symmetric when Q =
probability vector p
diag( p1 , p2 , . . .).
3. Suppose one defines a random walk procedure to generate a Markov chains of states
= {x1 , . . . , xN }, where each configuration of the system xi appears with canonical ensemble probability (xi ) = eU (xi ) /Z. Consider the case in which it is very
computationally demanding to compute the potential energy U (xi ), and much easier
to compute an approximate potential energy Uf (xi ). For example, the true potential energy U (xi ) could be computed by ab-initio electronic structure methods, and
Uf (xi ) could be the potential energy computed by a molecular-mechanical or low-level
ab-initio method.
Updates using Kf
u1 = x i
1
0
0
1
u2
1
0
0
1
um1
1
0
0
1
y i1
= um
0
0
1
Accepted ?
yes
no
x i+1 = yi
x1
x
00 i
11
00
11
x i+1= x
111111
000000
1
0
0
1
(a) Suppose one develops a transition matrix Kf (ui uj ) that satisfies detailed
balance in the canonical ensemble for the system with potential energy Uf ,
pf (ui )Kf (ui uj ) = pf (uj )Kf (uj ui ),
where pf (ui ) = eUf (ui ) /Zf . At a given time step i of the Markov chain , suppose the configuration is xi . If one defines u1 = xi and then generates an auxiliary
sequence of states {u1 , . . . , um } according to a random walk with probabilities determined by Kf , and then selects the trial state for the i + 1 step of the Markov
chain to be yi = um , what is the probability T (xi yi ) of generating the configuration yi starting from xi via the sequence {u1 , . . . , um }? What is the reverse
probability T (yi xi ) via the path {um , . . . , u1 }?
(b) How should the acceptance probability A(xi yi ) of the trial configuration be
defined to ensure that the chain of states has limiting distribution ? How
142
B.3
143
Problem Set 3
Notes:
This set contains 3 problems.
This third set is due in two weeks, November 5, 2008.
1. Consider a Hamiltonian of the form
|PN |2
+ U (RN ),
2m
P pi pi
where |PN |2 = N
i=1 2m . This Hamiltonian is split into a kinetic and potential part
H=
K=
|PN |2
and U = U (RN ).
2m
RN
, i.e.
PN
(B.5)
144
2. The force between two Lennard-Jones particles with a smooth cutoff is given by
12
1
r < rc0
6
2
0
r12
r
0
r > rc
where r = |ri rj | is the (scalar) distance between two particles at (three-dimensional)
positions ri and rj .
(a) Show that the potential is everywhere differentiable.
(b) Derive the force that the two particles feel. What is the sum of these forces?
Consider an N particle system interacting through the smooth Lennard-Jones interaction.
(c) How do the momenta and positions of the particles change in a single Verlet step?
(d) Prove that momentum and angular momentum are conserved in a single Verlet
step (neglect boundary conditions).
(e) For large enough time steps, the Verlet scheme applied to the N particle LennardJones system starts to exhibit energy drift. Why does this drift go towards higher
energies?
(f) The potential and forces can be computed using cell divisions or a Verlet list.
Assuming most of the numerical cost is in computing the potential and forces,
which one is more efficient and by how much? Why do these techniques only work
for larger systems?
3. Write a small program to simulate a Lennard-Jones system of 343 particles (cf. Eq. (B.6))
in a cubic periodic simulation box, using the Verlet scheme (B.5). Set the mass m,
interaction range and interaction strength to 1 (Lennard-Jones units), and set the
density at 1. Choose cutoff parameters rc0 = 2.5 and rc = 3. (Note: why is there no
reason to use cells?) As initial conditions, put the particles on a cubic lattice and draw
their initial momenta from a Gaussian with a standard deviation of 0.1. For a range
of time steps, determine
the appropriate length of the burn-in time,
the average kinetic temperature Tkin = K/( 32 N ) in equilibrium,
the total energy in equilibrium, and
the fluctuations of the total energy in the simulation.
Demonstrate that the fluctuations of the total energy scale as t2 and determine the
time step at which numerical instability sets in.
B.4
145
Problem Set 4
Notes:
This set contains 3 problems.
This fourth and final set is due in two weeks, December 3, 2008.
1. For a harmonic oscillator system with Hamiltonian H = 12 (p2 + q 2 ), compare the
critical points of instability tc of the Verlet, HOA2, FR4, and 4th-order order Suziki
(gradient corrected) integrators.
2. Consider an an-harmonic oscillator system in three dimensions with Hamiltonian
H=
p1 p1 p2 p2
+
+ U (r12 )
2m1
2m2
U (r12 ) =
k 2
4
r12 + r12
,
2
4
PP pp
+
+ U (r) = KP + Kp + U,
2M
2
where M is the total mass of the system, is the relative mass, and P and p are
the momenta conjugate to R and r, respectively.
i. What are the solutions of the equations of motion for R and P?
ii. Construct the Liouville operators LP , Lp and LU and evaluate the commutators of these three operators.
(b) Devise a second-order symplectic integrator of this system and find the shadow
Hamiltonian to order t2 .
(c) Suppose one writes the potential U = Uh + U in terms of a harmonic reference
potential Uh (r) = kr2 /2 and a difference potential U (r) = r4 /4. Use a splitting
scheme to construct a second-order integrator that uses the exact solution of
the evolution of a harmonic oscillator. To order t2 , what does the shadow
Hamiltonian look like for this integrator? Is it better, or worse, than the integrator
in part b?
(d) Suppose the actual potential energy UA (r) and forces are computed using a costly
electronic structure method. Using the potential U (r) as a reference potential,
where UA (r) U (r), devise an integrator using splitting methods that minimizes
the number of electronic structure calculations over a fixed time interval. How
might multiple time step methods be used profitably in this integrator?
146
N
X
h|ri (t + t0 ) ri (t)|i
i=1
6N t0
where ri (t) is the position of particle i at time t. Note that when using this form,
one must pay careful attention to how the distance grows in a periodic system.
Modify the code you wrote for a Lennard-Jones system (or modify the sample code
provided on the class web site) to compute the self-diffusion coefficient D in the two
ways mentioned above. The system conditions can be taken to be as described in
problem 3 on problem set 3.