Lecture 5: Introduction to Quantum Chemical Simulation graduate course taught at MIT in Fall 2014 by Heather Kulik. This course covers: wavefunction theory, density functional theory, force fields and molecular dynamics and sampling.
2. MIT
10.637
Lecture 5
Why molecular dynamics?
Protein folding: how proteins fold and
misfold (Prof. Vijay Pande)
Voelz, Bowman, Beauchamp, Pande. JACS (2010).
3. MIT
10.637
Lecture 5
Molecular dynamics
F = ma
Classical particles can be simulated by solving Newton’s
second equation:
The force is the derivative of the potential energy at position
r:
r is a vector containing the coordinates for all particles in
cartesian coordinates (i.e. of length 3Natom).
The potential energy V function (for now) comes from our
force field parameters.
4. MIT
10.637
Lecture 5
Structure of an MD code
1. Initialize positions and velocities,
temperature, density, etc.
2. Compute forces
3. Integrate equations of motion
4. Move atoms
5. Repeat 2-4 until equilibrated (desired
properties are stable, potential energy and
kinetic energy are stable).
6. Continue 2-4 as production run -> collecting
data to average over.
5. MIT
10.637
Lecture 5
Initialization
• Avoid random initialization. Don’t want energy divergence.
• Initial positions generated from a structure – avoid overly
short distances between molecules/inside molecules.
• Velocities – start out small or zero. Can slowly heat up the
system, giving more and more temperature (velocity) to the
particles.
• Randomizing initial velocities – equipartition theorem relates
temperature to the velocity. Choose a random number from a
uniform distribution, make sure the net velocity results in a
total momentum of zero, scale velocities until we get a kinetic
energy that matches the initial temperature.
6. MIT
10.637
Lecture 5
Statistical mechanics: ensembles
Ways in which a fixed volume can be described with statistical mechanics:
• Microcanonical ensemble: Fixed number of particles (N), fixed energy (E)
- NVE. Equal probability for each possible state with that
energy/composition.
• Canonical ensemble: Fixed composition (N), in thermal equilibrium with a
heat bath of a given temperature (T). Energy can vary but same number of
particles – probability of a state depends on its energy (origin of the
Boltzmann distribution).
• Grand canonical ensemble (mVT): Variable composition - thermal and
chemical equilibrium with a reservoir. Fixed temperature reservoir with a
chemical potential for each particle. States can vary energy and number of
particles.
• Macroscopic properties of these ensembles can be calculated as weighted
averages – based on the partition function.
7. MIT
10.637
Lecture 5
Ergodic hypothesis
• We assume the average obtained by following a
small number of particles over a long time is the
same as averaging over a large number of
particles for a short time.
• Time-averaging is equivalent to ensemble-
averaging.
• Or alternatively: no matter where a system is
started – it can get to another point in phase
space.
8. MIT
10.637
Lecture 5
Choosing an ensemble
Ensemble menu:
Choose one from each row
Particle number N Chemical potential m
Volume V Pressure P
Energy E Temperature T
Most common combinations:
Microcanonical ensemble (NVE): Conserves the total energy , S has maximum in
equilibrium state.
Canonical ensemble (NVT): Also called constant temperature molecular dynamics.
Requires thermostats for exchanging energy. A has minimum in equilibrium.
Isothermal-isobaric ensemble (NPT): Requires both a thermostat and barostat,
corresponds most closely to “laboratory” conditions. G has minimum.
10. MIT
10.637
Lecture 5
Integration
• Integration algorithms need to be fast,
require little memory.
• Should allow us to choose a long timestep.
• Stay close to the exactly integrated
trajectory.
• Conserve momentum and energy
• Be time-reversible.
• Be straightforward to implement.
12. MIT
10.637
Lecture 5
Choosing a time step
Too short - computation needlessly slow
Too long - errors result from approximations
Just right - errors acceptable, maximum speed
13. MIT
10.637
Lecture 5
Euler method
Taylor expansion for particle position and velocity at time t+Dt
with truncation after first term:
Recall a is from the forces.
14. MIT
10.637
Lecture 5
Euler method
Problems persist with this method:
• First order method, local error scales with
square of the timestep. Global errors are
larger.
• Not time-reversible.
• Sensitive, easy to make unstable.
15. MIT
10.637
Lecture 5
Leap-frog method
This method minimizes some of the error present in the Euler
method by calculating velocities at ½ timestep offsets –
second order method.
Step 1: Solve for acceleration/forces
Step 2: Update velocities
Step 3: Update positions
Repeat
16. MIT
10.637
Lecture 5
The Verlet algorithm
Taylor expansion for particle position at time t+Dt:
Taylor expansion for particle position at time t-Dt:
Add expressions:
v a b ( or a’)
17. MIT
10.637
Lecture 5
The Verlet algorithm
Positions evaluated:
Approximation for the first timestep: Acceleration is from potential:
Advantages: Simple to program, conserves energy (and time reversible).
Disadvantages for Verlet algorithm: differences between large numbers can lead to
finite precision issues, velocities would be calculated based on difference between
positions at t+dt vs t-dt (velocity extension) – so don’t know instantaneous
velocities/temperatures. Need new positions before velocity.
18. MIT
10.637
Lecture 5
The Velocity Verlet algorithm
Regular Verlet has no explicit dependence on velocities, only on acceleration – would
be better to depend on velocity. This is solved with Velocity Verlet algorithm.
Taylor expand position, velocity:
Taylor expand acceleration, then rearrange and multiply by Dt/2:
19. MIT
10.637
Lecture 5
The Velocity Verlet algorithm
Substitute in expression for second derivative of velocity:
We get this expression, then simplify:
20. MIT
10.637
Lecture 5
The Velocity Verlet procedure
Step 1: Evaluate new positions
Step 2: Evaluate forces (acceleration) at t+Dt.
Step 3: Evaluate new velocities
Repeat procedure
21. MIT
10.637
Lecture 5
Predictor-corrector approach
1. Predict r, v, and a at time t+Dt using second order Taylor
expansions.
2. Calculate forces (and accelerations) from new positions
r(t+Dt)
3. Calculate difference in predicted versus actual
accelerations:
4. Correct positions, velocities, accelerations using new
accelerations Da(t+Dt)
5. Repeat
23. MIT
10.637
Lecture 5
Pros and cons of predictor-
corrector
Pros
• Positions and velocities are corrected to Dt4
• Very accurate for small Dt
Cons
• Not time reversible
• Not symplectic (area/energy preserving)
• Takes more time – two force evaluations per step.
• High memory requirements (15N instead of 9N).
24. MIT
10.637
Lecture 5
Use of constraints to
increase the integration step
SHAKE algorithm fixes X-H bonds and allows increase of timesteps from
1fs to 2fs.
Also, hydrogen mass repartitioning: take mass from neighboring atoms
and increase mass of hydrogen to ~4 au: timesteps ~4fs
d
Unconstrained
update
d
Project out forces
along the bond
l
Correct for rotational
lengthening
d
p
25. MIT
10.637
Lecture 5
Lyapunov instability
Trajectories are sensitive to initial conditions!
Position of Nth particle at time t depends on initial position and momentum
plus elapsed time:
Perturbing initial conditions of the momentum:
Difference diverges exponentially, l is the Lyapunov exponent.
26. MIT
10.637
Lecture 5
Lyapunov instability
Example: two particles out of 1000 in a Lennard-Jones simulation have
velocities in x-component changed by +10-10 and -10-10.
Monitor the sum of the squares of
differences in positions of all
particles:
Gets very large very quickly!
(After only about 1000 steps).
27. MIT
10.637
Lecture 5
Periodic boundary conditions
• Can simulate the condensed phase with
a limited number of particles if we use
periodic boundary conditions.
• Needed to eliminate surface effects
• Particle interacts with “closest” images of
other molecules.
• A number of options in AMBER: cubic
box, truncated octahedron, spherical cap.
rcut < L/2
28. MIT
10.637
Lecture 5
Periodic boundary conditions
• van der Waals interactions are usually
treated with a finite distance cutoff.
• Ewald summation treats long range
electrostatics accurately and efficiently
using real space (short range) and
reciprocal space (long range but short
range in inverse space) summations->
converges quickly. Particle Mesh Ewald
uses FFT and converges O(N log N).
• Choose a large enough simulation cell to
avoid contact between periodic images –
e.g. protein-protein interactions.
• Need cutoffs of interactions to be no more
than half the shortest box dimension.
• Need to neutralize the simulation cell with
counter-ions.
a b
b
Cutoff approaches (better
than abrupt truncation):
29. MIT
10.637
Lecture 5
Speeding up MD calculations
• Lookup tables: pre-compute interaction energies at various distances and
interpolate to get value.
• Neighbor lists: lists of atoms to calculate interactions for, then only update
the list when atoms move a certain distance (about every 10-20 timesteps
for liquids, infrequent for solids). Storage issues for very large systems.
• Cell-index method: discretize simulation cell into sub-cells. Search only the
sub-cells within a certain distance (e.g. nearest neighbors).
• Multiple timestep dynamics (e.g. Berne’s RESPA method): evaluate and
update forces due to different interactions on different timescales – long
range interactions like electrostatics get updated most slowly, bond
constants get updated most quickly.
• Rigid bonds/mass repartitioning (covered earlier).
30. MIT
10.637
Lecture 5
Temperature in MD
Equipartition energy theorem relates temperature to the
average kinetic energy of the system.
Instantaneous temperature is:
Thermostats may be used to control temperature (e.g. in
NPT and NVT ensembles).
31. MIT
10.637
Lecture 5
Berendsen thermostat
Suppresses fluctuations in kinetic energy so not truly producing canonical ensemble.
If t is same as timestep, then simply velocity rescaling.
A form of velocity rescaling with weak coupling to an external bath.
Velocities get multiplied by a proportionality factor (l) to move the temperature (T)
closer to the set point (T0).
Proportionality factor: Revised equations of motion:
Typically t = 0.1-0.4ps
32. MIT
10.637
Lecture 5
Andersen thermostat
Correctly samples NVT. Cannot be used to sample time-dependent properties –
e.g. diffusion, hydrogen bond lifetimes.
Each atom at each integration step is subject to small, random probability of collision
with a heat bath. This is a stochastic process.
Probability of a collision event:
For small timesteps, , and each particle is assigned a random
number between 0 and 1. If that number is smaller than then the momentum of
the particle is reset.
New momentum follows a Gaussian distribution around the set point temperature.
33. MIT
10.637
Lecture 5
Langevin dynamics
In Langevin dynamics, all particles experience a random force from particles
“outside” the simulation as well as a friction force that lowers velocities. The friction
force and random force are related in a way that guarantees NVT statistics.
Standard force
Friction force with
coefficient g
Random force with
random number
R(t) and related to
friction force
through g.
Recommended values for g are around 2-5 ps-1. Langevin is susceptible to
synchronization artifacts so it’s important to use a random seed when initializing
velocities. In some cases, Langevin can allow for longer time steps.
34. MIT
10.637
Lecture 5
Nose-Hoover thermostat
Extended system method: introduce additional artificial degrees of freedom and mass:
Stretched timescale Artificial mass
Kinetic energy and potential energy terms for heatbath degree of freedom (s).
Sample microcanonical ensemble in extended system variables, but there are
fluctuations of s, resulting in heat transfer between system and bath – sample
canonical ensemble in real system.
35. MIT
10.637
Lecture 5
Thermostat review
Thermostat Description True NVT? Stochastic?
Velocity
rescaling/Berendse
n
KE (velocities) revised to
produce desired T
No No
Nose-Hoover Extra degrees of freedom act as
thermal reservoir
Yes No
Langevin Noise and friction give correct T Yes Yes
Andersen Momenta re-randomized
occasionally
Yes Yes
36. MIT
10.637
Lecture 5
Pressure in MD
Clausius virial equation is used to obtain pressure from a
molecular dynamics system:
where r is the position of particle i and F is the force.
Barostats may be used to control pressure (e.g. in NPT
ensemble).
37. MIT
10.637
Lecture 5
Berendsen barostat
Used in Amber code for NPT dynamics. Does not strictly sample from NPT ensemble.
Positions and volume are rescaled:
Scaling factor:
Where P0 is target pressure and P is instantaneous pressure. t is the pressure
coupling time (typically 1-5 ps) and b is the isothermal compressibility (44.6x10-6
bar-1 for water).
38. MIT
10.637
Lecture 5
Properties from MD runs
Autocorrelation functions:
Autocorrelation functions (ACFs) can be defined and calculated for any particle quantity
(e.g. vi ) or any system quantity (e.g. U, T, P, r). Starts at 1 and decays usually
exponentially with time.
Diffusion coefficient:
t(ps)
Solid
Liquid
<vi(t).vi(0)>
0.0 t (ps)
40. MIT
10.637
Lecture 5
Summary
• Well-equilibrated molecular dynamics gives us access to
thermodynamic properties
• We need to choose the right ensemble, thermostat/barostat,
simulation cell, timestep, cutoffs, force fields for the job.
• Direct, unbiased molecular dynamics are limited to sampling the
potential energy surface we’ve given it enough energy to sample
and by the timescale accessible with the timestep we’ve selected.
• Hydrogens (flexible or rigid) are the limiting factor in describing
molecular dynamics of organic systems.
• Adaptive sampling approaches are required to efficiently sample
rare events – higher energy portions of the potential energy surface,
slower processes.