2. Outline
Introduction
• Physical chemistry; the role of theory and computation
• An overview of computer simulations of molecules
• Molecular dynamics within the classical approximation
Methods
• Force fields
• Integrating the equations of motion
• Sampling from thermodynamic ensembles
Applications
• Protein folding structure and mechanism
• Conformational change and binding free energies
• Properties of condensed phase matter (water)
3. Introduction: Physical Chemistry
Physics helps us to understand chemical processes.
Chemical bonding and reactivity Molecular interactions with:
Electric potentials (electrochemistry)
Synthesis of new chemical compounds
Photosystem II
splits water in
green plants
Light (photochemistry)
From nylon to kevlar
OD-PABA,
Accelerating chemical reactions via catalysis found in sunscreen
Other molecules (noncovalent interactions)
Anticancer drug
“Gleevec” docking
into tyrosine kinase
Creating fuels and reducing pollution
4. Introduction: Theory and Experiment
eoretical and computational chemistry can offer
explanations and predictions.
Insight into
atomic scale processes
Predictions of
Experiment experimental results eory and Computation
Experimental –
Theoretical Collaboration
Observations
needing explanation
Verification of
approximations
5. Example of an important process
Understanding the mechanism of photosynthesis
requires both experimental and theoretical insights.
Photosystem II: Light harvesting and
Energy stored: 3.60 eV
4H+ + 2Q + 4e- à protein in green plants
water splitting 2QH2
2H2O + 4h+ à O2 + 4H+
Light harvesting complexes Oxygen evolving
center from
in green plants (top) and Umena et al.,
purple bacteria (bottom) Nature, 2011.
6. Introduction: Physical Theories
e main tools of theoretical chemistry are
quantum mechanics and statistical mechanics.
• e electronic structure of a
QUANTUM MECHANICS molecule is completely described
by solving Schrodinger’s equation
ˆ
HΨ = EΨ • For most systems, exact solution
not available; approximations are
computationally intensive
• Given a potential energy surface,
STATISTICAL MECHANICS
the partition function provides
E ({ri }) probabilities of states
!
3N kbT • High-dimensional integral is
Z= "d re evaluated by sampling, requiring
computer power and efficient
algorithms
7. The Schrödinger equation
All ground-state quantum chemistry is based on the
time-independent Schrödinger’s equation.
ˆ
ΗΨ = EΨ
(T + V ) Ψ = EΨ
ˆ ˆ
% "2 (
' !# i ! # Z I + # 1 * + ( ri ;R I ) = E ( R I ) + ( ri ;R I )
' ˆ ˆ ˆ *
& i 2 i,I R I ! ri i$ j ri ! rj )
Kinetic Electron Electron
energy nuclear electron Electron Electronic
operator attraction repulsion wavefunction energy
Born-Oppenheimer approximation:
when solving for the electronic
wavefunction, treat nuclei as static
external potentials
8. Useful results
Quantum mechanical calculations can provide
• Electronic structure much useful data.
Molecular orbitals can provide a
detailed explanation of reaction
mechanisms
• Spectra
HOMO
Computing spectra helps us
Lenhardt, Martinez and coworkers,
compare with experiment and Science, 2010
study molecular interactions with
light (may require time-dependent
calculations for electrons)
LUMO
• Potential energy surface Wang, Wu and Van Voorhis,
Inorg. Chem., 2011
Evaluating the potential energy
and its gradients allows us to
simulate how molecules move Yost, Wang and Van Voorhis,
J. Phys. Chem. B, 2011
9. Approximate solutions to Schrödinger equation
Schrödinger’s equation is nearly impossible to solve,
so approximate methods are used.
% ( • Electron repulsion makes this problem
"2
' !# i ! # Z I + # 1 * + ( ri ) = E+ ( ri )
' * difficult
ˆ ˆ ˆ
& i 2 i,I R I ! ri i$ j ri ! rj )
• Approach 1: Use approximate
wavefunction forms (Quantum chem.)
χ1 (1) χ1 (2) χ1 (N )
• Approach 2: Use the electron density
χ 2 (1) χ 2 (2) χ 2 (N )
Ψ = as the main variable
HF
χ N (1) χ N (2) χ N (N ) • Approach 3: Approximate the energy
using empirical functions
Slater determinant: Use wavefunction of
noninteracting electrons for interacting system 2
E ( R I ) ! kIJ ( RIJ " RIJ )
0
ρ 0 (r ) ↔ Ψ0 (r1 , r2 , r3 ,, rN ) → E
Density functional theory: e ground-state Molecular mechanics: Use empirical functions
electron density contains all information in the and parameters to describe the energy (e.g.
ground-state wavefunction harmonic oscillator for chemical bond)
10. A wide scope of computer simulations
One calculation for 2-3 atoms
• Computer simulations
100 fs, 30 atoms: of molecules span a wide
photochemistry range of resolutions
• More detailed theories
can describe complex
10 ps, 100 atoms: phenomena and offer
fast chemical higher accuracy
10 µs, 10k atoms: reactions
protein dynamics,
drug binding • Less detailed theories
allow for simulation of
larger systems / longer
timescales
1 ms+, 1 million atoms: • Empirical force fields
folding of large proteins,
virus capsids
are the method of choice
in the simulation of
biomolecules
11. Introduction: Molecular dynamics simulation
e fundamental equation of classical molecular
dynamics is Newton’s second law.
Key approximations:
e force is given by the negative gradient of the • Born-Oppenheimer
potential energy. approximation; no
transitions between
F = !"V ( r ) electronic states
• Approximate potential
Knowing the force allows us to accelerate the atoms in energy surface, either
the direction of the force. from quantum chemistry
or from the force field
F = ma • Classical mechanics!
Nuclei are quantum
particles in reality, but
this is ignored.
12. Outline
Introduction
• Physical chemistry; the role of theory and computation
• An overview of computer simulations of molecules
• Molecular dynamics within the classical approximation
Methods
• Force fields
• Integrating the equations of motion
• Sampling from thermodynamic ensembles
Applications
• Protein folding structure and mechanism
• Conformational change and binding free energies
• Properties of condensed phase matter (water)
13. Force Fields
• Force fields are built
from functional forms
and empirical parameters
• Interactions include
bonded pairwise, 3-
body, and 4-body
interactions…
• … as well as non-
bonded pairwise
interactions
• Simulation accuracy
depends critically on
choice of parameters
14. Force Field Parameterization
Force fields are parameterized to compensate for
their simplified description of reality.
Most models have incomplete physics:
• Fixed point charges (no electronic polarization)
• Classical mechanics (no isotope effects)
• Fixed bond topology (no chemistry)
However, much can be recovered through parameterization:
• Increase the partial charges to recover polarization effects
• Tune vdW parameters to recover the experimental density
• In many cases, force fields exceed the accuracy of quantum methods!
15. Electrostatic functional forms
How detailed does the function need to be
in order to describe reality?
AMBER fixed-charge force field:
qi q j
• Point charge on each atom !r
AMOEBA polarizable force field: i< j ij
• Point charge, dipole, and quadrupole on each atom
• Polarizable point dipole on each atom with short-range damping
16. A careful balance of solute and solvent
e interactions in a force field must be carefully
balanced to ensure correct behavior.
One must be careful to avoid force field bias in the simulation results…
Solvent-protein interactions strong Solvent-protein interactions weak
Protein-protein interactions weak Protein-protein interactions strong
Tendency to unfold / Intermediate Tendency to fold
form random coil structure or collapse
17. Integrating the equations of motion
MD simulation requires accurate numerical
integration of Newton’s equations.
Euler method (unstable) • Integrators are algorithms that
accelerate the atoms in the
x (t + !t ) = x (t ) + v (t ) !t + O ( !t 2 ) direction of the force.
v (t + !t ) = v (t ) + a (t ) !t + O ( !t 2 ) • More sophisticated algorithms
include higher order terms for
Velocity Verlet method better accuracy
1
x (t + !t ) = x (t ) + v (t ) !t + a (t ) !t 2 + O ( !t 4 ) • Time step is limited by fast
2
degrees of freedom (bond
1
v (t + !t ) = v (t ) + ( a (t ) + a (t + !t )) !t + O ( !t 2 ) vibrations)
2
• MD simulations are chaotic;
Beeman predictor-corrector method
small differences in the initial
1 conditions quickly lead to very
x (t + !t ) = x (t ) + v (t ) !t +
6
( 4a (t ) " a (t " !t )) !t 2 + O (!t 4 )
different trajectories
1
v (t + !t ) = v (t ) +
6
(2a (t + !t ) + 5a (t ) " a (t " !t )) !t + O (!t 3 )
18. A brief distraction…
Example of collisional trajectory
for Mars and the Earth.
• Numerical algorithms for
integrating Newton’s equations
are applied across many fields,
including gravitational
simulations in astrophysics
• Chaos exists for simple systems
with only a few interacting
bodies
• A small change in the initial
conditions (0.15 mm) results in
Venus and/or Mars eventually
colliding with the Earth in 3
billion years
J Laskar and M Gastineau, Nature 2009
19. Boundary conditions
Boundary conditions allow us to simulate condensed
phase systems with a finite number of particles.
• van der Waals interactions are typically
treated using finite distance cutoffs.
• Ewald summation treats long range
electrostatics accurately and efficiently
using real space and reciprocal space
summations
Important considerations!
• Choose large enough simulation cell to
avoid close contact between periodic
images
• Make sure the simulation cell is
neutralized using counter-ions (otherwise
Ewald is unphysical)
20. Statistical mechanical ensembles
Statistical mechanical ensembles allow our simulation
to exchange energy with an external environment.
• An ensemble represents all of
Ensemble menu: the microstates (i.e. geometries)
Choose one from each row that are accessible to the
Particle number N Chemical potential µ
simulation, and provides the
probability of each microstate.
Volume V Pressure P
• An ideal MD simulation
Energy E Temperature T conserves the total energy and
entropy, and samples the
microcanonical (NVE) ensemble.
E (r)
" • More realistic systems may
kbT
PNVT ( r ) ! e exchange energy, volume or
particles with external reservoirs
Probability of a microstate • However, this could make the
in the canonical (NVT) ensemble. algorithms more difficult
21. Temperature and pressure control
ermostats and barostats allow MD simulations to
sample different thermodynamic ensembles.
Kinetic energy / temperature relationship • ermostat algorithms ensure
that the temperature of our
2
T= ! KE system (derived from kinetic
3NkB energy) fluctuates around a
target temperature that we set.
Berendsen thermostat (uses velocity scaling)
• e Berendsen thermostat
dT T ! T0 achieves the target temperature
= but does not produce the
dt ! correct canonical ensemble
Andersen thermostat (velocity randomization) • e Andersen thermostat
3 p2
produces the canonical
! 1 $ 2 '
2mkBT ensemble by randomly resetting
P ( p) = # & e the momenta of particles
" 2! mkBT % (imitating random collisions)
22. Other integrators and algorithms
Langevin dynamics represents a different physical
process; Monte Carlo is a pure sampling stategy.
Langevin equation • Langevin dynamics (a.k.a.
stochastic dynamics): Particles
F = ma = !"V ( r ) ! ! mv + 2! MkBT R (t ) experience a friction force as well
as a random force from particles
“outside” the simulation
Metropolis criterion for canonical ensemble • Metropolis Monte Carlo
) method (a.k.a. Metropolis-
# &
+ exp % En " En+1 (, when En+1 > En
+ Hastings): Generate any trial
P ( rn ! rn+1 ) = * $ k BT ' move you like, and accept /
+ reject the move with a
+
, 1, otherwise.
probability given by the
Metropolis criterion.
23. Methodological challenges
e main challenges in molecular dynamics methods
are simulation timescale and accuracy.
Bond Bond Water Helix Fast conformational Slow conformational
vibration Isomerization dynamics forms change change
10-15 10-12 10-9 10-6 10-3 100
femto pico nano micro milli seconds
MD long where we where we’d
step MD run need to be love to be
Statistical Mechanics: Efficiently sample the correct thermodynamic ensemble
Algorithms and Computer Science: Make our simulations run faster by
designing faster algorithms and taking maximum advantage of current hardware
Force Fields: Achieve higher accuracy without unduly increasing the complexity
of the calculation, using better parameterization methods (my work!)
Data Analysis: Draw scientific conclusions from huge volumes of data
Image courtesy V. S. Pande
24. Outline
Introduction
• Physical chemistry; the role of theory and computation
• An overview of computer simulations of molecules
• Molecular dynamics within the classical approximation
Methods
• Force fields
• Integrating the equations of motion
• Sampling from thermodynamic ensembles
Applications
• Protein folding structure and mechanism
• Conformational change and binding free energies
• Properties of condensed phase matter (water)
25. Protein Folding: Structure Prediction
A collection of fast-folding proteins
• Can we accurately reproduce /
predict the structures of
experimentally crystallized
proteins?
• In many cases, protein folding
simulations work amazingly well
• Folding studies help us
understand intrinsically
disordered proteins, which may
be relevant in neurodegenerative
disease (e.g. Alzheimer’s,
Parkinson’s, Huntington’s)
Amyloid beta peptide (intrinsically disordered)
Lindorff-Larsen et al., Science 2011
0.17% 0.16% 0.16% 0.13% 0.13%
Y-S. Lin and V. S. Pande, Biophys. J. 2012
26. Protein Folding Mechanism and Kinetics
What are the pathways of
protein folding? Is there a
single preferrred pathway, or
are there many pathways?
• DE Shaw approach (above): Very long
simulation trajectories (> 10 ms) using
special Anton supercomputer
• Pande Group approach (right):
Synthesize many short simulations (e.g. from F@H)
Lindorff-Larsen et al., Science 2011
into a model with discrete states and rates (MSM) Beauchamp et al., PNAS 2012
27. G-Protein Coupled Receptor (GPCR)
1. Signaling molecule
(green) binds GPCR on
G-protein coupled receptors extracellular side (blue),
(GPCRs) are targeted by GPCR changes geometry
roughly 50% of all drugs 2. G protein (orange/
currently on the market yellow) binds to GPCR on
intracellular side
3. Nucleotide (USA color)
is released and initiates
signaling cascade…
Nature special issue cover image, 2012.
28. G-Protein Coupled Receptor (GPCR)
2012 Nobel Prize in Chemistry!
First Crystal Structure of a GPCR
• For some GPCRs (e.g. β2AR)
both the agonist-bound active
and inactive structures have been
crystallized
• e conformational change of
the receptor is very subtle!
Kobilka and coworkers, Nature, 2011.
29. GPCR Conformational change
• What is the activation
mechanism of GPCRs?
• Investigate using large-scale
MD simulations and Markov
state models for analysis
D. Shukla, M. Lawrenz and V. S. Pande
30. Protein-Ligand Binding Free Energies
Trypsin with benzamidine
187/495 100 ns simulations achieved
binding pose within 2 Å of the crystal pose • How important is the role of
dynamics in the process of protein /
Compute binding free energy ligand binding?
within 1 kcal/mol of experiment
• Can MD simulations improve the
drug discovery process?
• Simulations involve the ligand
molecule repeatedly visiting the
binding pocket
• Lots of sampling required! Much
lower throughput than “docking”
but incorporates many more
physical effects
Buch, Giorgino, and De Fabritiis. PNAS 2011
31. Force field development for water
• My project in the Pande group: An
Simulated density of water vs. temperature.
inexpensive polarizable water force field
Our model is shown in light green
for biomolecular simulations, largely
based on AMOEBA (2003)
• Model is fitted to energies and forces
from QM calculations as well as
experimental data
• Simpler and faster to calculate than
AMOEBA, and also more accurate.
Right: Comparison of our model to
high-level QM (MP2/aug-cc-pVTZ)
energy and force calculations.
32. Properties of Condensed Phase Matter (water)
• By running NPT simulations we can
estimate the melting point of different
phases of ice
Left: Ice II (density 1190 kg m-3)
Right: Liquid water
240 K, 2500 atm
• Hopefully in a few weeks: Calculate
the phase diagram and compare our
results with experiment
33. Some classical molecular dynamics programs
Program Best feature Free?
Large community
AMBER No
Great modeling tools
Force fields for nucleic acids,
CHARMM No
carbohydrates and lipids
GROMACS Fast, also humorous Yes
TINKER Easy to edit, AMOEBA force field Yes
GPU acceleration, uses
OpenMM Yes
Python scripting language
DL-POLY ?
NAMD ?
LAMMPS ?
34. Recommended reading
S. Adcock and A. McCammon. “Molecular Dynamics: Survey of Methods for
Simulating the Activity of Proteins.” Chem. Rev. 2006, 106, 1589.
S. Rasmussen and B. Kobilka. “Crystal structure of the human β2 adrenergic G-protein-
coupled receptor.” Nature 2007, 454, 383.
B. Cooke and S. Schmidler. “Preserving the Boltzmann ensemble in replica-exchange
molecular dynamics.” J. Chem. Phys. 2008, 129, 164112. (Contains good introductions
to integrators, thermostats etc.)
C. Vega and J. L. F. Abascal. “Simulating water with rigid non-polarizable models: a
general perspective.” Phys. Chem. Chem. Phys. 2011, 13, 19663. (All about water
models.)
Two classic texts:
D. Frenkel and B. Smit, Understanding Molecular Simulation.
M. P. Allen and D. J. Tildesley, Computer Simulations of Liquids.
For the ambitious:
L. D. Landau and E. M. Lifshitz, Classical Mechanics 3rd ed.