Numerical models for complex molecular systems

Modèles numériques pour les systèmes moléculaires
complexes
Konrad HINSEN
Centre de Biophysique Moléculaire, Orléans, France
et
Synchrotron SOLEIL, Saint Aubin, France
6 septembre 2018
Konrad HINSEN (CBM/SOLEIL) Modèles moléculaires 6 septembre 2018 1 / 23

Sujets abordés
1 Les modèles en physique
2 Les modèles en simulation moléculaire
3 Mes intérêts de recherche dans ce domaine

Models in physics
Theories and models
Dominant model type: diﬀerential equations
Most models are “plugins” for “frameworks” called theories
Exception: ad-hoc models for emergent phenomena in complex
systems
Some widely used theories and their models
Classical mechanics: Lagrangian function
Quantum mechanics: Hamilton operator
Thermodynamics: thermodynamic potential function
Statistical mechanics: partition function

Physical models and computation
Comparison with observation requires numbers, and thus computation
Models/equations are speciﬁcations, not algorithms
Finding solution algorithms is not trivial
Traditional path: analytical solution → numerical evaluation
Recent alternative: direct numerical solution
Both paths rely heavily on approximations.
Constructing and evaluating approximations to models is a big part of the
everyday work of a physicist.

A simple example: simulating celestial mechanics
Given: past positions of
the planets of the solar
system
Goal: predict the future
positions of these planets
K. Hinsen, Comp. Sci. Eng. 17(4), 2015

Phase 1: physics
1 Approximation: There is nothing but the solar system.
We neglect the influence of the rest of the universe.
2 Approximation: The Sun and the planets are point masses.
We neglect the influence of their sizes and shapes.
3 Approximation: Newton’s laws of motion and gravity
We neglect relativistic and quantum effects.

The physics model
1. Law of motion (the theory):
d
dt
ri (t) = vi (t) v: velocity
d
dt
vi (t) = ai (t) a: acceleration
Fi (t) = mi ai (t) F: force, m: mass
2. Law of universal gravitation:
Fi = N
j=1
j=i
Fij
Fij = −G
mi mj
|ri −rj |2
ri −rj
|ri −rj |
3. Two observations ri (t1) and ri (t2)
This model deﬁnes ri (t) for all t, past and future.

Phase 2a (idealist): computable analysis
Goal: construct an algorithm that, given t and an error bound ,
computes r
( )
i (t) such that
ri (t) − r
( )
i (t) <
possible in principle (existence proof)
hasn’t been done for Newton’s equations (as far as I know)
impractical in terms of CPU time and memory requirements
Marian B. Pour-El and J. Ian Richards
Computability in Analysis and Physics
Springer, Berlin, 1989

Phase 2b (realist): numerical analysis
1 Approximation: differentials → finite differences
Accept discretization error in return for solvable equations.
2 Approximation: real numbers → floating-point numbers
Accept round-off error in return for efficiency.
Choices to be made:
finite-difference scheme
integration step size
floating-point precision

The numerical model
Störmer-Verlet integrator:
r
(n+1)
i = 2r
(n)
i − r
(n−1)
i +
∆2
mi
F
(n)
i
Gravitation:
F
(n)
i = N
j=1
j=i
F
(n)
ij
F
(n)
ij = −G
mi mj
|r
(n)
i −r
(n)
j |2
r
(n)
i −r
(n)
j
|r
(n)
i −r
(n)
j |
ri : vector of three floats for each i
∆: integration step size (float)
floating-point precision: IEEE-754 single/double, arbitrary via MPFR
Floating-point requires an explicit choice of the order of operations,
but then we have specified the results to the last bit!

Phase 3: software
1 Approximation: algorithmic changes during code optimization
Accept modified results in return for speed.
2 Approximation: the compiler re-orders floating-point operations
Accept modified results in return for speed.
Verification/validation practically impossible:
approximations not documented for software users
users cannot opt out

The invasion of complexity
Scientiﬁc computation in the 1960s
Long but simple computations.
Check by hand for small N.
Small N → big N, re-run.
Slowly but surely...
Ever more complex objects of study.
Ever more complex models.
Ever more complex computational protocols.
Ever more complex software.
It’s becoming impossible to keep track of all approximations. Scientists
don’t know which model they are actually using.

Complexity
−→
Same equations, but a lot more points and parameters
More severe approximations required for eﬃciency
Software source code becomes very diﬃcult to read...
... but we have no other precise notation for the models.

Complexity in Molecular Dynamics simulations
Principle
Follow the motions of the atomic nuclei.
Essential input: a model for the interactions between atoms
1964: liquid argon
A single atom type: argon
Lennard-Jones interactions: two parameters
1994: lysozyme (a small protein)
1 960 atoms of 26 distinct types (forceﬁeld AMBER94)
74 759 energy parameters
Parameter assignment requires non-trivial graph traversal algorithms

Uncertainty through obscurity: a recent case
A. Smart, Physics Today, 22 August 2018
My view: not a coding error, but a badly chosen approximation not
documented anywhere else than in unpublished source code.

Research agenda for a better care of models
Goals
Make models readable by scientists (source code → paper)
Make all approximations explicit and exposed to peer review
K. Hinsen, F1000 Research 3, 101 (2014)
Approaches
Digital scientiﬁc notations
Model-Driven Engineering ?
K. Hinsen, The Self-Journal of Science (2016)
K. Hinsen, PeerJ CompSci 4, e158 (2018)

Digital scientiﬁc notations
Notations for models.
Formal languages.
Speciﬁcation, not implementation.
Human-readable, embedded in plain text.

Leibniz: a digital scientific notation
An algebraic specification language inspired by Maude
Based on equational logic and term rewriting
Main novelty: embedded into plain text (“literate specification”)
Application domains:
Development focus: physics, chemistry
More generally: models based on continuous mathematics

Unusual features
Doing research = constructing software tools
No namespaces, no scopes, but explicit renaming.
Minimal built-in functionality: numbers, strings, and booleans.
No “standard library”.
Adapt published libraries rather then re-use without modiﬁcation.
Discourage the creation of black-box code libraries.
Understandability takes priority over modularity and reusability
Think of it as exectuable mathematics, not software

Example: source code
#lang leibniz
@import["mechanics" "mechanics.xml"]
@import["quantities" "quantities.xml"]
@title{Motion of a mass on a spring}
@author{Konrad Hinsen}
@context["mass-on-a-spring"
#:use "mechanics/dynamics"
#:use "quantities/angular-frequency"]{
We consider a point-like object of mass @op{m : M} attached to a
spring whose mass we assume to be negligible. The other end of the
spring is attached to a wall. When the particle is at position
@op{x : T→L}, the force @op{F : T→F} acting on it is proportional
to the displacement @op{d : T→L} of @term{x} relative to the
spring's equilibrium length @op{l : L}:
@inset{
@equation[def-d]{d = x - l} @linebreak[]
@equation[force]{F = -(k × d)}
}
where @op{k : force-constant} characterizes the elastic properties
of the spring.
Newton's equation of motion for the position @term{x} of the mass
takes the form
@inset{
@equation[newton-x]{m × 𝒟(𝒟(x)) = -(k × (x - l))}
}
This is a second-order ordinary differential equation, which can be
rewritten in terms of the displacement @term{d}, yielding

Example: rendered view
Motion of a mass on a spring
We consider a point-like object of mass m:M attached to a spring whose mass we
assume to be negligible. The other end of the spring is attached to a wall. When the
particle is at position x:T→L, the force F:T→F acting on it is proportional to the
displacement d:T→L of x relative to the spring’s equilibrium length l:L:
def-d: d = x - l
force: F = -(k × d)
where k:force-constant characterizes the elastic properties of the spring.
Newton’s equation of motion for the position x of the mass takes the form
newton-x: m × ( (x)) = -(k × (x - l))
This is a second-order ordinary differential equation, which can be rewritten in terms
of the displacement d, yielding
newton-d: ( (d)) = -((k ÷ m) × d).
Introducing ω:angular-frequency defined by ω = √(k ÷ m), the solution can be written
as
solution: d[t] = A × cos((ω × t) + δ)
∀ t : T,
where cos(angle):ℝ is the cosine function. The amplitude A:L and the phase δ:angle
can take arbitray values.
Additional arithmetic definitions for this context:
mass on a
ass on a spring
by Konrad Hinsen
Context mass-on-a-spring
uses mechanics/dynamics
uses quantities/angular-
frequency

Example: machine-readable XML ﬁleec295a3
211 lines (211 sloc) 4.99 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
<leibniz-document>
<library>
<document-ref id="mechanics">mechanics.xml</document-ref>
<document-ref id="quantities">quantities.xml</document-ref>
</library>
<context id="mass-on-a-spring">
<includes>
<use>mechanics/dynamics</use>
<use>quantities/angular-frequency</use>
</includes>
<sorts>
<sort id="ℝ" />
<sort id="angle" />
<sort id="T→A" />
<sort id="angular-frequency" />
<sort id="T→L" />
<sort id="angular-frequency-squared" />
<sort id="T→F" />
<sort id="force-constant" />
<sort id="L" />
<sort id="M" />
</sorts>
<subsorts />
<vars />
<ops>
<op id="m">
<arity />
<sort id="M" />
</op>
<op id="√">
<arity>
</arity>
</op>
<op id="ω">
<arity />
</op>
<op id="_÷">
<arity>
<sort id="M" />
</arity>
</op>
<op id="k">
<arity />
</op>
<op id="F">
<arity />
<sort id="T→F" />
</op>
<op id="x">
<arity />
<sort id="T→L" />
</op>
<op id="_×">
<arity>
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
<sort id="T→L" />
</arity>
<sort id="T→A" />
</op>
<op id="cos">
<arity>
<sort id="angle" />
</arity>
<sort id="ℝ" />
</op>
<op id="l">
<arity />
<sort id="L" />
</op>
<op id="_×">
<arity>
<sort id="T→L" />
</arity>
<sort id="T→F" />
</op>
<op id="δ">
<arity />
<sort id="angle" />
</op>
<op id="A">
<arity />
<sort id="L" />
</op>
<op id="d">
<arity />
<sort id="T→L" />
</op>
</ops>
<rules />
<assets>
<asset id="newton-x">
<equation>
<vars />
<left>
<term op="_×">
<term-or-var name="m" />
<term op=" ">
<term op=" ">
<term-or-var name="x" />
</term>
</term>
</term>
</left>
<condition />
<right>
<term op="-">
<term op="_×">
<term-or-var name="k" />
<term op="_-">
<term-or-var name="l" />
</term>
</term>
</term>
</right>
</equation>
</asset>
<asset id="force">
<equation>
<vars />
<left>
<term-or-var name="F" />
</left>
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
<right>
<term op="-">
<term op="_×">
<term-or-var name="d" />
</term>
</term>
</right>
</equation>
</asset>
<asset id="newton-d">
<equation>
<vars />
<left>
<term op=" ">
<term op=" ">
</term>
</term>
</left>
<condition />
<right>
<term op="-">
<term op="_×">
<term op="_÷">
<term-or-var name="m" />
</term>
</term>
</term>
</right>
</equation>
</asset>
<asset id="def-d">
<equation>
<vars />
<left>
</left>
<condition />
<right>
<term op="_-">
<term-or-var name="l" />
</term>
</right>
</equation>
</asset>
<asset id="solution">
<equation>
<vars>
<var id="t" sort="T" />
</vars>
<left>
<term op="[]">
<term-or-var name="t" />
</term>
</left>
<condition />
<right>
<term op="_×">
<term-or-var name="A" />
<term op="cos">
<term op="_+">
<term op="_×">
<term-or-var name="ω" />
<term-or-var name="t" />

Play with it yourself
The code is on GitHub.
Warning: research code !
Look at the growing collection of examples.
Written in Racket, which provides excellent support for this kind of project:
the Scribble language for writing documentation, which Leibniz extends.
the DrRacket programming environment, which is Leibniz’ authoring
environment.

Numerical models for complex molecular systems

Recommended

Recommended

More Related Content

Similar to Numerical models for complex molecular systems

Similar to Numerical models for complex molecular systems (20)

Recently uploaded

Recently uploaded (20)

Numerical models for complex molecular systems