Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
WCSB 2012
1. Bayesian inference for the chemical
master equation using approximate
models
Colin Gillespie
Joint work with Andrew Golightly
2. Modelling
Represent a biochemical network with a set of (pseudo-)biochemical
reactions
Specify the rate laws and rate constants of the reactions
Run some stochastic or deterministic computer simulator of the
system dynamics
Straightforward using the Gillespie algorithm. The reverse problem is
trickier – given time course data, and a set of reactions, can we
recover the rate constants?
2/19
3. Modelling
Generically,
k species and r reactions with a typical reaction
i c
Ri : ui1 Y1 + . . . + uik Yk −→ vi1 Y1 + . . . + vik Yk
−
Stochastic rate constant: ci
Hazard / instantaneous rate: hi (Y (t ), ci ) where
Y (t ) = (Y1 (t ), . . . , Yk (t )) is the current state of the system and
k
Yj (t )
hi (Y (t ), ci ) = ci ∏ uij
j =1
3/19
4. Modelling
Some remarks:
This setup describes a Markov jump process (MJP)
The effect of reaction Ri is to change the value of each Yj by vij − uij
It can be shown that the time to the next reaction is
r
τ ∼ Exp {h0 (Y (t ), c )} where h0 ( Y ( t ) , c ) = ∑ hi (Y (t ), ci )
i =1
and the reaction is of type i with probability hi (Y (t ), ci )/h0 (Y (t ), ci )
Hence, the process is easily simulated (and this technique is known
as the Gillespie algorithm)
4/19
5. Inference for the exact model
Aim: infer the ci given time-course biochemical data. Following Boys et al.
(2008) and Wilkinson (2006):
Suppose we observe the entire process Y over [0, T ]
The i th unit interval contains ni reactions with times and types
(tij , kij ), j = 1, 2, . . . , ni
Hence, the likelihood for c is
T −1 ni T
π (Y|c ) = ∏ ∏ hk ij
Y (ti ,j −1 ), ckij exp −
0
h0 (Y (t ), c ) dt
i =0 j =1
5/19
6. Inference for the exact model
Aim: infer the ci given time-course biochemical data. Following Boys et al.
(2008) and Wilkinson (2006):
Suppose we observe the entire process Y over [0, T ]
The i th unit interval contains ni reactions with times and types
(tij , kij ), j = 1, 2, . . . , ni
Hence, the likelihood for c is
T −1 ni T
π (Y|c ) = ∏ ∏ hk ij
Y (ti ,j −1 ), ckij exp −
0
h0 (Y (t ), c ) dt
i =0 j =1
5/19
7. Inference for the exact model
Problem: it is not feasible to observe all reaction times and types
Assume data are observed on a regular grid with
Y0:T = Y (t ) = (Y1 (t ), Y2 (t ), . . . , Yk (t )) : t = 0, 1, 2, . . . , T
Idea: use a Gibbs sampler to alternate between draws of
1. times and types of reactions in (0, T ] conditional on c and the
observations,
2. each ci conditional on the augmented data
Note that step 1 can be performed for each interval (i , i + 1] in turn, due to
the factorisation of π (Y|Y0:T , c )
6/19
8. Inference for the exact model
Problem: it is not feasible to observe all reaction times and types
Assume data are observed on a regular grid with
Y0:T = Y (t ) = (Y1 (t ), Y2 (t ), . . . , Yk (t )) : t = 0, 1, 2, . . . , T
Idea: use a Gibbs sampler to alternate between draws of
1. times and types of reactions in (0, T ] conditional on c and the
observations,
2. each ci conditional on the augmented data
Note that step 1 can be performed for each interval (i , i + 1] in turn, due to
the factorisation of π (Y|Y0:T , c )
6/19
9. Inference for the exact model
Problem: it is not feasible to observe all reaction times and types
Assume data are observed on a regular grid with
Y0:T = Y (t ) = (Y1 (t ), Y2 (t ), . . . , Yk (t )) : t = 0, 1, 2, . . . , T
Idea: use a Gibbs sampler to alternate between draws of
1. times and types of reactions in (0, T ] conditional on c and the
observations,
2. each ci conditional on the augmented data
Note that step 1 can be performed for each interval (i , i + 1] in turn, due to
the factorisation of π (Y|Y0:T , c )
6/19
10. Inference for the exact model
Problem: it is not feasible to observe all reaction times and types
Assume data are observed on a regular grid with
Y0:T = Y (t ) = (Y1 (t ), Y2 (t ), . . . , Yk (t )) : t = 0, 1, 2, . . . , T
Idea: use a Gibbs sampler to alternate between draws of
1. times and types of reactions in (0, T ] conditional on c and the
observations,
2. each ci conditional on the augmented data
Note that step 1 can be performed for each interval (i , i + 1] in turn, due to
the factorisation of π (Y|Y0:T , c )
6/19
11. Difficulties
These techniques do not scale well to problems of realistic size and
complexity...
True process is discrete and stochastic — stochasticity is vital — what
about discreteness?
Treating molecule numbers as continuous and performing exact
inference for the resulting approximate model appears to be
promising...
7/19
12. Bayesian filtering
Suppose we have noisy observations X0:(i −1) = {X (t ) : t = 0, 1, . . . , i − 1}
where for example,
X (t ) = Y (t ) + , ∼ N(0, Σ)
Goal: generate a sample from π [c , Y (i )|X0:i ] given a new datum X (i )
π [c , Y (i )|X0:i ] ∝ π [c , Y (i − 1)|X0:i −1 ] πh Y(i −1,i ] |c π [Y (i )|X (i )] dY[i −1,i )
∝ prior × likelihood
where Y(i −1,i ] = {Y (t ) : t ∈ (i − 1, i ]} is the latent path in (i − 1, i ] and πh [·|c ]
is the associated density under the simulator
Idea: if we can sample π [c , Y (i − 1)|X0:i −1 ] then we can use MCMC to sample
the target π [c , Y (i )|X0:i ]
8/19
13. Bayesian filtering
Suppose we have noisy observations X0:(i −1) = {X (t ) : t = 0, 1, . . . , i − 1}
where for example,
X (t ) = Y (t ) + , ∼ N(0, Σ)
Goal: generate a sample from π [c , Y (i )|X0:i ] given a new datum X (i )
π [c , Y (i )|X0:i ] ∝ π [c , Y (i − 1)|X0:i −1 ] πh Y(i −1,i ] |c π [Y (i )|X (i )] dY[i −1,i )
∝ prior × likelihood
where Y(i −1,i ] = {Y (t ) : t ∈ (i − 1, i ]} is the latent path in (i − 1, i ] and πh [·|c ]
is the associated density under the simulator
Idea: if we can sample π [c , Y (i − 1)|X0:i −1 ] then we can use MCMC to sample
the target π [c , Y (i )|X0:i ]
8/19
14. Bayesian filtering
Suppose we have noisy observations X0:(i −1) = {X (t ) : t = 0, 1, . . . , i − 1}
where for example,
X (t ) = Y (t ) + , ∼ N(0, Σ)
Goal: generate a sample from π [c , Y (i )|X0:i ] given a new datum X (i )
π [c , Y (i )|X0:i ] ∝ π [c , Y (i − 1)|X0:i −1 ] πh Y(i −1,i ] |c π [Y (i )|X (i )] dY[i −1,i )
∝ prior × likelihood
where Y(i −1,i ] = {Y (t ) : t ∈ (i − 1, i ]} is the latent path in (i − 1, i ] and πh [·|c ]
is the associated density under the simulator
Idea: if we can sample π [c , Y (i − 1)|X0:i −1 ] then we can use MCMC to sample
the target π [c , Y (i )|X0:i ]
8/19
15. Idealised MCMC scheme
1. Propose (c ∗ , Y (i − 1)∗ ) ∼ π [ · |X0:i −1 ]
2. Draw Y(∗−1,i ] ∼ πh [ · |c ∗ ] by forward simulation
i
3. Accept/reject with probability
π [Y (i )∗ |X (i )]
min 1, .
π [Y (i )|X (i )]
Since the simulator is used as a proposal process, we don’t need to
evaluate its associated density.
Similar to Approximate Bayesian Computation (ABC)
9/19
16. A particle approach
Since π [c , Y (i − 1)|X0:i −1 ] does not have analytic form (typically)
we approximate this density with an equally weighted cloud of points
or particles, hence the term particle filter
The scheme is initialised with a sample from the prior and a posterior
sample calculated at time i − 1 is used as the prior at time i
To avoid sample impoverishment, in step 1 of the scheme we sample
the kernel density estimate (KDE) of π [c ∗ , Y (i − 1)∗ |X0:i −1 ] by
adding zero mean Normal noise to each proposed particle
10/19
17. Three types of approximation
The distribution π [c , Y (i − 1)|X0:i −1 ] is represented as a discrete
distribution - in theory as the number of particles N increases, the
approximation improves
We perturb particles - as N increases, perturbation decreases
Simulator
Exact: Gillespie
Approximate: SDE, but can be perform poorly for low molecule
numbers
Approximate: ODE, but ignores stochasticity
11/19
18. Three types of approximation
The distribution π [c , Y (i − 1)|X0:i −1 ] is represented as a discrete
distribution - in theory as the number of particles N increases, the
approximation improves
We perturb particles - as N increases, perturbation decreases
Simulator
Exact: Gillespie
Approximate: SDE, but can be perform poorly for low molecule
numbers
Approximate: ODE, but ignores stochasticity
11/19
19. Three types of approximation
The distribution π [c , Y (i − 1)|X0:i −1 ] is represented as a discrete
distribution - in theory as the number of particles N increases, the
approximation improves
We perturb particles - as N increases, perturbation decreases
Simulator
Exact: Gillespie
Approximate: SDE, but can be perform poorly for low molecule
numbers
Approximate: ODE, but ignores stochasticity
11/19
20. Case study
Wish to compare the hybrid scheme with inferences made assuming the
“true” discrete, stochastic model
To allow such inferences, consider a toy example:
Autoregulatory network
c1 c2
Immigration R1 : ∅ − → Y1
− R2 : ∅ − → Y2
−
c3 c4
Death R 3 : Y1 − → ∅
− R4 : Y2 − → ∅
−
c5
Interaction R5 : Y1 + Y2 − → 2Y2 .
−
Corrupt the data by constructing
Poisson (Yi (t )) if Yi (t ) > 0,
Xi (t )|Yi (t ) ∼
Bernoulli(0.1) if Yi (t ) = 0.
for each i = 1, 2 so that Y (t ) is not observed anywhere
12/19
25. Summary
Inferring rate constants that govern discrete stochastic kinetic models
is computationally challenging
It appears promising to consider an approximation of the model and
perform exact inference using the approximate model
In this case study, the SDE and ODE models performed surprisingly
well
Although the ODE model had poor MCMC mixing properties
17/19
26. Future
A hybrid forwards simulator (or indeed any forwards simulator) can be
used as a proposal process inside a particle filter
However, for this model the Salis & Kaznessis hybrid simulator is
slower than the standard Gillespie algorithm
A particle MCMC approach provides an exact inference procedure
(see recent Andrieu et al JRSS B read paper)
Preprint available using this PMCMC. This paper also considers the
hybrid simulator and the linear noise approximation
18/19
27. Boys, R. J., Wilkinson, D.J. and T.B.L. Kirkwood (2008). Bayesian inference for a discretely
observed stochastic kinetic model. Statistics and Computing 18.
Gillespie, C. S. and Golightly, A. and (2010). Bayesian inference for generalized stochastic
population growth models with application to aphids. JRSS Series C 59(2).
Heron, E. A., Finkenstadt, B. and D. A. Hand (2007). Bayesian inference for dynamic
transcriptional regulation; the Hes1 system as a case study. Bioinformatics 23.
Salis, H. and Y. Kaznessis (2005). Accurate hybrid simulation of a system of coupled
biochemical reactions. Journal of Chemical Physics 122.
Wilkinson, D. J. (2006). Stochastic Modelling for Systems Biology. Chapman & Hall/CRC Press.
Contact details...
email: colin.gillespie@ncl.ac.uk
www: http://www.mas.ncl.ac.uk/~ncsg3/
19/19