- The document describes a talk given by Umberto Picchini on Bayesian inference for a mixed-effects stochastic differential equation (SDE) model of tumor growth.
- The model introduces a state-space model for tumor growth in mice with dynamics driven by an SDE and formulates a mixed-effects SDE model to estimate population parameters.
- The talk aims to show how to perform approximate Bayesian inference for the mixed-effects SDE model using synthetic likelihoods.
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of Tumor Growth
1. Inference via Bayesian Synthetic Likelihoods for
a Mixed-Effects SDE Model of Tumor Growth
Umberto Picchini
Centre for Mathematical Sciences,
Lund University
European Meeting of Statisticians
Helsinki 24-28 July 2017
Umberto Picchini (umberto@maths.lth.se)
2. This is joint ongoing work with Julie Lyng Forman (Biostatistics unit,
University of Copenhagen).
This presentation is based on the working paper:
P. and Forman (2017). Stochastic differential equation mixed effects
models for tumor growth and response to treatment,
arXiv:1607.02633.
Umberto Picchini (umberto@maths.lth.se)
3. In this talk we have three main goals:
Introduce a state-space model for tumor growth in mice, with
dynamics driven by a stochastic differential equation (SDE).
Formulate a mixed-effects SDE model for population estimation.
Show how to produce approximate Bayesian inference for our
mixed-effects SDE model using synthetic likelihoods.
Should we decide to make our model more complex, we can
seriously consider the synthetic likelihood approach for
non-state-space models having intractable likelihoods.
Umberto Picchini (umberto@maths.lth.se)
4. In this talk we have three main goals:
Introduce a state-space model for tumor growth in mice, with
dynamics driven by a stochastic differential equation (SDE).
Formulate a mixed-effects SDE model for population estimation.
Show how to produce approximate Bayesian inference for our
mixed-effects SDE model using synthetic likelihoods.
Should we decide to make our model more complex, we can
seriously consider the synthetic likelihood approach for
non-state-space models having intractable likelihoods.
Umberto Picchini (umberto@maths.lth.se)
5. Nowadays there are several ways to deal with “intractable likelihoods”.
“Plug-and-play methods”: the only requirements is the ability to simulate
from the data-generating-model.
1 particle marginal methods (PMMH, PMCMC) based on SMC filters
[Andrieu and Roberts 2009, Andrieu et al 2010].
2 (improved) Iterated filtering [Ionides et al. 2015]
3 approximate Bayesian computation (ABC) [Marin et al. 2012].
4 Synthetic likelihoods [Wood 2010].
(1)-(2) easily accommodate models of state-space type (Markovian
dynamics, conditionally independent measurements).
(3)-(4) do not impose any structure on the model. You only need to be able
to generate realizations from the model.
In the following I focus on Synthetic Likelihoods.
Umberto Picchini (umberto@maths.lth.se)
6. Our experiment: a tumor xenography study
a skin tumor is grown in each mouse in the study.
3 groups of mice: 2 groups getting an experimental treatment; 1
control group (no treatment).
experimental groups get treated with chemio or radiation therapy.
we wish to assess the effect of the treatments on tumor growth,
that is estimate model parameters.
Only 5–8 mice per group. Data are sparse.
group 1: chemio therapy;
group 3: combined chemio-radio therapy;
group 5: no treatment
Umberto Picchini (umberto@maths.lth.se)
7. Three experimental groups
0 5 10 15 20 25 30 35 40
days
3.5
4
4.5
5
5.5
6
6.5
7
7.5
logvolume(mm3
)
group 1
0 5 10 15 20 25 30 35 40
days
2.5
3
3.5
4
4.5
5
5.5
6
6.5
7
7.5
logvolume(mm
3
)
group 3
0 5 10 15 20 25 30 35
days
3
3.5
4
4.5
5
5.5
6
6.5
7
7.5
8
logvolume(mm3
)
group 5
Figure: Data of log-volumes (mm3
) for the three groups.
Umberto Picchini (umberto@maths.lth.se)
9. Mixed-effects modelling
Repeated measurements taken on a series of individuals/animals play
an important role in biomedical research.
Say that we have measurements on M subjects (mice).
Assume responses following the same model form for all
subjects
Each subject has its own individual parameters φi
φi ∼ p(φ|θ), i = 1, ..., M
θ are fixed (yet unknown) population parameters.
it may be desirable to consider random variations into individual
process dynamics (⇒ stochastic differential equations)
Umberto Picchini (umberto@maths.lth.se)
10. We formulate a state-space model accounting for:
intra-individual variation: explained via an SDE;
between-individuals variation: modelled by assuming “mixed
effects” φi ∼ p(φ|θ). Interest is on estimating θ.
residual variation.
Our data represent the size of the total volume Vi(t) at time t for
subject i = 1, ..., M.
For subject i, a fraction αi of the tumor volume has cells killed by the
treatment, 0 αi 1.
Vi(t) = Vsurv
i (t) + Vkill
i (t) dynamics are in the following slides
Vkill
i (0) = αivi,0 fraction of killed tumor volume
Vsurv
i (0) = (1 − αi)vi,0 fraction of survived tumor volume
vi,0 = 100 [mm3] known starting tumor volume
Umberto Picchini (umberto@maths.lth.se)
11. We formulate a state-space model accounting for:
intra-individual variation: explained via an SDE;
between-individuals variation: modelled by assuming “mixed
effects” φi ∼ p(φ|θ). Interest is on estimating θ.
residual variation.
Our data represent the size of the total volume Vi(t) at time t for
subject i = 1, ..., M.
For subject i, a fraction αi of the tumor volume has cells killed by the
treatment, 0 αi 1.
Vi(t) = Vsurv
i (t) + Vkill
i (t) dynamics are in the following slides
Vkill
i (0) = αivi,0 fraction of killed tumor volume
Vsurv
i (0) = (1 − αi)vi,0 fraction of survived tumor volume
vi,0 = 100 [mm3] known starting tumor volume
Umberto Picchini (umberto@maths.lth.se)
13. SDE mixed effects model
For subject i we take ni measurements.
Yij = log(Vij) + εij, i = 1, ..., M; j = 1, ..., ni
Vi(t) = Vsurv
i (t) + Vkill
i (t),
dVsurv
i (t) = (βi + γ2
/2)Vsurv
i (t)dt + γVsurv
i (t)dBi(t), Vsurv
i (0) = (1 − αi)vi,0
dVkill
i (t) = (−δi + τ2
/2)Vkill
i (t)dt + τVkill
i (t)dWi(t), Vkill
i (0) = αivi,0.
We assume Gaussian random effects, one realization per individual:
βi ∼ N( ¯β, σ2
β); δi ∼ N(¯δ, σ2
δ); αi ∼ N(0,1)( ¯α, σ2
α)
hence
φi = (βi, δi, αi)
And Gaussian residual variation (independent of everything else)
εij ∼iid N(0, σ2
ε)
Umberto Picchini (umberto@maths.lth.se)
14. Data Yij|Vi(tj) are conditionally independent.
Latent state {Vi(t)} is Markovian, conditionally on random effects.
The model is of state space type.
We wish to fit the model to the entire pool of data for M subjects.
Notice that data are very sparse, which makes inference challenging.
We estimate all population parameters and residual variation:
θ = ( ¯β, ¯δ, ¯α
means random effects
, σ2
β, σ2
δ, σ2
α
variances random effects
, γ, τ
intra-subj variation
, σ2
ε
residual variance
)
Umberto Picchini (umberto@maths.lth.se)
15. For random effect φi = (βi, δi, αi) and data yi = {yij} for subject i the
intractable likelihood for subject i is:
p(yi|θ) = p(yi|φi; θ)p(φi|θ)dφi
= p(yi|xi; θ)p(xi|φi; θ)dxi p(φi|θ)dφi
=
ni
j=1
p(yij|xij, φi, θ)p(xi,j|xi,j−1, φi; θ) p(xi0|φi, θ)dxi
× p(φi|θ)dφi
and the full likelihood for all subjects y = (y1, ..., yM) is
p(y|θ) =
M
i=1
p(yi|θ)
Umberto Picchini (umberto@maths.lth.se)
16. The previous intractable likelihood is manageable via particle filters
(sequential Monte Carlo).
What if the model is not of state-space type? Then the likelihood
would be even more intractable!
Umberto Picchini (umberto@maths.lth.se)
17. Synthetic Likelihoods (Wood, 2010)
Regardless the specific application, assume the following:
y: observed data, from static or dynamic models
s(y): (vector of) summary statistics of data, e.g. mean,
autocorrelations, marginal quantiles etc.
assume
s(y) ∼ N(µθ, Σθ)
an assumption justifiable via second order Taylor expansion
(same as in Laplace approximations).
µθ and Σθ unknown: estimate them via simulations.
Approach justifiable for very noisy models. Summary statistics
retain essential features of the data. Also useful for near-chaotic
models (very irregular likelihood).
Umberto Picchini (umberto@maths.lth.se)
19. For fixed θ we simulate N artificial datasets y∗
1 , ..., y∗
N and compute
corresponding (possibly vector valued) summaries s∗
1 , ..., s∗
N.
compute
ˆµθ =
1
N
N
i=1
s∗
i , ˆΣθ =
1
N − 1
N
i=1
(s∗
i − ˆµθ)(s∗
i − ˆµθ)
compute the statistics sobs for the observed data y.
evaluate a multivariate Gaussian likelihood at sobs
LN(sobs|θ) := exp(lN(sobs|θ)) ∝
1
| ˆΣθ|
e−(sobs− ˆµθ) ˆΣ−1
θ (sobs− ˆµθ)/2
This synthetic likelihood can be maximized w.r.t. θ or be plugged in a
(marginal) MCMC algorithm for Bayesian inference
πN(θ|sobs) ∝ LN(sobs|θ)π(θ)
Umberto Picchini (umberto@maths.lth.se)
20. Bayesian synthetic likelihoods
Actually we follow Price et al 2017.1
Construct an unbiased estimator ˜LN for a Gaussian likelihood
(Ghurye and Olkin, 1969), this implies that for any statistic s
E(˜LN(s|θ)) = L(s|θ) = N(s; µθ, Σθ)
plug ˜LN(sobs|θ) into a MCMC algorithm for inference on θ.
resulting draws have stationary distribution π(θ|sobs) not
πN(θ|sobs), whenever sobs is Gaussian.
The above is true regardless of the value of N.
The latter follows from Beaumont 2003, Andrieu and Roberts 2009.
1
Price, Drovandi, Lee and Nott. Bayesian synthetic likelihood. 2017, JCGS.
Umberto Picchini (umberto@maths.lth.se)
21. Recall we have not one but M subjects to fit simultaneously.
Data are y = (y1, ..., yM).
We construct the following vector-statistics:
s = (sindiv
1 , ..., sindiv
M , sbetween
)
For subject i individual summaries sindiv
i contain:
mean absolute deviation for subject i;
slope of the line segment connecting the first and the last
observation, (yi(tni ) − yi(t1))/(tni − t1);
first two measurement values yi(t1), yi(t2);
the estimated slope ˆβi1 from the autoregression
E(yij) = βi0 + βi1yi,j−1
Umberto Picchini (umberto@maths.lth.se)
22. Inter-individuals summaries sbetween include:
MAD{yi1}i=1:M, the mean absolute deviation between subjects at
the first time point;
the same as above but for the second time point.
These are useful to understand the “width” of the variability between
trajectories. Remember
0 5 10 15 20 25 30 35 40
days
2.5
3
3.5
4
4.5
5
5.5
6
6.5
7
7.5
logvolume(mm3
)
group 3
Umberto Picchini (umberto@maths.lth.se)
23. Therefore to run a single iteration of an MCMC algorithm using
synthetic likelihoods we must:
simulate M independent realizations of the random effects, and
corresponding M subjects trajectories;
do the above N times (can be done in parallel);
compute summary statistics for the M × N trajectories.
Umberto Picchini (umberto@maths.lth.se)
24. A particle marginal algorithm for exact Bayesian inference
for i = 1, ..., M do
draw φl
i ∼ p(φi|θ)
if j = 1 then
Sample xl
i1 ∼ p(xi1|x0, φl
i; θ).
Compute wl
i1 = p(yi1|xl
i1) and ˆp(yi1) = L
l=1 wl
i1/L.
Normalization: ˜wl
i1 := wl
i1/ L
l=1 wl
i1.
Resampling: sample L times with replacement from {xl
i1, ˜wl
i1}. Denote the
sampled particles with ˜xl
i1.
end if
for j = 2, ..., ni do
Forward propagation: sample xl
ij ∼ p(xij|˜xl
i,j−1, φl
i; θ).
Compute wl
ij = p(yij|xl
ij) and normalise ˜wl
ij := wl
ij/ L
l=1 wl
ij
Compute ˆp(yij|yi,1:j−1) = L
l=1 wl
ij/L
Resample L times with replacement from {xl
ij, ˜wl
ij}. Sampled particles are ˜xl
ij.
end for
end for
Umberto Picchini (umberto@maths.lth.se)
25. Each iteration of the previous for loop gives an unbiased ˆp(yi|θ).
Since E[ˆp(yi|θ)] = p(yi|θ)
and since all the ˆp(yi|θ)) are independent one of the other
then E[ M
i=1 ˆp(yi|θ)] = M
i=1 p(yi|θ)
The above means that the overall likelihood for our mixed effects
model can be estimated unbiasedly.
Therefore exact Bayesian inference can be obtained using
pseudo-marginal arguments (e.g. Andrieu and Roberts 20092):
We can sample exactly from π(θ|y) via Metropolis-Hastings.
2
Andrieu and Roberts 2009. The pseudo-marginal approach for efficient Monte
Carlo computations. The Annals of Statistics: 697-725.
Umberto Picchini (umberto@maths.lth.se)
28. Identify treatment efficacy with more subjects
The previous poor-data scenario shows difficulties in identify
between-subjects variability with (too) few subjects.
We now perform simulation studies with M = 17 subjects.
D1: a simulated dataset with 17 subjects assigned to a low
efficacy treatment, α = 0.37.
D2: a simulated dataset with 17 subjects assigned to a treatment
with high efficacy, α = 0.75.
We use Bayesian synthetic likelihoods: N = 6, 000 simulated
summaries, R = 15, 000 MCMC iterations.
Umberto Picchini (umberto@maths.lth.se)
29. Dashed curves: from low efficacy treatment (D1), α = 0.37.
Solid curves: from high efficacy treatment (D2), α = 0.75.
-1 -0.5 0 0.5 1 1.5 2 2.5
0
2
4
6
8
10
(j) log ¯β
0 0.2 0.4 0.6 0.8 1
0
2
4
6
8
10
(k) ¯α
A larger number of subjects enables identification of treatment
efficacy.
Umberto Picchini (umberto@maths.lth.se)
30. Summary
Bayesian synthetic likelihoods work well for SDE mixed effects
models provided a not too small number of subjects is given.
This gives us confidence for the possibility to perform inference in
non-state-space mixed effects models.
Reference:
P. and Forman (2017). Stochastic differential equation mixed effects
models for tumor growth and response to treatment,
arXiv:1607.02633.
Umberto Picchini (umberto@maths.lth.se)
32. Unbiased Gaussian likelihood estimate
Price et al. 2017 note than plugging-in the estimates ˆµ(θ) and ˆΣ(θ) into the
Gaussian likelihood p(s|θ) results in a biased estimate, while one could
instead use an unbiased estimator of a Gaussian likelihood (Ghurye and
Olkin, 1969) given by
ˆp(s|θ) = (2π)−d/2 c(d, N − 2)
c(d, N − 1)(1 − 1/N)d/2
|(N − 1) ˆΣN(θ)|−(n−d−2)/2
× ψ (N − 1) ˆΣN(θ) − (s − ˆµN(θ))(s − ˆµN(θ)) /(1 − 1/N)
(N−d−3)/2
where d = dim(s), π denotes the mathematical constant, N > d + 3, and for
a square matrix A the function ψ(A) is defined as ψ(A) = |A| if A is positive
definite and ψ(A) = 0 otherwise. Finally
c(k, v) = 2−kv/2
π−k(k−1)/4
/ k
i=1 Γ(1
2 (v − i + 1)).
Umberto Picchini (umberto@maths.lth.se)