SlideShare a Scribd company logo
Estimation of the score vector and observed
information matrix in intractable models
Arnaud Doucet (University of Oxford)
Pierre E. Jacob (University of Oxford)
Sylvain Rubenthaler (Universit´e Nice Sophia Antipolis)
November 28th, 2014
Pierre Jacob Derivative estimation 1/ 39
Outline
1 Context
2 General results and connections
3 Posterior concentration when the prior concentrates
4 Hidden Markov models
Pierre Jacob Derivative estimation 1/ 39
Outline
1 Context
2 General results and connections
3 Posterior concentration when the prior concentrates
4 Hidden Markov models
Pierre Jacob Derivative estimation 2/ 39
Motivation
Derivatives of the likelihood help optimizing / sampling.
For many models they are not available.
One can resort to approximation techniques.
Pierre Jacob Derivative estimation 2/ 39
Using derivatives in sampling algorithms
Modified Adjusted Langevin Algorithm
At step t, given a point θt, do:
propose
θ⋆
∼ q(dθ | θt) ≡ N(θt +
σ2
2
∇θ log π(θt), σ2
),
with probability
1 ∧
π(θ⋆)q(θt | θ⋆)
π(θt)q(θ⋆ | θt)
set θt+1 = θ⋆, otherwise set θt+1 = θt.
Pierre Jacob Derivative estimation 3/ 39
Using derivatives in sampling algorithms
Figure : Proposal mechanism for random walk Metropolis–Hastings.
Pierre Jacob Derivative estimation 4/ 39
Using derivatives in sampling algorithms
Figure : Proposal mechanism for MALA.
Pierre Jacob Derivative estimation 5/ 39
Using derivatives in sampling algorithms
In what sense is MALA better than MH?
Scaling with the dimension of the state space
For Metropolis–Hastings, optimal scaling leads to
σ2
= O(d−1
),
For MALA, optimal scaling leads to
σ2
= O(d−1/3
).
Roberts & Rosenthal, Optimal Scaling for Various
Metropolis-Hastings Algorithms, 2001.
Pierre Jacob Derivative estimation 6/ 39
Hidden Markov models
y2
X2X0
y1
X1
...
... yT
XT
θ
Figure : Graph representation of a general hidden Markov model.
Hidden process: initial distribution µθ, transition fθ.
Observations conditional upon the hidden process, from gθ.
Pierre Jacob Derivative estimation 7/ 39
Assumptions
Input:
Parameter θ : unknown, prior distribution p.
Initial condition µθ(dx0) : can be sampled from.
Transition fθ(dxt|xt−1) : can be sampled from.
Measurement gθ(yt|xt) : can be evaluated point-wise.
Observations y1:T = (y1, . . . , yT ).
Goals:
score: ∇θ log L(θ; y1:T ) for any θ,
observed information matrix: −∇2
θ log L(θ; y1:T ) for any θ.
Note: throughout the talk, the observations, and thus the
likelihood, are fixed.
Pierre Jacob Derivative estimation 8/ 39
Why is it an intractable model?
The likelihood function does not admit a closed form expression:
L(θ; y1, . . . , yT ) =
∫
XT+1
p(y1, . . . , yT | x0, . . . xT , θ)p(dx0, . . . dxT | θ)
=
∫
XT+1
T∏
t=1
gθ(yt | xt) µθ(dx0)
T∏
t=1
fθ(dxt | xt−1).
Hence the likelihood can only be estimated, e.g. by standard
Monte Carlo, or by particle filters.
What about the derivatives of the likelihood?
Pierre Jacob Derivative estimation 9/ 39
Fisher and Louis’ identities
Write the score as:
∇ℓ(θ) =
∫
∇ log p(x0:T , y1:T | θ)p(dx0:T | y1:T , θ).
which is an integral, with respect to the smoothing distribution
p(dx0:T | y1:T , θ), of
∇ log p(x0:T , y1:T | θ) = ∇ log µθ(x0)
+
T∑
t=1
∇ log fθ(xt | xt−1) +
T∑
t=1
∇ log gθ(yt | xt).
However pointwise evaluations of ∇ log µθ(x0) and
∇ log fθ(xt | xt−1) are not always available.
Pierre Jacob Derivative estimation 10/ 39
New kid on the block: Iterated Filtering
Perturbed model
Hidden states ˜Xt = (˜θt, Xt).
{
˜θ0 ∼ N(θ0, τ2Σ)
X0 ∼ µ˜θ0
(·)
and
{
˜θt ∼ N(˜θt−1, σ2Σ)
Xt ∼ f˜θt
(· | Xt−1 = xt−1)
Observations ˜Yt ∼ g˜θt
(· | Xt).
Score estimate
Consider VP,t = Cov[˜θt | y1:t−1] and ˜θF,t = E[˜θt | y1:t].
T∑
t=1
VP,t
−1
(
˜θF,t − ˜θF,t−1
)
≈ ∇ℓ(θ0)
when τ → 0 and σ/τ → 0. Ionides, Breto, King, PNAS, 2006.
Pierre Jacob Derivative estimation 11/ 39
Iterated Filtering: the mystery
Why is it valid?
Is it related to known techniques?
Can it be extended to estimate the second derivatives (i.e.
the Hessian, i.e. the observed information matrix)?
How does it compare to other methods such as finite
difference?
Pierre Jacob Derivative estimation 12/ 39
Outline
1 Context
2 General results and connections
3 Posterior concentration when the prior concentrates
4 Hidden Markov models
Pierre Jacob Derivative estimation 13/ 39
Iterated Filtering
Given a log likelihood ℓ and a given point, consider a prior
θ ∼ N(θ0, σ2
).
Posterior expectation when the prior variance goes to zero
First-order moments give first-order derivatives:
|σ−2
(E[θ|Y ] − θ0) − ∇ℓ(θ0)| ≤ Cσ2
.
Phrased simply,
posterior mean − prior mean
prior variance
≈ score.
Result from Ionides, Bhadra, Atchad´e, King, Iterated filtering,
2011.
Pierre Jacob Derivative estimation 13/ 39
Extension of Iterated Filtering
Posterior variance when the prior variance goes to zero
Second-order moments give second-order derivatives:
|σ−4
(
Cov[θ|Y ] − σ2
)
− ∇2
ℓ(θ0)| ≤ Cσ2
.
Phrased simply,
posterior variance − prior variance
prior variance2 ≈ hessian.
Result from Doucet, Jacob, Rubenthaler on arXiv, 2013.
Pierre Jacob Derivative estimation 14/ 39
Proximity mapping
Given a real function f and a point θ0, consider for any σ2 > 0
θ → f (θ) exp
{
−
1
2σ2
(θ − θ0)2
}
θ
θ0
Figure : Example for f : θ → exp(−|θ|) and three values of σ2
.
Pierre Jacob Derivative estimation 15/ 39
Proximity mapping
Proximity mapping
The σ2-proximity mapping is defined by
proxf : θ0 → argmaxθ∈R f (θ) exp
{
−
1
2σ2
(θ − θ0)2
}
.
Moreau approximation
The σ2-Moreau approximation is defined by
fσ2 : θ0 → C supθ∈R f (θ) exp
{
−
1
2σ2
(θ − θ0)2
}
where C is a normalizing constant.
Pierre Jacob Derivative estimation 16/ 39
Proximity mapping
θ
Figure : θ → f (θ) and θ → fσ2 (θ) for three values of σ2
.
Pierre Jacob Derivative estimation 16/ 39
Proximity mapping
Property
Those objects are such that
proxf (θ0) − θ0
σ2
= ∇ log fσ2 (θ0) −−−→
σ2→0
∇ log f (θ0)
Moreau (1962), Fonctions convexes duales et points proximaux
dans un espace Hilbertien.
Pereyra (2013), Proximal Markov chain Monte Carlo
algorithms.
Pierre Jacob Derivative estimation 17/ 39
Proximity mapping
Bayesian interpretation
If f is a seen as a likelihood function then
θ → f (θ) exp
{
−
1
2σ2
(θ − θ0)2
}
is an unnormalized posterior density function based on a
Normal prior with mean θ0 and variance σ2.
Hence
proxf (θ0) − θ0
σ2
−−−→
σ2→0
∇ log f (θ0)
can be read
posterior mode − prior mode
prior variance
≈ score.
Pierre Jacob Derivative estimation 18/ 39
Stein’s lemma
Stein’s lemma states that
θ ∼ N(θ0, σ2
)
if and only if for any function g such that E [|∇g(θ)|] < ∞,
E [(θ − θ0) g (θ)] = σ2
E [∇g (θ)] .
If we choose the function g : θ → exp ℓ (θ) /Z with
Z = E [exp ℓ (θ)] and apply Stein’s lemma we obtain
1
Z
E [θ exp ℓ(θ)] − θ0 =
σ2
Z
E [∇ℓ (θ) exp (ℓ (θ))]
⇔ σ−2
(E [θ | Y ] − θ0) = E [∇ℓ (θ) | Y ] .
Notation: E[φ(θ) | Y ] := E[φ(θ) exp ℓ(θ)]/Z.
Pierre Jacob Derivative estimation 19/ 39
Stein’s lemma
For the second derivative, we consider
h : θ → (θ − θ0) exp ℓ (θ) /Z.
Then
E
[
(θ − θ0)2
| Y
]
= σ2
+ σ4
E
[
∇2
ℓ(θ) + ∇ℓ(θ)2
| Y
]
.
Adding and subtracting terms also yields
σ−4
(
V [θ | Y ] − σ2
)
= E
[
∇2
ℓ(θ) | Y
]
+
{
E
[
∇ℓ(θ)2
| Y
]
− (E [∇ℓ(θ) | Y ])2
}
.
. . . but what we really want is
∇ℓ(θ0), ∇ℓ2
(θ0)
and not
E [∇ℓ(θ) | Y ] , E
[
∇ℓ2
(θ) | Y
]
.
Pierre Jacob Derivative estimation 20/ 39
Outline
1 Context
2 General results and connections
3 Posterior concentration when the prior concentrates
4 Hidden Markov models
Pierre Jacob Derivative estimation 21/ 39
Core Idea
The prior is a essentially a normal distribution N(θ0, σ2), but in
general has a density denoted by κ.
Posterior concentration induced by the prior
Under some assumptions, when σ → 0:
the posterior looks more and more like the prior,
the shift in posterior moments is in O(σ2).
Our arXived proof suffers from an overdose of Taylor
expansions.
Pierre Jacob Derivative estimation 21/ 39
Details
Introduce a test function h such that |h(u)| < c|u|α for some
c, α.
We start by writing
E {h (θ − θ0)| y} =
∫
h (σu) exp {ℓ (θ0 + σu) − ℓ(θ0)} κ (u) du
∫
exp {ℓ (θ0 + σu) − ℓ(θ0)} κ (u) du
using u = (θ − θ0)/σ and then focus on the numerator
∫
h (σu) exp {ℓ (θ0 + σu) − ℓ(θ0)} κ (u) du
since the denominator is a particular instance of this expression
with h : u → 1.
Pierre Jacob Derivative estimation 22/ 39
Details
For the numerator:
∫
h (σu) exp {ℓ (θ0 + σu) − ℓ(θ0)} κ (u) du
we use a Taylor expansion of ℓ around θ0 and a Taylor
expansion of exp around 0, and then take the integral with
respect to κ.
Notation:
ℓ(k)
(θ).u⊗k
=
∑
1≤i1,...,ik≤d
∂kℓ(θ)
∂θi1 . . . ∂θik
ui1 . . . uik
which in one dimension becomes
ℓ(k)
(θ).u⊗k
=
dkf (θ)
dθk
uk
.
Pierre Jacob Derivative estimation 23/ 39
Details
Main expansion:
∫
h(σu) exp {ℓ (θ0 + σu) − ℓ(θ0)} κ(u)du =
∫
h(σu)κ(u)du + σ
∫
h(σu)ℓ(1)
(θ0).u κ(u)du
+ σ2
∫
h(σu)
{
1
2
ℓ(2)
(θ0).u⊗2
+
1
2
(ℓ(1)
(θ0).u)2
}
κ(u)du
+ σ3
∫
h(σu)
{
1
3!
(ℓ(1)
(θ0).u)3
+
1
2
(ℓ(1)
(θ0).u)(ℓ(2)
(θ0).u⊗2
)
+
1
3!
ℓ(3)
(θ0).u⊗3
}
κ(u)du + O(σ4+α
).
In general, assumptions on the tails of the prior and the
likelihood are used to control the remainder terms and to
ensure there are O(σ4+α).
Pierre Jacob Derivative estimation 24/ 39
Details
We cut the integral into two bits:
∫
h(σu) exp {ℓ (θ0 + σu) − ℓ(θ0)} κ(u)du
=
∫
σ|u|≤ρ
h(σu) exp {ℓ (θ0 + σu) − ℓ(θ0)} κ(u)du
+
∫
σ|u|>ρ
h(σu) exp {ℓ (θ0 + σu) − ℓ(θ0)} κ(u)du
The expansion stems from the first term, where σ|u| is
small.
The second term ends up in the remainder in O(σ4+α)
using the assumptions.
Classic technique in Bayesian asymptotics theory.
Pierre Jacob Derivative estimation 25/ 39
Details
To get the score from the expansion, choose
h : u → u.
To get the observed information matrix from the
expansion, choose
h : u → u2
,
and surprisingly (?) further assume that κ is mesokurtic,
i.e. ∫
u4
κ(u)du = 3
(∫
u2
κ(u)du
)2
⇒ choose a Gaussian prior to obtain the hessian.
Pierre Jacob Derivative estimation 26/ 39
Outline
1 Context
2 General results and connections
3 Posterior concentration when the prior concentrates
4 Hidden Markov models
Pierre Jacob Derivative estimation 27/ 39
Hidden Markov models
y2
X2X0
y1
X1
...
... yT
XT
θ
Figure : Graph representation of a general hidden Markov model.
Pierre Jacob Derivative estimation 27/ 39
Hidden Markov models
Direct application of the previous results
1 Prior distribution N(θ0, σ2) on the parameter θ.
2 The derivative approximations involve E[θ|Y ] and
Cov[θ|Y ].
3 Posterior moments for HMMs can be estimated by
particle MCMC,
SMC2
,
ABC
or your favourite method.
Ionides et al. proposed another approach.
Pierre Jacob Derivative estimation 28/ 39
Iterated Filtering
Modification of the model: θ is time-varying.
The associated loglikelihood is
¯ℓ(θ1:T ) = log p(y1:T ; θ1:T )
= log
∫
XT+1
T∏
t=1
g(yt | xt, θt) µ(dx1 | θ1)
T∏
t=2
f (dxt | xt−1, θt).
Introducing θ → (θ, θ, . . . , θ) := θ[T] ∈ RT , we have
¯ℓ(θ[T]
) = ℓ(θ)
and the chain rule yields
dℓ(θ)
dθ
=
T∑
t=1
∂¯ℓ(θ[T])
∂θt
.
Pierre Jacob Derivative estimation 29/ 39
Iterated Filtering
Choice of prior on θ1:T :
θ1 = θ0 + V1, V1 ∼ τ−1
κ
{
τ−1
(·)
}
θt+1 − θ0 = ρ
(
θt − θ0
)
+ Vt+1, Vt+1 ∼ σ−1
κ
{
σ−1
(·)
}
Choose σ2 such that τ2 = σ2/(1 − ρ2). Covariance of the prior
on θ1:T :
ΣT = τ2












1 ρ · · · · · · · · · ρT−1
ρ 1 ρ · · · · · · ρT−2
ρ2 ρ 1
... ρT−3
...
...
...
...
...
ρT−2 ... 1 ρ
ρT−1 · · · · · · · · · ρ 1












.
Pierre Jacob Derivative estimation 30/ 39
Iterated Filtering
Applying the general results for this prior yields, with
|x| =
∑T
t=1 |xi|:
|∇¯ℓ(θ
[T]
0 ) − Σ−1
T
(
E
[
θ1:T | Y
]
− θ
[T]
0
)
| ≤ Cτ2
Moreover we have
T∑
t=1
∂¯ℓ(θ[T])
∂θt
−
T∑
t=1
{
Σ−1
T
(
E
[
θ1:T | Y
]
− θ
[T]
0
)}
t
≤
T∑
t=1
∂¯ℓ(θ[T])
∂θt
−
{
Σ−1
T
(
E
[
θ1:T | Y
]
− θ
[T]
0
)}
t
and
dℓ(θ)
dθ
=
T∑
t=1
∂¯ℓ(θ[T])
∂θt
.
Pierre Jacob Derivative estimation 31/ 39
Iterated Filtering
The estimator of the score is thus given by
T∑
t=1
{
Σ−1
T
(
E
[
θ1:T | Y
]
− θ
[T]
0
)}
t
which can be reduced to
Sτ,ρ,T (θ0) =
τ−2
1 + ρ
[
(1 − ρ)
{T−1∑
t=2
E
(
θt Y
)
}
− {(1 − ρ) T + 2ρ} θ0
+E
(
θ1 Y
)
+ E
(
θT Y
)]
,
given the form of Σ−1
T . Note that in the quantities E(θt | Y ),
Y = Y1:T is the complete dataset, thus those expectations are
with respect to the smoothing distribution.
Pierre Jacob Derivative estimation 32/ 39
Iterated Filtering
If ρ = 1, then the parameters follow a random walk:
θ1 = θ0 + N(0, τ2
) and θt+1 = θt + N(0, σ2
).
In this case Ionides et al. proposed the estimator
Sτ,σ,T = τ−2
(
E
(
θT | Y
)
− θ0
)
as well as
S
(bis)
τ,σ,T =
T∑
t=1
VP,t
−1
(
˜θF,t − ˜θF,t−1
)
with VP,t = Cov[˜θt | y1:t−1] and ˜θF,t = E[θt | y1:t].
Those expressions only involve expectations with respect to
filtering distributions.
Pierre Jacob Derivative estimation 33/ 39
Iterated Filtering
If ρ = 0, then the parameters are i.i.d:
θ1 = θ0 + N(0, τ2
) and θt+1 = θ0 + N(0, τ2
).
In this case the expression of the score estimator reduces to
Sτ,T = τ−2
T∑
t=1
(
E
(
θt | Y
)
− θ0
)
which involves smoothing distributions.
There’s only one parameter τ2 to choose for the prior.
However smoothing for general hidden Markov models is
difficult, and typically resorts to “fixed lag approximations”.
Pierre Jacob Derivative estimation 34/ 39
Iterated Smoothing
Only for the case ρ = 0 are we able to obtain simple expressions
for the observed information matrix. We propose the following
estimator:
Iτ,T (θ0) = −τ−4
{ T∑
s=1
T∑
t=1
Cov
(
θs, θt Y
)
− τ2
T
}
.
for which we can show that
Iτ,T − (−∇2
ℓ(θ0)) ≤ Cτ2
.
Pierre Jacob Derivative estimation 35/ 39
Numerical results
Linear Gaussian state space model where the ground truth is
available through the Kalman filter.
X0 ∼ N(0, 1) and Xt = ρXt−1 + N(0, V )
Yt = ηXt + N(0, W ).
Generate T = 100 observations and set
ρ = 0.9, V = 0.7, η = 0.9 and W = 0.1, 0.2, 0.4, 0.9.
240 independent runs, matching the computational costs
between methods in terms of number of calls to the transition
kernel.
Pierre Jacob Derivative estimation 36/ 39
Numerical results
q
q
q
q
q
q
q q q
q q q q q q q q q qq
q
q
qq
q
q
q
q q q q q q q q q q q
q
q
q
q
q
q
q
q
q
q
q q q
q q q q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q q q
q
q
q
q
10
100
1000
10000
0.0 0.1 0.2 0.3
h
RMSE
parameter q q q q
1 2 3 4
Finite Difference
qq q q q q q q q q q q q q q q q q q q
qq q q q q q q q q q q q q q q q q q q
qq q q q q q q q q q q q q q q q q q q
q
q q q q q q q q q q q q q q q q q q q
10
100
1000
10000
0.1 0.2 0.3 0.4 0.5
tau
RMSE
parameter q q q q
1 2 3 4
Iterated Smoothing
Figure : 240 runs for Iterated Smoothing and Finite Difference.
Pierre Jacob Derivative estimation 37/ 39
Numerical results
qqqqqqq q q q q q q q q q q q q q
qqqqqqq q q q q q q q q q q q q q
qqqqqqq q q q q q q q q q q q q q
q
qqqqqq q q q q q q q q q q q q q
10
100
1000
10000
0.1 0.2 0.3 0.4 0.5
tau
RMSE
parameter q q q q
1 2 3 4
Iterated Smoothing
qq
q
q
q
q q q q q
q
q
q
q q
q q q q q
q
q
q
q
q
q
q q q q
q
q
q
q
q q q q q q
10
100
1000
10000
0.0 0.1 0.2 0.3 0.4 0.5
tau
RMSE
parameter q q q q
1 2 3 4
Iterated Filtering 1
qqqq q q q q q q
qqqq q q q q q q
qqqq q q
q
q q q
qqqq
q
q
q q q q
10
100
1000
10000
0.0 0.1 0.2 0.3 0.4 0.5
tau
RMSE
parameter q q q q
1 2 3 4
Iterated Filtering 2
Figure : 240 runs for Iterated Smoothing and Iterated Filtering.
Pierre Jacob Derivative estimation 38/ 39
Bibliography
Main references:
Inference for nonlinear dynamical systems, Ionides, Breto,
King, PNAS, 2006.
Iterated filtering, Ionides, Bhadra, Atchad´e, King, Annals
of Statistics, 2011.
Efficient iterated filtering, Lindstr¨om, Ionides, Frydendall,
Madsen, 16th IFAC Symposium on System Identification.
Derivative-Free Estimation of the Score Vector
and Observed Information Matrix,
Doucet, Jacob, Rubenthaler, 2013 (on arXiv).
Pierre Jacob Derivative estimation 39/ 39

More Related Content

What's hot

Numerical Fourier transform based on hyperfunction theory
Numerical Fourier transform based on hyperfunction theoryNumerical Fourier transform based on hyperfunction theory
Numerical Fourier transform based on hyperfunction theory
HidenoriOgata
 
Numerical integration based on the hyperfunction theory
Numerical integration based on the hyperfunction theoryNumerical integration based on the hyperfunction theory
Numerical integration based on the hyperfunction theory
HidenoriOgata
 
Classification and regression based on derivatives: a consistency result for ...
Classification and regression based on derivatives: a consistency result for ...Classification and regression based on derivatives: a consistency result for ...
Classification and regression based on derivatives: a consistency result for ...
tuxette
 
Tensor Train data format for uncertainty quantification
Tensor Train data format for uncertainty quantificationTensor Train data format for uncertainty quantification
Tensor Train data format for uncertainty quantification
Alexander Litvinenko
 
Nonconvex Compressed Sensing with the Sum-of-Squares Method
Nonconvex Compressed Sensing with the Sum-of-Squares MethodNonconvex Compressed Sensing with the Sum-of-Squares Method
Nonconvex Compressed Sensing with the Sum-of-Squares Method
Tasuku Soma
 
Low rank tensor approximation of probability density and characteristic funct...
Low rank tensor approximation of probability density and characteristic funct...Low rank tensor approximation of probability density and characteristic funct...
Low rank tensor approximation of probability density and characteristic funct...
Alexander Litvinenko
 
Signal Processing Introduction using Fourier Transforms
Signal Processing Introduction using Fourier TransformsSignal Processing Introduction using Fourier Transforms
Signal Processing Introduction using Fourier TransformsArvind Devaraj
 
Ada boost brown boost performance with noisy data
Ada boost brown boost performance with noisy dataAda boost brown boost performance with noisy data
Ada boost brown boost performance with noisy data
Shadhin Rahman
 
Hyperfunction method for numerical integration and Fredholm integral equation...
Hyperfunction method for numerical integration and Fredholm integral equation...Hyperfunction method for numerical integration and Fredholm integral equation...
Hyperfunction method for numerical integration and Fredholm integral equation...
HidenoriOgata
 
2018 MUMS Fall Course - Statistical and Mathematical Techniques for Sensitivi...
2018 MUMS Fall Course - Statistical and Mathematical Techniques for Sensitivi...2018 MUMS Fall Course - Statistical and Mathematical Techniques for Sensitivi...
2018 MUMS Fall Course - Statistical and Mathematical Techniques for Sensitivi...
The Statistical and Applied Mathematical Sciences Institute
 
Congrès SMAI 2019
Congrès SMAI 2019Congrès SMAI 2019
Congrès SMAI 2019
Hamed Zakerzadeh
 
Levitan Centenary Conference Talk, June 27 2014
Levitan Centenary Conference Talk, June 27 2014Levitan Centenary Conference Talk, June 27 2014
Levitan Centenary Conference Talk, June 27 2014
Nikita V. Artamonov
 
Bregman divergences from comparative convexity
Bregman divergences from comparative convexityBregman divergences from comparative convexity
Bregman divergences from comparative convexity
Frank Nielsen
 
Intro to Approximate Bayesian Computation (ABC)
Intro to Approximate Bayesian Computation (ABC)Intro to Approximate Bayesian Computation (ABC)
Intro to Approximate Bayesian Computation (ABC)
Umberto Picchini
 
Random Matrix Theory and Machine Learning - Part 1
Random Matrix Theory and Machine Learning - Part 1Random Matrix Theory and Machine Learning - Part 1
Random Matrix Theory and Machine Learning - Part 1
Fabian Pedregosa
 
On the Jensen-Shannon symmetrization of distances relying on abstract means
On the Jensen-Shannon symmetrization of distances relying on abstract meansOn the Jensen-Shannon symmetrization of distances relying on abstract means
On the Jensen-Shannon symmetrization of distances relying on abstract means
Frank Nielsen
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Appli...
 Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Appli... Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Appli...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Appli...
The Statistical and Applied Mathematical Sciences Institute
 

What's hot (20)

Numerical Fourier transform based on hyperfunction theory
Numerical Fourier transform based on hyperfunction theoryNumerical Fourier transform based on hyperfunction theory
Numerical Fourier transform based on hyperfunction theory
 
Numerical integration based on the hyperfunction theory
Numerical integration based on the hyperfunction theoryNumerical integration based on the hyperfunction theory
Numerical integration based on the hyperfunction theory
 
Classification and regression based on derivatives: a consistency result for ...
Classification and regression based on derivatives: a consistency result for ...Classification and regression based on derivatives: a consistency result for ...
Classification and regression based on derivatives: a consistency result for ...
 
Tensor Train data format for uncertainty quantification
Tensor Train data format for uncertainty quantificationTensor Train data format for uncertainty quantification
Tensor Train data format for uncertainty quantification
 
Nonconvex Compressed Sensing with the Sum-of-Squares Method
Nonconvex Compressed Sensing with the Sum-of-Squares MethodNonconvex Compressed Sensing with the Sum-of-Squares Method
Nonconvex Compressed Sensing with the Sum-of-Squares Method
 
Low rank tensor approximation of probability density and characteristic funct...
Low rank tensor approximation of probability density and characteristic funct...Low rank tensor approximation of probability density and characteristic funct...
Low rank tensor approximation of probability density and characteristic funct...
 
Signal Processing Introduction using Fourier Transforms
Signal Processing Introduction using Fourier TransformsSignal Processing Introduction using Fourier Transforms
Signal Processing Introduction using Fourier Transforms
 
Ada boost brown boost performance with noisy data
Ada boost brown boost performance with noisy dataAda boost brown boost performance with noisy data
Ada boost brown boost performance with noisy data
 
Hyperfunction method for numerical integration and Fredholm integral equation...
Hyperfunction method for numerical integration and Fredholm integral equation...Hyperfunction method for numerical integration and Fredholm integral equation...
Hyperfunction method for numerical integration and Fredholm integral equation...
 
2018 MUMS Fall Course - Statistical and Mathematical Techniques for Sensitivi...
2018 MUMS Fall Course - Statistical and Mathematical Techniques for Sensitivi...2018 MUMS Fall Course - Statistical and Mathematical Techniques for Sensitivi...
2018 MUMS Fall Course - Statistical and Mathematical Techniques for Sensitivi...
 
Congrès SMAI 2019
Congrès SMAI 2019Congrès SMAI 2019
Congrès SMAI 2019
 
Levitan Centenary Conference Talk, June 27 2014
Levitan Centenary Conference Talk, June 27 2014Levitan Centenary Conference Talk, June 27 2014
Levitan Centenary Conference Talk, June 27 2014
 
Bregman divergences from comparative convexity
Bregman divergences from comparative convexityBregman divergences from comparative convexity
Bregman divergences from comparative convexity
 
Rousseau
RousseauRousseau
Rousseau
 
Chapter 4 (maths 3)
Chapter 4 (maths 3)Chapter 4 (maths 3)
Chapter 4 (maths 3)
 
Intro to Approximate Bayesian Computation (ABC)
Intro to Approximate Bayesian Computation (ABC)Intro to Approximate Bayesian Computation (ABC)
Intro to Approximate Bayesian Computation (ABC)
 
Random Matrix Theory and Machine Learning - Part 1
Random Matrix Theory and Machine Learning - Part 1Random Matrix Theory and Machine Learning - Part 1
Random Matrix Theory and Machine Learning - Part 1
 
Ss 2013 midterm
Ss 2013 midtermSs 2013 midterm
Ss 2013 midterm
 
On the Jensen-Shannon symmetrization of distances relying on abstract means
On the Jensen-Shannon symmetrization of distances relying on abstract meansOn the Jensen-Shannon symmetrization of distances relying on abstract means
On the Jensen-Shannon symmetrization of distances relying on abstract means
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Appli...
 Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Appli... Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Appli...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Appli...
 

Similar to Estimation of the score vector and observed information matrix in intractable models

Linear response theory and TDDFT
Linear response theory and TDDFT Linear response theory and TDDFT
Linear response theory and TDDFT
Claudio Attaccalite
 
On Twisted Paraproducts and some other Multilinear Singular Integrals
On Twisted Paraproducts and some other Multilinear Singular IntegralsOn Twisted Paraproducts and some other Multilinear Singular Integrals
On Twisted Paraproducts and some other Multilinear Singular Integrals
VjekoslavKovac1
 
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
Alexander Litvinenko
 
Lecture_9.pdf
Lecture_9.pdfLecture_9.pdf
Lecture_9.pdf
BrofessorPaulNguyen
 
Litvinenko_RWTH_UQ_Seminar_talk.pdf
Litvinenko_RWTH_UQ_Seminar_talk.pdfLitvinenko_RWTH_UQ_Seminar_talk.pdf
Litvinenko_RWTH_UQ_Seminar_talk.pdf
Alexander Litvinenko
 
Meta-learning and the ELBO
Meta-learning and the ELBOMeta-learning and the ELBO
Meta-learning and the ELBO
Yoonho Lee
 
Tales on two commuting transformations or flows
Tales on two commuting transformations or flowsTales on two commuting transformations or flows
Tales on two commuting transformations or flows
VjekoslavKovac1
 
Bayesian inference on mixtures
Bayesian inference on mixturesBayesian inference on mixtures
Bayesian inference on mixtures
Christian Robert
 
IVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionIVR - Chapter 1 - Introduction
IVR - Chapter 1 - Introduction
Charles Deledalle
 
Unbiased Hamiltonian Monte Carlo
Unbiased Hamiltonian Monte CarloUnbiased Hamiltonian Monte Carlo
Unbiased Hamiltonian Monte Carlo
JeremyHeng10
 
Diffusion Schrödinger bridges for score-based generative modeling
Diffusion Schrödinger bridges for score-based generative modelingDiffusion Schrödinger bridges for score-based generative modeling
Diffusion Schrödinger bridges for score-based generative modeling
JeremyHeng10
 
Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...
Alexander Litvinenko
 
Bayesian Inference and Uncertainty Quantification for Inverse Problems
Bayesian Inference and Uncertainty Quantification for Inverse ProblemsBayesian Inference and Uncertainty Quantification for Inverse Problems
Bayesian Inference and Uncertainty Quantification for Inverse Problems
Matt Moores
 
A Fibonacci-like universe expansion on time-scale
A Fibonacci-like universe expansion on time-scaleA Fibonacci-like universe expansion on time-scale
A Fibonacci-like universe expansion on time-scale
OctavianPostavaru
 
Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9
Daisuke Yoneoka
 
Spectral sum rules for conformal field theories
Spectral sum rules for conformal field theoriesSpectral sum rules for conformal field theories
Spectral sum rules for conformal field theories
Subham Dutta Chowdhury
 
2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
The Statistical and Applied Mathematical Sciences Institute
 
Approximate Bayesian Computation with Quasi-Likelihoods
Approximate Bayesian Computation with Quasi-LikelihoodsApproximate Bayesian Computation with Quasi-Likelihoods
Approximate Bayesian Computation with Quasi-Likelihoods
Stefano Cabras
 
Diffusion Schrödinger bridges for score-based generative modeling
Diffusion Schrödinger bridges for score-based generative modelingDiffusion Schrödinger bridges for score-based generative modeling
Diffusion Schrödinger bridges for score-based generative modeling
JeremyHeng10
 

Similar to Estimation of the score vector and observed information matrix in intractable models (20)

Linear response theory and TDDFT
Linear response theory and TDDFT Linear response theory and TDDFT
Linear response theory and TDDFT
 
On Twisted Paraproducts and some other Multilinear Singular Integrals
On Twisted Paraproducts and some other Multilinear Singular IntegralsOn Twisted Paraproducts and some other Multilinear Singular Integrals
On Twisted Paraproducts and some other Multilinear Singular Integrals
 
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
 
Lecture_9.pdf
Lecture_9.pdfLecture_9.pdf
Lecture_9.pdf
 
Litvinenko_RWTH_UQ_Seminar_talk.pdf
Litvinenko_RWTH_UQ_Seminar_talk.pdfLitvinenko_RWTH_UQ_Seminar_talk.pdf
Litvinenko_RWTH_UQ_Seminar_talk.pdf
 
Meta-learning and the ELBO
Meta-learning and the ELBOMeta-learning and the ELBO
Meta-learning and the ELBO
 
Tales on two commuting transformations or flows
Tales on two commuting transformations or flowsTales on two commuting transformations or flows
Tales on two commuting transformations or flows
 
Bayesian inference on mixtures
Bayesian inference on mixturesBayesian inference on mixtures
Bayesian inference on mixtures
 
Muchtadi
MuchtadiMuchtadi
Muchtadi
 
IVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionIVR - Chapter 1 - Introduction
IVR - Chapter 1 - Introduction
 
Unbiased Hamiltonian Monte Carlo
Unbiased Hamiltonian Monte CarloUnbiased Hamiltonian Monte Carlo
Unbiased Hamiltonian Monte Carlo
 
Diffusion Schrödinger bridges for score-based generative modeling
Diffusion Schrödinger bridges for score-based generative modelingDiffusion Schrödinger bridges for score-based generative modeling
Diffusion Schrödinger bridges for score-based generative modeling
 
Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...
 
Bayesian Inference and Uncertainty Quantification for Inverse Problems
Bayesian Inference and Uncertainty Quantification for Inverse ProblemsBayesian Inference and Uncertainty Quantification for Inverse Problems
Bayesian Inference and Uncertainty Quantification for Inverse Problems
 
A Fibonacci-like universe expansion on time-scale
A Fibonacci-like universe expansion on time-scaleA Fibonacci-like universe expansion on time-scale
A Fibonacci-like universe expansion on time-scale
 
Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9
 
Spectral sum rules for conformal field theories
Spectral sum rules for conformal field theoriesSpectral sum rules for conformal field theories
Spectral sum rules for conformal field theories
 
2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
 
Approximate Bayesian Computation with Quasi-Likelihoods
Approximate Bayesian Computation with Quasi-LikelihoodsApproximate Bayesian Computation with Quasi-Likelihoods
Approximate Bayesian Computation with Quasi-Likelihoods
 
Diffusion Schrödinger bridges for score-based generative modeling
Diffusion Schrödinger bridges for score-based generative modelingDiffusion Schrödinger bridges for score-based generative modeling
Diffusion Schrödinger bridges for score-based generative modeling
 

More from Pierre Jacob

Talk at CIRM on Poisson equation and debiasing techniques
Talk at CIRM on Poisson equation and debiasing techniquesTalk at CIRM on Poisson equation and debiasing techniques
Talk at CIRM on Poisson equation and debiasing techniques
Pierre Jacob
 
ISBA 2022 Susie Bayarri lecture
ISBA 2022 Susie Bayarri lectureISBA 2022 Susie Bayarri lecture
ISBA 2022 Susie Bayarri lecture
Pierre Jacob
 
Couplings of Markov chains and the Poisson equation
Couplings of Markov chains and the Poisson equation Couplings of Markov chains and the Poisson equation
Couplings of Markov chains and the Poisson equation
Pierre Jacob
 
Monte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problemsMonte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problems
Pierre Jacob
 
Monte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problemsMonte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problems
Pierre Jacob
 
Markov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing themMarkov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing them
Pierre Jacob
 
Unbiased MCMC with couplings
Unbiased MCMC with couplingsUnbiased MCMC with couplings
Unbiased MCMC with couplings
Pierre Jacob
 
Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods
Pierre Jacob
 
Recent developments on unbiased MCMC
Recent developments on unbiased MCMCRecent developments on unbiased MCMC
Recent developments on unbiased MCMC
Pierre Jacob
 
Current limitations of sequential inference in general hidden Markov models
Current limitations of sequential inference in general hidden Markov modelsCurrent limitations of sequential inference in general hidden Markov models
Current limitations of sequential inference in general hidden Markov models
Pierre Jacob
 
On non-negative unbiased estimators
On non-negative unbiased estimatorsOn non-negative unbiased estimators
On non-negative unbiased estimators
Pierre Jacob
 
Path storage in the particle filter
Path storage in the particle filterPath storage in the particle filter
Path storage in the particle filter
Pierre Jacob
 
Density exploration methods
Density exploration methodsDensity exploration methods
Density exploration methods
Pierre Jacob
 
SMC^2: an algorithm for sequential analysis of state-space models
SMC^2: an algorithm for sequential analysis of state-space modelsSMC^2: an algorithm for sequential analysis of state-space models
SMC^2: an algorithm for sequential analysis of state-space models
Pierre Jacob
 
PAWL - GPU meeting @ Warwick
PAWL - GPU meeting @ WarwickPAWL - GPU meeting @ Warwick
PAWL - GPU meeting @ Warwick
Pierre Jacob
 
Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7
Pierre Jacob
 
Presentation MCB seminar 09032011
Presentation MCB seminar 09032011Presentation MCB seminar 09032011
Presentation MCB seminar 09032011
Pierre Jacob
 

More from Pierre Jacob (17)

Talk at CIRM on Poisson equation and debiasing techniques
Talk at CIRM on Poisson equation and debiasing techniquesTalk at CIRM on Poisson equation and debiasing techniques
Talk at CIRM on Poisson equation and debiasing techniques
 
ISBA 2022 Susie Bayarri lecture
ISBA 2022 Susie Bayarri lectureISBA 2022 Susie Bayarri lecture
ISBA 2022 Susie Bayarri lecture
 
Couplings of Markov chains and the Poisson equation
Couplings of Markov chains and the Poisson equation Couplings of Markov chains and the Poisson equation
Couplings of Markov chains and the Poisson equation
 
Monte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problemsMonte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problems
 
Monte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problemsMonte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problems
 
Markov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing themMarkov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing them
 
Unbiased MCMC with couplings
Unbiased MCMC with couplingsUnbiased MCMC with couplings
Unbiased MCMC with couplings
 
Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods
 
Recent developments on unbiased MCMC
Recent developments on unbiased MCMCRecent developments on unbiased MCMC
Recent developments on unbiased MCMC
 
Current limitations of sequential inference in general hidden Markov models
Current limitations of sequential inference in general hidden Markov modelsCurrent limitations of sequential inference in general hidden Markov models
Current limitations of sequential inference in general hidden Markov models
 
On non-negative unbiased estimators
On non-negative unbiased estimatorsOn non-negative unbiased estimators
On non-negative unbiased estimators
 
Path storage in the particle filter
Path storage in the particle filterPath storage in the particle filter
Path storage in the particle filter
 
Density exploration methods
Density exploration methodsDensity exploration methods
Density exploration methods
 
SMC^2: an algorithm for sequential analysis of state-space models
SMC^2: an algorithm for sequential analysis of state-space modelsSMC^2: an algorithm for sequential analysis of state-space models
SMC^2: an algorithm for sequential analysis of state-space models
 
PAWL - GPU meeting @ Warwick
PAWL - GPU meeting @ WarwickPAWL - GPU meeting @ Warwick
PAWL - GPU meeting @ Warwick
 
Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7
 
Presentation MCB seminar 09032011
Presentation MCB seminar 09032011Presentation MCB seminar 09032011
Presentation MCB seminar 09032011
 

Recently uploaded

In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
AlguinaldoKong
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
AADYARAJPANDEY1
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
anitaento25
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
muralinath2
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
Richard Gill
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
Richard Gill
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
yusufzako14
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
muralinath2
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
subedisuryaofficial
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SELF-EXPLANATORY
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
AADYARAJPANDEY1
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
Health Advances
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
muralinath2
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
sonaliswain16
 

Recently uploaded (20)

In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
 

Estimation of the score vector and observed information matrix in intractable models

  • 1. Estimation of the score vector and observed information matrix in intractable models Arnaud Doucet (University of Oxford) Pierre E. Jacob (University of Oxford) Sylvain Rubenthaler (Universit´e Nice Sophia Antipolis) November 28th, 2014 Pierre Jacob Derivative estimation 1/ 39
  • 2. Outline 1 Context 2 General results and connections 3 Posterior concentration when the prior concentrates 4 Hidden Markov models Pierre Jacob Derivative estimation 1/ 39
  • 3. Outline 1 Context 2 General results and connections 3 Posterior concentration when the prior concentrates 4 Hidden Markov models Pierre Jacob Derivative estimation 2/ 39
  • 4. Motivation Derivatives of the likelihood help optimizing / sampling. For many models they are not available. One can resort to approximation techniques. Pierre Jacob Derivative estimation 2/ 39
  • 5. Using derivatives in sampling algorithms Modified Adjusted Langevin Algorithm At step t, given a point θt, do: propose θ⋆ ∼ q(dθ | θt) ≡ N(θt + σ2 2 ∇θ log π(θt), σ2 ), with probability 1 ∧ π(θ⋆)q(θt | θ⋆) π(θt)q(θ⋆ | θt) set θt+1 = θ⋆, otherwise set θt+1 = θt. Pierre Jacob Derivative estimation 3/ 39
  • 6. Using derivatives in sampling algorithms Figure : Proposal mechanism for random walk Metropolis–Hastings. Pierre Jacob Derivative estimation 4/ 39
  • 7. Using derivatives in sampling algorithms Figure : Proposal mechanism for MALA. Pierre Jacob Derivative estimation 5/ 39
  • 8. Using derivatives in sampling algorithms In what sense is MALA better than MH? Scaling with the dimension of the state space For Metropolis–Hastings, optimal scaling leads to σ2 = O(d−1 ), For MALA, optimal scaling leads to σ2 = O(d−1/3 ). Roberts & Rosenthal, Optimal Scaling for Various Metropolis-Hastings Algorithms, 2001. Pierre Jacob Derivative estimation 6/ 39
  • 9. Hidden Markov models y2 X2X0 y1 X1 ... ... yT XT θ Figure : Graph representation of a general hidden Markov model. Hidden process: initial distribution µθ, transition fθ. Observations conditional upon the hidden process, from gθ. Pierre Jacob Derivative estimation 7/ 39
  • 10. Assumptions Input: Parameter θ : unknown, prior distribution p. Initial condition µθ(dx0) : can be sampled from. Transition fθ(dxt|xt−1) : can be sampled from. Measurement gθ(yt|xt) : can be evaluated point-wise. Observations y1:T = (y1, . . . , yT ). Goals: score: ∇θ log L(θ; y1:T ) for any θ, observed information matrix: −∇2 θ log L(θ; y1:T ) for any θ. Note: throughout the talk, the observations, and thus the likelihood, are fixed. Pierre Jacob Derivative estimation 8/ 39
  • 11. Why is it an intractable model? The likelihood function does not admit a closed form expression: L(θ; y1, . . . , yT ) = ∫ XT+1 p(y1, . . . , yT | x0, . . . xT , θ)p(dx0, . . . dxT | θ) = ∫ XT+1 T∏ t=1 gθ(yt | xt) µθ(dx0) T∏ t=1 fθ(dxt | xt−1). Hence the likelihood can only be estimated, e.g. by standard Monte Carlo, or by particle filters. What about the derivatives of the likelihood? Pierre Jacob Derivative estimation 9/ 39
  • 12. Fisher and Louis’ identities Write the score as: ∇ℓ(θ) = ∫ ∇ log p(x0:T , y1:T | θ)p(dx0:T | y1:T , θ). which is an integral, with respect to the smoothing distribution p(dx0:T | y1:T , θ), of ∇ log p(x0:T , y1:T | θ) = ∇ log µθ(x0) + T∑ t=1 ∇ log fθ(xt | xt−1) + T∑ t=1 ∇ log gθ(yt | xt). However pointwise evaluations of ∇ log µθ(x0) and ∇ log fθ(xt | xt−1) are not always available. Pierre Jacob Derivative estimation 10/ 39
  • 13. New kid on the block: Iterated Filtering Perturbed model Hidden states ˜Xt = (˜θt, Xt). { ˜θ0 ∼ N(θ0, τ2Σ) X0 ∼ µ˜θ0 (·) and { ˜θt ∼ N(˜θt−1, σ2Σ) Xt ∼ f˜θt (· | Xt−1 = xt−1) Observations ˜Yt ∼ g˜θt (· | Xt). Score estimate Consider VP,t = Cov[˜θt | y1:t−1] and ˜θF,t = E[˜θt | y1:t]. T∑ t=1 VP,t −1 ( ˜θF,t − ˜θF,t−1 ) ≈ ∇ℓ(θ0) when τ → 0 and σ/τ → 0. Ionides, Breto, King, PNAS, 2006. Pierre Jacob Derivative estimation 11/ 39
  • 14. Iterated Filtering: the mystery Why is it valid? Is it related to known techniques? Can it be extended to estimate the second derivatives (i.e. the Hessian, i.e. the observed information matrix)? How does it compare to other methods such as finite difference? Pierre Jacob Derivative estimation 12/ 39
  • 15. Outline 1 Context 2 General results and connections 3 Posterior concentration when the prior concentrates 4 Hidden Markov models Pierre Jacob Derivative estimation 13/ 39
  • 16. Iterated Filtering Given a log likelihood ℓ and a given point, consider a prior θ ∼ N(θ0, σ2 ). Posterior expectation when the prior variance goes to zero First-order moments give first-order derivatives: |σ−2 (E[θ|Y ] − θ0) − ∇ℓ(θ0)| ≤ Cσ2 . Phrased simply, posterior mean − prior mean prior variance ≈ score. Result from Ionides, Bhadra, Atchad´e, King, Iterated filtering, 2011. Pierre Jacob Derivative estimation 13/ 39
  • 17. Extension of Iterated Filtering Posterior variance when the prior variance goes to zero Second-order moments give second-order derivatives: |σ−4 ( Cov[θ|Y ] − σ2 ) − ∇2 ℓ(θ0)| ≤ Cσ2 . Phrased simply, posterior variance − prior variance prior variance2 ≈ hessian. Result from Doucet, Jacob, Rubenthaler on arXiv, 2013. Pierre Jacob Derivative estimation 14/ 39
  • 18. Proximity mapping Given a real function f and a point θ0, consider for any σ2 > 0 θ → f (θ) exp { − 1 2σ2 (θ − θ0)2 } θ θ0 Figure : Example for f : θ → exp(−|θ|) and three values of σ2 . Pierre Jacob Derivative estimation 15/ 39
  • 19. Proximity mapping Proximity mapping The σ2-proximity mapping is defined by proxf : θ0 → argmaxθ∈R f (θ) exp { − 1 2σ2 (θ − θ0)2 } . Moreau approximation The σ2-Moreau approximation is defined by fσ2 : θ0 → C supθ∈R f (θ) exp { − 1 2σ2 (θ − θ0)2 } where C is a normalizing constant. Pierre Jacob Derivative estimation 16/ 39
  • 20. Proximity mapping θ Figure : θ → f (θ) and θ → fσ2 (θ) for three values of σ2 . Pierre Jacob Derivative estimation 16/ 39
  • 21. Proximity mapping Property Those objects are such that proxf (θ0) − θ0 σ2 = ∇ log fσ2 (θ0) −−−→ σ2→0 ∇ log f (θ0) Moreau (1962), Fonctions convexes duales et points proximaux dans un espace Hilbertien. Pereyra (2013), Proximal Markov chain Monte Carlo algorithms. Pierre Jacob Derivative estimation 17/ 39
  • 22. Proximity mapping Bayesian interpretation If f is a seen as a likelihood function then θ → f (θ) exp { − 1 2σ2 (θ − θ0)2 } is an unnormalized posterior density function based on a Normal prior with mean θ0 and variance σ2. Hence proxf (θ0) − θ0 σ2 −−−→ σ2→0 ∇ log f (θ0) can be read posterior mode − prior mode prior variance ≈ score. Pierre Jacob Derivative estimation 18/ 39
  • 23. Stein’s lemma Stein’s lemma states that θ ∼ N(θ0, σ2 ) if and only if for any function g such that E [|∇g(θ)|] < ∞, E [(θ − θ0) g (θ)] = σ2 E [∇g (θ)] . If we choose the function g : θ → exp ℓ (θ) /Z with Z = E [exp ℓ (θ)] and apply Stein’s lemma we obtain 1 Z E [θ exp ℓ(θ)] − θ0 = σ2 Z E [∇ℓ (θ) exp (ℓ (θ))] ⇔ σ−2 (E [θ | Y ] − θ0) = E [∇ℓ (θ) | Y ] . Notation: E[φ(θ) | Y ] := E[φ(θ) exp ℓ(θ)]/Z. Pierre Jacob Derivative estimation 19/ 39
  • 24. Stein’s lemma For the second derivative, we consider h : θ → (θ − θ0) exp ℓ (θ) /Z. Then E [ (θ − θ0)2 | Y ] = σ2 + σ4 E [ ∇2 ℓ(θ) + ∇ℓ(θ)2 | Y ] . Adding and subtracting terms also yields σ−4 ( V [θ | Y ] − σ2 ) = E [ ∇2 ℓ(θ) | Y ] + { E [ ∇ℓ(θ)2 | Y ] − (E [∇ℓ(θ) | Y ])2 } . . . . but what we really want is ∇ℓ(θ0), ∇ℓ2 (θ0) and not E [∇ℓ(θ) | Y ] , E [ ∇ℓ2 (θ) | Y ] . Pierre Jacob Derivative estimation 20/ 39
  • 25. Outline 1 Context 2 General results and connections 3 Posterior concentration when the prior concentrates 4 Hidden Markov models Pierre Jacob Derivative estimation 21/ 39
  • 26. Core Idea The prior is a essentially a normal distribution N(θ0, σ2), but in general has a density denoted by κ. Posterior concentration induced by the prior Under some assumptions, when σ → 0: the posterior looks more and more like the prior, the shift in posterior moments is in O(σ2). Our arXived proof suffers from an overdose of Taylor expansions. Pierre Jacob Derivative estimation 21/ 39
  • 27. Details Introduce a test function h such that |h(u)| < c|u|α for some c, α. We start by writing E {h (θ − θ0)| y} = ∫ h (σu) exp {ℓ (θ0 + σu) − ℓ(θ0)} κ (u) du ∫ exp {ℓ (θ0 + σu) − ℓ(θ0)} κ (u) du using u = (θ − θ0)/σ and then focus on the numerator ∫ h (σu) exp {ℓ (θ0 + σu) − ℓ(θ0)} κ (u) du since the denominator is a particular instance of this expression with h : u → 1. Pierre Jacob Derivative estimation 22/ 39
  • 28. Details For the numerator: ∫ h (σu) exp {ℓ (θ0 + σu) − ℓ(θ0)} κ (u) du we use a Taylor expansion of ℓ around θ0 and a Taylor expansion of exp around 0, and then take the integral with respect to κ. Notation: ℓ(k) (θ).u⊗k = ∑ 1≤i1,...,ik≤d ∂kℓ(θ) ∂θi1 . . . ∂θik ui1 . . . uik which in one dimension becomes ℓ(k) (θ).u⊗k = dkf (θ) dθk uk . Pierre Jacob Derivative estimation 23/ 39
  • 29. Details Main expansion: ∫ h(σu) exp {ℓ (θ0 + σu) − ℓ(θ0)} κ(u)du = ∫ h(σu)κ(u)du + σ ∫ h(σu)ℓ(1) (θ0).u κ(u)du + σ2 ∫ h(σu) { 1 2 ℓ(2) (θ0).u⊗2 + 1 2 (ℓ(1) (θ0).u)2 } κ(u)du + σ3 ∫ h(σu) { 1 3! (ℓ(1) (θ0).u)3 + 1 2 (ℓ(1) (θ0).u)(ℓ(2) (θ0).u⊗2 ) + 1 3! ℓ(3) (θ0).u⊗3 } κ(u)du + O(σ4+α ). In general, assumptions on the tails of the prior and the likelihood are used to control the remainder terms and to ensure there are O(σ4+α). Pierre Jacob Derivative estimation 24/ 39
  • 30. Details We cut the integral into two bits: ∫ h(σu) exp {ℓ (θ0 + σu) − ℓ(θ0)} κ(u)du = ∫ σ|u|≤ρ h(σu) exp {ℓ (θ0 + σu) − ℓ(θ0)} κ(u)du + ∫ σ|u|>ρ h(σu) exp {ℓ (θ0 + σu) − ℓ(θ0)} κ(u)du The expansion stems from the first term, where σ|u| is small. The second term ends up in the remainder in O(σ4+α) using the assumptions. Classic technique in Bayesian asymptotics theory. Pierre Jacob Derivative estimation 25/ 39
  • 31. Details To get the score from the expansion, choose h : u → u. To get the observed information matrix from the expansion, choose h : u → u2 , and surprisingly (?) further assume that κ is mesokurtic, i.e. ∫ u4 κ(u)du = 3 (∫ u2 κ(u)du )2 ⇒ choose a Gaussian prior to obtain the hessian. Pierre Jacob Derivative estimation 26/ 39
  • 32. Outline 1 Context 2 General results and connections 3 Posterior concentration when the prior concentrates 4 Hidden Markov models Pierre Jacob Derivative estimation 27/ 39
  • 33. Hidden Markov models y2 X2X0 y1 X1 ... ... yT XT θ Figure : Graph representation of a general hidden Markov model. Pierre Jacob Derivative estimation 27/ 39
  • 34. Hidden Markov models Direct application of the previous results 1 Prior distribution N(θ0, σ2) on the parameter θ. 2 The derivative approximations involve E[θ|Y ] and Cov[θ|Y ]. 3 Posterior moments for HMMs can be estimated by particle MCMC, SMC2 , ABC or your favourite method. Ionides et al. proposed another approach. Pierre Jacob Derivative estimation 28/ 39
  • 35. Iterated Filtering Modification of the model: θ is time-varying. The associated loglikelihood is ¯ℓ(θ1:T ) = log p(y1:T ; θ1:T ) = log ∫ XT+1 T∏ t=1 g(yt | xt, θt) µ(dx1 | θ1) T∏ t=2 f (dxt | xt−1, θt). Introducing θ → (θ, θ, . . . , θ) := θ[T] ∈ RT , we have ¯ℓ(θ[T] ) = ℓ(θ) and the chain rule yields dℓ(θ) dθ = T∑ t=1 ∂¯ℓ(θ[T]) ∂θt . Pierre Jacob Derivative estimation 29/ 39
  • 36. Iterated Filtering Choice of prior on θ1:T : θ1 = θ0 + V1, V1 ∼ τ−1 κ { τ−1 (·) } θt+1 − θ0 = ρ ( θt − θ0 ) + Vt+1, Vt+1 ∼ σ−1 κ { σ−1 (·) } Choose σ2 such that τ2 = σ2/(1 − ρ2). Covariance of the prior on θ1:T : ΣT = τ2             1 ρ · · · · · · · · · ρT−1 ρ 1 ρ · · · · · · ρT−2 ρ2 ρ 1 ... ρT−3 ... ... ... ... ... ρT−2 ... 1 ρ ρT−1 · · · · · · · · · ρ 1             . Pierre Jacob Derivative estimation 30/ 39
  • 37. Iterated Filtering Applying the general results for this prior yields, with |x| = ∑T t=1 |xi|: |∇¯ℓ(θ [T] 0 ) − Σ−1 T ( E [ θ1:T | Y ] − θ [T] 0 ) | ≤ Cτ2 Moreover we have T∑ t=1 ∂¯ℓ(θ[T]) ∂θt − T∑ t=1 { Σ−1 T ( E [ θ1:T | Y ] − θ [T] 0 )} t ≤ T∑ t=1 ∂¯ℓ(θ[T]) ∂θt − { Σ−1 T ( E [ θ1:T | Y ] − θ [T] 0 )} t and dℓ(θ) dθ = T∑ t=1 ∂¯ℓ(θ[T]) ∂θt . Pierre Jacob Derivative estimation 31/ 39
  • 38. Iterated Filtering The estimator of the score is thus given by T∑ t=1 { Σ−1 T ( E [ θ1:T | Y ] − θ [T] 0 )} t which can be reduced to Sτ,ρ,T (θ0) = τ−2 1 + ρ [ (1 − ρ) {T−1∑ t=2 E ( θt Y ) } − {(1 − ρ) T + 2ρ} θ0 +E ( θ1 Y ) + E ( θT Y )] , given the form of Σ−1 T . Note that in the quantities E(θt | Y ), Y = Y1:T is the complete dataset, thus those expectations are with respect to the smoothing distribution. Pierre Jacob Derivative estimation 32/ 39
  • 39. Iterated Filtering If ρ = 1, then the parameters follow a random walk: θ1 = θ0 + N(0, τ2 ) and θt+1 = θt + N(0, σ2 ). In this case Ionides et al. proposed the estimator Sτ,σ,T = τ−2 ( E ( θT | Y ) − θ0 ) as well as S (bis) τ,σ,T = T∑ t=1 VP,t −1 ( ˜θF,t − ˜θF,t−1 ) with VP,t = Cov[˜θt | y1:t−1] and ˜θF,t = E[θt | y1:t]. Those expressions only involve expectations with respect to filtering distributions. Pierre Jacob Derivative estimation 33/ 39
  • 40. Iterated Filtering If ρ = 0, then the parameters are i.i.d: θ1 = θ0 + N(0, τ2 ) and θt+1 = θ0 + N(0, τ2 ). In this case the expression of the score estimator reduces to Sτ,T = τ−2 T∑ t=1 ( E ( θt | Y ) − θ0 ) which involves smoothing distributions. There’s only one parameter τ2 to choose for the prior. However smoothing for general hidden Markov models is difficult, and typically resorts to “fixed lag approximations”. Pierre Jacob Derivative estimation 34/ 39
  • 41. Iterated Smoothing Only for the case ρ = 0 are we able to obtain simple expressions for the observed information matrix. We propose the following estimator: Iτ,T (θ0) = −τ−4 { T∑ s=1 T∑ t=1 Cov ( θs, θt Y ) − τ2 T } . for which we can show that Iτ,T − (−∇2 ℓ(θ0)) ≤ Cτ2 . Pierre Jacob Derivative estimation 35/ 39
  • 42. Numerical results Linear Gaussian state space model where the ground truth is available through the Kalman filter. X0 ∼ N(0, 1) and Xt = ρXt−1 + N(0, V ) Yt = ηXt + N(0, W ). Generate T = 100 observations and set ρ = 0.9, V = 0.7, η = 0.9 and W = 0.1, 0.2, 0.4, 0.9. 240 independent runs, matching the computational costs between methods in terms of number of calls to the transition kernel. Pierre Jacob Derivative estimation 36/ 39
  • 43. Numerical results q q q q q q q q q q q q q q q q q q qq q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 10 100 1000 10000 0.0 0.1 0.2 0.3 h RMSE parameter q q q q 1 2 3 4 Finite Difference qq q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 10 100 1000 10000 0.1 0.2 0.3 0.4 0.5 tau RMSE parameter q q q q 1 2 3 4 Iterated Smoothing Figure : 240 runs for Iterated Smoothing and Finite Difference. Pierre Jacob Derivative estimation 37/ 39
  • 44. Numerical results qqqqqqq q q q q q q q q q q q q q qqqqqqq q q q q q q q q q q q q q qqqqqqq q q q q q q q q q q q q q q qqqqqq q q q q q q q q q q q q q 10 100 1000 10000 0.1 0.2 0.3 0.4 0.5 tau RMSE parameter q q q q 1 2 3 4 Iterated Smoothing qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 10 100 1000 10000 0.0 0.1 0.2 0.3 0.4 0.5 tau RMSE parameter q q q q 1 2 3 4 Iterated Filtering 1 qqqq q q q q q q qqqq q q q q q q qqqq q q q q q q qqqq q q q q q q 10 100 1000 10000 0.0 0.1 0.2 0.3 0.4 0.5 tau RMSE parameter q q q q 1 2 3 4 Iterated Filtering 2 Figure : 240 runs for Iterated Smoothing and Iterated Filtering. Pierre Jacob Derivative estimation 38/ 39
  • 45. Bibliography Main references: Inference for nonlinear dynamical systems, Ionides, Breto, King, PNAS, 2006. Iterated filtering, Ionides, Bhadra, Atchad´e, King, Annals of Statistics, 2011. Efficient iterated filtering, Lindstr¨om, Ionides, Frydendall, Madsen, 16th IFAC Symposium on System Identification. Derivative-Free Estimation of the Score Vector and Observed Information Matrix, Doucet, Jacob, Rubenthaler, 2013 (on arXiv). Pierre Jacob Derivative estimation 39/ 39