SlideShare a Scribd company logo
1 of 93
Download to read offline
Data fusion approaches
(for Earth-systems data)
Veronica J. Berrocal
University of Michigan
Department of Biostatistics
SAMSI course
Fall 2017
Veronica J. Berrocal Data fusion
Outline
• Introduction
• Data assimilation
• Optimal Interpolation
• Variational methods
• Kalman filter
• Further approaches
• Data fusion in the statistical literature
• Spatial data
• Example: Wikle and Berliner, 2005
• Example: Fuentes and Raftery, 2005 (Bayesian melding)
• Space-time data
• Example: Wikle et al., 2001
• Example: Choi et al., 2009
• Example: McMillan et al., 2009
• Example: Berrocal et al., 2010 and 2012
• Example: Sahu et al., 2009
Veronica J. Berrocal Data fusion
Definitions
• Data fusion refers to the statistical technique used to combine
data from different sources
• If one of the sources is the output of a computer model
→ Data assimilation
• Data assimilation: term coined in the atmospheric science
community
• Several definitions
• ”approach for fusing data (observations) with prior knowledge
(e.g. mathematical representations of physical laws; model
output) to obtain an estimate of the distribution of the true
state of the process” (Wikle and Berliner, 2006)
Veronica J. Berrocal Data fusion
Why data fusion?
• The evolution in time of many geophysical processes (e.g.
atmosphere, etc.) can be described by systems of partial
differential equations
• As an example, numerical weather forecasts are obtained by
running forward in time computer models that simulate the
evolution of the atmosphere in time
• The equations are solved numerically, by discretizing both
space and time
• It is necessary to specify initial conditions, and, at times,
boundary conditions
• High sensitivity of forecasts from the initial conditions
Veronica J. Berrocal Data fusion
Data fusion: an old problem
• Often, observations of the inital states are not available: this
was recognized by mathematicians and astronomers, among
which Euler, Lagrange and Laplace.
• In particular, Gauss elaborated how available observations of
the physical system were not easily translatable into initial
conditions and stated
”[..] since all our observations and measurements are nothing more
than approximations to the truth, the same must be true of all
calculations resting on them, and the highest of all computations
made concerning concrete phenomena must be to approximate, as
nearly as practicable, to the truth. But this can be accomplished in
no other way than by suitable combination of more observations
than the number absolutely requisite for the determination of
unknown quantities.” (Theory of Motion of Heavenly Bodies)
Veronica J. Berrocal Data fusion
Global atmospheric models
Veronica J. Berrocal Data fusion
Numerical weather prediction models
• xt: state of the atmosphere at time t; Mt: numerical weather
prediction model at time t
• Mt consists of a set of partial differential equations
Longitude
Latitude
Height
xt
Veronica J. Berrocal Data fusion
Numerical weather prediction models
• xt: state of the atmosphere at time t; Mt: numerical weather
prediction model at time t
• Mt consists of a set of partial differential equations
Longitude
Latitude
Height
−→
Mt Longitude
Latitude
Heightxt xt+1
xt+1 = Mt(xt)
(state-space model)
Veronica J. Berrocal Data fusion
It’s an initial-value problem
• In order to obtain a skillful forecast, it is necessary that:
• Mt is a realistic representation of the atmosphere
• the vector xt of state space variables is known accurately
• We will assume that the atmospheric model Mt approximates
well the evolution in time of the atmosphere
• We will focus on how to determine xt
Veronica J. Berrocal Data fusion
Determining the initial conditions
Longitude
Latitude
Height
• At each time t, xt is a vector of order n = 107.
Veronica J. Berrocal Data fusion
Determining the initial conditions
Longitude
Latitude
Height
• At each time t, xt is a vector of order n = 107.
• For any time window around t, (t −δt;t +δt), there are
typically p = 105 observations yt of the atmosphere.
• The observations might not refer to the same variables as the
state-space variables.
• Data assimilation integrates the two sources of information: a
short-range forecast, (background or first guess), x
(b)
t , with
the observations, yt.
Veronica J. Berrocal Data fusion
Data assimilation
[E. Kalnay (2003)]
• Background or first guess: x
(b)
t .
• Global analysis: data assimilation
of the background, x
(b)
t , with the
observations, yt.
Veronica J. Berrocal Data fusion
Data assimilation approaches
• There are several methods for data assimilation. Main
difference is on whether the observations are integrated
sequentially or not, and whether the model is assumed perfect
or stochastic:
• Optimal Interpolation
• Variational methods: 3D-Var and 4D-Var
• Kalman filtering: Kalman filter and Ensemble Kalman filter
• They all hypothesize that at time t, there are:
1 a true unknown state of the atmophere: xt
2 a background field: x
(b)
t
3 a vector of observations: yt
4 Goal: combine x
(b)
t and yt to determine the best
approximation or analysis, x
(a)
t , to xt
Veronica J. Berrocal Data fusion
Data assimilation: assumptions
• xt: true state of the atmosphere at time t
• x
(b)
t : background field at time t
x
(b)
t = xt +ε
(b)
t ε
(b)
t ∼ Nn(0,P(b)
)
• yt: observations at time t
yt = H (xt)+ε
(o)
t ≈ Ht ·xt +ε
(o)
t ε
(o)
t ∼ Np(0,R)
H observation operator, assumed to be linear (or
approximated with) and represented at time t by the matrix
Ht
• x
(a)
t : analysis at time t
x
(a)
t = xt +ε
(a)
t ε
(a)
t ∼ Nn(0,P(a)
)
Veronica J. Berrocal Data fusion
Optimal Interpolation
• We want to express x
(a)
t as a linear combination of x
(b)
t and yt:
x
(a)
t = a1x
(b)
t +a2yt
so that x
(a)
t is unbiased and a1 and a2 minimize the mean
squared error
• Using the same approach as in least squares, we assume that
x
(a)
t is given by
x
(a)
t = x
(b)
t +W(yt −Htx
(b)
t )
• Goal: determine the matrix W so that the analysis error, ε
(a)
t ,
minimize the expected sum of squares
Veronica J. Berrocal Data fusion
Optimal Interpolation
• x
(a)
t = x
(b)
t +W(yt −Htx
(b)
t )
• Goal: determine W so that:
E(ε
(a)
t ε
(a)
t ) = E W(yt −Htx
(b)
t )−ε
(b)
t W(yt −Htx
(b)
t )−ε
(b)
t
is minimized
• Then: W = P(b)
Ht (R+HtP(b)
Ht )−1
• The optimal weight matrix W is also called the gain matrix
• The covariance matrix, P(a)
, of the analysis error, ε
(a)
t , is:
P(a)
= (In −WHt)P(b)
Veronica J. Berrocal Data fusion
Optimal Interpolation
• The analysis is obtained by adding to the first guess, x
(b)
t , the
product of the optimal weight matrix times the innovation,
that is, yt −Htx
(b)
t
• The optimal weight matrix, W, is given by the covariance of
the forecast error in the observation space (P(b)Ht ) divided
by the total error covariance
• If the observation operator, Ht, is a linear operator (or an
interpolator), then
Optimal Interpolation = Kriging
Veronica J. Berrocal Data fusion
Optimal Interpolation
• Observation operator H is a linear operator represented at
time t by the matrix Ht
• Suppose that we assumed the following:
• Prior distribution: xt ∼ Nn(x
(b)
t ,P(b)
)
• Likelihood: yt|xt ∼ Np(Htxt,R)
• Posterior distribution: xt|yt ∼ Nn(E(xt|yt),Var(xt|yt))
E(xt|yt) = x
(b)
t +W(yt −Htx
(b)
t )
Var(xt|yt)) = (In −WHt)P(b)
with W as in Optimal Interpolation.
Veronica J. Berrocal Data fusion
Variational methods: 3D-Var
• The true state of the atmosphere, xt, is found by minimizing
a scalar cost function J(xt).
J(xt) =
1
2
(yt −Htxt) R−1
(yt −Htxt)+
1
2
(xt −x
(b)
t ) (P(b)
)−1
(xt −x
(b)
t )
• R observation error covariance matrix
• P(b)
forecast error covariance matrix
• Formally the solution to the 3D-Var minimization problem is the
same as the solution to the Optimal Interpolation problem
• The solution to a 3D-Var is the posterior mean in the case of a
Gaussian prior for xt and a Gaussian likelihood with a linear
observation operator Ht.
Veronica J. Berrocal Data fusion
Variational methods: 4D-Var
• The true state of the atmosphere, xt, is found by minimizing
a scalar cost function that allows for observations to be
distributed within a time interval (t0,tN)
J(xt0 ) =
1
2
N
∑
i=0
(yti
−Hti
xti
) Ri
−1
(yti
−Hti
xti
)
+
1
2
(xt0 −x
(b)
t0
) (P
(b)
0 )−1
(xt0 −x
(b)
t0
)
• Ri observation error covariance matrix at time i
• P0
(b)
forecast error covariance matrix at the start of the period
• The cost function J(xt0 ) is minimized with respect to the initial true
state of the atmosphere xt0
Veronica J. Berrocal Data fusion
Assimilation via Kalman Filter
• The numerical model is imperfect:
xti = Mti−1 (xti−1 )+ηti i = 1,...,N
with ηti ∼ N(0,Qi )
• The observations are used sequentially in the time interval
(t1;tN).
• At each time ti two operations are performed sequentially:
1 Forecast step
2 Analysis/assimilation step
Veronica J. Berrocal Data fusion
Assimilation via Kalman Filter
• Forecast step:
1 Derive forecast or background at time ti : x
(b)
ti
= Mti−1
(x
(a)
ti−1
).
2 Assuming that Mt can be linearized and represented by the
matrix Mt, compute covariance matrix of background error at
time ti : P
(b)
i = Mti−1
P
(a)
i−1Mti−1
+Qi .
• Analysis step:
1 Compute the Kalman gain matrix at time ti ,
Ki = P
(b)
i Hi (Ri +Hi P
(b)
i Hi )−1
.
2 Derive the analysis at time ti , x
(a)
ti
:
x
(a)
ti
= x
(b)
ti
+Ki (yti
−Hi x
(b)
ti
)
3 Compute the covariance matrix of analysis error at time ti :
P
(a)
i = (In −Ki Hi )P
(b)
i
If Mti and Hti are not linear, then → Extended Kalman Filter
Veronica J. Berrocal Data fusion
Kalman Filter
• Suppose that for each i = 1,...,N:
• measurement equation:
yti
= Hi xti
+εti
εti
∼ N(0,Ri )
• process/transition equation:
xti
= Mti
xti−1
+ηti
ηti
∼ N(0,Qti
)
with xt0 ∼ N(x
(b)
t0
,P
(b)
t0
)
Veronica J. Berrocal Data fusion
Kalman Filter
• Let xt0:ti ≡ {xt0 ,xt1 ,...,xti } and yt1:ti ≡ {yt1 ,yt2 ,...,yti }.
• Let x
(a)
ti
be the analysis
• Let x
(f )
ti
denote the forecast
• For i = 1,...,N:
• Filter step
1 x
(f )
ti
= E(xti |yti−1 ) = Mti x
(a)
ti−1
2 P
(f )
ti
= Var(xti |yti−1 ) = Mti P
(a)
ti−1
Mti
+Qti
• Analysis step
1 Kti = P
(f )
ti
Hi (Ri +Hi P
(f )
ti
Hi )−1
2 x
(a)
ti
= x
(f )
ti
+Kti (yti −Hi x
(f )
ti
)
3 P
(a)
ti
= (In −Kti Hti )P
(f )
ti
Veronica J. Berrocal Data fusion
Kalman Filter/Extended Kalman Filter
• In the case of a linear state-space model Mt and a linear
observation operator H , Kalman filter can be interpreted
within a Bayesian framework.
• If, at time ti , we assume:
• xti
∼ N(x
(b)
ti
,P
(b)
i )
• yti
∼ N(Hi xti
,Ri )
• Then, the analysis x
(a)
ti
is the posterior mean, E(xti |yti ) with
the analysis covariance matrix P
(a)
ti
posterior variance
Var(xti |yti )
• On the other hand, the forecast step consists into deriving
p(xti+1 |yti ) = p(xti+1 |xti )·p(xti |yti )dxti
Veronica J. Berrocal Data fusion
Kalman Filter/Extended Kalman Filter
• It is the “gold standard” of data assimlation
• Even with a poor initial guess of the state of the atmosphere,
it should provide the best linear unbiased estimate of the state
of the atmosphere
• Problems if the system is unstable
• Computationally expensive! The matrix operations to
compute P
(b)
i and P
(a)
i involve matrices of order n ≈ 107
• Nonlinear dynamics: i.e. Mti non-linear, linear approximation
does not perform well
Veronica J. Berrocal Data fusion
Ensemble Kalman Filter
• Main idea: Use an ensembe of system states as a discrete
approximation to the distribution of xti
• Each ensemble member is propagated forward in time using
Mti
• The mean and covariance matrix of the new ensemble are
used to approximate the forecast distribution
• Similar to particle filter with the ensemble members being
”particles”
• The same set of observations are assimilated to each ensemble
member
Veronica J. Berrocal Data fusion
Ensemble Kalman Filter
• Let x
(b)
t0,j
, j = 1,...,M be M ensemble members
• Forecast step:
1 Derive forecast ensemble members at time ti :
x
(b)
ti,j
= Mti−1
(x
(a)
ti−1,j
), j = 1,...,M
2 Compute sample covariance matrix of background error at
time ti : ˆP
(b)
i
• Analysis step:
1 Compute the Kalman gain matrix at time ti ,
Ki = ˆP
(b)
i Hi (Ri +Hi
ˆP
(b)
i Hi )−1
2 Derive the analysis ensemble members at time ti :
x
(a)
ti,j
= x
(b)
ti,j
+Ki (˜yti ,j −Hi x
(b)
ti,j
)
where ˜yti ,j = yti
+εj , εj ∼ N(0,R)
3 Compute the sample covariance matrix of the analysis error at
time ti , ˆP
(a)
i
Veronica J. Berrocal Data fusion
Further approaches
• Different strategies to perform the analysis step in the
Ensemble Kalman filter
• Sampling variability in Ensemble Kalman filter, especially if
the ensemble size is small
→ filter divergence; decrease in contribution of the
observations
1 Localization of the ensemble covariance matrix (e.g.
covariance tapering, etc.)
2 Inflation of the ensemble spread
Veronica J. Berrocal Data fusion
Data fusion: spatial data
Wikle and Berliner, Technometrics, 2005
• Two sources of wind data: daily wind satellite data and
computer model output from a weather center for the period
15 September 1996-29 June 1997
• Data with different resolution → Change of support problem
• Satellite-based wind estimates from NASA Scatterometer
(NSCAT) at 0.5 degree resolution and not on a regular grid
• National Center for Environmental Prediction (NCEP) analysis
of wind direction at 2.5 degree resolution and on a regular grid
• Goal: Predict surface streamfunction at a resolution of 1.0
degree
Veronica J. Berrocal Data fusion
Data fusion: spatial data
Wind data from satellite and from an analysis for December 26, 1996
Veronica J. Berrocal Data fusion
Data fusion: spatial data
• Z measurement data from the two sources
• Y true underlying process
• Adopt the modeling approach:
[Data Process]
[Process Parameters]
[Parameters]
• Goal: Infer upon the process Y
• Problem: The data has different spatial support
Veronica J. Berrocal Data fusion
Data fusion: spatial data
• Let:
1 Ai , i = 1,...,na
2 Bj , j = 1,...,nb
3 Ck , k = 1,...,nc be non-overlapping sets such that:
0 ≤ |Ai | < |Bj | < |Ck | < ∞ for all i,j,k
• ZA ≡ (Z(A1),...,Z(Ana )) ,observations on the subgrid
• ZC ≡ (Z(C1),...,Z(Cnc )) , observations on the supergrid
Veronica J. Berrocal Data fusion
Data fusion: spatial data
• Y = {Y (s) : s ∈ D ⊂ R} spatial process
•
Y (S) =



1
|S| S Y (s)ds |S| > 0
avg {Y (s) : s ∈ S} |S| = 0
• YA ≡ (Y (A1),...,Y (Ana )) ,subgrid process
• YC ≡ (Y (C1),...,Y (Cnc )) , supergrid process
• YB ≡ (Y (B1),...,Y (Bnb
)) process on the prediction grid
Then:
• ZA observations of YA
• ZC observations of YC
Veronica J. Berrocal Data fusion
Data fusion: spatial data
Data model
• Model for [ZA,ZC |YA,YC ,YB,θm]
• Measurement error
ZA = YA +εA εA ∼ N(0, σ2
a Ina )
ZC = YC +εC εC ∼ N(0, σ2
c Inc )
•
ZA
ZC
|YA,YC ,σ2
a,σ2
c ∼
N
YA
YC
,Σm =
σ2
aIna 0
0 σ2
c Inc
Veronica J. Berrocal Data fusion
Data fusion: spatial data
Process model
• For all s ∈ Bj
Y (s) = Y (Bj )+γ(s)
with E(γ(s)) = 0 and Cov(γ(s),γ(r)) = C(s,r;φ)
• Then:
1 for all Ai , Y (Ai ) = g
(i)
A YB + 1
|Ai | Ai
γ(s)ds
2 for all Ck , Y (Ck ) = g
(k)
C YB + 1
|Ck | Ck
γ(s)ds
•
YA
YC
|YB,σ ∼ N
GA
GC
YB,Σ(φ)
Veronica J. Berrocal Data fusion
Data fusion: spatial data
Complete model
• Data model:
ZA
ZC
|YB,Σm,Σ ∼ N
GA
GC
YB,Σm +Σ(φ)
• Process model: YB ∼ N(θB,ΣB()φ)
• Parameters: [σ2
a,σ2
c,θB,φ]
Veronica J. Berrocal Data fusion
Data fusion: spatial data
Example: streamfunction
• Data:
1 NSCAT satellite data: UA, VA (na = 369)
2 NCEP (numerical model output): UC , VC (nC = 15)
• Process: uB, vB
• Data model:
1
UA
UC
|uB,σu,Σ ∼ N
GA
GC
uB,Σu +Σm
2
VA
VC
|vB,σv ,Σ ∼ N
GA
GC
vB,Σv +Σm
• Process model:
1
uB
vB
∼ N
µu1
µv 1
,Σuv
Veronica J. Berrocal Data fusion
Data fusion: spatial data
• Interest in predicting the streamfunction ψ.
• Deterministic Poisson equation to determine streamfunction ψ
from winds:
∇2
ψ =
∂v
∂x
−
∂u
∂y
u: east-west wind component, v: north-south wind
component
• Discretizing to a regular grid:
1 ψI |ψbc,u,v ∼ N(L−1
[Dx v −Dy u+Lbc ψbc],ΣI )
2 ψbc ∼ N(µbc ,Σbc )
• ψI : streamfunction at the interior grid locations
• ψbc: streamfunction at the boundary grid locations
Veronica J. Berrocal Data fusion
Data fusion: spatial data
Wind data (top row); posterior mean and realization from the posterior
distribution of the streamfunction for December 26, 1996 (bottom row)
Veronica J. Berrocal Data fusion
Data fusion: spatial data
Fuentes and Raftery, Biometrics, 2005
• Two sources of weekly average SO2 concentration data:
monitoring data and computer model output
• Data with different resolution → Change of support problem
• Monitoring data from CASTNet sites
• Output of a numerical model, Models-3, given as average
concentration over 36×36 km
• Goal: Estimate true weekly average concentration of SO2
Veronica J. Berrocal Data fusion
Data fusion: spatial data
Fuentes and Raftery, Biometrics, 2005
Average SO2 concentration for the week of July 11, 1995
Veronica J. Berrocal Data fusion
Data fusion: spatial data
• Process: Z(s) true underlying process
• Data:
1 ˆZ(s) measurement from monitoring network (CASTNET)
2 ˜Z(B) numerical model output (Models-3)
• Goal: Infer upon the process Z(s)
• Problem: The data has different spatial support
Veronica J. Berrocal Data fusion
Data fusion: spatial data
Veronica J. Berrocal Data fusion
Data fusion: spatial data
Data model
• Model for ˆZ(s), ˜Z(B) | Z(s),θm
• Measurement error
ˆZ(s) = Z(s)+e(s) e(s) ∼ N(0,σ2
e)
˜Z(B) =
1
|B| B
˜Z(s)ds
˜Z(s) = a(s)+b(s)Z(s)+δ(s) δ(s) ∼ N(0,σ2
δ)
where
1 a(s) polynomial in s
2 b(s) ≡ b
Veronica J. Berrocal Data fusion
Data fusion: spatial data
Process model
• Z(s) = µ(s)+ε(s) with
1 E(ε(s)) = 0 and Cov(ε(s),ε(r)) = σ(s,r;φ)
2 µ(s) polynomial in s with coefficients β
→ Z(s) ∼ GP(µ(s),Σ)
• Goal: Infer on Z given ˆZ, ˜Z
Veronica J. Berrocal Data fusion
Data fusion: spatial data
• ˆZ = ˆZ(s1),..., ˆZ(sn)
• ˜Z = ˜Z(B1),..., ˜Z(BM)
ˆZ
˜Z
∼ N
ˆµ
˜a+b˜µ
,
ΣC ΣCM
ΣCM ΣM
where
1 ˆµ = (µ(s1),...,µ(sn))
2 ˜a = 1
|B1| B1
a(s)ds,..., 1
|BM | BM
a(s)ds
3 ˜µ = 1
|B1| B1
µ(s)ds,..., 1
|BM | BM
µ(s)ds
Veronica J. Berrocal Data fusion
Data fusion: spatial data
ˆZ
˜Z
∼ N
ˆµ
˜a+b˜µ
,
ΣC ΣCM
ΣCM ΣM
where
1 ΣC n ×n matrix: (ΣC )ij = σ(si ,sj ;φ)+1{si ≡sj }σ2
e
2 ΣCM n ×M matrix: (ΣCM)ik = b · 1
|Bk | Bk
σ(si ,v;φ)dv
3 ΣM M ×M matrix:
(ΣM)kl = b2
·
1
|Bk|·|Bl | Bk Bl
σ(u,v;φ)du dv +1{Bk ≡Bl }σ2
δ
Veronica J. Berrocal Data fusion
Data fusion: spatial data
Example: air pollution
Data:
1 Weekly average of SO2 concentration at n = 50 CASTNet
sites for the week of July 11, 1995
2 Weekly average of SO2 concentration at M = 81×87 36 × 36
grid cells, output of Models-3 for the week of July 11, 1995
Other modeling details
• Stochastic integrals approximated by taking systematic sample
of 4 points within each a grid cell
• Degree of polynomials defining the mean trend µ(s) of Z(s)
and of the additive bias a(s) of ˜Z(s) determined via RJMCMC
• Non-stationary covariance function for the underlying true
process Z(s)
Veronica J. Berrocal Data fusion
Data fusion: spatial data
Posterior predictive mean and posterior predictive SD for Z(s) for
the week of July 11, 1995
Veronica J. Berrocal Data fusion
Data fusion: space-time data
Wikle et al., JASA 2001
• Extend the modeling idea of Wikle and Berliner (2005) to
account for time
• Daily wind data from two sources: satellite data (at higher
resolution) and computer model output (at a lower resolution)
• Goal: Predict winds at an intermediate resolution over a 54
6-hour increment period
• Accounted for the temporal dependence in the data by using
dynamic coefficients in the specification of the process driving
the observed data
• Avoided to compute stochastic integrals!
Veronica J. Berrocal Data fusion
Data fusion: space-time data
• Data:
1 NSCAT satellite data: UA,t, VA,t at time t
2 NCEP (numerical model output): UC,t, VC,t at time t
→ Ut = (UC,t,UA,t) and Vt = (VC,t,VA,t) observed data at
time t
→ {Ut}T
1 = (U1,...,UT ) , {Vt}T
1 = (V1,...,VT )
• Process:
• ut, vt at time t at nB prediction grid cells.
• Similar definition for {ut}T
t=1 and {vt}T
t=1
Veronica J. Berrocal Data fusion
Data fusion: space-time data
Data model
{V}T
t=1 ,{U}T
t=1 | {v}T
t=1 ,{u}T
t=1 ,θ =
T
∏
t=1
[Vt | vt,θ]·[Ut | ut,θ]
• Vt | vt,Σt ∼ N(Ktvt,Σt)
• Ut | ut,Σt ∼ N(Ktut,Σt)
1 Σt diagonal matrix with entries equal to either σ2
(satellite
obs), σ2
b (NCEP boundary grid cells) or σ2
I (NCEP interior
cells)
2 Kt design matrices that maps the prediction grid cells to the
observation grid cells
Veronica J. Berrocal Data fusion
Data fusion: space-time data
Process model
ut = µu +uE
t + ˜ut
vt = µv +vE
t + ˜ut
1 µu spatial mean for the u wind component: µu = Puγu
(resp. for µv ) → Pu design matrix (resp. Pv )
2 uE
t thin fluid approximation of the u wind component:
uE
t = Φau
t (resp. for vE
t ) → Φ basis function
• ˜ut small scale motions of the u wind component: ˜ut = Ψbu
t
(resp. for ˜vt) → Ψ wavelet basis function
Veronica J. Berrocal Data fusion
Data fusion: space-time data
Parameters
• The 2n ×1 random vectors au
t , av
t are modeled as dynamically
evolving in time but are independent between prediction grid
cells
• The n ×1 random vectors bu
t and bv
t are modeled as
dynamically evolving in time and are independent between
prediction grid cells
• No need to compute stochastic integrals!
• Only temporal dependence is explicitly modeled
• Computationally feasible
Veronica J. Berrocal Data fusion
Data fusion: space-time data
Choi et al., Comp. Stat. and Data Analysis 2009
• Extend the modeling idea of Fuentes and Raftery (2005) to
account for time
• Daily average PM2.5 concentration from two sources:
monitoring data and computer model output
1 ˆZ(s,t) observation from monitoring site s at time t
2 ˜Z(B,t) model output at grid cell B at time t
• Goal: Predict true daily average PM2.5 concentration
aggregated over counties at time t for health analysis
• Included the temporal dependence in the mean structure of
the underlying process
Veronica J. Berrocal Data fusion
Data fusion: space-time data
Data model
• Model for ˆZ(s,t), ˜Z(B,t) | Z(s,t),θm
ˆZ(s) = Z(s,t)+e(s,t) e(s,t) ∼ N(0,σ2
e)
˜Z(B,t) =
1
|B| B
˜Z(s,t)ds
˜Z(s,t) = a(s)+Z(s,t)+δ(s,t) δ(s,t) ∼ N(0,σ2
δ)
Process model
Z(s,t) = M(s,t)ξ +ε(s,t) ε(s,t) ∼ N(0,τ2
)
• M(s,t) vector of meteorological variables at site s at time t
Veronica J. Berrocal Data fusion
Data fusion: space-time data
McMillan et al., Environmetrics, 2009
• Propose a spatio-temporal model to combine monitoring data
and numerical model output
1 Daily average PM2.5 concentration from monitoring sites
during year 2001
2 Daily average PM2.5 concentration, output of CMAQ model
ran at 12 km grid cell resolution (M = 213×188)
• Goal: Combine the two sources of data and predict true daily
average PM2.5 concentration for each day in 2001
Veronica J. Berrocal Data fusion
Data fusion: space-time data
• Process: Wi true underlying process
• Data:
1 Xi,k monitoring data
2 Yi,k CMAQ output
• Wi defined on space-time grid cells: i ∈ {1,...,N}, where
N = NT ×NP, NT number of time points, NP number of grid
cells
• Xi,k observed monitoring data for the k −th monitor
observation in cell i
• Yi,k CMAQ output in cell i (k = 1)
Veronica J. Berrocal Data fusion
Data fusion: space-time data
Veronica J. Berrocal Data fusion
Data fusion: space-time data
Data model
• Model for [Xi,k,Yi,k | Wi ,θ]
Measurement error
Xi,k = Wi +εi,k εi,k ∼ N(0,τ2
X )
Yi,k = Di β +Wi +δi,k δi,k ∼ N(0,τ2
Y )
• Di : vector of uniform B-splines over a regular 3-dimensional
lattice of ND knots
=⇒ CMAQ bias for grid cell i : Di β = ∑ND
j=1 Dij βj
Veronica J. Berrocal Data fusion
Data fusion: space-time data
Process model
Wi = µt(i) +Zi
• t(i) temporal index of grid cell i
• µt(i) constant across space: µt(i) ∼ N(0,τ2
µ)
• Z space-time multivariate normal with a separable covariance
structure: autoregressive in time and conditionally
autoregressive (CAR) in space
=⇒ Z | τ2
Z ,ρ ∼ N(0,τ2
Z · (ΛT (ρ)⊗ΛP)
−1
)
Veronica J. Berrocal Data fusion
Data fusion: space-time data
Daily mean levels for predicted surface, monitoring data and
CMAQ over Eastern US
Veronica J. Berrocal Data fusion
Data fusion: space-time data
Posterior predictive mean for (a) 4 July 2001 and (b) 24 December 2001
(a) (b)
Veronica J. Berrocal Data fusion
Data fusion: space-time data
Berrocal et al., JABES, 2010
• Propose a spatio-temporal model to combine monitoring data
and numerical model output
1 Daily 8-hr max ozone concentration from monitoring sites
during summer of 2001
2 Daily 8-hr max ozone concentration, output of CMAQ model
ran at 12 km grid cell resolution (M = 213×188)
• Goal: Combine the two sources of data and “downscale”
numerical model output at point level
• Does not assume a “true” underlying process
Veronica J. Berrocal Data fusion
Data fusion: space-time data
• Y (s,t): observation at site s at time t
• x(B,t): CMAQ output at grid cell B at time t
For s ∈ B:
Y (s,t) = ˜β0(s,t)+ ˜β1(s,t)x(B,t)+ε(s,t) ε(s,t) ∼ N(0,σ2
)
with ˜βi (s,t) = βit +βi (s,t), for i = 0,1.
• Temporal dependence in β0t and β1t:
(i) β0t,β1t Nested within time
(ii) β0t,β1t Dynamic in time
• β0(s,t) and β1(s,t) correlated Gaussian processes that are either:
(i) Nested within time OR (i) Dynamic in time
Veronica J. Berrocal Data fusion
Data fusion: space-time data
• Possible spatio-temporal models to combine the two data
β0t β0(s,t)
Model β1t β1(s,t)
Model 1 Independent across time Constant in time
Model 2 Dynamic Constant in time
Model 3 Independent across time Independent across time
Model 4 Dynamic Dynamic
Veronica J. Berrocal Data fusion
Data fusion: space-time data
−100 −95 −90 −85 −80 −75 −70
30354045
Longitude
Latitude
Ozone monitoring sites, 2001
Test sites (black), validation sites (red)
• Daily maximum 8-hour ozone
concentration (ppb): observations
(n=803) and CMAQ model output
• Model output on 12-km grid cells
(M=40,440)
• Fit models for May 1 - October 15, 2001
• 436 sites used to fit the model,
367 sites for validation
Veronica J. Berrocal Data fusion
Data fusion: space-time
• National Ambient Air Quality Standard (NAAQS) for ozone is that
the 3-year rolling average of the annual fourth highest daily 8-hour
maximum ozone concentration be less than a given threshold
• Maps of the probability that the fourth highest ozone concentration
during the period May 1 - October 15, 2001 exceeds:
−100 −95 −90 −85 −80 −75 −70
30354045
Longitude
Latitude
0.0
0.2
0.4
0.6
0.8
1.0
(a) 80 ppb (1997 standard)
−100 −95 −90 −85 −80 −75 −70
30354045
Longitude
Latitude
0.0
0.2
0.4
0.6
0.8
1.0
(b) 75 ppb (2008 standard)
Veronica J. Berrocal Data fusion
Data fusion: space-time
Berrocal et al., Environmetrics, 2012
• Extended the 2010 downscaler model to allow for potential
spatial misalignment in the computer model output
1 Seasonal average temperature at 17 synoptic stations in
Sweden for the period December 1962-November 2007
2 Regional climate model output on a 12.5km × 12.5km grid for
the same period
• Goal: Assess the performance of the regional climate model.
Veronica J. Berrocal Data fusion
RCM data
12 14 16 18 20
56586062
−10
−8
−6
−4
−2
0
2
4
RCM output: DJF 2002
q
q
q
Stockholm
Borlange
Goteborg
• Output of the Swedish
Meteorological Hydrological
Institute (SMHI) Rossby
Centre Atmospheric (RCA)
RCM model
• Daily output for 2-m
temperature from
December 1, 1962 to
November 30, 2007, then
aggregated to quarterly
averages (DJF, MAM, JJA,
SON)
• Output at
12.5 km × 12.5 km grid
boxes
Veronica J. Berrocal Data fusion
RCM data
12 14 16 18 20
56586062
−2
0
2
4
6
8
RCM output: MAM 2002
q
q
q
Stockholm
Borlange
Goteborg
• Output of the Swedish
Meteorological Hydrological
Institute (SMHI) Rossby
Centre Atmospheric (RCA)
RCM model
• Daily output for 2-m
temperature from
December 1, 1962 to
November 30, 2007, then
aggregated to quarterly
averages (DJF, MAM, JJA,
SON)
• Output at
12.5 km × 12.5 km grid
boxes
Veronica J. Berrocal Data fusion
Observational data
12 14 16 18 20
56586062
−10
−8
−6
−4
−2
0
2
4
Observation data: DJF 2002
q
q
q
Stockholm
Borlange
Goteborg
qq
q
q
q
q
q
q qq
q
q q
q
q q
q
• Observed daily average
temperature from 17 stations
in the SMHI network of
synoptic stations
• Period: December 1, 1962 to
November 30, 2007
• Daily data aggregated to
quarterly scale
• Three stations, G¨oteborg,
Stockholm and Borl¨ange
held out for validation
Veronica J. Berrocal Data fusion
Downscaling model
• Some notation:
• B1,...,Bg : RCM model grid boxes with centroids r1,...,rg
• x(B1,t),x(B2,t),...,x(Bg ,t): RCM output of quarterly
average temperature for quarter t = 1,...,T at grid box
B1,B2,...,Bg
• Y (s,t): observed quarterly average temperature at station s
for quarter t = 1,...,T
• The 2010 downscaling applied to this data would be: for s in
B and t = 1,...,T
Y (s,t) = ˜β0,t(s,t)+˜β1,t(s,t)x(B,t)+ε(s,t) ε(s,t)
iid
∼ N(0,τ2
)
Veronica J. Berrocal Data fusion
Downscaling model
• The 2012 model starts from the observation that we could
write: for t = 1,...,T
Y (s,t) = ˜β0(s,t)+β1 ˜x(s,t)+ε(s,t) ε(s,t) ∼ N(0,τ2
)
with
• ˜x(s,t): spatio-temporal weighted average of the RCM output:
˜x(s,t) =
g
∑
k=1
wk (s,t)x(Bk ,t)
• ˜β1(s,t) replaced by β1 for identifiability reasons
• The weights wk(s,t) should be:
• positive and sum up to 1
• spatially correlated within sites and across sites
Veronica J. Berrocal Data fusion
Downscaling model
• If r1,...,rg are the centroids of the RCM grid boxes, we can
take the weights wk(s,t) to be
wk(s,t) =
K (|s−rk|;λ)
∑
g
l=1 K (|s−rl |;λ)
• K (·;λ) kernel function with bandwidth λ.
For example: K (|s−rk|;λ) = exp(−|s−rk |
λ ).
Veronica J. Berrocal Data fusion
Downscaling model
We consider RCM output and observational data:
12 14 16 18 20
56586062
−10
−8
−6
−4
−2
0
2
4
RCM output: DJF 2002
q
q
q
Stockholm
Borlange
Goteborg
12 14 16 18 2056586062
−10
−8
−6
−4
−2
0
2
4
Observation data: DJF 2002
q
q
q
Stockholm
Borlange
Goteborg
qq
q
q
q
q
q
q qq
q
q q
q
q q
q
Veronica J. Berrocal Data fusion
Downscaling model
We consider RCM output and observational data:
12 14 16 18 20
56586062
−10
−8
−6
−4
−2
0
2
4
RCM output: DJF 2002
q
q
q
Stockholm
Borlange
Goteborg
12 14 16 18 20
56586062
−10
−8
−6
−4
−2
0
2
4
Observation data: DJF 2002
q
q
q
Stockholm
Borlange
Goteborg
q
We establish the spatial linear model: for s ∈ B and t = 1,...,T
Y (s,t) = ˜β0(s,t)+β1 ˜x(s,t)+ε(s,t) ε(s,t) ∼ N(0,τ2
)
Veronica J. Berrocal Data fusion
Downscaling model
For t = 1,...,T, the weight wk(s,t) is:
12 14 16 18 20
56586062
0.0
0.1
0.2
0.3
0.4
0.5
q
q
q
Stockholm
Borlange
Goteborg
q
14.0 14.2 14.4 14.6 14.8
61.661.862.062.262.4
0.0
0.1
0.2
0.3
0.4
0.5
q
Veronica J. Berrocal Data fusion
Downscaling model
• To allow for the weights wk(s,t) to be directional, we modify
the expression
wk(s,t) =
K (|s−rk|;λ)
∑
g
l=1 K (|s−rl |;λ)
to
wk(s,t) =
K (|s−rk|;λ)·exp(Q(rk,t))
∑
g
l=1 K (|s−rl |;λ)·exp(Q(rl ,t))
where for t = 1,...,T, Q(r,t) is a latent stationary mean-zero
spatial Gaussian process with variance 1 and exponential
correlation function.
• For t = 1,...,T, the range φ of the latent spatial process
Q(r,t) influences the directionality of the weights.
Veronica J. Berrocal Data fusion
Downscaling model
• Finally, the downscaling model is: for s and t = 1,...,T:
Y (s,t) = ˜β0(s,t)+β1 ˜x(s,t)+ε(s,t) ε(s,t) ∼ N(0,τ2
)
• ˜β0,t(s) = β0,t +β0(s,t) with β0(s,t) stationary mean-zero
Gaussian spatial process with time-varying range parameter.
• ˜x(s,t) = ∑
g
k=1 wk (s,t)x(Bk ,t)
• wk (s,t) = K (|s−rk |;λ)·exp(Q(rk ,t))
∑
g
l=1 K (|s−rl |;λ)·exp(Q(rl ,t))
• Q(r,t) is a latent stationary mean-zero spatial Gaussian
process with variance 1 and exponential correlation function
with range parameter φ.
• For t = 1,...,T, the calibration parameters, β0,t,β0(s,t) are
assumed to be independent in time, and so is the latent
process Q(r,t).
Veronica J. Berrocal Data fusion
Predictions at point level
• We predicted quarterly average temperature at three reserved
stations and compared them with:
1 observed data
2 the quarterly average temperature, output of the RCM at the
grid box containing the station.
12 14 16 18 20
56586062
Longitude
Latitude
12
3
4
5
6
7
8 910 11
12 13
14
15 16
17
q
q
q
Stockholm
Borlange
Goteborg
Veronica J. Berrocal Data fusion
Predictions at Borl¨ange
Black line: observed data
Blue line: downscaling model prediction
Red line: RCM output
Magenta line: upscaling model prediction
1970 1980 1990 2000
−15−10−505
Borlänge
Winter
1970 1980 1990 2000
−5051015
Spring
1970 1980 1990 2000
510152025
Summer
1970 1980 1990 2000
05101520
Year
Autumn
Veronica J. Berrocal Data fusion
Predictions at Stockholm
Black line: observed data
Blue line: downscaling model prediction
Red line: RCM output
Magenta line: upscaling model prediction
1970 1980 1990 2000
−15−10−505
Stockholm
Winter
1970 1980 1990 2000
−5051015
Spring
1970 1980 1990 2000
510152025
Summer
1970 1980 1990 2000
05101520
Year
Autumn
Veronica J. Berrocal Data fusion
Spatial differences
12 14 16 18 20
57596163
12
3
4
5
6
7
8 10
12 13 14
15 16
17
9,11
Downscaling Climate: Winter 2002
10
5
0
5
10
12 14 16 18 20
57596163
12
3
4
5
6
7
8 10
12 13 14
15 16
17
9,11
Upscaling Climate: Winter 2002
10
5
0
5
10
12 14 16 18 20
57596163 12
3
4
5
6
7
8 10
12 13 14
15 16
17
9,11
Downscaling Climate: Spring 2002
10
5
0
5
10
12 14 16 18 20
57596163
12
3
4
5
6
7
8 10
12 13 14
15 16
17
9,11
Upscaling Climate: Spring 2002
10
5
0
5
10
12 14 16 18 20
57596163
12
3
4
5
6
7
8 10
12 13 14
15 16
17
9,11
Downscaling Climate: Summer 2002
10
5
0
5
10
12 14 16 18 20
57596163
12
3
4
5
6
7
8 10
12 13 14
15 16
17
9,11
Upscaling Climate: Summer 2002
10
5
0
5
10
12 14 16 18 20
57596163
12
3
4
5
6
7
8 10
12 13 14
15 16
17
9,11
Downscaling Climate: Autumn 2002
10
5
0
5
10
12 14 16 18 20
57596163
12
3
4
5
6
7
8 10
12 13 14
15 16
17
9,11
Upscaling Climate: Autumn 2002
10
5
0
5
10
Veronica J. Berrocal Data fusion
Data fusion: space-time
Sahu et al., JRSS Series C, 2009
• Propose a spatio-temporal model to combine monitoring data
and numerical model output to predict wet chemical
deposition
1 Weekly nitrate (resp. sulfate) deposition for year 2001 at
monitoring sites (n = 152)
2 Weekly precipitation data for year 2001 at monitoring sites
3 Weekly nitrate (resp. sulfate) deposition, output of CMAQ
model ran at 12 km grid cell resolution (M = 33,390)
• Goal: Combine the sources of data and predict weekly, annual
and seasonal wet deposition in the Eastern US for 2001
Veronica J. Berrocal Data fusion
Data fusion: space-time
Data:
1 P(s,t) observed precipitation at s at time t
2 Z(s,t) observed deposition at s at time t
3 Q(B,t) CMAQ model output at grid cell B at time t
Data model:
P(s,t) =
exp(U(s,t)) if V (s,t) > 0
0 o.w
Z(s,t) =
exp(Y (s,t)) if V (s,t) > 0
0 o.w
Q(B,t) =
exp(X(B,t)) if ˜V (B,t) > 0
0 o.w
X(B,t) = γ0 +γ1
˜V (B,t)+ψ(B,t) ψ(B,t) ∼ N(0,σ2
ψ)
Veronica J. Berrocal Data fusion
Data fusion: space-time
Process model:
1 U(s,t) process driving precipitation at s at time t
2 Y (s,t) process driving deposition at s at time t
3 V (s,t) latent atmospheric process
4 ˜V (B,t) process driving the log-CMAQ output at B at time t
U(s,t) = α0 +α1V (s,t)+δ(s,t) δt ∼ GP(0,Σδ)
Y (s,t) = β0 +β1U(s,t)+β2V (s,t)+[b0 +b1(s)X(B,t)]+η(s,t)+ε(s,t)
V (s,t) = ˜V (B,t)+ν(s,t) ν(s,t) ∼ N(0,σ2
ψ)
˜V (B,t) = ρ ˜V (B,t −1)+ζ(B,t) ζ(B,t) ∼ CAR
Veronica J. Berrocal Data fusion
Data fusion: spatial data
Veronica J. Berrocal Data fusion
Data fusion: spatial data
Monitoring data and validation sites
Veronica J. Berrocal Data fusion
Data fusion: spatial data
Annual total precipitation in 2001
Veronica J. Berrocal Data fusion
Data fusion: space-time data
(a) (b)
Posterior predictive mean for b(s) for (a) sulfate (b) nitrate
Veronica J. Berrocal Data fusion
Data fusion: space-time data
(a) (b)
(a) Posterior predictive annual mean for nitrate and (b) length of
predictive interval
Veronica J. Berrocal Data fusion

More Related Content

What's hot

ABC with Wasserstein distances
ABC with Wasserstein distancesABC with Wasserstein distances
ABC with Wasserstein distancesChristian Robert
 
Coordinate sampler: A non-reversible Gibbs-like sampler
Coordinate sampler: A non-reversible Gibbs-like samplerCoordinate sampler: A non-reversible Gibbs-like sampler
Coordinate sampler: A non-reversible Gibbs-like samplerChristian Robert
 
accurate ABC Oliver Ratmann
accurate ABC Oliver Ratmannaccurate ABC Oliver Ratmann
accurate ABC Oliver Ratmannolli0601
 
Bayesian model choice in cosmology
Bayesian model choice in cosmologyBayesian model choice in cosmology
Bayesian model choice in cosmologyChristian Robert
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerChristian Robert
 
Can we estimate a constant?
Can we estimate a constant?Can we estimate a constant?
Can we estimate a constant?Christian Robert
 

What's hot (20)

QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
ABC with Wasserstein distances
ABC with Wasserstein distancesABC with Wasserstein distances
ABC with Wasserstein distances
 
MUMS Opening Workshop - Model Uncertainty in Data Fusion for Remote Sensing -...
MUMS Opening Workshop - Model Uncertainty in Data Fusion for Remote Sensing -...MUMS Opening Workshop - Model Uncertainty in Data Fusion for Remote Sensing -...
MUMS Opening Workshop - Model Uncertainty in Data Fusion for Remote Sensing -...
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Coordinate sampler: A non-reversible Gibbs-like sampler
Coordinate sampler: A non-reversible Gibbs-like samplerCoordinate sampler: A non-reversible Gibbs-like sampler
Coordinate sampler: A non-reversible Gibbs-like sampler
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
accurate ABC Oliver Ratmann
accurate ABC Oliver Ratmannaccurate ABC Oliver Ratmann
accurate ABC Oliver Ratmann
 
CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...
CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...
CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Bayesian model choice in cosmology
Bayesian model choice in cosmologyBayesian model choice in cosmology
Bayesian model choice in cosmology
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like sampler
 
CLIM Fall 2017 Course: Statistics for Climate Research, Analysis for Climate ...
CLIM Fall 2017 Course: Statistics for Climate Research, Analysis for Climate ...CLIM Fall 2017 Course: Statistics for Climate Research, Analysis for Climate ...
CLIM Fall 2017 Course: Statistics for Climate Research, Analysis for Climate ...
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Can we estimate a constant?
Can we estimate a constant?Can we estimate a constant?
Can we estimate a constant?
 

Viewers also liked

Viewers also liked (17)

Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
CLIM Undergraduate Workshop: Undergraduate Workshop Introduction - Elvan Ceyh...
CLIM Undergraduate Workshop: Undergraduate Workshop Introduction - Elvan Ceyh...CLIM Undergraduate Workshop: Undergraduate Workshop Introduction - Elvan Ceyh...
CLIM Undergraduate Workshop: Undergraduate Workshop Introduction - Elvan Ceyh...
 
CLIM Undergraduate Workshop: Introduction to Spatial Data Analysis with R - M...
CLIM Undergraduate Workshop: Introduction to Spatial Data Analysis with R - M...CLIM Undergraduate Workshop: Introduction to Spatial Data Analysis with R - M...
CLIM Undergraduate Workshop: Introduction to Spatial Data Analysis with R - M...
 
CLIM Undergraduate Workshop: Tutorial on R Software - Huang Huang, Oct 23, 2017
CLIM Undergraduate Workshop: Tutorial on R Software - Huang Huang, Oct 23, 2017CLIM Undergraduate Workshop: Tutorial on R Software - Huang Huang, Oct 23, 2017
CLIM Undergraduate Workshop: Tutorial on R Software - Huang Huang, Oct 23, 2017
 
CLIM Undergraduate Workshop: How was this Made?: Making Dirty Data into Somet...
CLIM Undergraduate Workshop: How was this Made?: Making Dirty Data into Somet...CLIM Undergraduate Workshop: How was this Made?: Making Dirty Data into Somet...
CLIM Undergraduate Workshop: How was this Made?: Making Dirty Data into Somet...
 
CLIM Undergraduate Workshop: Extreme Value Analysis for Climate Research - Wh...
CLIM Undergraduate Workshop: Extreme Value Analysis for Climate Research - Wh...CLIM Undergraduate Workshop: Extreme Value Analysis for Climate Research - Wh...
CLIM Undergraduate Workshop: Extreme Value Analysis for Climate Research - Wh...
 
CLIM Undergraduate Workshop: Statistical Development and challenges for Paleo...
CLIM Undergraduate Workshop: Statistical Development and challenges for Paleo...CLIM Undergraduate Workshop: Statistical Development and challenges for Paleo...
CLIM Undergraduate Workshop: Statistical Development and challenges for Paleo...
 
CLIM Undergraduate Workshop: (Attachment) Performing Extreme Value Analysis (...
CLIM Undergraduate Workshop: (Attachment) Performing Extreme Value Analysis (...CLIM Undergraduate Workshop: (Attachment) Performing Extreme Value Analysis (...
CLIM Undergraduate Workshop: (Attachment) Performing Extreme Value Analysis (...
 
CLIM Undergraduate Workshop: Applications in Climate Context - Michael Wehner...
CLIM Undergraduate Workshop: Applications in Climate Context - Michael Wehner...CLIM Undergraduate Workshop: Applications in Climate Context - Michael Wehner...
CLIM Undergraduate Workshop: Applications in Climate Context - Michael Wehner...
 

Similar to CLIM Fall 2017 Course: Statistics for Climate Research, Guest lecture: Data Fusion - Veronica Berrocal, Sep 26, 2017

The Analytical/Numerical Relativity Interface behind Gravitational Waves: Lec...
The Analytical/Numerical Relativity Interface behind Gravitational Waves: Lec...The Analytical/Numerical Relativity Interface behind Gravitational Waves: Lec...
The Analytical/Numerical Relativity Interface behind Gravitational Waves: Lec...Lake Como School of Advanced Studies
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsChristian Robert
 
Recursive State Estimation AI for Robotics.pdf
Recursive State Estimation AI for Robotics.pdfRecursive State Estimation AI for Robotics.pdf
Recursive State Estimation AI for Robotics.pdff20220630
 
Numerical Methods
Numerical MethodsNumerical Methods
Numerical MethodsTeja Ande
 
Jiaais 2017-maceio-oa rosso
Jiaais 2017-maceio-oa rossoJiaais 2017-maceio-oa rosso
Jiaais 2017-maceio-oa rossoRandy Quindai
 
Foundation of KL Divergence
Foundation of KL DivergenceFoundation of KL Divergence
Foundation of KL DivergenceNatan Katz
 
Computing the masses of hyperons and charmed baryons from Lattice QCD
Computing the masses of hyperons and charmed baryons from Lattice QCDComputing the masses of hyperons and charmed baryons from Lattice QCD
Computing the masses of hyperons and charmed baryons from Lattice QCDChristos Kallidonis
 
Recent developments on unbiased MCMC
Recent developments on unbiased MCMCRecent developments on unbiased MCMC
Recent developments on unbiased MCMCPierre Jacob
 
Seminar 20091023 heydt_presentation
Seminar 20091023 heydt_presentationSeminar 20091023 heydt_presentation
Seminar 20091023 heydt_presentationdouglaslyon
 
Unbiased Markov chain Monte Carlo
Unbiased Markov chain Monte CarloUnbiased Markov chain Monte Carlo
Unbiased Markov chain Monte CarloJeremyHeng10
 
Unbiased Hamiltonian Monte Carlo
Unbiased Hamiltonian Monte Carlo Unbiased Hamiltonian Monte Carlo
Unbiased Hamiltonian Monte Carlo JeremyHeng10
 
Dengue Vector Population Forecasting Using Multisource Earth Observation Prod...
Dengue Vector Population Forecasting Using Multisource Earth Observation Prod...Dengue Vector Population Forecasting Using Multisource Earth Observation Prod...
Dengue Vector Population Forecasting Using Multisource Earth Observation Prod...University of Pavia
 
Theoretical Spectroscopy Lectures: real-time approach 2
Theoretical Spectroscopy Lectures: real-time approach 2Theoretical Spectroscopy Lectures: real-time approach 2
Theoretical Spectroscopy Lectures: real-time approach 2Claudio Attaccalite
 
Controlled sequential Monte Carlo
Controlled sequential Monte Carlo Controlled sequential Monte Carlo
Controlled sequential Monte Carlo JeremyHeng10
 
Gibbs flow transport for Bayesian inference
Gibbs flow transport for Bayesian inferenceGibbs flow transport for Bayesian inference
Gibbs flow transport for Bayesian inferenceJeremyHeng10
 
Allele Frequencies as Stochastic Processes: Mathematical & Statistical Approa...
Allele Frequencies as Stochastic Processes: Mathematical & Statistical Approa...Allele Frequencies as Stochastic Processes: Mathematical & Statistical Approa...
Allele Frequencies as Stochastic Processes: Mathematical & Statistical Approa...Gota Morota
 

Similar to CLIM Fall 2017 Course: Statistics for Climate Research, Guest lecture: Data Fusion - Veronica Berrocal, Sep 26, 2017 (20)

The Analytical/Numerical Relativity Interface behind Gravitational Waves: Lec...
The Analytical/Numerical Relativity Interface behind Gravitational Waves: Lec...The Analytical/Numerical Relativity Interface behind Gravitational Waves: Lec...
The Analytical/Numerical Relativity Interface behind Gravitational Waves: Lec...
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximations
 
Recursive State Estimation AI for Robotics.pdf
Recursive State Estimation AI for Robotics.pdfRecursive State Estimation AI for Robotics.pdf
Recursive State Estimation AI for Robotics.pdf
 
Quantum theory research overview
Quantum theory research overview Quantum theory research overview
Quantum theory research overview
 
Numerical Methods
Numerical MethodsNumerical Methods
Numerical Methods
 
Jiaais 2017-maceio-oa rosso
Jiaais 2017-maceio-oa rossoJiaais 2017-maceio-oa rosso
Jiaais 2017-maceio-oa rosso
 
CLIM: Transition Workshop - Accounting for Model Errors Due to Sub-Grid Scale...
CLIM: Transition Workshop - Accounting for Model Errors Due to Sub-Grid Scale...CLIM: Transition Workshop - Accounting for Model Errors Due to Sub-Grid Scale...
CLIM: Transition Workshop - Accounting for Model Errors Due to Sub-Grid Scale...
 
Foundation of KL Divergence
Foundation of KL DivergenceFoundation of KL Divergence
Foundation of KL Divergence
 
Computing the masses of hyperons and charmed baryons from Lattice QCD
Computing the masses of hyperons and charmed baryons from Lattice QCDComputing the masses of hyperons and charmed baryons from Lattice QCD
Computing the masses of hyperons and charmed baryons from Lattice QCD
 
Recent developments on unbiased MCMC
Recent developments on unbiased MCMCRecent developments on unbiased MCMC
Recent developments on unbiased MCMC
 
Seminar 20091023 heydt_presentation
Seminar 20091023 heydt_presentationSeminar 20091023 heydt_presentation
Seminar 20091023 heydt_presentation
 
Unbiased Markov chain Monte Carlo
Unbiased Markov chain Monte CarloUnbiased Markov chain Monte Carlo
Unbiased Markov chain Monte Carlo
 
Unbiased Hamiltonian Monte Carlo
Unbiased Hamiltonian Monte Carlo Unbiased Hamiltonian Monte Carlo
Unbiased Hamiltonian Monte Carlo
 
Dengue Vector Population Forecasting Using Multisource Earth Observation Prod...
Dengue Vector Population Forecasting Using Multisource Earth Observation Prod...Dengue Vector Population Forecasting Using Multisource Earth Observation Prod...
Dengue Vector Population Forecasting Using Multisource Earth Observation Prod...
 
PanicO
PanicOPanicO
PanicO
 
Technical
TechnicalTechnical
Technical
 
Theoretical Spectroscopy Lectures: real-time approach 2
Theoretical Spectroscopy Lectures: real-time approach 2Theoretical Spectroscopy Lectures: real-time approach 2
Theoretical Spectroscopy Lectures: real-time approach 2
 
Controlled sequential Monte Carlo
Controlled sequential Monte Carlo Controlled sequential Monte Carlo
Controlled sequential Monte Carlo
 
Gibbs flow transport for Bayesian inference
Gibbs flow transport for Bayesian inferenceGibbs flow transport for Bayesian inference
Gibbs flow transport for Bayesian inference
 
Allele Frequencies as Stochastic Processes: Mathematical & Statistical Approa...
Allele Frequencies as Stochastic Processes: Mathematical & Statistical Approa...Allele Frequencies as Stochastic Processes: Mathematical & Statistical Approa...
Allele Frequencies as Stochastic Processes: Mathematical & Statistical Approa...
 

More from The Statistical and Applied Mathematical Sciences Institute

More from The Statistical and Applied Mathematical Sciences Institute (20)

Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
 
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
 
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
 
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
 
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
 
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
 
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
 
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
 
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
 
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
 
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
 
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
 
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
 
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
 
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
 
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
 
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
 
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
 
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
 
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
 

Recently uploaded

Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...RKavithamani
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 

Recently uploaded (20)

Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 

CLIM Fall 2017 Course: Statistics for Climate Research, Guest lecture: Data Fusion - Veronica Berrocal, Sep 26, 2017

  • 1. Data fusion approaches (for Earth-systems data) Veronica J. Berrocal University of Michigan Department of Biostatistics SAMSI course Fall 2017 Veronica J. Berrocal Data fusion
  • 2. Outline • Introduction • Data assimilation • Optimal Interpolation • Variational methods • Kalman filter • Further approaches • Data fusion in the statistical literature • Spatial data • Example: Wikle and Berliner, 2005 • Example: Fuentes and Raftery, 2005 (Bayesian melding) • Space-time data • Example: Wikle et al., 2001 • Example: Choi et al., 2009 • Example: McMillan et al., 2009 • Example: Berrocal et al., 2010 and 2012 • Example: Sahu et al., 2009 Veronica J. Berrocal Data fusion
  • 3. Definitions • Data fusion refers to the statistical technique used to combine data from different sources • If one of the sources is the output of a computer model → Data assimilation • Data assimilation: term coined in the atmospheric science community • Several definitions • ”approach for fusing data (observations) with prior knowledge (e.g. mathematical representations of physical laws; model output) to obtain an estimate of the distribution of the true state of the process” (Wikle and Berliner, 2006) Veronica J. Berrocal Data fusion
  • 4. Why data fusion? • The evolution in time of many geophysical processes (e.g. atmosphere, etc.) can be described by systems of partial differential equations • As an example, numerical weather forecasts are obtained by running forward in time computer models that simulate the evolution of the atmosphere in time • The equations are solved numerically, by discretizing both space and time • It is necessary to specify initial conditions, and, at times, boundary conditions • High sensitivity of forecasts from the initial conditions Veronica J. Berrocal Data fusion
  • 5. Data fusion: an old problem • Often, observations of the inital states are not available: this was recognized by mathematicians and astronomers, among which Euler, Lagrange and Laplace. • In particular, Gauss elaborated how available observations of the physical system were not easily translatable into initial conditions and stated ”[..] since all our observations and measurements are nothing more than approximations to the truth, the same must be true of all calculations resting on them, and the highest of all computations made concerning concrete phenomena must be to approximate, as nearly as practicable, to the truth. But this can be accomplished in no other way than by suitable combination of more observations than the number absolutely requisite for the determination of unknown quantities.” (Theory of Motion of Heavenly Bodies) Veronica J. Berrocal Data fusion
  • 6. Global atmospheric models Veronica J. Berrocal Data fusion
  • 7. Numerical weather prediction models • xt: state of the atmosphere at time t; Mt: numerical weather prediction model at time t • Mt consists of a set of partial differential equations Longitude Latitude Height xt Veronica J. Berrocal Data fusion
  • 8. Numerical weather prediction models • xt: state of the atmosphere at time t; Mt: numerical weather prediction model at time t • Mt consists of a set of partial differential equations Longitude Latitude Height −→ Mt Longitude Latitude Heightxt xt+1 xt+1 = Mt(xt) (state-space model) Veronica J. Berrocal Data fusion
  • 9. It’s an initial-value problem • In order to obtain a skillful forecast, it is necessary that: • Mt is a realistic representation of the atmosphere • the vector xt of state space variables is known accurately • We will assume that the atmospheric model Mt approximates well the evolution in time of the atmosphere • We will focus on how to determine xt Veronica J. Berrocal Data fusion
  • 10. Determining the initial conditions Longitude Latitude Height • At each time t, xt is a vector of order n = 107. Veronica J. Berrocal Data fusion
  • 11. Determining the initial conditions Longitude Latitude Height • At each time t, xt is a vector of order n = 107. • For any time window around t, (t −δt;t +δt), there are typically p = 105 observations yt of the atmosphere. • The observations might not refer to the same variables as the state-space variables. • Data assimilation integrates the two sources of information: a short-range forecast, (background or first guess), x (b) t , with the observations, yt. Veronica J. Berrocal Data fusion
  • 12. Data assimilation [E. Kalnay (2003)] • Background or first guess: x (b) t . • Global analysis: data assimilation of the background, x (b) t , with the observations, yt. Veronica J. Berrocal Data fusion
  • 13. Data assimilation approaches • There are several methods for data assimilation. Main difference is on whether the observations are integrated sequentially or not, and whether the model is assumed perfect or stochastic: • Optimal Interpolation • Variational methods: 3D-Var and 4D-Var • Kalman filtering: Kalman filter and Ensemble Kalman filter • They all hypothesize that at time t, there are: 1 a true unknown state of the atmophere: xt 2 a background field: x (b) t 3 a vector of observations: yt 4 Goal: combine x (b) t and yt to determine the best approximation or analysis, x (a) t , to xt Veronica J. Berrocal Data fusion
  • 14. Data assimilation: assumptions • xt: true state of the atmosphere at time t • x (b) t : background field at time t x (b) t = xt +ε (b) t ε (b) t ∼ Nn(0,P(b) ) • yt: observations at time t yt = H (xt)+ε (o) t ≈ Ht ·xt +ε (o) t ε (o) t ∼ Np(0,R) H observation operator, assumed to be linear (or approximated with) and represented at time t by the matrix Ht • x (a) t : analysis at time t x (a) t = xt +ε (a) t ε (a) t ∼ Nn(0,P(a) ) Veronica J. Berrocal Data fusion
  • 15. Optimal Interpolation • We want to express x (a) t as a linear combination of x (b) t and yt: x (a) t = a1x (b) t +a2yt so that x (a) t is unbiased and a1 and a2 minimize the mean squared error • Using the same approach as in least squares, we assume that x (a) t is given by x (a) t = x (b) t +W(yt −Htx (b) t ) • Goal: determine the matrix W so that the analysis error, ε (a) t , minimize the expected sum of squares Veronica J. Berrocal Data fusion
  • 16. Optimal Interpolation • x (a) t = x (b) t +W(yt −Htx (b) t ) • Goal: determine W so that: E(ε (a) t ε (a) t ) = E W(yt −Htx (b) t )−ε (b) t W(yt −Htx (b) t )−ε (b) t is minimized • Then: W = P(b) Ht (R+HtP(b) Ht )−1 • The optimal weight matrix W is also called the gain matrix • The covariance matrix, P(a) , of the analysis error, ε (a) t , is: P(a) = (In −WHt)P(b) Veronica J. Berrocal Data fusion
  • 17. Optimal Interpolation • The analysis is obtained by adding to the first guess, x (b) t , the product of the optimal weight matrix times the innovation, that is, yt −Htx (b) t • The optimal weight matrix, W, is given by the covariance of the forecast error in the observation space (P(b)Ht ) divided by the total error covariance • If the observation operator, Ht, is a linear operator (or an interpolator), then Optimal Interpolation = Kriging Veronica J. Berrocal Data fusion
  • 18. Optimal Interpolation • Observation operator H is a linear operator represented at time t by the matrix Ht • Suppose that we assumed the following: • Prior distribution: xt ∼ Nn(x (b) t ,P(b) ) • Likelihood: yt|xt ∼ Np(Htxt,R) • Posterior distribution: xt|yt ∼ Nn(E(xt|yt),Var(xt|yt)) E(xt|yt) = x (b) t +W(yt −Htx (b) t ) Var(xt|yt)) = (In −WHt)P(b) with W as in Optimal Interpolation. Veronica J. Berrocal Data fusion
  • 19. Variational methods: 3D-Var • The true state of the atmosphere, xt, is found by minimizing a scalar cost function J(xt). J(xt) = 1 2 (yt −Htxt) R−1 (yt −Htxt)+ 1 2 (xt −x (b) t ) (P(b) )−1 (xt −x (b) t ) • R observation error covariance matrix • P(b) forecast error covariance matrix • Formally the solution to the 3D-Var minimization problem is the same as the solution to the Optimal Interpolation problem • The solution to a 3D-Var is the posterior mean in the case of a Gaussian prior for xt and a Gaussian likelihood with a linear observation operator Ht. Veronica J. Berrocal Data fusion
  • 20. Variational methods: 4D-Var • The true state of the atmosphere, xt, is found by minimizing a scalar cost function that allows for observations to be distributed within a time interval (t0,tN) J(xt0 ) = 1 2 N ∑ i=0 (yti −Hti xti ) Ri −1 (yti −Hti xti ) + 1 2 (xt0 −x (b) t0 ) (P (b) 0 )−1 (xt0 −x (b) t0 ) • Ri observation error covariance matrix at time i • P0 (b) forecast error covariance matrix at the start of the period • The cost function J(xt0 ) is minimized with respect to the initial true state of the atmosphere xt0 Veronica J. Berrocal Data fusion
  • 21. Assimilation via Kalman Filter • The numerical model is imperfect: xti = Mti−1 (xti−1 )+ηti i = 1,...,N with ηti ∼ N(0,Qi ) • The observations are used sequentially in the time interval (t1;tN). • At each time ti two operations are performed sequentially: 1 Forecast step 2 Analysis/assimilation step Veronica J. Berrocal Data fusion
  • 22. Assimilation via Kalman Filter • Forecast step: 1 Derive forecast or background at time ti : x (b) ti = Mti−1 (x (a) ti−1 ). 2 Assuming that Mt can be linearized and represented by the matrix Mt, compute covariance matrix of background error at time ti : P (b) i = Mti−1 P (a) i−1Mti−1 +Qi . • Analysis step: 1 Compute the Kalman gain matrix at time ti , Ki = P (b) i Hi (Ri +Hi P (b) i Hi )−1 . 2 Derive the analysis at time ti , x (a) ti : x (a) ti = x (b) ti +Ki (yti −Hi x (b) ti ) 3 Compute the covariance matrix of analysis error at time ti : P (a) i = (In −Ki Hi )P (b) i If Mti and Hti are not linear, then → Extended Kalman Filter Veronica J. Berrocal Data fusion
  • 23. Kalman Filter • Suppose that for each i = 1,...,N: • measurement equation: yti = Hi xti +εti εti ∼ N(0,Ri ) • process/transition equation: xti = Mti xti−1 +ηti ηti ∼ N(0,Qti ) with xt0 ∼ N(x (b) t0 ,P (b) t0 ) Veronica J. Berrocal Data fusion
  • 24. Kalman Filter • Let xt0:ti ≡ {xt0 ,xt1 ,...,xti } and yt1:ti ≡ {yt1 ,yt2 ,...,yti }. • Let x (a) ti be the analysis • Let x (f ) ti denote the forecast • For i = 1,...,N: • Filter step 1 x (f ) ti = E(xti |yti−1 ) = Mti x (a) ti−1 2 P (f ) ti = Var(xti |yti−1 ) = Mti P (a) ti−1 Mti +Qti • Analysis step 1 Kti = P (f ) ti Hi (Ri +Hi P (f ) ti Hi )−1 2 x (a) ti = x (f ) ti +Kti (yti −Hi x (f ) ti ) 3 P (a) ti = (In −Kti Hti )P (f ) ti Veronica J. Berrocal Data fusion
  • 25. Kalman Filter/Extended Kalman Filter • In the case of a linear state-space model Mt and a linear observation operator H , Kalman filter can be interpreted within a Bayesian framework. • If, at time ti , we assume: • xti ∼ N(x (b) ti ,P (b) i ) • yti ∼ N(Hi xti ,Ri ) • Then, the analysis x (a) ti is the posterior mean, E(xti |yti ) with the analysis covariance matrix P (a) ti posterior variance Var(xti |yti ) • On the other hand, the forecast step consists into deriving p(xti+1 |yti ) = p(xti+1 |xti )·p(xti |yti )dxti Veronica J. Berrocal Data fusion
  • 26. Kalman Filter/Extended Kalman Filter • It is the “gold standard” of data assimlation • Even with a poor initial guess of the state of the atmosphere, it should provide the best linear unbiased estimate of the state of the atmosphere • Problems if the system is unstable • Computationally expensive! The matrix operations to compute P (b) i and P (a) i involve matrices of order n ≈ 107 • Nonlinear dynamics: i.e. Mti non-linear, linear approximation does not perform well Veronica J. Berrocal Data fusion
  • 27. Ensemble Kalman Filter • Main idea: Use an ensembe of system states as a discrete approximation to the distribution of xti • Each ensemble member is propagated forward in time using Mti • The mean and covariance matrix of the new ensemble are used to approximate the forecast distribution • Similar to particle filter with the ensemble members being ”particles” • The same set of observations are assimilated to each ensemble member Veronica J. Berrocal Data fusion
  • 28. Ensemble Kalman Filter • Let x (b) t0,j , j = 1,...,M be M ensemble members • Forecast step: 1 Derive forecast ensemble members at time ti : x (b) ti,j = Mti−1 (x (a) ti−1,j ), j = 1,...,M 2 Compute sample covariance matrix of background error at time ti : ˆP (b) i • Analysis step: 1 Compute the Kalman gain matrix at time ti , Ki = ˆP (b) i Hi (Ri +Hi ˆP (b) i Hi )−1 2 Derive the analysis ensemble members at time ti : x (a) ti,j = x (b) ti,j +Ki (˜yti ,j −Hi x (b) ti,j ) where ˜yti ,j = yti +εj , εj ∼ N(0,R) 3 Compute the sample covariance matrix of the analysis error at time ti , ˆP (a) i Veronica J. Berrocal Data fusion
  • 29. Further approaches • Different strategies to perform the analysis step in the Ensemble Kalman filter • Sampling variability in Ensemble Kalman filter, especially if the ensemble size is small → filter divergence; decrease in contribution of the observations 1 Localization of the ensemble covariance matrix (e.g. covariance tapering, etc.) 2 Inflation of the ensemble spread Veronica J. Berrocal Data fusion
  • 30. Data fusion: spatial data Wikle and Berliner, Technometrics, 2005 • Two sources of wind data: daily wind satellite data and computer model output from a weather center for the period 15 September 1996-29 June 1997 • Data with different resolution → Change of support problem • Satellite-based wind estimates from NASA Scatterometer (NSCAT) at 0.5 degree resolution and not on a regular grid • National Center for Environmental Prediction (NCEP) analysis of wind direction at 2.5 degree resolution and on a regular grid • Goal: Predict surface streamfunction at a resolution of 1.0 degree Veronica J. Berrocal Data fusion
  • 31. Data fusion: spatial data Wind data from satellite and from an analysis for December 26, 1996 Veronica J. Berrocal Data fusion
  • 32. Data fusion: spatial data • Z measurement data from the two sources • Y true underlying process • Adopt the modeling approach: [Data Process] [Process Parameters] [Parameters] • Goal: Infer upon the process Y • Problem: The data has different spatial support Veronica J. Berrocal Data fusion
  • 33. Data fusion: spatial data • Let: 1 Ai , i = 1,...,na 2 Bj , j = 1,...,nb 3 Ck , k = 1,...,nc be non-overlapping sets such that: 0 ≤ |Ai | < |Bj | < |Ck | < ∞ for all i,j,k • ZA ≡ (Z(A1),...,Z(Ana )) ,observations on the subgrid • ZC ≡ (Z(C1),...,Z(Cnc )) , observations on the supergrid Veronica J. Berrocal Data fusion
  • 34. Data fusion: spatial data • Y = {Y (s) : s ∈ D ⊂ R} spatial process • Y (S) =    1 |S| S Y (s)ds |S| > 0 avg {Y (s) : s ∈ S} |S| = 0 • YA ≡ (Y (A1),...,Y (Ana )) ,subgrid process • YC ≡ (Y (C1),...,Y (Cnc )) , supergrid process • YB ≡ (Y (B1),...,Y (Bnb )) process on the prediction grid Then: • ZA observations of YA • ZC observations of YC Veronica J. Berrocal Data fusion
  • 35. Data fusion: spatial data Data model • Model for [ZA,ZC |YA,YC ,YB,θm] • Measurement error ZA = YA +εA εA ∼ N(0, σ2 a Ina ) ZC = YC +εC εC ∼ N(0, σ2 c Inc ) • ZA ZC |YA,YC ,σ2 a,σ2 c ∼ N YA YC ,Σm = σ2 aIna 0 0 σ2 c Inc Veronica J. Berrocal Data fusion
  • 36. Data fusion: spatial data Process model • For all s ∈ Bj Y (s) = Y (Bj )+γ(s) with E(γ(s)) = 0 and Cov(γ(s),γ(r)) = C(s,r;φ) • Then: 1 for all Ai , Y (Ai ) = g (i) A YB + 1 |Ai | Ai γ(s)ds 2 for all Ck , Y (Ck ) = g (k) C YB + 1 |Ck | Ck γ(s)ds • YA YC |YB,σ ∼ N GA GC YB,Σ(φ) Veronica J. Berrocal Data fusion
  • 37. Data fusion: spatial data Complete model • Data model: ZA ZC |YB,Σm,Σ ∼ N GA GC YB,Σm +Σ(φ) • Process model: YB ∼ N(θB,ΣB()φ) • Parameters: [σ2 a,σ2 c,θB,φ] Veronica J. Berrocal Data fusion
  • 38. Data fusion: spatial data Example: streamfunction • Data: 1 NSCAT satellite data: UA, VA (na = 369) 2 NCEP (numerical model output): UC , VC (nC = 15) • Process: uB, vB • Data model: 1 UA UC |uB,σu,Σ ∼ N GA GC uB,Σu +Σm 2 VA VC |vB,σv ,Σ ∼ N GA GC vB,Σv +Σm • Process model: 1 uB vB ∼ N µu1 µv 1 ,Σuv Veronica J. Berrocal Data fusion
  • 39. Data fusion: spatial data • Interest in predicting the streamfunction ψ. • Deterministic Poisson equation to determine streamfunction ψ from winds: ∇2 ψ = ∂v ∂x − ∂u ∂y u: east-west wind component, v: north-south wind component • Discretizing to a regular grid: 1 ψI |ψbc,u,v ∼ N(L−1 [Dx v −Dy u+Lbc ψbc],ΣI ) 2 ψbc ∼ N(µbc ,Σbc ) • ψI : streamfunction at the interior grid locations • ψbc: streamfunction at the boundary grid locations Veronica J. Berrocal Data fusion
  • 40. Data fusion: spatial data Wind data (top row); posterior mean and realization from the posterior distribution of the streamfunction for December 26, 1996 (bottom row) Veronica J. Berrocal Data fusion
  • 41. Data fusion: spatial data Fuentes and Raftery, Biometrics, 2005 • Two sources of weekly average SO2 concentration data: monitoring data and computer model output • Data with different resolution → Change of support problem • Monitoring data from CASTNet sites • Output of a numerical model, Models-3, given as average concentration over 36×36 km • Goal: Estimate true weekly average concentration of SO2 Veronica J. Berrocal Data fusion
  • 42. Data fusion: spatial data Fuentes and Raftery, Biometrics, 2005 Average SO2 concentration for the week of July 11, 1995 Veronica J. Berrocal Data fusion
  • 43. Data fusion: spatial data • Process: Z(s) true underlying process • Data: 1 ˆZ(s) measurement from monitoring network (CASTNET) 2 ˜Z(B) numerical model output (Models-3) • Goal: Infer upon the process Z(s) • Problem: The data has different spatial support Veronica J. Berrocal Data fusion
  • 44. Data fusion: spatial data Veronica J. Berrocal Data fusion
  • 45. Data fusion: spatial data Data model • Model for ˆZ(s), ˜Z(B) | Z(s),θm • Measurement error ˆZ(s) = Z(s)+e(s) e(s) ∼ N(0,σ2 e) ˜Z(B) = 1 |B| B ˜Z(s)ds ˜Z(s) = a(s)+b(s)Z(s)+δ(s) δ(s) ∼ N(0,σ2 δ) where 1 a(s) polynomial in s 2 b(s) ≡ b Veronica J. Berrocal Data fusion
  • 46. Data fusion: spatial data Process model • Z(s) = µ(s)+ε(s) with 1 E(ε(s)) = 0 and Cov(ε(s),ε(r)) = σ(s,r;φ) 2 µ(s) polynomial in s with coefficients β → Z(s) ∼ GP(µ(s),Σ) • Goal: Infer on Z given ˆZ, ˜Z Veronica J. Berrocal Data fusion
  • 47. Data fusion: spatial data • ˆZ = ˆZ(s1),..., ˆZ(sn) • ˜Z = ˜Z(B1),..., ˜Z(BM) ˆZ ˜Z ∼ N ˆµ ˜a+b˜µ , ΣC ΣCM ΣCM ΣM where 1 ˆµ = (µ(s1),...,µ(sn)) 2 ˜a = 1 |B1| B1 a(s)ds,..., 1 |BM | BM a(s)ds 3 ˜µ = 1 |B1| B1 µ(s)ds,..., 1 |BM | BM µ(s)ds Veronica J. Berrocal Data fusion
  • 48. Data fusion: spatial data ˆZ ˜Z ∼ N ˆµ ˜a+b˜µ , ΣC ΣCM ΣCM ΣM where 1 ΣC n ×n matrix: (ΣC )ij = σ(si ,sj ;φ)+1{si ≡sj }σ2 e 2 ΣCM n ×M matrix: (ΣCM)ik = b · 1 |Bk | Bk σ(si ,v;φ)dv 3 ΣM M ×M matrix: (ΣM)kl = b2 · 1 |Bk|·|Bl | Bk Bl σ(u,v;φ)du dv +1{Bk ≡Bl }σ2 δ Veronica J. Berrocal Data fusion
  • 49. Data fusion: spatial data Example: air pollution Data: 1 Weekly average of SO2 concentration at n = 50 CASTNet sites for the week of July 11, 1995 2 Weekly average of SO2 concentration at M = 81×87 36 × 36 grid cells, output of Models-3 for the week of July 11, 1995 Other modeling details • Stochastic integrals approximated by taking systematic sample of 4 points within each a grid cell • Degree of polynomials defining the mean trend µ(s) of Z(s) and of the additive bias a(s) of ˜Z(s) determined via RJMCMC • Non-stationary covariance function for the underlying true process Z(s) Veronica J. Berrocal Data fusion
  • 50. Data fusion: spatial data Posterior predictive mean and posterior predictive SD for Z(s) for the week of July 11, 1995 Veronica J. Berrocal Data fusion
  • 51. Data fusion: space-time data Wikle et al., JASA 2001 • Extend the modeling idea of Wikle and Berliner (2005) to account for time • Daily wind data from two sources: satellite data (at higher resolution) and computer model output (at a lower resolution) • Goal: Predict winds at an intermediate resolution over a 54 6-hour increment period • Accounted for the temporal dependence in the data by using dynamic coefficients in the specification of the process driving the observed data • Avoided to compute stochastic integrals! Veronica J. Berrocal Data fusion
  • 52. Data fusion: space-time data • Data: 1 NSCAT satellite data: UA,t, VA,t at time t 2 NCEP (numerical model output): UC,t, VC,t at time t → Ut = (UC,t,UA,t) and Vt = (VC,t,VA,t) observed data at time t → {Ut}T 1 = (U1,...,UT ) , {Vt}T 1 = (V1,...,VT ) • Process: • ut, vt at time t at nB prediction grid cells. • Similar definition for {ut}T t=1 and {vt}T t=1 Veronica J. Berrocal Data fusion
  • 53. Data fusion: space-time data Data model {V}T t=1 ,{U}T t=1 | {v}T t=1 ,{u}T t=1 ,θ = T ∏ t=1 [Vt | vt,θ]·[Ut | ut,θ] • Vt | vt,Σt ∼ N(Ktvt,Σt) • Ut | ut,Σt ∼ N(Ktut,Σt) 1 Σt diagonal matrix with entries equal to either σ2 (satellite obs), σ2 b (NCEP boundary grid cells) or σ2 I (NCEP interior cells) 2 Kt design matrices that maps the prediction grid cells to the observation grid cells Veronica J. Berrocal Data fusion
  • 54. Data fusion: space-time data Process model ut = µu +uE t + ˜ut vt = µv +vE t + ˜ut 1 µu spatial mean for the u wind component: µu = Puγu (resp. for µv ) → Pu design matrix (resp. Pv ) 2 uE t thin fluid approximation of the u wind component: uE t = Φau t (resp. for vE t ) → Φ basis function • ˜ut small scale motions of the u wind component: ˜ut = Ψbu t (resp. for ˜vt) → Ψ wavelet basis function Veronica J. Berrocal Data fusion
  • 55. Data fusion: space-time data Parameters • The 2n ×1 random vectors au t , av t are modeled as dynamically evolving in time but are independent between prediction grid cells • The n ×1 random vectors bu t and bv t are modeled as dynamically evolving in time and are independent between prediction grid cells • No need to compute stochastic integrals! • Only temporal dependence is explicitly modeled • Computationally feasible Veronica J. Berrocal Data fusion
  • 56. Data fusion: space-time data Choi et al., Comp. Stat. and Data Analysis 2009 • Extend the modeling idea of Fuentes and Raftery (2005) to account for time • Daily average PM2.5 concentration from two sources: monitoring data and computer model output 1 ˆZ(s,t) observation from monitoring site s at time t 2 ˜Z(B,t) model output at grid cell B at time t • Goal: Predict true daily average PM2.5 concentration aggregated over counties at time t for health analysis • Included the temporal dependence in the mean structure of the underlying process Veronica J. Berrocal Data fusion
  • 57. Data fusion: space-time data Data model • Model for ˆZ(s,t), ˜Z(B,t) | Z(s,t),θm ˆZ(s) = Z(s,t)+e(s,t) e(s,t) ∼ N(0,σ2 e) ˜Z(B,t) = 1 |B| B ˜Z(s,t)ds ˜Z(s,t) = a(s)+Z(s,t)+δ(s,t) δ(s,t) ∼ N(0,σ2 δ) Process model Z(s,t) = M(s,t)ξ +ε(s,t) ε(s,t) ∼ N(0,τ2 ) • M(s,t) vector of meteorological variables at site s at time t Veronica J. Berrocal Data fusion
  • 58. Data fusion: space-time data McMillan et al., Environmetrics, 2009 • Propose a spatio-temporal model to combine monitoring data and numerical model output 1 Daily average PM2.5 concentration from monitoring sites during year 2001 2 Daily average PM2.5 concentration, output of CMAQ model ran at 12 km grid cell resolution (M = 213×188) • Goal: Combine the two sources of data and predict true daily average PM2.5 concentration for each day in 2001 Veronica J. Berrocal Data fusion
  • 59. Data fusion: space-time data • Process: Wi true underlying process • Data: 1 Xi,k monitoring data 2 Yi,k CMAQ output • Wi defined on space-time grid cells: i ∈ {1,...,N}, where N = NT ×NP, NT number of time points, NP number of grid cells • Xi,k observed monitoring data for the k −th monitor observation in cell i • Yi,k CMAQ output in cell i (k = 1) Veronica J. Berrocal Data fusion
  • 60. Data fusion: space-time data Veronica J. Berrocal Data fusion
  • 61. Data fusion: space-time data Data model • Model for [Xi,k,Yi,k | Wi ,θ] Measurement error Xi,k = Wi +εi,k εi,k ∼ N(0,τ2 X ) Yi,k = Di β +Wi +δi,k δi,k ∼ N(0,τ2 Y ) • Di : vector of uniform B-splines over a regular 3-dimensional lattice of ND knots =⇒ CMAQ bias for grid cell i : Di β = ∑ND j=1 Dij βj Veronica J. Berrocal Data fusion
  • 62. Data fusion: space-time data Process model Wi = µt(i) +Zi • t(i) temporal index of grid cell i • µt(i) constant across space: µt(i) ∼ N(0,τ2 µ) • Z space-time multivariate normal with a separable covariance structure: autoregressive in time and conditionally autoregressive (CAR) in space =⇒ Z | τ2 Z ,ρ ∼ N(0,τ2 Z · (ΛT (ρ)⊗ΛP) −1 ) Veronica J. Berrocal Data fusion
  • 63. Data fusion: space-time data Daily mean levels for predicted surface, monitoring data and CMAQ over Eastern US Veronica J. Berrocal Data fusion
  • 64. Data fusion: space-time data Posterior predictive mean for (a) 4 July 2001 and (b) 24 December 2001 (a) (b) Veronica J. Berrocal Data fusion
  • 65. Data fusion: space-time data Berrocal et al., JABES, 2010 • Propose a spatio-temporal model to combine monitoring data and numerical model output 1 Daily 8-hr max ozone concentration from monitoring sites during summer of 2001 2 Daily 8-hr max ozone concentration, output of CMAQ model ran at 12 km grid cell resolution (M = 213×188) • Goal: Combine the two sources of data and “downscale” numerical model output at point level • Does not assume a “true” underlying process Veronica J. Berrocal Data fusion
  • 66. Data fusion: space-time data • Y (s,t): observation at site s at time t • x(B,t): CMAQ output at grid cell B at time t For s ∈ B: Y (s,t) = ˜β0(s,t)+ ˜β1(s,t)x(B,t)+ε(s,t) ε(s,t) ∼ N(0,σ2 ) with ˜βi (s,t) = βit +βi (s,t), for i = 0,1. • Temporal dependence in β0t and β1t: (i) β0t,β1t Nested within time (ii) β0t,β1t Dynamic in time • β0(s,t) and β1(s,t) correlated Gaussian processes that are either: (i) Nested within time OR (i) Dynamic in time Veronica J. Berrocal Data fusion
  • 67. Data fusion: space-time data • Possible spatio-temporal models to combine the two data β0t β0(s,t) Model β1t β1(s,t) Model 1 Independent across time Constant in time Model 2 Dynamic Constant in time Model 3 Independent across time Independent across time Model 4 Dynamic Dynamic Veronica J. Berrocal Data fusion
  • 68. Data fusion: space-time data −100 −95 −90 −85 −80 −75 −70 30354045 Longitude Latitude Ozone monitoring sites, 2001 Test sites (black), validation sites (red) • Daily maximum 8-hour ozone concentration (ppb): observations (n=803) and CMAQ model output • Model output on 12-km grid cells (M=40,440) • Fit models for May 1 - October 15, 2001 • 436 sites used to fit the model, 367 sites for validation Veronica J. Berrocal Data fusion
  • 69. Data fusion: space-time • National Ambient Air Quality Standard (NAAQS) for ozone is that the 3-year rolling average of the annual fourth highest daily 8-hour maximum ozone concentration be less than a given threshold • Maps of the probability that the fourth highest ozone concentration during the period May 1 - October 15, 2001 exceeds: −100 −95 −90 −85 −80 −75 −70 30354045 Longitude Latitude 0.0 0.2 0.4 0.6 0.8 1.0 (a) 80 ppb (1997 standard) −100 −95 −90 −85 −80 −75 −70 30354045 Longitude Latitude 0.0 0.2 0.4 0.6 0.8 1.0 (b) 75 ppb (2008 standard) Veronica J. Berrocal Data fusion
  • 70. Data fusion: space-time Berrocal et al., Environmetrics, 2012 • Extended the 2010 downscaler model to allow for potential spatial misalignment in the computer model output 1 Seasonal average temperature at 17 synoptic stations in Sweden for the period December 1962-November 2007 2 Regional climate model output on a 12.5km × 12.5km grid for the same period • Goal: Assess the performance of the regional climate model. Veronica J. Berrocal Data fusion
  • 71. RCM data 12 14 16 18 20 56586062 −10 −8 −6 −4 −2 0 2 4 RCM output: DJF 2002 q q q Stockholm Borlange Goteborg • Output of the Swedish Meteorological Hydrological Institute (SMHI) Rossby Centre Atmospheric (RCA) RCM model • Daily output for 2-m temperature from December 1, 1962 to November 30, 2007, then aggregated to quarterly averages (DJF, MAM, JJA, SON) • Output at 12.5 km × 12.5 km grid boxes Veronica J. Berrocal Data fusion
  • 72. RCM data 12 14 16 18 20 56586062 −2 0 2 4 6 8 RCM output: MAM 2002 q q q Stockholm Borlange Goteborg • Output of the Swedish Meteorological Hydrological Institute (SMHI) Rossby Centre Atmospheric (RCA) RCM model • Daily output for 2-m temperature from December 1, 1962 to November 30, 2007, then aggregated to quarterly averages (DJF, MAM, JJA, SON) • Output at 12.5 km × 12.5 km grid boxes Veronica J. Berrocal Data fusion
  • 73. Observational data 12 14 16 18 20 56586062 −10 −8 −6 −4 −2 0 2 4 Observation data: DJF 2002 q q q Stockholm Borlange Goteborg qq q q q q q q qq q q q q q q q • Observed daily average temperature from 17 stations in the SMHI network of synoptic stations • Period: December 1, 1962 to November 30, 2007 • Daily data aggregated to quarterly scale • Three stations, G¨oteborg, Stockholm and Borl¨ange held out for validation Veronica J. Berrocal Data fusion
  • 74. Downscaling model • Some notation: • B1,...,Bg : RCM model grid boxes with centroids r1,...,rg • x(B1,t),x(B2,t),...,x(Bg ,t): RCM output of quarterly average temperature for quarter t = 1,...,T at grid box B1,B2,...,Bg • Y (s,t): observed quarterly average temperature at station s for quarter t = 1,...,T • The 2010 downscaling applied to this data would be: for s in B and t = 1,...,T Y (s,t) = ˜β0,t(s,t)+˜β1,t(s,t)x(B,t)+ε(s,t) ε(s,t) iid ∼ N(0,τ2 ) Veronica J. Berrocal Data fusion
  • 75. Downscaling model • The 2012 model starts from the observation that we could write: for t = 1,...,T Y (s,t) = ˜β0(s,t)+β1 ˜x(s,t)+ε(s,t) ε(s,t) ∼ N(0,τ2 ) with • ˜x(s,t): spatio-temporal weighted average of the RCM output: ˜x(s,t) = g ∑ k=1 wk (s,t)x(Bk ,t) • ˜β1(s,t) replaced by β1 for identifiability reasons • The weights wk(s,t) should be: • positive and sum up to 1 • spatially correlated within sites and across sites Veronica J. Berrocal Data fusion
  • 76. Downscaling model • If r1,...,rg are the centroids of the RCM grid boxes, we can take the weights wk(s,t) to be wk(s,t) = K (|s−rk|;λ) ∑ g l=1 K (|s−rl |;λ) • K (·;λ) kernel function with bandwidth λ. For example: K (|s−rk|;λ) = exp(−|s−rk | λ ). Veronica J. Berrocal Data fusion
  • 77. Downscaling model We consider RCM output and observational data: 12 14 16 18 20 56586062 −10 −8 −6 −4 −2 0 2 4 RCM output: DJF 2002 q q q Stockholm Borlange Goteborg 12 14 16 18 2056586062 −10 −8 −6 −4 −2 0 2 4 Observation data: DJF 2002 q q q Stockholm Borlange Goteborg qq q q q q q q qq q q q q q q q Veronica J. Berrocal Data fusion
  • 78. Downscaling model We consider RCM output and observational data: 12 14 16 18 20 56586062 −10 −8 −6 −4 −2 0 2 4 RCM output: DJF 2002 q q q Stockholm Borlange Goteborg 12 14 16 18 20 56586062 −10 −8 −6 −4 −2 0 2 4 Observation data: DJF 2002 q q q Stockholm Borlange Goteborg q We establish the spatial linear model: for s ∈ B and t = 1,...,T Y (s,t) = ˜β0(s,t)+β1 ˜x(s,t)+ε(s,t) ε(s,t) ∼ N(0,τ2 ) Veronica J. Berrocal Data fusion
  • 79. Downscaling model For t = 1,...,T, the weight wk(s,t) is: 12 14 16 18 20 56586062 0.0 0.1 0.2 0.3 0.4 0.5 q q q Stockholm Borlange Goteborg q 14.0 14.2 14.4 14.6 14.8 61.661.862.062.262.4 0.0 0.1 0.2 0.3 0.4 0.5 q Veronica J. Berrocal Data fusion
  • 80. Downscaling model • To allow for the weights wk(s,t) to be directional, we modify the expression wk(s,t) = K (|s−rk|;λ) ∑ g l=1 K (|s−rl |;λ) to wk(s,t) = K (|s−rk|;λ)·exp(Q(rk,t)) ∑ g l=1 K (|s−rl |;λ)·exp(Q(rl ,t)) where for t = 1,...,T, Q(r,t) is a latent stationary mean-zero spatial Gaussian process with variance 1 and exponential correlation function. • For t = 1,...,T, the range φ of the latent spatial process Q(r,t) influences the directionality of the weights. Veronica J. Berrocal Data fusion
  • 81. Downscaling model • Finally, the downscaling model is: for s and t = 1,...,T: Y (s,t) = ˜β0(s,t)+β1 ˜x(s,t)+ε(s,t) ε(s,t) ∼ N(0,τ2 ) • ˜β0,t(s) = β0,t +β0(s,t) with β0(s,t) stationary mean-zero Gaussian spatial process with time-varying range parameter. • ˜x(s,t) = ∑ g k=1 wk (s,t)x(Bk ,t) • wk (s,t) = K (|s−rk |;λ)·exp(Q(rk ,t)) ∑ g l=1 K (|s−rl |;λ)·exp(Q(rl ,t)) • Q(r,t) is a latent stationary mean-zero spatial Gaussian process with variance 1 and exponential correlation function with range parameter φ. • For t = 1,...,T, the calibration parameters, β0,t,β0(s,t) are assumed to be independent in time, and so is the latent process Q(r,t). Veronica J. Berrocal Data fusion
  • 82. Predictions at point level • We predicted quarterly average temperature at three reserved stations and compared them with: 1 observed data 2 the quarterly average temperature, output of the RCM at the grid box containing the station. 12 14 16 18 20 56586062 Longitude Latitude 12 3 4 5 6 7 8 910 11 12 13 14 15 16 17 q q q Stockholm Borlange Goteborg Veronica J. Berrocal Data fusion
  • 83. Predictions at Borl¨ange Black line: observed data Blue line: downscaling model prediction Red line: RCM output Magenta line: upscaling model prediction 1970 1980 1990 2000 −15−10−505 Borlänge Winter 1970 1980 1990 2000 −5051015 Spring 1970 1980 1990 2000 510152025 Summer 1970 1980 1990 2000 05101520 Year Autumn Veronica J. Berrocal Data fusion
  • 84. Predictions at Stockholm Black line: observed data Blue line: downscaling model prediction Red line: RCM output Magenta line: upscaling model prediction 1970 1980 1990 2000 −15−10−505 Stockholm Winter 1970 1980 1990 2000 −5051015 Spring 1970 1980 1990 2000 510152025 Summer 1970 1980 1990 2000 05101520 Year Autumn Veronica J. Berrocal Data fusion
  • 85. Spatial differences 12 14 16 18 20 57596163 12 3 4 5 6 7 8 10 12 13 14 15 16 17 9,11 Downscaling Climate: Winter 2002 10 5 0 5 10 12 14 16 18 20 57596163 12 3 4 5 6 7 8 10 12 13 14 15 16 17 9,11 Upscaling Climate: Winter 2002 10 5 0 5 10 12 14 16 18 20 57596163 12 3 4 5 6 7 8 10 12 13 14 15 16 17 9,11 Downscaling Climate: Spring 2002 10 5 0 5 10 12 14 16 18 20 57596163 12 3 4 5 6 7 8 10 12 13 14 15 16 17 9,11 Upscaling Climate: Spring 2002 10 5 0 5 10 12 14 16 18 20 57596163 12 3 4 5 6 7 8 10 12 13 14 15 16 17 9,11 Downscaling Climate: Summer 2002 10 5 0 5 10 12 14 16 18 20 57596163 12 3 4 5 6 7 8 10 12 13 14 15 16 17 9,11 Upscaling Climate: Summer 2002 10 5 0 5 10 12 14 16 18 20 57596163 12 3 4 5 6 7 8 10 12 13 14 15 16 17 9,11 Downscaling Climate: Autumn 2002 10 5 0 5 10 12 14 16 18 20 57596163 12 3 4 5 6 7 8 10 12 13 14 15 16 17 9,11 Upscaling Climate: Autumn 2002 10 5 0 5 10 Veronica J. Berrocal Data fusion
  • 86. Data fusion: space-time Sahu et al., JRSS Series C, 2009 • Propose a spatio-temporal model to combine monitoring data and numerical model output to predict wet chemical deposition 1 Weekly nitrate (resp. sulfate) deposition for year 2001 at monitoring sites (n = 152) 2 Weekly precipitation data for year 2001 at monitoring sites 3 Weekly nitrate (resp. sulfate) deposition, output of CMAQ model ran at 12 km grid cell resolution (M = 33,390) • Goal: Combine the sources of data and predict weekly, annual and seasonal wet deposition in the Eastern US for 2001 Veronica J. Berrocal Data fusion
  • 87. Data fusion: space-time Data: 1 P(s,t) observed precipitation at s at time t 2 Z(s,t) observed deposition at s at time t 3 Q(B,t) CMAQ model output at grid cell B at time t Data model: P(s,t) = exp(U(s,t)) if V (s,t) > 0 0 o.w Z(s,t) = exp(Y (s,t)) if V (s,t) > 0 0 o.w Q(B,t) = exp(X(B,t)) if ˜V (B,t) > 0 0 o.w X(B,t) = γ0 +γ1 ˜V (B,t)+ψ(B,t) ψ(B,t) ∼ N(0,σ2 ψ) Veronica J. Berrocal Data fusion
  • 88. Data fusion: space-time Process model: 1 U(s,t) process driving precipitation at s at time t 2 Y (s,t) process driving deposition at s at time t 3 V (s,t) latent atmospheric process 4 ˜V (B,t) process driving the log-CMAQ output at B at time t U(s,t) = α0 +α1V (s,t)+δ(s,t) δt ∼ GP(0,Σδ) Y (s,t) = β0 +β1U(s,t)+β2V (s,t)+[b0 +b1(s)X(B,t)]+η(s,t)+ε(s,t) V (s,t) = ˜V (B,t)+ν(s,t) ν(s,t) ∼ N(0,σ2 ψ) ˜V (B,t) = ρ ˜V (B,t −1)+ζ(B,t) ζ(B,t) ∼ CAR Veronica J. Berrocal Data fusion
  • 89. Data fusion: spatial data Veronica J. Berrocal Data fusion
  • 90. Data fusion: spatial data Monitoring data and validation sites Veronica J. Berrocal Data fusion
  • 91. Data fusion: spatial data Annual total precipitation in 2001 Veronica J. Berrocal Data fusion
  • 92. Data fusion: space-time data (a) (b) Posterior predictive mean for b(s) for (a) sulfate (b) nitrate Veronica J. Berrocal Data fusion
  • 93. Data fusion: space-time data (a) (b) (a) Posterior predictive annual mean for nitrate and (b) length of predictive interval Veronica J. Berrocal Data fusion