CLIM Fall 2017 Course: Statistics for Climate Research, Guest lecture: Data Fusion - Veronica Berrocal, Sep 26, 2017

Data fusion approaches
(for Earth-systems data)
Veronica J. Berrocal
University of Michigan
Department of Biostatistics
SAMSI course
Fall 2017
Veronica J. Berrocal Data fusion

Outline
• Introduction
• Data assimilation
• Optimal Interpolation
• Variational methods
• Kalman ﬁlter
• Further approaches
• Data fusion in the statistical literature
• Spatial data
• Example: Wikle and Berliner, 2005
• Example: Fuentes and Raftery, 2005 (Bayesian melding)
• Space-time data
• Example: Wikle et al., 2001
• Example: Choi et al., 2009
• Example: McMillan et al., 2009
• Example: Berrocal et al., 2010 and 2012
• Example: Sahu et al., 2009

Definitions
• Data fusion refers to the statistical technique used to combine
data from different sources
• If one of the sources is the output of a computer model
→ Data assimilation
• Data assimilation: term coined in the atmospheric science
community
• Several definitions
• ”approach for fusing data (observations) with prior knowledge
(e.g. mathematical representations of physical laws; model
output) to obtain an estimate of the distribution of the true
state of the process” (Wikle and Berliner, 2006)

Why data fusion?
• The evolution in time of many geophysical processes (e.g.
atmosphere, etc.) can be described by systems of partial
diﬀerential equations
• As an example, numerical weather forecasts are obtained by
running forward in time computer models that simulate the
evolution of the atmosphere in time
• The equations are solved numerically, by discretizing both
space and time
• It is necessary to specify initial conditions, and, at times,
boundary conditions
• High sensitivity of forecasts from the initial conditions

Data fusion: an old problem
• Often, observations of the inital states are not available: this
was recognized by mathematicians and astronomers, among
which Euler, Lagrange and Laplace.
• In particular, Gauss elaborated how available observations of
the physical system were not easily translatable into initial
conditions and stated
”[..] since all our observations and measurements are nothing more
than approximations to the truth, the same must be true of all
calculations resting on them, and the highest of all computations
made concerning concrete phenomena must be to approximate, as
nearly as practicable, to the truth. But this can be accomplished in
no other way than by suitable combination of more observations
than the number absolutely requisite for the determination of
unknown quantities.” (Theory of Motion of Heavenly Bodies)

Global atmospheric models

Numerical weather prediction models
• xt: state of the atmosphere at time t; Mt: numerical weather
prediction model at time t
• Mt consists of a set of partial diﬀerential equations
Longitude
Latitude
Height
xt

Numerical weather prediction models
• xt: state of the atmosphere at time t; Mt: numerical weather
prediction model at time t
• Mt consists of a set of partial diﬀerential equations
Longitude
Latitude
Height
−→
Mt Longitude
Latitude
Heightxt xt+1
xt+1 = Mt(xt)
(state-space model)

It’s an initial-value problem
• In order to obtain a skillful forecast, it is necessary that:
• Mt is a realistic representation of the atmosphere
• the vector xt of state space variables is known accurately
• We will assume that the atmospheric model Mt approximates
well the evolution in time of the atmosphere
• We will focus on how to determine xt

Determining the initial conditions
Longitude
Latitude
Height
• At each time t, xt is a vector of order n = 107.

Determining the initial conditions
Longitude
Latitude
Height
• At each time t, xt is a vector of order n = 107.
• For any time window around t, (t −δt;t +δt), there are
typically p = 105 observations yt of the atmosphere.
• The observations might not refer to the same variables as the
state-space variables.
• Data assimilation integrates the two sources of information: a
short-range forecast, (background or ﬁrst guess), x
(b)
t , with
the observations, yt.

Data assimilation
[E. Kalnay (2003)]
• Background or ﬁrst guess: x
(b)
t .
• Global analysis: data assimilation
of the background, x
(b)
t , with the
observations, yt.

Data assimilation approaches
• There are several methods for data assimilation. Main
difference is on whether the observations are integrated
sequentially or not, and whether the model is assumed perfect
or stochastic:
• Optimal Interpolation
• Variational methods: 3D-Var and 4D-Var
• Kalman filtering: Kalman filter and Ensemble Kalman filter
• They all hypothesize that at time t, there are:
1 a true unknown state of the atmophere: xt
2 a background field: x
(b)
t
3 a vector of observations: yt
4 Goal: combine x
(b)
t and yt to determine the best
approximation or analysis, x
(a)
t , to xt

Data assimilation: assumptions
• xt: true state of the atmosphere at time t
• x
(b)
t : background ﬁeld at time t
x
(b)
t = xt +ε
(b)
t ε
(b)
t ∼ Nn(0,P(b)
)
• yt: observations at time t
yt = H (xt)+ε
(o)
t ≈ Ht ·xt +ε
(o)
t ε
(o)
t ∼ Np(0,R)
H observation operator, assumed to be linear (or
approximated with) and represented at time t by the matrix
Ht
• x
(a)
t : analysis at time t
x
(a)
t = xt +ε
(a)
t ε
(a)
t ∼ Nn(0,P(a)
)

Optimal Interpolation
• We want to express x
(a)
t as a linear combination of x
(b)
t and yt:
x
(a)
t = a1x
(b)
t +a2yt
so that x
(a)
t is unbiased and a1 and a2 minimize the mean
squared error
• Using the same approach as in least squares, we assume that
x
(a)
t is given by
x
(a)
t = x
(b)
t +W(yt −Htx
(b)
t )
• Goal: determine the matrix W so that the analysis error, ε
(a)
t ,
minimize the expected sum of squares

• x
(a)
t = x
(b)
t +W(yt −Htx
(b)
t )
• Goal: determine W so that:
E(ε
(a)
t ε
(a)
t ) = E W(yt −Htx
(b)
t )−ε
(b)
t W(yt −Htx
(b)
t )−ε
(b)
t
is minimized
• Then: W = P(b)
Ht (R+HtP(b)
Ht )−1
• The optimal weight matrix W is also called the gain matrix
• The covariance matrix, P(a)
, of the analysis error, ε
(a)
t , is:
P(a)
= (In −WHt)P(b)

• The analysis is obtained by adding to the ﬁrst guess, x
(b)
t , the
product of the optimal weight matrix times the innovation,
that is, yt −Htx
(b)
t
• The optimal weight matrix, W, is given by the covariance of
the forecast error in the observation space (P(b)Ht ) divided
by the total error covariance
• If the observation operator, Ht, is a linear operator (or an
interpolator), then
Optimal Interpolation = Kriging

• Observation operator H is a linear operator represented at
time t by the matrix Ht
• Suppose that we assumed the following:
• Prior distribution: xt ∼ Nn(x
(b)
t ,P(b)
)
• Likelihood: yt|xt ∼ Np(Htxt,R)
• Posterior distribution: xt|yt ∼ Nn(E(xt|yt),Var(xt|yt))
E(xt|yt) = x
(b)
t +W(yt −Htx
(b)
t )
Var(xt|yt)) = (In −WHt)P(b)
with W as in Optimal Interpolation.

Variational methods: 3D-Var
• The true state of the atmosphere, xt, is found by minimizing
a scalar cost function J(xt).
J(xt) =
1
2
(yt −Htxt) R−1
(yt −Htxt)+
1
2
(xt −x
(b)
t ) (P(b)
)−1
(xt −x
(b)
t )
• R observation error covariance matrix
• P(b)
forecast error covariance matrix
• Formally the solution to the 3D-Var minimization problem is the
same as the solution to the Optimal Interpolation problem
• The solution to a 3D-Var is the posterior mean in the case of a
Gaussian prior for xt and a Gaussian likelihood with a linear
observation operator Ht.

Variational methods: 4D-Var
• The true state of the atmosphere, xt, is found by minimizing
a scalar cost function that allows for observations to be
distributed within a time interval (t0,tN)
J(xt0 ) =
1
2
N
∑
i=0
(yti
−Hti
xti
) Ri
−1
(yti
−Hti
xti
)
+
1
2
(xt0 −x
(b)
t0
) (P
(b)
0 )−1
(xt0 −x
(b)
t0
)
• Ri observation error covariance matrix at time i
• P0
(b)
forecast error covariance matrix at the start of the period
• The cost function J(xt0 ) is minimized with respect to the initial true
state of the atmosphere xt0

Assimilation via Kalman Filter
• The numerical model is imperfect:
xti = Mti−1 (xti−1 )+ηti i = 1,...,N
with ηti ∼ N(0,Qi )
• The observations are used sequentially in the time interval
(t1;tN).
• At each time ti two operations are performed sequentially:
1 Forecast step
2 Analysis/assimilation step

Assimilation via Kalman Filter
• Forecast step:
1 Derive forecast or background at time ti : x
(b)
ti
= Mti−1
(x
(a)
ti−1
).
2 Assuming that Mt can be linearized and represented by the
matrix Mt, compute covariance matrix of background error at
time ti : P
(b)
i = Mti−1
P
(a)
i−1Mti−1
+Qi .
• Analysis step:
1 Compute the Kalman gain matrix at time ti ,
Ki = P
(b)
i Hi (Ri +Hi P
(b)
i Hi )−1
.
2 Derive the analysis at time ti , x
(a)
ti
:
x
(a)
ti
= x
(b)
ti
+Ki (yti
−Hi x
(b)
ti
)
3 Compute the covariance matrix of analysis error at time ti :
P
(a)
i = (In −Ki Hi )P
(b)
i
If Mti and Hti are not linear, then → Extended Kalman Filter

Kalman Filter
• Suppose that for each i = 1,...,N:
• measurement equation:
yti
= Hi xti
+εti
εti
∼ N(0,Ri )
• process/transition equation:
xti
= Mti
xti−1
+ηti
ηti
∼ N(0,Qti
)
with xt0 ∼ N(x
(b)
t0
,P
(b)
t0
)

Kalman Filter
• Let xt0:ti ≡ {xt0 ,xt1 ,...,xti } and yt1:ti ≡ {yt1 ,yt2 ,...,yti }.
• Let x
(a)
ti
be the analysis
• Let x
(f )
ti
denote the forecast
• For i = 1,...,N:
• Filter step
1 x
(f )
ti
= E(xti |yti−1 ) = Mti x
(a)
ti−1
2 P
(f )
ti
= Var(xti |yti−1 ) = Mti P
(a)
ti−1
Mti
+Qti
• Analysis step
1 Kti = P
(f )
ti
Hi (Ri +Hi P
(f )
ti
Hi )−1
2 x
(a)
ti
= x
(f )
ti
+Kti (yti −Hi x
(f )
ti
)
3 P
(a)
ti
= (In −Kti Hti )P
(f )
ti

Kalman Filter/Extended Kalman Filter
• In the case of a linear state-space model Mt and a linear
observation operator H , Kalman ﬁlter can be interpreted
within a Bayesian framework.
• If, at time ti , we assume:
• xti
∼ N(x
(b)
ti
,P
(b)
i )
• yti
∼ N(Hi xti
,Ri )
• Then, the analysis x
(a)
ti
is the posterior mean, E(xti |yti ) with
the analysis covariance matrix P
(a)
ti
posterior variance
Var(xti |yti )
• On the other hand, the forecast step consists into deriving
p(xti+1 |yti ) = p(xti+1 |xti )·p(xti |yti )dxti

Kalman Filter/Extended Kalman Filter
• It is the “gold standard” of data assimlation
• Even with a poor initial guess of the state of the atmosphere,
it should provide the best linear unbiased estimate of the state
of the atmosphere
• Problems if the system is unstable
• Computationally expensive! The matrix operations to
compute P
(b)
i and P
(a)
i involve matrices of order n ≈ 107
• Nonlinear dynamics: i.e. Mti non-linear, linear approximation
does not perform well

Ensemble Kalman Filter
• Main idea: Use an ensembe of system states as a discrete
approximation to the distribution of xti
• Each ensemble member is propagated forward in time using
Mti
• The mean and covariance matrix of the new ensemble are
used to approximate the forecast distribution
• Similar to particle ﬁlter with the ensemble members being
”particles”
• The same set of observations are assimilated to each ensemble
member

Ensemble Kalman Filter
• Let x
(b)
t0,j
, j = 1,...,M be M ensemble members
• Forecast step:
1 Derive forecast ensemble members at time ti :
x
(b)
ti,j
= Mti−1
(x
(a)
ti−1,j
), j = 1,...,M
2 Compute sample covariance matrix of background error at
time ti : ˆP
(b)
i
• Analysis step:
1 Compute the Kalman gain matrix at time ti ,
Ki = ˆP
(b)
i Hi (Ri +Hi
ˆP
(b)
i Hi )−1
2 Derive the analysis ensemble members at time ti :
x
(a)
ti,j
= x
(b)
ti,j
+Ki (˜yti ,j −Hi x
(b)
ti,j
)
where ˜yti ,j = yti
+εj , εj ∼ N(0,R)
3 Compute the sample covariance matrix of the analysis error at
time ti , ˆP
(a)
i

Further approaches
• Different strategies to perform the analysis step in the
Ensemble Kalman filter
• Sampling variability in Ensemble Kalman filter, especially if
the ensemble size is small
→ filter divergence; decrease in contribution of the
observations
1 Localization of the ensemble covariance matrix (e.g.
covariance tapering, etc.)
2 Inflation of the ensemble spread

Data fusion: spatial data
Wikle and Berliner, Technometrics, 2005
• Two sources of wind data: daily wind satellite data and
computer model output from a weather center for the period
15 September 1996-29 June 1997
• Data with diﬀerent resolution → Change of support problem
• Satellite-based wind estimates from NASA Scatterometer
(NSCAT) at 0.5 degree resolution and not on a regular grid
• National Center for Environmental Prediction (NCEP) analysis
of wind direction at 2.5 degree resolution and on a regular grid
• Goal: Predict surface streamfunction at a resolution of 1.0
degree

Wind data from satellite and from an analysis for December 26, 1996

• Z measurement data from the two sources
• Y true underlying process
• Adopt the modeling approach:
[Data Process]
[Process Parameters]
[Parameters]
• Goal: Infer upon the process Y
• Problem: The data has diﬀerent spatial support

• Let:
1 Ai , i = 1,...,na
2 Bj , j = 1,...,nb
3 Ck , k = 1,...,nc be non-overlapping sets such that:
0 ≤ |Ai | < |Bj | < |Ck | < ∞ for all i,j,k
• ZA ≡ (Z(A1),...,Z(Ana )) ,observations on the subgrid
• ZC ≡ (Z(C1),...,Z(Cnc )) , observations on the supergrid

• Y = {Y (s) : s ∈ D ⊂ R} spatial process
•
Y (S) =



1
|S| S Y (s)ds |S| > 0
avg {Y (s) : s ∈ S} |S| = 0
• YA ≡ (Y (A1),...,Y (Ana )) ,subgrid process
• YC ≡ (Y (C1),...,Y (Cnc )) , supergrid process
• YB ≡ (Y (B1),...,Y (Bnb
)) process on the prediction grid
Then:
• ZA observations of YA
• ZC observations of YC

Data model
• Model for [ZA,ZC |YA,YC ,YB,θm]
• Measurement error
ZA = YA +εA εA ∼ N(0, σ2
a Ina )
ZC = YC +εC εC ∼ N(0, σ2
c Inc )
•
ZA
ZC
|YA,YC ,σ2
a,σ2
c ∼
N
YA
YC
,Σm =
σ2
aIna 0
0 σ2
c Inc

Complete model
• Data model:
ZA
ZC
|YB,Σm,Σ ∼ N
GA
GC
YB,Σm +Σ(φ)
• Process model: YB ∼ N(θB,ΣB()φ)
• Parameters: [σ2
a,σ2
c,θB,φ]

Example: streamfunction
• Data:
1 NSCAT satellite data: UA, VA (na = 369)
2 NCEP (numerical model output): UC , VC (nC = 15)
• Process: uB, vB
• Data model:
1
UA
UC
|uB,σu,Σ ∼ N
GA
GC
uB,Σu +Σm
2
VA
VC
|vB,σv ,Σ ∼ N
GA
GC
vB,Σv +Σm
• Process model:
1
uB
vB
∼ N
µu1
µv 1
,Σuv

• Interest in predicting the streamfunction ψ.
• Deterministic Poisson equation to determine streamfunction ψ
from winds:
∇2
ψ =
∂v
∂x
−
∂u
∂y
u: east-west wind component, v: north-south wind
component
• Discretizing to a regular grid:
1 ψI |ψbc,u,v ∼ N(L−1
[Dx v −Dy u+Lbc ψbc],ΣI )
2 ψbc ∼ N(µbc ,Σbc )
• ψI : streamfunction at the interior grid locations
• ψbc: streamfunction at the boundary grid locations

Wind data (top row); posterior mean and realization from the posterior
distribution of the streamfunction for December 26, 1996 (bottom row)

Fuentes and Raftery, Biometrics, 2005
• Two sources of weekly average SO2 concentration data:
monitoring data and computer model output
• Data with diﬀerent resolution → Change of support problem
• Monitoring data from CASTNet sites
• Output of a numerical model, Models-3, given as average
concentration over 36×36 km
• Goal: Estimate true weekly average concentration of SO2

Fuentes and Raftery, Biometrics, 2005
Average SO2 concentration for the week of July 11, 1995

• Process: Z(s) true underlying process
• Data:
1 ˆZ(s) measurement from monitoring network (CASTNET)
2 ˜Z(B) numerical model output (Models-3)
• Goal: Infer upon the process Z(s)
• Problem: The data has diﬀerent spatial support

Data model
• Model for ˆZ(s), ˜Z(B) | Z(s),θm
• Measurement error
ˆZ(s) = Z(s)+e(s) e(s) ∼ N(0,σ2
e)
˜Z(B) =
1
|B| B
˜Z(s)ds
˜Z(s) = a(s)+b(s)Z(s)+δ(s) δ(s) ∼ N(0,σ2
δ)
where
1 a(s) polynomial in s
2 b(s) ≡ b

Process model
• Z(s) = µ(s)+ε(s) with
1 E(ε(s)) = 0 and Cov(ε(s),ε(r)) = σ(s,r;φ)
2 µ(s) polynomial in s with coeﬃcients β
→ Z(s) ∼ GP(µ(s),Σ)
• Goal: Infer on Z given ˆZ, ˜Z

• ˆZ = ˆZ(s1),..., ˆZ(sn)
• ˜Z = ˜Z(B1),..., ˜Z(BM)
ˆZ
˜Z
∼ N
ˆµ
˜a+b˜µ
,
ΣC ΣCM
ΣCM ΣM
where
1 ˆµ = (µ(s1),...,µ(sn))
2 ˜a = 1
|B1| B1
a(s)ds,..., 1
|BM | BM
a(s)ds
3 ˜µ = 1
|B1| B1
µ(s)ds,..., 1
|BM | BM
µ(s)ds

ˆZ
˜Z
∼ N
ˆµ
˜a+b˜µ
,
ΣC ΣCM
ΣCM ΣM
where
1 ΣC n ×n matrix: (ΣC )ij = σ(si ,sj ;φ)+1{si ≡sj }σ2
e
2 ΣCM n ×M matrix: (ΣCM)ik = b · 1
|Bk | Bk
σ(si ,v;φ)dv
3 ΣM M ×M matrix:
(ΣM)kl = b2
·
1
|Bk|·|Bl | Bk Bl
σ(u,v;φ)du dv +1{Bk ≡Bl }σ2
δ

Example: air pollution
Data:
1 Weekly average of SO2 concentration at n = 50 CASTNet
sites for the week of July 11, 1995
2 Weekly average of SO2 concentration at M = 81×87 36 × 36
grid cells, output of Models-3 for the week of July 11, 1995
Other modeling details
• Stochastic integrals approximated by taking systematic sample
of 4 points within each a grid cell
• Degree of polynomials deﬁning the mean trend µ(s) of Z(s)
and of the additive bias a(s) of ˜Z(s) determined via RJMCMC
• Non-stationary covariance function for the underlying true
process Z(s)

Posterior predictive mean and posterior predictive SD for Z(s) for
the week of July 11, 1995

Data fusion: space-time data
Wikle et al., JASA 2001
• Extend the modeling idea of Wikle and Berliner (2005) to
account for time
• Daily wind data from two sources: satellite data (at higher
resolution) and computer model output (at a lower resolution)
• Goal: Predict winds at an intermediate resolution over a 54
6-hour increment period
• Accounted for the temporal dependence in the data by using
dynamic coeﬃcients in the speciﬁcation of the process driving
the observed data
• Avoided to compute stochastic integrals!

• Data:
1 NSCAT satellite data: UA,t, VA,t at time t
2 NCEP (numerical model output): UC,t, VC,t at time t
→ Ut = (UC,t,UA,t) and Vt = (VC,t,VA,t) observed data at
time t
→ {Ut}T
1 = (U1,...,UT ) , {Vt}T
1 = (V1,...,VT )
• Process:
• ut, vt at time t at nB prediction grid cells.
• Similar deﬁnition for {ut}T
t=1 and {vt}T
t=1

Data model
{V}T
t=1 ,{U}T
t=1 | {v}T
t=1 ,{u}T
t=1 ,θ =
T
∏
t=1
[Vt | vt,θ]·[Ut | ut,θ]
• Vt | vt,Σt ∼ N(Ktvt,Σt)
• Ut | ut,Σt ∼ N(Ktut,Σt)
1 Σt diagonal matrix with entries equal to either σ2
(satellite
obs), σ2
b (NCEP boundary grid cells) or σ2
I (NCEP interior
cells)
2 Kt design matrices that maps the prediction grid cells to the
observation grid cells

Process model
ut = µu +uE
t + ˜ut
vt = µv +vE
t + ˜ut
1 µu spatial mean for the u wind component: µu = Puγu
(resp. for µv ) → Pu design matrix (resp. Pv )
2 uE
t thin ﬂuid approximation of the u wind component:
uE
t = Φau
t (resp. for vE
t ) → Φ basis function
• ˜ut small scale motions of the u wind component: ˜ut = Ψbu
t
(resp. for ˜vt) → Ψ wavelet basis function

Parameters
• The 2n ×1 random vectors au
t , av
t are modeled as dynamically
evolving in time but are independent between prediction grid
cells
• The n ×1 random vectors bu
t and bv
t are modeled as
dynamically evolving in time and are independent between
prediction grid cells
• No need to compute stochastic integrals!
• Only temporal dependence is explicitly modeled
• Computationally feasible

Choi et al., Comp. Stat. and Data Analysis 2009
• Extend the modeling idea of Fuentes and Raftery (2005) to
account for time
• Daily average PM2.5 concentration from two sources:
monitoring data and computer model output
1 ˆZ(s,t) observation from monitoring site s at time t
2 ˜Z(B,t) model output at grid cell B at time t
• Goal: Predict true daily average PM2.5 concentration
aggregated over counties at time t for health analysis
• Included the temporal dependence in the mean structure of
the underlying process

Data model
• Model for ˆZ(s,t), ˜Z(B,t) | Z(s,t),θm
ˆZ(s) = Z(s,t)+e(s,t) e(s,t) ∼ N(0,σ2
e)
˜Z(B,t) =
1
|B| B
˜Z(s,t)ds
˜Z(s,t) = a(s)+Z(s,t)+δ(s,t) δ(s,t) ∼ N(0,σ2
δ)
Process model
Z(s,t) = M(s,t)ξ +ε(s,t) ε(s,t) ∼ N(0,τ2
)
• M(s,t) vector of meteorological variables at site s at time t

McMillan et al., Environmetrics, 2009
• Propose a spatio-temporal model to combine monitoring data
and numerical model output
1 Daily average PM2.5 concentration from monitoring sites
during year 2001
2 Daily average PM2.5 concentration, output of CMAQ model
ran at 12 km grid cell resolution (M = 213×188)
• Goal: Combine the two sources of data and predict true daily
average PM2.5 concentration for each day in 2001

• Process: Wi true underlying process
• Data:
1 Xi,k monitoring data
2 Yi,k CMAQ output
• Wi deﬁned on space-time grid cells: i ∈ {1,...,N}, where
N = NT ×NP, NT number of time points, NP number of grid
cells
• Xi,k observed monitoring data for the k −th monitor
observation in cell i
• Yi,k CMAQ output in cell i (k = 1)

Data model
• Model for [Xi,k,Yi,k | Wi ,θ]
Measurement error
Xi,k = Wi +εi,k εi,k ∼ N(0,τ2
X )
Yi,k = Di β +Wi +δi,k δi,k ∼ N(0,τ2
Y )
• Di : vector of uniform B-splines over a regular 3-dimensional
lattice of ND knots
=⇒ CMAQ bias for grid cell i : Di β = ∑ND
j=1 Dij βj

Process model
Wi = µt(i) +Zi
• t(i) temporal index of grid cell i
• µt(i) constant across space: µt(i) ∼ N(0,τ2
µ)
• Z space-time multivariate normal with a separable covariance
structure: autoregressive in time and conditionally
autoregressive (CAR) in space
=⇒ Z | τ2
Z ,ρ ∼ N(0,τ2
Z · (ΛT (ρ)⊗ΛP)
−1
)

Daily mean levels for predicted surface, monitoring data and
CMAQ over Eastern US

Posterior predictive mean for (a) 4 July 2001 and (b) 24 December 2001
(a) (b)

Berrocal et al., JABES, 2010
and numerical model output
1 Daily 8-hr max ozone concentration from monitoring sites
during summer of 2001
2 Daily 8-hr max ozone concentration, output of CMAQ model
ran at 12 km grid cell resolution (M = 213×188)
• Goal: Combine the two sources of data and “downscale”
numerical model output at point level
• Does not assume a “true” underlying process

• Y (s,t): observation at site s at time t
• x(B,t): CMAQ output at grid cell B at time t
For s ∈ B:
Y (s,t) = ˜β0(s,t)+ ˜β1(s,t)x(B,t)+ε(s,t) ε(s,t) ∼ N(0,σ2
)
with ˜βi (s,t) = βit +βi (s,t), for i = 0,1.
• Temporal dependence in β0t and β1t:
(i) β0t,β1t Nested within time
(ii) β0t,β1t Dynamic in time
• β0(s,t) and β1(s,t) correlated Gaussian processes that are either:
(i) Nested within time OR (i) Dynamic in time

• Possible spatio-temporal models to combine the two data
β0t β0(s,t)
Model β1t β1(s,t)
Model 1 Independent across time Constant in time
Model 2 Dynamic Constant in time
Model 3 Independent across time Independent across time
Model 4 Dynamic Dynamic

−100 −95 −90 −85 −80 −75 −70
30354045
Longitude
Latitude
Ozone monitoring sites, 2001
Test sites (black), validation sites (red)
• Daily maximum 8-hour ozone
concentration (ppb): observations
(n=803) and CMAQ model output
• Model output on 12-km grid cells
(M=40,440)
• Fit models for May 1 - October 15, 2001
• 436 sites used to ﬁt the model,
367 sites for validation

Data fusion: space-time
• National Ambient Air Quality Standard (NAAQS) for ozone is that
the 3-year rolling average of the annual fourth highest daily 8-hour
maximum ozone concentration be less than a given threshold
• Maps of the probability that the fourth highest ozone concentration
during the period May 1 - October 15, 2001 exceeds:
−100 −95 −90 −85 −80 −75 −70
30354045
Longitude
Latitude
0.0
0.2
0.4
0.6
0.8
1.0
(a) 80 ppb (1997 standard)
−100 −95 −90 −85 −80 −75 −70
30354045
Longitude
Latitude
0.0
0.2
0.4
0.6
0.8
1.0
(b) 75 ppb (2008 standard)

Berrocal et al., Environmetrics, 2012
• Extended the 2010 downscaler model to allow for potential
spatial misalignment in the computer model output
1 Seasonal average temperature at 17 synoptic stations in
Sweden for the period December 1962-November 2007
2 Regional climate model output on a 12.5km × 12.5km grid for
the same period
• Goal: Assess the performance of the regional climate model.

RCM data
12 14 16 18 20
56586062
−10
−8
−6
−4
−2
0
2
4
RCM output: DJF 2002
q
q
q
Stockholm
Borlange
Goteborg
• Output of the Swedish
Meteorological Hydrological
Institute (SMHI) Rossby
Centre Atmospheric (RCA)
RCM model
• Daily output for 2-m
temperature from
December 1, 1962 to
November 30, 2007, then
aggregated to quarterly
averages (DJF, MAM, JJA,
SON)
• Output at
12.5 km × 12.5 km grid
boxes

RCM data
12 14 16 18 20
56586062
−2
0
2
4
6
8
RCM output: MAM 2002
q
q
q
Stockholm
Borlange
Goteborg
• Output of the Swedish
Meteorological Hydrological
Institute (SMHI) Rossby
Centre Atmospheric (RCA)
RCM model
• Daily output for 2-m
temperature from
December 1, 1962 to
November 30, 2007, then
aggregated to quarterly
averages (DJF, MAM, JJA,
SON)
• Output at
12.5 km × 12.5 km grid
boxes

Observational data
12 14 16 18 20
56586062
−10
−8
−6
−4
−2
0
2
4
Observation data: DJF 2002
q
q
q
Stockholm
Borlange
Goteborg
qq
q
q
q
q
q
q qq
q
q q
q
q q
q
• Observed daily average
temperature from 17 stations
in the SMHI network of
synoptic stations
• Period: December 1, 1962 to
November 30, 2007
• Daily data aggregated to
quarterly scale
• Three stations, G¨oteborg,
Stockholm and Borl¨ange
held out for validation

Downscaling model
• Some notation:
• B1,...,Bg : RCM model grid boxes with centroids r1,...,rg
• x(B1,t),x(B2,t),...,x(Bg ,t): RCM output of quarterly
average temperature for quarter t = 1,...,T at grid box
B1,B2,...,Bg
• Y (s,t): observed quarterly average temperature at station s
for quarter t = 1,...,T
• The 2010 downscaling applied to this data would be: for s in
B and t = 1,...,T
Y (s,t) = ˜β0,t(s,t)+˜β1,t(s,t)x(B,t)+ε(s,t) ε(s,t)
iid
∼ N(0,τ2
)

Downscaling model
• The 2012 model starts from the observation that we could
write: for t = 1,...,T
Y (s,t) = ˜β0(s,t)+β1 ˜x(s,t)+ε(s,t) ε(s,t) ∼ N(0,τ2
)
with
• ˜x(s,t): spatio-temporal weighted average of the RCM output:
˜x(s,t) =
g
∑
k=1
wk (s,t)x(Bk ,t)
• ˜β1(s,t) replaced by β1 for identiﬁability reasons
• The weights wk(s,t) should be:
• positive and sum up to 1
• spatially correlated within sites and across sites

Downscaling model
• If r1,...,rg are the centroids of the RCM grid boxes, we can
take the weights wk(s,t) to be
wk(s,t) =
K (|s−rk|;λ)
∑
g
l=1 K (|s−rl |;λ)
• K (·;λ) kernel function with bandwidth λ.
For example: K (|s−rk|;λ) = exp(−|s−rk |
λ ).

Downscaling model
We consider RCM output and observational data:
12 14 16 18 20
56586062
−10
−8
−6
−4
−2
0
2
4
q
q
q
Stockholm
Borlange
Goteborg
12 14 16 18 2056586062
−10
−8
−6
−4
−2
0
2
4
q
q
q
Stockholm
Borlange
Goteborg
qq
q
q
q
q
q
q qq
q
q q
q
q q
q

Downscaling model
We consider RCM output and observational data:
12 14 16 18 20
56586062
−10
−8
−6
−4
−2
0
2
4
q
q
q
Stockholm
Borlange
Goteborg
12 14 16 18 20
56586062
−10
−8
−6
−4
−2
0
2
4
q
q
q
Stockholm
Borlange
Goteborg
q
We establish the spatial linear model: for s ∈ B and t = 1,...,T
)

Downscaling model
For t = 1,...,T, the weight wk(s,t) is:
12 14 16 18 20
56586062
0.0
0.1
0.2
0.3
0.4
0.5
q
q
q
Stockholm
Borlange
Goteborg
q
14.0 14.2 14.4 14.6 14.8
61.661.862.062.262.4
0.0
0.1
0.2
0.3
0.4
0.5
q

Downscaling model
• To allow for the weights wk(s,t) to be directional, we modify
the expression
wk(s,t) =
K (|s−rk|;λ)
∑
g
l=1 K (|s−rl |;λ)
to
wk(s,t) =
K (|s−rk|;λ)·exp(Q(rk,t))
∑
g
l=1 K (|s−rl |;λ)·exp(Q(rl ,t))
where for t = 1,...,T, Q(r,t) is a latent stationary mean-zero
spatial Gaussian process with variance 1 and exponential
correlation function.
• For t = 1,...,T, the range φ of the latent spatial process
Q(r,t) inﬂuences the directionality of the weights.

Downscaling model
• Finally, the downscaling model is: for s and t = 1,...,T:
)
• ˜β0,t(s) = β0,t +β0(s,t) with β0(s,t) stationary mean-zero
Gaussian spatial process with time-varying range parameter.
• ˜x(s,t) = ∑
g
k=1 wk (s,t)x(Bk ,t)
• wk (s,t) = K (|s−rk |;λ)·exp(Q(rk ,t))
∑
g
l=1 K (|s−rl |;λ)·exp(Q(rl ,t))
• Q(r,t) is a latent stationary mean-zero spatial Gaussian
process with variance 1 and exponential correlation function
with range parameter φ.
• For t = 1,...,T, the calibration parameters, β0,t,β0(s,t) are
assumed to be independent in time, and so is the latent
process Q(r,t).

Predictions at point level
• We predicted quarterly average temperature at three reserved
stations and compared them with:
1 observed data
2 the quarterly average temperature, output of the RCM at the
grid box containing the station.
12 14 16 18 20
56586062
Longitude
Latitude
12
3
4
5
6
7
8 910 11
12 13
14
15 16
17
q
q
q
Stockholm
Borlange
Goteborg

Predictions at Borl¨ange
Black line: observed data
Blue line: downscaling model prediction
Red line: RCM output
Magenta line: upscaling model prediction
1970 1980 1990 2000
−15−10−505
Borlänge
Winter
1970 1980 1990 2000
−5051015
Spring
1970 1980 1990 2000
510152025
Summer
1970 1980 1990 2000
05101520
Year
Autumn

Predictions at Stockholm
Black line: observed data
Blue line: downscaling model prediction
Red line: RCM output
Magenta line: upscaling model prediction
1970 1980 1990 2000
−15−10−505
Stockholm
Winter
1970 1980 1990 2000
−5051015
Spring
1970 1980 1990 2000
510152025
Summer
1970 1980 1990 2000
05101520
Year
Autumn

Spatial diﬀerences
12 14 16 18 20
57596163
12
3
4
5
6
7
8 10
12 13 14
15 16
17
9,11
Downscaling Climate: Winter 2002
10
5
0
5
10
12 14 16 18 20
57596163
12
3
4
5
6
7
8 10
12 13 14
15 16
17
9,11
Upscaling Climate: Winter 2002
10
5
0
5
10
12 14 16 18 20
57596163 12
3
4
5
6
7
8 10
12 13 14
15 16
17
9,11
Downscaling Climate: Spring 2002
10
5
0
5
10
12 14 16 18 20
57596163
12
3
4
5
6
7
8 10
12 13 14
15 16
17
9,11
Upscaling Climate: Spring 2002
10
5
0
5
10
12 14 16 18 20
57596163
12
3
4
5
6
7
8 10
12 13 14
15 16
17
9,11
Downscaling Climate: Summer 2002
10
5
0
5
10
12 14 16 18 20
57596163
12
3
4
5
6
7
8 10
12 13 14
15 16
17
9,11
Upscaling Climate: Summer 2002
10
5
0
5
10
12 14 16 18 20
57596163
12
3
4
5
6
7
8 10
12 13 14
15 16
17
9,11
Downscaling Climate: Autumn 2002
10
5
0
5
10
12 14 16 18 20
57596163
12
3
4
5
6
7
8 10
12 13 14
15 16
17
9,11
Upscaling Climate: Autumn 2002
10
5
0
5
10

Sahu et al., JRSS Series C, 2009
and numerical model output to predict wet chemical
deposition
1 Weekly nitrate (resp. sulfate) deposition for year 2001 at
monitoring sites (n = 152)
2 Weekly precipitation data for year 2001 at monitoring sites
3 Weekly nitrate (resp. sulfate) deposition, output of CMAQ
model ran at 12 km grid cell resolution (M = 33,390)
• Goal: Combine the sources of data and predict weekly, annual
and seasonal wet deposition in the Eastern US for 2001

Data:
1 P(s,t) observed precipitation at s at time t
2 Z(s,t) observed deposition at s at time t
3 Q(B,t) CMAQ model output at grid cell B at time t
Data model:
P(s,t) =
exp(U(s,t)) if V (s,t) > 0
0 o.w
Z(s,t) =
exp(Y (s,t)) if V (s,t) > 0
0 o.w
Q(B,t) =
exp(X(B,t)) if ˜V (B,t) > 0
0 o.w
X(B,t) = γ0 +γ1
˜V (B,t)+ψ(B,t) ψ(B,t) ∼ N(0,σ2
ψ)

Process model:
1 U(s,t) process driving precipitation at s at time t
2 Y (s,t) process driving deposition at s at time t
3 V (s,t) latent atmospheric process
4 ˜V (B,t) process driving the log-CMAQ output at B at time t
U(s,t) = α0 +α1V (s,t)+δ(s,t) δt ∼ GP(0,Σδ)
Y (s,t) = β0 +β1U(s,t)+β2V (s,t)+[b0 +b1(s)X(B,t)]+η(s,t)+ε(s,t)
V (s,t) = ˜V (B,t)+ν(s,t) ν(s,t) ∼ N(0,σ2
ψ)
˜V (B,t) = ρ ˜V (B,t −1)+ζ(B,t) ζ(B,t) ∼ CAR

Monitoring data and validation sites

Annual total precipitation in 2001

(a) (b)
Posterior predictive mean for b(s) for (a) sulfate (b) nitrate

(a) (b)
(a) Posterior predictive annual mean for nitrate and (b) length of
predictive interval

CLIM Fall 2017 Course: Statistics for Climate Research, Guest lecture: Data Fusion - Veronica Berrocal, Sep 26, 2017

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (17)

Similar to CLIM Fall 2017 Course: Statistics for Climate Research, Guest lecture: Data Fusion - Veronica Berrocal, Sep 26, 2017

Similar to CLIM Fall 2017 Course: Statistics for Climate Research, Guest lecture: Data Fusion - Veronica Berrocal, Sep 26, 2017 (20)

More from The Statistical and Applied Mathematical Sciences Institute

More from The Statistical and Applied Mathematical Sciences Institute (20)

Recently uploaded

Recently uploaded (20)

CLIM Fall 2017 Course: Statistics for Climate Research, Guest lecture: Data Fusion - Veronica Berrocal, Sep 26, 2017