CLIM Fall 2017 Course: Statistics for Climate Research, Guest lecture: Data Fusion - Veronica Berrocal, Sep 26, 2017
1. Data fusion approaches
(for Earth-systems data)
Veronica J. Berrocal
University of Michigan
Department of Biostatistics
SAMSI course
Fall 2017
Veronica J. Berrocal Data fusion
2. Outline
• Introduction
• Data assimilation
• Optimal Interpolation
• Variational methods
• Kalman filter
• Further approaches
• Data fusion in the statistical literature
• Spatial data
• Example: Wikle and Berliner, 2005
• Example: Fuentes and Raftery, 2005 (Bayesian melding)
• Space-time data
• Example: Wikle et al., 2001
• Example: Choi et al., 2009
• Example: McMillan et al., 2009
• Example: Berrocal et al., 2010 and 2012
• Example: Sahu et al., 2009
Veronica J. Berrocal Data fusion
3. Definitions
• Data fusion refers to the statistical technique used to combine
data from different sources
• If one of the sources is the output of a computer model
→ Data assimilation
• Data assimilation: term coined in the atmospheric science
community
• Several definitions
• ”approach for fusing data (observations) with prior knowledge
(e.g. mathematical representations of physical laws; model
output) to obtain an estimate of the distribution of the true
state of the process” (Wikle and Berliner, 2006)
Veronica J. Berrocal Data fusion
4. Why data fusion?
• The evolution in time of many geophysical processes (e.g.
atmosphere, etc.) can be described by systems of partial
differential equations
• As an example, numerical weather forecasts are obtained by
running forward in time computer models that simulate the
evolution of the atmosphere in time
• The equations are solved numerically, by discretizing both
space and time
• It is necessary to specify initial conditions, and, at times,
boundary conditions
• High sensitivity of forecasts from the initial conditions
Veronica J. Berrocal Data fusion
5. Data fusion: an old problem
• Often, observations of the inital states are not available: this
was recognized by mathematicians and astronomers, among
which Euler, Lagrange and Laplace.
• In particular, Gauss elaborated how available observations of
the physical system were not easily translatable into initial
conditions and stated
”[..] since all our observations and measurements are nothing more
than approximations to the truth, the same must be true of all
calculations resting on them, and the highest of all computations
made concerning concrete phenomena must be to approximate, as
nearly as practicable, to the truth. But this can be accomplished in
no other way than by suitable combination of more observations
than the number absolutely requisite for the determination of
unknown quantities.” (Theory of Motion of Heavenly Bodies)
Veronica J. Berrocal Data fusion
7. Numerical weather prediction models
• xt: state of the atmosphere at time t; Mt: numerical weather
prediction model at time t
• Mt consists of a set of partial differential equations
Longitude
Latitude
Height
xt
Veronica J. Berrocal Data fusion
8. Numerical weather prediction models
• xt: state of the atmosphere at time t; Mt: numerical weather
prediction model at time t
• Mt consists of a set of partial differential equations
Longitude
Latitude
Height
−→
Mt Longitude
Latitude
Heightxt xt+1
xt+1 = Mt(xt)
(state-space model)
Veronica J. Berrocal Data fusion
9. It’s an initial-value problem
• In order to obtain a skillful forecast, it is necessary that:
• Mt is a realistic representation of the atmosphere
• the vector xt of state space variables is known accurately
• We will assume that the atmospheric model Mt approximates
well the evolution in time of the atmosphere
• We will focus on how to determine xt
Veronica J. Berrocal Data fusion
10. Determining the initial conditions
Longitude
Latitude
Height
• At each time t, xt is a vector of order n = 107.
Veronica J. Berrocal Data fusion
11. Determining the initial conditions
Longitude
Latitude
Height
• At each time t, xt is a vector of order n = 107.
• For any time window around t, (t −δt;t +δt), there are
typically p = 105 observations yt of the atmosphere.
• The observations might not refer to the same variables as the
state-space variables.
• Data assimilation integrates the two sources of information: a
short-range forecast, (background or first guess), x
(b)
t , with
the observations, yt.
Veronica J. Berrocal Data fusion
12. Data assimilation
[E. Kalnay (2003)]
• Background or first guess: x
(b)
t .
• Global analysis: data assimilation
of the background, x
(b)
t , with the
observations, yt.
Veronica J. Berrocal Data fusion
13. Data assimilation approaches
• There are several methods for data assimilation. Main
difference is on whether the observations are integrated
sequentially or not, and whether the model is assumed perfect
or stochastic:
• Optimal Interpolation
• Variational methods: 3D-Var and 4D-Var
• Kalman filtering: Kalman filter and Ensemble Kalman filter
• They all hypothesize that at time t, there are:
1 a true unknown state of the atmophere: xt
2 a background field: x
(b)
t
3 a vector of observations: yt
4 Goal: combine x
(b)
t and yt to determine the best
approximation or analysis, x
(a)
t , to xt
Veronica J. Berrocal Data fusion
14. Data assimilation: assumptions
• xt: true state of the atmosphere at time t
• x
(b)
t : background field at time t
x
(b)
t = xt +ε
(b)
t ε
(b)
t ∼ Nn(0,P(b)
)
• yt: observations at time t
yt = H (xt)+ε
(o)
t ≈ Ht ·xt +ε
(o)
t ε
(o)
t ∼ Np(0,R)
H observation operator, assumed to be linear (or
approximated with) and represented at time t by the matrix
Ht
• x
(a)
t : analysis at time t
x
(a)
t = xt +ε
(a)
t ε
(a)
t ∼ Nn(0,P(a)
)
Veronica J. Berrocal Data fusion
15. Optimal Interpolation
• We want to express x
(a)
t as a linear combination of x
(b)
t and yt:
x
(a)
t = a1x
(b)
t +a2yt
so that x
(a)
t is unbiased and a1 and a2 minimize the mean
squared error
• Using the same approach as in least squares, we assume that
x
(a)
t is given by
x
(a)
t = x
(b)
t +W(yt −Htx
(b)
t )
• Goal: determine the matrix W so that the analysis error, ε
(a)
t ,
minimize the expected sum of squares
Veronica J. Berrocal Data fusion
16. Optimal Interpolation
• x
(a)
t = x
(b)
t +W(yt −Htx
(b)
t )
• Goal: determine W so that:
E(ε
(a)
t ε
(a)
t ) = E W(yt −Htx
(b)
t )−ε
(b)
t W(yt −Htx
(b)
t )−ε
(b)
t
is minimized
• Then: W = P(b)
Ht (R+HtP(b)
Ht )−1
• The optimal weight matrix W is also called the gain matrix
• The covariance matrix, P(a)
, of the analysis error, ε
(a)
t , is:
P(a)
= (In −WHt)P(b)
Veronica J. Berrocal Data fusion
17. Optimal Interpolation
• The analysis is obtained by adding to the first guess, x
(b)
t , the
product of the optimal weight matrix times the innovation,
that is, yt −Htx
(b)
t
• The optimal weight matrix, W, is given by the covariance of
the forecast error in the observation space (P(b)Ht ) divided
by the total error covariance
• If the observation operator, Ht, is a linear operator (or an
interpolator), then
Optimal Interpolation = Kriging
Veronica J. Berrocal Data fusion
18. Optimal Interpolation
• Observation operator H is a linear operator represented at
time t by the matrix Ht
• Suppose that we assumed the following:
• Prior distribution: xt ∼ Nn(x
(b)
t ,P(b)
)
• Likelihood: yt|xt ∼ Np(Htxt,R)
• Posterior distribution: xt|yt ∼ Nn(E(xt|yt),Var(xt|yt))
E(xt|yt) = x
(b)
t +W(yt −Htx
(b)
t )
Var(xt|yt)) = (In −WHt)P(b)
with W as in Optimal Interpolation.
Veronica J. Berrocal Data fusion
19. Variational methods: 3D-Var
• The true state of the atmosphere, xt, is found by minimizing
a scalar cost function J(xt).
J(xt) =
1
2
(yt −Htxt) R−1
(yt −Htxt)+
1
2
(xt −x
(b)
t ) (P(b)
)−1
(xt −x
(b)
t )
• R observation error covariance matrix
• P(b)
forecast error covariance matrix
• Formally the solution to the 3D-Var minimization problem is the
same as the solution to the Optimal Interpolation problem
• The solution to a 3D-Var is the posterior mean in the case of a
Gaussian prior for xt and a Gaussian likelihood with a linear
observation operator Ht.
Veronica J. Berrocal Data fusion
20. Variational methods: 4D-Var
• The true state of the atmosphere, xt, is found by minimizing
a scalar cost function that allows for observations to be
distributed within a time interval (t0,tN)
J(xt0 ) =
1
2
N
∑
i=0
(yti
−Hti
xti
) Ri
−1
(yti
−Hti
xti
)
+
1
2
(xt0 −x
(b)
t0
) (P
(b)
0 )−1
(xt0 −x
(b)
t0
)
• Ri observation error covariance matrix at time i
• P0
(b)
forecast error covariance matrix at the start of the period
• The cost function J(xt0 ) is minimized with respect to the initial true
state of the atmosphere xt0
Veronica J. Berrocal Data fusion
21. Assimilation via Kalman Filter
• The numerical model is imperfect:
xti = Mti−1 (xti−1 )+ηti i = 1,...,N
with ηti ∼ N(0,Qi )
• The observations are used sequentially in the time interval
(t1;tN).
• At each time ti two operations are performed sequentially:
1 Forecast step
2 Analysis/assimilation step
Veronica J. Berrocal Data fusion
22. Assimilation via Kalman Filter
• Forecast step:
1 Derive forecast or background at time ti : x
(b)
ti
= Mti−1
(x
(a)
ti−1
).
2 Assuming that Mt can be linearized and represented by the
matrix Mt, compute covariance matrix of background error at
time ti : P
(b)
i = Mti−1
P
(a)
i−1Mti−1
+Qi .
• Analysis step:
1 Compute the Kalman gain matrix at time ti ,
Ki = P
(b)
i Hi (Ri +Hi P
(b)
i Hi )−1
.
2 Derive the analysis at time ti , x
(a)
ti
:
x
(a)
ti
= x
(b)
ti
+Ki (yti
−Hi x
(b)
ti
)
3 Compute the covariance matrix of analysis error at time ti :
P
(a)
i = (In −Ki Hi )P
(b)
i
If Mti and Hti are not linear, then → Extended Kalman Filter
Veronica J. Berrocal Data fusion
23. Kalman Filter
• Suppose that for each i = 1,...,N:
• measurement equation:
yti
= Hi xti
+εti
εti
∼ N(0,Ri )
• process/transition equation:
xti
= Mti
xti−1
+ηti
ηti
∼ N(0,Qti
)
with xt0 ∼ N(x
(b)
t0
,P
(b)
t0
)
Veronica J. Berrocal Data fusion
24. Kalman Filter
• Let xt0:ti ≡ {xt0 ,xt1 ,...,xti } and yt1:ti ≡ {yt1 ,yt2 ,...,yti }.
• Let x
(a)
ti
be the analysis
• Let x
(f )
ti
denote the forecast
• For i = 1,...,N:
• Filter step
1 x
(f )
ti
= E(xti |yti−1 ) = Mti x
(a)
ti−1
2 P
(f )
ti
= Var(xti |yti−1 ) = Mti P
(a)
ti−1
Mti
+Qti
• Analysis step
1 Kti = P
(f )
ti
Hi (Ri +Hi P
(f )
ti
Hi )−1
2 x
(a)
ti
= x
(f )
ti
+Kti (yti −Hi x
(f )
ti
)
3 P
(a)
ti
= (In −Kti Hti )P
(f )
ti
Veronica J. Berrocal Data fusion
25. Kalman Filter/Extended Kalman Filter
• In the case of a linear state-space model Mt and a linear
observation operator H , Kalman filter can be interpreted
within a Bayesian framework.
• If, at time ti , we assume:
• xti
∼ N(x
(b)
ti
,P
(b)
i )
• yti
∼ N(Hi xti
,Ri )
• Then, the analysis x
(a)
ti
is the posterior mean, E(xti |yti ) with
the analysis covariance matrix P
(a)
ti
posterior variance
Var(xti |yti )
• On the other hand, the forecast step consists into deriving
p(xti+1 |yti ) = p(xti+1 |xti )·p(xti |yti )dxti
Veronica J. Berrocal Data fusion
26. Kalman Filter/Extended Kalman Filter
• It is the “gold standard” of data assimlation
• Even with a poor initial guess of the state of the atmosphere,
it should provide the best linear unbiased estimate of the state
of the atmosphere
• Problems if the system is unstable
• Computationally expensive! The matrix operations to
compute P
(b)
i and P
(a)
i involve matrices of order n ≈ 107
• Nonlinear dynamics: i.e. Mti non-linear, linear approximation
does not perform well
Veronica J. Berrocal Data fusion
27. Ensemble Kalman Filter
• Main idea: Use an ensembe of system states as a discrete
approximation to the distribution of xti
• Each ensemble member is propagated forward in time using
Mti
• The mean and covariance matrix of the new ensemble are
used to approximate the forecast distribution
• Similar to particle filter with the ensemble members being
”particles”
• The same set of observations are assimilated to each ensemble
member
Veronica J. Berrocal Data fusion
28. Ensemble Kalman Filter
• Let x
(b)
t0,j
, j = 1,...,M be M ensemble members
• Forecast step:
1 Derive forecast ensemble members at time ti :
x
(b)
ti,j
= Mti−1
(x
(a)
ti−1,j
), j = 1,...,M
2 Compute sample covariance matrix of background error at
time ti : ˆP
(b)
i
• Analysis step:
1 Compute the Kalman gain matrix at time ti ,
Ki = ˆP
(b)
i Hi (Ri +Hi
ˆP
(b)
i Hi )−1
2 Derive the analysis ensemble members at time ti :
x
(a)
ti,j
= x
(b)
ti,j
+Ki (˜yti ,j −Hi x
(b)
ti,j
)
where ˜yti ,j = yti
+εj , εj ∼ N(0,R)
3 Compute the sample covariance matrix of the analysis error at
time ti , ˆP
(a)
i
Veronica J. Berrocal Data fusion
29. Further approaches
• Different strategies to perform the analysis step in the
Ensemble Kalman filter
• Sampling variability in Ensemble Kalman filter, especially if
the ensemble size is small
→ filter divergence; decrease in contribution of the
observations
1 Localization of the ensemble covariance matrix (e.g.
covariance tapering, etc.)
2 Inflation of the ensemble spread
Veronica J. Berrocal Data fusion
30. Data fusion: spatial data
Wikle and Berliner, Technometrics, 2005
• Two sources of wind data: daily wind satellite data and
computer model output from a weather center for the period
15 September 1996-29 June 1997
• Data with different resolution → Change of support problem
• Satellite-based wind estimates from NASA Scatterometer
(NSCAT) at 0.5 degree resolution and not on a regular grid
• National Center for Environmental Prediction (NCEP) analysis
of wind direction at 2.5 degree resolution and on a regular grid
• Goal: Predict surface streamfunction at a resolution of 1.0
degree
Veronica J. Berrocal Data fusion
31. Data fusion: spatial data
Wind data from satellite and from an analysis for December 26, 1996
Veronica J. Berrocal Data fusion
32. Data fusion: spatial data
• Z measurement data from the two sources
• Y true underlying process
• Adopt the modeling approach:
[Data Process]
[Process Parameters]
[Parameters]
• Goal: Infer upon the process Y
• Problem: The data has different spatial support
Veronica J. Berrocal Data fusion
33. Data fusion: spatial data
• Let:
1 Ai , i = 1,...,na
2 Bj , j = 1,...,nb
3 Ck , k = 1,...,nc be non-overlapping sets such that:
0 ≤ |Ai | < |Bj | < |Ck | < ∞ for all i,j,k
• ZA ≡ (Z(A1),...,Z(Ana )) ,observations on the subgrid
• ZC ≡ (Z(C1),...,Z(Cnc )) , observations on the supergrid
Veronica J. Berrocal Data fusion
34. Data fusion: spatial data
• Y = {Y (s) : s ∈ D ⊂ R} spatial process
•
Y (S) =
1
|S| S Y (s)ds |S| > 0
avg {Y (s) : s ∈ S} |S| = 0
• YA ≡ (Y (A1),...,Y (Ana )) ,subgrid process
• YC ≡ (Y (C1),...,Y (Cnc )) , supergrid process
• YB ≡ (Y (B1),...,Y (Bnb
)) process on the prediction grid
Then:
• ZA observations of YA
• ZC observations of YC
Veronica J. Berrocal Data fusion
35. Data fusion: spatial data
Data model
• Model for [ZA,ZC |YA,YC ,YB,θm]
• Measurement error
ZA = YA +εA εA ∼ N(0, σ2
a Ina )
ZC = YC +εC εC ∼ N(0, σ2
c Inc )
•
ZA
ZC
|YA,YC ,σ2
a,σ2
c ∼
N
YA
YC
,Σm =
σ2
aIna 0
0 σ2
c Inc
Veronica J. Berrocal Data fusion
36. Data fusion: spatial data
Process model
• For all s ∈ Bj
Y (s) = Y (Bj )+γ(s)
with E(γ(s)) = 0 and Cov(γ(s),γ(r)) = C(s,r;φ)
• Then:
1 for all Ai , Y (Ai ) = g
(i)
A YB + 1
|Ai | Ai
γ(s)ds
2 for all Ck , Y (Ck ) = g
(k)
C YB + 1
|Ck | Ck
γ(s)ds
•
YA
YC
|YB,σ ∼ N
GA
GC
YB,Σ(φ)
Veronica J. Berrocal Data fusion
37. Data fusion: spatial data
Complete model
• Data model:
ZA
ZC
|YB,Σm,Σ ∼ N
GA
GC
YB,Σm +Σ(φ)
• Process model: YB ∼ N(θB,ΣB()φ)
• Parameters: [σ2
a,σ2
c,θB,φ]
Veronica J. Berrocal Data fusion
38. Data fusion: spatial data
Example: streamfunction
• Data:
1 NSCAT satellite data: UA, VA (na = 369)
2 NCEP (numerical model output): UC , VC (nC = 15)
• Process: uB, vB
• Data model:
1
UA
UC
|uB,σu,Σ ∼ N
GA
GC
uB,Σu +Σm
2
VA
VC
|vB,σv ,Σ ∼ N
GA
GC
vB,Σv +Σm
• Process model:
1
uB
vB
∼ N
µu1
µv 1
,Σuv
Veronica J. Berrocal Data fusion
39. Data fusion: spatial data
• Interest in predicting the streamfunction ψ.
• Deterministic Poisson equation to determine streamfunction ψ
from winds:
∇2
ψ =
∂v
∂x
−
∂u
∂y
u: east-west wind component, v: north-south wind
component
• Discretizing to a regular grid:
1 ψI |ψbc,u,v ∼ N(L−1
[Dx v −Dy u+Lbc ψbc],ΣI )
2 ψbc ∼ N(µbc ,Σbc )
• ψI : streamfunction at the interior grid locations
• ψbc: streamfunction at the boundary grid locations
Veronica J. Berrocal Data fusion
40. Data fusion: spatial data
Wind data (top row); posterior mean and realization from the posterior
distribution of the streamfunction for December 26, 1996 (bottom row)
Veronica J. Berrocal Data fusion
41. Data fusion: spatial data
Fuentes and Raftery, Biometrics, 2005
• Two sources of weekly average SO2 concentration data:
monitoring data and computer model output
• Data with different resolution → Change of support problem
• Monitoring data from CASTNet sites
• Output of a numerical model, Models-3, given as average
concentration over 36×36 km
• Goal: Estimate true weekly average concentration of SO2
Veronica J. Berrocal Data fusion
42. Data fusion: spatial data
Fuentes and Raftery, Biometrics, 2005
Average SO2 concentration for the week of July 11, 1995
Veronica J. Berrocal Data fusion
43. Data fusion: spatial data
• Process: Z(s) true underlying process
• Data:
1 ˆZ(s) measurement from monitoring network (CASTNET)
2 ˜Z(B) numerical model output (Models-3)
• Goal: Infer upon the process Z(s)
• Problem: The data has different spatial support
Veronica J. Berrocal Data fusion
45. Data fusion: spatial data
Data model
• Model for ˆZ(s), ˜Z(B) | Z(s),θm
• Measurement error
ˆZ(s) = Z(s)+e(s) e(s) ∼ N(0,σ2
e)
˜Z(B) =
1
|B| B
˜Z(s)ds
˜Z(s) = a(s)+b(s)Z(s)+δ(s) δ(s) ∼ N(0,σ2
δ)
where
1 a(s) polynomial in s
2 b(s) ≡ b
Veronica J. Berrocal Data fusion
46. Data fusion: spatial data
Process model
• Z(s) = µ(s)+ε(s) with
1 E(ε(s)) = 0 and Cov(ε(s),ε(r)) = σ(s,r;φ)
2 µ(s) polynomial in s with coefficients β
→ Z(s) ∼ GP(µ(s),Σ)
• Goal: Infer on Z given ˆZ, ˜Z
Veronica J. Berrocal Data fusion
47. Data fusion: spatial data
• ˆZ = ˆZ(s1),..., ˆZ(sn)
• ˜Z = ˜Z(B1),..., ˜Z(BM)
ˆZ
˜Z
∼ N
ˆµ
˜a+b˜µ
,
ΣC ΣCM
ΣCM ΣM
where
1 ˆµ = (µ(s1),...,µ(sn))
2 ˜a = 1
|B1| B1
a(s)ds,..., 1
|BM | BM
a(s)ds
3 ˜µ = 1
|B1| B1
µ(s)ds,..., 1
|BM | BM
µ(s)ds
Veronica J. Berrocal Data fusion
48. Data fusion: spatial data
ˆZ
˜Z
∼ N
ˆµ
˜a+b˜µ
,
ΣC ΣCM
ΣCM ΣM
where
1 ΣC n ×n matrix: (ΣC )ij = σ(si ,sj ;φ)+1{si ≡sj }σ2
e
2 ΣCM n ×M matrix: (ΣCM)ik = b · 1
|Bk | Bk
σ(si ,v;φ)dv
3 ΣM M ×M matrix:
(ΣM)kl = b2
·
1
|Bk|·|Bl | Bk Bl
σ(u,v;φ)du dv +1{Bk ≡Bl }σ2
δ
Veronica J. Berrocal Data fusion
49. Data fusion: spatial data
Example: air pollution
Data:
1 Weekly average of SO2 concentration at n = 50 CASTNet
sites for the week of July 11, 1995
2 Weekly average of SO2 concentration at M = 81×87 36 × 36
grid cells, output of Models-3 for the week of July 11, 1995
Other modeling details
• Stochastic integrals approximated by taking systematic sample
of 4 points within each a grid cell
• Degree of polynomials defining the mean trend µ(s) of Z(s)
and of the additive bias a(s) of ˜Z(s) determined via RJMCMC
• Non-stationary covariance function for the underlying true
process Z(s)
Veronica J. Berrocal Data fusion
50. Data fusion: spatial data
Posterior predictive mean and posterior predictive SD for Z(s) for
the week of July 11, 1995
Veronica J. Berrocal Data fusion
51. Data fusion: space-time data
Wikle et al., JASA 2001
• Extend the modeling idea of Wikle and Berliner (2005) to
account for time
• Daily wind data from two sources: satellite data (at higher
resolution) and computer model output (at a lower resolution)
• Goal: Predict winds at an intermediate resolution over a 54
6-hour increment period
• Accounted for the temporal dependence in the data by using
dynamic coefficients in the specification of the process driving
the observed data
• Avoided to compute stochastic integrals!
Veronica J. Berrocal Data fusion
52. Data fusion: space-time data
• Data:
1 NSCAT satellite data: UA,t, VA,t at time t
2 NCEP (numerical model output): UC,t, VC,t at time t
→ Ut = (UC,t,UA,t) and Vt = (VC,t,VA,t) observed data at
time t
→ {Ut}T
1 = (U1,...,UT ) , {Vt}T
1 = (V1,...,VT )
• Process:
• ut, vt at time t at nB prediction grid cells.
• Similar definition for {ut}T
t=1 and {vt}T
t=1
Veronica J. Berrocal Data fusion
53. Data fusion: space-time data
Data model
{V}T
t=1 ,{U}T
t=1 | {v}T
t=1 ,{u}T
t=1 ,θ =
T
∏
t=1
[Vt | vt,θ]·[Ut | ut,θ]
• Vt | vt,Σt ∼ N(Ktvt,Σt)
• Ut | ut,Σt ∼ N(Ktut,Σt)
1 Σt diagonal matrix with entries equal to either σ2
(satellite
obs), σ2
b (NCEP boundary grid cells) or σ2
I (NCEP interior
cells)
2 Kt design matrices that maps the prediction grid cells to the
observation grid cells
Veronica J. Berrocal Data fusion
54. Data fusion: space-time data
Process model
ut = µu +uE
t + ˜ut
vt = µv +vE
t + ˜ut
1 µu spatial mean for the u wind component: µu = Puγu
(resp. for µv ) → Pu design matrix (resp. Pv )
2 uE
t thin fluid approximation of the u wind component:
uE
t = Φau
t (resp. for vE
t ) → Φ basis function
• ˜ut small scale motions of the u wind component: ˜ut = Ψbu
t
(resp. for ˜vt) → Ψ wavelet basis function
Veronica J. Berrocal Data fusion
55. Data fusion: space-time data
Parameters
• The 2n ×1 random vectors au
t , av
t are modeled as dynamically
evolving in time but are independent between prediction grid
cells
• The n ×1 random vectors bu
t and bv
t are modeled as
dynamically evolving in time and are independent between
prediction grid cells
• No need to compute stochastic integrals!
• Only temporal dependence is explicitly modeled
• Computationally feasible
Veronica J. Berrocal Data fusion
56. Data fusion: space-time data
Choi et al., Comp. Stat. and Data Analysis 2009
• Extend the modeling idea of Fuentes and Raftery (2005) to
account for time
• Daily average PM2.5 concentration from two sources:
monitoring data and computer model output
1 ˆZ(s,t) observation from monitoring site s at time t
2 ˜Z(B,t) model output at grid cell B at time t
• Goal: Predict true daily average PM2.5 concentration
aggregated over counties at time t for health analysis
• Included the temporal dependence in the mean structure of
the underlying process
Veronica J. Berrocal Data fusion
57. Data fusion: space-time data
Data model
• Model for ˆZ(s,t), ˜Z(B,t) | Z(s,t),θm
ˆZ(s) = Z(s,t)+e(s,t) e(s,t) ∼ N(0,σ2
e)
˜Z(B,t) =
1
|B| B
˜Z(s,t)ds
˜Z(s,t) = a(s)+Z(s,t)+δ(s,t) δ(s,t) ∼ N(0,σ2
δ)
Process model
Z(s,t) = M(s,t)ξ +ε(s,t) ε(s,t) ∼ N(0,τ2
)
• M(s,t) vector of meteorological variables at site s at time t
Veronica J. Berrocal Data fusion
58. Data fusion: space-time data
McMillan et al., Environmetrics, 2009
• Propose a spatio-temporal model to combine monitoring data
and numerical model output
1 Daily average PM2.5 concentration from monitoring sites
during year 2001
2 Daily average PM2.5 concentration, output of CMAQ model
ran at 12 km grid cell resolution (M = 213×188)
• Goal: Combine the two sources of data and predict true daily
average PM2.5 concentration for each day in 2001
Veronica J. Berrocal Data fusion
59. Data fusion: space-time data
• Process: Wi true underlying process
• Data:
1 Xi,k monitoring data
2 Yi,k CMAQ output
• Wi defined on space-time grid cells: i ∈ {1,...,N}, where
N = NT ×NP, NT number of time points, NP number of grid
cells
• Xi,k observed monitoring data for the k −th monitor
observation in cell i
• Yi,k CMAQ output in cell i (k = 1)
Veronica J. Berrocal Data fusion
61. Data fusion: space-time data
Data model
• Model for [Xi,k,Yi,k | Wi ,θ]
Measurement error
Xi,k = Wi +εi,k εi,k ∼ N(0,τ2
X )
Yi,k = Di β +Wi +δi,k δi,k ∼ N(0,τ2
Y )
• Di : vector of uniform B-splines over a regular 3-dimensional
lattice of ND knots
=⇒ CMAQ bias for grid cell i : Di β = ∑ND
j=1 Dij βj
Veronica J. Berrocal Data fusion
62. Data fusion: space-time data
Process model
Wi = µt(i) +Zi
• t(i) temporal index of grid cell i
• µt(i) constant across space: µt(i) ∼ N(0,τ2
µ)
• Z space-time multivariate normal with a separable covariance
structure: autoregressive in time and conditionally
autoregressive (CAR) in space
=⇒ Z | τ2
Z ,ρ ∼ N(0,τ2
Z · (ΛT (ρ)⊗ΛP)
−1
)
Veronica J. Berrocal Data fusion
63. Data fusion: space-time data
Daily mean levels for predicted surface, monitoring data and
CMAQ over Eastern US
Veronica J. Berrocal Data fusion
64. Data fusion: space-time data
Posterior predictive mean for (a) 4 July 2001 and (b) 24 December 2001
(a) (b)
Veronica J. Berrocal Data fusion
65. Data fusion: space-time data
Berrocal et al., JABES, 2010
• Propose a spatio-temporal model to combine monitoring data
and numerical model output
1 Daily 8-hr max ozone concentration from monitoring sites
during summer of 2001
2 Daily 8-hr max ozone concentration, output of CMAQ model
ran at 12 km grid cell resolution (M = 213×188)
• Goal: Combine the two sources of data and “downscale”
numerical model output at point level
• Does not assume a “true” underlying process
Veronica J. Berrocal Data fusion
66. Data fusion: space-time data
• Y (s,t): observation at site s at time t
• x(B,t): CMAQ output at grid cell B at time t
For s ∈ B:
Y (s,t) = ˜β0(s,t)+ ˜β1(s,t)x(B,t)+ε(s,t) ε(s,t) ∼ N(0,σ2
)
with ˜βi (s,t) = βit +βi (s,t), for i = 0,1.
• Temporal dependence in β0t and β1t:
(i) β0t,β1t Nested within time
(ii) β0t,β1t Dynamic in time
• β0(s,t) and β1(s,t) correlated Gaussian processes that are either:
(i) Nested within time OR (i) Dynamic in time
Veronica J. Berrocal Data fusion
67. Data fusion: space-time data
• Possible spatio-temporal models to combine the two data
β0t β0(s,t)
Model β1t β1(s,t)
Model 1 Independent across time Constant in time
Model 2 Dynamic Constant in time
Model 3 Independent across time Independent across time
Model 4 Dynamic Dynamic
Veronica J. Berrocal Data fusion
68. Data fusion: space-time data
−100 −95 −90 −85 −80 −75 −70
30354045
Longitude
Latitude
Ozone monitoring sites, 2001
Test sites (black), validation sites (red)
• Daily maximum 8-hour ozone
concentration (ppb): observations
(n=803) and CMAQ model output
• Model output on 12-km grid cells
(M=40,440)
• Fit models for May 1 - October 15, 2001
• 436 sites used to fit the model,
367 sites for validation
Veronica J. Berrocal Data fusion
69. Data fusion: space-time
• National Ambient Air Quality Standard (NAAQS) for ozone is that
the 3-year rolling average of the annual fourth highest daily 8-hour
maximum ozone concentration be less than a given threshold
• Maps of the probability that the fourth highest ozone concentration
during the period May 1 - October 15, 2001 exceeds:
−100 −95 −90 −85 −80 −75 −70
30354045
Longitude
Latitude
0.0
0.2
0.4
0.6
0.8
1.0
(a) 80 ppb (1997 standard)
−100 −95 −90 −85 −80 −75 −70
30354045
Longitude
Latitude
0.0
0.2
0.4
0.6
0.8
1.0
(b) 75 ppb (2008 standard)
Veronica J. Berrocal Data fusion
70. Data fusion: space-time
Berrocal et al., Environmetrics, 2012
• Extended the 2010 downscaler model to allow for potential
spatial misalignment in the computer model output
1 Seasonal average temperature at 17 synoptic stations in
Sweden for the period December 1962-November 2007
2 Regional climate model output on a 12.5km × 12.5km grid for
the same period
• Goal: Assess the performance of the regional climate model.
Veronica J. Berrocal Data fusion
71. RCM data
12 14 16 18 20
56586062
−10
−8
−6
−4
−2
0
2
4
RCM output: DJF 2002
q
q
q
Stockholm
Borlange
Goteborg
• Output of the Swedish
Meteorological Hydrological
Institute (SMHI) Rossby
Centre Atmospheric (RCA)
RCM model
• Daily output for 2-m
temperature from
December 1, 1962 to
November 30, 2007, then
aggregated to quarterly
averages (DJF, MAM, JJA,
SON)
• Output at
12.5 km × 12.5 km grid
boxes
Veronica J. Berrocal Data fusion
72. RCM data
12 14 16 18 20
56586062
−2
0
2
4
6
8
RCM output: MAM 2002
q
q
q
Stockholm
Borlange
Goteborg
• Output of the Swedish
Meteorological Hydrological
Institute (SMHI) Rossby
Centre Atmospheric (RCA)
RCM model
• Daily output for 2-m
temperature from
December 1, 1962 to
November 30, 2007, then
aggregated to quarterly
averages (DJF, MAM, JJA,
SON)
• Output at
12.5 km × 12.5 km grid
boxes
Veronica J. Berrocal Data fusion
73. Observational data
12 14 16 18 20
56586062
−10
−8
−6
−4
−2
0
2
4
Observation data: DJF 2002
q
q
q
Stockholm
Borlange
Goteborg
qq
q
q
q
q
q
q qq
q
q q
q
q q
q
• Observed daily average
temperature from 17 stations
in the SMHI network of
synoptic stations
• Period: December 1, 1962 to
November 30, 2007
• Daily data aggregated to
quarterly scale
• Three stations, G¨oteborg,
Stockholm and Borl¨ange
held out for validation
Veronica J. Berrocal Data fusion
74. Downscaling model
• Some notation:
• B1,...,Bg : RCM model grid boxes with centroids r1,...,rg
• x(B1,t),x(B2,t),...,x(Bg ,t): RCM output of quarterly
average temperature for quarter t = 1,...,T at grid box
B1,B2,...,Bg
• Y (s,t): observed quarterly average temperature at station s
for quarter t = 1,...,T
• The 2010 downscaling applied to this data would be: for s in
B and t = 1,...,T
Y (s,t) = ˜β0,t(s,t)+˜β1,t(s,t)x(B,t)+ε(s,t) ε(s,t)
iid
∼ N(0,τ2
)
Veronica J. Berrocal Data fusion
75. Downscaling model
• The 2012 model starts from the observation that we could
write: for t = 1,...,T
Y (s,t) = ˜β0(s,t)+β1 ˜x(s,t)+ε(s,t) ε(s,t) ∼ N(0,τ2
)
with
• ˜x(s,t): spatio-temporal weighted average of the RCM output:
˜x(s,t) =
g
∑
k=1
wk (s,t)x(Bk ,t)
• ˜β1(s,t) replaced by β1 for identifiability reasons
• The weights wk(s,t) should be:
• positive and sum up to 1
• spatially correlated within sites and across sites
Veronica J. Berrocal Data fusion
76. Downscaling model
• If r1,...,rg are the centroids of the RCM grid boxes, we can
take the weights wk(s,t) to be
wk(s,t) =
K (|s−rk|;λ)
∑
g
l=1 K (|s−rl |;λ)
• K (·;λ) kernel function with bandwidth λ.
For example: K (|s−rk|;λ) = exp(−|s−rk |
λ ).
Veronica J. Berrocal Data fusion
78. Downscaling model
We consider RCM output and observational data:
12 14 16 18 20
56586062
−10
−8
−6
−4
−2
0
2
4
RCM output: DJF 2002
q
q
q
Stockholm
Borlange
Goteborg
12 14 16 18 20
56586062
−10
−8
−6
−4
−2
0
2
4
Observation data: DJF 2002
q
q
q
Stockholm
Borlange
Goteborg
q
We establish the spatial linear model: for s ∈ B and t = 1,...,T
Y (s,t) = ˜β0(s,t)+β1 ˜x(s,t)+ε(s,t) ε(s,t) ∼ N(0,τ2
)
Veronica J. Berrocal Data fusion
79. Downscaling model
For t = 1,...,T, the weight wk(s,t) is:
12 14 16 18 20
56586062
0.0
0.1
0.2
0.3
0.4
0.5
q
q
q
Stockholm
Borlange
Goteborg
q
14.0 14.2 14.4 14.6 14.8
61.661.862.062.262.4
0.0
0.1
0.2
0.3
0.4
0.5
q
Veronica J. Berrocal Data fusion
80. Downscaling model
• To allow for the weights wk(s,t) to be directional, we modify
the expression
wk(s,t) =
K (|s−rk|;λ)
∑
g
l=1 K (|s−rl |;λ)
to
wk(s,t) =
K (|s−rk|;λ)·exp(Q(rk,t))
∑
g
l=1 K (|s−rl |;λ)·exp(Q(rl ,t))
where for t = 1,...,T, Q(r,t) is a latent stationary mean-zero
spatial Gaussian process with variance 1 and exponential
correlation function.
• For t = 1,...,T, the range φ of the latent spatial process
Q(r,t) influences the directionality of the weights.
Veronica J. Berrocal Data fusion
81. Downscaling model
• Finally, the downscaling model is: for s and t = 1,...,T:
Y (s,t) = ˜β0(s,t)+β1 ˜x(s,t)+ε(s,t) ε(s,t) ∼ N(0,τ2
)
• ˜β0,t(s) = β0,t +β0(s,t) with β0(s,t) stationary mean-zero
Gaussian spatial process with time-varying range parameter.
• ˜x(s,t) = ∑
g
k=1 wk (s,t)x(Bk ,t)
• wk (s,t) = K (|s−rk |;λ)·exp(Q(rk ,t))
∑
g
l=1 K (|s−rl |;λ)·exp(Q(rl ,t))
• Q(r,t) is a latent stationary mean-zero spatial Gaussian
process with variance 1 and exponential correlation function
with range parameter φ.
• For t = 1,...,T, the calibration parameters, β0,t,β0(s,t) are
assumed to be independent in time, and so is the latent
process Q(r,t).
Veronica J. Berrocal Data fusion
82. Predictions at point level
• We predicted quarterly average temperature at three reserved
stations and compared them with:
1 observed data
2 the quarterly average temperature, output of the RCM at the
grid box containing the station.
12 14 16 18 20
56586062
Longitude
Latitude
12
3
4
5
6
7
8 910 11
12 13
14
15 16
17
q
q
q
Stockholm
Borlange
Goteborg
Veronica J. Berrocal Data fusion
83. Predictions at Borl¨ange
Black line: observed data
Blue line: downscaling model prediction
Red line: RCM output
Magenta line: upscaling model prediction
1970 1980 1990 2000
−15−10−505
Borlänge
Winter
1970 1980 1990 2000
−5051015
Spring
1970 1980 1990 2000
510152025
Summer
1970 1980 1990 2000
05101520
Year
Autumn
Veronica J. Berrocal Data fusion
84. Predictions at Stockholm
Black line: observed data
Blue line: downscaling model prediction
Red line: RCM output
Magenta line: upscaling model prediction
1970 1980 1990 2000
−15−10−505
Stockholm
Winter
1970 1980 1990 2000
−5051015
Spring
1970 1980 1990 2000
510152025
Summer
1970 1980 1990 2000
05101520
Year
Autumn
Veronica J. Berrocal Data fusion
86. Data fusion: space-time
Sahu et al., JRSS Series C, 2009
• Propose a spatio-temporal model to combine monitoring data
and numerical model output to predict wet chemical
deposition
1 Weekly nitrate (resp. sulfate) deposition for year 2001 at
monitoring sites (n = 152)
2 Weekly precipitation data for year 2001 at monitoring sites
3 Weekly nitrate (resp. sulfate) deposition, output of CMAQ
model ran at 12 km grid cell resolution (M = 33,390)
• Goal: Combine the sources of data and predict weekly, annual
and seasonal wet deposition in the Eastern US for 2001
Veronica J. Berrocal Data fusion
87. Data fusion: space-time
Data:
1 P(s,t) observed precipitation at s at time t
2 Z(s,t) observed deposition at s at time t
3 Q(B,t) CMAQ model output at grid cell B at time t
Data model:
P(s,t) =
exp(U(s,t)) if V (s,t) > 0
0 o.w
Z(s,t) =
exp(Y (s,t)) if V (s,t) > 0
0 o.w
Q(B,t) =
exp(X(B,t)) if ˜V (B,t) > 0
0 o.w
X(B,t) = γ0 +γ1
˜V (B,t)+ψ(B,t) ψ(B,t) ∼ N(0,σ2
ψ)
Veronica J. Berrocal Data fusion
88. Data fusion: space-time
Process model:
1 U(s,t) process driving precipitation at s at time t
2 Y (s,t) process driving deposition at s at time t
3 V (s,t) latent atmospheric process
4 ˜V (B,t) process driving the log-CMAQ output at B at time t
U(s,t) = α0 +α1V (s,t)+δ(s,t) δt ∼ GP(0,Σδ)
Y (s,t) = β0 +β1U(s,t)+β2V (s,t)+[b0 +b1(s)X(B,t)]+η(s,t)+ε(s,t)
V (s,t) = ˜V (B,t)+ν(s,t) ν(s,t) ∼ N(0,σ2
ψ)
˜V (B,t) = ρ ˜V (B,t −1)+ζ(B,t) ζ(B,t) ∼ CAR
Veronica J. Berrocal Data fusion
90. Data fusion: spatial data
Monitoring data and validation sites
Veronica J. Berrocal Data fusion
91. Data fusion: spatial data
Annual total precipitation in 2001
Veronica J. Berrocal Data fusion
92. Data fusion: space-time data
(a) (b)
Posterior predictive mean for b(s) for (a) sulfate (b) nitrate
Veronica J. Berrocal Data fusion
93. Data fusion: space-time data
(a) (b)
(a) Posterior predictive annual mean for nitrate and (b) length of
predictive interval
Veronica J. Berrocal Data fusion