This is our joint work with colleagues from TU Braunschweig. Prof. H. G. Matthies had an excellent idea to develop a Bayesian surrogate formula for updating not probability densities (like in classical Bayesian formula), but PCE coefficients of the given random variable. Bojana Rosic implemented the linear case. I (with help of Elmar Zander) implemented non-linear case. Later on Elmar significantly simplified the algorithm.
chaitra-1.pptx fake news detection using machine learning
Linear Bayesian update surrogate for updating PCE coefficients
1. Bayesian Update in low-rank tensor format
A. Litvinenko, B. V. Rosi´c, E. Zander, O. Pajonk, H. G. Matthies,
Institute for Scientific Computing, TU Braunschweig, Germany
July 13, 2011
Bayesian Update in low-rank tensor format — July 13, 2011 1/40
2. 1 Introduction
2 Direct General Bayesian Approach
3 Discretisation
4 Numerical Examples
5 Conclusion
Bayesian Update in low-rank tensor format — July 13, 2011 2/40
3. Introduction
Inverse Problem: Find parameter q given measurement data z
qA( ,u)
f u=S(q,f)
Y(q,u)
Forward
(u?)
Inverse
(q?)
z
Ill-posed problem: issues of existence, uniqueness and stability
Bayesian Update in low-rank tensor format — July 13, 2011 3/40
4. Bayesian Regularization
- Additional information to data z: qf (apriori information, forecast)
What is qf ?
- classical Bayesian approach: qf := πf apriori pdf
πa(q|z) = const πf (q)π(z|q) = const πf (q)L(q)
Markov Chain Monte Carlo methods (MCMC) [Gamerman 2006]
spectral stochastic FEM +MCMC [Kuˇcerov´a at all 2010, Marzouk
2009]
collocation methods [Christen & Fox 2010]
-drawback: requires a complete statistical description of the problem
Bayesian Update in low-rank tensor format — July 13, 2011 4/40
5. Direct General Bayesian Approach
- Probability space (Ω, B, P)
- the space of RVs with finite variance S := L2(Ω) (stochastic space)
- the Hilbert space Q (deterministic space)
Q -valued RVs form a space Q := Q ⊗ S
True measurement
- Linear measurement ˇy = Y(q, u) ∈ Y is polluted by noise :
z = ˇy + , ∼ N(0, C ) ⇒ z ∈ Y0 ⊆ Y := Y ⊗ S
Apriori information
qf : Ω → Q, qf ∈ Qf ⊂ Q
Bayesian Update in low-rank tensor format — July 13, 2011 5/40
6. Direct General Bayesian Approach
- already defined: z ∈ Y0, qf ∈ Qf
- given linear mapping H : Q → Y , predict observation
y = Hqf , y ∈ Q0 = H∗
(Y0)
Theorem
In the setting just described, the random variable qa ∈ Q — “a”
stands for “assimilated” or “analysis” — is the orthogonal ( min.
variance) projection of q onto the subspace Qf + Q0:
qa(ω) = qf (ω) + K(z(ω) − y(ω)), K := Cqf y (Cy + C )
−1
with qf being the orthogonal projection onto Qf and K the “Kalman
gain” operator [Luenberger 1969, Rosi´c at all 2011, Pajonk at all
2011].
- doesn’t assume Gaussian statistics; in linear case reduces to
Kalman fillter [Evensen 2009]
Bayesian Update in low-rank tensor format — July 13, 2011 6/40
8. Example
- Darcy Law
− div(κ(x, ω) u(x, ω)) = f(x, ω),
u(x, ω) = 0.
- Conductivity is for simplicity assumed to be scalar field with apriori
distribution (via maximum entropy principle)
κf (x) := exp(qf (x)), qf (x) ∼ N(µqf
, σ2
qf
)
- Covariance function
Covqf
(x, y) = σ2
qf
exp(−|x − y|/lc)
- following conditions hold:
κf (x, ω) > 0, κf L∞(G×Ω) < ∞, 1/κf L∞(G×Ω) < ∞.
Bayesian Update in low-rank tensor format — July 13, 2011 8/40
9. Variational Formulation
- The solution space:
U := U ⊗ S, U := ˚H1
(G) = {u ∈ H1
(G) | u = 0 on ∂G}
- Euqilibrium equation:
a(v, u) := E (a(ω)(v(·, ω), u(·, ω))) = E ( (ω), v(·, ω) ) =: , v .
a(ω)(v, u) :=
G
v(x) · (κf (x, ω) u(x)) dx,
(ω), v :=
G
v(x)f(x, ω) dx, ∀v ∈ U,
- The well-possednes via Lax-Milgram theorem.
Bayesian Update in low-rank tensor format — July 13, 2011 9/40
10. Discretisation
- Finite element discretisation: u(x, ω) =
N
n=1 un(ω)φn(x)
A(ω)[u(ω)] = f(ω),
(A(ω))m,n := a(ω)(φm, φn) with the bi-linear form a(ω),
(f(ω))m := (ω), φm ,
u(ω) = [u1(ω), . . . , uN(ω)]T
.
Bayesian Update in low-rank tensor format — July 13, 2011 10/40
11. PCE and KLE
- Wiener’s polynomial chaos expansion: un(θ) = α∈J uα
n Hα(θ(ω)),
α = (α1, . . . , α, . . .) ∈ N
(N)
0 , (1)
∀β ∈ J : E ([f(θ) − A(θ)u(θ)]Hβ(θ)) = 0, (2)
with fβ := E (f(θ)Hβ(θ)) and Aβ,α := E (Hβ(θ)A(θ)Hα(θ)),
∀β ∈ J :
α∈J
Aβ,αuα
= fβ, (3)
which further represents a linear, symmetric and positive definite
system of equations of size N × R.
Bayesian Update in low-rank tensor format — July 13, 2011 11/40
12. - The Karhunen-Lo`eve expansion (KLE) of stiffness and rhs
Au := (
∞
j=0
Aj ⊗ ∆j
)(
α∈J
uα
⊗ eα
) = (
α∈J
fα ⊗ eα
) =: f,
where ∆j
= E(Hαξj Hβ), κf =
M
j=1 κj
f ξj and |J | = R.
- The sparse tensor Galerkin methods [Zander, Matthies 2010]
Bayesian Update in low-rank tensor format — July 13, 2011 12/40
13. Simulation of Measurements
- Measure some functional of the solution u in finitely many patches L:
ˆG := {x1, ..., xL} ⊂ G, L := | ˆG|.
- The average hydraulic head:
y(u, ω) := ..., y(xj ), ... ∈ RL
, y(xj ) =
Gj
u(x, ω)dx,
ˇy = [y(x1, ˇω), ..., y(xL, ˇω)]
T
- Observation:
z := ˇy + , ∼ N(0, C )
Bayesian Update in low-rank tensor format — July 13, 2011 13/40
14. Inverse Problem
- κf is cone in the vector space of RVs (not subspace)
- project: κf = α∈J κ
(α)
f Hα(θ(ω)) (similar for z and y)
qf (x, ω) = log κf =
α∈J
q
(α)
f (x)Hα(θ(ω)) = Qf H, Qf ∈ RN×R
, H ∈ RR
Let Qa = [..., qβ
a , ...], Z = [..., zβ
, ...] and Y = [..., yβ
, ...], then
- matrix form of update formula:
Qa = Qf + K(Z − Y), K ∈ RN×L
; Z, Y ∈ RL×R
- map back
κa = exp(qa(x, ω))
Bayesian Update in low-rank tensor format — July 13, 2011 14/40
15. Bayesian update procedure
Input: a priori information qf (ω) and measurements z.
1 approximate qf (ω) and input z(ω) by PCE.
2 set Qf = [..., qβ
f , ...], Z = [..., zβ
, ...]
3 solve u(ω) = S(qf (ω); f(ω))
4 forecast of measurement
y(ω) = Y(qf (ω); u(ω)) = Y(qf (ω); S(qf (ω); f(ω)))
5 PCE representation of y(ω): Y = [..., yβ
, ...]
6 compute covariance Cd = Cy + C = ˜Y∆0 ˜Y
T
+ C
7 compute G = C−1
d (Z − Y)
8 compute covariance Cqf y = ˜Qf ∆0 ˜Y
T
9 compute formula Qa = Qf + Cqf y G
Assimilated data Qa = [..., qβ
a , ...].
Bayesian Update in low-rank tensor format — July 13, 2011 15/40
16. Kalman Filter
- the variance
Cqa = E (˜qa(·) ⊗ ˜qa(·)) =
γ,β>0
qγ
a ⊗ qβ
a E (HγHβ) =
γ>0
qγ
a ⊗ qγ
a γ!,
Cqa
= ˜Qa∆0 ˜Qa
T
, ˜Qa = Qa|γ=0
- Kalman formula:
Cqa
= Cqf
+ Cqf y (Cy + C )
−1
CT
qf y − 2Cqf y (Cy + C )
−1
CT
qf y
= Cqf
− Cqf y (Cy + C )
−1
CT
qf y
Bayesian Update in low-rank tensor format — July 13, 2011 16/40
17. Low rank data format
Aim: to compute the following equation in low-rank tensor format
qa(ω) = qf (ω) + K(z(ω) − y(ω)), (4)
with
K = Cqf y (Cy + C )
−1
, (5)
where Cqf y = Cov(qf , y) = E (qf − E (qf ))(y − E (y))T
,
Cy = Cov(y, y), C = Cov( , ). can be approximated in H-matrix or in
low-rank tensor formats [Litvinenko et al. 2008].
Bayesian Update in low-rank tensor format — July 13, 2011 17/40
18. Compression of PCE coefficients
Let RF q(x, θ), θ = (θ1, ..., θM , ...) is approximated:
q(x, θ) =
β∈J
Hβ(θ)qβ(x), (6)
qβ(x) =
1
β! Θ
Hβ(θ)q(x, θ) P(dθ) ≈
1
β!
nq
i=1
Hβ(θi )q(x, θi )wi , (7)
where nq - number of quadrature points. Using low-rank format,
obtain
qβ(x) = [q(x, θ1), ..., q(x, θnq )] · [Hβ(θ1)w1, ..., Hβ(θnq )wnq ]T
(8)
Bayesian Update in low-rank tensor format — July 13, 2011 18/40
19. Denote
cβ := [Hβ(θ1)w1, ..., Hβ(θnq
)wnq
] ∈ Rnq
(9)
and approximate the set of realisations in low-rank format:
[q(x, θ1), ..., q(x, θnq )] ≈ ABT
.
The matrix of all PCE coefficients will be
RN×|J |
[...qβ(x)...] ≈ ABT
[...cT
β ...], β ∈ J . (10)
Later compression Hβ(θ) =
M
j=1 hβj
(θj ), where hβj
(θj ) are 1D
Hermite polynomials, is possible.
Bayesian Update in low-rank tensor format — July 13, 2011 19/40
20. Response surface in low-rank format
Put all together, obtain low-rank representation of RS
q(x, θ) =
β∈J
Hβ(θ)qβ(x) = HqT
(x), (11)
where H = (..., Hβ(θ), ...) and q(x) = (..., qβ(x), ...). Use Eq. 10,
obtain
q(x, θ) = Hq(x)T
= HABT
[...cT
β ...], (12)
where vector cβ is defined in Eq. 9.
Matrices A, BT
and [...cT
β ...] are given. By fixing random parameter
θ = θ∗
compute vector H and then a realisations q(x, θ∗
) of RF.
Bayesian Update in low-rank tensor format — July 13, 2011 20/40
21. Application of response surface
Now, having RS
q(x, θ) = HABT
[...cT
β ...] (13)
we generate RV θ, compute vector H, multiply by A, resulting vector
multiply by BT
and then by matrix [...cT
β ...]. We repeat this , e.g., 106
times and then use the obtained sample to compute (in each point x)
errorbars (command errorbar in Matlab ),
quantiles (command quantile in Matlab ),
cumulative density function (command ksdensity in Matlab ).
Bayesian Update in low-rank tensor format — July 13, 2011 21/40
22. Relative errors and memory of rank-k approx.
rank k press. density tke ev xv memory, MB
10 1.9e-2 1.9e-2 4.0e-3 1.4e-3 1.1e-2 21
20 1.4e-2 1.3e-2 5.9e-3 4.1e-4 9.7e-3 42
50 5.3e-3 5.1e-3 1.5e-4 7.7e-5 3.4e-3 104
Table: Matrices ∈ R260000×600
. Dense matrix format costs 1.25 GB.
Bayesian Update in low-rank tensor format — July 13, 2011 22/40
23. Numerical examples of tensor approximations
Gaussian kernel exp(−h2
) has the Kronecker rank 1.
The exponen. kernel exp(−h) can be approximated by a tensor with
low Kronecker rank r.
Approximation of C ∈ RN×N
, N = 412
= 1681 in the KT format.
r 1 2 3 4 5 6 10
C−Cr ∞
C ∞
11.5 1.7 0.4 0.14 0.035 0.007 2.8e − 8
C−Cr 2
C 2
6.7 0.52 0.1 0.03 0.008 0.001 5.3e − 9
Bayesian Update in low-rank tensor format — July 13, 2011 23/40
25. Measurement points
−1 0 1
−1
−0.5
0
0.5
1
−1 0 1
−1
−0.5
0
0.5
1
a) 447 measurement patches b) 239 measurement patches
−1 0 1
−1
−0.5
0
0.5
1
−1 0 1
−1
−0.5
0
0.5
1
c) 120 measurement patches d) 10 measurement patches
Table: Position of measurement points (FEM nodes) used in the experiments
Bayesian Update in low-rank tensor format — July 13, 2011 25/40
26. Given Data
- Right hand side: f = f0 sin(2π
λ xT
d + ϕ)
d = [cos α sin α], α ∈ [−π/2, π/2], ϕ ∈ [0, 2π]
- ’ Virtual truth’ is taken as
a) κ = 2
b) κ = 2 + 0.3 · (x + y)
c) κ = 2.2 − 0.1 · (x2
+ y2
)
- Apriori information:
E(κ) = 2.4, σκ = 0.4
order of PCE p = 3 and number of KLE modes: M <= 50
Bayesian Update in low-rank tensor format — July 13, 2011 26/40
28. Relative Error
0 1 2 3 4
10
−2
10
−1
10
0
Number of sequential updates
Relativeerrorεa
447 pt
239 pt
120 pt
60 pt
10 pt
Figure: “Linear truth”, experiment 1 (L=447): Convergence behaviour of
the relative error εa with respect to the number of sequential updates and
measurement points
Bayesian Update in low-rank tensor format — July 13, 2011 28/40
29. Relative Error
−1
0
1
−1
0
1
0
1
2
a) εa
[%]
−1
0
1
−1
0
1
5.5
6
6.5
b) εa
[%]
−1
0
1
−1
0
1
76
78
80
c) I [%]
Figure: “Constant truth”, experiment 1 (L=447) after 4th update: a)
Relative error ¯εa (the mean of the posterior compared to the mean of the
truth) b) relative error εa (the posterior compared to the truth) c) improvement
I (the posterior compared to the prior)
Bayesian Update in low-rank tensor format — July 13, 2011 29/40
30. PDF
0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
0
2
4
6
κ
PDF
κ
f
κ
a
Figure: “Constant truth”, experiment 3 (L=120): Posterior probability
density function κa compared to the prior κf for a single point in domain
Bayesian Update in low-rank tensor format — July 13, 2011 30/40
31. Update
Figure: “Linear truth”, experiment 1 (L=447) after 1th update: a) mean of
the prior, ¯κf b) truth, κ c) mean of the posterior, ¯κa
Bayesian Update in low-rank tensor format — July 13, 2011 31/40
32. Update
−1
0
1
−1
0
1
1
2
3
a) κf
−1
0
1
−1
0
1
2
2.1
2.2
b) true κ
−1
0
1
−1
0
1
2
2.1
2.2
c) κa
Figure: “Quadratic truth”, experiment 1 (L=447) after 4th update: a)
mean of the prior, ¯κf b) truth, κ c) mean of the posterior, ¯κa
Bayesian Update in low-rank tensor format — July 13, 2011 32/40
33. Example: The Lorenz-84 Model
Described by the system:
dx
dt
= −ax − y2
− z2
+ aF1
dy
dt
= −y + xy − bxz + F2 (14)
dz
dt
= −z − xz + bxy,
where F1 and F2 represent known thermal forcings, and a and b are
fixed constants.
Bayesian Update in low-rank tensor format — July 13, 2011 33/40
34. The Lorenz-84 model shows chaotic behaviour and is very sensitive
to the initial conditions. For this reason we model these as
independent Gaussian RVs:
x0(ω) ∼ N(x0, σ1)
y0(ω) ∼ N(y0, σ2) (15)
z0(ω) ∼ N(z0, σ3).
Due to the appearance of RVs, the determ. model turns into a system
of SDEs.
Bayesian Update in low-rank tensor format — July 13, 2011 34/40
35. Figure: Bi-modal identification experiment after 1 update. Here are shown
the results for different amounts of measurements used to determine the
PCE coefficients. First, we use 10 measurements (a), then 100 (b) and finally
1000 (c). The plot contains the truth, the prior and the posterior, as well as
the last used measurement as an example.
Bayesian Update in low-rank tensor format — July 13, 2011 35/40
36. Figure: Bi-modal identification experiment after 10 updates
Bayesian Update in low-rank tensor format — July 13, 2011 36/40
37. Figure: Bi-modal identification experiment after 100 updates
Bayesian Update in low-rank tensor format — July 13, 2011 37/40
38. Conclusion
The ill-posed problem is regularized by introduction of apriori
information
the update of the prior is a projection of the minimum variance
estimator from linear Bayesian updating onto the polynomial
chaos basis
for the mean and variance the estimation is of the Kalman type.
The estimation is purely deterministic without need for any kind
of sampling procedures
The presented linear Bayesian update does not need any
linearity in the forward model, and it can readily update
non-Gaussian uncertainties.
Bayesian Update in low-rank tensor format — July 13, 2011 38/40
39. Any Questions?
Thank you for your attention! Any Questions?
LiBerty
LInear BayEsian diRecT polYnomial chaos update
Bayesian Update in low-rank tensor format — July 13, 2011 39/40
40. References
1 Gamerman, D. and Lopes, H. F. , Markov Chain Monte Carlo:
Stochastic Simulation for Bayesian Inference, Chapman and
Hall, 2006
2 Kuˇcerov´a, A. and Matthies, H. G., Uncertainty Updating in the
Description of Heterogeneous Materials, Technische Mechanik,
Vol. 30, pp. 211–225, 2010
3 Marzouk, Y. M. and Najm, H. N. ,Dimensionality reduction and
polynomial chaos acceleration of Bayesian inference in inverse
problems, J. Comput. Phys, Vol. 228, 2009
4 Christen, J. A. and Fox, C., MCMC using an approximation, J.
Comput. Graph. Stat., Vol. 14, pp. 795–810, 2005
5 Luenberger, D. G., Optimization by Vector Space Methods, John
Wiley and Sons, Inc., New York, 1969
6 Rosi´c, B., Litvinenko. A, Pajonk O., Matthies H.G., Direct
Bayesian update of polynomial chaos representations, J.
Comput. Phys, 2011, submitted
7 Pajonk, O. and Rosi´c, B. V. and Litvinenko, A. and Matthies,
H. G., A Deterministic Filter for non-Gaussian Bayesian
Estimation, Physica D: Nonlinear Phenomena, 2011, submittedBayesian Update in low-rank tensor format — July 13, 2011 40/40