MUMS Opening Workshop - Emulators for models and Complexity Reduction - Akil Narayan , August 21, 2018

Emulators for models and complexity reduction
Akil Narayan1
1
Department of Mathematics, and Scientiﬁc Computing and Imaging (SCI) Institute
University of Utah
August 2018
SAMSI MUMS opening workshop
A. Narayan (U. Utah) Emulators and surrogates

Models and emulators
y = u(x) + ε
x ∈ D ⊆ Rd
y ∈ RP
x
y
The parameters/factors x govern the bulk behavior of the response u
The noise or error ε can account for model discrepancy
The observable y can be deterministic or stochastic

y = u(x) + ε
x ∈ D ⊆ Rd
y ∈ RP
x
y
Available data: noisy measurements, y, abstractly treated as samples at
speciﬁc values of x.
Emulators are generally built to be consistent with data. Their purpose can
be to
extrapolate/interpolate data
accelerate queries of the model
analyze for variances, screening, sensitivity, etc.

y = u(x) + ε
x ∈ D ⊆ Rd
y ∈ RP
x
y
I will primarily discuss emulator constructions from applied
mathematics/scientiﬁc computing.
We are interested in things like stability, accuracy, consistency, etc.
Take-home point: experimental design is critical in building good emulators.

Building emulators
Many mathematical emulator models have the form
u(x) ≈ uN (x) :=
N
n=1
cnφn(x),
Information about y: sample data (xm, ym), m = 1, . . . , M.
Two general types of approximations:
linear approximations: uN is linear in the data.
φn(·) are prescribed a priori, {ym} → {cn} is linear
nonlinear approximations: uN is nonlinearly dependent on data
computation of cn may be nonlinear
identiﬁcation of φn may depend on data

Building emulators
u(x) ≈ uN (x) :=
N
n=1
cnφn(x),
The form of φn does not generally dictate linear/nonlinearity.
Some linear approximations:
interpolation
quadrature
least-squares
Some nonlinear approximations:
radial basis/kernel approximations
non-quadratic regularized approximation
proper orthogonal decomposition

Building emulators
u(x) ≈ uN (x) :=
N
n=1
cnφn(x),
Example: If M ≥ N, cj coeﬃcients computable via least-squares





y1
y2
...
yM





= y ≈ Ac =





φ1(x1) φ2(x1) · · · φN (x1)
φ1(x2) φ2(x2) · · · φN (x2)
...
...
...
...
φ1(xM ) φ2(xM ) · · · φN (xM )










c1
c2
...
cN






Emulators as model reduction
Emulators are built in the hope that x → u(x) is a map of low complexity.
If true, and an eﬃcient model to capture this complexity is discoverable, then
u(x) ≈ uN (x) =
N
n=1
cnφn(x), V := span{φ1, . . . , φN }
can be achieved with “small” N.

Emulators as model reduction
Emulators are built in the hope that x → u(x) is a map of low complexity.
If true, and an efficient model to capture this complexity is discoverable, then
u(x) ≈ uN (x) =
N
n=1
cnφn(x), V := span{φ1, . . . , φN }
can be achieved with “small” N.
Identify V
Efficiently construct uN from V
Neither of these is particularly easy in general.
Anyway, scientific models are complex, is this even feasible with reasonable
N?

An explicit example
Example: Consider the solution u(z; x) to the parameterized PDE:
− z · (a(z; x) zu(z; x)) = f(z), (z, x) ∈ Ω × D,
u(z; x) = 0, (z, x) ∈ ∂Ω × D.
For each x, u(·; x) ∈ H = H1
(Ω). Let the diﬀusion coeﬃcient be given by
a(z; x) =
∞
j=1
xjψj(z).

An explicit example
Example: Consider the solution u(z; x) to the parameterized PDE:
− z · (a(z; x) zu(z; x)) = f(z), (z, x) ∈ Ω × D,
u(z; x) = 0, (z, x) ∈ ∂Ω × D.
For each x, u(·; x) ∈ H = H1
(Ω). Let the diﬀusion coeﬃcient be given by
a(z; x) =
∞
j=1
xjψj(z).
If x = (x1, . . .) ∈ D = [−1, 1]∞
, and there is some p ≤ 1 such that
∞
j=1
ψj
p
L∞(Ω) < ∞,
then an emulator uN can be constructed such that
u − uN L2(D,H) N−r
, r =
1
p
−
1
2
.
[Cohen, DeVore, Schwab 2010]

Adapted vs linear
An approximation to u:
u ≈ uN (z; x) =
N
n=1
cn(z)φn(x), V := span{φ1, . . . , φN }
Non-adapted approximation: With V chosen, construct uN so that
u − uN L2(D,RP ) inf
v∈V
u − v L2(D,RP )
The main task is to compute uN from a given V .

Adapted vs linear
u ≈ uN (z; x) =
N
n=1
cn(z)φn(x), V := span{φ1, . . . , φN }
v∈V
u − v L2(D,RP )
Adapated approximation: Find V and uN so that
u(x) − uN (x) RP is “small” for all x ∈ D

Adapted vs linear
u ≈ uN (z; x) =
N
n=1
cn(z)φn(x), V := span{φ1, . . . , φN }
v∈V
u − v L2(D,RP )
Adapated approximation: Find V and uN so that
u(x) − uN (x) RP is “small” for all x ∈ D
Adapted approximations are always nonlinear.
Non-adapted approximations can be linear.

Emulators and sampling/experimental design
y = u +
u ≈ uN =
N
n=1
cnφn(x) ∈ V,
{(xm, ym)}
M
m=1 −→ {cn}
N
n=1
Desiderata:
u − uN B small for a normed vector space B
M of “reasonable” size
Accuracy, both in identiﬁcation of V and in computation of uN depends
largely on sample design, i.e., the choice of x1, . . . , xM .

Emulators and sampling/experimental design
y = u +
u ≈ uN =
N
n=1
cnφn(x) ∈ V,
{(xm, ym)}
M
m=1 −→ {cn}
N
n=1
Desiderata:
u − uN B small for a normed vector space B
M of “reasonable” size
Accuracy, both in identiﬁcation of V and in computation of uN depends
largely on sample design, i.e., the choice of x1, . . . , xM .
Good sample design can minimize required data size M
Intelligent sampling enables eﬃcient emulator construction

Summary of methods
We’ll see how sampling design aﬀects approximation statements for three
strategies:
Discrete least-squares: linear approximation, M ≥ N
Compressive sampling: nonlinear approximation, M N
Reduced order modeling: nonlinear approximation, N ∼ M = O(1)
I’ll discuss optimal mathematical statements one can make, taking the form
u − uN B KN × (Best approx error) + if M ≥ KM .
I will focus on the role that sampling plays in these techniques.

Summary of methods
strategies:
Warning: There are entire sub-ﬁelds of applied math and statistics
concerning sampling that I will ignore.

Summary of methods
strategies:
Warning: There are entire sub-ﬁelds of applied math and statistics
concerning sampling that I will ignore.
(Because they’re not directly relevant to the message.)

Part I: Linear approximation
Discrete least squares
Non-adapted basis functions, linear approximation construction procedure

An aside – polynomials and PCE
u(x) + = y ≈ uN (x) =
N
n=1
cnφn(x),
x ∈ D ⊆ Rd
y ∈ R c ∈ RN
Non-adapted approximation: the φn are a priori chosen.
We often choose polynomials. (Cf. PCE)
Why?

An aside – polynomials and PCE
u(x) + = y ≈ uN (x) =
N
n=1
cnφn(x),
x ∈ D ⊆ Rd
y ∈ R c ∈ RN
Non-adapted approximation: the φn are a priori chosen.
We often choose polynomials. (Cf. PCE)
Why?
Polynomials are easy to compute with/evaluate
Polynomial expansions are (reasonably) easy to manipulate, multiply,
diﬀerentiate, etc.
Polynomials provide best approximation numbers that behave optimally:
inf
N=dim P =dim P d
k
P ⊂Hs
sup
f∈Hs
f Hs =1
inf
p∈P
f − p Hs ∼ N−s/d
sup
f∈Hs
f Hs =1
inf
p∈P d
k
N=dim P d
k
f − p Hs N−s/d
[Pinkus 1985]

Mathematical preliminaries
u(x) + = y ≈ uN (x) =
N
n=1
cnφn(x),
x ∈ D ⊆ Rd
y ∈ R c ∈ RN
Least-squares problem is approximation of the form
(c∗
1, . . . , c∗
N )
T
= c∗
= arg min
c∈RN
M
m=1
[uN (xm) − ym]
2

u(x) + = y ≈ uN (x) =
N
n=1
cnφn(x),
x ∈ D ⊆ Rd
y ∈ R c ∈ RN
Least-squares problem is approximation of the form
(c∗
1, . . . , c∗
N )
T
= c∗
= arg min
c∈RN
M
m=1
[uN (xm) − ym]
2
Let V := span {φ1, . . . , φN }. Least-squares is, equivalently,
v∗
= arg min
v∈V
M
m=1
(v(xm) − ym)
2
V is an a priori space of functions.
What is the “best” approximation we can hope for?

Given a probability measure µ on D, approximation will take place in an L2
space:
g, h µ :=
D
g(x)h(x)dµ(x), L2
µ (D) := g : D → R g µ < ∞
The best approximation error to u from the subspace V is
σV (u) := inf
v∈V
u − v µ

Given a probability measure µ on D, approximation will take place in an L2
space:
g, h µ :=
D
g(x)h(x)dµ(x), L2
µ (D) := g : D → R g µ < ∞
The best approximation error to u from the subspace V is
σV (u) := inf
v∈V
u − v µ
Randomized sampling: xm sampled iid from µ, and no noise, ym = u(xm),
uN = arg min
v∈V
M
m=1
(v(xm) − ym)
2
Law of large numbers: M ↑ ∞ ⇒ uN − u µ → σV (u).

“Standard” Monte Carlo
Approximate a function
u(x) = exp −ω x −
1
π
2
, x ∈ [−1, 1], = 0,
with µ uniform on [−1, 1], from the space of potential surrogates
V = span 1, . . . , xN−1
Data xm sampled iid from µ
Convergence observed, but slow
Why does this happen, and can we ﬁx it?
50 100 150 200 250 300
10−5
10−3
10−1
101
103
105
107
M
Mean-squareerror
D = [−1, 1], N = 50
Optimal error
MC

“Standard” Monte Carlo
Approximate a function
u(x) = exp −ω x −
1
π
2
, x ∈ [−1, 1], = 0,
with µ uniform on [−1, 1], from the space of potential surrogates
V = span 1, . . . , xN−1
Data xm sampled iid from µ
Convergence observed, but slow
Why does this happen, and can we ﬁx it?
50 100 150 200 250 300
10−5
10−3
10−1
101
103
105
107
M
Mean-squareerror
D = [−1, 1], N = 50
Optimal error
MC
Sampling from a standard distribution is frequently suboptimal

Convergence results
Proximity to the optimal solution is guaranteed with enough samples.
Deﬁne
Kµ(V ) := sup
x∈D
sup
v∈V {0}
|v(x)|2
v 2
µ
If x1, . . . , xM are sampled iid from µ, then
M
log M
≥
2 + 2r
log(e/2)
Kµ(V )
guarantees that, with probability ≥ 1 − 2M−r
,
E u − uN
2
µ ≤ 1 + 2
1 − log 2
(1 + r) log M
σV (u)2
+ 8U2
M−r
,
where U = supx∈D |u(x)|, and
uN = arg min
v∈V
M
m=1
(v(xm) − ym)
2
[Cohen, Davenport, Leviatan 2013]

Randomized sampling – Monte Carlo
M
log M
≥
2 + 2r
log(e/2)
Kµ(V ), Kµ(V ) = sup
x∈D
sup
v∈V {0}
|v(x)|2
v 2
µ
The smallest (best) value of Kµ(V ) is N.

M
log M
≥
2 + 2r
log(e/2)
x∈D
sup
v∈V {0}
|v(x)|2
v 2
µ
Example: Linear models, N = d + 1
φ1(x) = 1, φj+1(x) = xj, j = 1, . . . , d
Let µ be the standard Gaussian measure over D = Rd
Then Kµ(V ) = ∞.
Analysis suggests this is a pretty bad sampling design, but in practice it’s ﬁne.

M
log M
≥
2 + 2r
log(e/2)
x∈D
sup
v∈V {0}
|v(x)|2
v 2
µ
Example: Linear models, N = d + 1
φ1(x) = 1, φj+1(x) = xj, j = 1, . . . , d
Let µ be the standard Gaussian measure over D = Rd
Then Kµ(V ) = ∞.
Analysis suggests this is a pretty bad sampling design, but in practice it’s ﬁne.
In the previous example, Kµ(V ) ∼ N2
.
In practice, Kµ(V ) depends exponentially on d.
The ideal case: Kµ(V ) ∼ N. To accomplish this, use biased sampling.

Randomized sampling – weighted methods
Lesson: sampling xm ∼ µ is usually not optimal, and sometimes terrible.
Standard least-squares:
arg min
c
Ac − y 2
Weighted least-squares:
arg min
c
Ac − y 2,w = arg min
c
√
W Ac −
√
W y
2
where W = diag(w1, . . . , wM ) contains positive weights wj.

Randomized sampling – optimality
We can entirely circumvent the Kµ(V ) problem by changing sampling
measures.
Assume φ1, . . . φN is an L2
µ-orthonormal basis for V . Generate x1, . . . , xM
iid from µV , where
dµV (x) =
1
N
N
n=1
φ2
n(x)dµ(x).
Use weights
wm =
dµ
dµV
(xm) =
N
N
n=1 φ2
n(xm)
.
Our weighted least-squares estimator is deﬁned by
c∗
= arg min
c
Ac − y 2,w
The measure µV is called the induced distribution for V .

Randomized sampling – optimality
Let x1, . . . , xM ∼ µV , with uN (x) =
N
n=1 c∗
nφn(x) computed via
c∗
= arg min
c
Ac − y 2,w
Then
M
log M
≥
2 + 2r
log(e/2)
N
guarantees that, with probability ≥ 1 − 2M−r
,
E u − uN
2
µ ≤ 1 + 2
1 − log 2
(1 + r) log M
σV (µ)2
+ 8U2
M−r
[Cohen, Migliorati 2017]
Note: This M/N dependence is essentially optimal.

The induced distribution
The induced distribution µV can be substantially diﬀerent from µ.
x ∈ D = R2
, dµ(x) ∝ exp(− x 2
2),
x = (x(1)
, x(2)
), V = span x(1)
α1
x(2)
α2
(α1 + 1)(α2 + 1) ≤ 26
−6 −5 −4 −3 −2 −1 0 1 2 3 4 5 6
−6
−5
−4
−3
−2
−1
0
1
2
3
4
5
6
x(1)
x(2)
Samples from µ
−6 −5 −4 −3 −2 −1 0 1 2 3 4 5 6
−6
−5
−4
−3
−2
−1
0
1
2
3
4
5
6
x(1)
x(2)
Samples from µV
Under certain conditions, can sample from this distribution very eﬃciently, in
particular with linear complexity in d. [AN 2017]

Randomized sampling – examples
This analysis tends to give accurate estimates
10 20 30 40 50
10−1
100
101
M
L2
µerror
N = 5
Optimal error = ε µ
µ
µV
50 100 150 200 250 300
10−5
10−3
10−1
101
103
105
107
M
L2
µerror
N = 45

Randomized sampling – examples
This analysis tends to give accurate estimates
10 20 30 40 50
10−1
100
101
M
L2
µerror
N = 5
Optimal error = ε µ
µ
µV
50 100 150 200 250 300
10−5
10−3
10−1
101
103
105
107
M
L2
µerror
N = 45
Moral of the story:
randomized sampling according to µ is generally bad
randomized sampling according to µV is generally good
Intelligent sampling allows eﬃcient, near-optimal computation of emulators.

Odds and ends
Robust and accurate least-squares emulators for linear approximations can be
built with biased sampling.
Estimates are optimal: M N implies u − uN µ σV (u).
Estimates are d-independent.
Sampling is eﬃcient if both µ and φn are tensor-product.
Convergence results robust to noise > 0.
No signiﬁcant changes if y is vector-/function-valued

Part II: Nonlinear approximation
Sparse approximation
Non-adapted basis functions, nonlinear approximation construction procedure

Limited measurements
u(x) + = y ≈ uN (x) =
N
n=1
cnφn(x),
x ∈ D ⊆ Rd
y ∈ R c ∈ RN
V = span {φ1, . . . , φN }
{(xm, ym)}
M
m=1 → {cn}
N
n=1
When d > 1, it is common for an a priori N = dim V to be very large.
Least-squares: collecting M ∼ dim V measurements can be infeasible.

Limited measurements
u(x) + = y ≈ uN (x) =
N
n=1
cnφn(x),
x ∈ D ⊆ Rd
y ∈ R c ∈ RN
V = span {φ1, . . . , φN }
{(xm, ym)}
M
m=1 → {cn}
N
n=1
When d > 1, it is common for an a priori N = dim V to be very large.
Least-squares: collecting M ∼ dim V measurements can be infeasible.
What happens when M < N? The system
Ac ≈ y,
is now underdetermined. Unique solutions can be gauranteed if functional
structure is imposed.

uN (x) =
N
n=1
cnφn(x)
If M < N measurements are available, can we recover the largest M
coeﬃcients from the vector c?
Assume
y(x) =
N
n=1
cnφn(x) + (x), | | < η.
The compressibility of y is measured by
σV,s(c) = inf
d∈RN
d 0≤s
c − d 1 , d 0 := {j ∈ {1, . . . , N} dj = 0

y(x) =
N
n=1
cnφn(x) + (x), | | < η.
ym = y(xm), A ≈ cy
y is assumed to be compressible (i.e., c is assumed compressible)
With a limited number, M, of measurements, seek to approximate the
best s-term approximation of c.
Ideally, s ∼ M.
This is not possible if the sampling points are arbitrarily chosen.

Compressed sensing
It is possible to recover the best s-term approximation with high probability.
Assume xm are iid sampled from µ, and that φn are L2
µ-orthonormal, and
assume
M CKµs log3
(s) log N,
For any c ∈ RN
, let ym = y(xm) =
N
n=1 cnφn(xm) + (xm), and assume
| | ≤ η.
Then, with probability exceeding 1 − N−γ log3
(s)
, the solution c∗
to the
optimization problem,
min d 1 such that Ad − y 2 ≤ η
√
M,
satisﬁes
c − c∗
1 ≤ C1σV,s(c) + C2
√
s .
Above,
Kµ = max
n=1,...,N
φn L∞(D).
[Rauhut 2010], [Rauhut, Ward 2010]

Recovery of models with sparse representations
Figure 2. Transition plots for uniform random variables for d = 2 (top row
row). The left column corresponds to sampling from the random variable
column the CSA method and the right column asymptotic sampling.
For all low-dimensional and high-degree situations considered, CSA
and performs signiﬁcantly better than than probabilistic sampling acco
optimization tolerances, and when th
tolerance, the authors of [23] obtained
We observe poor recovery since Kµ in sample requirement is poorly behaved:
M CKµs log3
(s) log N, Kµ = max
n=1,...,N
φn L∞(D).
This requirement is heavily dependent on µ.

Better sampling
Again, choosing a better sampling strategy ameliorates this issue.
Sample xm ∼ µV , solve
min d 1 such that Ad − y 2,w ≤ η
√
M,
where w are weights to make the discrete sampling unbiased.
Exponential variables, Moreover the error in the approximation recovered by the asymptotic bounded
sampling method for Beta variables increases with dimension. When d = 30 the asymptotic bounded
sampling method fails to recover any polynomials regardless of the sparsity or the number of samples
used.
It is worth noting that case of Legendre polynomials sampled by Chebyshev distribution we have
a complete independence of the order of approximation, which agrees with previous results in [42].
However there are numerical results in [23, 49] showing almost no recovery when using the Chebyshev
sampling method in high-dimensions.
With the help of the authors of [23] we have verified that the poor performance exhibited in
the aforementioned papers is a result of numerical issues associated with the authors use of the
`1
-minimization solver in SparseLab [16]. Specifically, the authors of [23] were using more lenient
optimization tolerances, and when these tolerances were made tighter to match our optimization
tolerance, the authors of [23] obtained results consistent with Figure 2.
sampling method for Beta variables increases with dimension. When d = 30 the asymptotic bounded
sampling method fails to recover any polynomials regardless of the sparsity or the number of samples
used.
It is worth noting that case of Legendre polynomials sampled by Chebyshev distribution we have
a complete independence of the order of approximation, which agrees with previous results in [42].
However there are numerical results in [23, 49] showing almost no recovery when using the Chebyshev
sampling method in high-dimensions.
With the help of the authors of [23] we have verified that the poor performance exhibited in
the aforementioned papers is a result of numerical issues associated with the authors use of the
`1
-minimization solver in SparseLab [16]. Specifically, the authors of [23] were using more lenient
optimization tolerances, and when these tolerances were made tighter to match our optimization
tolerance, the authors of [23] obtained results consistent with Figure 2.
A GENERALIZED SAMPLING AND PRECONDITIONING SCHEME F
101
102
103
10 14
10 13
10 12
10 11
10 10
10 9
10 8
10 7
10 6
10 5
10 4
10 3
10 2
10 1
Number of samples M
`2error
CSA
MC
101
Figure 8. The e↵ect of dimension on the convergence of the CSA
the di↵usion equation (28). (Left) 30th degree polynomial in 2
polynomial in 20 dimensions
10 4
10 3
10 2
10 1
rror
CSA
Asymptotic
[Jakeman, AN 2017], [Guo, Zhou, Chen, AN 2016]

Part III: Nonlinear approximation
Dimension reduction/reduced modeling
Adapted basis functions, nonlinear approximation construction procedure

Dimension reduction
u(x) + = y ≈ uN (x) =
N
n=1
cnφn(x),
x ∈ D ⊆ Rd
y ∈ RP
cn ∈ RP
A “sample” ym is a vector, possible of large size, P 1.
In scientiﬁc models, P is also an indicator of the eﬀort to obtain ym.
Construct V and φ1, . . . , φN , by analyzing
{(xm, ym)}
M
m=1 , (xm, ym) ∈ Rd
× RP
The φn are adapted to the data.
Though φn has no explicit form, evaluating such functions can be much
cheaper than gathering more data.

Reduced basis methods
Gather (xm, ym) from a scientific model.
The reduced basis method (RBM) for nonlinear, adapted approximation,
constructs the emulator
uN (x) =
N
n=1
cnφn(x) =
N
n=1
yn n(x),
Here:
We need at least N = M data samples ym.
n are cardinal Lagrange functions, satisfying n(xm) = δn,m. They
have no explicit form.
The n are defined implicitly from the scientific model. (Via a Galerkin
procedure.)
This is not POD.
The space V = span{φn}N
n=1 is constructed/defined from the data and the
model.
There is no reason to believe this is a good idea unless xm is chosen well!

Reduced basis methods
End goal: evaluation of surrogate uN should cost less than acquiring more
data. Costs:
Evaluting Lagrange functions n is the hard part – complexity usually
scales like N3
.
The full model ym is queried only at xm, and nowhere else.
Details of computational eﬃciency of the surrogate uN depend on
particular problem.
In practice, N ∼ O(10).

Lagrange functions
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
µ
-3
-2
-1
0
1
2
The Lagrange functions
uN
10,1 uN
10,2 uN
10,3 uN
10,4 uN
10,5 uN
10,6 uN
10,7 uN
10,8 uN
10,9 uN
10,10
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
µ
2
4
6
8
˜∆10(µ)
5 10 15 20 25 30 35 40
Number of bases, N
10
-10
10
-5
10
0
-1 -0.5 0 0.5 1
µ
10
-6
10
-5
10
-4
10
-3
10
-2
10−4
× ˜∆10(µ)
||uN
(µ) − uN
10,E3
(µ)||X
S10
E3
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
µ
-1
-0.5
0
0.5
1
1.5
The Lagrange functions
uN
10,1 uN
10,2 uN
10,3 uN
10,4 uN
10,5 uN
10,6 uN
10,7 uN
10,8 uN
10,9 uN
10,10
3
˜∆10(µ)

RBM accuracy
Does uN computed via RBM provide a good emulator for u? Depends on the
sampling.
Let u(xm) ∈ H. Suppose we choose
xn+1 = arg max
x∈D
un(x) − u(x) H
(This can be approximated without knowing u!)
Then,
u − uN L∞(D,H) σN (U),
where
U := u(x) x ∈ D ⊂ H,
σN (U) := inf
dim V =N
sup
v∈U
inf
vN ∈V
v − vN H.
[DeVore et al 2013], [Binev et al 2013]

RBM accuracy
Surrogates for nontrivial problems can be constructed.
(−∆)s
u(z; x) = f(z; ν), (z, x) ∈ Ω × D
u(z; x) = 0, (z, x) ∈ ∂Ω × D
Parameters/variables are x = (s, ν).
4 2 4 6 8 10 12 14
10
-10
10-5
10
0
tion UN associated to problem (42), where µ = (s, ⌫).
1 100 200 309
M
10
2
103
10
4
10
5
tMsolves
s ∈ D1
M · tUN
toﬄine + M · tU10
1 100 200 309
M
10
2
103
10
4
10
5
tMsolves
s ∈ D2
M · tUN
toﬄine + M · tU10
Figure 3: The cumulative computation time for M queries of the full order model uN and the
RBM surrogate uN . On the left is for the case s 2 D1 with N = 7; on the right, s 2 D2 with
[Antil, Chen, AN 2018]

Building emulators
Surrogate models can be enormously useful.
Linear approximations with non-adapted basis functions
“Easiest” to construct, with weakest accuracy guarantees.
Querying surrogate generally very fast.
Useful for analying large datasets
Nonlinear approximations with non-adapted basis functions
Harder to construct, but more general accuracy guarantees.
Querying surrogate still very fast.
Useful when data is limited.
Nonlinear approximations with adapted basis functions
Generally very hard to construct.
Very attractive accuracy bounds, when possible to certify
Depend heavily on data, model, and the transparency of the model.

Building emulators
Surrogate models can be enormously useful.
Linear approximations with non-adapted basis functions
“Easiest” to construct, with weakest accuracy guarantees.
Querying surrogate generally very fast.
Useful for analying large datasets
Nonlinear approximations with non-adapted basis functions
Harder to construct, but more general accuracy guarantees.
Querying surrogate still very fast.
Useful when data is limited.
Nonlinear approximations with adapted basis functions
Generally very hard to construct.
Very attractive accuracy bounds, when possible to certify
Depend heavily on data, model, and the transparency of the model.
Challenges:
high dimensionality (d, P, or N)
adaptivity and hierarchical constructions

mathematics of reduced order models
algorithms for approximation and
complexity reduction
computational statistics and data-driven
techniques
https://icerm.brown.edu/programs/sp-s20/

References
Chkifa, Cohen, Migliorati, Nobile, Tempone, ”Discrete least squares polynomial approximation with
random evaluations – application to parametric and stochastic elliptic PDEs”, ESAIM: Mathematical
Modelling and Numerical Analysis, 49:3 (2015)
Cohen, Davenport, & Leviatan, ”On the Stability and Accuracy of Least Squares Approximations”,
Foundations of Computational Mathematics, 13:5 (2013)
Cohen & Migliorati, ”Optimal weighted least-squares methods”, arXiv:1608.00512 [math, stat]
Jakeman, Narayan, & Zhou, ”A Christoﬀel function weighted least squares algorithm for collocation
approximations”, Mathematics of Computation, 86:306 (2017)
Narayan, ”Computation of Induced Orthogonal Polynomial Distributions”, arXiv:1704.08465 [math]
(2017)
Narayan & Zhou, ”Stochastic Collocation on Unstructured Multivariate Meshes”, Communications in
Computational Physics, 18:1 (2015)

MUMS Opening Workshop - Emulators for models and Complexity Reduction - Akil Narayan , August 21, 2018

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to MUMS Opening Workshop - Emulators for models and Complexity Reduction - Akil Narayan , August 21, 2018

Similar to MUMS Opening Workshop - Emulators for models and Complexity Reduction - Akil Narayan , August 21, 2018 (20)

More from The Statistical and Applied Mathematical Sciences Institute

More from The Statistical and Applied Mathematical Sciences Institute (20)

Recently uploaded

Recently uploaded (20)

MUMS Opening Workshop - Emulators for models and Complexity Reduction - Akil Narayan , August 21, 2018