SlideShare a Scribd company logo
1 of 27
Download to read offline
Small Sample Analysis and Algorithms
for Multivariate Functions
Mac Hyman
Tulane University
Joint work with Lin Li, Jeremy Dewar, and Mu Tian (SUNY),
SAMSI WG5 , May 7, 2018
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 1 / 27
The Problem: Accurate integration of multivariate functions
Goal:To estimate the integral I = Ω f (x)dx,
where f (x) : Ω → R, Ω ⊂ Rd
We are focused on situations where:
Situations where there are few samples (n < few 1000), and the
effective dimension is relatively small, x ∈ Rd , (d < 50);
Function evaluations (samples) f (x) are (very) expensive, such as a
large-scale simulation, and additional samples may not be obtainable;
Little a prior information about f (x) is available; and
We might not have control over the sample locations, which can be
far from a desired distribution (e.g. missing data).
Identify new sample locations to minimize MSE based on existing
information.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 2 / 27
The Problem: Accurate integration of multivariate functions
Goal:To estimate the integral I = Ω f (x)dx,
where f (x) : Ω → R, Ω ⊂ Rd
Four Approaches that work pretty well in practice.
How does do they work in theory?
1. Detrending using covariates
2. Voronoi Weighted Quadrature
3. Surrogate Model Quadrature
4. Adaptive Sampling Based on Kriging SE Estimates
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 3 / 27
Detrending before Integrating
1. Detrending using covariates
Detrending first approximates the underlying function with an easily
integrated surrogate model (covariate).
The integral is then estimated by the exact integral of surrogate +
an approximation of the residual.
For example, the , f (x) can be approximated by a linear combination of
simple basis functions, such as Legendre polynomials, p(x) = t
i=1 βi ψi (x),
which can be integrated exactly, and define
I(f ) =
Ω
f (x)dx (1)
=
Ω
p(x)dx +
Ω
[f (x) − p(x)]dx . (2)
Goal is the pick f (x) to minimize the residual Ω[f (x) − p(x)]dx.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 4 / 27
Detrending before Integrating
The error for the detrended integral is proportional to the
standard deviation of the residual p(x) − f (x), not f (x)
The residual errors are the only errors in the integration approximation
I(f ) =
Ω
f (x)dx (3)
≈ ˆI(f ) =
Ω
p(x)dx +
1
n
[f (xi ) − p(xi )] (4)
PMC error bound: ||en|| = O( 1√
n
σ(f − p)) and
QMC error bound: ||en||≤O(1
n V [f − p](log n)(d−1))
1. The error bounds are based on σ(f − p) and V [f − p] instead of σ(f )
and V [f ]. The least squares fit reduces these quantities.
2. The convergence rates are the same; the constants are reduced.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 5 / 27
Quintic detrending reduces MC and QMC errors by factor of 100
Error ˆI(f ) − I(f ) Distributions I(f ) = [0,1]6 i cos(ixi)dx
Error Distributions (6D, 600 points) for PMC (top) and LDS QMC
(bottom) for detrending with a cubic and quintic, K = 3, 5, polynomial.
The x-axis bounds are 10 times smaller for the LDS/QMC samples.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 6 / 27
Detrending reduces the error constant, not the convergence rates
Detrending doesn’t change the convergence rates PMC
(O(n−1
2 )) and QMC (O(n−1
)) for [0,1]6 i cos(ixi)dx
Errors for PMC (upper lines −−) and QMC (lower lines − · −) for
constant K = 0 (left), cubic K = 3 (center), and quintic K = 5 (right).
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 7 / 27
Detrending reduces the error constant, not the convergence rates
Mean errors for [0,1]5 i cos(ixi)dx with detrending
Detrending errors: degrees K = 0, 1, 2, 3, 4, 5 for 500 − 4000 samples.
Convergence rates don’t chance, but the constant is reduced by 0.001
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 8 / 27
Curse of Dimensionality for polynomial detrending
High degree polynomials in high dimensions are quickly
constrained by the Curse of Dimensionality
For the least squares coefficients to be identifiable, the number of samples
must be ≥ the number of coefficients in the detrending function.
Degree  Dimension 1 2 3 4 5 10 20
0 1 1 1 1 1 1 1
1 2 3 4 5 6 11 21
2 3 6 10 15 21 66 231
3 4 10 20 35 56 286 1,771
4 5 15 35 70 126 1,001 10,626
5 6 21 56 126 252 3,003 53,130
10 11 66 286 1,001 3,003 184,756 30,045,015
The mixed variable terms in multivariate polynomials create an explosion
in the number of terms as a function of the degree and dimension.
For example, a 5th degree polynomial in 20 dimensions has 53,130 terms.
The complexity of this approach grows linearly in the number of
basis functions, and as O(n3) as the number of samples increases.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 9 / 27
Sparse model selection
The Lp sparse penalty regularization method
p(x) =
t
i=1
βi ψi (x) (5)
β = argmin{
1
2
||Aβ − f ||2
2 +
1
p
λ||β||p} (6)
(7)
This system can be solved using a cyclic coordinate descent algorithm, or
factored iterated reweighted least-squares (IRLS) solving a linear system
of (n = number of samples) equations on each iteration.
If the function f (x) varies along some directions more than others, then
sparse subset selection extracts the appropriate basis functions based on
the effective dimension of the active subspaces.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 10 / 27
Sparse model selection
Sparse subset detrending allows high degree polynomial
dictionaries for sparse sample distributions.
[0,1]6 i cos(ixi )dx; Errors PMC (top) and QMC (bottom) degrees
K = 0, 3, 5. K = 5 fits keep 35% (PMC) or 29% (QMC) of terms.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 11 / 27
Least squares detrending = a weighted quadrature rule
Least squares detrending is equivalent to using a weighted
quadrature rule
The integral of a least squares detrending fit through the data points can
be represented as a weighted quadrature rule:
I(f ) =
Ω
f (x)dx =
i Ωi
f (x)dx = wi
¯fi ≈ ˆwi f (xi )
where ¯fi is the mean of f in the Voronoi volume (wi ) of Ωi near xi , and
ˆwi ≈ wi . The error depends on (¯fi − f (xi )) and (wi − ˆwi ).
When the sample points have low discrepancy, then wi ≈ ˆwi = 1/n is a
good approximation.
Can this be improved if we replace the weights with a better approxima-
tion of the Voronoi volume?
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 12 / 27
Voronoi Weighted Quadrature
2. Voronoi Weighted Quadrature
The Voronoi weighted quadrature rule is defined as
In(f ) =
n
i=1
wi f (xi )
where wi is the Voronoi volume associated with the sample xi that is
closer to xi than any other sample point.
• The Voronoi weighted quadrature rule, In(f ), is exact if f (x) is piece-
wise constant over each Voronoi volume.
• Solving for the exact Voronoi volumes is expensive in high dimensions
and suffers from the curse of dimensionality.
• Solution: Use LDS to approximate these volumes.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 13 / 27
Voronoi Weighted Quadrature
Voronoi Weighted Quadrature
The weights for the Voronoi quadrature rule
In(f ) = wi f (xi )
can be approximated using nearest neighbors of a dense reference LDS.
Step 1: Generate a dense LDS, {ˆxj }, with NLDS points.
Step 2: Compute the distance from each LDS to the original sample set.
Step 3: Define Wi as the number of LDS points closest to xi .
Step 4: Rescale these counts to define the weights wi = Wi /NLDS
(and normalize by the domain volume, if needed).
The weights wi converge to the Voronoi volumes as O(1/NLDS )
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 14 / 27
Voronoi Weighted Quadrature
Voronoi Volumes Estimated by Low Discrepancy Sample
The fraction of LDS samples nearest to each sample is used to estimate
the Voronoi volume for the sample as a fraction of the domain volume.
Works for samples living in a blob.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 15 / 27
Voronoi Weighted Quadrature
Simple 3D example
1 1.2 1.4 1.6 1.8 2 2.2 2.4
log10
(number of samples)
-2.6
-2.4
-2.2
-2
-1.8
-1.6
-1.4
-1.2
-1
log10
(meanerror)
trig Integration error 3D
MC error
MC Voronoi error
1 1.2 1.4 1.6 1.8 2 2.2 2.4
log10
(number of samples)
-3
-2.5
-2
-1.5
-1
log10
(meanerror)
trig Integration error 3D
LDS error
LSD Voronoi error
In 3D, the Voronoi weights are much more effective in reducing the errors
when the original sample is iid MC than its for a LDS QMC sample.
In(f ) = ˆwi f (xi ) ˆwi = estimate of xi Voronoi volume
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 16 / 27
Voronoi Weighted Quadrature
Simple 6D example
1.4 1.6 1.8 2 2.2 2.4 2.6 2.8
log10
(number of samples)
-2.2
-2
-1.8
-1.6
-1.4
-1.2
-1
log10
(meanerror)
trig Integration error 6D
MC error
MC Voronoi error
1.4 1.6 1.8 2 2.2 2.4 2.6 2.8
log10
(number of samples)
-2.8
-2.6
-2.4
-2.2
-2
-1.8
-1.6
-1.4
log10
(meanerror)
trig Integration error 6D
LDS error
LSD Voronoi error
In 6D, the Voronoi weighted quadrature reduces the errors for iid MC
samples. The approach is not effective for LDS in higher dimensions.
We are looking for ideas for explaining why the Voronoi weighted quadra-
ture approach is less effective for LDS in higher dimensions.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 17 / 27
Surrogate Model Quadrature
3. Surrogate Model Quadrature
Interpolate samples to a dense LDS, and use standard QMC quadra-
ture on the surrogate points.
Step 1: Generate a dense LDS sample, {ˆxj }, with NLDS points.
Step 2: Use kriging to approximate ˆf (ˆxj ) at the LDS points.
Step 3: Estimate the integral by
I(f ) ≈
1
NLDS
ˆf (ˆxj )
We use the DACE kriging package with a quadratic polynomial basis
based on distance-weighted least-squares with radial Gaussian weights.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 18 / 27
Surrogate Model Quadrature
Simple 3D example
1 1.2 1.4 1.6 1.8 2 2.2 2.4
log10
(number of samples)
-3
-2.5
-2
-1.5
-1
log10
(meanerror)
trig Integration error 3D
MC error
MC SLDS Error
1 1.2 1.4 1.6 1.8 2 2.2 2.4
log10
(number of samples)
-4.5
-4
-3.5
-3
-2.5
-2
-1.5
-1
log10
(meanerror)
trig Integration error 3D
LDS error
LDS SLDS Error
In 3D, the surrogate data points are effective in reducing the errors for
both iid MC and LDS samples.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 19 / 27
Surrogate Model Quadrature
Simple 6 D example
1.4 1.6 1.8 2 2.2 2.4
log10
(number of samples)
-2.2
-2
-1.8
-1.6
-1.4
-1.2
-1
log10
(meanerror)
trig Integration error 6D
MC error
MC SLDS Error
1.4 1.6 1.8 2 2.2 2.4 2.6 2.8
log10
(number of samples)
-4.5
-4
-3.5
-3
-2.5
-2
-1.5
log10
(meanerror)
trig Integration error 6D
LDS error
LDS SLDS Error
In 6D, the surrogate data points are effective in reducing the errors for
both iid MC and LDS samples.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 20 / 27
Comparing Voronoi and Surrogate Model Quadrature
The surrogate quadrature is consistently better than the
Voronoi quadrature
1 1.2 1.4 1.6 1.8 2 2.2 2.4
log10
(number of samples)
-3
-2.5
-2
-1.5
-1
log10
(meanerror)
trig Integration error 3D
MC error
MC Voronoi error
MC SLDS Error
1 1.2 1.4 1.6 1.8 2 2.2 2.4
log10
(number of samples)
-4.5
-4
-3.5
-3
-2.5
-2
-1.5
-1
log10
(meanerror)
trig Integration error 3D
LDS error
LSD Voronoi error
LDS SLDS Error
Both methods reduce the errors in this 3D dimensional problem.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 21 / 27
Comparing Voronoi and Surrogate Model Quadrature
The surrogate quadrature is consistently better than the
Voronoi quadrature
1.4 1.6 1.8 2 2.2 2.4 2.6 2.8
log10
(number of samples)
-2.4
-2.2
-2
-1.8
-1.6
-1.4
-1.2
-1
log10
(meanerror)
trig Integration error 6D
MC error
MC Voronoi error
MC SLDS Error
1.4 1.6 1.8 2 2.2 2.4 2.6 2.8
log10
(number of samples)
-4.5
-4
-3.5
-3
-2.5
-2
-1.5
log10
(meanerror)
trig Integration error 6D
LDS error
LSD Voronoi error
LDS SLDS Error
• In this 6D dimensional problem, both methods reduce the error when
the original sample is not a LDS.
• When the original sample is LDS, then the Voronoi quadrature doesn’t
improve the accuracy, while the surrogate model continues to be effective.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 22 / 27
Adaptive Sampling Quadrature
4. Adaptive Sampling Based on Kriging SE Estimates
Instead of adding new samples to ’fill in the holes’ of the existing
distribution, use kriging error estimates to guide future samples.
Iterate until converged, or max number of function values is reached:
Step 1: Generate a dense LDS sample, {ˆxj }, with NLDS points.
Step 2: Using kriging to approximate the function, ˆf (ˆxj ), and estimate
standard errors, SEj , at the LDS points.
Step 3: If the max{SEj } > tolerance, then evaluate the function with the
largest SEj , and return to Step 2 .
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 23 / 27
Adaptive Sampling Quadrature
Initial Random Sample
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
• = current samples
The large standard errors are small red circles and smaller errors are blue.
The next sample will be evaluated at the largest SE.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 24 / 27
Adaptive Sampling Quadrature
First and second adaptive samples are in the corners
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
The large standard errors are small red circles and smaller errors are blue.
• = current samples • = largest SE and next sample.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 25 / 27
Adaptive Sampling Quadrature
The new samples fill in the holes to reduce the
uncertainity
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
The large standard errors are small red circles and smaller errors are blue.
• = current samples • = largest SE and next sample.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 26 / 27
Future Research Questions
Future Research Questions
Continue exploring surrogate models for guiding adaptive sampling.
Develop theory for how many LDS surrogate samples are needed for the
Voronoi weights and surrogate quadrature methods.
Use the surrogate approach to interpolate to a sparse grid, instead of the
LDS, and use higher order quadrature rules.
Combine the surrogate LDS methods with the detrending approaches.
Develop kriging methods that preserve local positivity, monotonicity, and
convexity of the data for both design of experiment surrogate models and
surrogate quadrature methods.
Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 27 / 27

More Related Content

What's hot

NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)Christian Robert
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Christian Robert
 
CISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergenceCISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergenceChristian Robert
 
An overview of Bayesian testing
An overview of Bayesian testingAn overview of Bayesian testing
An overview of Bayesian testingChristian Robert
 
Summery of Robust and Effective Metric Learning Using Capped Trace Norm
Summery of  Robust and Effective Metric Learning Using Capped Trace NormSummery of  Robust and Effective Metric Learning Using Capped Trace Norm
Summery of Robust and Effective Metric Learning Using Capped Trace Normssuser42f2881
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsChristian Robert
 
Likelihood-free Design: a discussion
Likelihood-free Design: a discussionLikelihood-free Design: a discussion
Likelihood-free Design: a discussionChristian Robert
 
A Comparative Study of Two-Sample t-Test Under Fuzzy Environments Using Trape...
A Comparative Study of Two-Sample t-Test Under Fuzzy Environments Using Trape...A Comparative Study of Two-Sample t-Test Under Fuzzy Environments Using Trape...
A Comparative Study of Two-Sample t-Test Under Fuzzy Environments Using Trape...inventionjournals
 
accurate ABC Oliver Ratmann
accurate ABC Oliver Ratmannaccurate ABC Oliver Ratmann
accurate ABC Oliver Ratmannolli0601
 
Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]Christian Robert
 
Delayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithmsDelayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithmsChristian Robert
 
comments on exponential ergodicity of the bouncy particle sampler
comments on exponential ergodicity of the bouncy particle samplercomments on exponential ergodicity of the bouncy particle sampler
comments on exponential ergodicity of the bouncy particle samplerChristian Robert
 
ABC short course: introduction chapters
ABC short course: introduction chaptersABC short course: introduction chapters
ABC short course: introduction chaptersChristian Robert
 
Testing as estimation: the demise of the Bayes factor
Testing as estimation: the demise of the Bayes factorTesting as estimation: the demise of the Bayes factor
Testing as estimation: the demise of the Bayes factorChristian Robert
 
On the vexing dilemma of hypothesis testing and the predicted demise of the B...
On the vexing dilemma of hypothesis testing and the predicted demise of the B...On the vexing dilemma of hypothesis testing and the predicted demise of the B...
On the vexing dilemma of hypothesis testing and the predicted demise of the B...Christian Robert
 

What's hot (20)

NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
CISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergenceCISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergence
 
An overview of Bayesian testing
An overview of Bayesian testingAn overview of Bayesian testing
An overview of Bayesian testing
 
Summery of Robust and Effective Metric Learning Using Capped Trace Norm
Summery of  Robust and Effective Metric Learning Using Capped Trace NormSummery of  Robust and Effective Metric Learning Using Capped Trace Norm
Summery of Robust and Effective Metric Learning Using Capped Trace Norm
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximations
 
Likelihood-free Design: a discussion
Likelihood-free Design: a discussionLikelihood-free Design: a discussion
Likelihood-free Design: a discussion
 
A Comparative Study of Two-Sample t-Test Under Fuzzy Environments Using Trape...
A Comparative Study of Two-Sample t-Test Under Fuzzy Environments Using Trape...A Comparative Study of Two-Sample t-Test Under Fuzzy Environments Using Trape...
A Comparative Study of Two-Sample t-Test Under Fuzzy Environments Using Trape...
 
accurate ABC Oliver Ratmann
accurate ABC Oliver Ratmannaccurate ABC Oliver Ratmann
accurate ABC Oliver Ratmann
 
asymptotics of ABC
asymptotics of ABCasymptotics of ABC
asymptotics of ABC
 
Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]
 
Delayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithmsDelayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithms
 
comments on exponential ergodicity of the bouncy particle sampler
comments on exponential ergodicity of the bouncy particle samplercomments on exponential ergodicity of the bouncy particle sampler
comments on exponential ergodicity of the bouncy particle sampler
 
ABC workshop: 17w5025
ABC workshop: 17w5025ABC workshop: 17w5025
ABC workshop: 17w5025
 
ABC short course: introduction chapters
ABC short course: introduction chaptersABC short course: introduction chapters
ABC short course: introduction chapters
 
8803-09-lec16.pdf
8803-09-lec16.pdf8803-09-lec16.pdf
8803-09-lec16.pdf
 
Testing as estimation: the demise of the Bayes factor
Testing as estimation: the demise of the Bayes factorTesting as estimation: the demise of the Bayes factor
Testing as estimation: the demise of the Bayes factor
 
On the vexing dilemma of hypothesis testing and the predicted demise of the B...
On the vexing dilemma of hypothesis testing and the predicted demise of the B...On the vexing dilemma of hypothesis testing and the predicted demise of the B...
On the vexing dilemma of hypothesis testing and the predicted demise of the B...
 

Similar to Small Sample Analysis and Algorithms for Multivariate Functions

Big Data Analysis
Big Data AnalysisBig Data Analysis
Big Data AnalysisNBER
 
A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:
A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:
A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:Sean Golliher
 
Non-Normally Distributed Errors In Regression Diagnostics.docx
Non-Normally Distributed Errors In Regression Diagnostics.docxNon-Normally Distributed Errors In Regression Diagnostics.docx
Non-Normally Distributed Errors In Regression Diagnostics.docxvannagoforth
 
Diagnostic methods for Building the regression model
Diagnostic methods for Building the regression modelDiagnostic methods for Building the regression model
Diagnostic methods for Building the regression modelMehdi Shayegani
 
Machine Learning meets DevOps
Machine Learning meets DevOpsMachine Learning meets DevOps
Machine Learning meets DevOpsPooyan Jamshidi
 
Outlying and Influential Data In Regression Diagnostics .docx
Outlying and Influential Data In Regression Diagnostics .docxOutlying and Influential Data In Regression Diagnostics .docx
Outlying and Influential Data In Regression Diagnostics .docxkarlhennesey
 
New approaches for boosting to uniformity
New approaches for boosting to uniformityNew approaches for boosting to uniformity
New approaches for boosting to uniformityNikita Kazeev
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Valentin De Bortoli
 
slides_low_rank_matrix_optim_farhad
slides_low_rank_matrix_optim_farhadslides_low_rank_matrix_optim_farhad
slides_low_rank_matrix_optim_farhadFarhad Gholami
 
A comparative analysis of predictve data mining techniques3
A comparative analysis of predictve data mining techniques3A comparative analysis of predictve data mining techniques3
A comparative analysis of predictve data mining techniques3Mintu246
 
Factor analysis
Factor analysis Factor analysis
Factor analysis Mintu246
 
directed-research-report
directed-research-reportdirected-research-report
directed-research-reportRyen Krusinga
 
Slides: A glance at information-geometric signal processing
Slides: A glance at information-geometric signal processingSlides: A glance at information-geometric signal processing
Slides: A glance at information-geometric signal processingFrank Nielsen
 
Paper Summary of Disentangling by Factorising (Factor-VAE)
Paper Summary of Disentangling by Factorising (Factor-VAE)Paper Summary of Disentangling by Factorising (Factor-VAE)
Paper Summary of Disentangling by Factorising (Factor-VAE)준식 최
 

Similar to Small Sample Analysis and Algorithms for Multivariate Functions (20)

QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...
QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...
QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...
 
Big Data Analysis
Big Data AnalysisBig Data Analysis
Big Data Analysis
 
A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:
A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:
A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction:
 
Non-Normally Distributed Errors In Regression Diagnostics.docx
Non-Normally Distributed Errors In Regression Diagnostics.docxNon-Normally Distributed Errors In Regression Diagnostics.docx
Non-Normally Distributed Errors In Regression Diagnostics.docx
 
PhysRevE.89.042911
PhysRevE.89.042911PhysRevE.89.042911
PhysRevE.89.042911
 
Diagnostic methods for Building the regression model
Diagnostic methods for Building the regression modelDiagnostic methods for Building the regression model
Diagnostic methods for Building the regression model
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Machine Learning meets DevOps
Machine Learning meets DevOpsMachine Learning meets DevOps
Machine Learning meets DevOps
 
Outlying and Influential Data In Regression Diagnostics .docx
Outlying and Influential Data In Regression Diagnostics .docxOutlying and Influential Data In Regression Diagnostics .docx
Outlying and Influential Data In Regression Diagnostics .docx
 
MUMS: Bayesian, Fiducial, and Frequentist Conference - Inference on Treatment...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Inference on Treatment...MUMS: Bayesian, Fiducial, and Frequentist Conference - Inference on Treatment...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Inference on Treatment...
 
Input analysis
Input analysisInput analysis
Input analysis
 
New approaches for boosting to uniformity
New approaches for boosting to uniformityNew approaches for boosting to uniformity
New approaches for boosting to uniformity
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...
 
slides_low_rank_matrix_optim_farhad
slides_low_rank_matrix_optim_farhadslides_low_rank_matrix_optim_farhad
slides_low_rank_matrix_optim_farhad
 
R for Statistical Computing
R for Statistical ComputingR for Statistical Computing
R for Statistical Computing
 
A comparative analysis of predictve data mining techniques3
A comparative analysis of predictve data mining techniques3A comparative analysis of predictve data mining techniques3
A comparative analysis of predictve data mining techniques3
 
Factor analysis
Factor analysis Factor analysis
Factor analysis
 
directed-research-report
directed-research-reportdirected-research-report
directed-research-report
 
Slides: A glance at information-geometric signal processing
Slides: A glance at information-geometric signal processingSlides: A glance at information-geometric signal processing
Slides: A glance at information-geometric signal processing
 
Paper Summary of Disentangling by Factorising (Factor-VAE)
Paper Summary of Disentangling by Factorising (Factor-VAE)Paper Summary of Disentangling by Factorising (Factor-VAE)
Paper Summary of Disentangling by Factorising (Factor-VAE)
 

More from The Statistical and Applied Mathematical Sciences Institute

More from The Statistical and Applied Mathematical Sciences Institute (20)

Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
 
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
 
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
 
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
 
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
 
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
 
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
 
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
 
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
 
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
 
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
 
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
 
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
 
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
 
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
 
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
 
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
 
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
 
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
 
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
 

Recently uploaded

The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxShobhayan Kirtania
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 

Recently uploaded (20)

The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptx
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 

Small Sample Analysis and Algorithms for Multivariate Functions

  • 1. Small Sample Analysis and Algorithms for Multivariate Functions Mac Hyman Tulane University Joint work with Lin Li, Jeremy Dewar, and Mu Tian (SUNY), SAMSI WG5 , May 7, 2018 Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 1 / 27
  • 2. The Problem: Accurate integration of multivariate functions Goal:To estimate the integral I = Ω f (x)dx, where f (x) : Ω → R, Ω ⊂ Rd We are focused on situations where: Situations where there are few samples (n < few 1000), and the effective dimension is relatively small, x ∈ Rd , (d < 50); Function evaluations (samples) f (x) are (very) expensive, such as a large-scale simulation, and additional samples may not be obtainable; Little a prior information about f (x) is available; and We might not have control over the sample locations, which can be far from a desired distribution (e.g. missing data). Identify new sample locations to minimize MSE based on existing information. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 2 / 27
  • 3. The Problem: Accurate integration of multivariate functions Goal:To estimate the integral I = Ω f (x)dx, where f (x) : Ω → R, Ω ⊂ Rd Four Approaches that work pretty well in practice. How does do they work in theory? 1. Detrending using covariates 2. Voronoi Weighted Quadrature 3. Surrogate Model Quadrature 4. Adaptive Sampling Based on Kriging SE Estimates Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 3 / 27
  • 4. Detrending before Integrating 1. Detrending using covariates Detrending first approximates the underlying function with an easily integrated surrogate model (covariate). The integral is then estimated by the exact integral of surrogate + an approximation of the residual. For example, the , f (x) can be approximated by a linear combination of simple basis functions, such as Legendre polynomials, p(x) = t i=1 βi ψi (x), which can be integrated exactly, and define I(f ) = Ω f (x)dx (1) = Ω p(x)dx + Ω [f (x) − p(x)]dx . (2) Goal is the pick f (x) to minimize the residual Ω[f (x) − p(x)]dx. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 4 / 27
  • 5. Detrending before Integrating The error for the detrended integral is proportional to the standard deviation of the residual p(x) − f (x), not f (x) The residual errors are the only errors in the integration approximation I(f ) = Ω f (x)dx (3) ≈ ˆI(f ) = Ω p(x)dx + 1 n [f (xi ) − p(xi )] (4) PMC error bound: ||en|| = O( 1√ n σ(f − p)) and QMC error bound: ||en||≤O(1 n V [f − p](log n)(d−1)) 1. The error bounds are based on σ(f − p) and V [f − p] instead of σ(f ) and V [f ]. The least squares fit reduces these quantities. 2. The convergence rates are the same; the constants are reduced. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 5 / 27
  • 6. Quintic detrending reduces MC and QMC errors by factor of 100 Error ˆI(f ) − I(f ) Distributions I(f ) = [0,1]6 i cos(ixi)dx Error Distributions (6D, 600 points) for PMC (top) and LDS QMC (bottom) for detrending with a cubic and quintic, K = 3, 5, polynomial. The x-axis bounds are 10 times smaller for the LDS/QMC samples. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 6 / 27
  • 7. Detrending reduces the error constant, not the convergence rates Detrending doesn’t change the convergence rates PMC (O(n−1 2 )) and QMC (O(n−1 )) for [0,1]6 i cos(ixi)dx Errors for PMC (upper lines −−) and QMC (lower lines − · −) for constant K = 0 (left), cubic K = 3 (center), and quintic K = 5 (right). Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 7 / 27
  • 8. Detrending reduces the error constant, not the convergence rates Mean errors for [0,1]5 i cos(ixi)dx with detrending Detrending errors: degrees K = 0, 1, 2, 3, 4, 5 for 500 − 4000 samples. Convergence rates don’t chance, but the constant is reduced by 0.001 Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 8 / 27
  • 9. Curse of Dimensionality for polynomial detrending High degree polynomials in high dimensions are quickly constrained by the Curse of Dimensionality For the least squares coefficients to be identifiable, the number of samples must be ≥ the number of coefficients in the detrending function. Degree Dimension 1 2 3 4 5 10 20 0 1 1 1 1 1 1 1 1 2 3 4 5 6 11 21 2 3 6 10 15 21 66 231 3 4 10 20 35 56 286 1,771 4 5 15 35 70 126 1,001 10,626 5 6 21 56 126 252 3,003 53,130 10 11 66 286 1,001 3,003 184,756 30,045,015 The mixed variable terms in multivariate polynomials create an explosion in the number of terms as a function of the degree and dimension. For example, a 5th degree polynomial in 20 dimensions has 53,130 terms. The complexity of this approach grows linearly in the number of basis functions, and as O(n3) as the number of samples increases. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 9 / 27
  • 10. Sparse model selection The Lp sparse penalty regularization method p(x) = t i=1 βi ψi (x) (5) β = argmin{ 1 2 ||Aβ − f ||2 2 + 1 p λ||β||p} (6) (7) This system can be solved using a cyclic coordinate descent algorithm, or factored iterated reweighted least-squares (IRLS) solving a linear system of (n = number of samples) equations on each iteration. If the function f (x) varies along some directions more than others, then sparse subset selection extracts the appropriate basis functions based on the effective dimension of the active subspaces. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 10 / 27
  • 11. Sparse model selection Sparse subset detrending allows high degree polynomial dictionaries for sparse sample distributions. [0,1]6 i cos(ixi )dx; Errors PMC (top) and QMC (bottom) degrees K = 0, 3, 5. K = 5 fits keep 35% (PMC) or 29% (QMC) of terms. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 11 / 27
  • 12. Least squares detrending = a weighted quadrature rule Least squares detrending is equivalent to using a weighted quadrature rule The integral of a least squares detrending fit through the data points can be represented as a weighted quadrature rule: I(f ) = Ω f (x)dx = i Ωi f (x)dx = wi ¯fi ≈ ˆwi f (xi ) where ¯fi is the mean of f in the Voronoi volume (wi ) of Ωi near xi , and ˆwi ≈ wi . The error depends on (¯fi − f (xi )) and (wi − ˆwi ). When the sample points have low discrepancy, then wi ≈ ˆwi = 1/n is a good approximation. Can this be improved if we replace the weights with a better approxima- tion of the Voronoi volume? Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 12 / 27
  • 13. Voronoi Weighted Quadrature 2. Voronoi Weighted Quadrature The Voronoi weighted quadrature rule is defined as In(f ) = n i=1 wi f (xi ) where wi is the Voronoi volume associated with the sample xi that is closer to xi than any other sample point. • The Voronoi weighted quadrature rule, In(f ), is exact if f (x) is piece- wise constant over each Voronoi volume. • Solving for the exact Voronoi volumes is expensive in high dimensions and suffers from the curse of dimensionality. • Solution: Use LDS to approximate these volumes. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 13 / 27
  • 14. Voronoi Weighted Quadrature Voronoi Weighted Quadrature The weights for the Voronoi quadrature rule In(f ) = wi f (xi ) can be approximated using nearest neighbors of a dense reference LDS. Step 1: Generate a dense LDS, {ˆxj }, with NLDS points. Step 2: Compute the distance from each LDS to the original sample set. Step 3: Define Wi as the number of LDS points closest to xi . Step 4: Rescale these counts to define the weights wi = Wi /NLDS (and normalize by the domain volume, if needed). The weights wi converge to the Voronoi volumes as O(1/NLDS ) Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 14 / 27
  • 15. Voronoi Weighted Quadrature Voronoi Volumes Estimated by Low Discrepancy Sample The fraction of LDS samples nearest to each sample is used to estimate the Voronoi volume for the sample as a fraction of the domain volume. Works for samples living in a blob. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 15 / 27
  • 16. Voronoi Weighted Quadrature Simple 3D example 1 1.2 1.4 1.6 1.8 2 2.2 2.4 log10 (number of samples) -2.6 -2.4 -2.2 -2 -1.8 -1.6 -1.4 -1.2 -1 log10 (meanerror) trig Integration error 3D MC error MC Voronoi error 1 1.2 1.4 1.6 1.8 2 2.2 2.4 log10 (number of samples) -3 -2.5 -2 -1.5 -1 log10 (meanerror) trig Integration error 3D LDS error LSD Voronoi error In 3D, the Voronoi weights are much more effective in reducing the errors when the original sample is iid MC than its for a LDS QMC sample. In(f ) = ˆwi f (xi ) ˆwi = estimate of xi Voronoi volume Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 16 / 27
  • 17. Voronoi Weighted Quadrature Simple 6D example 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 log10 (number of samples) -2.2 -2 -1.8 -1.6 -1.4 -1.2 -1 log10 (meanerror) trig Integration error 6D MC error MC Voronoi error 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 log10 (number of samples) -2.8 -2.6 -2.4 -2.2 -2 -1.8 -1.6 -1.4 log10 (meanerror) trig Integration error 6D LDS error LSD Voronoi error In 6D, the Voronoi weighted quadrature reduces the errors for iid MC samples. The approach is not effective for LDS in higher dimensions. We are looking for ideas for explaining why the Voronoi weighted quadra- ture approach is less effective for LDS in higher dimensions. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 17 / 27
  • 18. Surrogate Model Quadrature 3. Surrogate Model Quadrature Interpolate samples to a dense LDS, and use standard QMC quadra- ture on the surrogate points. Step 1: Generate a dense LDS sample, {ˆxj }, with NLDS points. Step 2: Use kriging to approximate ˆf (ˆxj ) at the LDS points. Step 3: Estimate the integral by I(f ) ≈ 1 NLDS ˆf (ˆxj ) We use the DACE kriging package with a quadratic polynomial basis based on distance-weighted least-squares with radial Gaussian weights. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 18 / 27
  • 19. Surrogate Model Quadrature Simple 3D example 1 1.2 1.4 1.6 1.8 2 2.2 2.4 log10 (number of samples) -3 -2.5 -2 -1.5 -1 log10 (meanerror) trig Integration error 3D MC error MC SLDS Error 1 1.2 1.4 1.6 1.8 2 2.2 2.4 log10 (number of samples) -4.5 -4 -3.5 -3 -2.5 -2 -1.5 -1 log10 (meanerror) trig Integration error 3D LDS error LDS SLDS Error In 3D, the surrogate data points are effective in reducing the errors for both iid MC and LDS samples. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 19 / 27
  • 20. Surrogate Model Quadrature Simple 6 D example 1.4 1.6 1.8 2 2.2 2.4 log10 (number of samples) -2.2 -2 -1.8 -1.6 -1.4 -1.2 -1 log10 (meanerror) trig Integration error 6D MC error MC SLDS Error 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 log10 (number of samples) -4.5 -4 -3.5 -3 -2.5 -2 -1.5 log10 (meanerror) trig Integration error 6D LDS error LDS SLDS Error In 6D, the surrogate data points are effective in reducing the errors for both iid MC and LDS samples. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 20 / 27
  • 21. Comparing Voronoi and Surrogate Model Quadrature The surrogate quadrature is consistently better than the Voronoi quadrature 1 1.2 1.4 1.6 1.8 2 2.2 2.4 log10 (number of samples) -3 -2.5 -2 -1.5 -1 log10 (meanerror) trig Integration error 3D MC error MC Voronoi error MC SLDS Error 1 1.2 1.4 1.6 1.8 2 2.2 2.4 log10 (number of samples) -4.5 -4 -3.5 -3 -2.5 -2 -1.5 -1 log10 (meanerror) trig Integration error 3D LDS error LSD Voronoi error LDS SLDS Error Both methods reduce the errors in this 3D dimensional problem. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 21 / 27
  • 22. Comparing Voronoi and Surrogate Model Quadrature The surrogate quadrature is consistently better than the Voronoi quadrature 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 log10 (number of samples) -2.4 -2.2 -2 -1.8 -1.6 -1.4 -1.2 -1 log10 (meanerror) trig Integration error 6D MC error MC Voronoi error MC SLDS Error 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 log10 (number of samples) -4.5 -4 -3.5 -3 -2.5 -2 -1.5 log10 (meanerror) trig Integration error 6D LDS error LSD Voronoi error LDS SLDS Error • In this 6D dimensional problem, both methods reduce the error when the original sample is not a LDS. • When the original sample is LDS, then the Voronoi quadrature doesn’t improve the accuracy, while the surrogate model continues to be effective. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 22 / 27
  • 23. Adaptive Sampling Quadrature 4. Adaptive Sampling Based on Kriging SE Estimates Instead of adding new samples to ’fill in the holes’ of the existing distribution, use kriging error estimates to guide future samples. Iterate until converged, or max number of function values is reached: Step 1: Generate a dense LDS sample, {ˆxj }, with NLDS points. Step 2: Using kriging to approximate the function, ˆf (ˆxj ), and estimate standard errors, SEj , at the LDS points. Step 3: If the max{SEj } > tolerance, then evaluate the function with the largest SEj , and return to Step 2 . Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 23 / 27
  • 24. Adaptive Sampling Quadrature Initial Random Sample 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 • = current samples The large standard errors are small red circles and smaller errors are blue. The next sample will be evaluated at the largest SE. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 24 / 27
  • 25. Adaptive Sampling Quadrature First and second adaptive samples are in the corners 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 The large standard errors are small red circles and smaller errors are blue. • = current samples • = largest SE and next sample. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 25 / 27
  • 26. Adaptive Sampling Quadrature The new samples fill in the holes to reduce the uncertainity 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 The large standard errors are small red circles and smaller errors are blue. • = current samples • = largest SE and next sample. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 26 / 27
  • 27. Future Research Questions Future Research Questions Continue exploring surrogate models for guiding adaptive sampling. Develop theory for how many LDS surrogate samples are needed for the Voronoi weights and surrogate quadrature methods. Use the surrogate approach to interpolate to a sparse grid, instead of the LDS, and use higher order quadrature rules. Combine the surrogate LDS methods with the detrending approaches. Develop kriging methods that preserve local positivity, monotonicity, and convexity of the data for both design of experiment surrogate models and surrogate quadrature methods. Hyman, Li, Dewar and Tian Small Sample Analysis SAMSI WG5 27 / 27