This paper investigates error bounds for reduced order modeling (ROM) by examining the effect of the probability distribution function used to sample errors. Earlier work found bounds could be overly conservative using a normal distribution. Different distributions were tested to find the smallest multiplier needed to ensure 90% of sampled errors were below the bound. The binomial distribution resulted in the smallest multiplier, providing a more realistic error bound closer to actual reduction errors compared to other distributions like the normal or uniform. Using the binomial distribution allows generating error bounds for ROM predictions that are less conservative than previous approaches.
2. ANS MC2015 - Joint International Conference on Mathematics and Computation (M&C), Supercomputing in Nuclear Applications (SNA) and the Monte
Carlo (MC) Method • Nashville, TN • April 19-23, 2015, on CD-ROM, American Nuclear Society, LaGrange Park, IL (2015)
Further Investigation of Error Bounds for Reduced Order Modeling
Mohammad Abdo and Hany S. Abdel-Khalik.
School of Nuclear Engineering, Purdue University
Corresponding Address
abdo@purdue.edu; abdelkhalik@purdue.edu
ABSTRACT
This manuscript investigates the level of conservatism of the bounds developed in earlier work to
capture the errors resulting from reduced order modeling. Reduced order modeling is premised on
the fact that large areas of the input and/or output spaces can be safely discarded from the analysis
without affecting the quality of predictions for the quantities of interest. For this premise to be
credible, ROM models must be equipped with theoretical bounds that can guarantee the quality of
the ROM model predictions. Earlier work has devised an approach in which a small number of
oversamples are used to predict such bound. Results indicated that the bound may sometimes be
too conservative, which would negatively impact the size and hence the efficiency of the ROM
model.
Key Words: Error bounds, reduced order modeling.
INTRODUCTION
Reduced order modeling (ROM) denotes any process by which the complexity of the model can
be reduced to render its repeated execution computationally practical. Almost all engineering
practitioners rely on ROM in some form. To warrant the use of the ROM model in lieu of the
original model, the quality of its predictions must be assessed against the original model
predictions. This is in general a difficult problem and it largely depends on how the ROM model
is constructed. In our earlier work, we have relied on the use of randomized techniques to
identify the so-called active subspace that can be used to reduce the effective dimensionality of
the models [Abdel-Khalik, et al., 2012]. In doing so, the same physics model is employed,
however its input parameters and output responses are constrained to their respective active
subspace. This approach for building the ROM models has many advantages, among which is the
ability to construct an upper-bound on the error resulting from the reduction. This bound
provides a warranty to the user that future predictions of the ROM model will be within the
bound when compared to the original model predictions.
Earlier work has shown that the construction of the bound depends on the probability distribution
function (PDF) from which the oversamples are drawn. This was inspired by the analytic work
by Dixon [1983], where he uses a normal distribution to estimate the 2-norm of a matrix. We
have noticed that the use of this PDF results in an overly conservative bound, implying that the
actual reduction errors can be one order of magnitude less than the nominal bound. This
manuscript investigates the effect of the PDF choice on the level of conservatism of the bound,
and whether there exists other distributions that render a more realistic bound estimate.
3. Abdo, M., and Abdel-Khalik, H.
Page 2 of 5
DESCRIPTION OF PROPOSED APPROACH
Consider a model of reactor physics simulation of the form:
y f x (1)
where
n
x are n-reactor physics parameters, e.g., cross-sections, m
y are m-reactor
responses of interest e.g., eigenvalue, peak clad temperature, etc. ROM approach is aiming to
approximate the original model by f to replace it in any computationally expensive analysis,
such as uncertainty characterization. To certify the robustness of the ROM, the process must be
equipped with an error criterion:
b userf x f x x S (2)
where b is an error upper-bound to be determined based on the reduction spaces. Whereas user
is user-defined tolerance and S is the parameter active space extracted by any ROM technique.
Using our ROM approximation f will be:
f x f x Ν Κ
where both N and K are rank-deficient transformation matrices such that:
T m m
y y
N Q Q , dim R yrN , and min( , )yr m n ,
T n n
x x
K Q Q , dim R xrK , and min( , )xr m n .
Now let’s look at the error operator as an unknown black box which can be sampled and aggregated
in a matrix E where th
ij element of E represents the error in the th
i response of the th
j sample,
which can be written as:
,: ,: ( )
[ ]
T T
i j y y i x x j
ij
i j
f x i i f x
E
f x
Q Q Q Q
(3)
The matrix E calculates the component of the function f that is truncated, i.e., discarded, by the
ROM application. If the norm of the matrix E can be estimated, an upper-bound on the resulting
ROM errors in the function evaluation can be computed. Note that each row of the matrix E
represents a response, implying that if one treates each row as a matrix, it is possible to calculate
a different error bound for each response. This is important since each response is expected to have
its own reduction error.
Earlier work has shown that a matrix norm can be estimated using randomized inner products,
which is due to [Dixon, 1983]. This approach may be described as follows. Consider a matrix
m N
E and a random vector n
x such that ix , where is a known distribution. Then
x can be used to estimate the 2-norm of E via:
,x p E E (4)
4. Further Investigation of Error Bounds for Reduced Order Modeling
Page 3 of 5
where
1
x
. It is intuitive that p depends on both and the probability distribution
function of x. In this work we chose to perform that numerically and not to constrain the
distribution to the normal or even the uniform distributions. In fact the main goal of this work is
to inspect the sensitivity of the error statement to the distribution used. To do this we fixed p to
be 0.9 and then find that makes 90% of the test cases satisfy the bound. Repeat that for many
distributions and pick the distribution with the smallest multiplier .
The analytic version of this statement was first introduced by Dixon (1983), when he used a
normal distribution to estimate the 2-norm of a matrix, Dixon has shown that in case of normal
distribution the multiplier
2
corresponds to a probability of 1
1
, which means that
with an value of 10, the multiplier will be 8 with a probability of success of 0.9. As we
estimate the rank of an ROM model, the magnitudes of the singular values typically become
closer to each other implying that xE is expected to be close to E , thereby rendering a
multiplier of 8 impractical. To overcome this situation, we have inspected other distributions and
numerically tested the multiplier in each case to determine the distribution with the smallest
multiplier.
CALCULATIONAL PROCEDURE AND RESULTS
To find the multiplier corresponding to each distribution the following procedure is deployed:
1. Generate a random matrix E and a random vector x sampled from the distribution under
inspection.
2. Compute E and xE .
3. Repeat steps 1, 2 million times.
4. Compute the multiplier such that 90% of the cases satisfies that: xE E
(i.e. 0.9x E E ).
The previous procedure is repeated for different distributions, the multiplier is computed; the
computed multiplier is then tested with a different sample set. Eq. (4) is verified using a scatter
plot and the number of failures is reported and compared to the probability predicted by the
theory. The multipliers are shown in the following table followed by the scatter plots of the
distributions that gave the largest and the smallest multipliers.
Table I. Multipliers of Different Distributions
Gaussian (0,1) 7.98 Chi‐square 1.31
Uniform (‐1,1) 13.2 Log‐normal 1.50
Binomial (N,0.9) 1.02 Beta (0.5, (N‐1)/2) 1.65
Poisson 1.67 Beta (1,10) 1.44
Exponential 1.49
5. Abdo, M., and Abdel-Khalik, H.
Page 4 of 5
From the previous table we conclude that the binomial distribution (1 p)n x n x
xC p
has the
smallest multiplier which provides a realistic bound.
Figure 1. Gaussian distribution
Figure 2. Uniform distribution
The x-axis in the top right two figures show the calculated bound using the Gaussian PDF and
the actual error, with the red points indicating failure, where the actual error exceeds the
estimated upper-bound. The trend shows that the bound could be much bigger than the actual
error, therefore justifying our current investigation for better distributions. Fig. 3 presents similar
results to Fig. 2 but now calculated using the binomial distribution which calculates more
realistic bounds.
6. Further Investigation of Error Bounds for Reduced Order Modeling
Page 5 of 5
Figure 3. Binomial distribution
CONCLUSIONS
The ROM error bound estimation requires sampling of the actual error using a user-defined PDF.
Several oversamples are drawn from such PDF to calculate the corresponding exact errors of the
ROM model with the maximum of these errors multiplied by a scalar to establish an upper-
bound on the ROM error. Earlier work has noticed that the scalar multiplier could result in an
unnecessarily conservative bound, and identified the reason for that being the use of Gaussian
PDF to select the oversamples. This work employed numerical experiments to show that one
may use a binomial distribution to calculate a more realistic bound, i.e., one that is closer to the
actual reduction errors.
ACKNOWLEDGEMENTS
The first author would like to acknowledge the support received from the department of nuclear
engineering at North Carolina State University to complete this work in support of his PhD.
REFERENCES
1. Mohammad G. Abdo and Hany S. Abdel-Khalik, Propagation of Error Bounds due to Active
Subspace Reduction, Transactions of American Nuclear Society, Summer 2014.
2. S. S. Wilks, Mathematical statistics, John Wiley, New York, 1st ed. 1962.
3. John D. Dixon, Estimating extremal eigenvalues and condition numbers of matrices, SIAM
1983; 20(2): 812–814.
4. Abdel-Khalik, H., et al., “Overview of Hybrid Subspace Methods for Uncertainty
Quantification and Sensitivity Analysis,” Annals of Nuclear Energy, 52, pp.28-46 (2013).A
Tutorial on Applications of dimensionality reduction and function approximation.