Surrogate-based design is an effective approach for modeling computationally expensive system behavior. In such application, it is often challenging to characterize the expected accuracy of the surrogate. In addition to global and local error measures, regional error measures can be used to understand and interpret the surrogate accuracy in the regions of interest. This paper develops the Regional Error Estimation of Surrogate (REES) method to quantify the level of the error in any given subspace (or region) of the entire domain, when all the available training points have been invested to build the surrogate. In this approach, the accuracy of the surrogate in each subspace is estimated by modeling the variations of the mean and the maximum error in that subspace with increasing number of training points (in an iterative process). A regression model is used for this purpose. At each iteration, the intermediate surrogate is constructed using a subset of the entire training data, and tested over the remaining points. The evaluated errors at the intermediate test points at each iteration are used for training the regression model that represents the error variation with sample points. The effectiveness of the proposed method is illustrated using standard test problems. To this end, the predicted regional errors of the surrogate constructed using all the training points are compared with the regional errors estimated over a large set of test points.
1. Regional Error Estimation of Surrogates (REES)
Ali Mehmani*, Souma Chowdhury #, Jie Zhang# and Achille Messac*
* Syracuse University, Department of Mechanical and Aerospace Engineering
# Rensselaer Polytechnic Institute, Department of Mechanical, Aerospace, and Nuclear Engineering
14th AIAA/ISSMO Multidisciplinary Analysis and Optimization Conference
17 - 19 Sep 2012
Indianapolis, Indiana
2. Surrogate Modeling - Overview
• The use of surrogates (or metamodels) in design space
exploration, sensitivity analysis, and optimization has
become popular to provide a tractable and inexpensive
approximation of the actual system behavior.
• Evaluating the accuracy of a surrogate to ensure a
reasonable accuracy is challenging when using the surrogate
for analysis and optimization.
• Error measurement methods offer a solution to this
challenge by assessing the accuracy of the surrogate
estimation.
2
3. Research Motivation
The knowledge of the local and global accuracy of a
surrogate model can be used for:
• Domain exploration,
• Improvement of the surrogate using direct or sequential
3
sampling*
• Assessing the reliability of the optimal design obtained through
surrogate based optimization, and
• Quantifying the uncertainty (and user confidence) associated
with the surrogate.
*Adaptive Sequential Sampling ** Efficient Global Optimization
4. Research Objective
• To develop an effective methodology for quantifying the level
of surrogate error in any subspaces of the design domain,
which can:
Be applicable to a majority of interpolative/regression surrogate
4
models (model independent); and
Be computationally inexpensive (it uses the available training
data set to predict the errors in different regions without any
additional system evaluations).
5. Presentation Outline
• Review of Surrogate Model Error Measurement Methods
• Regional Error Estimation of Surrogate (REES) Method
• Numerical examples: procedure, results and discussion
• Concluding remarks
5
6. Surrogate Model Error Measurement Methods
Error quantification methods can be classified,
based on their computational expense, into methods that require
additional data, and methods that use existing data.
based on the region of interest, into:
• Global error measure to evaluate the overall accuracy of a
surrogate (e.g., split sample, cross-validation, and bootstrapping).
Unfortunately, these measures may not adequately represent the
regional accuracy of a complex system.
• Local or point-wise error measure to evaluate the accuracy of
the surrogates in different locations. (e.g., the mean squared errors
for Kriging and the linear reference model (LRM).)
These measures are either model dependent or required additional system
evaluation.
6
7. Regional Error Estimation of Surrogate (REES)
7
2nd Surrogate
#{푿푻푹}
푵ퟏ
푵ퟐ
푵ퟑ
푵ퟒ
Regional
Error
푬ퟏ
푬ퟐ
푬ퟑ
푬푷?풓풆
Total number of training points
1st Surrogate
3rd Surrogate
Final Surrogate
1st Int.
Training Pts.
1st Int.
Test Pts.
2nd Int.
Training Pts.
2nd Int.
Test Pts.
3rd Int.
Training Pts.
3rd Int.
Test Pts.
Total number of training points
8. Regional Error Estimation of Surrogate (REES)
8
The REES method estimates the error of the surrogate in each
subspace, by predicting the error variation in that subspace with
increasing number of training points.
Unique features of the REES method include:
It is not limited to a specific kind of surrogate model (model
independent),
It uses the available training data set to predict the error without
any additional system evaluations,
It provides useful information about the accuracy of the
surrogate in different regions (local/global), and
It predicts the accuracy of the final surrogate.
9. 9
Principle of the REES Method
Surrogate
Model
The entire set of sample points
#{X}32
REES method
Surrogate Model
Accuracy Prediction
10. Principle of the REES Method
Training Points
Iteration =1 Test Points
1. Select a subset of the training points to
construct an intermediate surrogate
2. #{푿푻푹
10
풕=ퟏ} = ퟖ, and#{푿푻푬
풕=ퟏ} = ퟐퟒ
3. Segregate the domain into subspaces
4. Estimate the maximum and the mean
errors in each subspace
The entire set of sample points
#{X}32
11. Principle of the REES Method
Iteration =2 Iteration =3
1. Select a subset of the training points to
construct an intermediate surrogate,
2. #{푿풕=ퟐ푻푹
} = ퟏퟔ, and#{푿풕=ퟐ푻푬
} = ퟏퟔ
3. Segregate the domain into subspaces
4. Estimate the maximum and the mean
error in each subspace.
Training Points
Test Points
1. Select a subset of the training points to
construct an intermediate surrogate,
2. #{푿풕=ퟑ푻푹
} = ퟐퟒ, and#{푿풕=ퟑ푻푬
} = ퟖ
3. Segregation the domain into subspaces
4. Estimate the maximum and the mean
error in each subspace.
11
12. 12
Principle of the REES Method
Iteration =4 Training Points
• The errors (EMean and EMax) in each
subspace estimated during the iterative
process are used to build the Variation
of Error with Sample Points (VESP)
regression model.
• The VESP regression model is used to
predict the expected mean and the
maximum errors of the final surrogate in
each subspace.
14. 14
Step 1 – Generation of Training Data
• A set of experimental designs are chosen to cover the entire design
space as uniformly as possible.
• The system (simulation or experiment) is evaluated over selected data
points.
• The total set of sample points is represented by {X}.
In this paper:
• The Optimal Latin Hypercube developed by Audze-Eglais is adopted
to determine the locations of the full set of sample points ({X}).
15. Step 2 – Estimation of the Variation of the Error in Different Regions
• The sample points are
divided into an intermediate
training data set and an
intermediate test data set.
푡 } ⊂ {푋
and in the subsequent iteration
15
푡 } = {푋푇푅
푋푇푅
푡−1} + {푋푡
푇푅퐴
푡 } = {푋} − {푋푇푅
푋푇퐸
푡
where in the first iteration
푡 } ⊂ {푋푇퐸
푋푇푅퐴
푡−1
푋푇푅퐴
16. Step 2 – Estimation of the Variation of the Error in Different Regions
• The intermediate surrogate
at each iteration is
constructed using the
intermediate training points,
and is evaluated over the
intermediate test points.
In this paper:
• The surrogates are developed
16
using Kriging method.
• The Relative Accuracy Error (RAE) is used to evaluate the error at each test point
17. Step 2 – Estimation of the Variation of the Error in Different Regions
17
• The boundary between subspaces in the REES method can be:
- determined using error-based pattern classification methods, or
- predefined by the designer based on the knowledge regarding the design problem.
18. Step 2 – Estimation of the Variation of the Error in Different Regions
• The entire design space is
manually divided into a
specific number of
subspaces.
18
• At each iteration, the mean and maximum errors are estimated in each subspace.
eij is the error of the jth
test point in the ith
subspace at iteration t.
19. Step 2 – Estimation of the Variation of the Error in Different Regions
19
20. 퐸푀푡
푡
퐸푀푎푥
• The errors (3333 and 3333 3) in each subspace
estimated during the iterative process are used to build
the Variation of Error with Sample Point(VESP)
regression model.
20
Step 3 –Error Prediction in Different Regions
• The final surrogate model is constructed using the full
set of training data.
퐹 (푛, 푎) = 푎0푒 푎1푛
• The VESP regression models are used to predict the
mean and the maximum regional errors (퐸푀 and
퐸푀푎푥 ) for the final/actual surrogate.
21. Numerical Examples
• The proposed regional error estimation approach is evaluated using
21
the Dixon & Price and the Branin Function.
• Numerical settings of two problems are given as
Function
No. of
variables
Total No. of
sample points
(nX)
Iteration
(t)
No. of training
points at each
itr. ()
No. of test
points at each
itr. ()
Dixon
& Price
2 32
1 16 16
2 20 12
3 24 8
4 28 4
Branin 2 32
1 16 16
2 20 12
3 24 8
4 28 4
22. REES results
22
Dixon & Price function
• The trend of VESP regression model in different subspaces
• The VESP coefficients in
each subspace
퐹 (푛, 푎) = 푎0푒 푎1푛
Subspace 1 Subspace 2
Subspace 3 Subspace 4
23. REES results
23
Dixon & Price function
• The trend of VESP regression model in different subspaces
Subspace 1 Subspace 2
Subspace 3 Subspace 4
24. Level of Error
24
• The mean and the maximum error levels are specified as
푹풆품풊풐풏풂풍 푬풓풓풐풓
푴풆풂풏 푬풓풓풐풓
Level 1 <50%
Level 2 50-100%
Level 3 100-150%
Level 4 >150%
× ퟏퟎퟎ%
25. REES results
25
Dixon & Price function
• The predicted accuracy of the surrogate in different subspaces
퐸푀푒푎푛 in each subspace 퐸푀푎푥 in each subspace
푷풓풆풅풊풄풕풆풅
푹풆풔풖풍풕풔
• The error levels of the surrogate in different subspaces on 10,000 Test Points
푨풄풕풖풂풍
푹풆풔풖풍풕풔
26. REES results
26
Branin function
• The trend of VESP regression model in different subspaces
• The VESP coefficients in
each subspace
퐹 (푛, 푎) = 푎0푒 푎1푛
Subspace 1 Subspace 2
Subspace 3 Subspace 4
27. REES results
27
Branin function
• The predicted accuracy of the surrogate in different subspaces
퐸푀푒푎푛 in each subspace 퐸푀푎푥 in each subspace
푷풓풆풅풊풄풕풆풅
푹풆풔풖풍풕풔
• The error levels of the surrogate in different subspaces on 10,000 Test Points
푨풄풕풖풂풍
푹풆풔풖풍풕풔
28. Concluding remarks
• We developed the Regional Error Estimation of Surrogate (REES) to
quantify the accuracy of the final surrogate in each subspace of
the design domain.
• The REES method predicts the mean and the maximum errors of the
surrogate in each subspace by modeling the variation of the error in
that subspace with increasing number of training points.
• This method can be applicable ta a majority of surrogate modeling
28
techniques and does not require additional system evaluations.
• The preliminary results indicate that the REES method can predict
the regional accuracy of a surrogate correctly.
29. Acknowledgement
• I would like to acknowledge my research adviser Prof.
Achille Messac, for his immense help and support in this
research.
• I would also like to thank my friends and colleagues Jie
Zhang and Souma Chowdhury for their valuable
contributions to this paper.
29
30. Selected References
1. Zhang, J., Chowdhury, S., and Messac, A., “An Adaptive Hybrid Surrogate Model,” Structural and Multidisciplinary
Optimization, 2012, doi: 10.1007/s00158-012-0764-x.
2. Mehmani, A., Zhang, J., Chowdhury, S., and Messac, A., “Surrogate-based Design Optimization with Adaptive Sequential
Sampling,” 53rd AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics and Materials Conference, Hawaii, USA,
April 2012.
3. Zhang, J., Chowdhury, S., and Messac, A., “Domain Segmentation based on Uncertainty in the Surrogate (DSUS),” 53rd
AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics and Materials Conference, Hawaii, USA, April 2012.
4. Jones, D., Schonlau, M., and Welch, W., “Efficient Global Optimization of Expensive Black-box Functions,” Journal of Global
Optimization, Vol. 13, No. 4, 1998, pp. 455–492.
5. Goel, T., Haftka, R., and Shyy, W., “Ensemble of Surrogates,” Structural and Multidisciplinary Optimization, Vol. 38, 2009,
pp. 429442.
6. Viana, F., Haftka, R., and Steffen, V., “Multiple Surrogates: How Cross-validation Errors Can Help Us to Obtain the Best
Predictor,” Structural and Multidisciplinary Optimization, Vol. 39, No. 4, 2009, pp. 439–457.
30