Robustness metrics: How are they calculated and when should they be used?

ROBUSTNESS METRICS
How are they calculated and when should they be used?
C. McPhail, H.R. Maier, J.H. Kwakkel, M. Giuliani, A. Castelletti and S.Westra

How do we plan for an uncertain future?

How do we plan for an uncertain future?
Estimated
distribution of
future states
(Scenario #1)
Estimated
distribution of
future states
(Scenario #2)
System
state
Today Future

How do we quantify system performance
under uncertainty?
System performance
System performance
System
state
Today Future
Relative likelihood of
occurrence is unknown

How do we quantify system performance
under uncertainty?
System performance
System performance
Robustness
System
state
Today Future
Relative likelihood of
occurrence is unknown

Which metric should we use for
calculating robustness?
Maximin
Maximax
Hurwicz optimism-
pessimism rule
Laplace’s principle of
insufficient reason
Minimax regret
90th percentile
minimax regret
Mean-variance
Undesirable
deviations
Percentile-
based
skewnessPercentile-based
peakedness
Starr’s domain
criterion

Contributions of the research
1. A unified framework for the calculation of a wide range of robustness metrics.
Enabling a comparison of robustness metrics.
2. A taxonomy of robustness metrics.
Providing guidance to decision-makers.

Robustness value
Performance
metric (e.g.
cost,
reliability)
Decision
alternatives
(e.g. Policy
options, plans,
solutions)
Plausible
future
conditions
(Scenarios)
Robustness metric

Probabilityofoccurrencein
selectedscenarios
Transformed Performance
1 2 3 4 … n
Performance
Scenario #
1 2 4 …
TransformedPerformance
Scenario #
Performance over all
scenarios
Transformed
performance values over
all scenarios
Transformed
performance values over
selected scenarios
Robustness value
Mean
T1: Performance
value transformation
T2: Scenario subset
selection
T3: Robustness
metric calculation
1 2 3 4 … nTransformedPerformance
Scenario #

1 2 3 4 … n
Scenario #
T1: Performance value transformation
Performance (identity)Cost of making wrong
decision (regret)
1 2 3 4 … n
Performance
Scenario #
1 2 3 4 … n
Scenario #
1 2 3 4 … n
Performance
Scenario #
Performance values Pass/fail (satisficing)
OR OR
Threshold

Robustness calculation based on
relative performance values
Robustness calculation based on
absolute performance values
Indication of whether system
performance is satisfactory or
not
- (MORE)
- (POMORE)
- (Decision Scaling)
- Starr’s domain criterion
- (Info Gap)
Indication of actual system
performance
- Minimax regret
- 90th percentile minimax regret
- Undesirable deviations
- Maximin (minimax)
- Maximax
- Hurwicz’s optimism-pessimism rule
- Laplace’s principle of insufficient
reason
- Mean-variance
- Percentile-based skewness
- Percentile-based peakedness

T2: Scenario subset selection
Worst half of scenariosBest-case scenario
1 2 3 4 … n
Performance
Scenario #
Transformed
performance values
Worst-case scenario
OR OR
1 2 3 4 … n
Performance
Scenario #
1 2 3 4 … n
Performance
Scenario #
1 2 3 4 … n
Performance
Scenario #

Less risk averse
More risk averse

selectedscenarios
T3: Robustness metric calculation
SkewVariance
Transformed
performance values in
selected scenarios
Expected value
(e.g. mean)
OR OR
Mean
selectedscenarios
selectedscenarios
Variance
selectedscenarios
Mean
Median

Robustness metric
Robustness metric calculation
None Sum Mean Weighted mean Variance Skew Kurtosis
Maximin ✓
Maximax ✓
Hurwicz optimism-pessimism rule ✓
Laplace’s principle of insufficient reason ✓
Minimax regret ✓
90th percentile minimax regret ✓
Mean-variance ✓ ✓
Undesirable deviations ✓
Percentile-based skewness ✓
Percentile-based peakedness ✓
Starr’s domain criterion ✓

Maximin
Maximax
Hurwicz optimism-pessimism rule
Laplace’s principle of insufficient
reason
Minimax regret
90th percentile minimax regret
Mean-variance
Undesirable deviations
Percentile-based skewness
Percentile-based peakedness
Starr’s domain criterion

Metric
T1: Performance value
transformation
T2: Scenario subset
selection
T3: Robustness
metric calculation
Maximin Identity Worst-case Identity
Maximax Identity Best-case Identity
Hurwicz optimism-pessimism rule Identity Worst- and best-cases Weighted mean
reason
Identity All Mean
Minimax regret
Regret from best decision
alternative
Worst-case Identity
alternative
90th percentile Identity
Mean-variance Identity All Mean-variance
Regret from median
performance
Worst-half Sum
Percentile-based skewness Identity
10th, 50th and 90th
percentiles
Skew
Percentile-based peakedness Identity
10th, 25th, 75th and 90th
percentiles
Kurtosis
Starr’s domain criterion Satisfaction of constraints All Mean
Maximin
Maximax
Hurwicz optimism-pessimism rule
reason
Minimax regret
Mean-variance
Percentile-based skewness
Percentile-based peakedness
Starr’s domain criterion

Metric
T1: Performance value
transformation
T2: Scenario subset
selection
T3: Robustness
metric calculation
Maximin Identity Worst-case Identity
Maximax Identity Best-case Identity
Hurwicz optimism-pessimism rule Identity Worst- and best-cases Weighted mean
reason
Identity All Mean
Minimax regret
alternative
Worst-case Identity
alternative
90th percentile Identity
Mean-variance Identity All Mean-variance
Regret from median
performance
Worst-half Sum
Percentile-based skewness Identity
10th, 50th and 90th
percentiles
Skew
Percentile-based peakedness Identity
10th, 25th, 75th and 90th
percentiles
Kurtosis
Starr’s domain criterion Satisfaction of constraints All Mean

Future work
A conceptual framework for understanding when robustness metrics agree or
disagree.
Paper under revision
C. McPhail, H.R. Maier, J.H. Kwakkel, M. Giuliani, A. Castelletti and S.Westra
(under revision), Robustness metrics: How are they calculated, when should
they be used and why do they give different results?, Earth’s Future.
Contact
Cameron McPhail
University of Adelaide
cameron.mcphail@adelaide.edu.au

Description Equation
Identity transformation 𝑓′ 𝑥𝑖, 𝑠𝑗 = 𝑓 𝑥𝑖, 𝑠𝑗
alternative
𝑓′ 𝑥𝑖, 𝑠𝑗 =
max
𝑥
𝑓 𝑥, 𝑠𝑗 − 𝑓 𝑥𝑖, 𝑠𝑗 , maximisation
𝑓 𝑥𝑖, 𝑠𝑗 − min
𝑥
𝑓 𝑥, 𝑠𝑗 , minimisation
Regret from median
𝑞50 − 𝑓 𝑥𝑖, 𝑠𝑗 , maximisation
𝑓 𝑥𝑖, 𝑠𝑗 − 𝑞50, minimisation
where 𝑞50 is the median performance for decision alternative 𝑥𝑖. i.e.
𝑃 𝑓 𝑥𝑖, 𝑆 ≤ 𝑞50 =
1
2
Satisfaction of constraints
1 if 𝑓 𝑥𝑖, 𝑠𝑗 ≥ 𝑐
0 if 𝑓 𝑥𝑖, 𝑠𝑗 < 𝑐
, maximisation
1 if 𝑓 𝑥𝑖, 𝑠𝑗 ≤ 𝑐
0 if 𝑓 𝑥𝑖, 𝑠𝑗 > 𝑐
, minimisation
where 𝑐 is a constraint

Worst-case 𝑆′
=
arg min
𝑠
𝑓′ 𝑥𝑖, 𝑠 , maximisation
arg max
𝑠
𝑓′ 𝑥𝑖, 𝑠 , minimisation
Best-case 𝑆′
=
arg max
𝑠
𝑓′ 𝑥𝑖, 𝑠 , maximisation
arg min
𝑠
𝑓′ 𝑥𝑖, 𝑠 , minimisation
Worst- and best-cases 𝑆′
= arg max
𝑠
𝑓′ 𝑥𝑖, 𝑠 , arg min
𝑠
𝑓′ 𝑥𝑖, 𝑠
All 𝑆′
= 𝑆
Worst-half
𝑆′
=
𝑠 ∈ 𝑆: 𝑓′ 𝑥𝑖, 𝑠 ≤ 𝑞50 , maximisation
𝑠 ∈ 𝑆: 𝑓′ 𝑥𝑖, 𝑠 ≥ 𝑞50 , minimisation
where 𝑞50 is the 50th percentile (median) value of 𝑓′ 𝑥𝑖, 𝑆
Percentile
𝑆′
= 𝑓′ 𝑥𝑖, 𝑠 = 𝑞 𝑘
where 𝑞 𝑘 is the kth percentile value of 𝑓′ 𝑥𝑖, 𝑆
Note that the scenario 𝑠 that produces the value of 𝑓′ 𝑥𝑖, 𝑠 closest to 𝑞 𝑘 is the scenario that
is used.

Identity
transformation
𝑅 𝑥𝑖, 𝑆 = 𝑓′ 𝑥𝑖, 𝑆′
Mean
𝑅 𝑥𝑖, 𝑆 =
1
𝑛′
𝑗=1
𝑛′
𝑓′ 𝑥𝑖, 𝑠𝑗
where 𝑛′ is the number of scenarios in 𝑆′
Sum 𝑅 𝑥𝑖, 𝑆 =
𝑗=1
𝑛′
𝑓′ 𝑥𝑖, 𝑠𝑗
Weighted mean (two
scenarios)
𝑅 𝑥𝑖, 𝑆 = 𝛼𝑓′ 𝑥𝑖, 𝑠 𝑎 + 1 − 𝛼 𝑓′ 𝑥𝑖, 𝑠 𝑏
where 𝑠 𝑎 and 𝑠 𝑏 are two scenarios and 𝛼 is the preference of the decision maker towards using 𝑠 𝑎 and 0 <
𝛼 < 1
(Also see next slide…)

Variance-based (i.e.
the standard
deviation)
1
𝑛′ − 1
𝑗=1
𝑛′
𝑓′ 𝑥𝑖, 𝑠𝑗 − 𝜇
2
where 𝜇 is the mean (see the equation earlier in this table)
Mean-variance
𝜇 + 1 𝜎 + 1 , maximisation
− 𝜇 + 1 𝜎 + 1 , minimisation
where 𝜇 is the mean and 𝜎 is the standard deviation (given by equations above)
Skew
𝑓′ 𝑥𝑖, 𝑠90 + 𝑓′ 𝑥𝑖, 𝑠10 2 − 𝑓′ 𝑥𝑖, 𝑠50
𝑓′ 𝑥𝑖, 𝑠90 − 𝑓′ 𝑥𝑖, 𝑠10 2
, maximisation
−
𝑓′ 𝑥𝑖, 𝑠90 + 𝑓′ 𝑥𝑖, 𝑠10 2 − 𝑓′ 𝑥𝑖, 𝑠50
𝑓′ 𝑥𝑖, 𝑠90 − 𝑓′ 𝑥𝑖, 𝑠10 2
, minimisation
where 𝑠10, 𝑠50 and 𝑠90 are scenarios that represent the 10th, 50th and 90th percentiles for 𝑓′ 𝑥𝑖, 𝑆
Kurtosis
𝑓′ 𝑥𝑖, 𝑠90 − 𝑓′ 𝑥𝑖, 𝑠10
𝑓′ 𝑥𝑖, 𝑠75 − 𝑓′ 𝑥𝑖, 𝑠25
where 𝑠10, 𝑠25, 𝑠75 and 𝑠90 are scenarios that represent the 10th, 25th, 75th and 90th percentiles for
𝑓′ 𝑥𝑖, 𝑆

Robustness metrics: How are they calculated and when should they be used?

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Robustness metrics: How are they calculated and when should they be used?

Similar to Robustness metrics: How are they calculated and when should they be used? (20)

Recently uploaded

Recently uploaded (20)

Robustness metrics: How are they calculated and when should they be used?

Editor's Notes