White.p.johnson.k

Dr. K. Preston White, University of Virginia
Kenneth L. Johnson, NASA/ NESC, LaRC

Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 1

Overview

Motivation – why care?
Probabilistic requirements
SR&QA vs. engineering performance based
How to write them
Decision matrix: consumer’s vs. producer’s risk
The “simplest” verification case: pass/fail
Number of simulation trials needed to verify
Reliability Test Planner (RTP)
Tweaking the verification sampling plan
More sophisticated verifications

Motivation

CxCEF (Constellation Chief Engineer’s Forum)
asked for expert input on how to clarify and
standardize the process of writing probabilistic
requirements
Address previous Review Item Discrepancies (RIDs)/ no
more RIDS on this topic
Verify the hardware/ process/ whatever using modeling
and simulation (M&S)
Make sure you don’t have a problem
The hardware will perform as required
See how close an intermediate product is to verification

Constellation
Program (Cx)
NASA’s next manned space program,
scheduled to make its first flights
early in the next decade.

Ares I and Ares V rockets

Orion spacecraft

The vision is to send human explorers
back to the moon and then onward to
Mars and other destinations in the solar
system . . .
. . . SAFELY

Deterministic Engineering
Design
Traditional approach
Ignore uncertainties initially
Use deterministic values to stand in for uncertain
model parameters
Apply factors of safety and similar constructs
to outputs to account for uncertainty


Probabilistic Engineering
Design
Rapidly gaining acceptance
Confront uncertainties directly—use estimated
probability distributions for model parameters
Estimate probability distributions for outputs from
test and/ or via Monte Carlo sim and/ or other
methods
Capture, describe and leverage uncertainty to help design
robust, reliable systems
Advantages
Better understanding of impacts of uncertainties (lower
risks)
Design “closer to the margin” (lower costs)


Monte Carlo Approach
Draw Observations Calculate Observations
Random inputs with known Sample distributions of
probability distributions model outputs
Output Distributions
0.35
Input Distributions
Sensitivity Analysis 0.50

0.45
0.30 Fixed parameters and controlled
Fixed parameters and controlled 0.40

0.25
inputs with known values
inputs with known values 0.35

0.30
0.20
0.25

0.15 0.20

0.15
0.10
0.10

0.05 0.05

0.00
0.00

-4

-3

-2

-1

0

1

2

3
-3

-2

-1

0

1

2

3

4

5

6

Simulation Model

Staggering range of applications: computational mathematics, science, social
science, economics and finance, computer science, all branches of engineering


Writing Probabilistic
Requirements
Requirements which involve M&S run under
uncertainty need particular treatment
Wording needs to be clear
Make sure address “goodness” of M&S as well as whether
output specifically meets a number
Need to address uncertainty inherent in M&S along with
uncertainty within the model and assumptions themselves
Recommendations to CxCEF by Probabilistic
Requirements Verification Team
JSC document EA4‐07‐005 dated 5/14/2007 with
attachments
This is a probabilistic technology (PT)

M&S Design Aids and Checks

Part of a well‐written requirement
Goals include
Make sure simulation model isn’t designed in a vacuum
Make sure the simulation model is appropriate to the
questions at hand
Make sure uncertainties are correctly and fully
addressed
Peer review
Methods (neither exhaustive nor mutually exclusive)
Six Steps (Suren Singhal, MSFC; attachment to EA4‐07‐005)
NASA‐STD‐7009


Two Major Types of
Probabilistic Requirements
Safety, Reliability and Quality Assurance
(SR&QA)‐type, aka probabilistic risk assessment
(PRA)‐type
Failure generally results in loss of crew and/ or loss of
mission (LOC/LOM)
Engineering performance‐based (aka physics‐
based)
Next block in failure scenario is generally non‐
catastrophic
Project has ultimate decision of which type
Talk to a statistician, PRA expert and/or requirements
expert if not clear which to apply

SR&QA Type Requirements

Requirement: [CAxxxx‐PO] The XXX system shall limit its
contribution to the risk of loss of crew (LOC) for a Xsssss
mission to no greater than 1 in 200 (TBR‐xxx‐xxx).
Rationale: The 1 in 200 (TBR‐xxx‐xxx) means a .005 (or .5%)
probability of LOC due to the XXX during any Xsssss
mission. The baseline numbers were derived from a
preliminary PRA within NASA‐TM‐2005‐214062, NASA's
Exploration Systems Architecture Study. This requirement
is driven by CxP 70003‐XXXXxx, Constellation Program
Plan, Annex 1: Need, Goals, and Objectives (NGO), Safety
Goal CxP‐Xxx: Provide a substantial increase in safety, crew
survival and reliability of the overall system over legacy
systems.


SR&QA Type Verification
Statement
[CA0501V‐PO] Xsssss LOC due to XXX shall be verified by
analysis. The simulation tools and analysis methodology,
and the assumed non‐ideal model behavior and design data
which is used in the analysis shall be developed and peer
reviewed to ensure the potential causes for off‐nominal
behavior are adequately identified and their probabilities
properly quantified [ see note 1]. The requirement shall be
considered satisfied when analysis results show there is at
most a 0.5% (TBR‐xxx‐xxx) probability of LOC with the
probability taken as a mean probability [ see note 2 ].
Notes: see backup


Physics‐Based Probabilistic
Requirements
[CAyyyy‐PO] The Vehicle shall perform Ysssss action under
Yrrrrrr conditions
Rationale: Establishes the Vehicle as the launch vehicle to
perform Yssss action with sufficient remaining propellant to
execute further necessary actions. The architecture design
solution of using the Ynnnnn approach was a result of
NASA‐TM‐yyyyyyyy.


Physics‐Based Verification
Statement
[CAxxxxV‐PO] The performance of  the Vehicle to perform Ysssss
action under Yrrrrrr conditions shall be verified by analysis. The
simulation tools and analysis methodology,  and the assumed non‐
ideal model behavior and design data which is used in the analysis
shall be developed and peer reviewed to ensure the potential causes
for off‐nominal behavior are adequately identified and their
probabilities properly quantified [ see note 1 ].   The requirement
shall be considered satisfied when analysis results show there is at
least a 99.73% (TBR‐yyy‐yyy) probability of successfully achieving
success criteria with a “consumer’s risk” of 10% [ see note 2 ].


The Notes

Note 1: See “6‐Step Process” for one satisfactory approach.
Note that the "peer review" may generally be performed by
the SIG's, Panels, and Engineering Review Boards
responsible for the engineering effort related to the
requirements being verified.
Note 2: “Consumer’s risk” is defined in the glossary. A 10%
maximum is suggested, and consumer’s risk is specified
because of the criticality of meeting this constraint to
mission success. The term “β‐confidence” could be used if
preferred where the value specified would then be 90%.


How To

Break down components of requirement
Desired performance
Desired probability/proportion of achieving
desired performance
Acceptable risk
Sampling error
Consumer’s (β, Type II) and producer’s (α, Type I)
risks


Decision Matrix
Verification Procedure
Determines that the Design:
The Meets the
Actual Design: Fails the Standard
Standard
Correct Producer’s Risk
Meets the Standard Determination Type I Error
(probability 1-α) (probability α)

Consumer’s Risk Correct
Fails the Standard Type II Error Determination
(probability β) (probability 1-β)


The Courtroom Analogy

Threshold for a criminal trial
Assume innocence/ prove beyond a reasonable doubt
Focuses on α risk: want to make sure that given you
found evidence of wrongdoing, you really are sure of
the evidence
Type I error: wrongful conviction
Type II error: letting a guilty person go free
American courts try not to convict based on finding a
possibility that the defendant is guilty
(Don’t look too closely: the analogy isn’t quite right
web.bsu.edu/cob/econ/research/papers/bsuecwp200601liu.
pdf)


Bare‐Bones Physics‐Based
Verification Statement

The system will attain the success threshold
99.73% of the time with a “consumer’s risk”
of 10%.
99.73% is a coverage probability, aka a
percentile of a distribution, aka a reliability of
the system
Means you expect to be ρ = 99.73% reliable, and
can deal with failure 27 times out of 10,000
Generally flowed down from parent requirements


Bare‐Bones Physics‐Based
Verification Statement

The system will attain the success threshold 99.73%
of the time with a “consumer’s risk” of β = 10%.
10% is an expression of consumer’s risk
Means the Program expects to be ρ = 99.73% reliable, but
if the system is in actual reality ρ = 99.729% reliable, the
Program can deal with accepting that condition β = ~10%
of the time
Suggest this is a programmatic decision, but probably needs
thoughtful input from designers and systems analysts
Often depends on economics
β risk is for figuring out whether you’ve met the
requirement and is not necessarily used for flowing down
requirements to the next level


Proof of Verification
How do I prove I have met my requirement
based on coverage and consumer’s risk?


The Pass/Fail Case
(Binomial)
A typical case:
n simulation trials are run
The number of trials in which the simulated result
failed requirement is counted
How
How do we determine n?
How many of the n trials can exceed the
requirement before we know we’ve failed?


Number of Trials Required

General idea
Want to run enough trials to be able to claim, “We’ve
looked exhaustively”
“Exhaustively” is determined by the Project:
consumer’s risk (the probability we say it’s good when
it’s actually bad)
This is a well‐characterized statistical case
Acceptance sampling
Addressed in MIL‐HDBK‐781
Watch it: many QC acceptance standards focus on
producer’s risk (ASTM/ASQ Z1.4)


Acceptance Sampling
Qualities of acceptance sampling as a best
practice
Strengths
• Accepted national and international standard
• Easy to implement—good software and “cookbook”
to apply
• Sampling plan can be determined a priori before any
simulation runs are made
Weakness
• For high reliability (large ρ ) with low risk (small β)
requires thousands of runs

Why Do Math?
An easy‐to‐run (pretty much) sample size
calculator is available for free online
Gary Pryor for the US Army TRADOC, Ft. Leonard
Wood
Handles many cases accurately
Reliability Test Planner (RTP)
http://www.wood.army.mil/msbl/Reliability%20Te
st%20Planner/Reliability_Test_Planner.htm


Steps
1. Start up the program, then click the “Solver” tab
2. Select the “Binomial Dist” radio button
3. Choose whether you want to find a first‐cut plan by
specifying number of trials or number of “acceptable”
failures
4. Enter desired reliability (coverage) in the “reliability” box
5. Enter one minus the consumer’s risk in the “confidence”
box (this is β confidence or power)
6. Enter either desired number of trials or acceptable failures
7. Press “Solve”
8. Read out the (n,c) sampling plan and its (within rounding)
exact reliability ρ and β confidence (power) in the window


Determining n: RTP Solver

Sampling plan indicated: (852,0)
RTP doesn’t work for high reliabilities and other large
numbers
See a statistician if you need more

Now, Press “Implement”
This creates the “power
curve” for your sampling
plan
Also known as an
operating characteristic
(OC) curve
X axis: true system
reliability, an unknown
value
Y axis: probability that,
given a true system
reliability of X and your
sampling plan, the system
will be accepted


Power Curve Shows Your Plan
This plan says to run 852
sim trials and call the
requirement verified if
there are zero failures
A single failure means the
system is not verified to the
requirement
Given a true reliability of
99.73%, the power curve 9
shows a 10% probability 10 9.73
% %
co re
that you will verify (accept) n s lia
um bil
using this sampling plan er ’ i t y ,
s r
is k
This is the way we specified
the plan


Power Curve Details Your
Plan
Given a true reliability
of ~99.65%, we have a
~5% chance we’ll verify
This should be rejected
Can you live with
accepting verification 5% 99
5% . 6 5
of the time? co % r
ns elia
um bi
er’ lity
s r ,
is k


Power Curve Details Your
Plan
Given a true reliability
of ~99.92%, we have a
9
~50% chance we’ll 50 9.92
% %
co re
verify ns lia
um bil
er’ ity,
s r
. . . And a ~50% chance is k

we’ll reject
This should be accepted
50% of the time, a
perfectly good system
will be rejected using
this plan


Can the Plan Be Improved?
Definition of improvement:
discrimination
Want to be able to
discriminate between good
and bad systems as perfectly
as possible
Graphically: want the Power curve of
sampling plan with
power curve to be vertical perfect
All systems <99.73% reliable discrimination
are rejected 100% of the time
All systems >99.73% reliable
are accepted 100% of the
time


Can I Get Closer?
Goal: steeper curve, . .
. b
ac
Br
ing
t h
k t is p
still going through the his
w
ay
oin
t .
. .
point (0.9973, 0.10)
One way: brute force
More trials (aka more
samples)
Conceptually: specify a
second point you want
your curve to go
through


Can I Get Closer?
RTP will generate related
plans with constant
consumer’s risk
Press “Add Plans” in menu
bar
Select “Constant Consumer
Risk” and type in the desired
β risk
Type in the number of plans
you want to examine
“Above” and “Below” the
current curve and the
increment of number of
failures between each plan
Press “OK” to get the
sampling plans and curves


More Plans
Adding a lot more trials
and accepting more
failures gets us closer
to a vertical power
curve
Best plan here is the
(12114, 25) plan
You can get better by
running more trials, but
clearly, there are
diminishing returns



Plan comparisons
10 sampling plans with fixed ρ=0.995 and β=0.05

3500 0.5

Producer's Risk (α) 0.45
3000 (Right Axis)
0.4
2500 0.35

0.3
2000
0.25
1500
(Left Axis) 0.2

1000 Number of Trials (n) 0.15

0.1
500
0.05

0 0

8

9
6
4
2
0

5
3

7
1

Acceptance Number (c)

Test Plan Description
To get a better test plan Plan Description:  Original Fixed Length Test Plan
Plan Type: Fixed Length
description, select the Source:
Distribution:
Original
Binomial
“Detail” tab and press Lower Test Value: 0.997
Consumers Risk (Beta):  9.990989E‐02
the square button on the
Upper Test Value: 0.999
tab Producers Risk (Alpha):  0.6836846
Primary Plan Length: 852 trials &  0 failures
Note that the “length” Plan # Length Fail
Beta Alpha
column contains decimal 1 852.0 0 0.1
0.684
trials: should be rounded 2 850.3 0
0.683
0.1
3 3,435.0 5 0.1
up 0.321
4 5,702.3 10 0.1
0.155
E.g. Plan no. 7 should be 5 7,881.4 15 0.1
0.075
(12115, 8) 6 10,012.3 20 0.1
0.035
7 12,114.8 25 0.1
0.017


What If I Can’t Do That
Many?
It may not be possible to run the number of
trials required for verification
Each sim trial takes a very long time
Resources not available to run a lot of trials
Team doesn’t want to run that many trials


Here’s What You Can Do
Get a faster computer
Use a simpler simulation model
Go with the number of trials you can run
Examine risk with RTP or other correct statistical calculation
ACCEPT THAT RISK
May need to write a waiver
Use acceptance sampling by variables technique
Use a technique that searches the response space efficiently
Response surface methodology?
Probabilistic methods?
Calculate an answer
Be sure calculations are correct or conservative
“Worst‐on‐worst”?
Make sure it really is worst‐on‐worst


But . . . I Know I’m OK!
“But I know I’m OK because all my trials’ output values are a
long way from my requirement limit!”
This is not the binomial (pass/fail) case, but may be able to be
dealt with another way
NOT PASS/FAIL
Continuous variable output characterization generally
requires far fewer samples than binomial case
Sometimes a small fraction of the number of trials needed for
pass/fail
Depends on distribution
You still need to verify statistically or somehow convince
the verification panel using engineering judgment that risk
is sufficiently low


Variables Acceptance Plans
Basic idea: compare
Gamma (3-Parameter) Distribution
the mean of your sim’s Probability = 0.0
output distribution to 0.4
Shape,Scale,Threshold
the requirement limit 400,16,10000
0.3
Add a factor which Requirement Sim

density
accounts for sampling limit
0.2 output

error
0.1

0
10010

10015

10020

10025

10030

10035
Required Characteristic


However
Important caveat: you have to conservatively
describe the distribution
Therefore, you may need enough sim trials to
produce enough data points to make sure you
have the distribution you think you have
Alternatives (may require a waiver)
History
Engineering judgment
Other Bayesian methods
DON’T just assume a normal distribution!


Methods: See a Statistician

Talk to a statistician if this is the route you want
to take
Conservative distribution fitting
Correct sample size determination
Dr. White has assembled Excel sample size
calculators for a significant number of
distributions
Need beta (sic) testers to exercise the methods in real
situations using real data
Characterization of β risk given your data may be
possible post hoc

Summary

Allow correct risk decisions and avoid RIDS by using
correct probabilistic verification language and
methods
Language recommendations available
Use good M&S design and peer review methods
Physics‐based probabilistic requirements are all
about diligence in searching for problems
Consumer’s risk
Calculators are available for many pass/ fail
verifications
Other methods are available which allow for
potentially less resource‐intensive verifications
Verification by variables requires more rigor in verification

Backup


The Search Analogy
In a test or experiment, the engineer wants to be sure the results
prove the hypothesis
Make sure the results aren’t due to chance
Producer’s (α) risk: probability that what you found is really due to
chance
e.g. “95% confidence” means only α = 5% chance results could have
occurred by chance and were not due to the controlled inputs
In accepting a lot of material, ideally you want to minimize your
risk of accepting a bad lot
Want to search diligently for bad material
Consumer’s (β) risk : probability that you accepted bad material
E.g. “10% consumer’s risk” means that there is a 10% probability that you
would accept a lot of barely rejectable quality given your acceptance
sampling plan
1 – β is called the “power” of the sampling plan
α and β risks are not equivalent


Six Steps
1. Document the requirement, consequences of not meeting the
requirements, and causes leading to the consequences.
2. Logic diagram such as a fault tree showing potential causes for not
meeting the requirement.
3. Document the rationale supporting the methods (including analysis,
software, testing, and inspection) selected to compute probability of
failure at various gates of the logic diagram.
4. Compute probabilities at various gates of the logic diagram and
results of the completed logic diagram analysis leading to verification
of the requirement with specified probability and its associated risk
(confidence level).
5. Peer review of the four steps stated above. An independent review
shall be performed focusing on critical failure modes and events on the
critical path
6. A verification report should be submitted to the organization
responsible for the requirement


NASA‐STD‐7009

M&S Standard development started in May 2005
in response to Diaz Action #4
The permanent NASA M&S Standard was issued
by the NASA Chief Engineer on July 11, 2008 as
NASA‐STD‐7009
Goal: credibility
Contains requirements for various parts of M&S to
achieve that goal
Construct a credible M&S
Obtain credible output
Peer assessment of credibility of M&S as a whole
Ensure that the credibility of the results from M&S is clearly
and properly conveyed to those making critical decisions

The Notes for SR&QA Reqs
Note 1: See “6‐Step Process” for one satisfactory approach. Note that
the "peer review" may generally be performed by the SIG's, Panels, and
Engineering Review Boards responsible for the engineering effort
related to the requirements being verified. If analysis is performed
using Probabilistic Risk Assessment methodology, the analysis shall be
performed in accordance with CxP 70017, Constellation Program
Probabilistic Risk Assessment (PRA) Methodology Document. The
team recommends, however, that this Methodology Document should
be reviewed and modified to incorporate the salient features of the “6‐
Step Process” if the Probabilistic Risk Assessment Methodology is used
for this example and the resulting information be discussed with the
design community.
Note 2: LOC and LOM conditions are defined in the glossary. The use
of mean probabilities without confidence levels for these classes of
requirements is specified to allow for allocation of requirements to
subsystems.


White.p.johnson.k

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (20)

Similar to White.p.johnson.k

Similar to White.p.johnson.k (20)

More from NASAPMC

More from NASAPMC (20)

Recently uploaded

Recently uploaded (20)

White.p.johnson.k