2. Overview
Motivation – why care?
Probabilistic requirements
SR&QA vs. engineering performance based
How to write them
Decision matrix: consumer’s vs. producer’s risk
The “simplest” verification case: pass/fail
Number of simulation trials needed to verify
Reliability Test Planner (RTP)
Tweaking the verification sampling plan
More sophisticated verifications
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 2
5. Constellation
Program (Cx)
NASA’s next manned space program,
scheduled to make its first flights
early in the next decade.
Ares I and Ares V rockets
Orion spacecraft
The vision is to send human explorers
back to the moon and then onward to
Mars and other destinations in the solar
system . . .
. . . SAFELY
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 5
7. Probabilistic Engineering
Design
Rapidly gaining acceptance
Confront uncertainties directly—use estimated
probability distributions for model parameters
Estimate probability distributions for outputs from
test and/ or via Monte Carlo sim and/ or other
methods
Capture, describe and leverage uncertainty to help design
robust, reliable systems
Advantages
Better understanding of impacts of uncertainties (lower
risks)
Design “closer to the margin” (lower costs)
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 7
8. Monte Carlo Approach
Draw Observations Calculate Observations
Random inputs with known Sample distributions of
probability distributions model outputs
Output Distributions
0.35
Input Distributions
Sensitivity Analysis 0.50
0.45
0.30 Fixed parameters and controlled
Fixed parameters and controlled 0.40
0.25
inputs with known values
inputs with known values 0.35
0.30
0.20
0.25
0.15 0.20
0.15
0.10
0.10
0.05 0.05
0.00
0.00
-4
-3
-2
-1
0
1
2
3
-3
-2
-1
0
1
2
3
4
5
6
Simulation Model
Staggering range of applications: computational mathematics, science, social
science, economics and finance, computer science, all branches of engineering
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 8
9. Writing Probabilistic
Requirements
Requirements which involve M&S run under
uncertainty need particular treatment
Wording needs to be clear
Make sure address “goodness” of M&S as well as whether
output specifically meets a number
Need to address uncertainty inherent in M&S along with
uncertainty within the model and assumptions themselves
Recommendations to CxCEF by Probabilistic
Requirements Verification Team
JSC document EA4‐07‐005 dated 5/14/2007 with
attachments
This is a probabilistic technology (PT)
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 9
10. M&S Design Aids and Checks
Part of a well‐written requirement
Goals include
Make sure simulation model isn’t designed in a vacuum
Make sure the simulation model is appropriate to the
questions at hand
Make sure uncertainties are correctly and fully
addressed
Peer review
Methods (neither exhaustive nor mutually exclusive)
Six Steps (Suren Singhal, MSFC; attachment to EA4‐07‐005)
NASA‐STD‐7009
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 10
12. SR&QA Type Requirements
Requirement: [CAxxxx‐PO] The XXX system shall limit its
contribution to the risk of loss of crew (LOC) for a Xsssss
mission to no greater than 1 in 200 (TBR‐xxx‐xxx).
Rationale: The 1 in 200 (TBR‐xxx‐xxx) means a .005 (or .5%)
probability of LOC due to the XXX during any Xsssss
mission. The baseline numbers were derived from a
preliminary PRA within NASA‐TM‐2005‐214062, NASA's
Exploration Systems Architecture Study. This requirement
is driven by CxP 70003‐XXXXxx, Constellation Program
Plan, Annex 1: Need, Goals, and Objectives (NGO), Safety
Goal CxP‐Xxx: Provide a substantial increase in safety, crew
survival and reliability of the overall system over legacy
systems.
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 12
13. SR&QA Type Verification
Statement
[CA0501V‐PO] Xsssss LOC due to XXX shall be verified by
analysis. The simulation tools and analysis methodology,
and the assumed non‐ideal model behavior and design data
which is used in the analysis shall be developed and peer
reviewed to ensure the potential causes for off‐nominal
behavior are adequately identified and their probabilities
properly quantified [ see note 1]. The requirement shall be
considered satisfied when analysis results show there is at
most a 0.5% (TBR‐xxx‐xxx) probability of LOC with the
probability taken as a mean probability [ see note 2 ].
Notes: see backup
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 13
16. The Notes
Note 1: See “6‐Step Process” for one satisfactory approach.
Note that the "peer review" may generally be performed by
the SIG's, Panels, and Engineering Review Boards
responsible for the engineering effort related to the
requirements being verified.
Note 2: “Consumer’s risk” is defined in the glossary. A 10%
maximum is suggested, and consumer’s risk is specified
because of the criticality of meeting this constraint to
mission success. The term “β‐confidence” could be used if
preferred where the value specified would then be 90%.
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 16
17. How To
Break down components of requirement
Desired performance
Desired probability/proportion of achieving
desired performance
Acceptable risk
Sampling error
Consumer’s (β, Type II) and producer’s (α, Type I)
risks
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 17
18. Decision Matrix
Verification Procedure
Determines that the Design:
The Meets the
Actual Design: Fails the Standard
Standard
Correct Producer’s Risk
Meets the Standard Determination Type I Error
(probability 1-α) (probability α)
Consumer’s Risk Correct
Fails the Standard Type II Error Determination
(probability β) (probability 1-β)
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 18
19. The Courtroom Analogy
Threshold for a criminal trial
Assume innocence/ prove beyond a reasonable doubt
Focuses on α risk: want to make sure that given you
found evidence of wrongdoing, you really are sure of
the evidence
Type I error: wrongful conviction
Type II error: letting a guilty person go free
American courts try not to convict based on finding a
possibility that the defendant is guilty
(Don’t look too closely: the analogy isn’t quite right
web.bsu.edu/cob/econ/research/papers/bsuecwp200601liu.
pdf)
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 19
20. Bare‐Bones Physics‐Based
Verification Statement
The system will attain the success threshold
99.73% of the time with a “consumer’s risk”
of 10%.
99.73% is a coverage probability, aka a
percentile of a distribution, aka a reliability of
the system
Means you expect to be ρ = 99.73% reliable, and
can deal with failure 27 times out of 10,000
Generally flowed down from parent requirements
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 20
21. Bare‐Bones Physics‐Based
Verification Statement
The system will attain the success threshold 99.73%
of the time with a “consumer’s risk” of β = 10%.
10% is an expression of consumer’s risk
Means the Program expects to be ρ = 99.73% reliable, but
if the system is in actual reality ρ = 99.729% reliable, the
Program can deal with accepting that condition β = ~10%
of the time
Suggest this is a programmatic decision, but probably needs
thoughtful input from designers and systems analysts
Often depends on economics
β risk is for figuring out whether you’ve met the
requirement and is not necessarily used for flowing down
requirements to the next level
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 21
23. The Pass/Fail Case
(Binomial)
A typical case:
n simulation trials are run
The number of trials in which the simulated result
failed requirement is counted
How
How do we determine n?
How many of the n trials can exceed the
requirement before we know we’ve failed?
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 23
24. Number of Trials Required
General idea
Want to run enough trials to be able to claim, “We’ve
looked exhaustively”
“Exhaustively” is determined by the Project:
consumer’s risk (the probability we say it’s good when
it’s actually bad)
This is a well‐characterized statistical case
Acceptance sampling
Addressed in MIL‐HDBK‐781
Watch it: many QC acceptance standards focus on
producer’s risk (ASTM/ASQ Z1.4)
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 24
25. Acceptance Sampling
Qualities of acceptance sampling as a best
practice
Strengths
• Accepted national and international standard
• Easy to implement—good software and “cookbook”
to apply
• Sampling plan can be determined a priori before any
simulation runs are made
Weakness
• For high reliability (large ρ ) with low risk (small β)
requires thousands of runs
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 25
27. Steps
1. Start up the program, then click the “Solver” tab
2. Select the “Binomial Dist” radio button
3. Choose whether you want to find a first‐cut plan by
specifying number of trials or number of “acceptable”
failures
4. Enter desired reliability (coverage) in the “reliability” box
5. Enter one minus the consumer’s risk in the “confidence”
box (this is β confidence or power)
6. Enter either desired number of trials or acceptable failures
7. Press “Solve”
8. Read out the (n,c) sampling plan and its (within rounding)
exact reliability ρ and β confidence (power) in the window
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 27
29. Now, Press “Implement”
This creates the “power
curve” for your sampling
plan
Also known as an
operating characteristic
(OC) curve
X axis: true system
reliability, an unknown
value
Y axis: probability that,
given a true system
reliability of X and your
sampling plan, the system
will be accepted
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 29
30. Power Curve Shows Your Plan
This plan says to run 852
sim trials and call the
requirement verified if
there are zero failures
A single failure means the
system is not verified to the
requirement
Given a true reliability of
99.73%, the power curve 9
shows a 10% probability 10 9.73
% %
co re
that you will verify (accept) n s lia
um bil
using this sampling plan er ’ i t y ,
s r
is k
This is the way we specified
the plan
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 30
32. Power Curve Details Your
Plan
Given a true reliability
of ~99.92%, we have a
9
~50% chance we’ll 50 9.92
% %
co re
verify ns lia
um bil
er’ ity,
s r
. . . And a ~50% chance is k
we’ll reject
This should be accepted
50% of the time, a
perfectly good system
will be rejected using
this plan
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 32
33. Can the Plan Be Improved?
Definition of improvement:
discrimination
Want to be able to
discriminate between good
and bad systems as perfectly
as possible
Graphically: want the Power curve of
sampling plan with
power curve to be vertical perfect
All systems <99.73% reliable discrimination
are rejected 100% of the time
All systems >99.73% reliable
are accepted 100% of the
time
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 33
34. Can I Get Closer?
Goal: steeper curve, . .
. b
ac
Br
ing
t h
k t is p
still going through the his
w
ay
oin
t .
. .
point (0.9973, 0.10)
One way: brute force
More trials (aka more
samples)
Conceptually: specify a
second point you want
your curve to go
through
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 34
35. Can I Get Closer?
RTP will generate related
plans with constant
consumer’s risk
Press “Add Plans” in menu
bar
Select “Constant Consumer
Risk” and type in the desired
β risk
Type in the number of plans
you want to examine
“Above” and “Below” the
current curve and the
increment of number of
failures between each plan
Press “OK” to get the
sampling plans and curves
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 35
36. More Plans
Adding a lot more trials
and accepting more
failures gets us closer
to a vertical power
curve
Best plan here is the
(12114, 25) plan
You can get better by
running more trials, but
clearly, there are
diminishing returns
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 36
37. Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 37
Plan comparisons
10 sampling plans with fixed ρ=0.995 and β=0.05
3500 0.5
Producer's Risk (α) 0.45
3000 (Right Axis)
0.4
2500 0.35
0.3
2000
0.25
1500
(Left Axis) 0.2
1000 Number of Trials (n) 0.15
0.1
500
0.05
0 0
8
9
6
4
2
0
5
3
7
1
Acceptance Number (c)
38. Test Plan Description
To get a better test plan Plan Description: Original Fixed Length Test Plan
Plan Type: Fixed Length
description, select the Source:
Distribution:
Original
Binomial
“Detail” tab and press Lower Test Value: 0.997
Consumers Risk (Beta): 9.990989E‐02
the square button on the
Upper Test Value: 0.999
tab Producers Risk (Alpha): 0.6836846
Primary Plan Length: 852 trials & 0 failures
Note that the “length” Plan # Length Fail
Beta Alpha
column contains decimal 1 852.0 0 0.1
0.684
trials: should be rounded 2 850.3 0
0.683
0.1
3 3,435.0 5 0.1
up 0.321
4 5,702.3 10 0.1
0.155
E.g. Plan no. 7 should be 5 7,881.4 15 0.1
0.075
(12115, 8) 6 10,012.3 20 0.1
0.035
7 12,114.8 25 0.1
0.017
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 38
40. Here’s What You Can Do
Get a faster computer
Use a simpler simulation model
Go with the number of trials you can run
Examine risk with RTP or other correct statistical calculation
ACCEPT THAT RISK
May need to write a waiver
Use acceptance sampling by variables technique
Use a technique that searches the response space efficiently
Response surface methodology?
Probabilistic methods?
Calculate an answer
Be sure calculations are correct or conservative
“Worst‐on‐worst”?
Make sure it really is worst‐on‐worst
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 40
41. But . . . I Know I’m OK!
“But I know I’m OK because all my trials’ output values are a
long way from my requirement limit!”
This is not the binomial (pass/fail) case, but may be able to be
dealt with another way
NOT PASS/FAIL
Continuous variable output characterization generally
requires far fewer samples than binomial case
Sometimes a small fraction of the number of trials needed for
pass/fail
Depends on distribution
You still need to verify statistically or somehow convince
the verification panel using engineering judgment that risk
is sufficiently low
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 41
42. Variables Acceptance Plans
Basic idea: compare
Gamma (3-Parameter) Distribution
the mean of your sim’s Probability = 0.0
output distribution to 0.4
Shape,Scale,Threshold
the requirement limit 400,16,10000
0.3
Add a factor which Requirement Sim
density
accounts for sampling limit
0.2 output
error
0.1
0
10010
10015
10020
10025
10030
10035
Required Characteristic
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 42
43. However
Important caveat: you have to conservatively
describe the distribution
Therefore, you may need enough sim trials to
produce enough data points to make sure you
have the distribution you think you have
Alternatives (may require a waiver)
History
Engineering judgment
Other Bayesian methods
DON’T just assume a normal distribution!
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 43
44. Methods: See a Statistician
Talk to a statistician if this is the route you want
to take
Conservative distribution fitting
Correct sample size determination
Dr. White has assembled Excel sample size
calculators for a significant number of
distributions
Need beta (sic) testers to exercise the methods in real
situations using real data
Characterization of β risk given your data may be
possible post hoc
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 44
45. Summary
Allow correct risk decisions and avoid RIDS by using
correct probabilistic verification language and
methods
Language recommendations available
Use good M&S design and peer review methods
Physics‐based probabilistic requirements are all
about diligence in searching for problems
Consumer’s risk
Calculators are available for many pass/ fail
verifications
Other methods are available which allow for
potentially less resource‐intensive verifications
Verification by variables requires more rigor in verification
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 45
47. The Search Analogy
In a test or experiment, the engineer wants to be sure the results
prove the hypothesis
Make sure the results aren’t due to chance
Producer’s (α) risk: probability that what you found is really due to
chance
e.g. “95% confidence” means only α = 5% chance results could have
occurred by chance and were not due to the controlled inputs
In accepting a lot of material, ideally you want to minimize your
risk of accepting a bad lot
Want to search diligently for bad material
Consumer’s (β) risk : probability that you accepted bad material
E.g. “10% consumer’s risk” means that there is a 10% probability that you
would accept a lot of barely rejectable quality given your acceptance
sampling plan
1 – β is called the “power” of the sampling plan
α and β risks are not equivalent
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 47
48. Six Steps
1. Document the requirement, consequences of not meeting the
requirements, and causes leading to the consequences.
2. Logic diagram such as a fault tree showing potential causes for not
meeting the requirement.
3. Document the rationale supporting the methods (including analysis,
software, testing, and inspection) selected to compute probability of
failure at various gates of the logic diagram.
4. Compute probabilities at various gates of the logic diagram and
results of the completed logic diagram analysis leading to verification
of the requirement with specified probability and its associated risk
(confidence level).
5. Peer review of the four steps stated above. An independent review
shall be performed focusing on critical failure modes and events on the
critical path
6. A verification report should be submitted to the organization
responsible for the requirement
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 48
49. NASA‐STD‐7009
M&S Standard development started in May 2005
in response to Diaz Action #4
The permanent NASA M&S Standard was issued
by the NASA Chief Engineer on July 11, 2008 as
NASA‐STD‐7009
Goal: credibility
Contains requirements for various parts of M&S to
achieve that goal
Construct a credible M&S
Obtain credible output
Peer assessment of credibility of M&S as a whole
Ensure that the credibility of the results from M&S is clearly
and properly conveyed to those making critical decisions
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 49
50. The Notes for SR&QA Reqs
Note 1: See “6‐Step Process” for one satisfactory approach. Note that
the "peer review" may generally be performed by the SIG's, Panels, and
Engineering Review Boards responsible for the engineering effort
related to the requirements being verified. If analysis is performed
using Probabilistic Risk Assessment methodology, the analysis shall be
performed in accordance with CxP 70017, Constellation Program
Probabilistic Risk Assessment (PRA) Methodology Document. The
team recommends, however, that this Methodology Document should
be reviewed and modified to incorporate the salient features of the “6‐
Step Process” if the Probabilistic Risk Assessment Methodology is used
for this example and the resulting information be discussed with the
design community.
Note 2: LOC and LOM conditions are defined in the glossary. The use
of mean probabilities without confidence levels for these classes of
requirements is specified to allow for allocation of requirements to
subsystems.
Prob Req Verification for PM Challenge 2009 ver. 1/16/2009 50