2.
ASQ Reliability Division
English Webinar Series
One of the monthly webinars
on topics of interest to
reliability engineers.
To view recorded webinar (available to ASQ Reliability
Division members only) visit asq.org/reliability
To sign up for the free and available to anyone live
webinars visit reliabilitycalendar.org and select English
Webinars to find links to register for upcoming events
http://reliabilitycalendar.org/webina
rs/
3.
Cost-Optimized Reliability Test Planning and
Decision-Making Through Bayesian Methods
and Leveraging Prior Knowledge
ASQ Reliability Division Webinar Program
Jun 6th 2013
Charles H. Recchia, MBA, PhD
4.
COST-OPTIMIZED RELIABILITY TEST PLANNING AND DECISION-MAKING
THROUGH BAYESIAN METHODS AND LEVERAGING PRIOR KNOWLEDGE
When planning for and interpreting reliability datasets proper application of
Bayesian statistics leads to improved decision-making, resource utilization and
allows for rigorous treatment of prior knowledge to optimize overall reliability
program costs and increase return on investment. In this webinar, we build
upon the foundation established in our previous intro-level presentation and
provide specific examples of reduced sample sizes enabled by Bayesian
methods. We also describe real-world scenarios of improved decision-making
during comparative reliability analyses using proper statistical perspectives on
relative failure rates between systems.
Charles H Recchia, MBA, PhD has more than twenty-five years of product development, engineering
management, and fundamental research experience with a special focus on reliability statistics of
complex systems. He earned his doctorate in Condensed Matter Physics from The Ohio State
University, and a Master of Business Administration degree from Babson College. Dr. Recchia acquired
in-depth reliability engineering expertise at Intel’s Portland Technology Development, MKS
Instruments and Saint-Gobain Innovative Materials R&D, has served as visiting professor of physics at
Wittenberg University, and is author of numerous peer-reviewed technical papers and patents across
multiple fields. Charles provided statistics & advanced lean six sigma consultancy for A123 Systems via
the Andover-based Quality Support Group Inc, and has contracted under Coleman Research Group
vetting CASIS-ISS US National Lab research proposals. A senior member of ASQ and the American
Physical Society, Charles currently works at Raytheon Integrated Defense Systems and serves on the
Advisory Committee for the Boston Chapter of the IEEE Reliability Society.
6/6/2013 ASQ RD Webinar 4
5.
References and Further Reading
• NIST/SEMATECH e-Handbook of Statistical Methods,
http://www.itl.nist.gov/div898/handbook/, April (2012)
• Statistical Methods for Reliability Data, WQ Meeker and LA Escobar
(1998)
• Applied Reliability, 2nd edition, PA Tobias and DC Trindade (1995)
• Bayesian Reliability, MS Hamada, AG Wilson, CS Reese, and HF
Martz, Springer Series in Statistics (2008)
• Bayesian Reliability Analysis, HF Martz and RA Waller (1982)
• Methods for Statistical Analysis of Reliability and Life Data, NR
Mann, RE Schafer, and ND Singpurwalla (1974)
• Bayes is for the Birds, RA Evans, IEEE Transactions on Reliability R-
38, 401 (1989).
6/6/2013 ASQ RD Webinar 5
6.
Agenda
• Brief Review of Bayesian Method
• Examples of Reduced Test Sample Sizes
• Comparative Reliability Decision Making
• Question and Answer
6/6/2013 ASQ RD Webinar 6
8.
Agenda
• Brief Review of Bayesian Method
• Examples of Reduced Test Sample Sizes
• Comparative Reliability Decision Making
• Question and Answer
6/6/2013 ASQ RD Webinar 8
9.
Agenda
• Brief Review of Bayesian Method
• Examples of Reduced Test Sample Sizes
• Comparative Reliability Decision Making
• Question and Answer
6/6/2013 ASQ RD Webinar 9
10.
Bayesian Core Idea
What you knew
before WYKB.
“Prior”
New Data
Best possible update of WYKB
adjusted by the New Data.
“Posterior”
𝑔 𝜆 𝑡𝑖 ∝ 𝐿 𝑡𝑖 𝜆 𝑔 𝜆
6/6/2013 ASQ RD Webinar 10
𝑔 𝜆 𝑡𝑖
𝑔 𝜆 𝑡𝑖
𝐿 𝑡𝑖 𝜆 =
uncensored
𝑓 𝑡𝑗 𝜆
censored
1 − 𝐹 𝑡 𝑘 𝜆
“The probability of l before
new data comes in”
“The likelihood of
obtaining 𝑡𝑖 given
parameter l”
“The probability of
parameter l given 𝑡𝑖 “
“A new set of
failure times“
11.
When reliability follows the exponential TTF model (eg the flat
constant failure rate portion of Bathtub Curve):
Classical Framework
– The MTBF is one fixed unknown value - there is no “probability”
associated with it
– Failure data from a test or observation period allows you to make
inferences about the value of the true unknown MTBF
– No other data are used and no “judgment” - the procedure is objective
and based solely on the test data and the assumed HPP model
Bayesian Framework
– The MTBF is a random quantity with a probability distribution
– Prior to running the test, you already have some idea of what the
MTBF probability distribution looks like based on prior test data or an
consensus engineering judgment
– Upon collecting failure data you incorporate the knowledge to refine
the distribution of the possible values for MTBF
6/6/2013 ASQ RD Webinar 11
12.
Conjugate Prior
• When the functional form of the posterior is
the same as that of the prior (as modified by
Bayesian likelihood/normalization kernel), that
is known as a “conjugate prior”
• Similar concept as eigenfunction. 𝐻𝜓 = 𝐸𝜓
• Conjugate priors are convenient to use due to
tractability and interpretation.
6/6/2013 ASQ RD Webinar 12
13.
b has units of time
a is dimensionless
6/6/2013 ASQ RD Webinar 13
𝑔 𝜆; 𝑎, 𝑏 =
𝑏 𝑎
Γ 𝑎
𝜆 𝑎−1
𝑒−𝑏𝜆
Mean lave = a/b
Variance s2 = a/b2
In Excel
=GAMMA.DIST(l, a, 1/b, FALSE)
14.
6/6/2013 ASQ RD Webinar 14
𝐺 𝜆; 𝑎, 𝑏 =
1
Γ 𝑎
𝛾 𝑎, 𝑏𝜆
𝐺 𝜆; 𝑎, 𝑏 = 𝐺 𝑏𝜆; 𝑎, 1
Where g (x, y) is the lower incomplete gamma function
Note that
In Excel
=GAMMA.DIST(l, a, 1/b, TRUE)
𝐺 𝜆; 𝑎, 𝑏 = 0
𝜆
𝑔 𝜆′
; 𝑎, 𝑏 𝑑𝜆′
G(l) is the prob
that the failure rate
is less than or equal
to l
15.
Bayesian assumptions for the gamma
exponential system model
1. Failure times for the system under investigation can be adequately
modeled by the exponential distribution with constant failure rate.
2. The MTBF for the system can be regarded as chosen from a prior
distribution model that is an analytic representation of our previous
information or judgments about the system's reliability. The form of
this prior model is the gamma distribution (the conjugate prior for
the exponential model).
The prior model is actually defined for l = 1/MTBF.
3. Our prior knowledge is used to choose the gamma parameters
a and b for the prior distribution model for l. There are a number
of ways to convert prior knowledge to gamma parameters.
6/6/2013 ASQ RD Webinar 15
16.
New data is collected …
New information is combined with the gamma prior model to
produce a gamma posterior distribution.
After a new test is run with
T additional system operating hours, and
r new failures,
The resultant posterior distribution for failure rate l remains
gamma (since conjugate), with new parameters
a' = a + r
b' = b + T
6/6/2013 ASQ RD Webinar 16
17.
Reliability estimation with Bayesian gamma prior model
6/6/2013 ASQ RD Webinar 17
18.
Gamma Prior Method 1: Previous Test Data
1. Actual data from previous testing done on the system (or a
system believed to have the same reliability as the one under
investigation) is the most credible prior knowledge, and the
easiest to use. Simply set
a = total number of failures from all the previous data, and
b = total of all the previous test hours.
6/6/2013 ASQ RD Webinar 18
19.
Gamma prior method 2: “50/95”
2. A consensus method for determining a and b that works well is the following:
Assemble a group of engineers who know the system and its sub-components well
from a reliability viewpoint.
A. Have the group reach agreement on a reasonable MTBF they expect the system to have.
They could each pick a number they would be willing to bet even money that the system
would either meet or miss, and the average or median of these numbers would be their
50% best guess for the MTBF. Or they could just discuss even-money MTBF candidates
until a consensus is reached.
B. Repeat the process again, this time reaching agreement on a low MTBF they expect the
system to exceed. A "5%" value that they are "95% confident" the system will exceed
(i.e., they would give 19 to 1 odds) is a good choice. Or a "10%" value might be chosen
(i.e., they would give 9 to 1 odds the actual MTBF exceeds the low MTBF). Use
whichever percentile choice the group prefers.
C. Call the reasonable MTBF MTBF50 and the low MTBF you are 95% confident the system
will exceedMTBF05. These two numbers uniquely determine gamma
parameters a and b that have percentile values at the right locations
Called the 50/95 method (or the 50/90 method if one uses MTBF10 , etc.)
6/6/2013 ASQ RD Webinar 19
20.
Gamma prior method 3: weak prior a = 1
3. Obtain consensus is on a reasonable expected
MTBF, called MTBF50. Next, however, the group
decides they want a weak prior that will change
rapidly, based on new test data. If the prior parameter
"a" is set to 1, the gamma has a standard deviation
equal to its mean, which makes it spread out, or
"weak".
To set the 50th percentile we must choose
b = ln 2 × MTBF50
Note: During planning of Bayesian tests, this weak prior is
actually a very friendly prior in terms of saving test time.
6/6/2013 ASQ RD Webinar 20
21.
Special Case: a = 1 (The "Weak" Prior)
When the prior is a weak prior with a = 1, the Bayesian test is always shorter
than the classical test.
There is a very simple way to calculate the required Bayesian test time when
the prior is a weak prior with a = 1.
First calculate the classical/frequentist test time. Call this Tc.
The Bayesian test time is T = Tc - b.
If the b parameter was set equal to (ln 2) × MTBF50(where MTBF50 is the
consensus choice for an "even money" MTBF), then T = Tc - (ln 2) × MTBF50
When a weak prior is used, the Bayesian test time is always less than the
corresponding classical test time. That is why this prior is also known as
a friendly prior.
This prior essentially sets the order of magnitude for the MTTF
6/6/2013 ASQ RD Webinar 21
22.
Comments
Many variations are possible, based on the above three
methods. For example, you might have prior data from
sources with various levels of applicability or suitability
relative to the system under investigation.
Thus, you may decide to "weight" the prior data by .5, to
"weaken" it. This can be implemented by setting a = 0.5 x
the number of fails in the prior data and b = 0.5 times the
number of test hours. That spreads out the prior
distribution more, and lets it be influenced more quickly
by freshly accumulated test data.
6/6/2013 ASQ RD Webinar 22
23.
Weibull
Example
6/6/2013 ASQ RD Webinar 23
𝑔 𝜆, 𝑘 𝑡𝑖
k
l
x
TTF CDF
TTF pdf
24.
Agenda
• Brief Review of Bayesian Method
• Examples of Reduced Test Sample Sizes
• Comparative Reliability Decision Making
• Question and Answer
6/6/2013 ASQ RD Webinar 24
25.
Agenda
• Brief Review of Bayesian Method
• Examples of Reduced Test Sample Sizes
• Comparative Reliability Decision Making
• Question and Answer
6/6/2013 ASQ RD Webinar 25
26.
Example of 50/95 Method Prior
A group of engineers, discussing the reliability of a new piece of
equipment, decide to use the 50/95 method to convert their knowledge
into a Bayesian gamma prior. Consensus is reached on:
likely MTBF50 value of 600 hrs, and
a low MTBF05 value of 250 hrs
Corresponding parameters solved
a = 2.863
b = 1522.46 hrs
These prior parameters “pre-load” the failure rate distribution
50% prob of l < 1/600 = 1.67e-3 hrs-1
95% prob of l < 1/250 = 4.00e-3 hrs-1
6/6/2013 ASQ RD Webinar 26
27.
Bayesian Test Planning
Gamma prior parameters a and b have already been determined.
Assume we have a given MTBF objective.
We want to confirm the system will have an MTBF of at least M at the 100×(1-a ) confidence
level. Pick a number of failures, r, that we can allow during the test.
We need a test time T such that we can observe up to r failures and still "pass" the test.
The posterior gamma distribution will have (worst case - assuming exactly r failures) new
parameters of a ' = a + r, b' = b + T and passing the test means that the failure rate λ1- α , the
upper 100×(1- a) %-tile for the posterior gamma, has to equal the target failure rate 1/M.
By definition, this is G-1(1- a; a', b'), where G-1 is the inverse of the gamma CDF distribution .
Based on the properties of the gamma distribution CDF the required test time would be:
6/6/2013 ASQ RD Webinar 27
𝑇 = 𝑀𝐺−1
1 − 𝛼; 𝑎 + 𝑟, 1 − 𝑏
28.
Bayesian Test Time Example
New equipment has MTBF requirement of 500 hrs at 80 % confidence.
Decide to leverage their collective experience, use 50/95 method and arrive at prior
parameters a = 2.863 and b = 1522.46 hrs.
Now they want to determine an appropriate test time so that they can confirm a MTBF of 500
with at least 80 % confidence, provided they have no more than two failures (r = 2).
They obtain a test time of 1756 hours using 500 × (G -1(1-0.2; 2.863+2, 1)) - 1522.46
If the test then runs for 1756 hours, with no more than two failures, an MTBF of at least 500
hours has been confirmed at 80 % confidence.
The classical (non-Bayesian) test time required would have been (is) 2140 hours.
The Bayesian test saves about 384 hours, or an 18 % savings.
If, instead, the engineers had used a weak prior with same 600 hr MTBF50 the required test time would have
been 2140 - 600 × ln 2 = 1724 hours, a savings of roughly 416 hrs, a 19% time savings vs non-Bayesian.
6/6/2013 ASQ RD Webinar 28
30.
Agenda
• Brief Review of Bayesian Method
• Examples of Reduced Test Sample Sizes
• Comparative Reliability Decision Making
• Question and Answer
6/6/2013 ASQ RD Webinar 30
31.
Agenda
• Brief Review of Bayesian Method
• Examples of Reduced Test Sample Sizes
• Comparative Reliability Decision Making
• Question and Answer
6/6/2013 ASQ RD Webinar 31
32.
EXCEL SPREADSHEET EXAMPLES
6/6/2013 ASQ RD Webinar 32
Yes, the Excel spreadsheet will be
available along with webinar slides.
33.
Agenda
• Brief Review of Bayesian Method
• Examples of Reduced Test Sample Sizes
• Comparative Reliability Decision Making
• Question and Answer
6/6/2013 ASQ RD Webinar 33
34.
Agenda
• Brief Review of Bayesian Method
• Examples of Reduced Test Sample Sizes
• Comparative Reliability Decision Making
• Question and Answer
6/6/2013 ASQ RD Webinar 34
35.
References and Further Reading
• NIST/SEMATECH e-Handbook of Statistical Methods,
http://www.itl.nist.gov/div898/handbook/, April (2012)
• Statistical Methods for Reliability Data, WQ Meeker and LA Escobar (1998)
• Applied Reliability, 2nd edition, PA Tobias and DC Trindade (1995)
• Bayesian Reliability, MS Hamada, AG Wilson, CS Reese, and HF Martz, Springer
Series in Statistics (2008)
• Bayesian Reliability Analysis, HF Martz and RA Waller (1982)
• Methods for Statistical Analysis of Reliability and Life Data, NR Mann, RE Schafer,
and ND Singpurwalla (1974)
• Bayes is for the Birds, RA Evans, IEEE Transactions on Reliability R-38, 401 (1989).
6/6/2013 ASQ RD Webinar 35