SlideShare a Scribd company logo
1 of 10
Common Mistakes with MTBF


MTBF is widely used to describe the reliability of a component or system. It is
also often misunderstood and used incorrectly. In some sense, the very name
“mean time between failures” contributes to the misunderstanding. The objective
of this paper is to explore the nature of the MTBF misunderstandings and the
impact on decision-making and program costs.

Mean-Time-Between-Failure (MTBF) as defined by MIL-STD-721C Definition of
Terms for Reliability and Maintainability, 12 June 1981, is

      A basic measure of reliability for repairable items: The mean number of life
units during which all parts of the item perform within their specified limits, during
a particular measurement interval under stated conditions.

The related measure, Mean-Time-To-Failure (MTTF) is define as

      A basic measure of reliability for non-repairable items: The total number of
life units of an item divided by the total number of failures within that population,
during a particular measurement interval under stated conditions.

These definitions are very similar. The subtle difference is important, yet the
confusion is further complicated when attempting to quantify MTBF or MTTF. In
both cases we often use the calculation as described within the MTTF definition.
This is what we would do for any group of values that we wanted to find the
mean (average) value estimate. Tally the values and divide by the number of
hours all units have operated and divide by the number of failures. This provided
an unbiased (statistically speaking) estimate of the population mean.

Keep in mind that time to failure data is often not normally distributed. The
underlying distribution for lifedata starts at time zero and increases. The
exponential family of distributions tends to describe lifedata well and is commonly
used. The unbiased estimate for the mean value of an exponential distribution is
as described for the MTTF definition above.

When working with data from a repairable system, one should use the
Nonhomogeneous Poison Process (NHPP) which is a generalization of the
Poison distribution. The estimate for the failure intensity can have various
models, yet if often assumed to be the exponential model. This results in the
common estimate of MTBF of

        T (k)
MTBF =
          k
Where, T(k) as the total time of one or more system operations divided by the
cumulative number of failures. [1]
Thus introducing the first source of confusion when considering MTBF, failure
rates, or hazard rates. Since we intuitively use the simple calculation to estimate
the mean value, many then do not then apply that estimate with the reliability
function of the appropriate distribution.

For example, if a vendor states the product has an MTTF of 16,000 hours, and
we wanted to know how many out of 100 units will fail in 8,000 hours, the
appropriate calculation is

            ætö
           -ç ÷
            èq ø
R(t) = e
                    æ 8,000 ö
                   -ç
                    è 16,000 ÷
                             ø
R(8, 000) = e                    = 0.61

such that we expect 61 out of the 100 units, or 61%, of the units to operate for
the full 8,000 hours.

This is assuming an exponential distribution and non-repairable units. Given only
an MTTF value, the most likely distribution to use without additional information is
the exponential.

Extending this same example to determine the reliability at 16,000 hours, we find
that only about 1/3 of the units would be expected to still be operating. And, if
someone has this common misunderstandings of the failure rate value that
MTBF represents, then it can lead to significant loss of resources or mission
readiness.

For example, a radar detection OEM received a contract to design and
manufacture a specific system with 5,000 hours MTBF. The specification
included functionality, mission duration and expected equipment duty cycle,
along with minor variations to the airborne inhabited environment. The contract
specified 5,000 hours MTBF for the sole reliability requirement. And, the design
team designed, built and tested and accomplished a better than 5,000 hour
MTBF.

The Air Force found the unit to be the leading cause of aborted missions
(equipment related) and complained to the OEM. A careful analysis of the field
data proved the units actually achieved almost 6,000 hour MTBF, thus exceeding
the specification. Of course, this didn‟t change the data on aborted missions. In
part the OEM‟s equipment just happened to be the least reliable equipment on
the aircraft.

A short discussion with the team found some misunderstanding and that “errors
had been made”. The Air Force procurement team and the prime contractor
personal mistakenly thought the term „5,000 hours MTBF‟ meant at least 5,000
failure free operating hours. When in reality the term, in this case, meant that
approximately two-thirds of the units are expected to have at least one failure
over of period of 5,000 operating hours. And, in fact, the product performed about
20% better than the specification.

The problem was exacerbated by the mission requiring the use of three of the
OEM‟s unit during the mission. Reliability speaking the equipment was in series,
meaning that if any one of the three units failed, the crew had to abort the
mission. Therefore, the probability of successfully completing 1000 hours of
operation where all three units have to work is

Rsys ( t ) = R1 ( t ) × R2 ( t ) × R3 ( t )
                        æ 1,000 ö         æ 1,000 ö         æ 1,000 ö
                       -ç                -ç                -ç
                        è 5,000 ÷         è 5,000 ÷         è 5,000 ÷
Rsys (1, 000) = e               ø
                                    ×e            ø
                                                      ×e            ø
                                                                        = 0.55

Even though each of the individual units have about an 82% reliability (or
probability of surviving 1,000 hours), the three in series have only a 55%
reliability, or probability that all three will operate for 1,000 hours.

Acknowledging either a specification error or misunderstanding of the metric
errors the team still had the issue of aborted missions. Simply changing the
reliability requirements would not change the design of the equipment without a
significant re-design. Further discussion found that installing a warm standby
unit, permitted the rapid replacement of a failed unit during the mission, thus
effectively and significantly reducing mission aborts. The reliability of a 3-out-of-4
system is

                m-1
                    æ nö
Rsys ( t ) = 1- å ç ÷ Ri ( t ) (1- R ( t ))
                                            n-i

                i=0 è i ø


where n is the number of systems out of m total have to be operating for the
overall system to be operating.[2] In the example above, n=3 and m=4, plus the
example has a reliability for a single system of about 82%. For three in series the
system reliability drops to about 55%. And the calculation for the 3 out of 4
parallel system reliability calculation results in 85%. Suffice it to say the reliability
is significantly improved.


Note, that using reliability in the above function does not require the use MTBF.
The reliability term can come from any distribution.

Calculating or using only the MTBF value to represent a product‟s reliability can
lead to more than misunderstanding. If the product performs better or worse than
expected you may have unnecessary spares expenses or not enough spares to
continue effectively. Another issue that may arise is the unexpected increase in
failure rate after a few years of a very low failure rate. Using the single
parameter, MTBF, does not provide information about the changing nature of
failure rates over time.

The following graph is a plot of percentage of the population that has failed over
time or cumulative distribution function plot. The red line is the plot of the fitted
exponential distribution. The data and fitted line represents the failure rate trend
that is declining over time. Over time the total number fo failures continues to
rise, yet the slope is low or less than the slope for the exponential distribution.




This is actual data and the time scale and title have been removed to protect the
source. The theta of the exponential distribution is 49,093 hours. Whereas the
Weibull distribution has a beta of 0.5823 and eta of 31,344 hours.

On this plot, the exponential distribution has a slope of 1. The fitted Weibull
distribution slope is less than one. Keep in mind that the exponential and Weibull
distribution are members of the exponential family of distribution. The formula for
the reliability function of the 2-parameter Weibull distribution is


         ( )
               b
        - th
R(t) = e
where the beta is the slope and eta is the characteristic life. Setting beta to 1
reduces the formula to the reliability function for the exponential distribution.

R(t) = e
            ( )
           - tq




where theta is the characteristic life and is also the inverse of the failure rate and
commonly theta is called MTTF or MTBF.

The plot of the CDF is related to the reliability function. Reliability is the
percentage of units surviving over a specific duration. And the CDF plots the
percentage of units failed over a specific duration. The CDF is represented by
F(t) and the CDF for the Weibull distribution is


            ( )
                         b
                  - th
F(t) = 1- e

therefore,


R(t) = 1- F(t)

Essentially the vertical axis on the above plot reverses from rising from 0 to
100% for the CDF. For the reliability function the vertical axis rises from 100 to
0%.

Consider the above CDF plot again. If the underlying data is represented by only
one value, say MTBF, we are in effect representing the data with the ill-fitted red
line. Only at one point in time does the distribution actually represent the data,
only at the point in time where they cross. Thus, if I need to make a decision prior
to that point based on the expected reliability of the system, we would use the
exponential distribution. For example, at time 100 hours we find the MTBF based
reliability to be

R(t) = e
            ( )
           - tq



R(100) = e
                  (
              - 100 49,093   ) = 0.9968



We get a number and can make a decision if the system meets our reliability
requirements. Whereas, using the fitted reliability distribution, we have a
description of the data using two parameters. Calculating the reliability at the
same point of time using the Weibull distribution we find
( )
               b
        - th
R(t) = e
           (              )
                              0.5823
           - 100 31,344
R(100) = e                             = 0.965

The difference in estimates may or may not make a difference in the decision, yet
we often attempt to use the best available data when making important decision.
The estimate provided by the exponential distribution is potentially misleading
and in the above example over states the system‟s reliability. This error varies
and get worse when examining a shorter period of time.

This error may cause the error of accepting a system that actually does not meet
the requirements. Or, it may cause the under stocking of needed spare parts for
failures that are likely to occur, leading to reduced mission readiness.



The following CDF plot shows a different situation. Here the data tends to
increase in failure rate over time and has a slope greater than one. Again the
exponential (MTBF) estimate does not reflect the actual data very well, except at
one point.
Again, the title and vertical access have been removed from this plot of actual
data. The theta for the exponential distribution is 20,860 hours. And, the fitted
parameters for the Weibull distribution are: Beta equals 1.897 and eta is 23,507
hours.

Performing the reliability calculations for the two distribution at 100 hours results
in the following two results

R(t) = e
            ( )
           - tq



R(100) = e
                  (
              - 100 20860   ) = 0.9952

is for the exponential distribution, and for the Weibull distribution

         ( )
                  b
           - th
R(t) = e
                  (          )
                             1.897
              - 100 23,507
R(100) = e                           = 0.999968


And while this difference may or may not change the decision based on the
system reliability, using the exponential distribution may lead to costly mistakes.
In this case, the system reliability estimate may be mistakenly represented as
being to low. This may lead to a cancelation of the program, or the overstocking
of spare parts.

Of course, in both examples, depending on which time point is selected the
difference between the two fitted curves is different. And if the duration on
interest is beyond the intersection of the two fitted lines, then the mistakes lead to
different results.




Another area of misleading use of MTBF is the lack of reliability apportionment.
The confusion comes from the notion of the weakest link limiting the reliability of
a system. As in the except from the poem by Oliver Wendal Homes, “The
Deacon‟s Masterpiece, or, the Wonderful One-Hoss Shay a Logical
Story.”,[3]where the chaise was build with every part was a study and strong as
all the parts. Then,

                                      --What do you think the parson found,
                                      
 When he got up and stared around?

                                     The poor old chaise in a heap or mound,

As if it had been to the mill and ground!

                                     You see, of course, if you 're not a dunce,

                                               How it went to pieces all at once,
                                              --
 All at once, and nothing first,
                                      --
 Just as bubbles do when they burst.


In practice, products do not failure all at once and completely. In more complex
systems, while many possible components may be the first to fail, it may be
unclear exactly which component will fail first. The replacement of that
component generally does not improve the probability of failure of the other
components, thus a different component may cause the next failure.

Back to the weakest link idea. In a series system, reliability speaking, if any one
element of a system fails, then the system fails. Given technical and design
limitations there is one element that is inherently weaker than the rest of the
system. Therefore, if we know, the compressor is the weakest link in a product
and it has a MTBF of 5,000 hours. Well, then no other component needs to be
any better than 5,000 hours MTBF. Right? And, one might say that for a system
is has no field replaceable units, that upon the first failure the unit has to be
totally replaced anyway. Basically, the thought is since the compressor limits the
life of the product (the weakest link), no other component needs to be better than
5,000 hours MTBF.

Given a system goal of 5,000 hours MTBF and using the logic from above and
from the One-Hoss Shay, we create a complex product with each subsystem
designed and tested to the same goal, 5,000 MTBF. Let‟s assume the product
has a display, circuit board, and power supply, in addition to the compressor
mentioned above.

For the sake of argument, let‟s assume each of the four subsystems do actually
have an exponential distribution for expected time to failure. This means that
each subsystem has a 1/5,0000 chance of failure every hour of operation and it
stays constant over time. Inverting the MTBF to find the failure rate per hour, we
find 1/5,000 = 0.0002 failures per hour. And, let‟s say that over a two year period
the systems are expected to operate 2,500 hours.

“No problem, everything meets at least 5000 hours MTBF”, one might say. Let‟s
do the math.

Rsys ( t ) = R1 ( t ) × R2 ( t ) × R3 ( t ) × R4 ( t )
                         æ 2,500 ö         æ 2,500 ö         æ 2,500 ö         æ 2,500 ö
                        -ç                -ç                -ç                -ç
                         è 5,000 ÷         è 5,000 ÷         è 5,000 ÷         è 5,000 ÷
Rsys ( 2, 500 ) = e              ø
                                     ×e            ø
                                                       ×e            ø
                                                                         ×e            ø
                                                                                           = 0.135
The more subsystems and components designed and selected to just meet the
5k MTBF the worse the actual result. The result of a system reliability of 13.5%
over 2,500 hours assumes that each subsystem achieves only 5,000 MTBF. In
practice each will achieve some other number, yet the point is, in design and
practice if each subsystem achieves the system goal, the result will be a
surprisingly low.

Another assumption in the above example is the use of exponential distributions
to describe each subsystem. This is often not true and using Weibull or
Lognormal distribution may be appropriate. For example, the compressor most
likely has a wearout type of failure mechanism. And, we are able to find a set of
data that with analysis provides a good fit to a Weibull distribution. The Weibull
parameters for the compressor are beta of 2 and eta of 5642(note: this would be
estimated as an theta of 5,000 for a fitted exponential distribution.)

Using the new information with the same example as above, we have
                            2
                æ 2,500 ö
               -ç
                è 5,642 ÷
R1 ( t ) = e            ø
                                = 0.82
Rsys ( t ) = R1 ( t ) × R2 ( t ) × R3 ( t ) × R4 ( t )
Rsys ( 2, 500 ) = ( 0.82 ) = 0.45
                                   4




The result is better as at the early portion of the life distribution, the failure rate is
relatively low. It is only later, after about 5,000 hours does the failure rate climb
above the estimated exponential distribution. It is overstating the reliability at
2,500 hours.


Conclusion

We have the math tools and understanding to use the appropriate distributions to
describe the expected failures or reliability functions. Using MTBF for
convenience, convention or „because the customer expects that metric” all tend
to lead to poor estimates and misunderstandings. Avoiding the use of the MTBF
simplifications can only improve the description of the underlying predictions, test
or field data results.

Using the best available data to make decisions implies that we use the best
available tools to represent the data. Doing so can save you and your
organization from costly errors within your program.
Endnotes

[1] Paul A. Tobias, David C. Trindade. 1998. Applied Reliability. 2nd ed:
       Chapman Hall/CRC Press, page 367.
[2] O'Connor, Patrick D. T. 2002.Practical reliability engineering. Edited by D.
Newton and R. Bromley. Vol. 4th ed. Patrick D.T. O'Connor with David Newton,
Richard Bromley.Chichester: Wiley, page 166.

[3] Oliver Wendal Homes, “The Deacon‟s Masterpiece, or, the Wonderful One-
Hoss Shay a Logical Story.”, Atlantic Monthly, September, 1858.

More Related Content

What's hot

Problem Solving:9S Methodology
Problem Solving:9S MethodologyProblem Solving:9S Methodology
Problem Solving:9S MethodologyMichael Venner
 
TPM for lean manufacturing | lean tools
 TPM  for lean manufacturing |  lean tools	 TPM  for lean manufacturing |  lean tools
TPM for lean manufacturing | lean tools 博行 門眞
 
Reliability centred maintenance
Reliability centred maintenanceReliability centred maintenance
Reliability centred maintenanceSHIVAJI CHOUDHURY
 
We just had a failure will weibull analysis help
We just had a failure will weibull analysis help We just had a failure will weibull analysis help
We just had a failure will weibull analysis help ASQ Reliability Division
 
Rcm 4 hour overview for rcm teams
Rcm 4 hour overview for rcm teamsRcm 4 hour overview for rcm teams
Rcm 4 hour overview for rcm teamsMatthew Clemens
 
Ch13 Reliability
Ch13  ReliabilityCh13  Reliability
Ch13 Reliabilityzacksazu
 
failure modes and effects analysis (fmea)
failure modes and effects analysis (fmea)failure modes and effects analysis (fmea)
failure modes and effects analysis (fmea)palanivendhan
 
Total productive maintenance(TPM)
Total productive maintenance(TPM)Total productive maintenance(TPM)
Total productive maintenance(TPM)Md.Muzahid Khan
 
Equipment reliability l1
Equipment reliability l1Equipment reliability l1
Equipment reliability l1Matthew Clemens
 
Spares criticality assessment methods & equipment overhaul replacementrepairs...
Spares criticality assessment methods & equipment overhaul replacementrepairs...Spares criticality assessment methods & equipment overhaul replacementrepairs...
Spares criticality assessment methods & equipment overhaul replacementrepairs...Amirul Faiz Amil Azman
 
FMEA Introduction.ppt
FMEA Introduction.pptFMEA Introduction.ppt
FMEA Introduction.pptbowerj
 
Planned Maintenance.
Planned Maintenance.Planned Maintenance.
Planned Maintenance.Hal Frohreich
 

What's hot (20)

Reliability
ReliabilityReliability
Reliability
 
Problem Solving:9S Methodology
Problem Solving:9S MethodologyProblem Solving:9S Methodology
Problem Solving:9S Methodology
 
TPM for lean manufacturing | lean tools
 TPM  for lean manufacturing |  lean tools	 TPM  for lean manufacturing |  lean tools
TPM for lean manufacturing | lean tools
 
Reliability centred maintenance
Reliability centred maintenanceReliability centred maintenance
Reliability centred maintenance
 
We just had a failure will weibull analysis help
We just had a failure will weibull analysis help We just had a failure will weibull analysis help
We just had a failure will weibull analysis help
 
KPI SMRP Presentation
KPI SMRP PresentationKPI SMRP Presentation
KPI SMRP Presentation
 
Weibull analysis
Weibull analysisWeibull analysis
Weibull analysis
 
overview of reliability engineering
overview of reliability engineeringoverview of reliability engineering
overview of reliability engineering
 
Rcm 4 hour overview for rcm teams
Rcm 4 hour overview for rcm teamsRcm 4 hour overview for rcm teams
Rcm 4 hour overview for rcm teams
 
Introdution to POF reliability methods
Introdution to POF reliability methodsIntrodution to POF reliability methods
Introdution to POF reliability methods
 
Ch13 Reliability
Ch13  ReliabilityCh13  Reliability
Ch13 Reliability
 
Failure Mode & Effects Analysis (FMEA)
Failure Mode & Effects Analysis (FMEA)Failure Mode & Effects Analysis (FMEA)
Failure Mode & Effects Analysis (FMEA)
 
failure modes and effects analysis (fmea)
failure modes and effects analysis (fmea)failure modes and effects analysis (fmea)
failure modes and effects analysis (fmea)
 
Total productive maintenance(TPM)
Total productive maintenance(TPM)Total productive maintenance(TPM)
Total productive maintenance(TPM)
 
Equipment reliability l1
Equipment reliability l1Equipment reliability l1
Equipment reliability l1
 
Spares criticality assessment methods & equipment overhaul replacementrepairs...
Spares criticality assessment methods & equipment overhaul replacementrepairs...Spares criticality assessment methods & equipment overhaul replacementrepairs...
Spares criticality assessment methods & equipment overhaul replacementrepairs...
 
FMEA Introduction.ppt
FMEA Introduction.pptFMEA Introduction.ppt
FMEA Introduction.ppt
 
5 whys
5 whys5 whys
5 whys
 
Planned Maintenance.
Planned Maintenance.Planned Maintenance.
Planned Maintenance.
 
Reliability centered maintenance
Reliability centered maintenanceReliability centered maintenance
Reliability centered maintenance
 

Similar to Common Mistakes with MTBF Explained

Reliability engineering chapter-2 reliability of systems
Reliability engineering chapter-2 reliability of systemsReliability engineering chapter-2 reliability of systems
Reliability engineering chapter-2 reliability of systemsCharlton Inao
 
Revised Reliability Presentation (1).ppt
Revised Reliability Presentation (1).pptRevised Reliability Presentation (1).ppt
Revised Reliability Presentation (1).pptAnandsharma33224
 
Reliability.pptx related to quality related
Reliability.pptx related to quality relatedReliability.pptx related to quality related
Reliability.pptx related to quality relatednikhilyadav365577
 
Application of Lifetime Models in Maintenance (Case Study: Thermal Electricit...
Application of Lifetime Models in Maintenance (Case Study: Thermal Electricit...Application of Lifetime Models in Maintenance (Case Study: Thermal Electricit...
Application of Lifetime Models in Maintenance (Case Study: Thermal Electricit...iosrjce
 
Guidelines to Understanding to estimate MTBF
Guidelines to Understanding to estimate MTBFGuidelines to Understanding to estimate MTBF
Guidelines to Understanding to estimate MTBFijsrd.com
 
reliability workshop
reliability workshopreliability workshop
reliability workshopGaurav Dixit
 
WS010_Dr. Shakuntla Singla.pptx
WS010_Dr. Shakuntla Singla.pptxWS010_Dr. Shakuntla Singla.pptx
WS010_Dr. Shakuntla Singla.pptxShakuSingla
 
1BetaC 2021 by Dr. Paul Battaglia as prepared at Florida Te
1BetaC  2021 by Dr. Paul Battaglia as prepared at Florida Te1BetaC  2021 by Dr. Paul Battaglia as prepared at Florida Te
1BetaC 2021 by Dr. Paul Battaglia as prepared at Florida TeTatianaMajor22
 
A Marketing-Oriented Inventory Model with Three-Component Demand Rate and Tim...
A Marketing-Oriented Inventory Model with Three-Component Demand Rate and Tim...A Marketing-Oriented Inventory Model with Three-Component Demand Rate and Tim...
A Marketing-Oriented Inventory Model with Three-Component Demand Rate and Tim...IJAEMSJORNAL
 
Electronics Reliability Prediction Using the Product Bill of Materials
Electronics Reliability Prediction Using the Product Bill of MaterialsElectronics Reliability Prediction Using the Product Bill of Materials
Electronics Reliability Prediction Using the Product Bill of MaterialsCheryl Tulkoff
 
680report final
680report final680report final
680report finalRajesh M
 
Availability performance testing with Application Insights.
Availability performance testing with Application Insights.Availability performance testing with Application Insights.
Availability performance testing with Application Insights.John Pourdanis
 
Fuzzy Fatigue Failure Model to Estimate the Reliability of Extend the Service...
Fuzzy Fatigue Failure Model to Estimate the Reliability of Extend the Service...Fuzzy Fatigue Failure Model to Estimate the Reliability of Extend the Service...
Fuzzy Fatigue Failure Model to Estimate the Reliability of Extend the Service...IOSRJMCE
 
Estimating Reliability of Power Factor Correction Circuits: A Comparative Study
Estimating Reliability of Power Factor Correction Circuits: A Comparative StudyEstimating Reliability of Power Factor Correction Circuits: A Comparative Study
Estimating Reliability of Power Factor Correction Circuits: A Comparative StudyIJERA Editor
 

Similar to Common Mistakes with MTBF Explained (20)

Reliability engineering chapter-2 reliability of systems
Reliability engineering chapter-2 reliability of systemsReliability engineering chapter-2 reliability of systems
Reliability engineering chapter-2 reliability of systems
 
Revised Reliability Presentation (1).ppt
Revised Reliability Presentation (1).pptRevised Reliability Presentation (1).ppt
Revised Reliability Presentation (1).ppt
 
Reliability.pptx related to quality related
Reliability.pptx related to quality relatedReliability.pptx related to quality related
Reliability.pptx related to quality related
 
Application of Lifetime Models in Maintenance (Case Study: Thermal Electricit...
Application of Lifetime Models in Maintenance (Case Study: Thermal Electricit...Application of Lifetime Models in Maintenance (Case Study: Thermal Electricit...
Application of Lifetime Models in Maintenance (Case Study: Thermal Electricit...
 
Guidelines to Understanding to estimate MTBF
Guidelines to Understanding to estimate MTBFGuidelines to Understanding to estimate MTBF
Guidelines to Understanding to estimate MTBF
 
Outgoing Reliability Assurance of 'End-Units'
Outgoing Reliability Assurance of 'End-Units'Outgoing Reliability Assurance of 'End-Units'
Outgoing Reliability Assurance of 'End-Units'
 
Statistical Confidence Level
Statistical Confidence LevelStatistical Confidence Level
Statistical Confidence Level
 
reliability workshop
reliability workshopreliability workshop
reliability workshop
 
WS010_Dr. Shakuntla Singla.pptx
WS010_Dr. Shakuntla Singla.pptxWS010_Dr. Shakuntla Singla.pptx
WS010_Dr. Shakuntla Singla.pptx
 
1BetaC 2021 by Dr. Paul Battaglia as prepared at Florida Te
1BetaC  2021 by Dr. Paul Battaglia as prepared at Florida Te1BetaC  2021 by Dr. Paul Battaglia as prepared at Florida Te
1BetaC 2021 by Dr. Paul Battaglia as prepared at Florida Te
 
A Marketing-Oriented Inventory Model with Three-Component Demand Rate and Tim...
A Marketing-Oriented Inventory Model with Three-Component Demand Rate and Tim...A Marketing-Oriented Inventory Model with Three-Component Demand Rate and Tim...
A Marketing-Oriented Inventory Model with Three-Component Demand Rate and Tim...
 
Availability
AvailabilityAvailability
Availability
 
Tutorial marzo2011 villen
Tutorial marzo2011 villenTutorial marzo2011 villen
Tutorial marzo2011 villen
 
Electronics Reliability Prediction Using the Product Bill of Materials
Electronics Reliability Prediction Using the Product Bill of MaterialsElectronics Reliability Prediction Using the Product Bill of Materials
Electronics Reliability Prediction Using the Product Bill of Materials
 
680report final
680report final680report final
680report final
 
Availability performance testing with Application Insights.
Availability performance testing with Application Insights.Availability performance testing with Application Insights.
Availability performance testing with Application Insights.
 
PPT TARUNA.pptx
PPT TARUNA.pptxPPT TARUNA.pptx
PPT TARUNA.pptx
 
Fuzzy Fatigue Failure Model to Estimate the Reliability of Extend the Service...
Fuzzy Fatigue Failure Model to Estimate the Reliability of Extend the Service...Fuzzy Fatigue Failure Model to Estimate the Reliability of Extend the Service...
Fuzzy Fatigue Failure Model to Estimate the Reliability of Extend the Service...
 
Estimating Reliability of Power Factor Correction Circuits: A Comparative Study
Estimating Reliability of Power Factor Correction Circuits: A Comparative StudyEstimating Reliability of Power Factor Correction Circuits: A Comparative Study
Estimating Reliability of Power Factor Correction Circuits: A Comparative Study
 
panel data.ppt
panel data.pptpanel data.ppt
panel data.ppt
 

More from Accendo Reliability

Should RCM be applied to all assets.pdf
Should RCM be applied to all assets.pdfShould RCM be applied to all assets.pdf
Should RCM be applied to all assets.pdfAccendo Reliability
 
T or F Must have failure data.pdf
T or F Must have failure data.pdfT or F Must have failure data.pdf
T or F Must have failure data.pdfAccendo Reliability
 
Should RCM Templates be used.pdf
Should RCM Templates be used.pdfShould RCM Templates be used.pdf
Should RCM Templates be used.pdfAccendo Reliability
 
12-RCM NOT a Maintenance Program.pdf
12-RCM NOT a Maintenance Program.pdf12-RCM NOT a Maintenance Program.pdf
12-RCM NOT a Maintenance Program.pdfAccendo Reliability
 
09-Myth RCM only product is maintenance.pdf
09-Myth RCM only product is maintenance.pdf09-Myth RCM only product is maintenance.pdf
09-Myth RCM only product is maintenance.pdfAccendo Reliability
 
10-RCM has serious weaknesses industrial environment.pdf
10-RCM has serious weaknesses industrial environment.pdf10-RCM has serious weaknesses industrial environment.pdf
10-RCM has serious weaknesses industrial environment.pdfAccendo Reliability
 
08-Master the basics carousel.pdf
08-Master the basics carousel.pdf08-Master the basics carousel.pdf
08-Master the basics carousel.pdfAccendo Reliability
 
07-Manufacturer Recommended Maintenance.pdf
07-Manufacturer Recommended Maintenance.pdf07-Manufacturer Recommended Maintenance.pdf
07-Manufacturer Recommended Maintenance.pdfAccendo Reliability
 
06-Is a Criticality Analysis Required.pdf
06-Is a Criticality Analysis Required.pdf06-Is a Criticality Analysis Required.pdf
06-Is a Criticality Analysis Required.pdfAccendo Reliability
 
05-Failure Modes Right Detail.pdf
05-Failure Modes Right Detail.pdf05-Failure Modes Right Detail.pdf
05-Failure Modes Right Detail.pdfAccendo Reliability
 
04-Equipment Experts Couldn't believe response.pdf
04-Equipment Experts Couldn't believe response.pdf04-Equipment Experts Couldn't believe response.pdf
04-Equipment Experts Couldn't believe response.pdfAccendo Reliability
 
Reliability Engineering Management course flyer
Reliability Engineering Management course flyerReliability Engineering Management course flyer
Reliability Engineering Management course flyerAccendo Reliability
 
How to Create an Accelerated Life Test
How to Create an Accelerated Life TestHow to Create an Accelerated Life Test
How to Create an Accelerated Life TestAccendo Reliability
 

More from Accendo Reliability (20)

Should RCM be applied to all assets.pdf
Should RCM be applied to all assets.pdfShould RCM be applied to all assets.pdf
Should RCM be applied to all assets.pdf
 
T or F Must have failure data.pdf
T or F Must have failure data.pdfT or F Must have failure data.pdf
T or F Must have failure data.pdf
 
Should RCM Templates be used.pdf
Should RCM Templates be used.pdfShould RCM Templates be used.pdf
Should RCM Templates be used.pdf
 
12-RCM NOT a Maintenance Program.pdf
12-RCM NOT a Maintenance Program.pdf12-RCM NOT a Maintenance Program.pdf
12-RCM NOT a Maintenance Program.pdf
 
13-RCM Reduces Maintenance.pdf
13-RCM Reduces Maintenance.pdf13-RCM Reduces Maintenance.pdf
13-RCM Reduces Maintenance.pdf
 
11-RCM is like a diet.pdf
11-RCM is like a diet.pdf11-RCM is like a diet.pdf
11-RCM is like a diet.pdf
 
09-Myth RCM only product is maintenance.pdf
09-Myth RCM only product is maintenance.pdf09-Myth RCM only product is maintenance.pdf
09-Myth RCM only product is maintenance.pdf
 
10-RCM has serious weaknesses industrial environment.pdf
10-RCM has serious weaknesses industrial environment.pdf10-RCM has serious weaknesses industrial environment.pdf
10-RCM has serious weaknesses industrial environment.pdf
 
08-Master the basics carousel.pdf
08-Master the basics carousel.pdf08-Master the basics carousel.pdf
08-Master the basics carousel.pdf
 
07-Manufacturer Recommended Maintenance.pdf
07-Manufacturer Recommended Maintenance.pdf07-Manufacturer Recommended Maintenance.pdf
07-Manufacturer Recommended Maintenance.pdf
 
06-Is a Criticality Analysis Required.pdf
06-Is a Criticality Analysis Required.pdf06-Is a Criticality Analysis Required.pdf
06-Is a Criticality Analysis Required.pdf
 
05-Failure Modes Right Detail.pdf
05-Failure Modes Right Detail.pdf05-Failure Modes Right Detail.pdf
05-Failure Modes Right Detail.pdf
 
03-3 Ways to Do RCM.pdf
03-3 Ways to Do RCM.pdf03-3 Ways to Do RCM.pdf
03-3 Ways to Do RCM.pdf
 
04-Equipment Experts Couldn't believe response.pdf
04-Equipment Experts Couldn't believe response.pdf04-Equipment Experts Couldn't believe response.pdf
04-Equipment Experts Couldn't believe response.pdf
 
02-5 RCM Myths Carousel.pdf
02-5 RCM Myths Carousel.pdf02-5 RCM Myths Carousel.pdf
02-5 RCM Myths Carousel.pdf
 
01-5 CBM Facts.pdf
01-5 CBM Facts.pdf01-5 CBM Facts.pdf
01-5 CBM Facts.pdf
 
Lean Manufacturing
Lean ManufacturingLean Manufacturing
Lean Manufacturing
 
Reliability Engineering Management course flyer
Reliability Engineering Management course flyerReliability Engineering Management course flyer
Reliability Engineering Management course flyer
 
How to Create an Accelerated Life Test
How to Create an Accelerated Life TestHow to Create an Accelerated Life Test
How to Create an Accelerated Life Test
 
Reliability Programs
Reliability ProgramsReliability Programs
Reliability Programs
 

Recently uploaded

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 

Common Mistakes with MTBF Explained

  • 1. Common Mistakes with MTBF MTBF is widely used to describe the reliability of a component or system. It is also often misunderstood and used incorrectly. In some sense, the very name “mean time between failures” contributes to the misunderstanding. The objective of this paper is to explore the nature of the MTBF misunderstandings and the impact on decision-making and program costs. Mean-Time-Between-Failure (MTBF) as defined by MIL-STD-721C Definition of Terms for Reliability and Maintainability, 12 June 1981, is A basic measure of reliability for repairable items: The mean number of life units during which all parts of the item perform within their specified limits, during a particular measurement interval under stated conditions. The related measure, Mean-Time-To-Failure (MTTF) is define as A basic measure of reliability for non-repairable items: The total number of life units of an item divided by the total number of failures within that population, during a particular measurement interval under stated conditions. These definitions are very similar. The subtle difference is important, yet the confusion is further complicated when attempting to quantify MTBF or MTTF. In both cases we often use the calculation as described within the MTTF definition. This is what we would do for any group of values that we wanted to find the mean (average) value estimate. Tally the values and divide by the number of hours all units have operated and divide by the number of failures. This provided an unbiased (statistically speaking) estimate of the population mean. Keep in mind that time to failure data is often not normally distributed. The underlying distribution for lifedata starts at time zero and increases. The exponential family of distributions tends to describe lifedata well and is commonly used. The unbiased estimate for the mean value of an exponential distribution is as described for the MTTF definition above. When working with data from a repairable system, one should use the Nonhomogeneous Poison Process (NHPP) which is a generalization of the Poison distribution. The estimate for the failure intensity can have various models, yet if often assumed to be the exponential model. This results in the common estimate of MTBF of T (k) MTBF = k Where, T(k) as the total time of one or more system operations divided by the cumulative number of failures. [1]
  • 2. Thus introducing the first source of confusion when considering MTBF, failure rates, or hazard rates. Since we intuitively use the simple calculation to estimate the mean value, many then do not then apply that estimate with the reliability function of the appropriate distribution. For example, if a vendor states the product has an MTTF of 16,000 hours, and we wanted to know how many out of 100 units will fail in 8,000 hours, the appropriate calculation is ætö -ç ÷ èq ø R(t) = e æ 8,000 ö -ç è 16,000 ÷ ø R(8, 000) = e = 0.61 such that we expect 61 out of the 100 units, or 61%, of the units to operate for the full 8,000 hours. This is assuming an exponential distribution and non-repairable units. Given only an MTTF value, the most likely distribution to use without additional information is the exponential. Extending this same example to determine the reliability at 16,000 hours, we find that only about 1/3 of the units would be expected to still be operating. And, if someone has this common misunderstandings of the failure rate value that MTBF represents, then it can lead to significant loss of resources or mission readiness. For example, a radar detection OEM received a contract to design and manufacture a specific system with 5,000 hours MTBF. The specification included functionality, mission duration and expected equipment duty cycle, along with minor variations to the airborne inhabited environment. The contract specified 5,000 hours MTBF for the sole reliability requirement. And, the design team designed, built and tested and accomplished a better than 5,000 hour MTBF. The Air Force found the unit to be the leading cause of aborted missions (equipment related) and complained to the OEM. A careful analysis of the field data proved the units actually achieved almost 6,000 hour MTBF, thus exceeding the specification. Of course, this didn‟t change the data on aborted missions. In part the OEM‟s equipment just happened to be the least reliable equipment on the aircraft. A short discussion with the team found some misunderstanding and that “errors had been made”. The Air Force procurement team and the prime contractor personal mistakenly thought the term „5,000 hours MTBF‟ meant at least 5,000
  • 3. failure free operating hours. When in reality the term, in this case, meant that approximately two-thirds of the units are expected to have at least one failure over of period of 5,000 operating hours. And, in fact, the product performed about 20% better than the specification. The problem was exacerbated by the mission requiring the use of three of the OEM‟s unit during the mission. Reliability speaking the equipment was in series, meaning that if any one of the three units failed, the crew had to abort the mission. Therefore, the probability of successfully completing 1000 hours of operation where all three units have to work is Rsys ( t ) = R1 ( t ) × R2 ( t ) × R3 ( t ) æ 1,000 ö æ 1,000 ö æ 1,000 ö -ç -ç -ç è 5,000 ÷ è 5,000 ÷ è 5,000 ÷ Rsys (1, 000) = e ø ×e ø ×e ø = 0.55 Even though each of the individual units have about an 82% reliability (or probability of surviving 1,000 hours), the three in series have only a 55% reliability, or probability that all three will operate for 1,000 hours. Acknowledging either a specification error or misunderstanding of the metric errors the team still had the issue of aborted missions. Simply changing the reliability requirements would not change the design of the equipment without a significant re-design. Further discussion found that installing a warm standby unit, permitted the rapid replacement of a failed unit during the mission, thus effectively and significantly reducing mission aborts. The reliability of a 3-out-of-4 system is m-1 æ nö Rsys ( t ) = 1- å ç ÷ Ri ( t ) (1- R ( t )) n-i i=0 è i ø where n is the number of systems out of m total have to be operating for the overall system to be operating.[2] In the example above, n=3 and m=4, plus the example has a reliability for a single system of about 82%. For three in series the system reliability drops to about 55%. And the calculation for the 3 out of 4 parallel system reliability calculation results in 85%. Suffice it to say the reliability is significantly improved. Note, that using reliability in the above function does not require the use MTBF. The reliability term can come from any distribution. Calculating or using only the MTBF value to represent a product‟s reliability can lead to more than misunderstanding. If the product performs better or worse than expected you may have unnecessary spares expenses or not enough spares to continue effectively. Another issue that may arise is the unexpected increase in
  • 4. failure rate after a few years of a very low failure rate. Using the single parameter, MTBF, does not provide information about the changing nature of failure rates over time. The following graph is a plot of percentage of the population that has failed over time or cumulative distribution function plot. The red line is the plot of the fitted exponential distribution. The data and fitted line represents the failure rate trend that is declining over time. Over time the total number fo failures continues to rise, yet the slope is low or less than the slope for the exponential distribution. This is actual data and the time scale and title have been removed to protect the source. The theta of the exponential distribution is 49,093 hours. Whereas the Weibull distribution has a beta of 0.5823 and eta of 31,344 hours. On this plot, the exponential distribution has a slope of 1. The fitted Weibull distribution slope is less than one. Keep in mind that the exponential and Weibull distribution are members of the exponential family of distribution. The formula for the reliability function of the 2-parameter Weibull distribution is ( ) b - th R(t) = e
  • 5. where the beta is the slope and eta is the characteristic life. Setting beta to 1 reduces the formula to the reliability function for the exponential distribution. R(t) = e ( ) - tq where theta is the characteristic life and is also the inverse of the failure rate and commonly theta is called MTTF or MTBF. The plot of the CDF is related to the reliability function. Reliability is the percentage of units surviving over a specific duration. And the CDF plots the percentage of units failed over a specific duration. The CDF is represented by F(t) and the CDF for the Weibull distribution is ( ) b - th F(t) = 1- e therefore, R(t) = 1- F(t) Essentially the vertical axis on the above plot reverses from rising from 0 to 100% for the CDF. For the reliability function the vertical axis rises from 100 to 0%. Consider the above CDF plot again. If the underlying data is represented by only one value, say MTBF, we are in effect representing the data with the ill-fitted red line. Only at one point in time does the distribution actually represent the data, only at the point in time where they cross. Thus, if I need to make a decision prior to that point based on the expected reliability of the system, we would use the exponential distribution. For example, at time 100 hours we find the MTBF based reliability to be R(t) = e ( ) - tq R(100) = e ( - 100 49,093 ) = 0.9968 We get a number and can make a decision if the system meets our reliability requirements. Whereas, using the fitted reliability distribution, we have a description of the data using two parameters. Calculating the reliability at the same point of time using the Weibull distribution we find
  • 6. ( ) b - th R(t) = e ( ) 0.5823 - 100 31,344 R(100) = e = 0.965 The difference in estimates may or may not make a difference in the decision, yet we often attempt to use the best available data when making important decision. The estimate provided by the exponential distribution is potentially misleading and in the above example over states the system‟s reliability. This error varies and get worse when examining a shorter period of time. This error may cause the error of accepting a system that actually does not meet the requirements. Or, it may cause the under stocking of needed spare parts for failures that are likely to occur, leading to reduced mission readiness. The following CDF plot shows a different situation. Here the data tends to increase in failure rate over time and has a slope greater than one. Again the exponential (MTBF) estimate does not reflect the actual data very well, except at one point.
  • 7. Again, the title and vertical access have been removed from this plot of actual data. The theta for the exponential distribution is 20,860 hours. And, the fitted parameters for the Weibull distribution are: Beta equals 1.897 and eta is 23,507 hours. Performing the reliability calculations for the two distribution at 100 hours results in the following two results R(t) = e ( ) - tq R(100) = e ( - 100 20860 ) = 0.9952 is for the exponential distribution, and for the Weibull distribution ( ) b - th R(t) = e ( ) 1.897 - 100 23,507 R(100) = e = 0.999968 And while this difference may or may not change the decision based on the system reliability, using the exponential distribution may lead to costly mistakes. In this case, the system reliability estimate may be mistakenly represented as being to low. This may lead to a cancelation of the program, or the overstocking of spare parts. Of course, in both examples, depending on which time point is selected the difference between the two fitted curves is different. And if the duration on interest is beyond the intersection of the two fitted lines, then the mistakes lead to different results. Another area of misleading use of MTBF is the lack of reliability apportionment. The confusion comes from the notion of the weakest link limiting the reliability of a system. As in the except from the poem by Oliver Wendal Homes, “The Deacon‟s Masterpiece, or, the Wonderful One-Hoss Shay a Logical Story.”,[3]where the chaise was build with every part was a study and strong as all the parts. Then, --What do you think the parson found, 
 When he got up and stared around?
 The poor old chaise in a heap or mound,

  • 8. As if it had been to the mill and ground!
 You see, of course, if you 're not a dunce,
 How it went to pieces all at once, --
 All at once, and nothing first, --
 Just as bubbles do when they burst. In practice, products do not failure all at once and completely. In more complex systems, while many possible components may be the first to fail, it may be unclear exactly which component will fail first. The replacement of that component generally does not improve the probability of failure of the other components, thus a different component may cause the next failure. Back to the weakest link idea. In a series system, reliability speaking, if any one element of a system fails, then the system fails. Given technical and design limitations there is one element that is inherently weaker than the rest of the system. Therefore, if we know, the compressor is the weakest link in a product and it has a MTBF of 5,000 hours. Well, then no other component needs to be any better than 5,000 hours MTBF. Right? And, one might say that for a system is has no field replaceable units, that upon the first failure the unit has to be totally replaced anyway. Basically, the thought is since the compressor limits the life of the product (the weakest link), no other component needs to be better than 5,000 hours MTBF. Given a system goal of 5,000 hours MTBF and using the logic from above and from the One-Hoss Shay, we create a complex product with each subsystem designed and tested to the same goal, 5,000 MTBF. Let‟s assume the product has a display, circuit board, and power supply, in addition to the compressor mentioned above. For the sake of argument, let‟s assume each of the four subsystems do actually have an exponential distribution for expected time to failure. This means that each subsystem has a 1/5,0000 chance of failure every hour of operation and it stays constant over time. Inverting the MTBF to find the failure rate per hour, we find 1/5,000 = 0.0002 failures per hour. And, let‟s say that over a two year period the systems are expected to operate 2,500 hours. “No problem, everything meets at least 5000 hours MTBF”, one might say. Let‟s do the math. Rsys ( t ) = R1 ( t ) × R2 ( t ) × R3 ( t ) × R4 ( t ) æ 2,500 ö æ 2,500 ö æ 2,500 ö æ 2,500 ö -ç -ç -ç -ç è 5,000 ÷ è 5,000 ÷ è 5,000 ÷ è 5,000 ÷ Rsys ( 2, 500 ) = e ø ×e ø ×e ø ×e ø = 0.135
  • 9. The more subsystems and components designed and selected to just meet the 5k MTBF the worse the actual result. The result of a system reliability of 13.5% over 2,500 hours assumes that each subsystem achieves only 5,000 MTBF. In practice each will achieve some other number, yet the point is, in design and practice if each subsystem achieves the system goal, the result will be a surprisingly low. Another assumption in the above example is the use of exponential distributions to describe each subsystem. This is often not true and using Weibull or Lognormal distribution may be appropriate. For example, the compressor most likely has a wearout type of failure mechanism. And, we are able to find a set of data that with analysis provides a good fit to a Weibull distribution. The Weibull parameters for the compressor are beta of 2 and eta of 5642(note: this would be estimated as an theta of 5,000 for a fitted exponential distribution.) Using the new information with the same example as above, we have 2 æ 2,500 ö -ç è 5,642 ÷ R1 ( t ) = e ø = 0.82 Rsys ( t ) = R1 ( t ) × R2 ( t ) × R3 ( t ) × R4 ( t ) Rsys ( 2, 500 ) = ( 0.82 ) = 0.45 4 The result is better as at the early portion of the life distribution, the failure rate is relatively low. It is only later, after about 5,000 hours does the failure rate climb above the estimated exponential distribution. It is overstating the reliability at 2,500 hours. Conclusion We have the math tools and understanding to use the appropriate distributions to describe the expected failures or reliability functions. Using MTBF for convenience, convention or „because the customer expects that metric” all tend to lead to poor estimates and misunderstandings. Avoiding the use of the MTBF simplifications can only improve the description of the underlying predictions, test or field data results. Using the best available data to make decisions implies that we use the best available tools to represent the data. Doing so can save you and your organization from costly errors within your program.
  • 10. Endnotes [1] Paul A. Tobias, David C. Trindade. 1998. Applied Reliability. 2nd ed: Chapman Hall/CRC Press, page 367. [2] O'Connor, Patrick D. T. 2002.Practical reliability engineering. Edited by D. Newton and R. Bromley. Vol. 4th ed. Patrick D.T. O'Connor with David Newton, Richard Bromley.Chichester: Wiley, page 166. [3] Oliver Wendal Homes, “The Deacon‟s Masterpiece, or, the Wonderful One- Hoss Shay a Logical Story.”, Atlantic Monthly, September, 1858.