The Seven Deadly Sins in Measuring Asset Reliability


Published on

Most companies don’t measure mean time between failures (MTBF), even though it’s the most basic measurement that quantifies reliability. MTBF is the average time an asset functions before it fails. So, why don’t they measure MTBF? Let’s define reliability first before we go any further.

Reliability: The ability of an item to perform a required function under stated conditions for a stated period of time

So why don’t we measure Mean Time Between Failure. This articles discusses this issue.

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

The Seven Deadly Sins in Measuring Asset Reliability

  1. 1. The Seven Deadly Sins in Measuring Asset Reliability By Ricky Smith, CMRP “The problem w ith Management is they’re measuring the w rong things.” ― Peter DruckerSin #1 – Equipment Reliability is not measured.Most companies don’t measure mean time between failures (MTBF), eventhough it’s the most basic measurement that quantifies reliability. MTBF is theaverage time an asset functions before it fails. So, why don’t they measureMTBF? Let’s define reliability first before we go any further.Reliability: The ability of an item to perform a required function understated conditions for a stated period of timeSin #2: “We don’t really have failures”.We have a Condition Monitoring / Predictive Maintenance program that detectsfailures before a component or asset catastrophically fails. In this case thedefinition of the term ‘failure’ must be re-examined. More often than not, when afailure mode is detected by condition monitoring technologies, it requires someform of intrusive maintenance to rectify the problem. Just because your CM/PdMprogram gives you enough lead time to prevent catastrophic failure, it is a failurenever the less because it has pasted “P” on the PF Curve, as with all intrusivemaintenance, it also increases the risk of maintenance induced failure or ‘infantmortality”. Treat all EM/PdM work orders as failures. The PF Curve
  2. 2. Example: If condition monitoring detects a failure mode on an asset every 6months, even with proper planning and scheduling, the asset becomesunavailable twice a year. By ignoring work orders raised by condition monitoring,you are merely treating the symptoms rather than going after the root cause.What if we apply root cause and increase the failures rate from twice a year toonce every two years. Just a thought!Insanity –“doing the same thing over and over again expecting a different result”Sin #2: “We can’t measure MTBF in the same way for all of our assetsbecause not all of our plant is a continuous operation”.Some assets run only on the day shift (8 hours) while others run 24 hours aday. There should not be a misunderstanding of how MTBF is measured.The same calculation is used: For example, If machine ‘x’ runs for 8 hrs a dayand fails 3 times a year, and machine ‘y’ runs for 24 hours a day and fails 9times a year, the MTBF is the same for both assets. It is simply the number ofhours in operation divided by the number of failures. It’s that simple. Most of thetime identifying the exact measurement is not as important as knowing youhave a problem. Just a thought!Sin #3: “Work orders don’t capture all emergency work”.Many companies have rules such as, “A work order will be written only if the equipment is down for more than one hour.”This rule doesn’t make sense. Let’s say, for example, a circuit overload on a pieceof equipment trips 100 times in a month. Many times, small problems lead tomajor asset failure. Don’t wait until a small problem becomes a big one. Starttracking MTBF and you’ll be on the road to reliability. Eventually, you’ll learn tomanage your assets proactively according to their health. Then, you’ll see yourMTBF improve dramatically.Sin # 4: “Not every asset is loaded into the CMMS/EAM”.This is a problem that makes writing an emergency work order impossible. Ifyou’re not tracking every asset down to the component level, you can’t possiblyidentify any true reliability issue. Think about it this way; if 20% of your assets eatup 80% of your resources, wouldn’t you want to identify that 20%, the badactors? Put all of your assets in your CMMS/EAM, track the MTBF and the badactors will become obvious. Validating your equipment hierarchy is the first step.Recommendation: Read ISO14224
  3. 3. Sin #5: It isn’t important to measure MTBF because other metrics provideequivalent value.Yes, you can get asset reliability from other metrics, but keep it simple by usingMTBF. Count the number of breakdowns (the number of emergency work orders)for an asset during a given time interval (by week). That’s all it takes to learn howlong the equipment runs (on average) before it fails.Sin #6: “The maintenance organization is in such a reactive mode thatthere’s no time to generate any metrics.”They’re constantly scrambling merely to react to the latest crisis. But, taking asmall step in the right direction – tracking just one measure of reliability – willreveal the 20% of the assets that are burning 80% of the resources. If you startwith the worst actor, you’ll be surprised at how quickly you can rise out of thereactivity quagmire.For example, a plant manager who recently measured the MTBF for what hecalled his “Top 10 Critical Assets” was shocked at the results. He expected thecombined MTBF for these assets would be around eight hours to nine hours. Inthe first month of this initiative, he found that the actual MTBF was 0.7 hours.You may find yourself in the same situation. You’ll never know the true reliabilitystatus on your plant floor until you begin measuring it. Remember: The data isthe data whether one likes it or not.Sin #7: “There are too many other problems to worry about right nowwithout being pressured to measure reliability, too”.I’ve heard this many times and what it tells me is that the organization is in totalreactive mode. This organization deals only with the problem of the hour. If20% of your assets are taking 80% of your resources, dig yourself out of theproblem by attacking the assets that cause the most pain – the “high payoffassets” that will respond to a reliability improvement initiative.We’ve got to stop fighting fires. The characteristics of adept firefighters include:• High turnover of personnel (mostly in production).• Maintenance costs continue to rise.• Maintenance costs are capped before the month ends (“Don’t spend anymore money this month. We’re over budget.”)• Every day is a new day of problems and chaos.• Maintenance is blamed for missing the production goals.It isn’t easy to fight fires and initiate reliability improvement at the same time, butit can be done. Start measuring MTBF and attack the high-payoff assets.Admit it, you cannot change a company’s culture from reactive to proactiveovernight, however you can eliminate reliability problems on one major systemat a time.
  4. 4. That’s where you’ll find a rapid return on investment. Change people’s activitiesand behaviors slowly and you’ll transition to a proactive culture.Measuring Asset reliability is the key to keeping a company profitable, increasingits capacity and reducing its maintenance cost. In a future column, we’ll presentsome reliability improvement ideas. Check out the results of measuring MTBF byone company. They only measured MTBF of 900 Electric Motors for three yearsapplied while applying a couple known best practices.If you want to measure MTBF Effectively, based on my experience, beginmeasuring MTBF at the Section or System Level (see the EquipmentTaxonomy from ISO 14224 below. Once you have identified which Section orSystem has the lowest reliability you then begin measure the Components orMaintainable items in that specific Section or System.If you would like a copy of the MTBF Users Guide or would like moreinformation please contact Ricky at my website for more information at: