Reliability-centered Maintenance is a maintenance philosophy that includes a systematic approach to determining how to maintain equipment safely and economically. RCM is an invaluable business solution for companies
In situations where equipment failure is inevitable, the structured RCM process will ensure a maintenance strategy that will minimise or eliminate the consequences.
The central problem addressed by the RCM process is how to determine which scheduled maintenance tasks, if any, should be assigned to equipment, and how frequently
1. Minimizing the integrity failures of
aging plants and equipments
Through
Reliability Centered Maintenance
Bhavesh Shukla
1
2. Introduction to
Minimizing Integrity Failure
through RCM
2015 Bhavesh Shukla
All rights reserved. No part of this publication may be reproduced, distributed, by any
means, without the prior written permission of the publisher.
2
3. Agenda
• Failure Case study
• Types of Maintenance
• Risk Based Maintenance Strategy
• Process Steps
• Exercise
3
5. Agitator FailureCost Analysis:
i. Additional cost to Business due to failure =1.3 Million
ii. Cost to prevent failure=$10000
Causal Factor:
i. Aging equipment
ii. Lack of predictive maintenance
iii. torque limiting setting was set too low during gear box
replacement which caused pre-mature clutch top limiter
slipping which generated heat and caused the high speed
coupling to fail
Preventive Action:
i. Temperature monitoring of Gearboxes and motor
ii. Vibration check and analysis program
iii. Lubrication test and analysis program
5
7. Repair
• Repair – run to failure, is suitable for small
items which are non critical,
inconsequential, unlikely to fail and
redundant.
• For critical equipments and parts is not an
option.
• When Stakes are high Repair is not an
option
7
10. •Preventive maintenance is time base
Maintenance.
•We change parts or service equipment
before it fail and some time before it need
to be.
Preventive Maintenance
10
14. Predictive Maintenance
• Is based on predictive testing, monitoring and
inspection.
• It is suitable for critical equipment which has
random failure
not subject to wear
PM induced failure
14
19. Reliability Centered Maintenance
• Reliability Centered Maintenance (RCM) is a
maintenance strategy that is implemented to optimize
the maintenance program of a company or facility using
cost-effective maintenance techniques.
• There are four principles that are critical for an RCM
program.
The primary objective is to preserve system function
Identify failure modes that can affect the system function
Prioritize the failure modes
Select applicable and effective tasks to control the failure
modes
19
20. RCM Process Steps
20
Operational Risk Number
System Risk Number
Asset Risk Number
Failure Probability Factor
Maintenance Priority Number
Safety Issues, Regulatory Compliance,
Product Quality Issues, Process Throughput,
Operational Cost
In series and will stop the system (24 hr lost)
, Important (8 hr lost) ,Can be Bypassed,
Little effect to system, No effect to system
Reactive Maintenance Performed…
Daily, Weekly, Monthly, Quarterly, Annually
21. System Risk Ranking Process
Area SF RC QF PT OC SRN
PROCESS SYSTEM NAME
Safety
Failures
Regulatory
Compliance
Quality
Failures
Process
Throughput
Operational
Cost
System
Risk Number
Reactor
(1-10) (1-10) (1-10) (1-10) (1-10) 1-10
Agitator TLS plate 8 2 10 10 10 8
Cooling control valve 6 8 9 8 5 7.35
Heating Control valve 5 6 6 8 8 6.71
Cooling insulation 3 2 3 4 3 3.07
21
22. Maintenance Priority Ranking Process
22
Risk Number ORN FPF
10 24 Hours Lost Daily
8 8 Hours Lost Weekly
6 2 Hours Lost Monthly
4 <1 Hour Lost Quarterly
2 No Lost Annually
PROCESS SYSTEM NAME
System
Risk
Number
Operational
Risk
Number
Asset
Risk
Number
Failure
Probability
Factor
Maintenance
Priority Number
#1-10 (1-10) # 1-100 (1-10) # 1-1000
Reactor SRN ORN ARN FPF MPN
Agitator TLS Plate 8 10 80 3 240
Cooling control valve 7.35 4 29.40 2 59
Heating Control valve 6.71 3 20.13 5 101
Cooling insulation 3.07 2 6.14 1 6
23. Maintenance Strategy
Maintenance Priority Index to determine which
type of maintenance is most effective.
• Repair- run to failure
• Preventive Maintenance
• Predictive Maintenance
23
I have question for you, Would you get on air plane which is only up and working 80% of time , or even working 99% of time in other word one plane crash out of 100 flight or you would get on that airplane if you knew that the airplane will run no matter what.
And How do the airline do that ? The overall maintenance safety is unbelievable, the same maintenance strategy can be used for our plant equipments.
So question is how you can do that and its not very complicated. With Risked base Maintenance program you can keep your plant up and running without machine breaking down and stopping your production, because most costly and high quality problems come when machine break down and production stopped.
To explain my point lets look at case-study from my own experience.
The agitator for the reactor which was in operation for 29 years failed during critical process step. Because of the failure product in the reactor gelled into hard solid rock.
Torque limiting slip plate failed resulted loss of agitation.
TLS plate is designed to protect motor against mechanical overload. It is set with predetermined torque setting . Over the time due to agitator gearbox bearing deterioration torque level went higher than set value which cause TLS plate to fail or rater it works and decoupled motor from agitator. Motor was still running so there was no alarm of agitation loss. On top of that TLS plate was not identified as critical parts and it took maintenance for more than 4 hours for agitator to mechanically ready, but by then product was too viscous to handle by agitator.
The failure resulted increased business cost of more than one million which includes business continuity expanse of importing product from our other sites air flown , removing gelled material, loss of production and increased number of high risk activities.
Causal factor: unidentified critical part of aging equipment under Maintenance program.
Preventative action implemented including regular vibration monitoring and analysis, Temperature and amp monitoring. Which cost less than $ 10000 and detect condition of equipment for proper maintenance planning.
In the case study there were three type of maintenance involved , some kind of preventive maintenance before failure, repair at the time of agitator breakdown and predictive maintenance as part of corrective action. We will see later why predictive maintenance was selected to prevent agitator failure.
The three type of maintenance has its own advantage and disadvantage and there is no one size fits all solution.
almost one third of all the maintenance costs are wasted as the result of unnecessary or improper maintenance activities which blindly and also indiscriminately involve almost all types of components with no or little consideration to the equipment’s lifetime, outage statistics, and economical value.
Lets look at each of the type maintenance and their Pros and cons then we will see how to select right maintenance type for each equipments.
Repair – run to failure, is suitable for small items which are non critical, inconsequential, unlikely to fail and redundant.
For critical equipments and parts is not an option.
Imagine a person with his wife and baby stranded on deserted road due to poorly maintained car !! He will do anything in this desperate situation for his car to run even if it cause serious accident. Do you think he will do risk assessment?
Preventive maintenance is most widely used and popular maintenance type in most industries, which is time base. We change parts or service equipment before it fail and some time before it need to be.
For example a mechanical seal for a X-chemical is determined to be failing every 24 months, so PM set frequency to replace seal every 20 months.
Advantage- is clear it minimize down time, critical failure and unplanned maintenance.
Disadvantage however is we might be replacing parts much sooner and not utilizing full life of parts. Some time due to change in usage, change in chemical and product it could have working for more than 5 years but we set the frequency was 2 years
Preventive maintenance suitable for equipments which are subject to wear out, consumable and with known failure pattern.
Example : Replacing lube oil, tightening of nut bolt at regular frequency, replacing part and overhauling equipment.
Is based on predictive testing, monitoring and inspection.
It is suitable for critical equipment which has random failure , not subject to wear, PM induced failure (Maintenance has a dark side It sometimes breaks equipment instead of fixing them)
So if time base approach of preventive maintenance require overhaul of equipment then it increases risk of infant mortality, it can be illustrated by bathtub curve.
Advantage: It is most cost effective type of maintenance for critical to process equipment. Replace/ repair what is failing. Reduce infant mortality failure.
Disadvantage: It require detail and many expensive monitoring which may not be cost effective for no-consequential equipment and parts
Example: IR Thermography, Vibration analysis, ultrasound monitoring
Studies conducted by the Japanese Institute of Plant Maintenance and companies like DuPont and Tennessee Eastman Chemical Company have shown that 3 major physical conditions make up some 80% of the variation. These physical conditions are: Looseness, Contamination & Lubrication
The BMW maintenance system Condition Based Service (CBS)* permanently monitors oil levels and the degree of wear and tear of individual components. It also checks the time/km recommendations for service intervals. By analyzing this data the Info Display automatically gives you four weeks notice when and which service is next due.The iDrive Control Display informs you anytime in detail about when a service for a specific component is due so you can plan your service appointments well in advance and avoid any unnecessary maintenance work.
As we saw not one solution fits all for maintenance techniques application, the question is how to determine which maintenance type to chose for particular equipment and its components. And answer is by implementing risk base Maintenance program
Risked base maintenance or reliability centered maintenance is strategy to optimize the maintenance program of a company using cost-effective maintenance techniques.
RCM process has five simple steps. First step of which is to calculate System risk number that include consideration of safety issue, regulatory compliance, product quality, process throughput and operation cost for identified component of process area.
Second step is to determine operation risk number that can be done bas on analyzing that if this part of process fails what will be consequence ? Is it in series and will stop the process for more than 24 hours?, or it is important part and failure will cause 8 hr lost? Or if it fail it can be bypassed without negative consequence or if it fail there will be no effect to overall system . We will rank it 0 to 10 where o when no consequence and 10 means it will cause more than 24 hours loss.
Third step is simple is to determine asset risk ranking by multiplying System risk number with operational risk number.
Forth step is to determine failure probably factor base on reactive maintenance performed. If the system need maintenance daily, weekly, monthly ,quarterly or annually. More. Where annual reactive pose lower risk compare to quarterly .
Fifth step is to calculate maintenance priority number by multiplying Asset risk number with failure probability factor.
It will be very easy to understand with an example
First step of RCM process to determine System risk number.
As we have seen Reactor is my most critical process and we want to identify system component to implement RCM. Let’s analyze, what is the risk of the part failure in terms of safety, regulatory compliance, Product quality, process throughput and operation cost.
We will rank them from 1 to 10 , 1 for less or no negative impact and 10 for high negative impact.
TLS plate of agitator being critical to safety due to loss of agitation can cause uncontrolled exothermic reaction and potential runaway reaction so we rank it 8 on 1 to 10 scale..
Regulatory compliance for the failure is relatively low as it would not lead to release of vapor, permit deviation and so forth so we rank it to 2
Quality failure apparently of high risk and score 10/10 same with process throughput and operational cost.
System risk number is average of five risk numbers and it is 8 so fairly high.
Another example to understand lets look at low risk component.
Cooling water insulation failure will score low in as there is low negative impact for safety failure, regulatory compliance, process throughput and operation cost So Safety risk number for this part of process scored low =3.07
Once System risk number is calculated , Step 2 is to conclude Operational Risk number.
ORN is determine by considering that if the part of process fails what will be consequence ? Is it in series and will stop the process for more than 24 hours?, or it is important part and failure will cause 8 hr lost? Or if it fail it can be bypassed without negative consequence or if it fail there will be no effect to overall system . We will rank it 0 to 10 where o when no consequence and 10 means it will cause more than 24 hours loss.
In Step 3 Asset risk number is simply multiplication of system risk number with operational risk number. So in case of TLS plate ARN is 8 multiply by 9 is equal to 72
Step 4 is to determine failure probability factor which is base on reactive maintenance performed, so TLS plate being part of agitator system and maintenance frequency is six monthly so Falire probably factor for TLS plate is 3
Asset risk number multiplying with failure probability factor will give us maintenance priority number. So TLS plate maintenance priority number is 72 multiply by 3 is equal to 216
Once we have maintenance priority number , we can chose type of maintenance that will be most effective. For our example TLS plate is critical part which need to apply predictive maintenance and best way to identify any developing problem is by monitoring and analyzing vibration of gear box.
On other end when MPN is low, replacing/servicing at regular frequency will not be cost effective as MPN is low and we can choose some kind of predictive maintenance such as visual inspection annually and or Repair when it fail.
This concludes that
Reliability-centered Maintenance is a maintenance philosophy that includes a systematic approach to determining how to maintain equipment safely and economically. RCM is an invaluable business solution for companies
In situations where equipment failure is inevitable, the structured RCM process will ensure a maintenance strategy that will minimize or eliminate the consequences.
The central problem addressed by the RCM process is how to determine which scheduled maintenance tasks, if any, should be assigned to equipment, and how frequently