This document discusses reliability and its relationship to safety in heavy industry processes. It argues that reliability is the "missing leg" of the safety stool and that reliability and safety are interdependent. When equipment is more reliable through proactive maintenance and reliability programs, workers are exposed to hazards less frequently, improving safety. The document advocates for adopting a reliability-centered approach and management program that utilizes tools like predictive maintenance, root cause analysis, and reliability metrics to continuously improve equipment reliability and thereby also improve safety.
Workplace injuries continue to cost companies and their workers’ compensation insurers billions of dollars each year. Yet the combination of a new generation of industrial wearables, big data, and smart algorithms promises to change this picture for the better. Workers’ comp insurers and industrial organizations can benefit from partnering to drive these solutions into the workplace—in the process, decreasing claims, keeping industrial workers protected and productive, and bringing down costs across the board.
uction safety has been achieved, the industry still continues to lag behind most other industries with regard to safety. The construction safety of any organization consists of employee’s attitudes towards and perceptions of, health and safety
behaviour. Construction workers attitudes towards safety are influenced by their perceptions of risk, management, safety rules
and procedures. A measure of safety could be used to identify those areas of safety that need more attention and improvement.
The aim of the study was to identify factors in the safety management that any lead project success. these factors influence
construction safety. In this project questionnaire is framed to find safety in major organisations. data is collected on the basis
of questionnaire. Employees of various construction firm are interviewed. Collected data is analysed statistically. this analysis
is show the safety environment among organisation. it also gives suggestion to improve safety at construction site.
SBIC Report : Transforming Information Security: Future-Proofing ProcessesEMC
This report from the Security for Business Innovation Council (SBIC), sponsored by RSA, contends that keeping pace with cyber threats requires an overhaul of information-security processes and provides actionable guidance for change.
THE EVALUATION OF FACTORS INFLUENCING SAFETY PERFORMANCE: A CASE IN AN INDUST...IJDKP
Safety has become a very important element in firms and organisations especially in Ghana. The impact of safety factors on a firm’s 3E’s (Employee, Environment and Equipment) can improve or deteriorate firm’s public image. This paper identified the key safety indicators and also provided a set of core factors that contribute meaningful in promoting safety performance in an Industrial Gas producer in Ghana using the Analytic Hierarchy Process. Organisational, Human, Technical and Environmental factors were identified as the safety indicators in relation to the study area. The studies revealed that organisational factor is the most important factor or criterion that could facilitate a better safety performance of the Industrial Gas
Company. In addition, employees was identified the best safety alternative, whilst environment and equipment followed sequentially.
Workplace injuries continue to cost companies and their workers’ compensation insurers billions of dollars each year. Yet the combination of a new generation of industrial wearables, big data, and smart algorithms promises to change this picture for the better. Workers’ comp insurers and industrial organizations can benefit from partnering to drive these solutions into the workplace—in the process, decreasing claims, keeping industrial workers protected and productive, and bringing down costs across the board.
uction safety has been achieved, the industry still continues to lag behind most other industries with regard to safety. The construction safety of any organization consists of employee’s attitudes towards and perceptions of, health and safety
behaviour. Construction workers attitudes towards safety are influenced by their perceptions of risk, management, safety rules
and procedures. A measure of safety could be used to identify those areas of safety that need more attention and improvement.
The aim of the study was to identify factors in the safety management that any lead project success. these factors influence
construction safety. In this project questionnaire is framed to find safety in major organisations. data is collected on the basis
of questionnaire. Employees of various construction firm are interviewed. Collected data is analysed statistically. this analysis
is show the safety environment among organisation. it also gives suggestion to improve safety at construction site.
SBIC Report : Transforming Information Security: Future-Proofing ProcessesEMC
This report from the Security for Business Innovation Council (SBIC), sponsored by RSA, contends that keeping pace with cyber threats requires an overhaul of information-security processes and provides actionable guidance for change.
THE EVALUATION OF FACTORS INFLUENCING SAFETY PERFORMANCE: A CASE IN AN INDUST...IJDKP
Safety has become a very important element in firms and organisations especially in Ghana. The impact of safety factors on a firm’s 3E’s (Employee, Environment and Equipment) can improve or deteriorate firm’s public image. This paper identified the key safety indicators and also provided a set of core factors that contribute meaningful in promoting safety performance in an Industrial Gas producer in Ghana using the Analytic Hierarchy Process. Organisational, Human, Technical and Environmental factors were identified as the safety indicators in relation to the study area. The studies revealed that organisational factor is the most important factor or criterion that could facilitate a better safety performance of the Industrial Gas
Company. In addition, employees was identified the best safety alternative, whilst environment and equipment followed sequentially.
Human Factors (HF) covers a variety of issues that relate primarily to the individual and workforce, their behavior and attributes. Human error is still poorly understood by many stakeholders and so the risk assessments of operations or process often fall short in their capture of potential failures. There is little consideration of human factors in the engineering design of equipment, operating systems and the overall process, procedures and specific work tasks. Operational human factor issues are often treated on an ad-hoc basis in response to individual situations rather than as part of an overarching and comprehensive safety management strategy. The role that human factors play in the rate of incidents, equipment failure and hydrocarbon releases is poorly understood and underdeveloped.
Caught in Numbers, Lost in Focus: What it Means to Manage Safety in Global Sh...Nippin Anand
Have you ever wondered about how safety gets measured and how it became one with the ‘science’ of management (granted there is such a term as management science). What is the relationship between occupational health and safety and technical safety? How reliable is compliance as a measure of safety and how could we possibly distinguish between quality and safety?
Determination of the most important General Failure Types based on Tripod-DELTAIJERA Editor
Prior research on the occupational incidents show that job-related accidents are majorly caused by two types of failures: a) active failures (technical and human errors), they directly affect the occurrences and totally happen in the beginning of work, b) latent failures, though hold unseen and steady for a long while in the system, they are rooted in the decisions of top-ranking officials. Tripod-DELTA is a tool that exposes latent failures. It classifies all possible latent failures in to 11 GFTs. By analyzing an operation using the GFTs, DELTA assesses where the most problematic areas of an operation are. The process of action itemizing DELTA facilitates the removal of latent failures.
Safety management and organizational performance of selected manufacturing fi...AJHSSR Journal
The health and safety (H&S) of employees is a very significant issue to consider with relation to
the attainment of organizational goals. The broad objective of the study is to examine the level of relationship
between safety management system and organizational performance of two plastic industries in Awka
metropolis. Three research questions and hypotheses were formulated in line with the specific objectives. The
study is anchored on Heinrich Theory. In pursuance of the objective of the study, the descriptive survey design
was adopted. The study worked with the population of eighty. Pilot study was conducted using a test retest
method to establish the reliability of the research instrument. The validity of the instrument was also tested. Chi
square was used for data analysis and Z test was also used to test the Chi square at 0.05 level of significance.
The findings revealed that safety management has a positive influence on firm’s profitability, that there is a
relationship between safety management and customer satisfaction, that safety management has influence on
employee commitment and that safety management reduces cost for organization. The study recommends that
Safety should therefore be afforded the highest priority, taking precedence over commercial, operational,
environmental or social pressures, in that staff must be given responsibility for their own actions, and managers
held responsible for the safety performance of their organisations.
A small section of the course ECP-901, Business Continuity & Resiliency Management, by the Institute for Business Continuity Training, https://www.ibct.com
UL DQS India News Letter - iSeeek jun_2014DQS India
Our Bi-Monthly Newsletter with updates and news on the Certifications and Assessments. Hope you will find it interesting and we look forward to receiving your inputs and feedback.
Application of Lean Tools in the Oil Field Safety ManagementIJERA Editor
Current safety management in oil fields is in low efficiency and data from DOE indicated that the injury rate in the oil and gas field was greater than those for all the other U.S. industries. The paperintroduced lean concepts and tools to the safety management in oil fields. In theresearch, a new safety management methodology has been set up. The study also compared the current safety management and the new safety management which was built up by lean concepts. In addition, several lean tools have been modified to make them fit and work better in oil fields
http://www.aiche.org/CCPS
The Business Case for Process Safety is an industry-wide study that identifies and explains the ways business benefit from implementing a robust process safety program.
Learn more about chemical process safety at http://www.aiche.org/ccps
Human Factors (HF) covers a variety of issues that relate primarily to the individual and workforce, their behavior and attributes. Human error is still poorly understood by many stakeholders and so the risk assessments of operations or process often fall short in their capture of potential failures. There is little consideration of human factors in the engineering design of equipment, operating systems and the overall process, procedures and specific work tasks. Operational human factor issues are often treated on an ad-hoc basis in response to individual situations rather than as part of an overarching and comprehensive safety management strategy. The role that human factors play in the rate of incidents, equipment failure and hydrocarbon releases is poorly understood and underdeveloped.
Caught in Numbers, Lost in Focus: What it Means to Manage Safety in Global Sh...Nippin Anand
Have you ever wondered about how safety gets measured and how it became one with the ‘science’ of management (granted there is such a term as management science). What is the relationship between occupational health and safety and technical safety? How reliable is compliance as a measure of safety and how could we possibly distinguish between quality and safety?
Determination of the most important General Failure Types based on Tripod-DELTAIJERA Editor
Prior research on the occupational incidents show that job-related accidents are majorly caused by two types of failures: a) active failures (technical and human errors), they directly affect the occurrences and totally happen in the beginning of work, b) latent failures, though hold unseen and steady for a long while in the system, they are rooted in the decisions of top-ranking officials. Tripod-DELTA is a tool that exposes latent failures. It classifies all possible latent failures in to 11 GFTs. By analyzing an operation using the GFTs, DELTA assesses where the most problematic areas of an operation are. The process of action itemizing DELTA facilitates the removal of latent failures.
Safety management and organizational performance of selected manufacturing fi...AJHSSR Journal
The health and safety (H&S) of employees is a very significant issue to consider with relation to
the attainment of organizational goals. The broad objective of the study is to examine the level of relationship
between safety management system and organizational performance of two plastic industries in Awka
metropolis. Three research questions and hypotheses were formulated in line with the specific objectives. The
study is anchored on Heinrich Theory. In pursuance of the objective of the study, the descriptive survey design
was adopted. The study worked with the population of eighty. Pilot study was conducted using a test retest
method to establish the reliability of the research instrument. The validity of the instrument was also tested. Chi
square was used for data analysis and Z test was also used to test the Chi square at 0.05 level of significance.
The findings revealed that safety management has a positive influence on firm’s profitability, that there is a
relationship between safety management and customer satisfaction, that safety management has influence on
employee commitment and that safety management reduces cost for organization. The study recommends that
Safety should therefore be afforded the highest priority, taking precedence over commercial, operational,
environmental or social pressures, in that staff must be given responsibility for their own actions, and managers
held responsible for the safety performance of their organisations.
A small section of the course ECP-901, Business Continuity & Resiliency Management, by the Institute for Business Continuity Training, https://www.ibct.com
UL DQS India News Letter - iSeeek jun_2014DQS India
Our Bi-Monthly Newsletter with updates and news on the Certifications and Assessments. Hope you will find it interesting and we look forward to receiving your inputs and feedback.
Application of Lean Tools in the Oil Field Safety ManagementIJERA Editor
Current safety management in oil fields is in low efficiency and data from DOE indicated that the injury rate in the oil and gas field was greater than those for all the other U.S. industries. The paperintroduced lean concepts and tools to the safety management in oil fields. In theresearch, a new safety management methodology has been set up. The study also compared the current safety management and the new safety management which was built up by lean concepts. In addition, several lean tools have been modified to make them fit and work better in oil fields
http://www.aiche.org/CCPS
The Business Case for Process Safety is an industry-wide study that identifies and explains the ways business benefit from implementing a robust process safety program.
Learn more about chemical process safety at http://www.aiche.org/ccps
INTERNATIONAL JOURNAL OF INFORMATION SECURITY SCIENCE Walid.docxMargenePurnell14
INTERNATIONAL JOURNAL OF INFORMATION SECURITY SCIENCE
Walid Al-Ahmad, Bassil Mohammed, Vol. 2, No. 2
28
Addressing Information Security Risks by Adopting
Standards
Walid Al-Ahmad*‡, Bassil Mohammad**
*Computer Science Department, Faculty of Arts and Science, Gulf University for Science & Technology, Kuwait
**Ernst & Young, Amman, Jordan
‡
P.O.Box 7207 Hawally, 32093 Kuwait, Tel: +96525307321, Fax: +965 25307030, e-mail: [email protected]
Abstract- Modern society depends on information technology in nearly every facet of human activity including, finance,
transportation, education, government, and defense. Organizations are exposed to various and increasing kinds of risks,
including information technology risks. Several standards, best practices, and frameworks have been created to help
organizations manage these risks. The purpose of this research work is to highlight the challenges facing enterprises in their
efforts to properly manage information security risks when adopting international standards and frameworks. To assist in
selecting the best framework to use in risk management, the article presents an overview of the most popular and widely used
standards and identifies selection criteria. It suggests an approach to proper implementation as well. A set of recommendations
is put forward with further research opportunities on the subject.
Keywords- Information security; risk management; security frameworks; security standards; security management.
1. Introduction
The use of technology is increasingly covering
most aspects of our daily life. Businesses which
are heavily dependent on this technology use
information systems which were designed and
implemented with concentration on functionality,
costs reduction and ease of use. Information
security was not incorporated early enough into
systems and only recently has it started to get the
warranted attention. Accordingly, there is a need to
identify and manage these hidden weaknesses,
referred to as systems vulnerabilities, and to limit
their damaging impact on the information systems
integrity, confidentiality, and availability.
Vulnerabilities are exploited by attacks which are
becoming more targeted and sophisticated.
Attacking techniques and methods are virtually
countless and are evolving tremendously [1, 2].
In any enterprise, information security risks
must be identified, evaluated, analyzed, treated and
properly reported. Businesses that fail in
identifying the risks associated with the
technology they use, the people they employ, or
the environment where they operate usually
subject their business to unforeseen consequences
that might result in severe damage to the business
[3]. Therefore, it is critical to establish reliable
information security risk assessment and treatment
frameworks to guide organizations during the risk
management process.
Because risks cannot be complete.
INTERNATIONAL JOURNAL OF INFORMATION SECURITY SCIENCE Walid.docxbagotjesusa
INTERNATIONAL JOURNAL OF INFORMATION SECURITY SCIENCE
Walid Al-Ahmad, Bassil Mohammed, Vol. 2, No. 2
28
Addressing Information Security Risks by Adopting
Standards
Walid Al-Ahmad*‡, Bassil Mohammad**
*Computer Science Department, Faculty of Arts and Science, Gulf University for Science & Technology, Kuwait
**Ernst & Young, Amman, Jordan
‡
P.O.Box 7207 Hawally, 32093 Kuwait, Tel: +96525307321, Fax: +965 25307030, e-mail: [email protected]
Abstract- Modern society depends on information technology in nearly every facet of human activity including, finance,
transportation, education, government, and defense. Organizations are exposed to various and increasing kinds of risks,
including information technology risks. Several standards, best practices, and frameworks have been created to help
organizations manage these risks. The purpose of this research work is to highlight the challenges facing enterprises in their
efforts to properly manage information security risks when adopting international standards and frameworks. To assist in
selecting the best framework to use in risk management, the article presents an overview of the most popular and widely used
standards and identifies selection criteria. It suggests an approach to proper implementation as well. A set of recommendations
is put forward with further research opportunities on the subject.
Keywords- Information security; risk management; security frameworks; security standards; security management.
1. Introduction
The use of technology is increasingly covering
most aspects of our daily life. Businesses which
are heavily dependent on this technology use
information systems which were designed and
implemented with concentration on functionality,
costs reduction and ease of use. Information
security was not incorporated early enough into
systems and only recently has it started to get the
warranted attention. Accordingly, there is a need to
identify and manage these hidden weaknesses,
referred to as systems vulnerabilities, and to limit
their damaging impact on the information systems
integrity, confidentiality, and availability.
Vulnerabilities are exploited by attacks which are
becoming more targeted and sophisticated.
Attacking techniques and methods are virtually
countless and are evolving tremendously [1, 2].
In any enterprise, information security risks
must be identified, evaluated, analyzed, treated and
properly reported. Businesses that fail in
identifying the risks associated with the
technology they use, the people they employ, or
the environment where they operate usually
subject their business to unforeseen consequences
that might result in severe damage to the business
[3]. Therefore, it is critical to establish reliable
information security risk assessment and treatment
frameworks to guide organizations during the risk
management process.
Because risks cannot be complete.
Reduction of Un-safe Work Practices by Enhancing Shop floor Safety– A case studyIJERA Editor
Industrial safety is of utmost important in the present industrial scenario in order to protect employees, plant and
environment. The present study is carried out in a machine tool manufacturing company. The initial study
revealed several problems with respect to industrial safety and productivity. Keeping these problems in view,
the aim of the present study was to analyse the existing layout and designing the new layout to improve the
productivity by ensuring safety in the shop floor according to the standards.The existing problems were
analysed systematically and solved by adopting andimplementing DuPont Safety Model. The implementation
resulted in increasing the safety and productivity in the organization.
Efficient Safety Culture as Sustainable Development in Construction IndustryIJERA Editor
The paper focuses on precaution necessary to prevent avoidable accidents in the construction industries, important water development, building and roads construction in Nigeria. Moreover, appreciate the need for a safe working environment but also precaution necessary for hitch-free operation.
Developing Accident Avoidance Program for Occupational Safety and Healththeijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
STUDY ON ENHANCEMENT OF HUMAN SAFETY PROTECTION FACTORS IN CONSTRUCTION INDUSTRYIAEME Publication
Objectives:To show that effective implementation of safety protection factors will lead to zeroaccident project. To prove that there must be noticeable enhancement of safetyrequired in the construction industry. Methods/Analysis:Understanding the problems associated with the existing safety performancein the construction industry. Identifying the organization which implementingeffective safety performance in the site and taken it as case study. Extracting the personnelsafety factors from the above case studies where the maximum potentialpossibilities of accidents happen. Progressed with the questionnaire survey onextracted factors. Analysis of the data and extracting results. Findings: From the analysis of 40responses to questionnaire it shows thathuman safety protection factors is implementing in every constructionorganization but not in at the most effectiveway so that accidents are taking place in construction industry. So there isgreat need to enhance safety protection factors in construction industry. Applications: Regular use of humansafety protection factors is must for any organization that leads to betterenhancement on the safety in site. Projective knowledge on safety to engineersleads to zero accident as per the following case study which is most usefulfactor for an organization for safety performance of the project. Recommendationsmentioned in paper is applicable to any construction organization those areinfluencing with safety problems.
46 ProfessionalSafety FEBRUARY 2017 www.asse.org.docxalinainglis
46 ProfessionalSafety FEBRUARY 2017 www.asse.org
Program Management
Peer-reviewed
P
art 1 of this article (PS, January 2017,
pp. 36-45) discussed the three key elements
of a modern occupational safety program:
engineering and technical standards and controls,
management and operation systems, and human
factors. Each element plays an important role, yet
many organizations continue to stress one at the
expense of the others, which creates an unbalanced
and ineffective OSH program. The human factor is
present in most every incident, yet often the focus
is too narrowly trained on blaming at-risk behav-
iors or unsafe acts rather than on identifying and
addressing the conditions, systems and norms that
enable or cause those errors.
Part 2 of this article examines how employers
can better incorporate engineering and system
elements into worker-oriented initiatives to cre-
ate a more comprehensive approach to OSH and
thereby better understand incident causes, reduce
incident rates, confirm regulatory compliance, and
prevent serious injuries and fatalities.
Proving due diligence
While some allege that companies may use
behavior-based safety (BBS) as due-diligence
or reasonable-care proof in potential litigation
(United Steelworkers Local 343, 2000), in the au-
thor’s opinion, BBS observation documentation
does not appear to be strong in that regard, as it
is typically based on basic observations of work-
ers’ behaviors by nonprofessionals, and often
has nothing to do with recognizing and control-
ling occupational hazards. Traditional regulatory
compliance-based safety systems should be ex-
pected to provide due diligence.
only applicable to “best in class”?
BBS programs are often recommended for best-
in-class companies that already have engineering
controls and systems in place and an excellent
safety culture. Implementation in less-advanced
safety systems may be less ideal.
For example, “Practical Guide for Behavioral
Change in the Oil and Gas Industry,” states:
During the past 10 years, large improvements
in safety have been achieved through improved
hardware and design, and through improved
safety management systems and procedures.
However, the industry’s safety performance has
leveled out with little significant change being
achieved during the past few years. A different
approach is required to encourage further im-
provement. This next step involves taking action
to ensure that the behaviors of people at all lev-
els within the organization are consistent with an
improving safety culture. (Step Change in Safety,
2001)
The potential effect of behavior modifications on
safety performance (incident rates) is illustrated in
Figure 1 (p. 48). The conclusion suggested by Fig-
ure 1 is that significant incident reduction can be
achieved through engineering and systems con-
trols. When those two are addressed, an organi-
zation can then work on behavioral modifications
for further imp.
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
Water scarcity is the lack of fresh water resources to meet the standard water demand. There are two type of water scarcity. One is physical. The other is economic water scarcity.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
2. 23
Unfortunately, industry continues to see significant inci-
dents with loss of life, even with this enhanced focus on
safety and operations excellence.
A three-legged stool is inherently stable, but a stool
with fewer than three legs cannot stand. The missing leg
of this stool is reliability, which has been a focus area for
only a few end users. In the companies that view reli-
ability (all failures are preventable) on par with safety (all
incidents are preventable), equipment failure is addressed
proactively to prevent the catastrophic failures that have
led to significant incidents in facilities. This article explores
methods for developing a world-class, reliability-centric
organization.
ProcessSafety
In 1990, the U.S. Congress passed the Clean Air Act
Amendments, which included a three-pronged approach
to improve chemical accident prevention and increase
public accountability for safety at companies and work-
sites involved in chemical production, processing, han-
dling, and storage in the United States. The amendments
to the Clean Air Act gave new responsibilities to both the
U.S. Environmental Protection Agency and the U.S. Occu-
pational Safety and Health Administration. It also cre-
ated a separate agency, the U.S. Chemical Safety Board, to
conduct independent investigations of major chemical
accidents and publicly report the facts, conditions, circum-
stances, and cause or probable cause of each accident. In
addition, the industry founded the Center for Chemical
Process Safety as a response to the methyl isocyanate
release in Bhopal, India, in 1984 that killed more than
2,000 people and injured tens of thousands.
In 1999, a news article [1] was entered into the Con-
gressional Record (S 8235) that described 14 major chemi-
cal accidents in the United States that occurred between
1987 and 1991, which together resulted in 79 fatalities,
nearly 1,000 injuries, and more than US$2 billion in dam-
ages. [2] The incidents occurred at refining and chemical
companies [1],[3].
The 2005 vapor cloud explosion in Texas City [4] and
other industrial incidents have made it apparent that sig-
nificant accidents continue to occur and that safety and
process safety are still issues today. The Texas City refin-
ery explosion occurred on 23 March 2005 when a hydro-
carbon vapor cloud was ignited and violently exploded at
the isomerization process unit at a large refinery, killing
15 workers, injuring more than 180 others, and severely
damaging the refinery. Many within industry as well
as safety professionals had hoped that process-safety
management (PSM) and safety-based regulations would
reduce significant process events by more than 80%, but
this has proven to be an unfulfilled aspiration. The Baker
Panel [5], which convened following the incident in Texas
City, developed recommendations around leadership,
incentives, safety culture, maintenance, reliability, and
more effective implementation of PSM systems.
During several of the significant incident investigations,
there were several lessons for the corporate management
of process operations. Visscher [2] stated, “One is that most
large companies monitor the performance of individual
units, facilities, operations, and managers by statistical
measures. These numerical indicators are used not only to
monitor performance, but also to improve performance,
by basing pay and bonuses, budgets, or other types of
recognition” on these key performance indicators (KPIs).
Therefore, as the saying goes, “What gets measured gets
managed,” and what is reported to management is what
the front-line leaders focus their attention on and empha-
size to their teams, especially when it impacts pay and
performance incentives. As one moves up the management
ladder, this becomes magnified: bonus and stock options
might begin at a sum equal to 50% or more of an individu-
al’s base pay and can be up to multiple times the base pay
for executives.
Most companies’ performance-measurement and
incentive systems for safety are focused almost exclusively
on injury rates, and they do not include evaluation of
process-safety performance metrics. As a result, safety
programs focus on personal safety initiatives, and com-
pany management receives reports on trends in safety
performance based on injury rates, even as overall pro-
cess safety, asset integrity, maintenance, and reliability
deteriorate and, consequently, contribute to a higher risk
of major accidents. According to Visscher [2], “A principal
lesson coming out of the Texas City refinery disaster is
the importance of these additional process-safety indica-
tors, in addition to personal safety measures, to monitor
performance at any facility at which hazardous chemicals
are used, processed or handled.”
Engineers must learn from previous incidents to reduce
the likelihood of them recurring; if they do not, the like-
lihood of accidents continuing to occur is high. As the
incidents discussed demonstrate, small mistakes have disas-
trous consequences.
Accident CausalModel
Incidents rarely have only one cause or failure mode.
Ness [6] stated,
For many years, safety experts have used the Swiss
cheese model [7] to help managers and workers in
the process industries understand the events, failures,
and decisions that can lead to a catastrophic incident
or near miss. According to this model (Fig. 1), each
layer of protection is depicted as a slice of Swiss
cheese, and the holes in the cheese represent poten-
tial failures in the protection layers, such as:
● human errors
● single-point equipment failures or malfunctions
● asset integrity
● maintenance practice
● reliability
● knowledge deficiencies
JULY/AUGUST 2020 IEEEIndustryApplicationsMagazine
3. 24 IEEEIndustryApplicationsMagazineJULY/AUGUST 2020
● management decisions
● management system inadequacies
● failure to perform hazard analyses
● failure to recognize and manage changes
● inadequate follow-up on previously
experienced incidents warning signs.
As depicted in Figure 1, incidents are the result of more
than one failure to effectively address hazards, which are
represented by the holes aligning in consecutive slices. Typi-
cally, the result of these hazards aligning is a catastrophic
incident. End-user companies employ various types of man-
agement systems that include physical devices and planned
activities, such as preventive maintenance (PM) or predictive
maintenance (PdM), to protect and guard against equipment
failure. Therefore, organizations that have effective PSM,
asset integrity, and maintenance and reliability systems will,
according to Ness [6], reduce “the number of holes and the
sizes of the holes in each of the system’s layers, thereby
reducing the likelihood that the holes will align.”
Reliability EqualsSafety
As discussed previously, management performance is
measured by KPIs; unfortunately, one of these numerical
indicators tends to be the budget, which includes funding
for maintenance, engineering, and reliability programs.
The author of this article has experienced pressure to cut
maintenance budgets over consecutive years at more than
one employer to meet overall financial goals.
Organizations invoke budget cuts by arbitrary per-
centages across the board, without fully considering
the effects of these cuts or reflecting on the real impact
of continuously cutting costs over time as an organiza-
tion adds assets. Management never makes a request to
reduce the amount spent on safety. If an organization is
not willing to cut its safety budget, then, likewise, it
should not lower its maintenance, engineering, or reli-
ability program budgets.
It is unfortunate that not every-
one grasps the concept that reliabil-
ity and safety are dependent on each
other. Most accidents don’t occur
when things are running smoothly
and the equipment has a high level
of reliability. Accidents happen in
the realm of reactive maintenance,
where assets fail unexpectedly, and
there is a rush to quickly get the pro-
cess operational.
Manufacturing facilities are an
asset-intensive environment, where
an integral part of safety is reliability;
in simple terms, as reliability increas-
es, so does safety. The author has
had the opportunity to work for a
global chemical company that imple-
mented a best-in-class reliability pro-
gram. During those years, the author was able to see the
results of the reliability initiatives and their positive effect
on equipment, people, and processes. In addition, the
safety KPIs also improved, with several periods exceeding
300 safe workdays.
One might ask, “Why does this happen, and which
comes first, safety or reliability?” Another way to relate
reliability to safety is that, as assets become more reli-
able, end users are servicing them less often; thus, the
workforce is exposed to hazards less frequently. When
we look at the continued occurrence of significant indus-
trial incidents, it is evident that organizations without a
reliability-focused approach will continue to have occur-
rences, which impact safety. When working in a reliable
and safe environment, people are positive and are not
stressed in their respective roles because the processes
are under reliable control; safety is one of the positive
side effects of reliability.
There are three maintenance methods that asset-inten-
sive organizations can use. The first is reactive mainte-
nance, where all behavior is reactive. Reliability is low,
and there are no planned maintenance or failure-mode-
prevention activities; thus, all maintenance activities are
reactive, or breakdown initiated. In this environment,
according to Isleifsson [8],
Safety is instinctive; that is to say, when a problem
arises, instincts are what drive the situation more
than anything else. Instincts say that one can react to
the problem in a particular way, unfortunately, often
ignoring or not noticing the dangers that arepresent.
In the second method, planned maintenance, all
behavior is guided through the planning and scheduling
process. Safety procedures are embedded in the mainte-
nance plan and, thus, enforced. Reliability is intention-
ally planned, and scheduled maintenance activities can
support reliability initiatives. Focusing maintenance
activities on expected failure modes, thereby preventing
Organizational
Influences
Latent Failures
Latent Failures
Latent Failures
Active FailuresUnsafe Acts
Unsafe
Supervision
Preconditions
for Unsafe
Acts
Failed or
Absent Defenses
Mishap
FIGURE1.TheReasonor “SwissCheese”model[7].
4. 25
failures, obviously contributes to increased reliability.
However, individuals must be aware that excessive main-
tenance can backfire. Focusing on the wrong mainte-
nance activities or merely creating a time-based system
with no focus on process criticality, equipment reliabil-
ity, or actual maintenance needs can result in a system
that is not focused on the failure modes and can over-
extend the two critical resources of an organization (i.e.,
workforce and budget).
Finally, there is the proactive maintenance, where
behaviors are based on a self-disciplined focus on con-
tinuous reliability improvement,
with embedded safety in all aspects
of the maintenance and production
environment. There is a continuous
improvement approach to reliability.
Maintenance activities are not only
performed but implemented in a
way that focuses on how they can
be improved. There is a strong focus
on monitoring conditions and inter-
vening when only predictive failure-
mode measurements are trending up.
In a proactive environment, safety is
embedded, and safety conscience is
second nature: responsibility for safety
is everyone’s role and is viewed as an
added value to the business.
Safety should be a critical value to any company.
An excellent safety record preserves the organization’s
reputation and worker morale; minimizes insurance,
workers’ compensation, and legal costs; and reduces
process downtime. To some, it might appear unrea-
sonable to suggest that putting resources into reliabil-
ity could be an investment in safety. A study by Okoh
and Haugen, “Maintenance-Related Major Accidents:
Classification of Causes and Case Study” [9], found
the following for the hydrocarbon and chemical pro-
cess industries:
● In 30–40% of major industrial incidents, it was deter-
mined that maintenance was a factor.
● These incidents occurred during both the preparation
for and performance of maintenance.
● The study also reports incidents caused by a lack of
proper maintenance.
● Atotal of 76% of the incidents related to maintenance
performance occurred during the maintenance activity.
● Another 24% occurred during site preparation or tran-
sition to or from production activities.
● Maintenance activities expose workers to higher risks
because of changes in work activities and locations.
As one can see, reliability and safety have a strong cor-
relation. Focusing on increased reliability will promote
correct behaviors and result in better safety KPIs. In addi-
tion, it will also have a positive impact on productivity
and, therefore, profitability.
Reliability-Management Program
A reliability-management program (RMP) can minimize
hazards by improving a company’s control over asset life
and maintenance or repair cycles. However, to achieve
best-in-class results that will significantly reduce the
hazards associated with shorter maintenance cycles, a
reliability program must move beyond merely keeping
the equipment running. Although PM includes repairing
or replacing worn parts before they fail, this does not
address the failure mode; it only restores the equipment
for the short term. The failure mode must be resolved
to prevent parts from continuing to
fail. An RMP will extend the time
between maintenance or repair cycles
by reducing or removing the sources
of equipment defects.
Organizations with a mature safety
culture will find that the same basic
principles used in targeting safety
incidents and exploring methods to
reduce them are utilized to focus and
examine reliability incidents. A reli-
ability program uses risk assessments
or failure modes and effects analy-
sis (FMEA) as well as the statistical
analysis of equipment and system fail-
ures to determine which areas pose
the highest risk of failure, associated
costs in downtime, and repair/replace efforts. An FMEA
can include personnel risk assessments; thus, safety is
built into an RMP.
PdM is one of the most valuable components of an
RMP; unfortunately, most view it as a tool for merely
determining the optimal time to perform maintenance.
Although PdM data are often used for condition moni-
toring and analysis, PdM methods can be applied to
assess the quality of a repair; this can result in improved
repair techniques, thus leading to longer mean time
between failures (MTBF) and, subsequently, less exposure
to employees. Reliability and safety have the same goal:
all reliability and safety incidents should be prevented.
Reliability-Centered Maintenance
Reliability-centered maintenance (RCM) is a corporate-lev-
el maintenance strategy that is implemented and support-
ed to optimize the maintenance program of a company or
facility. The final result of an RCM program is the imple-
mentation of a specific maintenance strategy on each of
the assets of the facility.
Often, when end-user companies utilize RCM, root
cause analysis (RCA), and reliability measures tech-
niques, they are addressed or implemented as separate
and distinct topics instead of being incorporated into an
overall RMP. There are those individuals who believe
that one tool is all that is required—apply it to every-
thing, and all of the failures will be solved. One must
AnRMPwillextend
thetimebetween
maintenanceor
repaircyclesby
reducingorremoving
thesourcesof
equipmentdefects.
JULY/AUGUST 2020 IEEEIndustryApplicationsMagazine
5. 26 IEEEIndustryApplicationsMagazineJULY/AUGUST 2020
understand that each method is a separate and distinct
tool with its own benefits.
Following the merger of two large chemical companies
in the late 1990s, the author saw firsthand the effects of
applying RCM to every piece of equipment in a plant were
seen. Each company’s chemical complex was located on
opposite sides of a road. One site had just begun the pro-
cess of implementing RMP and had limited PM, while the
complex on the other side of the road had experienced a
catastrophic incident with significant loss of life and had
performed PM on every asset in the facility. One would
expect the reliability of the facility that implemented con-
sistent PM on all assets to be higher than the other site
with limited PM, but that proved to be incorrect. Both
sites had comparable facility reliability metrics but, in the
end, did not achieve equal reliability. Why? The facility
with PM on all assets did not have enough manpower
and administratively closed the ones not completed each
month. This process closed PM that was not completed
during the assigned period of time.
Therefore, the blind utilization of these techniques
(i.e., blanket implementation of PM) will not garner the
results that end users are seeking. However, each method
provides end users with tools that will help advance the
reliability efforts.
RCM is a reliability tool intended to be used to develop
a complete maintenance strategy for a facility, plant, pro-
cess, or individual piece of equipment. Although an RCM
analysis can be performed on any asset, this process and
its implementation are best suited for focusing on the
highest return on investment due to the fact that a com-
pany’s primary limiting factors are workforce and capital.
Some end users are well versed in implementing an RCM
analysis, but most are not; it requires a trained facilitator
and a cross-functional team of subject matter experts,
which should consist of engineers, mechanics, techni-
cians, and operators. Accomplishing a typical analysis
of 100–200 failure modes will take an experienced team
approximately one week.
RCM is a structured process that consists of the
following:
1) analysis preparation work
2) analysis
3) analysis implementation.
All three components require dedicated resources for the
process to function properly. Without the appropriate
amount of preparation, the analysis team will be delayed,
resulting in frustration and, potentially, failure. With no
analysis, there can be no analysis implementation; there-
fore, there will be no change in maintenance performance
or results. This will result in a lack of support for the RCM
effort and, eventually, failure.
Analysis preparation consists of the following: select-
ing a process, determining meeting schedule and location,
agreeing to the expected outputs of the study, deciding
who will implement the analysis tasks, forming and then
training the team, gathering information to conduct the
analysis, and reaching consensus on these steps. Too
often the analysis preparation step is skipped, which can
lead to less-than-optimal results. Process flow diagrams,
piping and instrument drawings, original equipment man-
ufacturer manuals, policies, procedures, and operational
history data should also be collected. The operational
history data should provide sufficient operational data to
determine the current operating conditions.
Once all of the analysis preparation steps have been
agreed to and communicated, everyone involved is aware
of the objectives and should be committed to meeting
them. The next step in the RCM process is to perform an
RCM analysis, which also requires structure and discipline.
By meeting daily, one can complete the analysis quickly,
which helps the team stay focused and eliminates wasted
time that results from an extended meetingschedule.
An RCM analysis of an asset requires the completion of
nine steps in a specific order to develop a complete main-
tenance strategy.
1) List process functions.
2) Enumerate the functional failures.
3) Itemize the failure modes.
4) Determine the probability that each failure will occur.
5) Describe the effects of the failure.
6) Characterize the consequences of the failure.
7) Run the failure mode through the RCM decision process.
8) Develop a maintenance task, redesign, or consequence-
reduction task.
9) Run the failed part through the RCMspare-parts deci-
sion process.
The RCM facilitator should note components desig-
nated as having a high failure rate and a medium to high
consequence rating. These components are candidates for
conducting RCA, which will focus on the possible compo-
nent failures, including those that are physical, human, and
latent. By completing these steps in order, the maintenance
strategy for the asset analyzed will be complete. This
includes the creation of several types of maintenance tasks:
● on-condition maintenance tasks (vibration as well as ther-
mographic and partial discharge analyses, and so on)
● PM tasks (scheduled rework, discard, and inspection)
● failure-finding tasks
● recommended redesigns
● consequence-reduction tasks (for components where
run-to-failure is the maintenance strategy).
The RCManalysis will find high failure rates and medi-
um to high consequence assets. When conducting RCA to
determine component failures modes, there will be
instances where redesign is the only maintenance strat-
egy to eliminate a failure. As an example, the author lead
an RCA on a hydrogen compressor that had an MTBF of
fewer than 50 days and required several days to rebuild,
working around the clock. The failure mode identified
was rings and rider bands. A design change was imple-
mented that extended the MTBF to allow the compressor
6. 27
to operate without failure during the normal operating
period between turnarounds. This is an example of utiliz-
ing redesign to enhance reliability and improve safety by
elimination of repetitive shutdown maintenance.
All procedures for the tasks must be clearly written
and specific in content. When an undesired condition
exists, the procedure should state clearly what the individ-
ual performing the task should observe, measure, record,
and do.
Implementation of the RCM analysis is the final step in
the process. This is where the rubber meets the road, and
by implementing the tasks, the RCM process generates
benefits. Implementation consists of four steps:
1) Prioritize the tasks based on probability and consequence.
2) Designate a specific person responsible for each task.
3) Assign a due date for task implementation.
4)Track and report the progress of implementation.
Following the completion of the implementation step,
the expected results from performing the RCM analysis
are as follows:
● improved asset reliability and availability
● lower maintenance costs
● less expensive unit cost of the product
● a reduction in health, safety, and environmental incidents
● an increased quality
● reduced emergency maintenance
● decreased number of spare parts
● a shorter turnaround time.
Once completed and implemented, this process can
easily save millions of dollars. As an example, the hydro-
gen compressor previously discussed had annual mainte-
nance costs of approximately one million dollars per year,
which was eliminated, and did not take into consider-
ation the repeated safety exposure. The key is to utilize
these reliability tools comprehensively to maximize the
benefits of RCM.
Criticality
The purpose of determining the critical equipment is
simple at the end of the day; the limiting factor in all
companies is workforce and funding. By defining what
is critical to operations, these limited resources can be
focused on the critical assets. This is how to establish a
methodology by which equipment can be classified for
the criticality classification to be used to set priorities for
PM and PdM programs, spare parts, reporting, and other
reliability efforts. This section describes the method for
determining equipment criticality classification and how
critical equipment lists are developed and maintained.
Critical equipment can be classified as equipment that
contains, controls, or processes hazardous substances,
which, in the event of a failure, might lead to a danger of
injury to persons both within and outside the workplace.
Some companies also classify equipment as critical if, as a
result of the equipment’s unavailability or malfunction,
goals are not attained for the environment, production,
business, or quality. Permanently installed equipment
may be defined as critical if used to mitigate the hazards
of loss of containment or control.
Before discussing the criticality classification exam-
ples, some context associated with process hazard analy-
sis (PHA) and assessment of the potential hazards should
be examined. A PHA is a set of organized and systematic
assessments of the potential hazards associated with an
industrial process. The assessment of the potential haz-
ards utilizing the application of risk assessment meth-
odology, including consequence categorization, is the
process used by many end-user companies to mitigate
risk. A consequence is the ultimate result of a deviation
or multiple deviations. Table 1 provides examples of con-
sequence categorization levels that are utilized to mitigate
risk, from category I (least severe) to category V (most
severe). The following sections are an example set of
criticality classification.
SafetyCritical
This aspect involves any electrical, instrumentation, or
mechanical device that is a component of an active safety
system, or a process case pressure-relief device, that is
designed to mitigate an action-level risk with a category
IV or V personnel, community, or environmental conse-
quence and provides an independent protection layer.
EnvironmentalCritical
This includes any electrical, instrumentation, or mechani-
cal device that is a component of active process control, or
a monitoring or safety system, that is designed to mitigate
an action-level risk with a category IV or V environmen-
tal consequence and provides an independent protection
layer. In addition, any equipment identified by facility
operating permits as required to maintain compliance with
environmental regulations and or permits can be added to
the environmental critical list.
Environmentally critical equipment includes, but is not
limited to, the following:
● emissions control, abatement or treatment equipment
such as flares and vent scrubbers; waste collection
devices, and wastewater treatment equipment
● devices or systems that monitor, measure, or record emis-
sions information required todemonstrate compliance.
Accounting(Inventory) Critical
Equipment is classified as being accounting critical when
failure will result in an inability to meet the conditions of a
contract (for example, custody transfer meters per the
corporate volumetric control policy or compliance require-
ments, such as a foreign-tradezone).
QualityCritical
Equipment is determined to be quality critical when the
site is solely dependent on it to maintain product qual- ity
and, upon failure, will produce an out-of-specification
JULY/AUGUST 2020 IEEEIndustryApplicationsMagazine
7. 28 IEEEIndustryApplicationsMagazineJULY/AUGUST 2020
product, without any indication of nonconformance (as
defined by a quality-certification system).
HealthSafetyandEnvironmentalCritical
Equipment is health safety and environmental (HSE) criti-
cal when it is intended to reduce the size of an HSE con-
sequence after it occurs. This includes the following:
● fire suppression system (foam, sprinkler, deluge, fire
monitors and firewater distribution, mobile fire equip-
ment, fireproofing, and fixed halon systems)
● emergency alarm system
● fire and gas detection systems
● safety equipment (respirators, self-contained breathing
apparatuses, breathing air equipment, safety showers,
and fire extinguishers and hoses)
● health surveillance equipment.
as an olefin plant’s crack gas compressor, a propylene
oxide manufacturer’s recycle compressor, a nonredun-
dant electrical distribution system, or a polymer produc-
er’s hypercompressor.
Criticality identification is a precise estimation and eval-
uation of consequence of failure that should be performed
using an appropriate risk-based methodology. Equipment
or functional locations may be categorized using one or
more classifications, although maintenance work, PM/
PdM (PPM) programs, and other reliability efforts will be
assigned to equipment based on the highest consequence
of failure and or classification of this equipment.
Once the criticality is determined, all of the equipment
with a criticality-classification value (safety, environmen-
tal, quality, accounting/inventory, HSE, and production)
should receive heightened attention in the areas of PPM
programs and spare-parts-availability requirements. Each
operating facility’s management system must clearly define
this process. Equipment history and actual data must be
incorporated into the criticality assessment to determine
PPM tasks and frequencies better as well as assess the
availability requirements for spare parts.
ProductionCritical
This aspect includes any electrical, instrumentation, or
mechanical device that is a component of active process
control, or a monitoring or safety system, that is designed to
mitigate an action-level risk with a category IV or V facility
impact and provides an independent protection layer. A reli-
ability-based study (i.e., FMEA) can be used to identify pro- PM
duction-critical equipment that, upon failure, will result in a
significant reduction or a loss of production. Also, this failure
could cause significant equipment damage and repair cost.
It is the plant’s responsibility to define what constitutes
a significant reduction for its site. Production-critical equip-
ment should make up a small percentage (approximately
5%) of the total equipment population. This includes, for
example, nonspared or single-train equipment, such
History
PM is any maintenance activity that is performed to pre-
vent a piece of equipment from failing. PM activities may
take several forms, but all of the forms of PM are intended
to do one of two things: either prevent the next occur-
rence of a failure or at least detect the presence of an
impending failure.
Consequence Consequence
Category I Category II
Personnel,community, Personnel:minoror
and/orenvironmental noinjury,nolosttime
impacts:negligible
Consequence Consequence
Category IV Category V
Personnel:oneor Personnel:fatality or
moresevereinjuries permanentlydisablinginjury
Community:oneor
moreminorinjuries
Community:oneor more
severeinjuries
Community: no injury,
hazard, or annoyance
to thepublic
Environmental:
recordable,with no
agencynotification or
permitviolation
Consequence
CategoryIII
Personnel: single injury,
not severe, possible lost
time
Community:odoror
noiseannoyance
complaintfromthepublic
Environmental:release
that resultsin agency
notificationor permit
violation
Environmental:
significantrelease
with aseriousoffsite
impact
Facilityimpacts:
minimalequipment
damageat an
estimatedcost*of
fewerthanUS$10,000
Facilityimpacts:
minimalequipment
damageat an
estimatedcost*of
US$10,000–100,000
Facility impacts: some
equipment damage at
an estimated cost* of
US$100,000–1million
Environmental:significant
releasewith aseriousoffsite
impactthat ismorelikely
than not to causeimmediate
or long-term healtheffects
Facilityimpacts:major Facilityimpacts:majoror
equipmentdamage total destruction to process
at anestimatedcost* area(s),estimatedat acost*
ofUS$1million–10 greaterthan US$10million
million
*Facility costs include equipment replacement and loss of production (enterprise basis) based on the projected average margin for the remaining life of the facility.
Table1.Someexamplesofriskconsequencecategories
8. 29
Performing PM at a fixed interval is another defin-
ing factor, and many refer to it as being calendar driven.
Most end users base the interval on operating time (days,
weeks, months, or years); however, process throughput
(barrels of product produced, gallons of fuel burned,
hours of run time, and so on) can be the basis. In both
methods, PM activities occur at fixed intervals. Therefore,
any activity one performs on demand is not a PM activity.
Some companies still subscribe to one of the original
styles of maintenance, where technicians or mechanics are
assigned to continuously survey the assets to detect any
issues that may need repair. This method has a significant
drawback in that the defect must be very late in its failure
progression to be detected. In the late 1960s, this style of
maintenance was found to be too costly and produce too
much nonproductive downtime due to not knowing what
was going to break next or how long it would take to fix.
In the early 1970s, this approach gave way to a dif-
ferent PM execution model: the concept of time-based
replacement. There was a belief that all failures were a
function of time or throughput and were, therefore, pre-
dictable. Organizations thought they could organize their
PM strategy to replace parts or components at fixed inter-
vals. This method was not the panacea it was perceived to
be, as maintenance costs increased, and system reliability
was unchanged. Merely tracking hours to failure produces
a lagging indicator and will rarely point to the failure
mode; therefore, a system implementing only time-based
replacement is not the answer.
Another type of PM utilized was a hybrid of the previ-
ous styles. This method involved scheduling PM for an
asset, disassembling it, and fixing whatever the technician
determined required repair. This PM style is utilized by
many end users today, and it has a high level of variability
based on the knowledge and skill level of the technician.
The source of the problem is a lack of understanding of
the failure modes driven by time or cycles and the ones
that are random.
Ground-BreakingPMDevelopment
In 1974, United Airlines was commissioned by the U.S.
Department of Defense to write a report on the processes
used in the civil aviation industry to develop maintenance
programs for aircraft. This report [10], “Reliability-Cen-
tered Maintenance,” written by Stan Nowlan and Howard
Heap and published in 1978, has become the basis of all
subsequent RCM approaches. According to Nowlan and
Heap, in the RCM system, all maintenance tasks are
driven by a specific failure mode and have a specific strat-
egy based on the impact of the failure and type of failure
mode, which may be random or wear out with respect to
time (see Figure 2).
Nowlan and Heap’s research [10] also found these
key factors: random failure modes require inspections,
and the corrective work is performed based on the con-
dition of the defect at the time of the inspection. Why
are infant failures considered random? Answering that
question requires us to understand the definition of ran-
dom: “having no specific pattern.” Occasionally, infant
mortality failures happen after maintenance on an asset,
but this does not occur after every initial startup or after
every maintenance activity. These failures sometimes
occur after start-up or maintenance, and sometimes they
do not; therefore, they are considered random failures.
Wear-out failure modes do not require frequent inspec-
tions, because this failure propagation is predictable, as
it is time or use related. This RCM report, which created
a new thought process on failure modes, defects, and
strategies, is perfect for PM systems even after 30-plus
years and remains the standard for reliability-system
design for maintenance.
Determining whether or not a maintenance task is to
remain in an equipment-maintenance plan is based on the
RCM system, which requires reviewing these questions
for each task:
● Does the task prevent a failure mode?
● Will the task detect the presence of a failure mode?
●Can the task address regulatory or statutory requirements?
The purpose of these tests is to help us determine whether
the task adds value and should remain in the maintenance
strategy. Frequently, non-value-adding tasks creep into the
PM program over time, which tends to inflate costs and per-
sonnel loading and, over time, are not effective. The typical
scenario that creates PM creep is similar to scope creep in a
large project or turnaround. Companies that experience a
failure believe they can apply PM toward reliability, thus
adding more tasks to the PM, increasing the frequency, and
sometimes adding people. Of course, this does not address
the problem. As mentioned earlier, the author has seen this
methodology implemented at one of his past employers in
response to a catastrophic incident. The fallacy of just add-
ing PM items is that the nature of most failures is random,
Pattern A = 4%
Pattern D = 7%
Pattern B = 2%
Pattern F = 68%Pattern C = 5%
(a) (b)
Pattern E = 14%
(c) (d)
Time Time
Time Time
Time Time
(e) (f)
Age Related = 11% Random = 89%
FIGURE2.TheRCMfailurecurves:(a)bathtub,(b)initial break-inperiod,
(c)wearout,(d)random,(e)fatigue,and(f)infantmortality[10].
JULY/AUGUST 2020 IEEEIndustryApplicationsMagazine
9. 30 IEEEIndustryApplicationsMagazineJULY/AUGUST 2020
and a time-based replacement strategy is not at all effective
in dealing with randomproblems.
To understand this phenomenon, a few terms must
be defined. A failure mode is the local effect of a failure
mechanism, according to ASTM International. An example
might be just “bent,” “broken,” or “leaking.” For the reli-
ability engineer, this is not descriptive enough to identify
the problem and solve it. Therefore, the definition of fail-
ure mode is modified to describe it as the part, problem,
and reason (e.g., bearing, fatigued, and misalignment,
respectively). This failure mode, then, would be as fol-
lows: the bearing was fatigued due to misalignment. This
description of the failure mode provides the information
necessary to remedy future failures effectively.
Once the failure modes are understood, one of the
six failure curves found in Figure 2 can be assigned. The
first three curves [Figure 2(a), (c), and (e)] denote interval-
based failures and are best suited for an interval-based
replacement strategy. The last three curves [Figure 2(b),
(d), and (f)] are random failures, and an interval-based
strategy will not work. In fact, it will cost significantly
more money and result in no higheravailability.
The answer to random failures is an inspection-based
strategy, where the component is inspected at some regu-
lar frequency, and the repair is implemented based on the
condition of the component, regardless of time. The P–F
curve [10] graphically encompasses all of the concepts of
reliability in a concise, articulate manner (Figure 3).
As defects propagate, the indications they give off
change as each defect passes through its severity pro-
gression cycle and are different early versus later in a
defect’s life. These indicators can inform the inspector
about the condition of the defect, which makes planning
and scheduling to eliminate it straightforward and spe-
cific. If an organization can mitigate the defect as close
to point P as possible, it creates many advantages: lower
repair costs, longer lead time for planning and schedul-
ing, lower probability of failure, and a lower necessity
for keeping spare parts in stock as opposed to ordering
them as needed. The concept of reliability is expressed
by these advantages; therefore, the development of the
P–F curve is the single most important concept in the
field of reliability.
P–FCurve
The P–F curve is a simplistic concept: the P stands for the
point of a defect in the asset, as before this point the
asset is healthy or defect free, and the F stands for the
point of functional failure or loss of function of the asset.
Therefore, the P–F curve has the following implications
for PM and PdM programs. Inspections, whether PM or
PdM, must be performed close to point P to identify the
defect, which increases the advantages to the end user.
Thus, using various PdM technologies (i.e., ultrasound,
vibration, partial discharge, and so on) to find early
defects provides the organization with an extended win-
dow (depending on the asset, typically an average of
90 days) to respond to the problem.
The P–F curve also gives us an excellent indicator
of the frequency with which PM and PdM inspections
should occur. For random failures, the inspection interval
is less than one half of the P–F interval, which ensures
that there is sufficient time to find the defect before fail-
ure. This rule applies to PM as well as PdM programs; yet
when the defects are no longer random and there is a
sufficiently strong wear-out mechanism producing the
defect, this rule no longer applies. End users must decide
what degree of risk they are willing to take with an inter-
val-based replacement strategy. For nonrandom failures,
the value of the PM program becomes significant, as it
is the only failure-preventing task
employed against that failure mode.
It is essential to understand
that criticality is a methodology
that allows focus on the organiza-
tion’s resources and level of scrutiny
employed, but it has no bearing what-
soever on the inspection frequency.
This is a common error for end users
that is ingrained in poor inspec-
tion techniques and the organiza-
tion’s inability to effectively identify
defects. The basis of the inspection
interval is the estimated P–F inter-
val. The criticality of the piece of
equipment determines the number
of inspection methods utilized for a
given failure mode. In some instanc-
es, an end user may choose to run
assets to failure, which is an accept-
able maintenance strategy.
Broken
EquipmentCondition
Point Where
Failure Starts Early
Early
to Occur Signal 1
Signal2 Early
Signal 3
Audible
Noise
Hot to
Touch
Equipment Fails Here
PdM Detects
ProblemsEarly
Cost to Repair
Time
FIGURE3.TheP–Fcurve.
10. 31
It is essential that end users understand that, often,
nothing is found when performing PM. Once end users
achieve a higher level of reliability, they must resist mak-
ing a business decision to save money by cutting the
frequency of inspections. In fact, if the reliability effort is
doing its job, then the inspections are not going to return
any defects. The purpose of conducting PM on a set fre-
quency is to mitigate risk. Risk must remain a part of the
decision matrix. As the number of inspections increases
without yielding any defect, then the nature of the inspec-
tions and the inspection criteria should be evaluated to
ensure accuracy. This should be the only caveat: leave the
inspection interval alone. If the end user has conducted
six inspections without finding a defect, it means that the
I–P interval (where I is the point of installation, so the I–P
interval is the failure-free period) has lengthened (see Fig-
ure 4); it does not mean that the P–F interval has changed.
It is the P–F interval that determines inspection frequency,
not the length of the I–Pinterval.
Conclusions
As stated earlier, a three-legged stool is inherently stable;
an organization that focuses on safety, operations excel-
lence, and reliability will exhibit similar stability. Compa-
nies that view reliability as on par with safety, proactively
address equipment failure to prevent the catastrophic fail-
ures that lead to significant incidents in facilities.
The ultimate objective of any industrial asset-intensive
enterprise is to maximize and control safety and opera-
tional profitability in real time. As the speed of industrial
business continues to increase, it is becoming even more
critically important to focus on these objectives. It turns
out that what is needed isn’t more advanced technology.
This age-old issue needs to be addressed, and that begins
with how asset reliability is measured in the firstplace.
Today, the continuous advancements in technology
(e.g., the Industrial Internet of Things, data science, and
the proliferation of condition and process measurements)
in industrial operations are making the direct real-time
measurement, utilizing PdM, of asset reliability cost-
effective. By empowering today’s industrial workforce
with real-time operational profitability data, along with
process control and real-time reliability risk information,
asset owners will be able to enhance operations and
business performance.
Operators will be able to make operational changes
and see the impact of these adjustments, not only in the
process but also on the profitability and reliability of
assets. They can then apply this feedback to make oper-
ating and business decisions that maximize operational
profitability without significantly increasing reliability
risk. Likewise, plant maintenance personnel can deter-
mine their maintenance activities from the profitability
and reliability risk information provided by the various
systems and adjust their actions and responses accord-
ingly. Since the common objective of both operators and
maintenance personnel is to improve operational profit-
ability, they will work much more collaboratively.
It is clear that a profitable reliability approach that com-
bines real-time control of reliability risk and operational
profitability with a higher level of reliability management
will go a long way toward helping end users meet their
short- and long-term operations and business objectives.
Profitable reliability control is the next evolution of main-
tenance technology solutions. The result will be higher
levels of operational profitability, safety, and reliability.
Author Information
Donald G. Dunn (donald.dunn@ieee.org) is a senior con-
sultant with Waldemar S. Nelson and Company, Inc., Hous-
ton, Texas. He is a Senior Member of the IEEE. This article
first appeared as “Reliability: The Missing Leg of the Stool”
at the 2018 IEEE IAS Petroleum and Chemical Industry
Conference. It was reviewed by the IAS Petroleum and
Chemical Industry Committee.
References
1 K. Schneider, “Petrochemical disasters raise alarm in industry,” NY
Times, section A, p. 22, June 9, 1999.
2G. Visscher, “Some observations about major chemical accidents from
recent CSB investigations,” in Proc. Institution of Chemical Engineers,
Symp. Series No. 154, Hazards XX Process Safety and Environmental
Protection – Harnessing Knowledge – Challenging Complacency, 2008,
pp. 34–48.
3“List of completed investigations and current investigations,” Chemical
Safety Board, Washington, D.C. [Online]. Available: www.csb.gov
4“Investigation report: Refinery explosion and fire – BP Texas City, TX,
March 23, 2005,” U.S. Chemical Safety and Hazard Investigation Board,
Washington, D.C., Rep. 2005-04-I-TX, 2007. [Online]. Available: https://
www.csb.gov/bp-america-refinery-explosion/
5J. A. Baker, “The report of the BP U.S. refineries independent safety
review panel,” Jan. 2007. [Online]. Available: http://sunnyday.mit.edu/
Baker-panel-report.pdf
6A. Ness, “Lessons learned from recent process safety incidents,” Chem.
Eng. Prog., vol. 111, no. 3, pp. 23–29, Mar. 2015.
7J. Reason, “The contribution of latent human failures to the breakdown of
complex systems,” Philos. Trans. R. Soc. Lond. B, Biol. Sci., vol. 327, no.
1241, pp. 475–484, Apr. 12, 1990. doi: 10.1098/rstb.1990.0090.
8B. E. Isleifsson, “Reliability or safety first?” Reliability & Mainte-
nance, Oct. 8, 2012. [Online]. Available: https://bjarniis.wordpress
.com/2012/10/08/reliability-or-safety-first/
9P. Okoh and S. Haugen, “Maintenance-related major accidents: Classi-
fication of causes and case study,” J. Loss Prev. Process Ind., vol. 26, no. 6,
pp. 1060–1070, 2013. doi: 10.1016/j.jlp.2013.04.002.
10F. S. Nowlan and H. F. Heap, “Reliability-centered maintenance,”
United Airlines for Office of Assistant Secretary of Defense, Washington,
D.C., AD-A066-579, Dec. 29, 1978.
I P
F
Equipment Fails Here
Equipment Installed Here
FIGURE4.TheI–P–Finterval.
JULY/AUGUST 2020 IEEEIndustryApplicationsMagazine