Safety Management Systems (SMS) Fundmentals: Safety Risk Management Component


Published on

The second component of an SMS, is Safety Risk Management. We’ve already seen the five major elements of SRM, let’s see how they work in detail.
The objective of an SMS is to provide a structured management system to enable us to make decisions on controlling risk in our operations.
Once hazards are identified and their related risks analyzed, an organization can focus its resources on eliminating or mitigating those hazards that pose the greatest risk.

This is what SRM helps us to do.

Published in: Education, Technology, Business
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • The second component of an SMS, is Safety Risk Management. We’ve already seen the five major elements of SRM, let’s see how they work in detail.
    The objective of an SMS is to provide a structured management system to enable us to make decisions on controlling risk in our operations.
    Once hazards are identified and their related risks analyzed, an organization can focus its resources on eliminating or mitigating those hazards that pose the greatest risk.
    This is what SRM helps us to do.
  • As previously mentioned a Safety Management System…
    …controls risk and assures that the controls are working.
    Safety Risk Management is how we do the first part, controlling the risk.
    Safety Risk Management:
    Starts with the identification of hazards which are then analyzed to determine the risk to the company.
    The information is assessed to determine if the risk is acceptable or not.
    And finally, unacceptable risks are mitigated through implementation of risk controls.
    Each organization must define their own acceptable levels of risk, and what levels of management are authorized to accept various degrees of risk.
    In Safety Assurance, the whole operation is then monitored for the effectiveness of the controls and for the appearance of new hazards.
  • Risk management is applied at two primary levels: The first is at the Process level and the second is at the Operational level.
    Process Risk Management has three main areas to consider:
    Policy, - “what needs to be done”
    - Many risk decisions should be part of company policy
    – the result of careful deliberation about a broad range of situations. Policies convey what employees should do in various situations. For example, many operators prohibit circling approaches in instrument weather conditions.
    Procedure, - “how to do it”
    - Employees should be given procedures that have been developed in ways that takes risk into account. The selections and sequence of items on the checklists for operations, maintenance, and cabin safety and helps control human error.
    controls, such as Required Inspection Items, making sure the first two get done and defend against possible errors that could occur in maintenance operations.
    Operational Risk Management is comprised of Operational Control and Crew or Team decision making:
    The Operational Control Aspects involve how risk management is employed on a mission-by-mission basis. Oversight by dispatchers, supervisors, and flight followers help to assure that potentially hazardous conditions are identified and controlled during specific operations.
    Crew or Team Decision Making starts with front-line personnel, such as pilots, mechanics, cabin crews and controllers. They must have tools and procedures that help them make decisions. We know the pilots-in-command are the “final authority”, several levels of risk control should be employed.
    Organizations need to pay attention to all levels of risk management.
  • System Description also called, system description and task analysis is the first step in the SRM Process. This step is the often skipped, or done hastily, yet it is one of the most valuable activities an organization can do. It allows the Company to understand what they do, who does it, and in which environment they achieve Company goals.
    Pilot Project Participants have shared they have found enormous value, financially and
    of efficiency, from taking the necessary time to complete this step.
    What is System Description & Task Analysis?
    It is a system design function; involving the whole organization.
    It is a predictive method of hazard identification; when the dots do not connect, there is probably a hazard lurking there, design it out.
    It is the foundation for sound safety analysis; if you do not know who is on the team and what position they play, how can you expect to accomplish the goal.Click
    When is it used?
    It is used during the implementation phases of SMS; as you design your SMS.
    And it is used in conjunction with all operational changes, again, as you design your new system, process or procedure.Click
    Who uses System & Task Analysis?
    You need an appropriately diverse team:
    Stakeholders, and
    Subject Matter Experts
    Remember, these analyses need only to be as complex as needed to design procedures, develop training, identify hazards, and to prepare documentation for employees and other users. The level of detail required will vary depending on the complexity and safety criticality of the processes involved. Detail beyond this point is not necessary
  • In training or discussions you’ve had in the past, we discovered the importance of identifying facts vs. inferences vs. judgment…The importance was probably explained to you by your Boss, immediately after a less than sterling performance,
    System and task analyses must concentrate on being factual, so you can accurately consider any processes potential for causing a problem.
    Here we break those processes, down into Activities (the things people do) and Workplace Conditions.
    Why are these processes, activities and workplace conditions important?
    First let’s look at workplace conditions and then at the human activities.
  • Here are several typical workplace conditions. These conditions exist for all processes. Unsatisfactory conditions can become hazards that exist before an accident.
    Understanding these factors and their effects on safety performance will help us to control them, which is the ultimate goal, in managing risk.
    A poorly written or incomplete procedure can be re-written, the marginal training class can be improved, the broken, missing or worn out tools or equipment can be replaced or serviced, the hangar that is poorly lighted can be upgraded. Finally, these factors, and many more workplace conditions, have been contributing or causal factors to many accidents. If an organization can manage these, they will likely be managing their risk.
    And this leads us to the Safety System Attributes,
  • Process Attributes (also known as System or Safety Attributes)
    In the earlier Safety Policy portion of the presentation, one of the primary responsibilities under a system safety approach is to make sure safety is built into our systems.
    What do we look for in the design of a system?
    A primary component of any well designed functional system is the adequacy of certain attributes. These attributes effect both safety and quality management systems and have been documented in both National and International publications.
    Responsibility – who is accountable for safety and quality of process activities?
    Authority – who has the power to establish and modify process procedures?
    Procedures – are there clear instructions for members of the organization?
    Controls – are the administrative, process or supervisory controls adequate to ensure activities produce the correct results?
    Process Measures – is there a way to determine whether a process is being performed according to the established procedures and achieving the desired results?
    Interfaces – Are there strong interrelationships between processes? Does the left hand talk to the right hand?
  • Now for a “process flow” that can be applied to any process in any organization. We will illustrate two things, first how the process attributes fit into a process flow and then second we’ll show how the flow applies to a real world workplace.
    First the process itself…
    …a set of interrelated or interacting activities, things that people do, that transform inputs into outputs.
    We add Inputs, which can be external (a decision by the company to open a new station) or internal (the output from a previous process, for example; the flight crew calling for deicing triggers the station deicing crew to start their process to deice the aircraft.). Notice that inputs are an Interface process attribute.
    Next we ask “Who’s in charge”…
    …and who can make changes to the process and who can accept risk on behalf of the company.
    Then we need procedures to direct or guide the “things that people do”. “ a procedure is a specified way to carry out an activity or a process.” Another process attribute.
    Controls, (procedures, steps or practices) are designed to keep processes on track so they will achieve their intended results.
    And now we get to outputs, the product or end result of a process, which can be recorded, monitored, measured, and analyzed. Outputs maybe the input for the next process area in succession, so it is important to get it right.. Outputs contain the partner to the input Interface attribute and the final attribute, Performance Measures.
    This model is a simple, input-activity-output flow showing where the process attributes affect a process.
  • Now, we will demonstrate how the Process Flow might look in the real world.
    In addition to the internal and external inputs we see customer requirements…
    …and materials, especially if your output is a product.
    On the controls side we see the obvious controls of; laws, regulations and standards.
    Training is also a control…
    …as are, the books, aids and software we use..
    How about the people; the folks doing the job and those in charge...
    …plus their facilities, equipment and tools…
    …and finally the environment in which everything has to work..
    Combine all these and you generate the output of a product or service.
    We’ve described system analysis, workplace conditions and attributes that should be considered, let’s look at the human side of the equation and some conditions that have a potential to cause harm.
  • Because humans, operating in the system, account for the majority of active failures, either contributing to or being blamed for most accidents, we will examine conditions relating to human error, as we continue with SRM, and attempt to manage the things we can manage.
    Here are some examples of conditions that contribute to errors, which can result in active failures.
    These conditions are typical findings, when reviewing incidents or accidents; they also exist in a predominant number of daily errors that haven’t yet resulted in an undesired event.
    You can think of examples of each of these that you’ve experienced or seen.
  • These are a few more examples of conditions that lead to errors:
    Many of these conditions are subjective and not factual, therefore making them difficult to address directly.
    Adjectives, such as “poor,” “inadequate,” and “lack of” at best may indicate a trend, however service providers must learn to manage the risk associated with these conditions.
    Typical, these conditions are findings when reviewing incidents or accidents.
    They also exist in a many daily errors that have not yet resulted in an undesired event.
  • This example shows how Activities and Workplace Conditions could be depicted for deicing an aircraft. Human activities are on the left and workplace conditions on the right. Human activities and workplace conditions are independent list.s
    While not an exhaustive list, this type of thought process and progression is helpful in an organization’s development of an SMS. Breaking down the process steps into manageable pieces, and is an effective way to capture and track the information.
  • The next element of SRM , called Hazard Identification is used to determine the aspects of systems and environments that present hazardous conditions.
    Consider these points:
    Hazards often are erroneously identified as consequences.
    Hazards are not events. As such, hazards do not occur but exist in the environment.
    Any workplace condition or set of conditions may or may not be a hazard.
    Hazardous conditions become workplace conditions when they singly or in combination present the potential for harm.
    For conditions to be hazards, there must be some type of exposure to aviation operations. As an example, not all power lines or telephone wires are hazards, however the wires in the photo are a hazard because they are in a place where aviation operations come close enough for airplanes to become tangled up in them.
  • We just said workplace conditions become hazards when they singly, or in combination, present the potential for harm .
    This diagram shows how our knowledge of workplace conditions flows into the hazard identification process, “Deficient conditions impacting activities”.
    We interpret the knowledge we have gained about workplace conditions and apply them to identify hazards that could cause harm.
    Then we add our knowledge of the human activities to our knowledge of the hazards to infer what could cause active failures.
    Ssystem and task analyses must concentrate on being factual so that accurate inferences about their potential for causing system failures can be accomplished.
  • Now we progress from hazard to risk analysis.
    A hazard is a condition that may have an adverse consequence. Risk is a hazard that has been analyzed for severity and likelihood of the consequence.
    Analysis of risk must consider both likelihood and severity. That is, workplace conditions and active human failures along with potential consequences must be analyzed to determine relative likelihood and severity. The more adverse the workplace conditions, the more the likelihood of active failures and a resulting accident.
  • This graphic demonstrates the progression from hazard to risk.
    After identifying the presence of a safety hazard, analysis is required to identify active failures that have potential for an adverse consequences. The likelihood that the adverse consequence will occur is directly proportional with increased exposure to the hazard. Likelihood, is typically a byproduct of an active failure, and, further, a set of underlying adverse workplace conditions. While we may have data to establish the likelihood of failures, based on past experience, usually this will be based…
    …on the judgment of those involved , another reason for the need for operational experience in the process.
    Active failures are often cited in probable cause statements or audit reports – “the probable cause of this accident was the…[pilot’s, crew’s, mechanic’s, controller’s, etc.]…failure to….”
    Be sure to consider all possible adverse consequences that could result from the active failure. One hazard can have multiple consequences, each with different likelihood or severity levels.
    Also, that severity is driven by the consequence. For instance, if the possible outcome of errors during an instrument approach were Controlled Flight Into Terrain , the consequences will almost certainly be catastrophic.
  • On the left are listed some possible active failures that can be a direct result of human activity.
    These failures should be related to the process and/or one of its underlying activities.
    We have also postulated potential consequences that could result, as an example tool that can be helpful for an organization going through this process.
  • Risk Assessment is a decision process based on human knowledge, experience and judgment, but guided by a structured process.
    It determines the level of risk to assist in bottom line decision making.
    A risk matrix is a tool for looking at the combined effects of likelihood and severity in order to prioritize resources and aid in decision making. It is important to remember that a risk matrix is not risk assessment…it is only a tool……to help determine which risk to address in what sequence. The risk assessment is the human activity of applying knowledge, experience and judgment to the likelihood and severity values.
    Color codes will help to determine the acceptability or “tolerability” of risk. Is the risk acceptable? If so, the system can be put into operation. If not….there is more work to be done.
    Typically, “Tolerability” and “Intolerability” are indicated by using the color codes of:
    Green = Acceptable Risk.
    Yellow = Acceptable with mitigation.
    Red = Unacceptable Risk.
    The FAA does not require there to be a certain number of severity or likelihood levels and leaves the construction of the matrix up to the organization,
    At this point, risk assessment has two options possible.. The risk is either acceptable or not acceptable. If acceptable, the SRM process is complete and the risk move on to the Safety Assurance (SA) process for monitoring. If the risk is not acceptable, then risk controls are required to mitigate or reduce the risk.
  • Risk controls, are sometimes called “defenses” or “barriers.”
    While we may be able to reduce severity, protective equipment, in most risk situations, we’ll only be able to reduce the likelihood or probability of an accident.
    We’ll do this by reducing the likelihood of an active failure , mostly human errors or equipment malfunctions.
    As an example; The use of Anti-skid tape on a maintenance stand reduces likelihood that someone will slip on the platform. Installing railings reduces the severity, if someone does slip and fall, while working on the stand. If the environment includes working in the rain, wind and at night, what will these conditions do to our likelihood and severity?
    Likelihood will increase, but severity may not.
  • Risk controls, , are usually applied to the working conditions, things that can be managed…..
    The purpose of risk controls is to develop barriers or defenses to mitigate safety risk down to an acceptable level. Once a risk control is developed, it should be run back through the SRM Process, to ensure it will function in the system, doesn’t introduce other hazards, and accomplishes what it was designed to do.
    When risk has been found to be undesirable or unacceptable, control measures must be introduced – the higher the risk, the greater the urgency.
    The level of risk can be lowered by reducing the severity or likelihood of each possible consequence.
    There is no such thing as absolute safety. Risks have to be managed to a level “as low as reasonably practicable”
    Which means that the risk must be balanced against the time, cost and difficulty of taking measures to reduce or eliminate the risk.
  • Safety Management Systems (SMS) Fundmentals: Safety Risk Management Component

    1. 1. SMS Details: Federal Aviation Administration Safety Risk Management Component Policy (Structure) Risk Management Safety Assurance Safety Promotion Federal Aviation Administration SL-1
    2. 2. Definitions • Safety management systems provide a systematic way to control risk and to provide assurance that those risk controls are effective. • Safety Risk Management performs the process of controlling risk thru hazard identification, analysis, risk assessment and developing risk controls. Federal Aviation Administration SL-2
    3. 3. Levels of Risk Management • Process Risk Management – Policy (What) – Procedure (How) – Controls • Operational Risk Management – Operational Control (Flight/Task/Mission) – Crew/Team (Real time decision making) Federal Aviation Administration SL-3
    4. 4. System Description What is System Description & Task Analysis? • • • It is a system design function. It is a predictive method of hazard identification. It is the foundation for sound safety analysis. System Description Hazard Identification When is it used? • • • • Risk Analysis Used during implementation phases of SMS. Used in the development of operational processes. Used when a new Hazard is identified Used in conjunction with all operational changes. Risk Assmt Who uses System Description & Task Analysis? • Risk Control Personnel within the organization who form an appropriately diverse team: – Stakeholders – Subject Matter Experts ICAO Doc. 9859 Federal Aviation Administration SL-4
    5. 5. SRM System Description System Description Facts Processes Hazard Identification Activities Workplace Conditions: e.g, System Factors & Attributes Risk Analysis Risk Assmt Variable Human Performance • Equipment • Information (Procedures) • Facilities • Phys. Envir. • Other Proc. (Interfaces) • Training • Supv./Mgmt. (Controls) • …. Risk Control Federal Aviation Administration SL-5
    6. 6. Typical Workplace Conditions • Equipment: Human-machine interface, facilities, tools… • Information: Procedures, guidance... • Environment: Physical, cultural… • Training: Formal, OJT, recurrent… • Company/regulator factors: The RULES... Barry Strauch (2004). Investigating Human Error Federal Aviation Administration SL-6
    7. 7. Process (System or Safety) Attributes • Responsibility • Authority • Procedures • Controls • Process Measures • Interfaces Federal Aviation Administration SL-7
    8. 8. External Internal Decision by Organization Inputs Previous Process Interface – (I) Process Controls (C) Activities •Procedural •Supervision •Assurance Processes (Things People do) Responsibility (R) •Accountable for process output Authority (A) Procedures (P) Empowered to: •Make key decisions •Alter process Outputs •Destination – Interface (I) •Deliverable – Performance Measures (PM) Processes Federal Aviation Administration SL-8
    9. 9. Customer Requirements Laws Regulations Standards Training Materials Process Activities Facilities Equipment Tools Knowledgeware • Manuals • Job Aids • Software Products Services Systems People • Employees • Contractors • Organization Environment • Physical • Operational • Cultural Federal Aviation Administration SL-9
    10. 10. Conditions (Hazards) Related to Human Error • Time pressure • Poor procedures and documentation • Teamwork (Too much, too little) • Shift turnovers/crew briefings • Group norms (Values, culture) • Fatigue management (shifts/circadian problems) Alan Hobbs, ATSB (2008) Federal Aviation Administration SL-10
    11. 11. Conditions (Hazards) Related to Human Error (Cont.) • Lack of System Knowledge • Poor, worn out, missing or equipment/facilities • Human-machine interface (e.g. design for maintainability) Federal Aviation Administration SL-11
    12. 12. Activities and Conditions: Deicing Activities/Tasks Workplace Conditions Things people do System and Environment Prepare truck / equipment Verify type of fluid Day/Night Weather – precip / cold Fluid temp / concentration Protective clothing Position at aircraft Equipment condition Communicate with crew Shift change Apply fluid Employee demographics Communicate with crew Depart ramp area Federal Aviation Administration SL-12
    13. 13. Hazard Identification System Description Hazard Identification A hazard is any real or potential condition… Risk Analysis that can result in injury, illness, or death to people; damage to, or loss of, a system (hardware or software), equipment, or property; and/or damage to the operating environment. Risk Assmt Risk Control ICAO Doc. 9859 Federal Aviation Administration SL-13
    14. 14. SRM Hazard Identification from Workplace Conditions System Description Hazard Identification Processes Activities Workplace Conditions: e.g. System Factors & Attributes Risk Analysis Risk Assmt • Equipment • Information (Procedures) • Facilities • Phys. Envir. • Other Proc. (Interfaces) • Training • Supv./Mgmt. (Controls) • …. Deficient Conditions impacting activities = Risk Control Inference Variable Performance Causing… Active Failures Hazards Resulting in… Consequences Federal Aviation Administration SL-14
    15. 15. Risk Analysis System Description  Important to distinguish between:  Hazard – a condition Hazard Identification Risk Analysis Risk Assmt  Consequence – result Risk Control  Risk – likelihood & severity of the consequence  Analyzing risk involves the consideration of both the likelihood and the severity of any adverse consequences. ICAO Doc. 9859 Federal Aviation Administration SL-15
    16. 16. SRM From Hazard to Risk System Description Hazard Identification Deficient Conditions impacting activities = Variable Performance Hazards Causing… Risk Analysis Risk Assmt Risk Control Resulting in… Active Failures Consequences L ikelihood S everity Risk Judgment Risk Federal Aviation Administration SL-16
    17. 17. Failures and Consequences Active failures Potential Consequences Direct results of human activity Accident/incident severity Incorrect fluid type Wrong fluid concentration Fluid sprayed into pitot-static ports Incomplete deicing Aircraft required to be deiced again Take-off delay Maintenance action required Take-off accident due to ice Hold-over time too long Federal Aviation Administration SL-17
    18. 18. Risk Assessment Risk assessment determines the level of risk to use in making a bottom line decision. System Description Hazard Identification Risk Likelihood Risk Severity Catastrophic Hazardous Major Minor Negligible A B C D E Risk Control 5A 5B 5C 5D 5E 4 4A 4B 4C 4D 4E Remote 3 3A 3B 3C 3D 3E Improbable Risk Assmt 5 Occasional Risk Analysis Frequent 2 2A 2B 2C 2D 2E Extremely improbable 1 1A 1B 1C 1D 1E A risk matrix is a tool used for risk assessment. It can vary in form yet it accomplishes the same purpose. Federal Aviation Administration SL-18
    19. 19. Risk Control = Risk Mitigation System Description Hazard Identification A major component of any safety system is the defenses (controls) put in place to protect people, property or the environment. Risk Analysis Risk Assmt Risk Control These defenses are used to reduce the likelihood or severity of the consequences associated with any given hazard or condition. ICAO Doc. 9859 Federal Aviation Administration SL-19
    20. 20. SRM Risk Control/Mitigation System Description Hazard Identification Processes Activities e.g. System Factors & Attributes Risk Analysis Risk Assmt Risk Control Workplace Conditions: Variable Human Performance • Equipment • Information (Procedures) • Facilities • Phys. Envir. • Other Proc. (Interfaces) • Training • Supv./Mgmt. (Controls) • …. Risk Controls Federal Aviation Administration SL-20