• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Dezfuli.h
 

Dezfuli.h

on

  • 10,268 views

 

Statistics

Views

Total Views
10,268
Views on SlideShare
10,268
Embed Views
0

Actions

Likes
0
Downloads
14
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Dezfuli.h Dezfuli.h Presentation Transcript

    • A ―Systems/Case-based‖ Approach to System SafetyPresented at NASA Project Management Challenge 2012 February 22-23, 2012 Homayoon Dezfuli, Ph.D. NASA Technical Fellow for System Safety Office of Safety and Mission assurance (OSMA) NASA Headquarters
    • Introduction • We have developed a System Safety Framework under which system safety activities are conducted and communicated • The three elements of the framework are: – Safety objectives – System safety activities – Risk-Informed Safety Case (RISC) • Guidance on the System Safety Framework is contained in the NASA System Safety Handbook – Volume 1: System Safety Framework and Concepts for Implementation (NASA/SP-2010- 580) • Volume 1 will be followed by Volume 2 on methodsPresented by Homayoon Dezfuli 2
    • Motivation • Development of the System Safety Framework is motivated by a desire to: – Foster a systems view of safety (i.e., a holistic, systems engineering view of safety) – Improve integration and effectiveness of system safety activities – Establish a process for defining ―adequate safety‖ – Establish a means for presenting a coherent case for the safety of the system to decision makers – Establish a process that is compatible with the growing trend toward insight/oversight relationships with commercial providersPresented by Homayoon Dezfuli 3
    • Safety Objectives 4
    • What is Safety? “Safety is freedom from those conditions that can cause death, injury, occupational illness, damage to or loss of equipment or property, or damage to the environment” NPR 8715.3 • The specific scope of safety is application-specific, and must be clearly defined by the stakeholders in terms of the entities to which it applies and the consequences against which it is assessed • The degree of safety that is considered acceptable is also application-specific – We strive to attain a degree of safety that fulfills obligations to the at-risk communities and addresses agency priorities – We do not expect to attain absolute safety (nor consider it possible to do so)Presented by Homayoon Dezfuli 5
    • Adequate Safety • Achieving an adequately safe system requires adherence to the following fundamental safety principles: – The system meets or exceeds a minimum tolerable level of safety. Below this level the system is considered unsafe – The system is as safe as reasonably practicable (ASARP) Achieve an adequately safe system Achieve a system that Achieve a system that is meets or exceeds the as safe as reasonably minimum tolerable level practicable (ASARP) of safety Operate the system Design the system to Build the system to Design the system to Build the system to Operate the system to continuously meet meet or exceed the meet or exceed the be as safe as be as safe as to continuously be as or exceed the minimum tolerable minimum tolerable reasonably reasonably safe as reasonably minimum tolerable level of safety level of safety practicable practicable practicable level of safety • The minimum tolerable level of safety is not necessarily static, and may evolve over the course of the system life cycle • The principles of adequate safety must be maintained throughout all phases of the system life cyclePresented by Homayoon Dezfuli 6
    • NASA Safety Thresholds & Goals • NASA‘s minimum level of tolerable safety for human spaceflight missions is articulated in NASA‘s agency-level safety goals and thresholds for crew transportation system missions to the ISS • They reflect a tolerance for an initial safety performance that is acceptable initially but below long-term expectationsPresented by Homayoon Dezfuli 7
    • NASA Safety Thresholds and Goals: Accounting for the Unknowns “ There are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns – the ones we dont know we dont know. ” — Former United States Secretary of Defense Donald Rumsfeld, 2002 • Meeting quantitative safety requirements means more than simply showing that the known-known risks do not exceed the applicable goal or threshold • We must also be able to show that: – The known-unknowns (risks that have been identified but are not quantifiable) and the unknown-unknowns (risks that exist but have not been identified) are bounded – The bounds of the unknowns do not threaten the quantitative safety requirements • Methods for doing this include: – Reliability growth analyses of US vehicles and other countries‘ vehicles – Analyses of historical precursors and anomaliesPresented by Homayoon Dezfuli 8
    • As Safe As Reasonably Practicable (ASARP) “ASARP entails weighing the safety performance of a system against the sacrifice needed to further improve it. A system is ASARP if an incremental improvement in safety would require a disproportionate deterioration of system performance in other areas.” From SS Handbook • The ASARP concept is closely related to the ―as low as reasonably achievable‖ (ALARA) and ―as low as reasonably practicable‖ (ALARP) concepts that are found in U.S. nuclear applications and U.K. Health and Safety law • ASARP implies that: – A comprehensive spectrum of alternative means for achieving operational objectives has been identified – The performance of each alternative has been analyzed to determine the relative gains and losses in performance (operational effectiveness, safety, cost, and schedule) that would result from selecting one alternative over another – Safety performance is given priority in the selection of an alternative, insofar as the selection is within operational constraintsPresented by Homayoon Dezfuli 9
    • ASARP (Cont.) • The ASARP region contains those alternatives whose safety performance is as high as can be achieved without resulting in intolerable performance in one or more of the other mission execution domains – ASARP is a region of the trade space and can contain more than one specific alternative – The ASARP concept makes no explicit reference to the absolute value of a system‘s safety performance – Improvements to cost, schedule, or technical performance beyond minimum tolerable levels are not justifiable if they come at the expense of safety performancePresented by Homayoon Dezfuli 10
    • Deriving Operational Safety Objectives • The fundamental safety principles set the stage for the further development of safety objectives, negotiated on an application-specific basis • Safety objectives are developed using an objectives hierarchy down to a level where they can be clearly addressed by systems safety activities, thereby creating a link that: – Assures that system safety activities are directed towards accomplishing defined safety objectives – Enables the system safety activities to be assessed in terms of the degree to which their target safety objectives have been met • The safety objectives at the bottom level of the objectives hierarchy represent the operational definition of safety for the system under consideration, and are referred to as operational safety objectivesPresented by Homayoon Dezfuli 11
    • System Safety Objectives HierarchyPresented by Homayoon Dezfuli 12
    • System Safety Activities 13
    • System Safety Activities as a Part of the System Safety Framework • System safety activities are conducted as part of the overall systems engineering technical process activities • System safety activities are designed to promote the development of safe systems and to provide evidence to help demonstrate through the Risk-Informed Safety Case (discussed later) that the stated system safety objectives have been achieved System Safety Objectives (define safety) RISC Evaluation Risk-Informed Safety Case (demonstrate safety) (confirm safety) System Safety Activities (achieve safety)Presented by Homayoon Dezfuli 14
    • System Safety Activities – Early Design1. Initial constraints focus on applicable safety requirements, design alternatives, operational constraints, and risk tolerances2. The RIDM process provides models and results to evaluate trade-offs in the search for a final design that is ASARP3. ISAs (integration of hazard analysis, physical response 7. Inform analysis, and probabilistic 8. Allocate 6. Initialize analysis) provide input needed to demonstrate the system meets quantitative safety requirements 2. Conduct RIDM4. Under the ASARP objective, trade 1. Set Initial Constraints 5. Select Design studies are performed to examine how variations (e.g., in design) affect not only safety but also the other mission execution domains 3. ISAs 4. Trades Note: Interfaces are shown in some cases by nesting rather than by arrows. The nesting format automatically implies an arrow from the smaller activities within the nest to the larger activity surrounding it. Presented by Homayoon Dezfuli 15
    • System Safety Activities – Early Design (Cont.)5. The process of down-selecting from the design alternatives to one particular design concept is conducted through a risk- informed deliberation by the decision makers6. The initialization role of CRM is to complete the risk modeling started during RIDM to include all hazards and associated scenarios that affect the risks7. Informed compliance with requirements that have been 8. Allocate 7. Inform 6. Initialize developed historically and are recognized as best practices in their engineering disciplines tend to provide protection against 2. Conduct RIDM 1. Set Initial Constraints 5. Select Design known unknowns and unknown unknowns8. The process for determining lower level performance requirements involves a risk- 3. ISAs 4. Trades informed allocation of requirements from system to sub- system level Presented by Homayoon Dezfuli 16
    • System Safety Activities – Detailed Design Design a safe system during detailed design Safety Detailed Design1. During detailed design, the role of Objectives CRM evolves to Design the system to Design the system to be as safe as reasonably include the meet or exceed the practicable minimum tolerable development and level of safety implementation of new controls when Maintain Minimize the Be responsive Comply with needed to allocation of Risk-inform to new introduction of levied requirements design potentially counteract any new information requirements consistent with solution adverse during system that affect achievable safety decisions conditions during or changed risks performance design system design safety2. Program controls and commitments include RISC Evaluation RISC management Confirms Safety activities to (within Systems Engineering) promote an System Safety Activities environment within 1. Conduct CRM (analytic deliberative process) Conduct CRM (analytic deliberative process) 2. Program control & commitments Program control and commitments Also conduct RIDM ififmajor re-planningis needed Also conduct RIDM major re-planning is needed which design Implement opportunities for Maintain risk analysis of system communication performance Management Conduct improving safety Conduct Control proactively protocols, verification research configuration without incurring Maintain Maintain other and identified seeks net- management, and validation mission exe- individual beneficial that safety unreasonable cost, integrated testing design best cution domain risks safety requirements safety programs practices, schedule, and analysis performance improvements lessons are being met models technical impacts learned, etc. are sought out and implemented Presented by Homayoon Dezfuli 17
    • Risk-Informed Safety Case 18
    • Risk-Informed Safety Case (RISC) • The risk-informed safety case (RISC) is the means by which the satisfaction of the system‘s safety objectives is demonstrated and communicated to decision makers at major milestones such as Key Decision Points (KDPs) • The RISC presents decision makers with a coherent case for safety, rather than presenting them with a set of individual safety analysis and safety management productsPresented by Homayoon Dezfuli 19
    • Risk-Informed Safety Case (RISC) (cont.) “A risk-informed safety case (RISC) is a structured argument, supported by a body of evidence, that provides a compelling, comprehensible and valid case that a system is or will be adequately safe for a given application in a given environment. This is accomplished by addressing each of the operational safety objectives that have been negotiated for the system, including articulation of the roadmap for the achievement of safety objectives that are applicable to later phases of the system life cycle.” From NASA/SP-2010-580 (SS Handbook) • The term ‗risk-informed‘ is used to emphasize that adequate safety is the result of a deliberative decision making process that involves an assessment of risks, and strives for a proper balance between safety performance and performance in other mission execution domainsPresented by Homayoon Dezfuli 20
    • Risk-Informed Safety Case (RISC) (cont.) • The elements of the RISC are: – An explicit set of safety claims about the system(s), for example, the probability of an accident or a group of accidents is lower than a specified value and/or as low as reasonably practicable – Supporting evidence for the claim, for example, representative operating history, redundancy in design, or results of analysis – Structured safety arguments that link claims to evidence and that use logically valid rules of inference • RISCs produced by lower-level organizational units (e.g., sub- system-level units) can be used as sub-claims of the RISC at the next higher level of the NASA hierarchyPresented by Homayoon Dezfuli 21
    • RISC Life Cycle Considerations • The RISC addresses the full system life cycle, regardless of the particular point in the life cycle at which the RISC is developed. This results in two types of safety claims: – Claims related to the safety objectives of the current or previous phases argue that the objectives have been met – Claims related to the safety objectives of future phases argue that necessary planning and preparation have been conducted, and that commitments are in place to satisfy the objectives at the appropriate timePresented by Homayoon Dezfuli 22
    • Example RISC Safety Claims Derived from Safety Objectives • The claims made (and defended) by the RISC dovetail with the safety objectives negotiated at the outset of system formulation • RISC Design Claims Derived from Design Objectives: The system design is adequately safe The system design The system design is meets or exceeds as safe as the minimum reasonably tolerable level of practicable (ASARP) safety Appropriate historically-informed defenses against Requirements have Design solution been allocated unknown and un- decisions have been consistent with quantified safety achievable safety risk informed hazards have been performance incorporated into the designPresented by Homayoon Dezfuli 23
    • Example RISC Structure The system design meets or exceeds the minimum tolerable level of safety • Claim: The system design meets or exceeds the An ISA has been properly The ISA shows that the conducted design solution meets the minimum tolerable level of allocated safety goal/ threshold requirements. safety The design solution has The ISA methods used are Unknown and un- been sufficiently well appropriate to the level of quantified safety hazards developed to support the design solution definition do not significantly impact ISA and the decision context safety performance Design solution elements:: ISA methods: The design is robust The design minimizes the ConOps Identify hazards against identified but un- potential for vulnerability to DRMs comprehensively quantified hazards unknown hazards Operating Characterize initiating environments events and system System schematics control responses Design drawings probabilistically The design incorporates: The design incorporates: ... Quantify events Historically-informed Minimal complexity consistent with margins against Appropriate TRL physics and available comparable stresses items data Appropriate Proven solutions to ... redundancies the extent possible Appropriate materials Appropriate for intended use inspection and Appropriate maintenance The ISA analysts are fully inspection and accesses qualified to conduct the maintenance ... ISA accesses ... Adjusted/waived requirements, standards, best practices do not significantly increase vulnerabilities to unknown/ unquantified hazardsPresented by Homayoon Dezfuli 24
    • Example RISC Structure (cont.) • Claim: Design solution Design solution decisions are risk informed decisions are risk informed RIDM has been conducted The tailored set of to select the design that requirements, standards, maximizes safety without and best practices to excessive performance which the design complies penalties in other mission supports a design solution execution domains that is as safe as reasonably practicable Stakeholder objectives are The RIDM methods used understood and are appropriate to the life The set of applicable There is an appropriate requirements (or imposed cycle phase and the requirements, standards, analytical basis for all constraints) have been decision context and best practices was adjustments/waivers to allocated from the level comprehensively identified requirements, standards, above and best practices RIDM methods: Identify alternatives Analyze the risks associated with each Adjusted/waived alternative requirements, standards, Support the risk- best practices: informed, deliberative Improve the balance selection of a design between analyzed alternative performance measures Preserve safety performance as a priority The RIDM analysts are Do not significantly fully qualified to conduct increase RIDM vulnerabilities to unknown/ unquantified hazardsPresented by Homayoon Dezfuli 25
    • Example RISC Structure (cont.) • Claim: Appropriate historically- informed defenses against Appropriate historically-informed defenses against unknown and unknown and un-quantified safety un-quantified safety hazards are incorporated into the design hazards are incorporated into the design The design is robust The design minimizes the • against identified but un- potential for vulnerability to Claim: Requirements are allocated quantified hazards unknown hazards consistent with achievable safety The design incorporates: The design incorporates: performance Historically-informed Minimal complexity margins against Appropriate TRL comparable stresses items Appropriate Proven solutions to redundancies the extent possible Appropriate materials Appropriate for intended use inspection and Allocated requirements Appropriate maintenance are consistent with inspection and accesses achievable safety maintenance ... performance accesses ... Performance requirements Allocated requirements are consistent with the have been negotiated Adjusted/waived performance commitments between the requirements requirements, standards, developed during RIDM owner and the best practices do not organization responsible significantly increase for meeting the vulnerabilities to unknown/ requirements unquantified hazardsPresented by Homayoon Dezfuli 26
    • Independent Evaluation of the RISC • It is good practice for an evaluator to have one or more checklists for determining whether the evidence is sufficient to support a claim • The checklist should be organized independently from the RISC and should tend to be generically applicable rather than application specific EVALUATION BY ANALYSIS TYPE ANALYSIS ATTRIBUTE Physical Hazards Individual Aggregate Risk Risk Responses Risks Risks Drivers Allocations Important issues are identified and evaluated Grade: Grade: Grade: Grade: Grade: Grade: Comment: Comment: Comment: Comment: Comment: Comment: Models are graded according to the importance of the issue Grade: Grade: Grade: Grade: Grade: Grade: Comment: Comment: Comment: Comment: Comment: Comment: Tests support models and analysis of important issues Grade: Grade: Grade: Grade: Grade: Grade: Comment: Comment: Comment: Comment: Comment: Comment: Best available models are used for all risk significant issues Grade: Grade: Grade: Grade: Grade: Grade: Comment: Comment: Comment: Comment: Comment: Comment: Etc. PROGRAMMATIC CONTROL EVALUATION Plans related to programmatic controls are comprehensively and clearly documented. Grade: Comment: Management will actively promote an environment within which design opportunities for improving safety Grade: without incurring unreasonable cost, schedule, and technical impacts are sought out and implemented Comment: during each phase. Protocols are in place that will promote effective and timely communication among design teams from Grade: different organizations working on different parts of the system. Comment: Etc.Presented by Homayoon Dezfuli 27
    • Putting It All TogetherPresented by Homayoon Dezfuli 28
    • Challenges Ahead • Organizational challenges – Integrating system safety personnel/activities more closely with systems engineering, operations management, and risk management • Analytical challenges – Integrating/refining existing analysis activities to support the development of an integrated safety analysis (ISA) – Meaningful accounting for unknown and under-evaluated risks in determining whether safety thresholds and goals have been achieved • Procedural and regulatory challenges – Development of standards and practices for formulating and evaluating risk informed safety cases (RISCs) – Development of guidelines for excising unnecessary requirements while maintaining safety beneficial requirementsPresented by Homayoon Dezfuli 29
    • Backup Slides 30
    • Independent Evaluation of the RISC • A flowdown checklist for evaluating the RISC has the advantage of explicitly showing how arguments based on evidence support claims. 1.0 TOP-LEVEL CLAIM Safety Performance Measures This flow-down checklist examines ―how safe‖ the system is (or will be),* how well it is demonstrated, and what is being done to make sure Safety Performance Requirements that the top-level safety claim is true (or remains true).* This is the technical basis for the claim: (including Goal and Threshold) Evidence, including operating experience, testing, associated engineering analysis, and a comprehensive, integrated design and safety Engineering Requirements analysis (IDSA), including scenario modeling using Probabilistic Safety Analysis (PSA) Process Requirements A credible set of performance commitments, deterministic requirements, and implementation measures. * The nature and specificity of the claim, and the character of the underlying evidence, depend on the life cycle phase at which the safety case is being applied. The results of analysis have been clearly presented, conditional on an The design intent is characterized in terms of It has been successfully demonstrated The implementation aspects needed to explicitly characterized baseline allocation of levels of performance, design reference missions, CONOPS, and that no further improvements to the achieve the level of safety claimed is risk-informed requirements, and operating experience. An effective deterministic requirements to be satisfied. The design or operations are currently net- correctly understood, and the process for identifying departures from this baseline and/or design itself is characterized at a level of detail beneficial (as safe as reasonably necessary measures have been addressing future emergent issues that are not addressed by this appropriate to the current life cycle phase. practicable). committed to. baseline has been developed. 1.1 1.2 1.3 1.4 An effective process for The design for the current life Analyses performed provide the An effective process has been addressing unresolved and The design and mission intent cycle phase (including following results: carried out to identify significant It has been confirmed that allocated non-quantified safety issues safety improvements, but no is well charctterized.* requirements and controls) is Aggregate risk results (issues invalidating the performance is feasible well specified.* Dominant accident scenarios candidate measures have been baseline case) has been identified 1.1.1 1.1.2 Comparison with threshold/ formuulated. 1.4.1 1.2.2 1.3.1 goal Established baseline for An effective process has been A reasonable defense developed for monitoring and precursor analysis It has been demonstrated that further against unknown safety assuring ongoing satisfaction of ….. issues is included in the improvements in safety would allocated performance levels, and design and controls unacceptably affect schedule 1.2.1 there are commitments to implement Concept of Operation What is credited is reasonable these measures Design Reference and justifiable 1.2.2.1 1.3.2 1.4.2 Missions Operation Environments 1.1.2.1 Historically Informed In addition to reviewing existing information sources and A reasonable attempt has been It has been demonstrated that further Elements operating experience, the best processes known for identifying improvements in safety would incur made to identify and prioritize all The nominal performance and previously unrecognized safety hazards has been applied. significant risks in the risk 1.1.1.1 dynamic responses in design excessive performance penalties management program reference phases are well 1.2.1.1 1.3.3 1.4.3 understood and justified 1.1.2.2 The limits of the safety models are recognized, the caliber of evidence used in the models has been evaluated, and uncertainty An effective process has been It has been demonstrated that further developed for evaluating flight and The performance tailoring and and sensitivity analyses have been performed. improvements in safety would incur test experience for the presence of allocation are well understood Completeness issue excessive cost accident precursors and justified Understanding of key phenomenology and assumptions 1.3.4 1.4.4 1.1.2.3 1.2.1.3 Hazard controls, crew survival methods (if applicable), deterministic requirements, and fault protection approaches have been formulated effectively in a risk-informed manner 1.1.2.4 1.2.1.2Presented by Homayoon Dezfuli 31