1. Taking Program Risk Management
To The Next Level
on NASA’s Constellation Program
John V. Turner, PhD
Constellation Program Risk Manager
CxIRMA
2. Agenda
• CxP Overview
• Pre-Historic Risk Management
• Risk Informed Decision Making
– CRM Process and Tools
– Risk Informed Design
– Integration with Systems Safety
– Risk Informed Test Program
– Knowledge Management
• CxP RIDM Status – Where are we Really on this?
• Areas for Improvement
Page 2 NASA CxP John V. Turner, PMC 2009
3. CxP Lunar Mission Overview
MOON
Ascent Stage
Altair Performs LOI Expended
100 km
Low Lunar Orbit
EDS
Expended
Service
Module
Low Expended
Earth
Orbit
Orion
EDS, Altair
Direct Entry
Land Landing
EARTH
Page 3 NASA CxP John V. Turner, PMC 2009
4. Constellation Systems
Altair Lander
Orion Capsule
Ares I and Ares V Rockets
Page 4 NASA CxP John V. Turner, PMC 2009
6. CxP Risk Management
• The complexity of the CxP, the ambitious nature of our
mission, and the significant constraints placed on our
program make effective RM essential
• We have to more proactive identify and manage our risks
than previous human spaceflight programs
Page 6 NASA CxP John V. Turner, PMC 2009
7. Early Risk Management
Continuous Risk
Management (CRM)
A meeting….
IRMA
A scorecard…..
A database…….
Hierarchical risk roll-up
Page 7 NASA CxP John V. Turner, PMC 2009
8. Risk Informed Decision Making (RIDM)
• NASA NPR 8000.4A Agency Risk Management Procedural
Requirements
• Integration of RIDM and CRM into a coherent framework:
– to foster proactive risk management:
– to better inform decision making through better use of risk
information,
– and then to more effectively manage implementation risks using
the CRM process - which is focused on the baseline
performance requirements emerging from the RIDM process.
• Within an RIDM process, decisions are made with regard to outcomes
of the decision alternatives, taking into account applicable risks and
uncertainties;
• As part of the implementation process, CRM is used to manage those
risks in order to achieve the performance levels that drove the
selection of a particular alternative
• Proactive risk management applies to programs, projects, and
institutional or mission support offices.
Page 8 NASA CxP John V. Turner, PMC 2009
9. RIDM
What Kind of Decisions? Where Are They Made?
Acquisition Strategy Selection
Boards and Panels
Mission Concept Definition
Tiger Teams
Requirements Definition
ATP Milestones
Design Trades
Safety Review Panels
Establish Controls / Ops Safety Baseline
Flight/Test Readiness
Mgt of Change Reviews
Source Boards
Budget Scrubs .
.
.
Design Risk Acceptance .
.
Operational Risk Acceptance
.
.
.
.
.
Page 9 NASA CxP John V. Turner, PMC 2009
10. Risk Informed Decision Making (RIDM)
Knowledge • KBRs
Management • PAL
• Knowledge Capture
Systems
Engineering
• Requirements and TPM
ATP MMRs Ops/Test Achievability
• Analysis priorities
• Test Objectives • Iterative Design and
• Risks Reviewed at Analysis
Authority to Proceed • Readiness Reviews
• C/S/T Baseline • Real Time
Decisions Systems Safety
Boards/Panels • Systematic Analysis
• Formal Risk Acceptance
• Establish Operational Safety
Continuous Risk • Managing risk Baseline (OSB)
through change
IRMA Management (CRM)
Probabilistic Design
• Standards for risk characterization
• CLAS for risks and Analysis
• Risk Communication and Reporting • Standards of Practice
Process • LOC LOM Reqts
• Prioritization of risk mitigation • Integrated Campaign,
proposals Architecture, System,
Element Analysis
Page 10 Dynamic Information Linkages
NASA CxP John V. Turner, PMC 2009
11. CRM
• The CxP follows the NASA Continuous Risk Management
Paradigm
Page 11 NASA CxP John V. Turner, PMC 2009
12. CRM
• The CxP has established Risk Management offices at the
Directorate level, program level and project level
– In some cases level IV (element) have a RM office as well
• RM policy is flowed from the agency to directorate, to
program to project level, and in some cases to elements
Page 12 NASA CxP John V. Turner, PMC 2009
13. CRM
• A Risk Management Working Group (~bi-weekly) has been
established to ensure common practice and guide the
development of RM policies, practices, and tools
– Including the CxP RM database application – IRMA
• A program risk scorecard has been put in place to help
establish consistency in risk priorities
Page 13 NASA CxP John V. Turner, PMC 2009
14. CRM
• A Top Risk Review Process is used to escalate the most
significant risks to higher levels for communication and
action
– Occurs ~ bi-monthly
– Top Project risks are discussed
– Risks requiring higher level awareness or action are escalated to the
directorate risk review
• The CxP Risk Team provides training to all program elements
in order to promote awareness, consistent practice,
improvement
– Several hundred personnel trained
Page 14 NASA CxP John V. Turner, PMC 2009
15. CRM - Risk Review Process
ESMD
CxP
Ares Orion Altair EVA
Project Project Project Project
Ground Ops Mission Ops Lunar Surface
Project Project Systems
SEI PPC SRQA OTI
Page 15 NASA CxP John V. Turner, PMC 2009
16. CRM - CxP Cost Threat Process
• The PP&C organizations at all program levels are responsible for
ensuring that the impact of risks on program reserves is identified
• This effort involves the Cx program and projects identifying and
quantifying new cost impacts related to risk mitigation planning
• A threat is money required to mitigate a risk that is not currently in the
Program or project budget
• Cost threats are documented and tracked in CxIRMA
• During the risk review, management considers risks with technical
performance, operations, safety, cost and schedule impacts
– Balances requests for new mitigation funding identified in threats
– What is the best portfolio of risk mitigation options that can be funded
based on threat profile and reserves?
Page 16 NASA CxP John V. Turner, PMC 2009
17. Fully Characterize the Risk
Team
Brainstorming Integrated
Project Control Analysis L2 SEI
Data (TDS) IMS
PRA Risk Drivers SRQA
Ares
Acc Risk Hazards IMS
(FMEA) OTI
Risk 2564 Orion
Requirements
Risks (CARD) IMS Orion
Integrated
Analysis (TDS) . .
Problem Reports PRA . .
(PRACA)
. .
Identification Assessment Handling Communication
(stakeholders)
18. CxIRMA
• The CxP uses the IRMA risk database application to document,
track, and communicate CxP risks
• CxIRMA users guide and training available in the tool
• IRMA is used in the ISS and Shuttle Programs and has been
modified to complement the Cx risk process
• The CxIRMA database is accessed in the CxP through the ICE
environment
• Users are assigned a role and to a Cx organization, and can be
assigned to multiple organizations
– Permissions are set by user type:
• All risks are visible in CxIRMA regardless of organization affiliation
• Candidates are only visible to those users assigned to the owning
organization
Page 18 NASA CxP John V. Turner, PMC 2009
19. CxIRMA
• CxIRMA is based on a “homepage” concept
– Each org has it’s own riskl list or homepage
• Risk they own, risks for which they are stakeholder, escalated risks
– Captures risk relationships
– Easy to generate reports
Page 19 NASA CxP John V. Turner, PMC 2009
20. CxIRMA
• Significant updates in work
– Update CxIRMA sw technology
• Database, middleware, interface
– New user friendly interface
– Data relationships with other data systems
• Requirements
• Critical Analyses (TDS)
• Schedule (IMS)
• Hazards
• PRACA
– Embedded in Program Control Data System
– Improved mitigation planning capability
• MS Project type interface
– Improved graphical reports
• Mitigation Gant or “Waterfall” charts
Page 20 NASA CxP John V. Turner, PMC 2009
21. Risk Informed Design (RID)
• Risk Informed Design means that the design of the CxP architecture will
consider risk as a critical design commodity so that the designs
produced most effectively balance risk against performance and cost.
– The ESAS used risk analysis to prioritize various architecture approaches
based on risk
– The establishment and allocation of LOC and LOM requirements applies
design pressure on architecture development at all levels
– Various risk analysis methods are used to identify risk drivers and identify
the most beneficial use of design commodities (mass, power, budget, etc)
to better meet LOC and LOM
• Hazard Analysis
• FMEA
• PRA
• Physics models and simulations
– Risk associated with Cost, Schedule, and other design commodities are
also considered
– The Iterative Design Analysis Process provides regular integration forums
where design insights can be made
Page 21 NASA CxP John V. Turner, PMC 2009
22. Risk Informed Design (RID)
• RID uses LOC and LOM requirements to provide top down allocations of risk
based on generic design reference mission configurations,
– LOC and LOM were initially defined at the generic DRM level per the ESAS and
architecture changes made after CxP startup
• These mission risk requirements were allocated to the system and
subsystem level
• PRA, simulation, and physics modeling methodologies were used to used
to evaluate adequacy of current designs and operational plans in meeting
these requirements
• LOC and LOM analysis addresses hardware, software, environments,
human reliability, external events, phenomenological events, etc.
• LOC and LOM analysis is part of the IDAC process
– LOC and LOM is incorporated in diverse assessments and trade studies as
integrated abort system design, launch order, land vs water landing, etc
• The program is developing a campaign analysis capability that will
allow us to evaluate the integrated effect of current designs and
plans over a campaign of missions
– Could result in a re-assessment of mission allocations and their allocations to
the subsystem level
– Could result in new requirements to drive more specific design issues
Page 22 NASA CxP John V. Turner, PMC 2009
23. Risk Informed Design (RID)
• The program is using PRA to provide more robust risk
characterization during the hazard analysis process
– Significant hazards will be quantified, and these incorporated in the PRA
mission models
– Functional Hazard Analysis performed to provide a top down, mission based
review of hazards to provide a basis for IHA and system HA allocations and a
starting point for mission PRA models
– Mission PRA models and hazards will have a common basis
• Integration of PRA and HA through FHA, and the quantification of
significant hazards, promotes better understanding and intelligent
management of the operational safety risk baseline
• FMEA, Hazard Analysis, PRA
• Controls, Verifications
Page 23 NASA CxP John V. Turner, PMC 2009
24. Development of Mission Concepts
and Architectures
M a rs M is s io n A rc h ite c tu re R is k A s s e s s m e n t
A rc hite c ture 6
A rc hite c ture 1 0 S ys te m s R e lia b ility
A rc hite c ture 5
E ntry / L a nd ing
R isk F O M
A rc hite c ture 8
A rc hite c ture 3 M a rs O rb it Ins e rtio n
A rc hite c ture 1 L a unc h / Inte g ra tio n
A rc hite c ture 7 Tra ns M a rs Inje c tio n
A rc hite c ture 4
A rc hite c ture 9 M a rs A s c e nt
A rc hite c ture 2 Tra ns E a rth Inje c tio n
0 .0 0 1 .0 0 2 .0 0 3 .0 0 O the r H a za rd s
R e fe re n c e M is s io n s
Example Only – Not Real Data
Page 24 NASA CxP John V. Turner, PMC 2009
25. Development of Mission Concepts
and Architectures
Cut No. % Cumul. % Cut Set Prob./Frequency Cut Sets
1 29 29 1.60E-04 Loss of crew due to common cause failure of parachutes during landing
2 50 22 1.20E-04 Loss of crew due to MMOD impact
3 67 16 9.08E-05 Loss of crew due to Capsule software failure
4 78 11 6.16E-05 Loss of crew due to LV Upper Stage Engine Upper Stage Engine Catastrophic
Failure
5 85 8 4.31E-05 Loss of crew due to Abort System separation jettison motor fails to function
6 89 4 2.16E-05 Loss of crew due to ground operations induced malfunction
7 95 5 3.02E-05 SRM case burst
Example Only – Not Real Data
Page 25 NASA CxP John V. Turner, PMC 2009
26. Systems Safety and Risk Management
• The CxP Risk Management program differentiates between risk
acceptance decisions made during early design and operations, and
longer term acceptance decisions
– The Safety Review Process considers residual risk hazards and makes initial
acceptance decision
– These risk are captured in the program CRM process to decide if longer term
mitigation is needed
– Periodic reviews are made of acceptance rationale to determine if further risk
mitigation is warranted based on new information, new capabilities, evolving
risk vulnerabilities, changes to designs and operating plans, or new funding
Page 26 NASA CxP John V. Turner, PMC 2009
27. The Life of a Safety/Mission Risk
Development Operations
Systems Safety Process
Hazard Hazard Hazard Hazard
HA, FMEA, PRA Acceptance Acceptance Acceptance Acceptance
CSERP Ops MS Ops MS Ops MS
Define And Implement
Maintain Controls
Characterize Risk Controls
CRM Process
Residual
CRM Risk
“Top” residual Hazards Acceptance
are entered in CRM
process (Defined by Risk Review
place on matrix)
Implement Strategic Mitigation Cease Mitigation?
Page 27 NASA CxP John V. Turner, PMC 2009
28. Integrated Risk Management:
CRM is the Glue
DDTE Operations
Acceptance
Systems Safety
Continuous Risk Management
• Define Risks and Controls
• Residual Risk Acceptance • Capture most significant AR
• Establish Operational Safety hazards as IRMA risks
Baseline (OSB) • Continue to mitigate accepted
risk hazards as appropriate
Boards/Panels
• Evaluate risks associated with
proposed changes • Document risks associated
• Conscious risk acceptance with decisions in CR and
assoicated with change mitigate
ATP milestones
• Define risks as part of ATP • Document risks identified as
prep and consider these in part of MMR process and
decision mitigate
• Conscious risk acceptance
• Identify new risks
Page 28 NASA CxP John V. Turner, PMC 2009
29. Apollo Test Program
2004 2005 2006 2007 2008 2009 2010 2011 2012
Vision ESAS LAS DFT LAS LAS RRF RRF RRF ISS
Speech Roll-Out 1 1 2 3 1 2 3 1
1957 1958 1959 1960 1961 1962 1963 1964 1965
1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4
Kennedy Speech 5/25 11/7 5/13 12/8 5/19 6/29
Apollo LES Sputnik “…before this decade
Is out…” PA-1 A-001 A-002 A-003 PA-2
Saturn 10/27 4/25 11/16 3/28 1/29
1 ATP
SA-1 SA-2 SA-3 SA-4 SA-45
SO SO SO SO SO
Saturn I 5/28 9/18 2/16 5/25 7/30
SA-6 SA-7 SA-9 8 10
Saturn I flew 4 times before adding an upper stage
Saturn IB Saturn I flew 6 times with S-IV before moving to S-IVB
Saturn IB flew 4 times before first manned flight
Saturn V flew 2 times before first manned flight
Page 29 NASA CxP John V. Turner, PMC 2009
31. Risk Informed Test Planning
• Goals of Test Program
– Validate requirements
– Validate models
– Enhance reliability growth
– Better support Risk Acceptance
• Methodology
1) Identify Hazards Early using Functional Hazard Analysis (FHA)
1) High level functional hazards vs Cause level
2) Evaluate likelihood of occurrence using available knowledge and historical analogs
3) Determine the capability of analysis, ground test, flight test to characterize risks and
reduce uncertainty
4) Recommend analysis and test activities needed to balance uncertainty reduction and
achieve reliability growth
5) As hazard analysis and PRA mature, re-assess
Pilot Project
• Examine 10-12 hazards, evaluate the adequacy of current planned
activities
Page 31 NASA CxP John V. Turner, PMC 2009
32. RM and KM Integration
• In pursuit of becoming a learning organization, CxP risk management will
include the integration of knowledge management and risk management
processes into the program/project life cycle
• Designing a complex architecture of hardware, software, ground and
space-based assets to return to the Moon and then on to Mars will require
an effective strategy to generate, capture and distribute knowledge
• Premise: Risk Managers, who already use lessons-learned as a source of
information for risk identification, are in a unique position within the
organization to effectively perform these functions
• Strategies
– Knowledge-Based Risks
– Pause and Learn (PAL) Events
– Knowledge Capture/Integration
Page 32 NASA CxP John V. Turner, PMC 2009
33. Cx Knowledge-Based Risks
• NASA’s Cx Program plans to create KBRs from pre-existing program risks
(housed inside of CxIRMA) as well as incorporate KBRs into new program risks
as they are identified.
• As the Cx Program evolves, KBRs will be integrated into the existing
continuous risk management (CRM) process.
– Similar to CRM, the Cx KBR process includes Identification, Disposition,
Documentation, and Distribution. KBR identification will become synonymous
with risk identification.
– The process also interacts with all levels and members of the Cx Program
including: Cx Orgs, Cx Risk Management Working Group (RMWG), KBR Owners
(similar to risk owners), ESMD, and SE&I.
• If the Cx Program decides a KBR is “significant,” the program has identified
the need for further exploration (including interviewing subject matter experts
on the topic, collecting related documentation, etc…) into how this KBR relates
to other NASA programs and projects. ESMD is responsible for significant
KBR development.
• Once the KBR implementation process has been tested successfully within the
Cx Program, other programs will have the ability to participate in the process,
creating a continuous KBR operation across the agency.
Page 33 NASA CxP John V. Turner, PMC 2009
34. CxP RM Status
• The CxP RM program is very strong
– Established Program Risk Management plan, risk review process, RM
tool, RM working group, and RM training (over 500 trained)
– All Cx Projects are actively identifying and mitigating risks and
participating in the top risk reporting process
– Integration of RM process & tools between levels I, II, and III going well
– Risk Management is integrated with project control and ATP Milestone
processes
– Overall, level of detail and fidelity of mitigation planning is excellent for this
stage of the program’s life and improves monthly
– Risk identification processes such as Reqts Design Compliance, HA,
FHA, Independent Cost Analysis, and PRA are in place to provide legs to
the RM process
– Integration of Technical Requirements, TPMs, TDSs, Cost Threats,
Safety Analysis, Cost and Schedule under way
– CxIRMA continues to develop improved capability to support new risk
integration initiatives and ease of use
Page 34 NASA CxP John V. Turner, PMC 2009
35. CxP RM Status
• Results are Evident
– Risk is driving the design of Ares, Orion, and Altair to obtain a more optimal
balance of risk across the architecture and mission timeline
– Significant decisions are informed by risk analysis, including technical,
safety, cost, schedule, and mission success factors
– RM practice is present at all levels and in all decision making forums in the
CxP
– The CxP has created a RIDM culture
• Having said that….there are areas where we can improve on this
practice
– Policy / Practice
• Streamline and focus risk reviews, Continue to improve the
quality of our risks. Integration of risks with other critical data
elements
– Tools
• Risk Informed Test Planning Methodology. IRMA
Enhancements. Knowledge Based Risks
– Training
• Case based training
Page 35 NASA CxP John V. Turner, PMC 2009
38. RIDM relies on being able to both: 1) compare risks to resolve
design trades, and 2) aggregate risks to understand risk posture
at the mission and campaign level
• The Risk Informed Design paradigm has been adopted by Ares, Orion,
Lander, and CxAT to establish a more optimal use of design commodities
to balance risk
– Adaptation of NESC recommended methodology (RP-06-108: Design, Development,
Test, and Evaluation (DDT&E) Considerations for Safe and Reliable Human Rated
Spacecraft Systems)
– Define Needs, Objectives, Constraints
– Define Minimum Functionality
– Make it Work
– Make it Safe
– Make it Reliable
– Make it Affordable
Page 38 NASA CxP John V. Turner, PMC 2009
39. Technical Risk Scenario
Conditional Conditional Conditional
Initiating
Event 1 Event 2 Event 3
Outcome
Event
A
Nominal LOC
Desirability of Outcome
Minor
Damage LOM
Catastrophic
LOM
Mitigation Events
Initiating Event
NOM
Time
• Paradigm works well for safety risk scenarios where discrete probabilities
can be assigned to specific events in an accident sequence
• Each sequence of events or risk trajectory, has a unique probability,
derived from the combination of conditional probability events
Page 39 NASA CxP John V. Turner, PMC 2009
40. Mission Success Depends Upon a
Combination of Many Variables
Launch Strategy:
Launch: • Two launch
Vehicle Reliability:
• Time increment • LOM/LOC
• Single Launch
between launches
• Launch Availability
Target Characteristics:
• Launch Probability
• Redundant Landing Sites
• Order of Launches
• Multiple opportunities to
access a select landing
site
LEO Loiter: • Lighting constraints at
• LEO Loiter Duration target
• Ascent Rendezvous
Vehicle Performance:
Opportunities • Orbital Mechanics Variation
Tolerance
• TLI Windows
• Additional Propulsive
Capability
• Vehicle Life
• Launch Mass Constraints
Page 40 NASA CxP John V. Turner, PMC 2009
41. Example – Functional Risk Timeline
Example Only – Not Real Data
Page 41 NASA CxP John V. Turner, PMC 2009
42. Saturn / Apollo Development Testing
• Saturn “Block 1” Sub-Orbital Flights
– First Stage Ascent Tests with Inert Upper Stages (no
separation)
– Validation of ascent performance, structural loads,
functionality of gimbaled nozzles on the outboard
engines for S&C.
– SA-4 flight included intentional “engine-out”
checkout.
• Saturn Block II Flights
– Functional S-IV Upper Stage
– SA-6 through SA-10 flights carried prototype Crew
Modules
– Test of nominal LES jettison on SA-6 and SA-7.
• Un-Crewed SI-B Flights
– Functional SIV-B upper stage powered by J-2
Engine.
– CM separated and returned to Earth.
• Launch Escape System Testing
– Abort Test Booster to test the LES at transonic,
maximum dynamic pressure, low altitude, and
power-on tumbling abort conditions.
Page 42 NASA CxP John V. Turner, PMC 2009
43. Mars First?
MOON
MARS
Earth
ISS
• Exploration Campaign Analysis: Identify the activities and architectures
required to optimally produce mission success and crew safety within
cost and schedule constraints
• The high risk associated with manned Mars exploration make risk
informed design essential
• ISS and Lunar missions are also essential to accomplishing this goal
– Technology demonstration
– Reliability growth
– Operational experience
Page 43 NASA CxP John V. Turner, PMC 2009