SlideShare a Scribd company logo
1 of 10
Resiliency
John Mymryk
December 15, 2014
2
Contents
1) Types of Business Impacts (Outages) & Their Costs
2) Three Foundational Pillars of Business Continuity
3) Problem Statement
4) High Availability & Sustained Resiliency
5) Our Methodology
6) Value Proposition
7) Appendix
Types of Business Impacts (Outages) & Their Costs
3
* Sources: www.symantec.com; www.informationweek.com;: www.businesscomputingworld.co.uk; www.evolven.com; www.quorum.net
Corporations implement a Business
Continuity Program (BCP) to address these
types of outages as they directly impact the
bottom line.
* Lost Labor: $46,000,000
(Per 10,000 person company @ 1.6 hr/wk)
* Lost Revenue: $26,500,000,000
(Survey across 200 companies)
* Brand Failure: $ ??
RIM (Blackberry)
Revenue lost for a single outage can be in the Millions ($). Outages
may also start a brand failure (i.e. Blackberry RIM outage ~ $100M)
Important! Most outages are either
Hardware Failure or Human Error – a very
small percentage of the overall outages are
Natural Disasters (5%).
What are these Outages? (Yearly Combined)
77%
What are the Costs? Here are some Data Points:
So, What are the key elements of a BCP Program?
* Losses are estimated at $ 1,200,000,000,000 Trillion dollars annually
Three Foundational Pillars of Business Continuity
4
Resiliency Recovery Contingency
Resiliency is a destination
A state where critical business
functions and the supporting
infrastructure are unaffected
by most outages.
Resiliency is the ability of a
corporation to move its
capability and capacity
seamlessly around its
environment.
Recovery is a journey
Also known as “Disaster
Recovery” or DR, it is complex
to maintain and difficult to
implement.
DR has data loss to a point-in-
time and production downtime
as an acceptable outcome.
Contingency is a last resort
Establish a generalized
capability and readiness to
cope with major incidents and
disasters. Not all are known.
Contingencies involve data
loss and production downtime
as an acceptable outcome.
Increasing Resiliency efforts will naturally reduce efforts in Recovery & Contingency.
Business
Continuity
Resiliency
Recovery
Contingency
5
Corporations leave a hole in their overall
Business Continuity programs spending
considerable dollars in Recovery & Contingency
(covering only ~5% of outages) with diminishing
return. This leaves a BCP program with built-in
downtime and data loss as acceptable outcomes.
Resize
Recovery &
Contingency
Efforts
Appropriately
Improve &
Increase
Resiliency
Efforts
Resiliency is pro-active in dealing with significant
business survival events (hardware failure, human
error, power outages, pandemic, natural disaster,
social un-rest, etc.).
Problem Statement
Goal: Reset the BCP Balance
Resiliency is a super-set of Recovery and
Contingency, which leverages established process
and procedure used in Recovery and Contingency.
1
Tier 1 - Business
Application
High Availability & Sustained Resiliency
6
Tier 0 - Load
Balancer
Infrastructure
Tier 0 –
Physical Server
Infrastructure
Tier 0 - DB Servers
& DB Infrastructure
Tier 0 -
Directories
(LDAP, AD,..)
Tier 0 - Storage
Infrastructure
Tier 0 –
Virtualization
Infrastructure
Identified
HA/SR Gaps
Identifying &
Resolving
HA/SR Gaps
Reduces
Infrastructure
Failures
Unplanned
Outages
Tangible
Loss
Intangible
Loss
Provider Confidence
Regulatory Fines
A compounding and/or cascading
failure can occur when many
HA/SR gaps are concentrated.
The value is to find those
HA/SR Gaps and address them
High Availability (HA)
Component availability, which can
be Inter-site or Intra-site.
Sustained Resiliency (SR)
Moving capacity & capability
seamlessly around the physical
environment
Resilience: Critical business functions and the supporting infrastructure are designed and engineered in such a way that
they are materially unaffected by most disruptions, for example through the use of redundancy and spare capacity. There
are two (2) methods to do this:
Benefits of HA/SR
2
Our Resiliency Methodology
7
Develop Test
Schedule Test
Submit Events
HA/SR Testing
Feedback &
Improvement
Validate
Gap Exposure Risk,
Value Assessment
Application Testing
Capability
Gap Remediation
Investigate
Applications
Submitted for
Assessment or Review
Perform Assessment
and Onboarding
Develop Test
Requirements and
Objectives
Assess
5
1
2
3
4
6
7
8
9
10
11
All applications (infrastructure, services, applications or utilities) that execute all 11 steps along the
Resiliency Methodology would be considered mature in their Resiliency profile, and by extension
would be able to endure business impactful (outage) events.
The Value – A Proven Resiliency Program
8
Lowered IT
Effort
Meet SLAs
(SR)
Lower
Outages (HA)
Contingency
Planning
Disaster
Recovery
Resiliency
Most Corporation’s
Current State
Implement Resiliency
Methodology
Meets most audit requirements.
Has great planning, but limited impact on improving the
production environment’s ability to sustain outages.
Increasing more effort on Disaster Recovery &
Contingency Planning results in a diminishing return.
Focusing more on Resiliency, IT teams can reduce
efforts and costs as DR and CP goals are met through
Resiliency implementation.
End result is a resilient application infrastructure.
9
Appendix
• Anatomy of a Recovery Event
Area of Potential
Data Loss
An Application’s Data
RTO, RPO, RTC, TTTR & BTTR (Visually) – Anatomy of a Recovery Event
10
Timeline of Incident/Outage
RPO
No Data
Business Decision
Window
Business
Resumption
Take
Action
Infra
Ready
Good
Data
Inconsistent
Data
Rebuilding
Data
Good
Data
Data
Available
Fixed
(if applies)
Business Time To Resume(BTTR)
Recovery Time Objective (RTO)
Technology Time To Recover (TTTR)
Return To Capacity (RTC)
Application Recovery
Business RecoveryInfrastructure Recovery
People / Facilities Recovery
Time To Fix
All Hands Fix
Incident
Start
Fix It Outage Time
Recovery Outage Time

More Related Content

What's hot

IT Disaster Recovery Plan
IT Disaster Recovery PlanIT Disaster Recovery Plan
IT Disaster Recovery PlanCallOneTel
 
Disaster Recovery Planning
Disaster Recovery PlanningDisaster Recovery Planning
Disaster Recovery PlanningKathy Pelletier
 
An Introduction to Disaster Recovery Planning
An Introduction to Disaster Recovery PlanningAn Introduction to Disaster Recovery Planning
An Introduction to Disaster Recovery PlanningNEBizRecovery
 
Duplex Monitoring System
Duplex Monitoring SystemDuplex Monitoring System
Duplex Monitoring SystemWorldFish
 
Risk management(software engineering)
Risk management(software engineering)Risk management(software engineering)
Risk management(software engineering)Priya Tomar
 
Om 0015 maintenance management
Om 0015   maintenance managementOm 0015   maintenance management
Om 0015 maintenance managementsmumbahelp
 
Quiz 9
Quiz 9Quiz 9
Quiz 9jiml59
 
IT Disaster Recovery Readiness (Maturity Assessement)
IT Disaster Recovery Readiness (Maturity Assessement) IT Disaster Recovery Readiness (Maturity Assessement)
IT Disaster Recovery Readiness (Maturity Assessement) Bashar Alkhatib
 
Business Continuity: How to Eliminate Downtime
Business Continuity: How to Eliminate DowntimeBusiness Continuity: How to Eliminate Downtime
Business Continuity: How to Eliminate DowntimeNamtek Consulting Services
 
An Intro to Resolver's Resilience Application
An Intro to Resolver's Resilience ApplicationAn Intro to Resolver's Resilience Application
An Intro to Resolver's Resilience ApplicationResolver Inc.
 
Disaster Recovery Plan
Disaster Recovery PlanDisaster Recovery Plan
Disaster Recovery Planmhdpaknejad
 
Disaster Prevention And Mitigation PowerPoint Presentation Slides
Disaster Prevention And Mitigation PowerPoint Presentation SlidesDisaster Prevention And Mitigation PowerPoint Presentation Slides
Disaster Prevention And Mitigation PowerPoint Presentation SlidesSlideTeam
 
Disaster recovery plan sample 2
Disaster recovery plan sample 2Disaster recovery plan sample 2
Disaster recovery plan sample 2AbenetAsmellash
 
Prevention And Mitigation In Disaster Management PowerPoint Presentation Slides
Prevention And Mitigation In Disaster Management PowerPoint Presentation SlidesPrevention And Mitigation In Disaster Management PowerPoint Presentation Slides
Prevention And Mitigation In Disaster Management PowerPoint Presentation SlidesSlideTeam
 
Risk Management for Online PR
Risk Management for Online PRRisk Management for Online PR
Risk Management for Online PRDavid Phillips
 
Managing in the presence of uncertainty
Managing in the presence of uncertaintyManaging in the presence of uncertainty
Managing in the presence of uncertaintyGlen Alleman
 
Business Continuity
Business ContinuityBusiness Continuity
Business ContinuityMark Yates
 
15 Secrets To Writing A Great Business Continuity Plan
15 Secrets To Writing A Great Business Continuity Plan15 Secrets To Writing A Great Business Continuity Plan
15 Secrets To Writing A Great Business Continuity PlanhSo
 

What's hot (20)

Ecm
EcmEcm
Ecm
 
IT Disaster Recovery Plan
IT Disaster Recovery PlanIT Disaster Recovery Plan
IT Disaster Recovery Plan
 
Disaster Recovery Planning
Disaster Recovery PlanningDisaster Recovery Planning
Disaster Recovery Planning
 
An Introduction to Disaster Recovery Planning
An Introduction to Disaster Recovery PlanningAn Introduction to Disaster Recovery Planning
An Introduction to Disaster Recovery Planning
 
Duplex Monitoring System
Duplex Monitoring SystemDuplex Monitoring System
Duplex Monitoring System
 
Risk management(software engineering)
Risk management(software engineering)Risk management(software engineering)
Risk management(software engineering)
 
Om 0015 maintenance management
Om 0015   maintenance managementOm 0015   maintenance management
Om 0015 maintenance management
 
Quiz 9
Quiz 9Quiz 9
Quiz 9
 
IT Disaster Recovery Readiness (Maturity Assessement)
IT Disaster Recovery Readiness (Maturity Assessement) IT Disaster Recovery Readiness (Maturity Assessement)
IT Disaster Recovery Readiness (Maturity Assessement)
 
FMEA Final Project
FMEA Final ProjectFMEA Final Project
FMEA Final Project
 
Business Continuity: How to Eliminate Downtime
Business Continuity: How to Eliminate DowntimeBusiness Continuity: How to Eliminate Downtime
Business Continuity: How to Eliminate Downtime
 
An Intro to Resolver's Resilience Application
An Intro to Resolver's Resilience ApplicationAn Intro to Resolver's Resilience Application
An Intro to Resolver's Resilience Application
 
Disaster Recovery Plan
Disaster Recovery PlanDisaster Recovery Plan
Disaster Recovery Plan
 
Disaster Prevention And Mitigation PowerPoint Presentation Slides
Disaster Prevention And Mitigation PowerPoint Presentation SlidesDisaster Prevention And Mitigation PowerPoint Presentation Slides
Disaster Prevention And Mitigation PowerPoint Presentation Slides
 
Disaster recovery plan sample 2
Disaster recovery plan sample 2Disaster recovery plan sample 2
Disaster recovery plan sample 2
 
Prevention And Mitigation In Disaster Management PowerPoint Presentation Slides
Prevention And Mitigation In Disaster Management PowerPoint Presentation SlidesPrevention And Mitigation In Disaster Management PowerPoint Presentation Slides
Prevention And Mitigation In Disaster Management PowerPoint Presentation Slides
 
Risk Management for Online PR
Risk Management for Online PRRisk Management for Online PR
Risk Management for Online PR
 
Managing in the presence of uncertainty
Managing in the presence of uncertaintyManaging in the presence of uncertainty
Managing in the presence of uncertainty
 
Business Continuity
Business ContinuityBusiness Continuity
Business Continuity
 
15 Secrets To Writing A Great Business Continuity Plan
15 Secrets To Writing A Great Business Continuity Plan15 Secrets To Writing A Great Business Continuity Plan
15 Secrets To Writing A Great Business Continuity Plan
 

Viewers also liked

Sustainability through health safety and environment
Sustainability through health safety and environmentSustainability through health safety and environment
Sustainability through health safety and environmentEmmanuel Ogbeide
 
Tellurian | 2017 Diaries, Notebooks and Corporate Gift Items in UAE
Tellurian | 2017 Diaries, Notebooks and Corporate Gift Items in UAETellurian | 2017 Diaries, Notebooks and Corporate Gift Items in UAE
Tellurian | 2017 Diaries, Notebooks and Corporate Gift Items in UAETellurian Book Production
 
Research Foundations_Role of Social Media in the Hospitality Industry
Research Foundations_Role of Social Media in the Hospitality IndustryResearch Foundations_Role of Social Media in the Hospitality Industry
Research Foundations_Role of Social Media in the Hospitality IndustryMark Llanos, MBA
 
Jim Greene Resume
Jim Greene ResumeJim Greene Resume
Jim Greene ResumeJim Greene
 
Strategy Thinking and Innovation - Microsoft's XBox - A study of a MNC and th...
Strategy Thinking and Innovation - Microsoft's XBox - A study of a MNC and th...Strategy Thinking and Innovation - Microsoft's XBox - A study of a MNC and th...
Strategy Thinking and Innovation - Microsoft's XBox - A study of a MNC and th...Mark Llanos, MBA
 
Đề cương khóa luận (Nhóm Trọng - Tuấn)
Đề cương khóa luận (Nhóm Trọng - Tuấn)Đề cương khóa luận (Nhóm Trọng - Tuấn)
Đề cương khóa luận (Nhóm Trọng - Tuấn)Lê Trọng
 
Resume Barry Timm 2015 02 02a
Resume Barry Timm  2015 02 02aResume Barry Timm  2015 02 02a
Resume Barry Timm 2015 02 02aBarry Timm
 

Viewers also liked (12)

Sustainability through health safety and environment
Sustainability through health safety and environmentSustainability through health safety and environment
Sustainability through health safety and environment
 
Tellurian | 2017 Diaries, Notebooks and Corporate Gift Items in UAE
Tellurian | 2017 Diaries, Notebooks and Corporate Gift Items in UAETellurian | 2017 Diaries, Notebooks and Corporate Gift Items in UAE
Tellurian | 2017 Diaries, Notebooks and Corporate Gift Items in UAE
 
Research Foundations_Role of Social Media in the Hospitality Industry
Research Foundations_Role of Social Media in the Hospitality IndustryResearch Foundations_Role of Social Media in the Hospitality Industry
Research Foundations_Role of Social Media in the Hospitality Industry
 
Fascismo
FascismoFascismo
Fascismo
 
Jim Greene Resume
Jim Greene ResumeJim Greene Resume
Jim Greene Resume
 
2017 Tellurian Catalogue
2017 Tellurian Catalogue2017 Tellurian Catalogue
2017 Tellurian Catalogue
 
Strategy Thinking and Innovation - Microsoft's XBox - A study of a MNC and th...
Strategy Thinking and Innovation - Microsoft's XBox - A study of a MNC and th...Strategy Thinking and Innovation - Microsoft's XBox - A study of a MNC and th...
Strategy Thinking and Innovation - Microsoft's XBox - A study of a MNC and th...
 
La Prima guerra mondiale
La Prima guerra mondialeLa Prima guerra mondiale
La Prima guerra mondiale
 
Đề cương khóa luận (Nhóm Trọng - Tuấn)
Đề cương khóa luận (Nhóm Trọng - Tuấn)Đề cương khóa luận (Nhóm Trọng - Tuấn)
Đề cương khóa luận (Nhóm Trọng - Tuấn)
 
Crisi democrazie
Crisi democrazieCrisi democrazie
Crisi democrazie
 
Stalinismo
StalinismoStalinismo
Stalinismo
 
Resume Barry Timm 2015 02 02a
Resume Barry Timm  2015 02 02aResume Barry Timm  2015 02 02a
Resume Barry Timm 2015 02 02a
 

Similar to 2015-01-13 Resiliency (v04)

2009_NYC_OpRiskUSA_Conf
2009_NYC_OpRiskUSA_Conf2009_NYC_OpRiskUSA_Conf
2009_NYC_OpRiskUSA_ConfPeter Poulos
 
Business Continuity Workshop Final
Business Continuity Workshop   FinalBusiness Continuity Workshop   Final
Business Continuity Workshop FinalBill Lisse
 
A Top Down Business Impact Analyses Method V5
A Top Down Business Impact Analyses Method V5A Top Down Business Impact Analyses Method V5
A Top Down Business Impact Analyses Method V5Gewurtz
 
Prevention Protection And Mitigation Planning PowerPoint Presentation Slides
Prevention Protection And Mitigation Planning PowerPoint Presentation SlidesPrevention Protection And Mitigation Planning PowerPoint Presentation Slides
Prevention Protection And Mitigation Planning PowerPoint Presentation SlidesSlideTeam
 
Business continuity & Disaster recovery planing
Business continuity & Disaster recovery planingBusiness continuity & Disaster recovery planing
Business continuity & Disaster recovery planingHanaysha
 
Cyber Security and Business Continuity an Integrated Discipline
Cyber Security and Business Continuity an Integrated DisciplineCyber Security and Business Continuity an Integrated Discipline
Cyber Security and Business Continuity an Integrated DisciplineGraeme Parker
 
JR Dickens - IWCS 2007
JR Dickens - IWCS 2007JR Dickens - IWCS 2007
JR Dickens - IWCS 2007jrd9234
 
Corporate Disaster Prevention And Preparedness PowerPoint Presentation Slides
Corporate Disaster Prevention And Preparedness PowerPoint Presentation Slides Corporate Disaster Prevention And Preparedness PowerPoint Presentation Slides
Corporate Disaster Prevention And Preparedness PowerPoint Presentation Slides SlideTeam
 
Business Hazards Mitigation PowerPoint Presentation Slides
Business Hazards Mitigation PowerPoint Presentation SlidesBusiness Hazards Mitigation PowerPoint Presentation Slides
Business Hazards Mitigation PowerPoint Presentation SlidesSlideTeam
 
Comparing RCM and PMO2000
Comparing RCM and PMO2000Comparing RCM and PMO2000
Comparing RCM and PMO2000Steve Turner
 
Risk Based Approach To Recovery And Continuity Management John P Morency
Risk Based Approach To Recovery And Continuity Management   John P  MorencyRisk Based Approach To Recovery And Continuity Management   John P  Morency
Risk Based Approach To Recovery And Continuity Management John P Morencyjmorency1952
 
Contingency Plan WAK BANKS ATM
Contingency Plan WAK BANKS ATMContingency Plan WAK BANKS ATM
Contingency Plan WAK BANKS ATMWajahat Ali Khan
 
Business Continuity Planning
Business Continuity PlanningBusiness Continuity Planning
Business Continuity Planningalanlund
 
Disaster recovery white_paper
Disaster recovery white_paperDisaster recovery white_paper
Disaster recovery white_paperCMR WORLD TECH
 
Business Emergency Management PowerPoint Presentation Slides
Business Emergency Management PowerPoint Presentation Slides Business Emergency Management PowerPoint Presentation Slides
Business Emergency Management PowerPoint Presentation Slides SlideTeam
 
Risk Management Procedure And Guidelines PowerPoint Presentation Slides
Risk Management Procedure And Guidelines PowerPoint Presentation Slides Risk Management Procedure And Guidelines PowerPoint Presentation Slides
Risk Management Procedure And Guidelines PowerPoint Presentation Slides SlideTeam
 
Incident managment plan
Incident managment planIncident managment plan
Incident managment planSafwan Hashmi
 

Similar to 2015-01-13 Resiliency (v04) (20)

2009_NYC_OpRiskUSA_Conf
2009_NYC_OpRiskUSA_Conf2009_NYC_OpRiskUSA_Conf
2009_NYC_OpRiskUSA_Conf
 
Business Continuity Workshop Final
Business Continuity Workshop   FinalBusiness Continuity Workshop   Final
Business Continuity Workshop Final
 
A Top Down Business Impact Analyses Method V5
A Top Down Business Impact Analyses Method V5A Top Down Business Impact Analyses Method V5
A Top Down Business Impact Analyses Method V5
 
Prevention Protection And Mitigation Planning PowerPoint Presentation Slides
Prevention Protection And Mitigation Planning PowerPoint Presentation SlidesPrevention Protection And Mitigation Planning PowerPoint Presentation Slides
Prevention Protection And Mitigation Planning PowerPoint Presentation Slides
 
Business continuity & Disaster recovery planing
Business continuity & Disaster recovery planingBusiness continuity & Disaster recovery planing
Business continuity & Disaster recovery planing
 
Cyber Security and Business Continuity an Integrated Discipline
Cyber Security and Business Continuity an Integrated DisciplineCyber Security and Business Continuity an Integrated Discipline
Cyber Security and Business Continuity an Integrated Discipline
 
JR Dickens - IWCS 2007
JR Dickens - IWCS 2007JR Dickens - IWCS 2007
JR Dickens - IWCS 2007
 
Corporate Disaster Prevention And Preparedness PowerPoint Presentation Slides
Corporate Disaster Prevention And Preparedness PowerPoint Presentation Slides Corporate Disaster Prevention And Preparedness PowerPoint Presentation Slides
Corporate Disaster Prevention And Preparedness PowerPoint Presentation Slides
 
Business Hazards Mitigation PowerPoint Presentation Slides
Business Hazards Mitigation PowerPoint Presentation SlidesBusiness Hazards Mitigation PowerPoint Presentation Slides
Business Hazards Mitigation PowerPoint Presentation Slides
 
Comparing RCM and PMO2000
Comparing RCM and PMO2000Comparing RCM and PMO2000
Comparing RCM and PMO2000
 
Risk Based Approach To Recovery And Continuity Management John P Morency
Risk Based Approach To Recovery And Continuity Management   John P  MorencyRisk Based Approach To Recovery And Continuity Management   John P  Morency
Risk Based Approach To Recovery And Continuity Management John P Morency
 
Contingency Plan WAK BANKS ATM
Contingency Plan WAK BANKS ATMContingency Plan WAK BANKS ATM
Contingency Plan WAK BANKS ATM
 
Business Continuity Planning
Business Continuity PlanningBusiness Continuity Planning
Business Continuity Planning
 
Disaster recovery white_paper
Disaster recovery white_paperDisaster recovery white_paper
Disaster recovery white_paper
 
Business Emergency Management PowerPoint Presentation Slides
Business Emergency Management PowerPoint Presentation Slides Business Emergency Management PowerPoint Presentation Slides
Business Emergency Management PowerPoint Presentation Slides
 
2005_SIA_BCP_Conf
2005_SIA_BCP_Conf2005_SIA_BCP_Conf
2005_SIA_BCP_Conf
 
The Cost of Downtime
The Cost of DowntimeThe Cost of Downtime
The Cost of Downtime
 
The Cost of Downtime
The Cost of DowntimeThe Cost of Downtime
The Cost of Downtime
 
Risk Management Procedure And Guidelines PowerPoint Presentation Slides
Risk Management Procedure And Guidelines PowerPoint Presentation Slides Risk Management Procedure And Guidelines PowerPoint Presentation Slides
Risk Management Procedure And Guidelines PowerPoint Presentation Slides
 
Incident managment plan
Incident managment planIncident managment plan
Incident managment plan
 

2015-01-13 Resiliency (v04)

  • 2. 2 Contents 1) Types of Business Impacts (Outages) & Their Costs 2) Three Foundational Pillars of Business Continuity 3) Problem Statement 4) High Availability & Sustained Resiliency 5) Our Methodology 6) Value Proposition 7) Appendix
  • 3. Types of Business Impacts (Outages) & Their Costs 3 * Sources: www.symantec.com; www.informationweek.com;: www.businesscomputingworld.co.uk; www.evolven.com; www.quorum.net Corporations implement a Business Continuity Program (BCP) to address these types of outages as they directly impact the bottom line. * Lost Labor: $46,000,000 (Per 10,000 person company @ 1.6 hr/wk) * Lost Revenue: $26,500,000,000 (Survey across 200 companies) * Brand Failure: $ ?? RIM (Blackberry) Revenue lost for a single outage can be in the Millions ($). Outages may also start a brand failure (i.e. Blackberry RIM outage ~ $100M) Important! Most outages are either Hardware Failure or Human Error – a very small percentage of the overall outages are Natural Disasters (5%). What are these Outages? (Yearly Combined) 77% What are the Costs? Here are some Data Points: So, What are the key elements of a BCP Program? * Losses are estimated at $ 1,200,000,000,000 Trillion dollars annually
  • 4. Three Foundational Pillars of Business Continuity 4 Resiliency Recovery Contingency Resiliency is a destination A state where critical business functions and the supporting infrastructure are unaffected by most outages. Resiliency is the ability of a corporation to move its capability and capacity seamlessly around its environment. Recovery is a journey Also known as “Disaster Recovery” or DR, it is complex to maintain and difficult to implement. DR has data loss to a point-in- time and production downtime as an acceptable outcome. Contingency is a last resort Establish a generalized capability and readiness to cope with major incidents and disasters. Not all are known. Contingencies involve data loss and production downtime as an acceptable outcome. Increasing Resiliency efforts will naturally reduce efforts in Recovery & Contingency.
  • 5. Business Continuity Resiliency Recovery Contingency 5 Corporations leave a hole in their overall Business Continuity programs spending considerable dollars in Recovery & Contingency (covering only ~5% of outages) with diminishing return. This leaves a BCP program with built-in downtime and data loss as acceptable outcomes. Resize Recovery & Contingency Efforts Appropriately Improve & Increase Resiliency Efforts Resiliency is pro-active in dealing with significant business survival events (hardware failure, human error, power outages, pandemic, natural disaster, social un-rest, etc.). Problem Statement Goal: Reset the BCP Balance Resiliency is a super-set of Recovery and Contingency, which leverages established process and procedure used in Recovery and Contingency.
  • 6. 1 Tier 1 - Business Application High Availability & Sustained Resiliency 6 Tier 0 - Load Balancer Infrastructure Tier 0 – Physical Server Infrastructure Tier 0 - DB Servers & DB Infrastructure Tier 0 - Directories (LDAP, AD,..) Tier 0 - Storage Infrastructure Tier 0 – Virtualization Infrastructure Identified HA/SR Gaps Identifying & Resolving HA/SR Gaps Reduces Infrastructure Failures Unplanned Outages Tangible Loss Intangible Loss Provider Confidence Regulatory Fines A compounding and/or cascading failure can occur when many HA/SR gaps are concentrated. The value is to find those HA/SR Gaps and address them High Availability (HA) Component availability, which can be Inter-site or Intra-site. Sustained Resiliency (SR) Moving capacity & capability seamlessly around the physical environment Resilience: Critical business functions and the supporting infrastructure are designed and engineered in such a way that they are materially unaffected by most disruptions, for example through the use of redundancy and spare capacity. There are two (2) methods to do this: Benefits of HA/SR 2
  • 7. Our Resiliency Methodology 7 Develop Test Schedule Test Submit Events HA/SR Testing Feedback & Improvement Validate Gap Exposure Risk, Value Assessment Application Testing Capability Gap Remediation Investigate Applications Submitted for Assessment or Review Perform Assessment and Onboarding Develop Test Requirements and Objectives Assess 5 1 2 3 4 6 7 8 9 10 11 All applications (infrastructure, services, applications or utilities) that execute all 11 steps along the Resiliency Methodology would be considered mature in their Resiliency profile, and by extension would be able to endure business impactful (outage) events.
  • 8. The Value – A Proven Resiliency Program 8 Lowered IT Effort Meet SLAs (SR) Lower Outages (HA) Contingency Planning Disaster Recovery Resiliency Most Corporation’s Current State Implement Resiliency Methodology Meets most audit requirements. Has great planning, but limited impact on improving the production environment’s ability to sustain outages. Increasing more effort on Disaster Recovery & Contingency Planning results in a diminishing return. Focusing more on Resiliency, IT teams can reduce efforts and costs as DR and CP goals are met through Resiliency implementation. End result is a resilient application infrastructure.
  • 9. 9 Appendix • Anatomy of a Recovery Event
  • 10. Area of Potential Data Loss An Application’s Data RTO, RPO, RTC, TTTR & BTTR (Visually) – Anatomy of a Recovery Event 10 Timeline of Incident/Outage RPO No Data Business Decision Window Business Resumption Take Action Infra Ready Good Data Inconsistent Data Rebuilding Data Good Data Data Available Fixed (if applies) Business Time To Resume(BTTR) Recovery Time Objective (RTO) Technology Time To Recover (TTTR) Return To Capacity (RTC) Application Recovery Business RecoveryInfrastructure Recovery People / Facilities Recovery Time To Fix All Hands Fix Incident Start Fix It Outage Time Recovery Outage Time

Editor's Notes

  1. According to Dunn & Bradstreet, 59% of Fortune 500 companies experience a minimum of 1.6 hours of downtime per week. This means that if you take the average Fortune 500 company (at least 10,000 employees) paid an average of $56 per hour, including benefits. The labor part of downtime costs for an organization this size would be $896,000 weekly, translating into more than $46 million per year. (Assessing The Financial Impact Of Downtime). For 2,000 corporations, lost labor costs would be $46M x 2,000 = $92,000,000,000 ($92 Billion). There are more than 2,000 companies with more than 10,000 employees. CA Technologies is the latest to attempt to calculate IT downtime, with a survey of 200 companies across North America and Europe intended to calculate the losses incurred from an IT outage. What it found was more than $26.5 billion in revenue is lost each year from IT downtime, which translates to roughly $150,000 is lost annually for each business. In September 2010, Virgin Blue's airline's check-in and online booking systems went down. Virgin Blue suffered a hardware failure, on September 26, and subsequent outage of the airline's internet booking, reservations, check-in and boarding systems. The outage severely interrupted the Virgin Blue business for a period of 11 days, affecting around 50,000 passengers and 400 flights, and was restored to normal on October 6. (Virgin Blue IT outage hit profit by up to $20M) - See more at: http://www.evolven.com/blog/downtime-outages-and-failures-understanding-their-true-costs.html#sthash.efet2jHX.dpuf On average, the businesses surveyed said they suffered 14 hours of IT downtime per year. Half of those said IT outages damage their reputation and 18% described the impact on their reputation as 'very damaging.' Headlines about IT failures certainly don't help. (IT Downtime Costs $26.5 Billion In Lost Revenue ) - See more at: http://www.evolven.com/blog/downtime-outages-and-failures-understanding-their-true-costs.html#sthash.efet2jHX.dpuf According to a study by the Ponemon Institute, the minimum, median, mean and maximum cost per minute of unplanned outages was computed based on input from 41 data centers. In the chart below, the most expensive cost of an unplanned outage is over $11,000 per minute. On average, the cost of an unplanned outage per minute is likely to exceed $5,000 per incident. (Understanding the Cost of Data Center Downtime: An Analysis of the Financial Impact of Infrastructure Vulnerability) - See more at: http://www.evolven.com/blog/downtime-outages-and-failures-understanding-their-true-costs.html#sthash.efet2jHX.dpuf When an outage occurs, it's a race against time to handle it before it spirals out of control. According to the IT Process Institute, resolution time per outage is around 200 minutes. It's really interesting to see just how much time is being put in to resolve outages, when you consider what is happening to the customer experience and company reputation in this time. The average reported incident length was 90 minutes, resulting in an average cost per incident of approximately $505,500. (Unplanned IT Outages Cost More than $5,000 per Minute) - See more at: http://www.evolven.com/blog/downtime-outages-and-failures-understanding-their-true-costs.html#sthash.efet2jHX.dpuf
  2. Resiliency is a destination where critical business functions and the supporting infrastructure are designed and engineered in such a way that they are materially unaffected by most disruptions, for example through the use of redundancy and spare capacity. Data loss is minimal or strictly meets RTO. Deploying an effective Resiliency Program reduces a corporation’s reliance on Recovery and Contingency Recovery is a journey where arrangements are made to recover or restore critical and less critical business functions that fail for some reason. Known as “Disaster Recovery”. Recovery may involve data loss to a point-in-time and production downtime as an acceptable outcome. Organizations understand the value of having a good DR plan, but recognize it is complex, and difficult to validate Contingency is a last resort where the organization establishes a generalized capability and readiness to cope effectively with whatever major incidents and disasters occur, including un-foreseen ones. Contingency preparations are a last-resort response if resilience and recovery arrangements should prove inadequate in practice. Contingencies involve data loss and production downtime as an acceptable outcome.
  3. Corporations do not address Resiliency effectively, leaving a hole in their overall Business Continuity programs. They continue to spend considerable dollars in areas with little to no return. Those areas are not 100% effective during significant business disrupting events. Because Resiliency is not well understood and not effectively implemented – the result is over-compensation on Recovery and Contingency. The over-compensation does not give a great return on investment and they do not always work during survival events. It is reactive at best. Resiliency is a super-set of Recovery and Contingency. A mature corporation that has Resiliency, has less reliance on Recovery and Contingency. Resiliency ensures that a corporation can move its capability and capacity seamlessly around the environment, without significant impact to production. Resiliency is pro-active in dealing with significant business survival events (power outages, failures, pandemic, natural disaster, social un-rest, etc.).
  4. DR and CP are necessary for audit and BCP planning. Their value is not realized during actual disaster or business disrupting events. The current IT spend and effort on Disaster Recovery and Contingency Planning do not return significant value to improving IT Infrastructure/Application environments. DR and CP continue to be necessary for audit and BCP planning; however, Resiliency practiced in production and during loads proving the agility of the Infrastructure/Application environment, adapting to those actual disasters or business disrupting events. By focusing more on Resiliency (a super-set of DR and CP), IT teams can reduce efforts and costs as DR and CP goals are met through Resiliency implementation.
  5. RTO - The Recovery Time Objective is the maximum time a business can be out of service. A Business Impact Analysis is used by the Business to determine its RTO. By extension, any application or infrastructure required by that business to meet its RTO must be recovered within the same timeframe RPO - The Recovery Point Objective is a point in time to which systems and data are expected to be recovered after recovery from an outage has been completed. (e.g. end of day, end of previous day, or within minutes of the outage) to limit the loss of data to within tolerable levels as determined by the LOB. Restoring applications to the agreed RPO must form part of the RTO. RTC - The Return To Capacity is the measurement of the capability to recover people (business) or technology (application and infrastructure). The RTC is measured from the point of invocation/declaration. This does not include the assessment before the declaration. It includes the combination of infrastructure recovery time, application recovery time and any activities that need to be performed prior to active usage of the application such as business verification and reconciliation time. This includes time required in order to restore to the defined Recovery Point Objective. Key components that occur in an actual event are measured to calculate the RTC as follows: People/Business Recovery - any or all of the following may be components of the recovery and should be assessed as part of the RTC: Travel time (to recovery site, to home, etc) Site set-up time, including clearing of existing users, removal/installation of equipment, etc Closedown of non-critical activities for work transfer solutions Activation as required of voice diverts, voice recording, printer diverts, etc Where needed, failover and/or of infrastructure such as shared data For technology recovery the following components should be measured as part of the RTC: If applicable, measure time for support staff to commute Tape shipment (Iron Mountain, etc.) and/or transfer of any vital records necessary for recovery, if applicable Infrastructure recovery time (usually performed by GTI, e.g., Network failover, storage failover, DNS pushes) Server recovery time (may be performed by GTI or more often by Business Aligned Infrastructure, e.g., SA recovery, DB recovery) Application recovery time (usually performed by AD teams to prepare, finalize and validate the application) Business verification/reconciliation time (any activities that need to be performed by the business prior to active usage of the application)