2015-01-13 Resiliency (v04)

Resiliency
John Mymryk
December 15, 2014

2
Contents
1) Types of Business Impacts (Outages) & Their Costs
2) Three Foundational Pillars of Business Continuity
3) Problem Statement
4) High Availability & Sustained Resiliency
5) Our Methodology
6) Value Proposition
7) Appendix

Types of Business Impacts (Outages) & Their Costs
3
* Sources: www.symantec.com; www.informationweek.com;: www.businesscomputingworld.co.uk; www.evolven.com; www.quorum.net
Corporations implement a Business
Continuity Program (BCP) to address these
types of outages as they directly impact the
bottom line.
* Lost Labor: $46,000,000
(Per 10,000 person company @ 1.6 hr/wk)
* Lost Revenue: $26,500,000,000
(Survey across 200 companies)
* Brand Failure: $ ??
RIM (Blackberry)
Revenue lost for a single outage can be in the Millions ($). Outages
may also start a brand failure (i.e. Blackberry RIM outage ~ $100M)
Important! Most outages are either
Hardware Failure or Human Error – a very
small percentage of the overall outages are
Natural Disasters (5%).
What are these Outages? (Yearly Combined)
77%
What are the Costs? Here are some Data Points:
So, What are the key elements of a BCP Program?
* Losses are estimated at $ 1,200,000,000,000 Trillion dollars annually

Three Foundational Pillars of Business Continuity
4
Resiliency Recovery Contingency
Resiliency is a destination
A state where critical business
functions and the supporting
infrastructure are unaffected
by most outages.
Resiliency is the ability of a
corporation to move its
capability and capacity
seamlessly around its
environment.
Recovery is a journey
Also known as “Disaster
Recovery” or DR, it is complex
to maintain and difficult to
implement.
DR has data loss to a point-in-
time and production downtime
as an acceptable outcome.
Contingency is a last resort
Establish a generalized
capability and readiness to
cope with major incidents and
disasters. Not all are known.
Contingencies involve data
loss and production downtime
as an acceptable outcome.
Increasing Resiliency efforts will naturally reduce efforts in Recovery & Contingency.

Business
Continuity
Resiliency
Recovery
Contingency
5
Corporations leave a hole in their overall
Business Continuity programs spending
considerable dollars in Recovery & Contingency
(covering only ~5% of outages) with diminishing
return. This leaves a BCP program with built-in
downtime and data loss as acceptable outcomes.
Resize
Recovery &
Contingency
Efforts
Appropriately
Improve &
Increase
Resiliency
Efforts
Resiliency is pro-active in dealing with significant
business survival events (hardware failure, human
error, power outages, pandemic, natural disaster,
social un-rest, etc.).
Problem Statement
Goal: Reset the BCP Balance
Resiliency is a super-set of Recovery and
Contingency, which leverages established process
and procedure used in Recovery and Contingency.

1
Tier 1 - Business
Application
High Availability & Sustained Resiliency
6
Tier 0 - Load
Balancer
Infrastructure
Tier 0 –
Physical Server
Infrastructure
Tier 0 - DB Servers
& DB Infrastructure
Tier 0 -
Directories
(LDAP, AD,..)
Tier 0 - Storage
Infrastructure
Tier 0 –
Virtualization
Infrastructure
Identified
HA/SR Gaps
Identifying &
Resolving
HA/SR Gaps
Reduces
Infrastructure
Failures
Unplanned
Outages
Tangible
Loss
Intangible
Loss
Provider Confidence
Regulatory Fines
A compounding and/or cascading
failure can occur when many
HA/SR gaps are concentrated.
The value is to find those
HA/SR Gaps and address them
High Availability (HA)
Component availability, which can
be Inter-site or Intra-site.
Sustained Resiliency (SR)
Moving capacity & capability
seamlessly around the physical
environment
Resilience: Critical business functions and the supporting infrastructure are designed and engineered in such a way that
they are materially unaffected by most disruptions, for example through the use of redundancy and spare capacity. There
are two (2) methods to do this:
Benefits of HA/SR
2

Our Resiliency Methodology
7
Develop Test
Schedule Test
Submit Events
HA/SR Testing
Feedback &
Improvement
Validate
Gap Exposure Risk,
Value Assessment
Application Testing
Capability
Gap Remediation
Investigate
Applications
Submitted for
Assessment or Review
Perform Assessment
and Onboarding
Develop Test
Requirements and
Objectives
Assess
5
1
2
3
4
6
7
8
9
10
11
All applications (infrastructure, services, applications or utilities) that execute all 11 steps along the
Resiliency Methodology would be considered mature in their Resiliency profile, and by extension
would be able to endure business impactful (outage) events.

The Value – A Proven Resiliency Program
8
Lowered IT
Effort
Meet SLAs
(SR)
Lower
Outages (HA)
Contingency
Planning
Disaster
Recovery
Resiliency
Most Corporation’s
Current State
Implement Resiliency
Methodology
Meets most audit requirements.
Has great planning, but limited impact on improving the
production environment’s ability to sustain outages.
Increasing more effort on Disaster Recovery &
Contingency Planning results in a diminishing return.
Focusing more on Resiliency, IT teams can reduce
efforts and costs as DR and CP goals are met through
Resiliency implementation.
End result is a resilient application infrastructure.

9
Appendix
• Anatomy of a Recovery Event

Area of Potential
Data Loss
An Application’s Data
RTO, RPO, RTC, TTTR & BTTR (Visually) – Anatomy of a Recovery Event
10
Timeline of Incident/Outage
RPO
No Data
Business Decision
Window
Business
Resumption
Take
Action
Infra
Ready
Good
Data
Inconsistent
Data
Rebuilding
Data
Good
Data
Data
Available
Fixed
(if applies)
Business Time To Resume(BTTR)
Recovery Time Objective (RTO)
Technology Time To Recover (TTTR)
Return To Capacity (RTC)
Application Recovery
Business RecoveryInfrastructure Recovery
People / Facilities Recovery
Time To Fix
All Hands Fix
Incident
Start
Fix It Outage Time
Recovery Outage Time

2015-01-13 Resiliency (v04)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (12)

Similar to 2015-01-13 Resiliency (v04)

Similar to 2015-01-13 Resiliency (v04) (20)

2015-01-13 Resiliency (v04)

Editor's Notes