Cloudifying High Availability
The Case for Elastic Disaster Recovery
Ali Hodroj
Senior Solutions Architect
 RTO (Recovery Time Objective)
 RPO (Recovery Point Objective)
 Cost
DISASTER RECOVERY, THE CONTEXT
Disaster Recovery K...
COST OF AN RTO = 1 HOUR
Copyright 2013 Gigaspaces. All Rights Reserved
Industry Cost/Hour
Finance (Brokerage Operations) $...
COST OF TESTING DR PROCEDURES
In cloud computing, there’s even a third Time
Objective metric
TTO – Testing Time Objective:...
 Deployable on any cloud any time
 Public, Private, Bare-metal
 Multi-zone, Multi-region out of the box
 Fail over and...
High Availability
6
7
CLOUD HIGH AVAILABILITY: MATURITY MODEL
Single server instance,
same data center
Same geographical
region
Same operation...
8
CLOUD HIGH AVAILABILITY: THE REALITY
Consistent deployment
Cross zone configuration
Machine images, security groups, key...
9
CLOUD HIGH AVAILABILITY: THROUGH CLOUDIFY
Across clouds
(AWS, Rackspace, Azure…etc)
Across AWS regions
Across AWS zones
...
10
ON-DEMAND, ON ANY CLOUD
Copyright 2013 Gigaspaces. All Rights Reserved
Disaster Recovery
11
ELASTIC ON-DEMAND DISASTER RECOVERY
12
 Problem
 Can we eliminate the
RTO vs. Cost trade-off
in the cloud?
 Solution (E...
13
ELASTIC ON-DEMAND DISASTER RECOVERY: CONTEXT
Cold/Warm
Disaster
Recovery
Hot
Disaster
Recovery
High RTO
Low Cost
Low RT...
14
ELASTICITY VS CRITICALITY CONTINUUM
Cold/Warm
Disaster
Recovery
Hot
Disaster
Recovery
High RTO
Low Cost
Low RTO
High Co...
Case Study
15
Solution
CASE STUDY: CLOUDIFY CUSTOMER
High
Availability
Data
Replication
Disaster
Recovery
Auto scaling
Self healing
Cros...
SAMPLE (INITIAL) ARCHITECTURE
17
Availability region (US-West: Oregon)
Data Volume
Internet EC2 Instance
mod_cluster
EC2 I...
EXTENDED ARCHITECTURE: CLOUDIFY DR SCENARIO
18
Region (US-West Oregon)
App Servers
PostgresSQL
Region (US-East Virginia)
P...
DR and Cloud Economics
19
Copyright 2013 Gigaspaces. All Rights Reserved20
ELASTIC ON-DEMAND DR: COSTS*
Main Site (US-West) Warm DR Site (US-East) H...
Copyright 2013 Gigaspaces. All Rights Reserved21
ELASTIC DR: WARM DR COST, CLOUD PORTABILITY
4 recipes
DR Site
$12k
SameRe...
Copyright 2013 Gigaspaces. All Rights Reserved22
ELASTIC DR: HOT DR COST
4 recipes
DR Site
$82k
SameRecipe
$79k
$115k
$68k...
Demo – AWS DR across regions
http://www.youtube.com/watch?v=U-PdZe1g_yw
23
Upcoming SlideShare
Loading in …5
×

Cloudifying High Availability: The Case for Elastic Disaster Recovery

975 views

Published on

Elastic DR: a solution architecture that aims to optimally balance cost and recovery time via three core principles that are germane the cloud world:

On-Demand: The disaster recovery cloud can be provisioned on any availability zone, region, or public/private cloud through Cloudify's cloud-agnostic bootstrapping mechanism.

Elastic: The ability to automatically provision resources in the recovery cloud in case of disaster while eliminating the need for idle resources in normal scenarios, thereby fully profiting from the pay-per-use pricing model of clouds.

Flexible RTO/RPO: The architecture can be easily extended from a warm DR to a hot DR pattern through enabling/disabling application recipes. This allows us to exploit economies of scale that the cloud provides by matching the number of recipes/tiers to provision (in the recovery cloud) against the recovery time/point objective for our disaster recovery strategy

Published in: Technology, Business
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
975
On SlideShare
0
From Embeds
0
Number of Embeds
10
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • The tolerance for RTO and RTP varies from industry to industry. Financialinstitutions, for example, require services back online in minutes, ratherthan hours. Even more critically, healthcare providers require emergencyresponse immediately. Other industries can afford to be down 24 hourswithout access to IT. Organizations that cannot afford to lose more than asingle minute’s worth of transactional data must have strategies that includeclustering or high availability, where online data is captured realtime in boththe production and backup environments. Other organizations might fi nd thattape backup programs supply ample data protection.
  • Servers store a wide variety of data types from different applications. Server data can be classified by its impact on business operations: - Mission critical: producing revenue or customer-facing - Business critical: supporting cross-organization functions Operationally critical: important to individual departments
  • Cloudifying High Availability: The Case for Elastic Disaster Recovery

    1. 1. Cloudifying High Availability The Case for Elastic Disaster Recovery Ali Hodroj Senior Solutions Architect
    2. 2.  RTO (Recovery Time Objective)  RPO (Recovery Point Objective)  Cost DISASTER RECOVERY, THE CONTEXT Disaster Recovery Key Drivers  Accomplishing high levels of redundancy is expensive  The hard cold reality for most businesses:  The cost of losing 24 hours of data is less than the cost of maintaining another active data center.  Determining an appropriate RPO and RTO is ultimately a financial calculation:  …at what point does the cost of data loss and downtime exceed the cost of a backup strategy that will prevent that level of data loss and downtime? Disaster Recovery Constraints Copyright 2013 Gigaspaces. All Rights Reserved
    3. 3. COST OF AN RTO = 1 HOUR Copyright 2013 Gigaspaces. All Rights Reserved Industry Cost/Hour Finance (Brokerage Operations) $ 5.15 Million Finance (Credit Card Authorizations) $ 3.10 Million Telecom $ 2.00 Million Manufacturing $ 1.60 Million Online Retail $ 613,000 Communications (ISP) $ 90,000 Media (Ticket Sales) $ 90,000 Transportation $ 89,000 Transportation (Packaging and Shipping) $ 28,000  Average cost per hour of downtime by industry  Source: “The Meta Group & Contingency Planning Research”
    4. 4. COST OF TESTING DR PROCEDURES In cloud computing, there’s even a third Time Objective metric TTO – Testing Time Objective: The Time required to test recovery plans to ensure a successful failover in case of disaster Copyright 2013 Gigaspaces. All Rights Reserved
    5. 5.  Deployable on any cloud any time  Public, Private, Bare-metal  Multi-zone, Multi-region out of the box  Fail over and provision applications  Automatically (polling master cloud)  Ad-hoc: Shell command, REST API  TTO  Easily provision a test environment for failover on a micro cloud (laptop)  Test failure frequently, and test often ensuring highest resiliency (similar to Netflix) WHAT IS ELASTIC ON-DEMAND DISASTER RECOVERY? On-Demand  RTO / RPO  Easily configurable recipes to increase/decrease RTO and RPO  Cost  Pay only for failures without compromising RTO/RPO  Leverage cloud economics (any cloud) Elastic Copyright 2013 Gigaspaces. All Rights Reserved
    6. 6. High Availability 6
    7. 7. 7 CLOUD HIGH AVAILABILITY: MATURITY MODEL Single server instance, same data center Same geographical region Same operational procedures, provider Single Points of Failures Copyright 2013 Gigaspaces. All Rights Reserved
    8. 8. 8 CLOUD HIGH AVAILABILITY: THE REALITY Consistent deployment Cross zone configuration Machine images, security groups, keys Different API, zone/region hierarchies Accidental Complexity: The higher we move in the HA scale, the less manageable the deployments become Copyright 2013 Gigaspaces. All Rights Reserved
    9. 9. 9 CLOUD HIGH AVAILABILITY: THROUGH CLOUDIFY Across clouds (AWS, Rackspace, Azure…etc) Across AWS regions Across AWS zones 1 application + overrides Several cloud drivers 1 application + overrides 1 cloud driver 1 application + overrides 1 cloud driver Availability Same application and service recipe Single recipe, deployable on-demand on any data center, zone, region, or cloud Copyright 2013 Gigaspaces. All Rights Reserved
    10. 10. 10 ON-DEMAND, ON ANY CLOUD Copyright 2013 Gigaspaces. All Rights Reserved
    11. 11. Disaster Recovery 11
    12. 12. ELASTIC ON-DEMAND DISASTER RECOVERY 12  Problem  Can we eliminate the RTO vs. Cost trade-off in the cloud?  Solution (Elastic DR)  A hybrid between Hot and Warm DR  Switch to Active site in matter of seconds through cloud- agnostic lifecycle automation recipes Copyright 2013 Gigaspaces. All Rights Reserved
    13. 13. 13 ELASTIC ON-DEMAND DISASTER RECOVERY: CONTEXT Cold/Warm Disaster Recovery Hot Disaster Recovery High RTO Low Cost Low RTO High Cost Elastic DR Recovery time objective (RTO)—The duration of time and the service level to which a business process must be restored after a disaster (or disruption) to avoid unacceptable consequences associated with a break in business continuity. Applying the cloud principle of Elasticity to Disaster Recovery Copyright 2013 Gigaspaces. All Rights Reserved
    14. 14. 14 ELASTICITY VS CRITICALITY CONTINUUM Cold/Warm Disaster Recovery Hot Disaster Recovery High RTO Low Cost Low RTO High Cost Elastic DR Copyright 2013 Gigaspaces. All Rights Reserved Operationally Critical Business Critical Mission Critical XAP WAN Gateway
    15. 15. Case Study 15
    16. 16. Solution CASE STUDY: CLOUDIFY CUSTOMER High Availability Data Replication Disaster Recovery Auto scaling Self healing Cross-zone, region, and cloud redundancy Automated lifecycle management of PostgreSQL + Cassandra replication Elastic Disaster Recovery pattern Copyright 2013 Gigaspaces. All Rights Reserved  Technology-based concrete process control and information service  Deployments across North America  Bi-directional messaging and data transfer from web- UI, mobile devices  NoSQL and Relational data stores for reporting/analytics  Lacking disaster recovery and high availability aspects Problem
    17. 17. SAMPLE (INITIAL) ARCHITECTURE 17 Availability region (US-West: Oregon) Data Volume Internet EC2 Instance mod_cluster EC2 Instance JBoss Data Volume EC2 Instance EC2 Instance PostgresSQL Cassandra 4 recipes
    18. 18. EXTENDED ARCHITECTURE: CLOUDIFY DR SCENARIO 18 Region (US-West Oregon) App Servers PostgresSQL Region (US-East Virginia) PostgresSQL Cloud #1 Cloud #2 Region (US-East Virginia ) PostgresSQL Cloud #1 Cloud #2 App Servers Region (US-West California) PostgresSQL Cloud #3 Region failure occurs Bootstrap another cloud in a different region using the same application recipe used to bootstrap cloud #2 above* Liveness poll Liveness poll Upon initial deployment, the primary deployment of the application will be bootstrapped onto cloud #1, another slightly modified application recipe will be bootstrapped as cloud #2, polling cloud #1 for failure, and acting as a PostgresSQL db slave. Turn Postgres slave into master, Start app server instances* Copyright 2013 Gigaspaces. All Rights Reserved
    19. 19. DR and Cloud Economics 19
    20. 20. Copyright 2013 Gigaspaces. All Rights Reserved20 ELASTIC ON-DEMAND DR: COSTS* Main Site (US-West) Warm DR Site (US-East) Hot DR Site Cost $82,068 $12,625 $82,068  Main Site  1 Load balancer, 2 JBoss instances, 1 PostgreSQL master, 3 Cassandra  DR Site  1 PostgreSQL slave – All other instance start on demand upon failover What if we deploy on different clouds? *Costs calculated using http://planforcloud.com
    21. 21. Copyright 2013 Gigaspaces. All Rights Reserved21 ELASTIC DR: WARM DR COST, CLOUD PORTABILITY 4 recipes DR Site $12k SameRecipe $14k $6k $5k $9k
    22. 22. Copyright 2013 Gigaspaces. All Rights Reserved22 ELASTIC DR: HOT DR COST 4 recipes DR Site $82k SameRecipe $79k $115k $68k $91k
    23. 23. Demo – AWS DR across regions http://www.youtube.com/watch?v=U-PdZe1g_yw 23

    ×