Disaster Recovery Site on AWS - Minimal Cost Maximum Efficiency (STG305) | AWS re:Invent 2013
 

Disaster Recovery Site on AWS - Minimal Cost Maximum Efficiency (STG305) | AWS re:Invent 2013

on

  • 1,483 views

Implementation of a disaster recovery (DR) site is crucial for the business continuity of any enterprise. Due to the fundamental nature of features like elasticity, scalability, and geographic ...

Implementation of a disaster recovery (DR) site is crucial for the business continuity of any enterprise. Due to the fundamental nature of features like elasticity, scalability, and geographic distribution, DR implementation on AWS can be done at 10-50% of the conventional cost. In this session, we do a deep dive into proven DR architectures on AWS and the best practices, tools and techniques to get the most out of them.

Statistics

Views

Total Views
1,483
Views on SlideShare
1,469
Embed Views
14

Actions

Likes
2
Downloads
70
Comments
0

3 Embeds 14

http://www.linkedin.com 7
https://www.linkedin.com 6
https://twitter.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Disaster Recovery Site on AWS - Minimal Cost Maximum Efficiency (STG305) | AWS re:Invent 2013 Disaster Recovery Site on AWS - Minimal Cost Maximum Efficiency (STG305) | AWS re:Invent 2013 Presentation Transcript

  • Disaster Recovery Site on AWS: Minimal Cost Maximum Efficiency Abdul Sathar Sait, Vikram Garlapati, and Kamal Arora (AWS) November 15, 2013 © 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • What you will learn • • • • Why AWS for disaster recovery? Common DR architectures – Pilot light architecture • Demo • Code walkthrough – Backup and restore Customer case studies Where to go next
  • Conventional Disaster Recovery sites • • • • • • High cost Low ROI Implemented only for most critical systems Usually scaled down to 50% of production Systems in a remote region challenging Costly software licenses based on hardware usage
  • Disaster Recovery site on AWS • • • • Unprecedented capabilities to implement DR sites Easily setup DR sites on different geographic regions Cut down DR site cost by up to 70% Substantial savings on software licenses
  • Global reach from your desktop
  • Common DR architectures Backup and restore Pilot light Warm standby Hot standby
  • Pilot light architecture
  • Pilot light architecture Create instances from AMIs
  • Pilot light architecture Build resources around replicated dataset Keep ‘pilot light’ on by replicating core databases Build AWS resources around dataset and leave in stopped state
  • Pilot light architecture Build resources around replicated dataset Scale resources in AWS in response to a DR event Keep ‘pilot light’ on by replicating core Start up pool of resources in AWS when databases events dictate Build AWS resources around dataset and Scale up the database instance to handle leave in stopped state production capacity
  • Pilot light architecture Switchover to AWS Make necessary DNS changes to redirect traffic to the DR site on AWS
  • Pilot Light DEMO
  • Simple DR solution – awsdrdemo.com Active Passive Active Elastic Load Balancing Scaled down Standby Amazon Route 53 Copy AMI Web/ App servers Web/ App Server AMI Auto scaling Group Oracle Master DB Setup Data Replication Oracle Slave DB Data Volume US East (N. Virginia) US West (N. California)
  • Simple DR solution – awsdrdemo.com DNS Failover Active Gone Elastic Load Balancing Web/ App servers Active Active Elastic Load Balancing Amazon Route 53 Web/ App servers Autoscale Auto Scaling group Oracle Slave DB Oracle Master DB Data Volume US East (N. Virginia) Scale up DB Data Volume US West (N. California)
  • Architecture failover.awsdrdemo.com awsdrdemo.com Active Active ELB: DRDemoPrimaryELB52152634.us-east1.elb.amazonaws.com Web Servers: i-36af5751 AMI Copy (ami-996634f0) Web/ App server VPC ID - vpc-5f9ef53e Subnet IDssubnet-440c786c subnet-289ef549 subnet-2c9ef54d Primary Database Server: (i-026aad65) Private IP 174.168.1.11 Amazon Route 53 Passive DR ELB Created on Failover Failover App Instance: i-55cfde0e Elastic IP 54.215.157.25 Webserver Failover AMI App AMI - Scaled down Standby Active Mirroring / Replication Primary Data US East (N. Virginia) Volume Secondary DB Data Volume US West (N. California) Web Servers Created on Failover VPC ID - vpc-a4f2efcc Subnet IDssubnet-bbf2efd3 subnet-884b01ce subnet-bef2efd6 Secondary Database Server: (i-3b266960) Private IP 174.168.1.11
  • Demo – AWS Resources console.aws.amazon.com
  • Demo – Application awsdrdemo.com
  • Demo – Failover Kickoff failover.awsdrdemo.com
  • Demo – Failover Status Updates status.awsdrdemo.com/dr
  • Failover Steps Launch Failover Application Route 53 DNS Updates Resize Target Database Instance Go Live AWS CloudFormation – Launch ELB AWS CloudFormation - Launch web servers
  • Failover Application Architecture (1) Trigger DR procedure Failover App (6) Real-time feed from SNS Webserver AMI SNS HTTP Notification Admin Users (2) Invoke Shell Script (4) Script Updates CLI (5) CF Updates (3) Launch CloudFormation AWS Region
  • Metadata Requests // Sample code for metadata request using .NET API SDK string uri = "http://169.254.169.254/latest/meta-data/placement/availability-zone"; // Create Web Request HttpWebRequest webrequest = (HttpWebRequest)WebRequest.Create(uri); HttpWebResponse webresponse = webresponse = (HttpWebResponse)webrequest.GetResponse(); Encoding enc = System.Text.Encoding.GetEncoding(1252); StreamReader loResponseStream = new StreamReader(webresponse.GetResponseStream(), enc); // get availability zone value string availzone = loResponseStream.ReadToEnd();
  • Amazon Route53 Updates http://vrg.s3.amazonaws.com/downloads/route53.json # Retrieving existing ELB details from Route53 Hosted Zone..“ domainname=www.awsdrdemo.com hostedzoneid="ZXXXXXXXXXXXXR“ # Retrieve ELB alias zone-id from existing Route53 zone zoneid= $(aws --region us-west-1 --output text route53 list-resource-record-sets --hosted-zone-id $hostedzoneid -start-record-name $domainname --start-record-type A --max-items 1 | grep ALIASTARGET | awk {'print $2'}) dns=$(aws --region us-west-1 --output text route53 list-resource-record-sets --hosted-zone-id $hostedzoneid --startrecord-name $domainname --start-record-type A --max-items 1 | grep ALIASTARGET | awk {'print $4'}) change-resource-record-sets --hosted-zone-id $hostedzoneid -change-batch file:///usr/local/bin/route53.json aws --region us-west-1 route53
  • Resize Database Instance # Stopping DB instance for resizing aws --region us-west-1 ec2 stop-instances --instance-ids $dbInstanceId # Publish Amazon SNS messages for actions aws --region us-west-1 sns instance“ publish --topic-arn $snsarn --message "Resizing the stopped # Resize the DB instance aws --region us-west-1 ec2 modify-instance-attribute --instance-id $dbInstanceId --instancetype "{"Value": "m1.small"}" # Start the resized DB instance aws --region us-west-1 ec2 start-instances --instance-ids $dbInstanceId
  • AWS CloudFormation Stack Launch # Launch DR stack using AWS CloudFormation script launchedstackid =$(aws --region us-west-1 --output text cloudformation create-stack --stackname $stackname --template-body file:///usr/local/bin/ELBWithEC2Instances.template -notification-ar-ns $snsarn --parameters ParameterKey="HostedZoneId",ParameterValue="$hostedzoneid")
  • AWS CloudFormation Template http://vrg.s3.amazonaws.com/downloads/ELBWithEC2Instances.template { "AWSTemplateFormatVersion" : "2010-09-09", "Description" : "AWS CloudFormation Template ELBWithEC2Instances: Create a load balanced, Auto Scaled sample website where the instances are locked down to only accept traffic from the load balancer. This script creates an Auto Scaling group behind a load balancer with a simple health check. The web site is available on port 80, however, the instances can be configured to listen on any port (8888 by default).", "Parameters" : { HEADERS "KeyPairName" : { "Description" : "Name of an existing Amazon EC2 key pair for SSH access", "Type" : "String", "Default" : "kamalkeydr" }, "InstanceType" : { "Description" : "WebServer EC2 instance type", "Type" : "String", "Default" : "m1.small", "AllowedValues" : [ "t1.micro","m1.small","m1.medium","m1.large","m1.xlarge","m2.xlarge","m2.2xlarge","m2.4xlarge","c1.medium","c1.xlarge","cc1.4xlarge","cc2.8xlarge","cg1.4xlarge"], "ConstraintDescription" : "must be a valid EC2 instance type." }, "WebServerPort" : { "Description" : "TCP/IP port of the web server", "Type" : "String", "Default" : "80" }, "HostedZoneId" : { "Type" : "String", "Description" : "The Record Set's Hosted Zone Id for the existing hosted zone", "Default" : "Z1M58G0W56PQJA" } }, PARAMETERS "Mappings" : { "AWSInstanceType2Arch" : { "t1.micro" : { "Arch" : "64" }, "m1.small" : { "Arch" : "64" }, "m1.medium" : { "Arch" : "64" }, "m1.large" : { "Arch" : "64" }, "m1.xlarge" : { "Arch" : "64" }, "m2.xlarge" : { "Arch" : "64" }, "m2.2xlarge" : { "Arch" : "64" }, "m2.4xlarge" : { "Arch" : "64" }, "c1.medium" : { "Arch" : "64" }, "c1.xlarge" : { "Arch" : "64" } }, MAPPINGS "AWSRegionArch2AMI" : { "us-west-1" : { "32" : "ami-5e41761b", "64" : "ami-5e41761b" } } }, "Resources" : { "WebServerGroup" : { "Type" : "AWS::AutoScaling::AutoScalingGroup", "Properties" : { "AvailabilityZones" : [ "us-west-1a"], "LaunchConfigurationName" : { "Ref" : "LaunchConfig" }, "MinSize" : "2", "MaxSize" : "2", "LoadBalancerNames" : [ { "Ref" : "ElasticLoadBalancer" }], "VPCZoneIdentifier" : ["subnet-bbf2efd3"] } }, "LaunchConfig" : { "Type" : "AWS::AutoScaling::LaunchConfiguration", "Properties" : { "ImageId" : { "Fn::FindInMap" : [ "AWSRegionArch2AMI", { "Ref" : "AWS::Region" }, { "Fn::FindInMap" : [ "AWSInstanceType2Arch", { "Ref" : "InstanceType" }, "Arch" ] } ] }, "UserData" : { "Fn::Base64" : { "Ref" : "WebServerPort" }}, "SecurityGroups" : [ { "Ref" : "InstanceSecurityGroup" } ], "InstanceType" : { "Ref" : "InstanceType" }, "KeyName" : { "Ref" : "KeyPairName" }, "AssociatePublicIpAddress" : "true" } }, "ElasticLoadBalancer" : { "Type" : "AWS::ElasticLoadBalancing::LoadBalancer", "Properties" : { RESOURCES "SecurityGroups" : [ { "Ref" : "LoadBalancerSecurityGroup" } ], "Subnets" : ["subnet-bbf2efd3"], "Listeners" : [ { "LoadBalancerPort" : "80", "InstancePort" : { "Ref" : "WebServerPort" }, "Protocol" : "HTTP" } ], "HealthCheck" : { "Target" : { "Fn::Join" : [ "", ["HTTP:", { "Ref" : "WebServerPort" }, "/"]]}, "HealthyThreshold" : "2", "UnhealthyThreshold" : "10", "Interval" : "10", "Timeout" : "3" } } }, "LoadBalancerSecurityGroup" : { "Type" : "AWS::EC2::SecurityGroup", "Properties" : { "GroupDescription" : "Enable HTTP access on port 80", "VpcId" : "vpc-a4f2efcc", "SecurityGroupIngress" : [ { "IpProtocol" : "tcp", "FromPort" : "80", "ToPort" : "80", "CidrIp" : "0.0.0.0/0" } ], "SecurityGroupEgress" : [ { "IpProtocol" : "tcp", "FromPort" : { "Ref" : "WebServerPort" }, "ToPort" : { "Ref" : "WebServerPort" }, "CidrIp" : "0.0.0.0/0" }] } }, "myDNS" : { "Type" : "AWS::Route53::RecordSetGroup", "Properties" : { "HostedZoneName" : "awsdrdemo.com.", "Comment" : "Zone apex alias targeted to myELB LoadBalancer.", "RecordSets" : [ { "Name" : "www.awsdrdemo.com.", "Type" : "A", "AliasTarget" : { "HostedZoneId" : { "Fn::GetAtt" : ["ElasticLoadBalancer", "CanonicalHostedZoneNameID"] }, "DNSName" : { "Fn::GetAtt" : ["ElasticLoadBalancer","CanonicalHostedZoneName"] } } } ] } }, "InstanceSecurityGroup" : { "Type" : "AWS::EC2::SecurityGroup", "Properties" : { "GroupDescription" : "Enable SSH access and HTTP access on the inbound port", "VpcId" : "vpc-a4f2efcc", "SecurityGroupIngress" : [ { "IpProtocol" : "tcp", "FromPort" : { "Ref" : "WebServerPort" }, "ToPort" : { "Ref" : "WebServerPort" }, "CidrIp" : "0.0.0.0/0" }] } } }, OUTPUTS
  • Parameters "Parameters" : { "KeyPairName" : { "Description" : "Name of an existing Amazon EC2 key pair for SSH access", "Type" : "String" }, "InstanceType" : { "Description" : "WebServer EC2 instance type", "Type" : "String", "Default" : "m1.small", "AllowedValues" : [ "t1.micro","m1.small","m1.medium","m1.large","m1.xlarge","m2.xlarge","m2.2xlarge","m2.4xlarge","c1.medium","c1.xlarge","cc1.4xlarge","cc2.8xl arge","cg1.4xlarge"], "ConstraintDescription" : "must be a valid EC2 instance type." }, "HostedZoneId" : { "Type" : "String", "Description" : "The Record Set's Hosted Zone Id for the existing hosted zone" } }
  • Resources – Web Servers "WebServerGroup" : { "Type" : "AWS::AutoScaling::AutoScalingGroup", "Properties" : { "AvailabilityZones" : [ "us-west-1a"], "LaunchConfigurationName" : { "Ref" : "LaunchConfig" }, "MinSize" : "2", "MaxSize" : "2", "LoadBalancerNames" : [ { "Ref" : "ElasticLoadBalancer" }], "VPCZoneIdentifier" : ["subnet-bbf2efd3"] } }, "LaunchConfig" : { "Type" : "AWS::AutoScaling::LaunchConfiguration", "Properties" : { "ImageId" : { "Fn::FindInMap" : [ "AWSRegionArch2AMI", { "Ref" : "AWS::Region" }, { "Fn::FindInMap" : [ "AWSInstanceType2Arch", { "Ref" : "InstanceType" }, "Arch" ] } ] }, "UserData" : { "Fn::Base64" : { "Ref" : "WebServerPort" }}, "SecurityGroups" : [ { "Ref" : "InstanceSecurityGroup" } ], "KeyName" : { "Ref" : "KeyPairName" } }
  • Demo – Failover Status Updates status.awsdrdemo.com/dr
  • Disaster recovery site on AWS can be for • Primary site on customer data center • Primary on AWS itself
  • Primary and DR sites on AWS
  • Backup & Restore pattern Simple to get started Cost-effective Easy starting point for exploring the Very high levels of data durability at AWS cloud low price Low technical barrier to entry Cost of storing snapshots in Focus on incorporating cloud into your Amazon S3 DR strategy, not on complex technical Archiving possibilities beyond tape issues related to hot-hot systems using Amazon Glacier
  • Backup and restore
  • Backup and restore
  • Backup and restore Create instances from AMIs Restore data from backups
  • Many ways to backup
  • Disaster Recovery site on AWS can be for • Primary site on customer data center • Primary on AWS itself
  • Primary and DR sites on AWS
  • Customer case study
  • We are sincerely eager to hear your feedback on this presentation and on re:Invent. Please fill out an evaluation form when you have a chance.