AWS Office Hours: Disaster Recovery

2,950 views

Published on

Architectural insight on DR with AWS

Published in: Technology, Business
1 Comment
4 Likes
Statistics
Notes
  • Disaster Recovery and Business Continuity Planning now an AWS Cloud Solution Provider

    KingsBridge Systems is now an AWS Partner providing disaster recovery and business continuity planning services that is running on AWS EC2 Instances. Please check out the Partner link at

    http://aws.amazon.com/solutions/solution-providers/disasterrecovery/

    You can also find out more about the AWS hosted SaaS solution at

    http://www.disasterrecovery.com/products/overview.foundation.html

    The first step in effective disaster recovery and business continuity is planning. Then integration with your planning tool and physical Cloud solutions can help you reduce your business continuity recovery time objectives.

    I hope you will check out http://www.disasterrecovery.com upon reading this and call and ask us about our Amazon Web Services based solution.

    We are also looking to expand the integration of our solution with other AWS solution providers. Upon reviewing our solution if you think there is a potential win win win situation please do contact us.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
2,950
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
1
Likes
4
Embeds 0
No embeds

No notes for slide

AWS Office Hours: Disaster Recovery

  1. 1. Disaster Recovery Office Hours<br />Attila Narin, June 30, 2011<br />
  2. 2. Introduction<br />Attila Narin, Sr. Manager, Solutions Architecture, EMEA<br />Based in Luxembourg<br />At Amazon for almost 7 years<br />About 4.5 years at AWS<br />Was member of EC2 Team before moving to Solutions Architecture<br />
  3. 3. Office Hours IS<br />Simply, Office Hours is a program the enables a technical audience the ability to interact with AWS technical experts.<br />We look to improve this program by soliciting feedback from Office Hours attendees. Please let us know what you would like to see.<br />
  4. 4. Office Hours is NOT<br />Support<br />If you have a problem with your current deployment please visit our forums or our support website http://aws.amazon.com/premiumsupport/<br />A place to find out about upcoming services<br />We do not typically disclose information about our services until they are available.<br />
  5. 5. Agenda<br />Disaster Recovery (DR) Concepts<br />AWS Features for DR<br />Example Architectural Patterns for DR<br />Solutions Providers for Backup and DR<br />Question and Answer <br />Please begin submitting questions now<br />
  6. 6. Disaster Recovery Overview<br />
  7. 7. Disaster Recovery Overview<br />What is Disaster Recovery (DR)?<br />Ability to recover from a disaster like fire, theft, physical destruction, large-scale events, etc.<br />The process of planning, preparing, rehearsing, testing, documenting, training, and updating the process itself<br />Goal: minimize business impact after disaster<br />Part of Business Continuity Planning (BCP)<br />
  8. 8. DR Objectives – Common Terms<br />RTO: Recovery Time Objective<br />Duration of time and service level within which a business process needs to be restored after a disaster in order to avoid unacceptable consequences Example: 4 hours<br />RPO: Recovery Point Objective<br />Acceptable amount of data loss measured in time Example: 2 minutes<br />
  9. 9. DR Planning<br />Business guides RTO/RPO<br />Based on financial impact<br />Based on continuity impact<br />etc.<br />IT seeks cost effective solutions to RTO and RPO<br />Tradeoff: Cost vs. RTO/RPO<br />
  10. 10. DR with AWS: Advantages<br />Infrastructure available when you need it<br />Multiple locations world wide<br />Various building blocks and services available<br />Fine control over cost vs. RTO/RPO<br />Ability to scale up when needed; automatable<br />No headache of provisioning physical infrastructure<br />Ability to effectively exercise your DR plan<br />Pay only for what you use<br />Several options available that don’t require provisioning of duplicate infrastructure<br />
  11. 11. AWS Features forDisaster Recovery<br />
  12. 12. AWS Features for DR<br />Amazon Simple Storage Service (S3)<br />Amazon Import/Export<br />Amazon Elastic Compute Cloud (EC2)<br />Amazon Machine Images (AMI)<br />Reserved Instances<br />Elastic IP Addresses<br />VM Import<br />Amazon Elastic Block Store (EBS) and Snapshots<br />Amazon CloudWatch<br />
  13. 13. AWS Features for DR<br />Multiple Regions and Availability Zones<br />Amazon Route 53<br />Amazon Virtual Private Cloud (VPC)<br />AWS CloudFormation<br />Amazon CloudWatch<br />APIs and various SDKs for automation<br />
  14. 14. Architectural Patterns for Disaster Recovery<br />
  15. 15. Architectural Patterns Overview<br />Variety of approaches exist<br />Tradeoff between RTO/RPO vs. cost and complexity<br />Example Architectural Patterns<br />(sorted by increasingly optimal RTO/RPO)<br />Backup and Restore<br />“Pilot Light” for Quick Recovery<br />Fully Working Low Capacity Standby<br />Multi-Site Hot Standby<br />Virtual Workstations<br />Best Practices for Being Prepared<br />
  16. 16. Backup and Restore<br />Advantages<br />Simple to get started<br />Extremely cost effective (mostly backup storage)<br />Preparation Phase<br />Take backups of current systems<br />Store backups in S3<br />Describe procedure to restore from backup on AWS<br />Know which AMI to use, build your own as needed<br />Know how to restore system from backups<br />Know how to switch to new system<br />Know how to configure the deployment<br />FREE Inbound Data Transfer<br />starting July 1st, 2011<br />
  17. 17. Backup to S3<br />www.example.com<br />Amazon Route 53<br />Customer Infrastructure<br />Data copied to S3<br />Traditional server<br />Bucket <br />with Objects<br />AWS Import/Export<br />
  18. 18. Backup and Restore<br />In Case of Disaster<br />Retrieve backups from S3<br />Bring up required infrastructure<br />EC2 instances with prepared AMIs, Load Balancing, etc.<br />Restore system from backup<br />Switch over to the new system<br />Adjust DNS records to point to AWS<br />Objectives<br />RTO: as long as it takes to bring up infrastructure and restore system from backups<br />RPO: time since last backup<br />
  19. 19. Restore from S3 into AWS<br />www.example.com<br />Amazon Route 53<br />Data copied from objects in S3<br />Availability Zone<br />Amazon Elastic Compute Cloud (EC2)<br />EC2 quickly provisioned from AMI<br />Pre-bundled with OS and applications<br />Bucket <br />with Objects<br />AMI<br />
  20. 20. “Pilot Light” for Quick Recovery<br />Advantages<br />Reduced RTO and RPO<br />Very cost effective (very few 24/7 resources)<br />Preparation Phase<br />Enable replication of all critical data to AWS<br />Standby DB, replica, mirror, etc.<br />Reduced infrastructure that runs 24/7 in AWS<br />Prepare all required resources for automatic start<br />AMIs, Network Settings, Load Balancing, etc.<br />Only runs when used for DR<br />Reserved Instances<br />
  21. 21. “Pilot Light” in Non-DR Phase<br />Reverse Proxy / Caching Server<br />www.example.com<br />Application<br />Server<br />Reverse Proxy / Caching Server<br />Not Running<br />Application<br />Server<br />Database<br />Server<br />Database<br />Server<br />Smaller Instance<br />DataVolume<br />Data Mirroring / Replication<br />DataVolume<br />
  22. 22. “Pilot Light” for Quick Recovery<br />In Case of Disaster<br />Automatically bring up resources around the replicated core data set<br />Scale the system as needed to handle current production traffic<br />Switch over to the new system<br />Adjust DNS records to point to AWS<br />Objectives<br />RTO: as long as it takes to detect need for DR and automatically scale up replacement system<br />RPO: depends on replication type<br />
  23. 23. “Pilot Light” in Disaster Phase<br />Reverse Proxy / Caching Server<br />Reverse Proxy / Caching Server<br />www.example.com<br />Application<br />Server<br />Application<br />Server<br />Not Running<br />Database<br />Server<br />Database<br />Server<br />Smaller Instance<br />DataVolume<br />DataVolume<br />
  24. 24. “Pilot Light” in Recovered Phase<br />Reverse Proxy / Caching Server<br />www.example.com<br />Application<br />Server<br />Reverse Proxy / Caching Server<br />Start in Minutes<br />Application<br />Server<br />Database<br />Server<br />Database<br />Server<br />Resize Instance to Prod Capacity<br />DataVolume<br />DataVolume<br />
  25. 25. Fully Working Low Capacity Standby<br />Advantages<br />Can take some production traffic at any time<br />Cost savings (IT footprint smaller than full DR)<br />Preparation<br />Similar to “Pilot Light”<br />All necessary components running 24/7, but not scaled for production traffic<br />
  26. 26. Low Capacity Standby in Non-DR Phase<br />Reverse Proxy / Caching Server<br />www.example.com<br />Amazon Route 53<br />Not Active for Production Traffic<br />Active<br />Elastic Load<br />Balancer<br />Application<br />Server<br />On site<br />Reverse Proxy / Caching Server<br />Scaled down<br />Standby<br />Master Database<br />Server<br />Application<br />Server<br />Application Data Source Cut Over<br />Slave<br />Database<br />Server<br />DataVolume<br />Mirroring / Replication<br />DataVolume<br />
  27. 27. Fully Working Low Capacity Standby<br />In Case of Disaster<br />Immediately fail over most critical production load<br />Adjust DNS records to point to AWS<br />Scale the system further to handle all production load<br />Objectives<br />RTO: for critical load: as long as it takes to fail over; for all other load, as long as it takes to scale further<br />RPO: depends on replication type<br />
  28. 28. Standby Scaled Up in DR Phase<br />Reverse Proxy / Caching Server<br />www.example.com<br />Amazon Route 53<br />Active<br />Active<br />Application<br />Server<br />Elastic Load<br />Balancer<br />Reverse Proxy / Caching Server<br />Scaled up for<br />Production Load<br />Database<br />Server<br />Application<br />Server<br />DataVolume<br />Master<br />Database<br />Server<br />DataVolume<br />
  29. 29. Multi-Site Hot Standby<br />Advantages<br />At any moment can take all production load<br />Preparation<br />Similar to Low Capacity Standby<br />Fully scaling in/out with production load<br />In Case of Disaster<br />Immediately fail over all production load<br />Adjust DNS records to point to AWS<br />Objectives<br />RTO: as long as it takes fail over<br />RPO: depends on replication type<br />
  30. 30. Multi-Site Hot Standby in Non-DR Phase<br />Reverse Proxy / Caching Server<br />www.example.com<br />Amazon Route 53<br />Active<br />Active<br />Elastic Load<br />Balancer<br />On site<br />Application<br />Server<br />Reverse Proxy / Caching Server<br />Master Database<br />Server<br />Application<br />Server<br />Application Data Source Cut Over<br />Slave<br />Database<br />Server<br />DataVolume<br />Mirroring / Replication<br />DataVolume<br />
  31. 31. Multi-Site Hot Standby in DR Phase<br />Reverse Proxy / Caching Server<br />www.example.com<br />Amazon Route 53<br />Active<br />Active<br />Elastic Load<br />Balancer<br />Application<br />Server<br />Reverse Proxy / Caching Server<br />Database<br />Server<br />Application<br />Server<br />Master<br />Database<br />Server<br />DataVolume<br />DataVolume<br />
  32. 32. Multi AZ HA Deployment<br />Reverse Proxy / Caching Server<br />Reverse Proxy / Caching Server<br />www.example.com<br />Amazon Route 53<br />Application<br />Server<br />Application<br />Server<br />Health Check Keeps working systems in service<br />Availability Zone A<br />Availability Zone B<br />Slave<br />Database<br />Server<br />Master Database<br />Server<br />Application Data Source Cut Over<br />DataVolume<br />DataVolume<br />Mirroring / Replication<br />
  33. 33. Hosted Workstations<br />Advantages<br />Replacement of workstations in case of disaster<br />Pay only when used for DR<br />Preparation<br />Set up AMIs with appropriate working environment<br />In Case of Disaster<br />Launch desktop AMI and resume work<br />Objectives<br />RTO: as long as it takes to launch AMI and restore work environment on virtual desktop<br />RPO: depends on state of AMI<br />
  34. 34. Best Practices for Being Prepared<br />Start simple and work your way up<br />Backups in AWS as a first step<br />Improve RTO/RPO as a continuous effort<br />Exercise your DR Solution<br />Game Day<br />Ensure backups, snapshots, AMIs, etc. are working<br />Monitor your monitoring system<br />Check into Licensing<br />
  35. 35. Solutions Providersfor Disaster Recovery<br />
  36. 36. http://aws.amazon.com/solutions/solution-providers/<br />http://aws.amazon.com/solutions/case-studies/<br />Solutions Providers<br />
  37. 37. http://aws.amazon.com/solutions/solution-providers/<br />http://aws.amazon.com/solutions/case-studies/<br />Managed Services Providers<br />
  38. 38. Conclusion – Advantages of DR with AWS<br />Various building blocks available<br />Fine control over cost vs. RTO/RPO<br />Ability to scale up when needed<br />Pay only for what you use and/or in case of DR<br />Ability to effectively exercise DR plan<br />Availability of multiple locations world wide<br />Hosted workstations possible<br />Variety of Solutions Providers<br />
  39. 39. Thank You!…and special thanks to Ianni Vamvadelis and Glen Robinson for their help preparing this presentation!<br />
  40. 40. Question & Answer<br />Visit http://aws.amazon.com/officehours to watch recorded sessions and to sign up for upcoming sessions.<br />

×