AWS and Disaster Recovery - Bixler

3,884 views

Published on

Published in: Technology
1 Comment
7 Likes
Statistics
Notes
  • Would you please allow me to download this PPT, and if possible then give me access so I can edit/modify it?? My Email ID: gthakuria@gmail.com
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
3,884
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
0
Comments
1
Likes
7
Embeds 0
No embeds

No notes for slide

AWS and Disaster Recovery - Bixler

  1. 1. Disaster Recovery withAmazon Web Services Tim Bixler | tbixler@amazon.com Federal Solutions Architects Manager and Principal Solutions Architect
  2. 2. Let’s start by defining what we mean• Archiving is the process of moving data that is no longer actively used to a separate data storage device for long-term retention. Data archives typically are indexed and have search capabilities so that files and parts of files can be easily located and retrieved.• A backup or the process of backing up is making copies of data which may be used to restore the original after a data loss event. The primary purpose is to recover data after its loss, be it by data deletion or corruption. The secondary purpose of backups is to recover data from an earlier time.• Disaster recovery (DR) is the process, policies and procedures related to preparing for recovery or continuation of technology infrastructure critical to an organization after a natural or human-induced disaster.
  3. 3. Disaster Recovery became a hot topic for enterprises in 2011• Flooding and cyclone in Australia The first six months saw $265• Mudslides Rio de Janeiro billion in economic losses• Earthquake in New Zealand [worldwide], well above the previous record of $220 billion• Tsunami and flooding in Japan (adjusted for inflation) set for all of• Tornados and flooding in USA 2005 (the year Hurricane Katrina struck)• Flooding in Taiwan -- Munich Re, a multinational that insures insurance companies
  4. 4. But physical infrastructure for DR and archiving is expensive• Physical storage demands exploding – Network Attached Storage (NAS) – Storage Attached Network (SAN)• DR physical second site is expensive• Geographic distribution is challenging• Reliable storage and retrieval is hard
  5. 5. We have storage solutions that match customer use casesCustomer Storage Data Disaster Use Cases Center Block File Archive Backup Recovery Internet Web AWS Direct Services API Connect HTTP(S) AWS Cloud G
  6. 6. Symantec Support for Tapeless Cloud Backup Physical Symantec Virtual NetBackup 7.5 EU West S. America APAC Region Japan Region Region (Sao (Singapore) Region (Ireland) Paulo) (Tokyo) US West US West AWS US East Region (N. Region GovCloud Region (N. California) (Oregon) Region (US) Virginia)
  7. 7. It’s easy to backup Oracle DBs to Amazon S3 Read the case Amazon.com study Corporate data center Oracle Oracle Secure RMAN Backup S3 Module
  8. 8. AWS widely used for backup, DR, and archive Estimated cost Estimated cost Sonian closed savings of $70,000 savings of $1M by USDA deal in 45 on a single storage eliminating 2nd site days project
  9. 9. Benefits to using AWS DR and archive solutions Durable Reduced Pay Only for Infrastructure What You Use Recover Easy Distribution Easily Scale Up Security and Spin Down Deploy
  10. 10. Large Solution Provider Partner Ecosystem
  11. 11. About the Related AWS Services
  12. 12. Our infrastructure makes it easy to have remote DR and archive sites AWS OperatesAcross the Globe 4 CONUS Regions 4 OCONUS Regions 36 Edge Locations Continental United States (CONUS) Outside the Continental United States (OCONUS)
  13. 13. AWS Regions and Availability ZonesCustomer Decides Where Applications and Data Reside Note: Conceptual drawing only. The number of Availability Zones may vary.
  14. 14. AWS is built for enterprise security standardsCertifications Physical Security HW, SW, Network SOC 1 Type 2 Datacenters in nondescript Systematic change (formerly SAS-70) facilities management ISO 27001 Physical access strictly Phased updates controlled deployment PCI DSS for EC2, S3, EBS, VPC, RD Must pass two-factor Safe storage decommission S, ELB, IAM authentication at least twice for floor access Automated monitoring and FISMA Moderate self-audit Compliant Controls Physical access logged and audited Advanced network HIPAA & ITAR protection Compliant Architecture DIACAP Compliant Controls
  15. 15. Robust Networking & Security AWS Direct Connect Amazon Virtual Private Dedicated Instances Cloud (VPC) Single Tenant Compute Instance InternetDedicated connection between Private VPN connection to Amazon EC2 resources your datacenter and AWS your AWS resources running on private hardware
  16. 16. We have the services needed for DR and archive Amazon Simple Storage Service (Amazon S3) Amazon Import/Export Amazon Elastic Compute Cloud (Amazon EC2) G Amazon Glacier AWS CloudFormation AWS Storage Gateway Amazon Route 53 AWS Elastic Beanstalk
  17. 17. Perspectives on Scaling: S3 Scales… Total Number of Objects Stored in Amazon S3 1 Trillion Peak Requests: 750,000+ 762 Billion per second 262 Billion 102 Billion 14 Billion 40 Billion2.9 Billion Q4 2006 Q4 2007 Q4 2008 Q4 2009 Q4 2010 Q4 2011 June 2012
  18. 18. Data on our infrastructure is durable Our customers don’t have to Availability Zones duplicate their S3 or Glacier Obj ect data like they ObjectCop do with tape y Object Copy Region
  19. 19. AWS storage is ideal for DR and Archive• Amazon Simple Storage Service (Amazon S3) – Highly durable blob storage – Excellent for backup and archive• Amazon Elastic Block Store (Amazon EBS) and EBS snapshots – Persistent data volumes for Amazon EC2 instances – Redundant within a single Availability Zone – Snapshot backups provide long term durability, and volume sharing / cloning capability within a Region• Amazon Storage Gateway – Securely connect an on-premises software appliance with cloud-based storage – Back up point-in-time snapshots of your on-premises data to Amazon S3 for future recovery – Mirror your on-premises data to Amazon EC2 instances for Disaster Recovery• Amazon Glacier – Extremely low-cost storage – Secure, durable storage for data archiving and backup – Optimized for data that is infrequently accessed
  20. 20. You have several networking alternatives Amazon S3 10G Bucket Corporate data center Using AWS Direct Connect Over the Internet Amazon Elastic Compute Cloud (EC2) Availability Zone AWS Import/Export On-site infrastructure AWS Region
  21. 21. AWS Storage Gateway Service – Backup & DR http://aws.amazon.com/storagegateway
  22. 22. Amazon Glacier – Durable Archive http://aws.amazon.com/glacier
  23. 23. More about Disaster Recovery with AWS
  24. 24. You might be able to:• Slash parts of your DR budget by 50%• Eliminate 30%+ of your on premises IT footprint• Eliminate your need for physical secondary site(s)• Eliminate tape for backup and archive
  25. 25. Let’s start with where DR fits into your plans High Backup Disaster Availability Storage Recovery• It’s part of a business continuity continuum• It’s not an all or nothing proposition• In the face of internal or external events, how do you… – Keep your applications running 24x7 – Make sure you data is safe – Get an application back up after a major disaster
  26. 26. Disaster recovery on the continuum• Recover from any event• Recovery Time Objective (RTO) – Acceptable time period within which normal operation (or degraded operation) needs to be restored after event• Recovery Point Objective (RPO) – Acceptable data loss measured in time• Traditional IT model has DR in a second physical site – Low end DR: off-site backups – High end DR: hot site active-active architecture
  27. 27. Here’s why you save with AWS as the 2nd site Site 1 (data center): Site 2 (data center): Site 2 (AWS): Routers Routers Routers Firewalls Firewalls Firewalls IP Network IP Network IP Network Application Licenses Application Licenses Application Licenses Operating Systems Operating Systems Operating Systems Hypervisor Hypervisor Hypervisor Servers Servers Servers Storage Network Storage Network Storage Network Primary Storage SnapshotStorage Primary Storage Backup SW Backup SW Backup SW Backup Tapes Backup Tapes Backup Tapes Tape Silos Tape Silos Tape Silos Archive SW Archive SW Archive SW Archive Storage Archive Storage Archive Storage
  28. 28. Common DR architecture patterns
  29. 29. There are two main approaches to recoveryStarting from AWS • Implement a high availability architecture • Implement a disaster recovery strategy • Rapidly restore and failover within AWS• Implement a high availability architecture Starting from your Data Center• Implement a disaster recovery strategy• Use backed up data to run analytics in AWS• Rapidly restore from AWS to on-premises• Rapidly failover to AWS while restoring on-premises
  30. 30. Common DR Scenarios on AWSBackup and Restore Pilot light for quick recovery into AWSFully-Working Low Multi-Site Hot Standby Capacity Standby
  31. 31. Now let’s look at the Backup and Restore architecture Backup and Restore Pilot light for quick recovery into AWS Fully-Working Low Multi-Site Hot Standby Capacity Standby
  32. 32. Backup and Restore• Advantages – Simple to get started – Extremely cost effective (mostly backup storage)• Preparation phase – Take backups of current systems – Store backups in Amazon S3 – Describe procedure to restore from backup on AWS
  33. 33. Backup and Restore Amazon Route 53 Data copied to S3 Traditional server S3 Bucket with Objects AWSOn-premises Infrastructure Import/Export
  34. 34. Backup and Restore – S3 and EC2 Amazon EC2 Instance Data copied from objects in S3 Data Volume Instance Quickly provisioned from Amazon AMI S3 Bucket Pre-bundled with OS and applications AMI Availability Zone AWS Region
  35. 35. Backup and Restore – Storage Gateway
  36. 36. Backup and Restore Review• In Case of Disaster – Retrieve backups from Amazon S3 or Storage Gateway – Bring up required infrastructure • EC2 instances with prepared AMIs, Load Balancing, etc. via CloudFormation – Restore system from backup • Update latest application deployment with Elastic Beanstalk – Switch over to the new system • Adjust DNS records to point to AWS• Objectives – RTO: as long as it takes to bring up infrastructure and restore system from backups – RPO: time since last backup
  37. 37. Now let’s look at the Pilot Light architecture Backup and Restore Pilot light for quick recovery into AWS Fully-Working Low Multi-Site Hot Standby Capacity Standby
  38. 38. Pilot Light• Advantages – Very cost effective (fewer 24/7 resources)• Preparation Phase – Enable replication of all critical data to AWS – Prepare all required resources for automatic start • AMIs, Network Settings, Load Balancing, etc. • Configure CloudFormation and Elastic Beanstalk for automation – Reserved Instances
  39. 39. Pilot Light in Non-DR Phase User or system Web Web Server Server Amazon Route 53 Not Running Application Application Server Server Database Database Smaller Instance Server Data Mirroring/ Server Replication Data Data Volume Volume
  40. 40. Pilot Light in Disaster Phase User or system Web Web Server Server Amazon Route 53 Not Running Application Application Server Server Database Database Database Server Smaller Instance Server Server Data Data Volume Volume
  41. 41. Pilot Light in Recovered Phase User or system Web Web Server Server Amazon Route 53 Start in Minutes Application Application Server Server Database Database Resize Instance Server Server to production capacity Data Volume Data Volume
  42. 42. Pilot Light Review• In Case of Disaster – Automatically bring up resources around the replicated core data set • Use CloudFormation and Elastic Beanstalk easy automation • Scale the system as needed to handle current production traffic – Switch over to the new system • Adjust DNS records to point to AWS• Objectives – RTO: as long as it takes to detect need for DR and automatically scale up replacement system – RPO: depends on replication type
  43. 43. Now let’s look at the Fully-Working Low Capacity Standby architecture Backup and Restore Pilot light for quick recovery into AWS Fully-Working Low Multi-Site Hot Standby Capacity Standby
  44. 44. Fully-Working Low Capacity Standby• Advantages – Can take some production traffic at any time – Cost savings (IT footprint smaller than full DR)• Preparation Phase – Similar to Pilot Light – All necessary components running 24/7, but not scaled for production traffic – Best practice – continuous testing • “Trickle” a statistical subset of production traffic to DR site
  45. 45. Fully-Working Low Capacity Standby in Non-DR Phase User or system Web Web Server Server Amazon Route 53 Low Capacity Application App Server Server Database Data Mirroring/ DB Server Server Replication Data Data Volume Volume
  46. 46. Fully-Working Low Capacity Standby in Disaster Phase User or system Web Web Server Server Amazon Route 53 Low Capacity Application App Server Server Database Data Mirroring/ DB Server Server Replication Data Data Volume Volume
  47. 47. Fully-Working Low Capacity Standby in Recovery Phase User or system Web Web Web Server Server Server Amazon Route 53 Grow Capacity Application Application App Server Server Server Database Data Mirroring/ Database DB Server Server Server Replication Data Data Volume Volume
  48. 48. Fully-Working Low Capacity Standby in Recovered Phase User or system Web Web Web Server Server Server Amazon Route 53 Grow Capacity Application Application App Server Server Server Database Data Mirroring/ Database DB Server Server Server Replication Data Data Volume Volume
  49. 49. Fully-Working Low Capacity Standby Review• In Case of Disaster – Scale up resources to meet full production load • Utilize CloudFormation template to trigger via single API call – Immediately fail over most critical production load • Adjust DNS records to point to AWS – (Auto) Scale the system further to handle all production load• Objectives – RTO: for critical load: as long as it takes to fail over; for all other load, as long as it takes to scale further – RPO: depends on replication type
  50. 50. Now let’s look at the Multi-Site Hot Standby architecture Backup and Restore Pilot light for quick recovery into AWS Fully-Working Low Multi-Site Hot Standby Capacity Standby
  51. 51. Multi-Site Hot Standby• Advantages – At any moment can take all production load• Preparation Phase – Similar to Low-Capacity Standby – Fully scaling in/out with production load
  52. 52. Multi-Site Hot Standby User or systemWeb Web Web ServerServer Server Amazon Route 53 Full Capacity ApplicationApplication Application App ServerServer Server ServerDatabase Database Data Mirroring/ Database DBServer Server Server Server Replication Data Data Volume Volume
  53. 53. Multi-Site Hot Standby Review• In Case of Disaster – Immediately fail over all production load • Adjust DNS records to point to AWS – (Auto) Scale the system further to handle further production load• Objectives – RTO: as long as it takes fail over – RPO: depends on replication type
  54. 54. Best Practices for Being Prepared• Start simple and work your way up – Backups in AWS as a first step – Incrementally improve RTO/RPO as a continuous effort • Utilize CloudFormation and Elastic Beanstalk for consistent automation • Implement Pilot Light or Fully-Working Low Capacity Standby as a next step• Check for any software licensing issues• Exercise your DR Solution – Ensure backups, snapshots, AMIs, etc. are working – Train personnel on your DR procedures (SOPs, TTPs) – Perform Continuous Testing of your DR procedures
  55. 55. Call to action• Learn more about our DR resources• Evaluate using AWS for a DR project• Start testing – first steps are simple• Give us feedback
  56. 56. Here’s where you can get more information• AWS Disaster Recovery http://aws.amazon.com/disaster-recovery• White papers http://aws.amazon.com/whitepapers• Webinars http://aws.amazon.com/resources/webinars• Videos http://www.youtube.com/user/AmazonWebServices• Slides http://www.slideshare.net/AmazonWebServices• Website http://aws.amazon.com• Blog http://aws.typepad.com• Partners https://aws.amazon.com/solution-providers• Contact AWS – http://aws.amazon.com/contact-us/
  57. 57. Thank You!Tim Bixler | tbixler@amazon.comFederal Solutions Architecture Manager & Principal Solutions Architect

×