Designing Fault-Tolerant Applications         Miles Ward         Enterprise Solutions Architect
Building Fault-Tolerant Applications on AWS       White paper published       last year       Sharing best practices      ...
AWS Fault-Tolerant Building BlocksTwo approaches:  1) AWS services that are inherently fault-tolerant and highly  availabl...
Amazon EC2 Architecture       Amazon                              Region       Machine                         Availabilit...
EC2 Features AMI   Packaged, reusable functionality On-Instance Storage   Lifetime tied to instance lifetime   AFR like...
EC2 Features Elastic IP Addresses   Map to any EC2 instance within a given Region   Detach from failed instance; map to ...
EC2 Features CloudWatch Alarms                Copyright © 2011 Amazon Web Services
EC2 Features Elastic Load Balancing   Distributes incoming traffic across multiple instances   Sends traffic only to hea...
Amazon EC2 Regions and Availability Zones   US East (Northern Virginia)                                 EU (Dublin)    Ava...
Availability Zone Characteristics and Advice Distinct physical locations Low-latency network connections between AZs Indep...
Proper Use of Multiple Availability Zones                       Centralized Services (S3 Backups, SimpleDB, etc)          ...
Region Characteristics and Advice Regions are:   Functionally separate   Composed of 2 or more AZs   Connected via the ...
RDS Fault-Tolerant Features Multi-AZ Deployments   Synchronous replication across AZs   Automatic fail-over to standby r...
AWS Architectural   Guidance     Copyright © 2011 Amazon Web Services
Design For Failure – Basic Principles  Avoid single points of failure  Assume everything fails, and design backwards  Goal...
Design For Failure – Use AWS Building Blocks  Use Elastic IP addresses for consistent and re -  mappable routes  Use multi...
Copyright ©                                                                2011 Amazon                                    ...
Implement Elasticity Don’t assume health or fixed location of components Use designs that are resilient to reboot and re-l...
Implementing ElasticityElastic Load Balancing, CloudWatch, and AutoScaling                          Elastic Load          ...
Copyright © 2011                                                                                     Amazon WebUse a Chaos...
AWS Architecture Center aws.amazon.com/architecture White papers:     Cloud architectures     Building fault-tolerant ap...
Thank You! Copyright © 2011 Amazon Web Services
Upcoming SlideShare
Loading in...5
×

AWS Summit 2011: Designing Fault Tolerant Applicatons

4,268

Published on

Published in: Technology
0 Comments
8 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,268
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
37
Comments
0
Likes
8
Embeds 0
No embeds

No notes for slide

AWS Summit 2011: Designing Fault Tolerant Applicatons

  1. 1. Designing Fault-Tolerant Applications Miles Ward Enterprise Solutions Architect
  2. 2. Building Fault-Tolerant Applications on AWS White paper published last year Sharing best practices We’d like to hear your best practices as wellhttp://media.amazonwebservices.com/AWS_Building_Fault_Tolerant_Applications.pdf Copyright © 2011 Amazon Web Services
  3. 3. AWS Fault-Tolerant Building BlocksTwo approaches: 1) AWS services that are inherently fault-tolerant and highly available: • Amazon Simple Storage Service (S3) • Amazon SimpleDB • Amazon SQS, SNS, SES, CloudWatch, CloudFront, and more. 2) AWS services that offer tools and features to design fault- tolerant and highly available systems: • Amazon Elastic Compute Cloud (EC2) – Availability Zones, Elastic IPs, EBS, etc. – Flexible to trade off budget vs. time to recovery • Amazon Relational Database Service (RDS) – Multi-AZ Deployments – Backup/Restore Copyright © 2011 Amazon Web Services
  4. 4. Amazon EC2 Architecture Amazon Region Machine Availability Zone Image (AMI) Ephemeral Storage EC2 Instance Elastic CloudWatch Block Storage Security Group(s) Auto Amazon S3 Scaling Elastic IP EBS EBS Address Snapshot Snapshot Load BalancingCopyright © 2011 Amazon Web Services
  5. 5. EC2 Features AMI  Packaged, reusable functionality On-Instance Storage  Lifetime tied to instance lifetime  AFR like standard hard disk (around 5%) EBS Volumes  Lifetime independent of any particular EC2 instance  Redundant within an AZ  AFR is 0.1% to 0.5%  Incorporate volume mappings into your architecture  Use EBS snapshot backups Copyright © 2011 Amazon Web Services
  6. 6. EC2 Features Elastic IP Addresses  Map to any EC2 instance within a given Region  Detach from failed instance; map to replacement Auto Scaling  Two ways to use it: • Respond to changing conditions by adding or terminating EC2 instances (attach to CloudWatch metrics) • Maintain a fixed number of instances running, replacing them if they fail or become unhealthy Reserved Instances  Guarantees capacity for when it’s needed Copyright © 2011 Amazon Web Services
  7. 7. EC2 Features CloudWatch Alarms Copyright © 2011 Amazon Web Services
  8. 8. EC2 Features Elastic Load Balancing  Distributes incoming traffic across multiple instances  Sends traffic only to healthy instances Copyright © 2011 Amazon Web Services
  9. 9. Amazon EC2 Regions and Availability Zones US East (Northern Virginia) EU (Dublin) Availability Availability Zone A Zone B Availability Availability Zone A Zone B Availability Availability Zone C Zone DAmazon EC2 Regions:US East (Northern Virginia) / US West (Northern California) /EU (Ireland) / Asia Pacific (Singapore) / Asia Pacific (Tokyo) Copyright © 2011 Amazon Web Services
  10. 10. Availability Zone Characteristics and Advice Distinct physical locations Low-latency network connections between AZs Independent power, cooling, network, security Always partition app stacks across 2 or more AZs Elastic Load Balance across instances in multiple AZs Copyright © 2011 Amazon Web Services
  11. 11. Proper Use of Multiple Availability Zones Centralized Services (S3 Backups, SimpleDB, etc) Availability Zone A Availability Zone B Database Server or Database Server or RDS DB Instance RDS DB Instance App Server App Server Web Server Web Server Requests and Health Checks Elastic Load BalancerCopyright © 2011 Amazon Web Services Incoming Requests
  12. 12. Region Characteristics and Advice Regions are:  Functionally separate  Composed of 2 or more AZs  Connected via the public internet Use regions to:  Have functionality geographically close to customers  Comply with national laws and practices  Implement a DR strategy
  13. 13. RDS Fault-Tolerant Features Multi-AZ Deployments  Synchronous replication across AZs  Automatic fail-over to standby replica Automated Backups  Enables point-in-time recovery of the DB instance  Retention period configurable Snapshots  User initiated full backup of DB  New DB can be created from snapshots
  14. 14. AWS Architectural Guidance Copyright © 2011 Amazon Web Services
  15. 15. Design For Failure – Basic Principles Avoid single points of failure Assume everything fails, and design backwards Goal: Applications should continue to function even if the underlying physical hardware fails or is removed or replaced. Design your recovery process Trade off business needs vs. cost of high -availability Copyright © 2011 Amazon Web Services
  16. 16. Design For Failure – Use AWS Building Blocks Use Elastic IP addresses for consistent and re - mappable routes Use multiple Amazon EC2 Availability Zones (AZs ) Replicate data across multiple AZs  Example: Amazon RDS Multi-AZ mode Use real-time monitoring (Amazon CloudWatch) Use Amazon Elastic Block Store (EBS) for persistent file systems Take EBS Snapshots and use S3 for backups Copyright © 2011 Amazon Web Services
  17. 17. Copyright © 2011 Amazon Web Services Build Loosely Coupled Systems Use independent components Design everything as a Black Box Load-balance and scale clusters Think about graceful degradationAmazon SQS as Buffers Tight Controller Controller Controller A B C Coupling Q Q Q Loose Coupling Controller Controller Controller using Queues A B C
  18. 18. Implement Elasticity Don’t assume health or fixed location of components Use designs that are resilient to reboot and re-launch Bootstrap your instances –  “Who am I am and what is my role?” Enable dynamic configuration Use configurations in SimpleDB for bootstrapping Use Auto Scaling Use Elastic Load Balancing on each tier Copyright © 2011 Amazon Web Services
  19. 19. Implementing ElasticityElastic Load Balancing, CloudWatch, and AutoScaling Elastic Load Balancing Utilization Auto Scaling CloudWatch Metrics Copyright © 2011 Amazon Web Services
  20. 20. Copyright © 2011 Amazon WebUse a Chaos Monkey Services From the Netflix blog: Simple monkey:  Kill any instance in the account Complex monkey:  Kill instances with specific tags  Introduce other faults (e.g. connectivity via Security Group) Human monkey:  Kill instances from the AWS Management Console http://techblog.netflix.com/2010/12/5-lessons-weve-learned-using-aws.html
  21. 21. AWS Architecture Center aws.amazon.com/architecture White papers:  Cloud architectures  Building fault-tolerant applications  Web hosting best practices  Leveraging different storage options  AWS security best practices Copyright © 2011 Amazon Web Services
  22. 22. Thank You! Copyright © 2011 Amazon Web Services

×