• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Building Fault Tolerant Applications in the cloud - AWS Summit 2012 - NYC
 

Building Fault Tolerant Applications in the cloud - AWS Summit 2012 - NYC

on

  • 3,647 views

 

Statistics

Views

Total Views
3,647
Views on SlideShare
3,647
Embed Views
0

Actions

Likes
2
Downloads
84
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Cloud computing is a better way to run your business. The cloud helps companies of all sizesbecome moreagile. Instead of running your applications yourself you can run them on the cloud where IT infrastructure is offered as a service like a utility. With the cloud, your company saves money: there are no up-front capital expenses as you don’t have to buy hardware for your projects. The massive scale and fast pace of innovation of the cloud drive the costs down for you. In the cloud, you pay only for what you use just like electricity.The cloud can also help your company save time and improve agility – it’s faster to get started: you can build new environments in minutes as you don’t need to wait for new servers to arrive. The elastic nature of the cloud makes it easy to scale up and down as needed. At the end of the day you have more resources left for innovation which allows you to focus on projects that can really impact your businesses like building and deploying more applications. “With the high growth nature of our business, we were looking for a cloud solution to enable us to scale fast. Think twice before buying your next server. Cloud computing is the way forward.” - Sami Lababidi, CTO, Playfish
  • AWS is useful for low-end traditional DR to high-end HA, but…AWS encourages a rethinking of traditional DR / HA practicesEverything in the cloud is “off-site” and (potentially) “multi-site”Using multiple sites (multiple AZs) comes largely for freeUsing multiple geographically-distributed sites (multiple Regions) is significantly cheaper and easierTends to move the default design point away from “cold” Disaster Recovery toward “hot” High AvailabilityMakes it easier to stack multiple mechanismse.g., Basic HA within one Region, DR site in second Region
  • Each item a
  • Each item a
  • Fault Separation Amazon EC2 provides customers the flexibility to place instances within multiple geographic regions as well as across multiple Availability Zones. Each Availability Zone is designed with fault separation. This means that Availability Zones are physically separated within a typical metropolitan region, on different flood plains, in seismically stable areas. In addition to discrete uninterruptable power source (UPS) and onsite backup generation facilities, they are each fed via different grids from independent utilities to further reduce single points of failure. They are all redundantly connected to multiple tier-1 transit providers. It should be noted that although traffic flowing across the private networks between Availability Zones in a single region is on AWS-controlled infrastructure, all communications between regions is across public Internet infrastructure, so appropriate encryption methods should be used to protect sensitive data. Data are not replicated between regions unless proactively done so by the customer.
  • Distinct physical locationsLow-latency network connections between AzsIndependent power, cooling, network, securityAlways partition app stacks across 2 or more AzsElastic Load Balance across instances in multiple AzsDon’t confuse AZ’s with Regions!
  • Note, the question is not “do you need to automate your deployment” or “should I use automation when I’m using the cloud?” the answer to that is YES!The question is; if you’re using fully standard PHP or Java stacks, why manage it? Beanstalk does that great, with zero lock-in. If what you need is more complex, perhaps cloudformation (note, you can do BOTH!)
  • Three-Tier Web App has been “fork-lifted” to the cloudEverything in a single Availability ZoneLoad balanced at the Web tier and App tier using software load balancersMaster and Standby databaseElastic IP on front end load balancer onlyS3 used as DB backup instead of tapeHow can you use AWS features to make this app more highly available?
  • Three-Tier Web App has been “fork-lifted” to the cloudEverything in a single Availability ZoneLoad balanced at the Web tier and App tier using software load balancersMaster and Standby databaseElastic IP on front end load balancer onlyS3 used as DB backup instead of tapeHow can you use AWS features to make this app more highly available?
  • Class exercise: Use AWS features to make this web app more Highly AvailableUse two Availability Zones for failoverEnable CloudWatch for monitoring and alarmsUse Auto Scaling at Web and App tiers (across two zones)Use regular EBS Snapshots, save configured EC2 instances as AMIsReplace front-end load balancer with ELBUse load balancer on EC2 between Web and App tierReplace when ELB offers internal load-balancingUse Elastic IP addresses for Load Balancer and Data BasePush all static content to S3 and/or CloudFront. Less popular content should be served from S3 directly.Use Route53 to control public DNS entries to dynamic and static content, and to get Zone Apex support for ELBPush logs to S3Put DB replica in second zone for failoverConsider using RDS with Multi-AZ deployment
  • Avoid single points of failureAssume everything fails, and design backwardsGoal: Applications should continue to function even if the underlying physical hardware fails or is removed or replaced.Design your recovery processTrade off business needs vs. cost of high-availability
  • Multiple DNS TargetsLoad Balanced across Availability ZonesAuto-scaled web-cache servers with health checksAuto-scaled web-servers with health checksComprehensive config, data, and AMI backupMonitoring, alarming and logging
  • Mid-tier Load Balancing or QueueingSpans Availability ZonesAuto-scaled App Servers with health checksComprehensive config, data, and AMI backupMonitoring, alarming and logging
  • DB-Tier Load Balancing or QueueingAuto-scaled Database cache servers with health checksRedundant Relational Database systems Mirrored, log-shipped, async or sync replicatedDesigned to scale horizontally (sharding)Durable NoSQL or KV-store Data SystemsNo SPOF designSupports automatic re-balancing, replication, and fault-recoveryMonitoring, alarming and logging
  • DB-Tier Load Balancing or QueueingAuto-scaled Database cache servers with health checksRedundant Relational Database systems Mirrored, log-shipped, async or sync replicatedDesigned to scale horizontally (sharding)Durable NoSQL or KV-store Data SystemsNo SPOF designSupports automatic re-balancing, replication, and fault-recoveryMonitoring, alarming and logging
  • Multi-AZ DeploymentsSynchronous replication across AZsAutomatic fail-over to standby replicaAutomated BackupsEnables point-in-time recovery of the DB instanceRetention period configurableSnapshotsUser initiated full backup of DBNew DB can be created from snapshots

Building Fault Tolerant Applications in the cloud - AWS Summit 2012 - NYC Building Fault Tolerant Applications in the cloud - AWS Summit 2012 - NYC Presentation Transcript

  • Building Fault-TolerantApplications in the CloudAdvanced Solutions ArchitectureMiles Ward
  • Faults?FacilitiesHardwareNetworkingCodePeople
  • What is “Fault-Tolerant”?Degrees of risk mitigation - not binaryAutomatedTested!
  • AgendaThe AWS ApproachBuilding BlocksSuccess Example:Design Patterns
  • Old School Fault-Tolerance: Build Two
  • Cloud Computing Benefits No Up-Front Low Cost Pay Only for Capital Expense What You Use Self-Service Easily Scale Improve Agility & Infrastructure Up and Down Time-to-Market Deploy
  • Cloud Computing Fault-Tolerance Benefits No Up-Front HA Low Cost Pay for DR Only Capital Expense Backups When You Use it Self-Service Easily Deliver Fault- Improve Agility & DR Infrastructure Tolerant Applications Time-to-Recovery Deploy
  • AWS Cloud allows Overcast Redundancy Have the shadow duplicate of your infrastructure ready to go when you need it……but only pay for whatyou actually use
  • Old Barriers to HAare now SurmountableCostComplexityExpertise
  • AWS Building Blocks: Two Strategies Inherently fault- Services that are fault-toleranttolerant services with the right architecture S3 Amazon EC2 SimpleDb VPC DynamoDB Cloudfront EBSSWF, SQS, SNS, SES RDS Route53Elastic Load Balancer Elastic Beanstalk ElastiCache Elastic MapReduce IAM
  • Resources DeploymentThe Stack: Management Configuration Networking Facilities Geographies
  • EC2 Instances Amazon Machine ImagesThe Stack: CW Alarms - AutoScaling Cloudformation - Beanstalk Route53 – ElasticIP – ELB Availability Zones Regions
  • Regional DiversityUse Regions for: Latency • Customers • Data Vendors • Staff Compliance Disaster Recovery … and Fault Tolerance!
  • Proper Use of Multiple Availability Zones
  • Network Fault-Tolerance Tools107.22.18.45 isn’t fault-tolerant but 50.17.200.146 is: EIPElastic Load BalancingAutomated DNS: Route53New! Latency-Based Routing
  • Cloudformation – Elastic Beanstalk Q: Is your stack unique?
  • Cloudwatch – Alarms – AutoScaling
  • AMI’sMaintenance is criticalAlternatives: Chef, Puppet, cfn-init, etc.New! When in doubt: 64-bitReplicate for DR
  • EC2 InstancesConsistent, reliable building block100% API controlledReserved InstancesEBSImmense Fleet Scale
  • New EC2 VPC feature:Elastic Network Interface Up to 2 Addresses Span Subnets Attach/Detach Public or Private
  • Example:a “fork-lifted” app
  • Example:Fault-Tolerant
  • Why mess with all of that?
  • Design For FailureSPOF
  • Copyright © 2011 Amazon Build Loosely Coupled Systems Web ServicesTightCouplingLoose Couplingusing Queues
  • Use the right approach for each tier
  • Fault-Tolerant Front-end SystemsAddressing: Route53, EIP Auto Scaling Amazon CloudFrontDistribution: Multi-AZ, ELB, CloudfrontRedundancy: Auto-Scaling Amazon CloudWatch Amazon Route 53 Elastic LoadMonitoring: Cloudwatch Balancer Elastic IP AWS ElasticPlatform: Elastic Beanstalk Beanstalk
  • Fault-Tolerant Data-Tier SystemsTunedPatchedCachedShardedReplicatedBacked UpArchivedMonitored
  • Fault-Tolerant Data-Tier SystemsTunedPatchedCached LOTSShardedReplicated OFBacked Up WORKArchivedMonitored
  • AWS Fault-Tolerant Data-Tier ServicesS3SimpleDB Amazon Relational Database Service Amazon Elastic (RDS) MapReduce Amazon Simple Storage ServiceEMR (S3)New! DynamoDB Amazon SimpleDB Amazon DynamoDBRDS Amazon ElastiCache
  • RDS Fault-Tolerant FeaturesMulti-AZ DeploymentsRead Replicas RDS DB Instance RDS DB Instance Multi-AZ StandbyAutomated BackupsSnapshots
  • New! Storage Gateway Your Datacenter Amazon Elastic Compute Cloud (EC2) AWS Storage Gateway VM SSL Clients Internet On-premises Host or Direct AWS Storage Amazon Simple Connect Gateway Service Storage Service (S3)Application Servers Amazon Elastic Block Storage (EBS) Direct Attached or Storage Area Network Disks
  • Test! Use a Chaos Monkey! Prudent Conservative Professional …and all the cool kids are doing ithttp://techblog.netflix.com/2010/12/5-lessons-weve-learned-using-aws.html
  • Thank You! @milesward