How to Design for High Availability & Scale with AWS

Agenda
Introduction
High Availability
Scalability
Fault Tolerance

AWS Global Infrastructure
Key Design Concepts
Design for Failure
Scaling
Self Healing / Fault Tolerant
Multiple AZ Architecture
Loose Coupling

Sample Architectures
Blazeclan

2

Cloud IT Better

Introduction

Blazeclan

3

Cloud IT Better

How Often Do You See This?

Blazeclan

4

Cloud IT Better

Cost of Downtime
A report published in 2010 for top
412 eCommerce sites says,
• The median length of downtime was 840
minutes

• On average, each of them saw 3291 minutes
of downtime

Lost Revenue
• On average, each of them lost $800,099 in
revenue due to downtime

• The total amount of revenue lost due to
downtime
of
all
was $329,640,928!

Blazeclan

412

companies

5

Cloud IT Better

Online Business & Downtime Facts
The Average Hourly Loss because
of Data Center Down Time in 2012

Source: http://www.techrepublic.com/blog/data-center/infographic-the-outrageous-costs-of-data-center-downtime

Blazeclan

6

Cloud IT Better

How to Build a HIGHLY
AVAILABLE, SCALABLE,
DURABLE AND
RESILIENT Web Application

Blazeclan

7

Cloud IT Better

High Availability
99.999%

• Up Time of an Application

uptime

• Planned or Unplanned Outage or Downtime
• Offline, Unreachable, or Partially Available
• Slow to Use

• Goal
• No Downtime
• Always Available

Blazeclan

8

Cloud IT Better

Scalability
Ability of an
Application to
accommodate
change in traffic
without
architectural
changes

Availability may
be impacted if
application
cannot Scale

Resources

Demand

Scalability
doesn’t
Guarantee
Availability

Blazeclan

Time

9

Cloud IT Better

Fault Tolerance
X

• Built-in Redundancy so
applications can Continue
Functioning when Components
fail

X

• Fault tolerance is crucial to
High Availability

Image courtesy: Gigamone.com

Blazeclan

10

Cloud IT Better

AWS Global
Infrastructure

Blazeclan

11

Cloud IT Better

AWS democratizes High Availability
• Multiple Servers
• Isolated Redundant Data
Centers

• Regions across the
Globe

• Availability Zones within

Source: http://aws.amazon.com/about-aws/globalinfrastructure/#reglink-sa

Regions

Blazeclan

12

Cloud IT Better

AWS Capacity

Source: http://www.slideshare.net/AmazonWebServices/aws-webinar-scaling-on-aws-for-the-first-10-million-users

Blazeclan

13

Cloud IT Better

AWS Platform

Source : http://www.slideshare.net/AmazonWebServices/aws-webinar-scaling-on-aws-for-the-first-10-million-users

Blazeclan

14

Cloud IT Better

AWS Building Blocks
Inherently Highly Available
and Fault Tolerant Services

 Amazon S3

 Amazon DynamoDB


Amazon SNS

 Amazon CloudFront
 Amazon SES
 Amazon Route53

Architect Across AZ’s

Span Across AZ’s

 Amazon SQS

Highly Available with Right
Architecture

 Amazon EC2
 Amazon EBS
 Amazon RDS
 Amazon VPC

 Amazon SWF
 Elastic Load Balancer
 …

Blazeclan

15

Cloud IT Better

Design For
Failure

Blazeclan

16

Cloud IT Better

Everything fails, all the time
– Werner Vogels, CTO, Amazon
Avoid
single
points of
failure
Application
Should
Continue to
Function
Assume
everything
fails, and
work
backwards

Obama’s Prized Limo after it
broke down in his Israel visit!

Blazeclan

17

Avoid Impact on
Business

Cloud IT Better

Ask Questions for Right Architecture

What kind of
Scenarios do I
have to
plan for?

What are my
single points
of failure?

If there are
master and slaves
In your architecture,
what if the master
node fails?

Blazeclan

If a load balancer
is sitting in front
of an array of application
servers, what if
that load
balancer fails?

What happens
if a node in your
system fails?

18

Cloud IT Better

Lots of Questions
How do you recognize
that failure?

How do I replace that node?

What if the cache keys grow beyond
memory limit of an instance?

How does the failover occur &
how is a new slave instantiated &
brought into sync with the master?

What if downstream service
times out or returns an exception?

Blazeclan

19

Cloud IT Better

Build Mechanisms to Handle Failure
• Build process threads that resume on reboot

• Allow the state of the system to re-sync
by reloading messages from queues

• Keep pre-configured and pre-optimized
virtual images to support above point
on launch/boot

• Avoid in-memory sessions or stateful
user context, move that to data stores
Image courtesy: http://www.outsmarthormones.com/wp-content/uploads/2011/06/Fix.jpg

• Have a coherent backup and restore
strategy for your data and automate it
Blazeclan

20

Cloud IT Better

Design for Failure

Source:
http://media.amazonwebservices.com/architecturecenter/AWS_ac_
ra_ftha_04.pdf

Blazeclan

21

Cloud IT Better

Scaling

Blazeclan

22

Cloud IT Better

Auto Scaling
• Enables to automatically scale
Amazon EC2 capacity up or down

• Enables to terminate Server
Instances at will

• Enables to add more instances
in response to an increasing load

• Enables launch of a replacement

Image Courtesy: http://www.knovelblogs.com/wp-content/uploads

instance immediately, in case of a failure

• Enables application to transition
seamlessly in case the primary server fails
Blazeclan

23

Cloud IT Better

Elastic Load Balancing (ELB)
• Distributes incoming traffic to a
application across several Amazon
EC2 instances

• ELB is given a DNS host name &
Requests Sent to this host name
are Delegated to a pool
of Amazon EC2 instances

• ELB Detects Unhealthy Instances
within its pool of Amazon EC2 instances and automatically
reroutes traffic to healthy instances, until the unhealthy
instances have been restored
Blazeclan

24

Cloud IT Better

ELB & Auto Scaling
• Auto Scaling & ELB are
an ideal combination

• ELB gives a single DNS
name for addressing

• Auto

Scaling ensures
there is always the right
number
of
healthy
Amazon EC2 instances to
accept requests

Blazeclan

25

Cloud IT Better

Fault
Tolerant

Blazeclan

26

Cloud IT Better

Fault Tolerance
• In order to build fault-tolerant
applications on Amazon EC2,
it’s important to follow best
practices such as,
• Quickly being able to commission
replacement instances

• Using Amazon EBS for persistent
storage

• Use Multiple Availability Zones and
elastic IP addresses.

Blazeclan

27

Cloud IT Better

Multi-AZ
Architecture

Blazeclan

28

Cloud IT Better

Multi-AZ Design Considerations
• Achieve greater Fault Tolerance
by Distributing your application geographically

• The Amazon EC2 service level
agreement commitment is 99.95%
availability for each Amazon EC2 Region

• Deploy application that spans
across multiple Availability Zones

• Redundant instances for each tier of an

Image Courtesy: http://chriscampcommunications.blogspot.in

application could be placed in distinct Availability Zones

• ELB can automatically balance traffic across multiple instances &
multiple Availability Zones
Blazeclan

29

Cloud IT Better

Multi- AZ Architecture

Blazeclan

30

Cloud IT Better

Loose
Coupling

Blazeclan

31

Cloud IT Better

Loose Coupled Systems

• Loosely coupled systems are
more fault tolerant and can achieve
a bigger scale

• Loosely coupled systems on AWS
• De-coupling systems allows for hybrid models
(in-cloud + in-physical data center)
• Balancing between clusters enables easier scaling
• Using queues (Amazon SQS) buffers against failures

• Design for a jumble of black boxes
Blazeclan

32

Cloud IT Better

Decoupling using SQS

Blazeclan

33

Cloud IT Better

Loose Coupling - Best Practices on AWS
• Use Amazon SQS to isolate components
• Use Amazon SQS as buffers between components

• Design every component such that it expose a service
interface and is responsible for its own scalability and
interacts with other components asynchronously

• Bundle the logical construct of a component
into an Amazon Machine Image so that it can
be deployed more often

• Make your applications as stateless as
possible. Store session state outside of component
(in Amazon SimpleDB, if appropriate)
Blazeclan

34

Cloud IT Better

Sample
Architectures

Blazeclan

35

Cloud IT Better

High Availability Architecture in RDS

Blazeclan

36

Cloud IT Better

Web Hosting on AWS

Blazeclan

37

Cloud IT Better

Scalable Reader Farm

Blazeclan

38

Cloud IT Better

Design for High Availability & Scale
Don’t let this happen to your Business

Our AWS Expert Solution Architects can help
you review your Architecture.

Avail for our 2hr Free Consultancy!
For any assistance please contact us at
info@blazeclan.com
Blazeclan

39

Cloud IT Better

Upcoming Webinars
Check out Our Upcoming Webinars
www.blazeclan.com/webinars

Blazeclan

40

Cloud IT Better

Thank you
info@blazeclan.com
Follow Us On :
Our Blog :
Blazeclan

http://blog.blazeclan.com/

How to Design for High Availability & Scale with AWS

More Related Content

What's hot

Viewers also liked

Similar to How to Design for High Availability & Scale with AWS

More from Blazeclan Technologies Private Limited

Recently uploaded

How to Design for High Availability & Scale with AWS