Describing methods on how to achieve fault tolerance in web by making the website architecture fault tolerant.The different architecture have been explained and picking one of the best approach in the end to describe why this approach is a good one.
Not everything on the cloud is fault tolerant!!
You have to design it to be Fault Tolerant
AWS offers Dynamic Fault tolerance
Around 40% of the users using AWS do not deploy any redundancy in
The price involved in using resources on the cloud has fallen by
Roughly 2500% in 7 years.
AWS service warranty claims 99.95% availability. That‟s around 4
hours downtime in a year.
Inherent Fault tolerant components
Amazon Simple storage (S3)
Amazon Elastic Load Balancing(ELB)
Amazon Elastic Compute Cloud(EC2)
Amazon Elastic Block Store (EBS)
“The above inherit Fault tolerant components provide features
such as AZ, Elastic IP‟s , Snapshots that a Fault Tolerant HA
system must take advantage of and use Correctly” .
Simply said AWS has given you the resources to make HA / FT
Amazon EC2 (Amazon Elastic Compute
Cloud) :- Web service that provides
computing resources i.e. server
instances to host your software.
AMI (Amazon Machine Image) :
Template basically contains s/w & h/w
configuration applied to instance type.
EBS (Elastic Block Store) :- Block Level
storage volumes for EC2‟s. Not
associated with instance. AFR is around
.1 to .5 %.
Amazon AZ are zones within same region.
Engineered to be insulated from failures of other AZ‟s.
Independent Power, cooling, network & security.
Elastic IP Addresses
Public IP addresses that can be
mapped to any EC2 Instance within
a particular EC2 region.
Addresses are associated with AWS
account and not the instance.
In case of failure of EC2 Component
, detach Elastic IP from the failed
component and map it to a reserve
Mapping downtime around 1-2 Mins.
Auto Scaling enables you to automatically scale up or down the
You Define your own rules to achieve this. E.g. When no of
running EC2‟s < X , launch Y EC2‟s.
Use metrics from Amazon CloudWatch to launch/terminate
EC2‟s . E.g. resource utilization above certain threshold.
E.g. of AS & ELB next ->
Elastic Load Balancing
Elastic Load Balancer distributes
incoming traffic across available EC2
Monitors EC2‟s and removes Failed
Works in parallel with Auto Scaling to
Implement N+1 Redundancy Auto
Scaling & ELB
Lets say N=1 .
Define rule X :- 2 Instances of defined AMI always available.
ELB distributes load among the 2 servers. Enough capacity for
each server to handle the entire capacity i.e. N=1
Server 1 Goes down
Server 2 can process the entire traffic.
Auto Scaling identifies failure and launches healthy EC2 using
the AMI to fulfill rule X.
Fault Tolerance Web Design
Architecting High Availability in AWS
High Availability in the Web/App Layer
High Availability in the Load Balancing Layer
High Availability in the Database Layer
It is a common practice to launch the Web/App layer in more
than one EC2 Instance to avoid SPOF.
How would user session information be shared between the
It is hence necessary to synchronize session data among EC2
Not every user can work with stateless server configurations.
Option 1 : JGroups
Toolkit for reliable messaging
Can be used by Java based servers.
Suited for max of around 5-10 EC2‟s.
Not suited for larger architectures.
Option 3 : RDMS
Many use it but considered poor design.
Master will be overwhelmed by session
A m1.RDS MySQL Master has max 600
connections. 400 online users will
generate session requests. Only 200
connections left to serve transaction/user
Can cause intermittent web service
downtime due to above reason.
Option 2:- MemCached
Highly Used , Supports multiple
Save user session data in multiple
nodes to avoid SPOF (trade off
latency to write to multiple nodes)
Depending on requirements create
high memory EC2 instances for
Can scale up to tens of thousands of
Load Balancing Layer
It balances the load among the available EC2 instances.
SPOF in the LB can bring down the entire site during outage.
Equally important as replicating servers, databases etc.
Many ways to build highly available Load balancing Tier.
Load Balancing Tier
Option 1: Elastic Load Balancer
Inherently Fault Tolerant.
Automatically distributes incoming traffic
among EC2 Instances.
Automatically creates more ELB EC2
Instance when load increases to avoid
Detects health of EC2 Instances and
routes to only healthy instances.
ELB Implementation Architecture
ELB with Auto Scaling(inside AZ)
Web/App Ec2 are configured with
AutoScaling to scale out/down.
Amazon ELB can direct the load
seamlessly to the EC2 instances
configured with AutoScaling.
ELB Implementation Architecture
ELB with Amazon AutoScaling
EC2 can be configured with
amazon autoscaling to scale
out/down across AZ’s.
Highly recommended . Highest
Availability offered among all ELB
Issues with ELB
Supports only round-robin & sticky session algorithms.
Weighted as of 2013.
Designed to handle incremental traffic. Sudden Flash traffic can
lead to non availability until scaling up occurs.
The ELB needs to be “Pre-warmed” to handle sudden traffic.
Currently not configurable from the AWS console.
Known to be “non – round robin” when requests are generated
from single or specific range of IP‟s.
Like multiple requests from within a company operating on a
specific range of IP.
3rd party Load Balancer
3rd Party Load Balancers
Nginx & Haproxy to work as Load
Use your own scripts to scale up EC2 „s
AutoScaling Works best with ELB.
Load Balancing Algorithms
Random :- Send connection requests to server randomly (Simple
Round Robin :- Round Robin passes each new connection request
to next server in line. Eventually distributing connections evenly.
Weighted Round Robin :- Assign weights to Machines based on the
capacity , no of connections each machine receives depends on
More Algos such as Least Connections, Fastest etc.
A Load Balancing Algorithm that adapts its strategies for
allocating web requests dynamically.
Prober :- Gather Status info from Web Servers every 50 ms.
CPU Load on server
Server‟s response rate
No of requests served
Allocator: - Based on prober update , allocator updates weights
The proposed algo differs by considering local & local
information at each web server to choose the best server to
Real Time Server Stats Load
Deciding Factors used in algorithm
Weighted metric of cache hits on different servers.
CPU Load of Web Server
Server Response Rate
No of Clients requests being handled