SlideShare a Scribd company logo
1 of 49
The Art of Infrastructure Elasticity
              April 28th, 2012



    Cloud Developer Conference 2012 , Bangalore

                Harish Ganesan
             CTO and Co-Founder
                   8KMiles
          Harish11g.AWS@gmail.com
Agenda
• Problem
• Challenges
• Requirements
• Solution Architecture
• Q&A

                          2
What is the problem scenario ?




                                 3
Big Sales Promotion every quarter by
         the Enterprise
                                       4
• Massive online Concurrent Visitors

• Limited processing capacity of the Booking Engine
                (~3k requests/sec)




                                                      5
• Unhappy Visitors

• More Booking opportunity lost




                                  6
Solution (Step 1):
• Create a Queuing App before the Booking engine
• Efficiently Queue the concurrent visitors




                                                   7
Solution (Step 2) :
Moderate and move the visitors waiting in Queuing
app to Booking engine




                                                    8
What are the Challenges ?




                            9
Concurrency
• HTTP/AJAX/REST requests
 • Total : 500+ Million requests in 6 hours
 • Average :23k+ requests/sec
 • Peak : 80K+ requests/sec




                                              10
Queue efficiency
• Allot unique Queue Numbers for visitors

• Queue Number allotment on Fair Basis (As
  much possible)

• Reduce the wait time in Queue Number
  allotment process

• Reduce overall Queue wait time for the     11
  visitor
Load Volatility
                                         Peak utilization during
Compute




                                         Promos

          Wasted Capacity




                                        Yearly
           Complete under
           utilization of Infra other
           times

 • Massive utilization and under utilization                       12
   pattern
IP Whitelisting
 Public Cloud




                                                      3rd Party
                                                      Services
                             IP Address of the
                           source EC2 Instances
                                needs to be
                           whitelisted in 3rd party
                             Services gateway




• Booking engine needs EC2 IP
  Whitelisting for security
                                                                  13
• Consecutive IP range needed
Variety of OS / Software’s
• RedHat OS for Load Balancer , NoSQL and
  Queue Layer

• Apache Tomcat Java web/App Layer

• CentOS for Processing Programs

• MySQL for Result storage
                                            14
• Hadoop for Analytics
What are the requirements from
          enterprise ?



                                 15
Requirements
• Elastic Infrastructure
 • Create the Infrastructure 2 hrs before the
   promo
 • Tear down infrastructure 2 hrs after the promo
 • Elastically expand the infra during the promo


• Highly Scalable and Available

• Log Analytics
                                                    16
• Complete Infrastructure Automation
Solution Architecture




                        17
Solution Architecture
Option 1: Single Queue ( Initial thought)




                    Queuing
                   Application
                                    Booking
Concurrent
                                    Engine
 visitors
                                              18
Solution Architecture
Option 2: Parallel Queue ( Recommended)




                                 Booking
Concurrent       Queuing         Engine
 visitors       Application
                                           19
Request types
• Customer Visit is a HTTP request to the
  Queuing Application

• Current Visitor Queue position is a AJAX
  call every X seconds to the Queuing
  Application
 • More Wait ~ More Calls


                                             20
Solution Step 1 : The Cloud ?




• Amazon Web Services

• We had 4+ years Architecture experience in AWS

• It satisfied many customer requirements and      21
  challenges in this use case
Solution Step 2 : R53/NW

                                Amazon Virtual Private Cloud


Users
         Amazon
         Route 53



                                                                           EC2 Instances
                                                                           on AWS
                      VPC Subnet 1                      VPC Subnet 2
                    Availability Zone 1              Availability Zone 2
 Users




• Amazon VPC with Multi-AZ subnet
  configurations ( HA )
• Amazon Route 53 for Managed DNS
                                                                                           22
• DNS RR algorithm at Route53
Solution Step 3 : Load Balancing

                                                         Amazon Virtual Private Cloud


Users
         Amazon
         Route 53




                       EBS     M1.large                                 EBS      M1.large
                                           Elastic IP                                        Elastic IP
                     Volumes                                          Volumes

                    HAProxy EC2 Instance –1                         HAProxy EC2 Instance –2
 Users
                                          Round Robin                                       Round Robin
                                           Algorithm                                         Algorithm




                                                         VPC Subnet 1
                                                                                                          23
                                                        Availability Zone 1
Solution Step 3: Load Balancing
• HAProxy vs Amazon ELB

• Custom programs to Auto Scale HAProxy

• HAProxy Elastic -> Attach / Detach from
  Route53

• HAProxy IP whitelisting in 3rd party Gateway

• 16 HAProxy Instances , 2 AZ’s , 2 Subnets

• RR Load Balancing algorithm                    24
Solution Step 4 : Web/App Servers

                                                         Amazon Virtual Private Cloud


Users
         Amazon
         Route 53
                      HA Proxy EC2 Instance-1




                                                                       Round Robin
                                                                        Algorithm
 Users




                       EBS     C1.Xlarge
                                           Elastic IP                  Web/App 2        Web/App 3
                     Volumes

                    Web/App EC2 Instance –1

                                                         VPC Subnet 1
                                                                                                    25
                                                        Availability Zone 1
Solution Step 4: Web/App Servers
• 3 Web/App instances under every HAProxy

• C1.Xlarge Instance Type for Web/App Instances

• Custom programs to Auto Scale C1.Xlarge

• Automatic Attach / Detach from HAProxy

• Every web/App Instance with EIP for IP
  whitelisting

• 48 Web/App EC2 Instances spread across 2 AZ’s   26
Solution Step 5 : Queue Servers
                                               Amazon Virtual Private Cloud



                    HA Proxy EC2 Instance-1

Users
         Amazon
         Route 53                                           Round Robin
                                                             Algorithm




 Users
                              Web/App 1           Web/App 2           Web/App 3




                                                EBS     m1.large
                                              Volumes

                                                  RabbitMQ                        VPC Subnet 1   27
                                          Availability Zone 1
Solution Step 5: Queue Servers
• RabbitMQ vs Amazon SQS

• FIFO/Concurrency/No Duplicate messages

• 1 RabbitMQ instance for queuing every
  sector

• M1. large Instance Type

• 16 RabbitMQ Instances overall            28
Solution Step 6 : Processors/Redis
                 Amazon                                  Single Sector View     Components of
                 Route 53                                                       Single Sector
             1
                                                                                1. One HAProxy
                                                                                2. Three Web/App
                               HA Proxy
                                                                                3. One RabbitMQ
                                                                                4. One BG
                            Round Robin                                              Processor Node
             2               Algorithm
                                                                                5. Two Redis

                                                                                Sector is not an
                                                                                AWS term , it is
                                                                                8KMiles term for
 Web/App 1       Web/App 2           Web/App 3                                  Logical EC2
                                     3
                                                                                instance groups for
                                                                                this use case

                            RabbitMQ
    4




                                 5
                                          Redis Master
                                                                                                      29
Processors                                                6                     7
                                                                   Processors
                                          Redis Slave                               Booking Engine
Solution Step 6: Redis
• Redis vs Amazon DynamoDB

• Redis : NoSQL KV Data store

• Visitors are shown their Current Queue
  position every X seconds from Redis

• 1 Redis Master-Slave instance for every sector

• M1. large Instance Type for Redis
                                                   30
• 32 Redis Instances overall
Solution Step 6: Processors
• BG Processors : Java Programs to

  • RabbitMq -> Redis : Allot Queue numbers to visitor
    requests and insert to Redis

  • Redis -> Booking Engine : Moderate the movement of
    queued visitors from Redis to Booking Engine

  • Process the Response Status / Booking Status / Inactive
    Visitors / Timeouts

• 2 BG Processor node per sector

• CPU intensive : C1.Xlarge Instance Type
                                                              31

• 32 BG Processor Instances overall
Overall Solution Architecture
Sector is not an AWS
term , it is 8KMiles term
for Logical EC2 instance             Amazon
groups for this use case             Route 53


                 Sector 1   2    3         4     5   ..   ..   16

              HAProxy

             Web/App

            RabbitMQ

                  Redis

         BG Programs                                                32


                                Booking Engine
Scalability
                                   AZ-1                Amazon Virtual Private Cloud
                                                                                                   AZ-2
               Sector -1                     Sector -3

Amazon
Route 53


              EC2 Instances                 EC2 Instances                     EC2 Instances                 EC2 Instances


              VPC Subnet 1                   VPC Subnet 1                     VPC Subnet 2                   VPC Subnet 2
            Availability Zone 1           Availability Zone 1               Availability Zone 2           Availability Zone 2


                Sector -2                     Sector -4




               EC2 Instances                 EC2 Instances                     EC2 Instances                 EC2 Instances


               VPC Subnet 1                   VPC Subnet 1                     VPC Subnet 2                   VPC Subnet 2
             Availability Zone 1           Availability Zone 1
                                                                             Availability Zone 2           Availability Zone 2
Scalability
• New sectors containing LB, Web, Queue ,
  NoSQL , BG stack will be created
  automatically depending upon the load
• Same AZ or multi-AZ can be specified for the
  creation
• CloudWatch Custom parameters used
• Automated Java Programs were used for the
  sector creation
• No Manual intervention needed
                                                 34
High Availability @ Instance level
                                   AZ-1                Amazon Virtual Private Cloud
                                                                                                   AZ-2


Amazon
Route 53


              EC2 Instances                 EC2 Instances                     EC2 Instances                 EC2 Instances


              VPC Subnet 1                   VPC Subnet 1                     VPC Subnet 2                   VPC Subnet 2
            Availability Zone 1           Availability Zone 1               Availability Zone 2           Availability Zone 2




               EC2 Instances                 EC2 Instances                     EC2 Instances                 EC2 Instances


               VPC Subnet 1                   VPC Subnet 1                     VPC Subnet 2                   VPC Subnet 2
             Availability Zone 1           Availability Zone 1
                                                                             Availability Zone 2           Availability Zone 2
High Availability @ Instance
• HA built @ Web/App , Redis and BG
  processor instances
• Any Failure / Non responsive EC2 instances
  will be automatically detected/replaced by
  Java programs
• No Manual intervention needed



                                               36
High Availability @ Sector level
                                   AZ-1                Amazon Virtual Private Cloud
                                                                                                   AZ-2
               Sector -1                     Sector -2                         Sector -5                     Sector -3
Amazon
Route 53


              EC2 Instances                 EC2 Instances                     EC2 Instances                 EC2 Instances


              VPC Subnet 1                   VPC Subnet 1                     VPC Subnet 2                   VPC Subnet 2
            Availability Zone 1           Availability Zone 1               Availability Zone 2           Availability Zone 2


               Sector -6                       Sector -4




               EC2 Instances                 EC2 Instances                     EC2 Instances                 EC2 Instances


               VPC Subnet 1                   VPC Subnet 1                     VPC Subnet 2                   VPC Subnet 2
             Availability Zone 1           Availability Zone 1
                                                                             Availability Zone 2           Availability Zone 2
High Availability @ Sector level
• Any Failure / Non responsive instances inside
  Sectors will be automatically
  detected/replaced by Java programs
• If sector-3 fails , still other sectors will be
  active and can take requests




                                                    38
High Availability @ AZ Level
                                   AZ-1                Amazon Virtual Private Cloud
                                                                                                   AZ-2


Amazon
Route 53


              EC2 Instances                 EC2 Instances                     EC2 Instances                 EC2 Instances


              VPC Subnet 1                   VPC Subnet 1                     VPC Subnet 2                   VPC Subnet 2
            Availability Zone 1           Availability Zone 1               Availability Zone 2           Availability Zone 2




               EC2 Instances                 EC2 Instances                     EC2 Instances                 EC2 Instances


               VPC Subnet 1                   VPC Subnet 1                     VPC Subnet 2                   VPC Subnet 2
             Availability Zone 1           Availability Zone 1
                                                                             Availability Zone 2           Availability Zone 2
High Availability @ AZ level
• If entire AZ-2 fails then load will be balanced
  to instances in AZ-1
• Automated programs will create new sectors
  inside AZ-1 to handle the load




                                                    40
Log Analytics
                                       HDFS Cluster



            1               2                  3
   EC2             S3                                      RDS
Instances        Bucket                                   MySQL
                with logs            Elastic Map Reduce
                                             Jobs




• Redis , Web/App , HAProxy , RBQ logs synced to S3

• Elastic MapReduce Jobs to process / analyze the logs

• Processed result moved to RDS MySQL for reports/                41
  Visualizations
Monitoring
• Nagios + Puppet (combined) for Auto
  scaled monitoring infra and deployment


• CloudWatch Custom metrics / Tomcat
  Valve/ Automated Java Programs for EC2




                                           42
Backup
• No backups -> only Syncs to S3

• Golden AMI’s snapshot to S3

• Periodic Sync of data between EC2 and S3

• Periodic log Sync between Web/App to S3

                                             43
Infrastructure
• Amazon Route53
• Amazon VPC – Public , Private subnet
• 150+ EC2 instances , 2 AZ’s , 1 Region
• 70+ Elastic IP’s
• 200+ EBS
• S3 buckets
• Suite of monitoring tools
• 1 Puppet Server
• Amazon CloudWatch
                                           44
• Amazon CloudFront
Infrastructure Elasticity
• Entire Infra created 2 hrs before promo
• Tear down infra 2 hrs after promo
• ~30 Mins to launch the infra in AWS
• ~45 Mins to tear down
• Automated Failure detection/rectification
• Automated Programs for Infra creation


                                              45
Infrastructure Cost
• ~10K USD per promo
• Not inclusive of Data charges

• Unthinkable Savings
• Visitor experience was good
• More Bookings per Promo

Power of Elasticity is Simply priceless
                                          46
AWS is “AWSome”
If you need help in architecting Highly Elastic
solutions on AWS?
Leave it to the experts , we will
handle this



Cloud Architecture Consulting
Cloud Application Development
Cloud Migration & Implementation
Cloud Adoption Strategy


                                   “Let's get the job done”
Q&A
Harish11g.aws@gmail.com
http://in.linkedin.com/in/harishganesan
www.twitter.com/harish11g
http://harish11g.blogspot.com



Amazon Web Services
aws.amazon.com
aws.amazon.com/contact-us/aws-sales
                                          49

More Related Content

What's hot

Overview of Amazon Web Services
Overview of Amazon Web ServicesOverview of Amazon Web Services
Overview of Amazon Web ServicesHarish Ganesan
 
Optimizing Application Performance and Costs with Auto Scaling - AWS Online T...
Optimizing Application Performance and Costs with Auto Scaling - AWS Online T...Optimizing Application Performance and Costs with Auto Scaling - AWS Online T...
Optimizing Application Performance and Costs with Auto Scaling - AWS Online T...Amazon Web Services
 
Auto scaling websites in the cloud
Auto scaling websites in the cloudAuto scaling websites in the cloud
Auto scaling websites in the cloudDavid Veksler
 
Amazon Ec2 Application Design
Amazon Ec2 Application DesignAmazon Ec2 Application Design
Amazon Ec2 Application Designguestd0b61e
 
AWS re:Invent 2016: Elastic Load Balancing Deep Dive and Best Practices (NET403)
AWS re:Invent 2016: Elastic Load Balancing Deep Dive and Best Practices (NET403)AWS re:Invent 2016: Elastic Load Balancing Deep Dive and Best Practices (NET403)
AWS re:Invent 2016: Elastic Load Balancing Deep Dive and Best Practices (NET403)Amazon Web Services
 
AWS October Webinar Series - Using Spot Instances to Save up to 90% off Your ...
AWS October Webinar Series - Using Spot Instances to Save up to 90% off Your ...AWS October Webinar Series - Using Spot Instances to Save up to 90% off Your ...
AWS October Webinar Series - Using Spot Instances to Save up to 90% off Your ...Amazon Web Services
 
Automating Management of Amazon EC2 Instances with Auto Scaling - March 2017 ...
Automating Management of Amazon EC2 Instances with Auto Scaling - March 2017 ...Automating Management of Amazon EC2 Instances with Auto Scaling - March 2017 ...
Automating Management of Amazon EC2 Instances with Auto Scaling - March 2017 ...Amazon Web Services
 
T1 – Architecting highly available applications on aws
T1 – Architecting highly available applications on awsT1 – Architecting highly available applications on aws
T1 – Architecting highly available applications on awsAmazon Web Services
 
AWS Webcast - Design for Availability
AWS Webcast - Design for AvailabilityAWS Webcast - Design for Availability
AWS Webcast - Design for AvailabilityAmazon Web Services
 
Intro to AWS: EC2 & Compute Services
Intro to AWS: EC2 & Compute ServicesIntro to AWS: EC2 & Compute Services
Intro to AWS: EC2 & Compute ServicesAmazon Web Services
 
Get the Most Bang for Your Buck with #EC2 #WINNING
Get the Most Bang for Your Buck with #EC2 #WINNINGGet the Most Bang for Your Buck with #EC2 #WINNING
Get the Most Bang for Your Buck with #EC2 #WINNINGAmazon Web Services
 
AWS Summit London 2014 | From One to Many - Evolving VPC Design (400)
AWS Summit London 2014 | From One to Many - Evolving VPC Design (400)AWS Summit London 2014 | From One to Many - Evolving VPC Design (400)
AWS Summit London 2014 | From One to Many - Evolving VPC Design (400)Amazon Web Services
 
(ARC301) Scaling Up to Your First 10 Million Users
(ARC301) Scaling Up to Your First 10 Million Users(ARC301) Scaling Up to Your First 10 Million Users
(ARC301) Scaling Up to Your First 10 Million UsersAmazon Web Services
 

What's hot (20)

Overview of Amazon Web Services
Overview of Amazon Web ServicesOverview of Amazon Web Services
Overview of Amazon Web Services
 
Optimizing Application Performance and Costs with Auto Scaling - AWS Online T...
Optimizing Application Performance and Costs with Auto Scaling - AWS Online T...Optimizing Application Performance and Costs with Auto Scaling - AWS Online T...
Optimizing Application Performance and Costs with Auto Scaling - AWS Online T...
 
Auto Scaling Groups
Auto Scaling GroupsAuto Scaling Groups
Auto Scaling Groups
 
Auto scaling websites in the cloud
Auto scaling websites in the cloudAuto scaling websites in the cloud
Auto scaling websites in the cloud
 
Amazon EC2 & VPC HOL
Amazon EC2 & VPC HOLAmazon EC2 & VPC HOL
Amazon EC2 & VPC HOL
 
CMS on AWS Deep Dive
CMS on AWS Deep DiveCMS on AWS Deep Dive
CMS on AWS Deep Dive
 
Amazon Ec2 Application Design
Amazon Ec2 Application DesignAmazon Ec2 Application Design
Amazon Ec2 Application Design
 
Amazon EC2 Masterclass
Amazon EC2 MasterclassAmazon EC2 Masterclass
Amazon EC2 Masterclass
 
AWS re:Invent 2016: Elastic Load Balancing Deep Dive and Best Practices (NET403)
AWS re:Invent 2016: Elastic Load Balancing Deep Dive and Best Practices (NET403)AWS re:Invent 2016: Elastic Load Balancing Deep Dive and Best Practices (NET403)
AWS re:Invent 2016: Elastic Load Balancing Deep Dive and Best Practices (NET403)
 
AWS October Webinar Series - Using Spot Instances to Save up to 90% off Your ...
AWS October Webinar Series - Using Spot Instances to Save up to 90% off Your ...AWS October Webinar Series - Using Spot Instances to Save up to 90% off Your ...
AWS October Webinar Series - Using Spot Instances to Save up to 90% off Your ...
 
Automating Management of Amazon EC2 Instances with Auto Scaling - March 2017 ...
Automating Management of Amazon EC2 Instances with Auto Scaling - March 2017 ...Automating Management of Amazon EC2 Instances with Auto Scaling - March 2017 ...
Automating Management of Amazon EC2 Instances with Auto Scaling - March 2017 ...
 
Your First Week with Amazon EC2
Your First Week with Amazon EC2Your First Week with Amazon EC2
Your First Week with Amazon EC2
 
T1 – Architecting highly available applications on aws
T1 – Architecting highly available applications on awsT1 – Architecting highly available applications on aws
T1 – Architecting highly available applications on aws
 
AWS Webcast - Design for Availability
AWS Webcast - Design for AvailabilityAWS Webcast - Design for Availability
AWS Webcast - Design for Availability
 
Intro to AWS: EC2 & Compute Services
Intro to AWS: EC2 & Compute ServicesIntro to AWS: EC2 & Compute Services
Intro to AWS: EC2 & Compute Services
 
AWS Black Belt Tips
AWS Black Belt TipsAWS Black Belt Tips
AWS Black Belt Tips
 
Get the Most Bang for Your Buck with #EC2 #WINNING
Get the Most Bang for Your Buck with #EC2 #WINNINGGet the Most Bang for Your Buck with #EC2 #WINNING
Get the Most Bang for Your Buck with #EC2 #WINNING
 
AWS Summit London 2014 | From One to Many - Evolving VPC Design (400)
AWS Summit London 2014 | From One to Many - Evolving VPC Design (400)AWS Summit London 2014 | From One to Many - Evolving VPC Design (400)
AWS Summit London 2014 | From One to Many - Evolving VPC Design (400)
 
(ARC301) Scaling Up to Your First 10 Million Users
(ARC301) Scaling Up to Your First 10 Million Users(ARC301) Scaling Up to Your First 10 Million Users
(ARC301) Scaling Up to Your First 10 Million Users
 
Introduction to Amazon EC2
Introduction to Amazon EC2Introduction to Amazon EC2
Introduction to Amazon EC2
 

Similar to The art of infrastructure elasticity

Getting Started with Serverless Architectures | AWS Public Sector Summit 2016
Getting Started with Serverless Architectures | AWS Public Sector Summit 2016Getting Started with Serverless Architectures | AWS Public Sector Summit 2016
Getting Started with Serverless Architectures | AWS Public Sector Summit 2016Amazon Web Services
 
Vault Digital Transformation
Vault Digital TransformationVault Digital Transformation
Vault Digital TransformationStenio Ferreira
 
Building a multi-tenanted Cloud-native AppServer
Building a multi-tenanted Cloud-native AppServerBuilding a multi-tenanted Cloud-native AppServer
Building a multi-tenanted Cloud-native AppServerAfkham Azeez
 
Production Ready Serverless Java Applications in 3 Weeks AWS UG Cologne Febru...
Production Ready Serverless Java Applications in 3 Weeks AWS UG Cologne Febru...Production Ready Serverless Java Applications in 3 Weeks AWS UG Cologne Febru...
Production Ready Serverless Java Applications in 3 Weeks AWS UG Cologne Febru...Vadym Kazulkin
 
VMware vFabric - CIO Webinar - Al Sargent
VMware vFabric - CIO Webinar - Al SargentVMware vFabric - CIO Webinar - Al Sargent
VMware vFabric - CIO Webinar - Al SargentVMware vFabric
 
Cloud Computing Principles and Paradigms: 5 virtual machines provisioning and...
Cloud Computing Principles and Paradigms: 5 virtual machines provisioning and...Cloud Computing Principles and Paradigms: 5 virtual machines provisioning and...
Cloud Computing Principles and Paradigms: 5 virtual machines provisioning and...Majid Hajibaba
 
2016-06 - Design your api management strategy - AWS - Microservices on AWS
2016-06 - Design your api management strategy - AWS - Microservices on AWS2016-06 - Design your api management strategy - AWS - Microservices on AWS
2016-06 - Design your api management strategy - AWS - Microservices on AWSSmartWave
 
What's New in AWS Serverless and Containers
What's New in AWS Serverless and ContainersWhat's New in AWS Serverless and Containers
What's New in AWS Serverless and ContainersAmazon Web Services
 
Breaking the Monolith Road to Containers
Breaking the Monolith Road to ContainersBreaking the Monolith Road to Containers
Breaking the Monolith Road to ContainersAmazon Web Services
 
Getting Started with AWS Lambda and the Serverless Cloud
Getting Started with AWS Lambda and the Serverless CloudGetting Started with AWS Lambda and the Serverless Cloud
Getting Started with AWS Lambda and the Serverless CloudAmazon Web Services
 
Cloud Computing from an Entrpreneur's Viewpoint
Cloud Computing from an Entrpreneur's ViewpointCloud Computing from an Entrpreneur's Viewpoint
Cloud Computing from an Entrpreneur's ViewpointJ Singh
 
(CMP404) Cloud Rendering at Walt Disney Animation Studios
(CMP404) Cloud Rendering at Walt Disney Animation Studios(CMP404) Cloud Rendering at Walt Disney Animation Studios
(CMP404) Cloud Rendering at Walt Disney Animation StudiosAmazon Web Services
 
Arquitetura Hibrida - Integrando seu Data Center com a Nuvem da AWS
Arquitetura Hibrida - Integrando seu Data Center com a Nuvem da AWSArquitetura Hibrida - Integrando seu Data Center com a Nuvem da AWS
Arquitetura Hibrida - Integrando seu Data Center com a Nuvem da AWSAmazon Web Services LATAM
 
Building high performance microservices in finance with Apache Thrift
Building high performance microservices in finance with Apache ThriftBuilding high performance microservices in finance with Apache Thrift
Building high performance microservices in finance with Apache ThriftRX-M Enterprises LLC
 
Introduction to First Commercial Memcached Service for Cloud
Introduction to First Commercial Memcached Service for CloudIntroduction to First Commercial Memcached Service for Cloud
Introduction to First Commercial Memcached Service for CloudGear6
 
Deep Dive on AWS Lambda - January 2017 AWS Online Tech Talks
Deep Dive on AWS Lambda - January 2017 AWS Online Tech TalksDeep Dive on AWS Lambda - January 2017 AWS Online Tech Talks
Deep Dive on AWS Lambda - January 2017 AWS Online Tech TalksAmazon Web Services
 
ServerlessPresentation
ServerlessPresentationServerlessPresentation
ServerlessPresentationRohit Kumar
 
Docker & aPaaS: Enterprise Innovation and Trends for 2015
Docker & aPaaS: Enterprise Innovation and Trends for 2015Docker & aPaaS: Enterprise Innovation and Trends for 2015
Docker & aPaaS: Enterprise Innovation and Trends for 2015WaveMaker, Inc.
 
AWS re:Invent 2016: Beeswax: Building a Real-Time Streaming Data Platform on ...
AWS re:Invent 2016: Beeswax: Building a Real-Time Streaming Data Platform on ...AWS re:Invent 2016: Beeswax: Building a Real-Time Streaming Data Platform on ...
AWS re:Invent 2016: Beeswax: Building a Real-Time Streaming Data Platform on ...Amazon Web Services
 

Similar to The art of infrastructure elasticity (20)

Running Legacy Applications with Containers
Running Legacy Applications with ContainersRunning Legacy Applications with Containers
Running Legacy Applications with Containers
 
Getting Started with Serverless Architectures | AWS Public Sector Summit 2016
Getting Started with Serverless Architectures | AWS Public Sector Summit 2016Getting Started with Serverless Architectures | AWS Public Sector Summit 2016
Getting Started with Serverless Architectures | AWS Public Sector Summit 2016
 
Vault Digital Transformation
Vault Digital TransformationVault Digital Transformation
Vault Digital Transformation
 
Building a multi-tenanted Cloud-native AppServer
Building a multi-tenanted Cloud-native AppServerBuilding a multi-tenanted Cloud-native AppServer
Building a multi-tenanted Cloud-native AppServer
 
Production Ready Serverless Java Applications in 3 Weeks AWS UG Cologne Febru...
Production Ready Serverless Java Applications in 3 Weeks AWS UG Cologne Febru...Production Ready Serverless Java Applications in 3 Weeks AWS UG Cologne Febru...
Production Ready Serverless Java Applications in 3 Weeks AWS UG Cologne Febru...
 
VMware vFabric - CIO Webinar - Al Sargent
VMware vFabric - CIO Webinar - Al SargentVMware vFabric - CIO Webinar - Al Sargent
VMware vFabric - CIO Webinar - Al Sargent
 
Cloud Computing Principles and Paradigms: 5 virtual machines provisioning and...
Cloud Computing Principles and Paradigms: 5 virtual machines provisioning and...Cloud Computing Principles and Paradigms: 5 virtual machines provisioning and...
Cloud Computing Principles and Paradigms: 5 virtual machines provisioning and...
 
2016-06 - Design your api management strategy - AWS - Microservices on AWS
2016-06 - Design your api management strategy - AWS - Microservices on AWS2016-06 - Design your api management strategy - AWS - Microservices on AWS
2016-06 - Design your api management strategy - AWS - Microservices on AWS
 
What's New in AWS Serverless and Containers
What's New in AWS Serverless and ContainersWhat's New in AWS Serverless and Containers
What's New in AWS Serverless and Containers
 
Breaking the Monolith Road to Containers
Breaking the Monolith Road to ContainersBreaking the Monolith Road to Containers
Breaking the Monolith Road to Containers
 
Getting Started with AWS Lambda and the Serverless Cloud
Getting Started with AWS Lambda and the Serverless CloudGetting Started with AWS Lambda and the Serverless Cloud
Getting Started with AWS Lambda and the Serverless Cloud
 
Cloud Computing from an Entrpreneur's Viewpoint
Cloud Computing from an Entrpreneur's ViewpointCloud Computing from an Entrpreneur's Viewpoint
Cloud Computing from an Entrpreneur's Viewpoint
 
(CMP404) Cloud Rendering at Walt Disney Animation Studios
(CMP404) Cloud Rendering at Walt Disney Animation Studios(CMP404) Cloud Rendering at Walt Disney Animation Studios
(CMP404) Cloud Rendering at Walt Disney Animation Studios
 
Arquitetura Hibrida - Integrando seu Data Center com a Nuvem da AWS
Arquitetura Hibrida - Integrando seu Data Center com a Nuvem da AWSArquitetura Hibrida - Integrando seu Data Center com a Nuvem da AWS
Arquitetura Hibrida - Integrando seu Data Center com a Nuvem da AWS
 
Building high performance microservices in finance with Apache Thrift
Building high performance microservices in finance with Apache ThriftBuilding high performance microservices in finance with Apache Thrift
Building high performance microservices in finance with Apache Thrift
 
Introduction to First Commercial Memcached Service for Cloud
Introduction to First Commercial Memcached Service for CloudIntroduction to First Commercial Memcached Service for Cloud
Introduction to First Commercial Memcached Service for Cloud
 
Deep Dive on AWS Lambda - January 2017 AWS Online Tech Talks
Deep Dive on AWS Lambda - January 2017 AWS Online Tech TalksDeep Dive on AWS Lambda - January 2017 AWS Online Tech Talks
Deep Dive on AWS Lambda - January 2017 AWS Online Tech Talks
 
ServerlessPresentation
ServerlessPresentationServerlessPresentation
ServerlessPresentation
 
Docker & aPaaS: Enterprise Innovation and Trends for 2015
Docker & aPaaS: Enterprise Innovation and Trends for 2015Docker & aPaaS: Enterprise Innovation and Trends for 2015
Docker & aPaaS: Enterprise Innovation and Trends for 2015
 
AWS re:Invent 2016: Beeswax: Building a Real-Time Streaming Data Platform on ...
AWS re:Invent 2016: Beeswax: Building a Real-Time Streaming Data Platform on ...AWS re:Invent 2016: Beeswax: Building a Real-Time Streaming Data Platform on ...
AWS re:Invent 2016: Beeswax: Building a Real-Time Streaming Data Platform on ...
 

Recently uploaded

Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 

Recently uploaded (20)

Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

The art of infrastructure elasticity

  • 1. The Art of Infrastructure Elasticity April 28th, 2012 Cloud Developer Conference 2012 , Bangalore Harish Ganesan CTO and Co-Founder 8KMiles Harish11g.AWS@gmail.com
  • 2. Agenda • Problem • Challenges • Requirements • Solution Architecture • Q&A 2
  • 3. What is the problem scenario ? 3
  • 4. Big Sales Promotion every quarter by the Enterprise 4
  • 5. • Massive online Concurrent Visitors • Limited processing capacity of the Booking Engine (~3k requests/sec) 5
  • 6. • Unhappy Visitors • More Booking opportunity lost 6
  • 7. Solution (Step 1): • Create a Queuing App before the Booking engine • Efficiently Queue the concurrent visitors 7
  • 8. Solution (Step 2) : Moderate and move the visitors waiting in Queuing app to Booking engine 8
  • 9. What are the Challenges ? 9
  • 10. Concurrency • HTTP/AJAX/REST requests • Total : 500+ Million requests in 6 hours • Average :23k+ requests/sec • Peak : 80K+ requests/sec 10
  • 11. Queue efficiency • Allot unique Queue Numbers for visitors • Queue Number allotment on Fair Basis (As much possible) • Reduce the wait time in Queue Number allotment process • Reduce overall Queue wait time for the 11 visitor
  • 12. Load Volatility Peak utilization during Compute Promos Wasted Capacity Yearly Complete under utilization of Infra other times • Massive utilization and under utilization 12 pattern
  • 13. IP Whitelisting Public Cloud 3rd Party Services IP Address of the source EC2 Instances needs to be whitelisted in 3rd party Services gateway • Booking engine needs EC2 IP Whitelisting for security 13 • Consecutive IP range needed
  • 14. Variety of OS / Software’s • RedHat OS for Load Balancer , NoSQL and Queue Layer • Apache Tomcat Java web/App Layer • CentOS for Processing Programs • MySQL for Result storage 14 • Hadoop for Analytics
  • 15. What are the requirements from enterprise ? 15
  • 16. Requirements • Elastic Infrastructure • Create the Infrastructure 2 hrs before the promo • Tear down infrastructure 2 hrs after the promo • Elastically expand the infra during the promo • Highly Scalable and Available • Log Analytics 16 • Complete Infrastructure Automation
  • 18. Solution Architecture Option 1: Single Queue ( Initial thought) Queuing Application Booking Concurrent Engine visitors 18
  • 19. Solution Architecture Option 2: Parallel Queue ( Recommended) Booking Concurrent Queuing Engine visitors Application 19
  • 20. Request types • Customer Visit is a HTTP request to the Queuing Application • Current Visitor Queue position is a AJAX call every X seconds to the Queuing Application • More Wait ~ More Calls 20
  • 21. Solution Step 1 : The Cloud ? • Amazon Web Services • We had 4+ years Architecture experience in AWS • It satisfied many customer requirements and 21 challenges in this use case
  • 22. Solution Step 2 : R53/NW Amazon Virtual Private Cloud Users Amazon Route 53 EC2 Instances on AWS VPC Subnet 1 VPC Subnet 2 Availability Zone 1 Availability Zone 2 Users • Amazon VPC with Multi-AZ subnet configurations ( HA ) • Amazon Route 53 for Managed DNS 22 • DNS RR algorithm at Route53
  • 23. Solution Step 3 : Load Balancing Amazon Virtual Private Cloud Users Amazon Route 53 EBS M1.large EBS M1.large Elastic IP Elastic IP Volumes Volumes HAProxy EC2 Instance –1 HAProxy EC2 Instance –2 Users Round Robin Round Robin Algorithm Algorithm VPC Subnet 1 23 Availability Zone 1
  • 24. Solution Step 3: Load Balancing • HAProxy vs Amazon ELB • Custom programs to Auto Scale HAProxy • HAProxy Elastic -> Attach / Detach from Route53 • HAProxy IP whitelisting in 3rd party Gateway • 16 HAProxy Instances , 2 AZ’s , 2 Subnets • RR Load Balancing algorithm 24
  • 25. Solution Step 4 : Web/App Servers Amazon Virtual Private Cloud Users Amazon Route 53 HA Proxy EC2 Instance-1 Round Robin Algorithm Users EBS C1.Xlarge Elastic IP Web/App 2 Web/App 3 Volumes Web/App EC2 Instance –1 VPC Subnet 1 25 Availability Zone 1
  • 26. Solution Step 4: Web/App Servers • 3 Web/App instances under every HAProxy • C1.Xlarge Instance Type for Web/App Instances • Custom programs to Auto Scale C1.Xlarge • Automatic Attach / Detach from HAProxy • Every web/App Instance with EIP for IP whitelisting • 48 Web/App EC2 Instances spread across 2 AZ’s 26
  • 27. Solution Step 5 : Queue Servers Amazon Virtual Private Cloud HA Proxy EC2 Instance-1 Users Amazon Route 53 Round Robin Algorithm Users Web/App 1 Web/App 2 Web/App 3 EBS m1.large Volumes RabbitMQ VPC Subnet 1 27 Availability Zone 1
  • 28. Solution Step 5: Queue Servers • RabbitMQ vs Amazon SQS • FIFO/Concurrency/No Duplicate messages • 1 RabbitMQ instance for queuing every sector • M1. large Instance Type • 16 RabbitMQ Instances overall 28
  • 29. Solution Step 6 : Processors/Redis Amazon Single Sector View Components of Route 53 Single Sector 1 1. One HAProxy 2. Three Web/App HA Proxy 3. One RabbitMQ 4. One BG Round Robin Processor Node 2 Algorithm 5. Two Redis Sector is not an AWS term , it is 8KMiles term for Web/App 1 Web/App 2 Web/App 3 Logical EC2 3 instance groups for this use case RabbitMQ 4 5 Redis Master 29 Processors 6 7 Processors Redis Slave Booking Engine
  • 30. Solution Step 6: Redis • Redis vs Amazon DynamoDB • Redis : NoSQL KV Data store • Visitors are shown their Current Queue position every X seconds from Redis • 1 Redis Master-Slave instance for every sector • M1. large Instance Type for Redis 30 • 32 Redis Instances overall
  • 31. Solution Step 6: Processors • BG Processors : Java Programs to • RabbitMq -> Redis : Allot Queue numbers to visitor requests and insert to Redis • Redis -> Booking Engine : Moderate the movement of queued visitors from Redis to Booking Engine • Process the Response Status / Booking Status / Inactive Visitors / Timeouts • 2 BG Processor node per sector • CPU intensive : C1.Xlarge Instance Type 31 • 32 BG Processor Instances overall
  • 32. Overall Solution Architecture Sector is not an AWS term , it is 8KMiles term for Logical EC2 instance Amazon groups for this use case Route 53 Sector 1 2 3 4 5 .. .. 16 HAProxy Web/App RabbitMQ Redis BG Programs 32 Booking Engine
  • 33. Scalability AZ-1 Amazon Virtual Private Cloud AZ-2 Sector -1 Sector -3 Amazon Route 53 EC2 Instances EC2 Instances EC2 Instances EC2 Instances VPC Subnet 1 VPC Subnet 1 VPC Subnet 2 VPC Subnet 2 Availability Zone 1 Availability Zone 1 Availability Zone 2 Availability Zone 2 Sector -2 Sector -4 EC2 Instances EC2 Instances EC2 Instances EC2 Instances VPC Subnet 1 VPC Subnet 1 VPC Subnet 2 VPC Subnet 2 Availability Zone 1 Availability Zone 1 Availability Zone 2 Availability Zone 2
  • 34. Scalability • New sectors containing LB, Web, Queue , NoSQL , BG stack will be created automatically depending upon the load • Same AZ or multi-AZ can be specified for the creation • CloudWatch Custom parameters used • Automated Java Programs were used for the sector creation • No Manual intervention needed 34
  • 35. High Availability @ Instance level AZ-1 Amazon Virtual Private Cloud AZ-2 Amazon Route 53 EC2 Instances EC2 Instances EC2 Instances EC2 Instances VPC Subnet 1 VPC Subnet 1 VPC Subnet 2 VPC Subnet 2 Availability Zone 1 Availability Zone 1 Availability Zone 2 Availability Zone 2 EC2 Instances EC2 Instances EC2 Instances EC2 Instances VPC Subnet 1 VPC Subnet 1 VPC Subnet 2 VPC Subnet 2 Availability Zone 1 Availability Zone 1 Availability Zone 2 Availability Zone 2
  • 36. High Availability @ Instance • HA built @ Web/App , Redis and BG processor instances • Any Failure / Non responsive EC2 instances will be automatically detected/replaced by Java programs • No Manual intervention needed 36
  • 37. High Availability @ Sector level AZ-1 Amazon Virtual Private Cloud AZ-2 Sector -1 Sector -2 Sector -5 Sector -3 Amazon Route 53 EC2 Instances EC2 Instances EC2 Instances EC2 Instances VPC Subnet 1 VPC Subnet 1 VPC Subnet 2 VPC Subnet 2 Availability Zone 1 Availability Zone 1 Availability Zone 2 Availability Zone 2 Sector -6 Sector -4 EC2 Instances EC2 Instances EC2 Instances EC2 Instances VPC Subnet 1 VPC Subnet 1 VPC Subnet 2 VPC Subnet 2 Availability Zone 1 Availability Zone 1 Availability Zone 2 Availability Zone 2
  • 38. High Availability @ Sector level • Any Failure / Non responsive instances inside Sectors will be automatically detected/replaced by Java programs • If sector-3 fails , still other sectors will be active and can take requests 38
  • 39. High Availability @ AZ Level AZ-1 Amazon Virtual Private Cloud AZ-2 Amazon Route 53 EC2 Instances EC2 Instances EC2 Instances EC2 Instances VPC Subnet 1 VPC Subnet 1 VPC Subnet 2 VPC Subnet 2 Availability Zone 1 Availability Zone 1 Availability Zone 2 Availability Zone 2 EC2 Instances EC2 Instances EC2 Instances EC2 Instances VPC Subnet 1 VPC Subnet 1 VPC Subnet 2 VPC Subnet 2 Availability Zone 1 Availability Zone 1 Availability Zone 2 Availability Zone 2
  • 40. High Availability @ AZ level • If entire AZ-2 fails then load will be balanced to instances in AZ-1 • Automated programs will create new sectors inside AZ-1 to handle the load 40
  • 41. Log Analytics HDFS Cluster 1 2 3 EC2 S3 RDS Instances Bucket MySQL with logs Elastic Map Reduce Jobs • Redis , Web/App , HAProxy , RBQ logs synced to S3 • Elastic MapReduce Jobs to process / analyze the logs • Processed result moved to RDS MySQL for reports/ 41 Visualizations
  • 42. Monitoring • Nagios + Puppet (combined) for Auto scaled monitoring infra and deployment • CloudWatch Custom metrics / Tomcat Valve/ Automated Java Programs for EC2 42
  • 43. Backup • No backups -> only Syncs to S3 • Golden AMI’s snapshot to S3 • Periodic Sync of data between EC2 and S3 • Periodic log Sync between Web/App to S3 43
  • 44. Infrastructure • Amazon Route53 • Amazon VPC – Public , Private subnet • 150+ EC2 instances , 2 AZ’s , 1 Region • 70+ Elastic IP’s • 200+ EBS • S3 buckets • Suite of monitoring tools • 1 Puppet Server • Amazon CloudWatch 44 • Amazon CloudFront
  • 45. Infrastructure Elasticity • Entire Infra created 2 hrs before promo • Tear down infra 2 hrs after promo • ~30 Mins to launch the infra in AWS • ~45 Mins to tear down • Automated Failure detection/rectification • Automated Programs for Infra creation 45
  • 46. Infrastructure Cost • ~10K USD per promo • Not inclusive of Data charges • Unthinkable Savings • Visitor experience was good • More Bookings per Promo Power of Elasticity is Simply priceless 46 AWS is “AWSome”
  • 47. If you need help in architecting Highly Elastic solutions on AWS?
  • 48. Leave it to the experts , we will handle this Cloud Architecture Consulting Cloud Application Development Cloud Migration & Implementation Cloud Adoption Strategy “Let's get the job done”