Scaling web application in the CloudPresentation Transcript
SCALING WEB APPLICATIONS
ON THE CLOUD
Federico Feroldi - firstname.lastname@example.org
1/3 of your
125M PVs cost
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
user demand servers capacity
TOPICS FOR TODAY
‣ What is Scalability?
‣ Amazon Web Services
‣ Building a web application on AWS
‣ Tools and Best Practices
Geek by nature, working on the web since 1996
WHAT IS SCALABILITY?
A desirable property of a system, which indicates its ability
Gracefully handle increasing demand.
Increase its capacity (transactions, processing, storage,
throughput) proportionally with the addition of more
resources (nodes, CPU, storage, bandwidth).
Add resources to a single node in the system.
Pros: transparent to the software, almost “linear” scale,
quickly fix performance issues
Cons: risky, may impact on service, can be very expensive,
cannot scale “infinitely”
Add more nodes to the system.
Pros: cheap, almost no impact on service, can scale
Cons: the system must be properly designed, could be
complex to operate
SCALE FAST OR DIE!
If your application is accessible from the Internet you can't
decide how many people are going to use it.
More time it takes for your systems to adapt to the user
demand, more time in advance you have to provision
your system to be able to handle the future usage (and
most of the time your forecast will be wrong).
The ability to quickly and
gracefully increase capacity
by adding more resources and
also to quickly and gracefully
release resources when the
required capacity decreases.
In a few words: the ability to
quickly scale a system
capacity up and down based
on the demand in an almost
Amazon launched Amazon Web
Services in 2001
Currently the most mature,
feature rich and flexible IaaS
In a few words: AWS is the Data
Center in the Cloud (and much
5 REASONS TO USE AWS
1. It’s cheap: pay per use, simple cost model.
2. It’s efficient: VMs, DBs, NAS, storage, messaging,
CDNs, LBs, etc...
3. It’s safe: proven infrastructure, Amazon itself builds it's
services on the same technologies.
4. It’s flexible: everything can be managed with an API
call, you can build your own tools or use one of the
5. It’s unlimited: virtually no limit on VM instances or
storage you can use.
WEB APPS COMPONENTS
Web/Application servers: to serve dynamic pages
Databases: to store user data
File Storage: to store images, videos, documents
Computing nodes: to execute background tasks like
image conversion, video transcoding, email delivery
Messaging infrastructure: asynchronous and reliable
communication between nodes
Load balancers: distribute requests
Monitoring infrastructure: to check that everything is
working well and track the system usage
BUILDING ON AWS
Coderloop is a community where
programmers can practice their Serve Web
skills and Load balancers: Application
distribute requests against each
other to solve complex
Store User Data
Users send computer
programs to solve certain
puzzles, Coderloop execute the
programs, verifies the Process
correctness and gives a Submissions
FIRST A WEB SERVER
EC2 is the Elastic Compute Cloud.
Users can launch virtual machines (called instances) with a
CLI tool or an API.
They have dynamic IPs but you can reserve fixed IPs
and associate them to an instance.
Instances are created from customizable “images” and
they come in many sizes (memory and CPU).
You pay for the CPU time and the bandwidth.
When an instance is turned off, you lose the data in it
PERSISTENT USERS’ DATA
RDS is the Relational Database Service.
It’s a fully managed MySQL database.
You can scale it as needed (CPU and available storage).
It’s automatically patched when needed.
Supports multi-master and replication.
You can create backups periodically or with a single API call.
You pay for CPU usage, dedicated primary storage,
backup storage (optional) and bandwidth used.
THE NOSQL ALTERNATIVE
SimpleDB is the non relational data store.
It’s a fully managed, scalable “data store”.
Simple web service API no SQL
No need to define schema in advance.
Scales automatically, you just keep adding data.
You pay for CPU time, used storage and bandwidth.
25 hours of CPU and 1GB of storage for free each
STORING FILES AND IMAGES
S3 is the Simple Storage Service.
Simple API (REST and SOAP) to write objects (files) in
buckets (similar to directories).
Objects can be from 1 byte to 5GB in size
Standard (99.999999999%) or reduced (99.99%)
You pay for the storage used and the bandwidth.
We use EC2 with SQS (Simple Queue Service)
Reliable, highly scalable message queue.
Web servers queue “job requests” that are picked by EC2
instances and processed.
EC2 instances can be added or removed based on the
amount of requests to be processed.
You pay for the amount of requests made to the service
and the amount of data transferred.
The first 100K requests are free each month.
SCALING OUT THE APP
We can distribute the requests with the Elastic Load
Monitors the available instances and routes incoming
requests to “healthy” instances
Supports sticky sessions and SSL termination.
You pay for the time and the bandwidth transferred.
With RDS we can easily setup read replicas for our
database to scale the read capacity.
With Cloud Watch we can monitor our servers’ usage in
real time: CPU, Disk and Network for EC2 instances and
This information can be used to enable Auto Scaling:
automatically scale your Amazon EC2 capacity up or down
according to conditions you define (based on average CPU
utilization, network activity or disk utilization).
New EC2 instances get automatically added to or
removed from the Elastic Load Balancer
AWS is deployed on multiple Regions and Availability
AZs are distinct locations, insulated from failures in other
AZs and provide inexpensive, low latency network
connectivity to other AZ in the same Region.
Regions consist of one or more AZs, are geographically
dispersed, and will be in separate geographic areas or
countries: currently Northern Virginia, Northern California,
Ireland, and Singapore.
Centralized configuration management
Rapid creation of an instance of a pre-defined type: web
server, email server, etc...
Ensuring uniform configuration of the entire set of instances
of the same type at all times.
A Puppet Master, guardian of configurations.
Multiple Puppet clients installed on EC2 instances.
Application deployment and parallel execution of
Scripting a certain number of tasks, whether complex or not
(deliveries, backups, site publication/maintenance, etc.)
executing them rapidly in parallel on X instances with a
Webistrano (web interface) for 1-click deployments
MONITORING AND LOGS
Supervision verifies the state of a host or a service and
sends out an alarm upon detecting any abnormal behavior:
Metrology enables instrumentation data to be archived, and
if necessary, processed or filtered, before it is presented in
the form of graphs or reports: Cacti
The more instances you have, the more scattered logs
there’ll be on the various instances, implement centralized
log collection: Syslog-NG
TIPS & BEST PRACTICES
6 TIPS TO REDUCE COSTS
Keep machines in the same availability zone
Use spot instances (can cost up to 3 time less)
Choose your instance types wisely (c1.medium cost 2x
m1.small by offers 5x computing power)
Choose the smallest possible storage (it’s very easy to
expand the capacity of an RDS instance)
Use Autoscaling (have always the least needed amount of
instances to handle the traffic)
Reserve your instances (you can reduce your costs
significantly by reserving number of instances for a year or
for three years).
Amazon Web Services is an excellent platform to build
your first prototype or even run your production
It has a very flexible and proven API that let you
manage your infrastructure in ways you never
though were possible before.
It provides you with a lot of building blocks that scale
well and that you can trust, so you don’t have to
waste time building your infrastructure and you can
focus on making your service the best in the world.
And when the day comes that you hit the Techcrunch
homepage, you infrastructure is ready to scale in
CODERLOOP IS HIRING!
Check out http://www.coderloop.com/jobs