This document provides best practices for building powerful web applications on AWS. It outlines 6 main rules:
1. Service all web requests by ensuring requests reach applications and that applications have the necessary data.
2. Service requests as fast as possible by choosing the fastest routing, offloading to services like CloudFront, caching frequently requested data, and using low latency services like DynamoDB.
3. Handle requests at any scale by vertically and horizontally scaling as needed using auto-scaling, and by provisioning for high performance using services like EBS and DynamoDB.
4. Simplify architecture with AWS services that handle "undifferentiated heavy lifting" like databases, queues, workflows, search
5. What your users want…
Always on,
Fast, performant
accessible
experience
anywhere
6. What your users want…
Always on,
Fast, performant
accessible
experience
anywhere
Personalized and
rich application
7. What your users want…
Always on,
Fast, performant
accessible
experience
anywhere
Lots of new
Personalized and
features all of the
rich application
time
8. Always on,
Fast, performant
accessible
experience
anywhere
Powerful web applications
Lots of new
Personalized and
features all of the
rich application
time
11. Rule 1: Service all web requests
Rule 2: Service requests as fast as possible
Rule 3: Handle requests at any scale
Rule 4: Simplify architecture with services
Rule 5: Automate operational management
Rule 6: Leverage unique cloud properties
12. Rule 1: Service all web requests
Or
“you aren’t doing anything if you
don’t answer the door…”
13. Rule 1: Service all web requests
a) Make sure requests get to your ‘front door’
DNS Application Data
14. Rule 1: Service all web requests
a) Make sure requests get to your ‘front door’
Request DNS Application Data
15. Rule 1: Service all web requests
a) Make sure requests get to your ‘front door’
Request DNS Application Data
16. Rule 1: Service all web requests
a) Make sure requests get to your ‘front door’
Request DNS Application Data
Clients can’t resolve …then this is
you? irrelevant
17. Rule 1: Service all web requests
a) Make sure requests get to your ‘front door’
Request DNS Application Data
Feature Details
Global Supported from AWS global edge locations for fast and reliable domain
name resolution
“100% Scalable Automatically scales based upon query volumes
Available” Route53
Latency based routing Supports resolution of endpoints based upon latency, enabling multi-
SLA region application delivery
Integrated Integrates with other AWS services allowing Route 53 to front load
http://aws.amazon.com/route53/sla balancers, S3 and EC2
Secure Integrates with IAM giving fine grained control over DNS record access
18. Rule 1: Service all web requests
a) Make sure requests get to your ‘front door’
b) Make sure you open the door when they arrive
Request DNS Application Data
Route53
19. Rule 1: Service all web requests
a) Make sure requests get to your ‘front door’
b) Make sure you open the door when they arrive
Request DNS Application Data
Region
Availability Zone
Elastic load balancing
Route53 Availability Zone Multi-availability zone
Multi-region
Availability Zone
Elastic
Load
Balancer Availability Zone
Region
20. Rule 1: Service all web requests
a) Make sure requests get to your ‘front door’
b) Make sure you open the door when they arrive
c) Have the data to form a response
Request DNS Application Data
Region
Availability Zone
Route53 Availability Zone
Availability Zone
Elastic
Load
Balancer Availability Zone
Region
21. Rule 1: Service all web requests
a) Make sure requests get to your ‘front door’
b) Make sure you open the door when they arrive
c) Have the data to form a response
Request DNS Application Data
Region
Multi-AZ RDS Availability Zone
(Master-slave)
Route53 Availability Zone
Inter-region
replication
Availability Zone
Read-replicas
Elastic
Load
Balancer Availability Zone
Region
22. Rule 1: Service all web requests
Rule 2: Service requests as fast as possible
Rule 3: Handle requests at any scale
Rule 4: Simplify architecture with services
Rule 5: Automate operational management
Rule 6: Leverage unique cloud properties
23. Rule 1: Service all web requests
Rule 2: Service requests as fast as possible
Or
“I’m not hanging around, so be
quick…”
25. Rule 2: Service requests as fast as possible
a) Choose the fastest route
Request Route53
Region Region B
A
26. Rule 2: Service requests as fast as possible
a) Choose the fastest route
Request Route53
16ms 92ms
Region Region B
A
27. Rule 2: Service requests as fast as possible
a) Choose the fastest route
Request Route53
16ms 92ms
Region Region B
A
28. Rule 2: Service requests as fast as possible
a) Choose the fastest route
Request Route53
Region A DNS entry
16ms
Region Region B
A
29. Rule 2: Service requests as fast as possible
a) Choose the fastest route
b) Offload your application servers
CloudFront 3 Served from S3
World-wide content distribution network /images/*
Easily distribute content to end users with low
latency, high data transfer speeds, and no
commitments.
London 2 Served from EC2
*.php
Paris
1 Single CNAME
NY
www.mysite.com
30. Rule 2: Service requests as fast as possible
a) Choose the fastest route
b) Offload your application servers
Without CloudFront
EC2 webservers/app servers loaded by user
requests
31. Rule 2: Service requests as fast as possible
a) Choose the fastest route
b) Offload your application servers
With CloudFront
Load of user requests pushed into
CloudFront, EC2 cluster can scale
down
Offload
Scale
Down
32. Rule 2: Service requests as fast as possible
a) Choose the fastest route
b) Offload your application servers
No CDN CDN for CDN for
Static Static &
Content Dynamic
Content
Offload
Scale
Down
Response Time
Response Time
Response Time
Server Load
Server
Server
Load
Load
33. Rule 2: Service requests as fast as possible
a) Choose the fastest route
b) Offload your application servers
c) Cache it if you can
ElastiCache
Memcached compatible caching
layer
Serve frequently requested & slow
changing data from scalable cache
clusters
Reduce load on database and other
servers
34. Rule 2: Service requests as fast as possible
a) Choose the fastest route
b) Offload your application servers
c) Cache it if you can
d) Single digit latencies where it matters
Database Query Performance
Desired consistency, predictability
Scale
35. Rule 2: Service requests as fast as possible
a) Choose the fastest route
b) Offload your application servers
c) Cache it if you can
d) Single digit latencies where it matters
Database Query Performance
Desired consistency, predictability
Actual
degraded
performance
with scale
Scale
36. Rule 2: Service requests as fast as possible
a) Choose the fastest route
b) Offload your application servers
c) Cache it if you can
d) Single digit latencies where it matters
Database Query Performance
Desired consistency, predictability
Management problems
Data sharding
Data caching
Actual Provisioning
degraded Cluster management
performance Fault management
with scale
Scale
37. Rule 2: Service requests as fast as possible
a) Choose the fastest route
b) Offload your application servers
c) Cache it if you can
d) Single digit latencies where it matters
Database Query Performance
Dynamo DB Query Performance DynamoDB
Low latency
Large scale
Zero admin
Predictable performance
Relational
Database
Query
Performance
Scale
38. Rule 2: Service requests as fast as possible
a) Choose the fastest route
b) Offload your application servers
c) Cache it if you can
d) Single digit latencies where it matters
Database Query Performance
Dynamo DB Query Performance DynamoDB
Low latency
Large scale
Average single-digit milliseconds server side Zero admin
latencies Predictable performance
Runs on solid state drives, and is built to
maintain consistent, fast latencies at any scale
Scale
39. Rule 1: Service all web requests
Rule 2: Service requests as fast as possible
Rule 3: Handle requests at any scale
Rule 4: Simplify architecture with services
Rule 5: Automate operational management
Rule 6: Leverage unique cloud properties
40. Rule 1: Service all web requests
Rule 2: Service requests as fast as possible
Rule 3: Handle requests at any scale
Or
“When they come, they REALLY
come…”
41. Rule 3: Handle requests at any scale
a) Scale up
Vertical Scaling
From $0.02/hr
Scale up with Elastic Compute Cloud (EC2)
Basic unit of compute capacity
Range of CPU, memory & local disk options
14 Instance types available, from micro through cluster
compute to SSD backed
42. Rule 3: Handle requests at any scale
a) Scale up
b) Scale out
as-create-auto-scaling-group MyGroup
Trigger
auto-scaling --launch-configuration MyConfig
policy --availability-zones eu-west-1a
--min-size 4
--max-size 200
Auto-scaling
Automatic re-sizing of compute clusters based upon demand
43. Rule 3: Handle requests at any scale
a) Scale up
b) Scale out
Manually By Schedule
Send an API call or use CLI to Scale up/down based on date and time
launch/terminate instances – Only need
to specify capacity change (+/-)
By Policy Auto-Rebalance
Scale in response to changing conditions, Instances are automatically
based on user configured real-time launched/terminated to ensure the
monitoring and alerts application is balanced across multiple
Azs
44. Rule 3: Handle requests at any scale
a) Scale up
b) Scale out
Manually By Schedule
Preemptive manual scaling of
Send an API call or use CLI to Regular scaling up and down of
Scale up/down based on date and time
launch/terminate instances – Only need
capacity instances
e.g. before a marketing event add(+/-)
to specify capacity change 10 more e.g. scale from 0 to 2 to process SQS
instances messages every night or double capacity
on a Friday night
By Policy Auto-Rebalance
Scale in response to changing conditions, Instances are automatically
Dynamic scale based upon
based on user configured real-time
Maintain capacity across
launched/terminated to ensure the
monitoringmetrics
custom and alerts application is balancedzones multiple
availability across
e.g. SQS queue depth, Average CPU load, e.g. Instance availability maintained in
Azs
ELB latency event of AZ becoming unavailable
45. Rule 3: Handle requests at any scale
a) Scale up
b) Scale out
c) Dial it up
Elastic Block Store DynamoDB
Provisioned IOPS up to 1000 per EBS Provisioned read/write performance per
volume table
Predictable performance for Predictable high performance scaled via
demanding workloads such as console or API
databases
46.
47. “AWS gave us the flexibility to bring a massive
amount of capacity online in a short period of
DynamoDB: time and allowed us to do so in an operationally
over 500,000 writes per straightforward way.
second
AWS is now Shazam’s cloud provider of choice,”
Amazon EMR:
more than 1 million writes Jason Titus,
per second CTO
48. Rule 1: Service all web requests
Rule 2: Service requests as fast as possible
Rule 3: Handle requests at any scale
Rule 4: Simplify architecture with services
Rule 5: Automate operational management
Rule 6: Leverage unique cloud properties
49. Rule 1: Service all web requests
Rule 2: Service requests as fast as possible
Rule 3: Handle requests at any scale
Rule 4: Simplify architecture with services
Or
“Building new stuff is fun, but other
peoples software is a drag”
50. Rule 4: Simplify architecture with services
30% 70%
On-Premise Your Managing All of the
Infrastructure Business “Undifferentiated Heavy Lifting”
51. Rule 4: Simplify architecture with services
30% 70%
On-Premise Your Managing All of the
Infrastructure Business “Undifferentiated Heavy Lifting”
AWS
More Time to Focus on Configuring Your
Cloud-Based
Your Business Cloud Assets
Infrastructure
70% 30%
52. Rule 4: Simplify architecture with services
Relational Database Service
Use RDS for databases Database-as-a-Service
No need to install or manage database instances
Scalable and fault tolerant configurations
DynamoDB Use DynamoDB for
Provisioned throughput NoSQL database high performance key-
Fast, predictable performance
value DB
Fully distributed, fault tolerant architecture
53. Rule 4: Simplify architecture with services
Amazon SQS Reliable message
Processing results Reliable, highly scalable, queue service
queuing without
for storing messages as they travel
Amazon SQS between instances
additional software
1
Processing
task/processing
trigger 2
Push inter-process Simple Workflow Task A
workflows into the Reliably coordinate processing steps
Task B 3
across applications
cloud with SWF (Auto-scaling)
Integrate AWS and non-AWS resources
Manage distributed state in complex
systems Task C
54. Rule 4: Simplify architecture with services
Document
Server
Cloud Search
Don’t install search Elastic search engine based upon
software, use Amazon A9 search engine
Fully managed service with
CloudSearch Search
sophisticated feature set
Server
Scales automatically
Results
Elastic MapReduce
Elastic Hadoop cluster
Process large volumes
Integrates with S3 & DynamoDB of data cost effectively
Leverage Hive & Pig analytics scripts with EMR
Integrates with instance types such as
spot
55.
56. “Amazon CloudSearch is a game-changing
product that has allowed us to deliver powerful
new search capabilities. Our customers can now
find what they are looking for faster and more
easily than ever before…
….We saved many months of re-architecture
and development time by going with Amazon
CloudSearch”
Don MacAskill
CEO & Chief Geek
SmugMug
57.
58.
59. Rule 1: Service all web requests
Rule 2: Service requests as fast as possible
Rule 3: Handle requests at any scale
Rule 4: Simplify architecture with services
Rule 5: Automate operational management
Rule 6: Leverage unique cloud properties
60. Or
“Run it from the iPhone…”
Rule 5: Automate operational management
Rule 6: Leverage unique cloud properties
61. Rule 5: Automate operational management
a) Everything is programmable
Access everything Achieve the highest levels
via CLI, API or Compute of automation
Console Security Scaling sophistication with ease
CDN Backup
DNS Database
Storage Load Balancing
Workflow Monitoring
Networking
Messaging
62. Rule 5: Automate operational management
a) Everything is programmable
b) Think disposable, one click deployments
Cloud Formation
Automate creation of ‘stacks’ in a repeatable way
Scripting framework for AWS resource creation
Feature Details
Platform support Support for AWS resources from EC2 to IAM
Resource creation Creates AWS resources behind the scenes and reports
on progress
Declarative Specify stacks in JSON format and source control your
environments
Customizable Drive stack creation with paramaters
63. Rule 5: Automate operational management
a) Everything is programmable
b) Think disposable, one click deployments
c) Design for failure, implement self healing
Bootstrapping Auto-scaling Cloud Watch
Customize instance Maintain capacity of Know what’s going on,
startup instances take automated actions
Get instances to ask ‘who am Using a minimum pool Use CloudWatch standard and
I?’ question on startup and be size will maintain custom metrics to create
configured dynamically upon capacity in the event of alarms.
being asnwered instance failures Respond with automated
administration actions
64. Rule 5: Automate operational management
a) Everything is programmable
b) Think disposable, one click deployments
c) Design for failure, implement self healing
65. Rule 1: Service all web requests
Rule 2: Service requests as fast as possible
Rule 3: Handle requests at any scale
Rule 4: Simplify architecture with services
Rule 5: Automate operational management
Rule 6: Leverage unique cloud properties
68. Rule 6: Leverage unique cloud properties
a) Optimize costs with instance types
On-demand instances Reserved instances Spot instances
Unix/Linux instances start at 1- or 3-year terms Bid on unused EC2 capacity
$0.02/hour
Pay low up-front fee, receive significant hourly Spot Price based on supply/demand,
Pay as you go for compute power discount determined automatically
Low cost and flexibility Low Cost / Predictability Cost / Large Scale, dynamic workload handling
Pay only for what you use, no up-front Helps ensure compute capacity is available
commitments or long-term contracts when needed
Use Cases:
Use Cases:
Use Cases: Applications with flexible start and end times
Applications with short term, spiky, or
unpredictable workloads; Applications with steady state or predictable Applications only feasible at very low compute
usage prices
Application development or testing
Applications that require reserved capacity,
including disaster recovery
69. Rule 6: Leverage unique cloud properties
a) Optimize costs with instance types
7000
6000 Spot
5000
4000 On Demand
3000
2000
Reserved Instances
1000
0
70. Rule 6: Leverage unique cloud properties
a) Optimize costs with instance types
b) Get insight fast with Elastic MapReduce
Elastic MapReduce Feature Details
Managed, elastic Hadoop cluster Scalable Use as many or as few compute instances running
Hadoop as you want. Modify the number of
Integrates with S3 & DynamoDB
instances while your job flow is running
Leverage Hive & Pig analytics scripts
Integrates with instance types such as spot
Integrated with Works seamlessly with S3 as origin and output.
other services Integrates with DynamoDB
Comprehensive Supports languages such as Hive and Pig for
defining analytics, and allows complex definitions
in Cascading, Java, Ruby, Perl, Python, PHP, R, or
C++
Cost effective Works with Spot instance types
Monitoring Monitor job flows from with the management
console
71. Rule 6: Leverage unique cloud properties
a) Optimize costs with instance types
b) Get insight fast with Elastic MapReduce
S3 + DynamoDB Input data
Code Elastic Name Output
MapReduce node S3 + SimpleDB
Queries
HDFS
+ BI
Via JDBC, Pig, Hive
Elastic cluster
72. Features powered by Amazon Elastic
MapReduce:
People Who Viewed this Also Viewed
Review highlights
Auto complete as you type on search
Search spelling suggestions
Top searches
Ads
200 Elastic MapReduce jobs per day
Processing 3TB of data
73. “With AWS, our developers can now do things they
couldn’t before…
…Our systems team can focus their energies on other
challenges.”
Dave Marin
Search and data-mining engineer
74. Rule 6: Leverage unique cloud properties
a) Optimize costs with instance types
b) Get insight fast with Elastic MapReduce
c) Create a supercomputer backend when you need it
Cluster compute instances Network placement groups
Implement HVM process execution Cluster instances deployed in a ‘Placement Group’ enjoy low
Intel® Xeon® E5-2670 processors latency, full bisection 10 Gbps bandwidth
10 Gigabit Ethernet
80 EC2
Compute Units
60GB RAM
3TB Local
Disk
Cluster
Compute 10Gbps
75. Rule 1: Service all web requests
Rule 2: Service requests as fast as possible
Rule 3: Handle requests at any scale
Rule 4: Simplify architecture with services
Rule 5: Automate operational management
Rule 6: Leverage unique cloud properties
76. What your users want…
Always on,
Fast, performant
accessible
experience
anywhere
Lots of new
Personalized and
features all of the
rich application
time
77. With AWS
Elastic utility
capacity
✔ Always on,
accessible
anywhere
Lots of new
Personalized and
features all of the
rich application
time
78. With AWS
Elastic utility
capacity
✔ Highly available
global coverage
✔
Lots of new
Personalized and
features all of the
rich application
time
79. With AWS
Elastic utility
capacity
✔ Highly available
global coverage
✔
✔
Agility &
Personalized and
automated
rich application
operations
80. With AWS
Elastic utility
capacity
✔ Highly available
global coverage
✔
✔ ✔
Agility & Cost effective
automated storage, big data &
operations analytics