AWS provides a platform that is ideally suited for deploying highly available and reliable systems that can scale with a minimal amount of human interaction. This talk describes a set of architectural patterns that support highly available services that are also scalable, low cost, low latency and allow for taking your application global with the click of a button. We walk through the various architectural decisions taken to achieve high scale and address global audience.
2. 503
Service Temporarily Unavailable
The server is temporarily unable to service
your request due to maintenance downtime or
capacity problems. Please try again later.
10. Self-Managed Fully-Managed
Database Server on
Amazon EC2
Your choice of
database running on
Amazon EC2
Bring Your Own License
(BYOL)
Amazon
DynamoDB
Managed NoSQL
database service using
SSD storage
Seamless scalability
Zero administration
Amazon RDS
Relational Database as
a managed service
Flexible licensing:
BYOL or License
Included
Database Options
11. But how do I choose what
DB technology I need?
SQL? NoSQL?
17. Established and well worn technology
Lots of existing code, communities, books, tools, etc
Clear patterns to scalability
You aren’t going to break SQL DBs in your first 10
million users. No really, you won’t
Why SQL?
18. • Database-as-a-Service
• No need to install or manage database
instances
• Scalable and fault tolerant configurations
Feature Details
Platform
support
Create MySQL, SQL Server and
Oracle
Preconfigured Get started instantly with
sensible default settings
Automated
patching
Keep your database platform up
to date automatically
Backups Automatic backups and point in
time recovery using snapshots
Manual DB snapshots
Failover Automated failover to slave hosts
in event of a failure
Replication Easily create read-replicas of your
data and seamlessly replicate
data across availability zones
Amazon Relational Database Service (RDS)
19. Automatic resizing of
compute clusters based on
demand Trigger auto-scaling policy
Feature Details
Control Define minimum and maximum
instance pool sizes and when scaling
and cool down occurs.
Integrated to
Amazon
CloudWatch
Use metrics gathered by CloudWatch
to drive scaling.
Instance types Run Auto Scaling for On-Demand
and Spot Instances. Compatible with
VPC.
as-create-auto-scaling-group MyGroup
--launch-configuration MyConfig
--availability-zones us-east-1a
--min-size 4
--max-size 200
Auto-Scaling Amazon
CloudWatch
30. Production 1.0 Architecture
Well-designed, 2 Tier architecture
Highly Available due to Multiple Availability Zone
Load Balancing & Auto-Scaling for full scalability
Fully managed Database included
Capable of serving >10K-100Ks users
32. Production 1.0 Architecture
Wasted server capacity for static content
Reliability and durability are not yet optimal
DRY – Don’t Repeat Yourself
End-user experience could be improved thru
offloading & caching
35. • World-wide content distribution
network
• Easily distribute content to end
users with low latency, high data
transfer speeds, and no
commitments
Feature Details
Fast Multiple world-wide edge locations to serve content as close to your
users as possible
Integrated with other
services
Works seamlessly with S3 and EC2 origin servers
Dynamic content Supports static and dynamic content from origin servers
Streaming Supports rtmp from S3 and includes support for live streaming from
Adobe FMS and Microsoft Media Server
CloudFront
37. Production 1.2 Architecture
Well-designed, 2 Tier architecture
Highly Available due to Multiple Availability Zone
Load Balancing & Auto-Scaling for full scalability
Fully managed Database included
Static content stored in durable, consistent way
Improved end-user experience through CDN
Capable of serving >100K-1M+ users
43. Provisioned throughput NoSQL database
Fast, predictable performance
Fully distributed, fault tolerant architecture
Feature Details
Provisioned
throughput
Dial up or down provisioned read/write
capacity
Predictable
performance
Average single digit millisecond latencies from
SSD backed infrastructure
Strong consistency Be sure you are reading the most up to date
values
Fault tolerant Data replicated across availability zones
Monitoring Integrated to Cloud Watch
Secure Integrates with AWS Identity and Access
Management (IAM)
Elastic MapReduce Integrates with Elastic MapReduce for
complex analytics on large datasets
DynamoDB
44. • Scalability to deal with 1.2TB / 100GB of metadata per customer
• Needed a DB engine to support 12K reads + 12K writes per sec.
• “…a very smooth experience. Thank you AWS for a great offering”
druva.com/blog/2012/05/23/insync-makes-to-aws-dynamodb/
45. • Managed, elastic Hadoop cluster
• Integrates with S3 & DynamoDB
• Leverage Hive & Pig analytics scripts
Feature Details
Scalable Use as many or as few compute instances running Hadoop as you want.
Modify the number of instances while your job flow is running
Integrated with other
services
Works seamlessly with S3 as origin and output. Integrates with
DynamoDB
Comprehensive Supports languages such as Hive and Pig for defining analytics, and allows
complex definitions in Cascading, Java, Ruby, Perl, Python, PHP, R, or C++
Cost effective Works with Spot instance types
Monitoring Monitor job flows from with the management console
Elastic MapReduce (EMR)
46. Foursquare…
Founded in 2009
112M in Venture Capital
33 million users
1.3 million businesses using the service
…generates a lot of Data
3.5 billion check-ins
15M+ venues,
Terabytes of log data
47. Uses EMR for
Evaluation of new features
Machine learning
Exploratory analysis
Daily customer usage reporting
Long-term trend analysis
48. Benefits of EMR
Ease-of-Use
“We have decreased the processing time for urgent data-analysis”
Flexibility
To deal with changing requirements & dynamically expand reporting clusters
Costs
“We have reduced our analytics costs by over 50%”
50. Production 1.3 Architecture
Well-designed, 2 Tier architecture
Highly Available due to Multiple Availability Zone
Load Balancing & Auto-Scaling for full scalability
Static content stored in durable, consistent way
Improved end-user experience through CDN
Big Data analytics built in for continuous optimization
Capable of serving >1m-10M+ users