Scaling on AWS for the First 10 
Million Users 
Craig S. Dickson, 
Solutions Architect, Amazon Web Services 
© 2014 Amazon...
• ME: Craig S. Dickson – Solutions Architect – AWS 
• YOU: Here to learn more about scaling infrastructure on 
AWS and jus...
So how do we scale?
Hi, I have NO IDEA what I am doing!!
a lot of things to read
a lot of things to read 
not where we want to start
If not Auto Scaling, 
then what do we need 
first?
Let’s start from day one, 
user one ( i.e. you )
Day One, User One 
• A single EC2 Instance 
– With full stack on this host 
• Web app 
• Database 
• Management 
• Etc. 
•...
“We’re gonna need a bigger box” 
• Simplest approach 
• Can leverage EBS PIOPs 
• High I/O instances 
• High memory instan...
“We’re gonna need a bigger box” 
• Simplest approach 
• Can now leverage PIOPs 
• High I/O instances 
• High memory instan...
Day One, User One 
• We could potentially get 
to a few hundred to a few 
thousand depending on 
application complexity 
a...
Day One, User One 
• We could potentially get 
to a few hundred to a few 
thousand depending on 
application complexity 
a...
Day Two, User >1 
First let’s separate out 
our single host into 
more than one. 
• Web 
• Database 
– Make use of a datab...
Self-managed Fully Managed 
Database Server 
on Amazon EC2 
Your choice of 
database running on 
Amazon EC2 
Bring Your Ow...
But how do I choose 
what DB technology I 
need? SQL? NOSQL?
Not a binary decision!
A blended approach 
can reduce technical 
debt
Start with SQL 
databases where it 
makes sense
Why start with SQL? 
• Established and well worn technology 
• Lots of existing code, communities, books, background, 
too...
If your usage is such that you will be 
generating several TB ( >5 ) of data 
in the first year OR have an 
incredibly dat...
Why might you need NOSQL? 
• Super low latency applications 
• Metadata driven datasets 
• Highly unrelational data 
• Nee...
So decide wisely. 
Look for the key 
points of scale.
Users > 100 
First let’s separate out 
our single host into 
more than one 
• Web 
• Database 
– Use RDS to make your life...
Users > 1000 
Next let’s address our 
lack of failover and 
redundancy issues 
• Elastic Load Balancing 
• Another web ins...
Scaling this 
horizontally and 
vertically will get us 
pretty far 
(10s-100s of thousands)
User > 10ks – 100ks 
RDS DB Instance 
Active (Multi-AZ) 
RDS DB Instance 
Standby (Multi-AZ) 
Elastic Load 
Balancing 
RDS...
This scales – but it can be 
much cleaner
Shift Some Load Around 
Let’s lighten the load on our 
web and database instances: 
• Move static content from the 
web In...
Shift Some Load Around 
Let’s lighten the load on our 
web and database instances 
• Move static content from the 
web ins...
Shift Some Load Around 
Let’s lighten the load on our 
web and database 
instances 
• Move static content from the 
web in...
Now that our Web tier is 
much more lightweight, we 
can revisit …
Auto Scaling!
Typical 
Weekly 
Traffic 
to 
Amazon.com 
Sunday Monday Tuesday Wednesday Thursday Friday Saturday
Typical 
Weekly 
Traffic 
to 
Amazon.com 
Provisioned Capacity 
Sunday Monday Tuesday Wednesday Thursday Friday Saturday
November 
Traffic 
to 
Amazon.com 
November
November 
Traffic 
to 
Amazon.com 
Provisioned Capacity 
November
November 
Traffic 
to 
Amazon.com 
76% 
24% 
Provisioned Capacity 
November
November 
Traffic 
to 
Amazon.com 
November
Auto Scaling lets you do 
this!
Users > 500k+ 
Availability Zone 
Amazon 
Route 53 
User 
Amazon S3 
Amazon 
Cloudfront 
Web 
Instance 
Availability Zone ...
This looks impressive. 
But what is missing?
“Give me six hours to chop down a tree and I will spend 
the first four sharpening the axe.” – Abraham Lincoln
A World of Hurt If You Are Missing These 
• Metrics & alarming 
• Automated builds 
• Automated deployment 
• Centralized ...
Spend Your Time Wisely 
Managing your infrastructure will become an 
increasingly important part of your time. Use tools t...
Host-level 
Metrics 
Aggregate-level 
Metrics 
Log 
Analysis 
External Site 
Performance
AWS Marketplace & Partners Can Help 
• Customers can find, research, 
buy software 
• Simple pricing aligns with EC2 
usag...
AWS Application Management Solutions 
Higher level services Do it yourself 
Elastic Beanstalk AWS OpsWorks AWS CloudFormat...
Host-based Configuration Management 
Popular offerings 
– Opscode Chef 
– PuppetLabs Puppet 
– Ansible 
• Do similar thing...
OK, we avoided that 
world of hurt, what’s 
next?
From 500k to 1 Million Users 
• Getting serious now 
• Significant user base 
• Plenty of attention if things go wrong 
• ...
Time to make some 
radical improvements at 
the web & app layers
SOA 
= 
Service-oriented Architecture
SOAing 
Move services into their own tiers 
or modules. Treat each of these 
as 100% separate pieces of your 
infrastructu...
Loose Coupling Sets You Free! 
• The looser they're coupled, the bigger they scale 
– Use independent components 
– Design...
Loose Coupling + SOA = Winning! 
In the early days, if someone has a service for it already, 
use that instead of building...
An example: 
Imagine we let our users 
upload photos
Amazon S3 
Bucket for 
Ingest 
User 
Amazon SNS Topic 
RRS 
Amazon S3 
Bucket to 
Serve 
Content to 
CloudFront 
Amazon S3...
Simple Workflow Service (SWF) 
• Provides an orchestration tool across your infrastructure 
• Can act as a middle layer to...
Amazon S3 
Bucket for 
Ingest 
User 
RRS 
Amazon S3 
Bucket to 
Serve 
Content to 
CloudFront 
Amazon S3 
Bucket for 
Orig...
Amazon S3 
Bucket for 
Ingest 
User 
RRS 
Amazon S3 
Bucket to 
Serve 
Content to 
CloudFront 
Amazon S3 
Bucket for 
Orig...
Users > 1 Million 
Reaching a million and above is going to require some of 
all the previous things: 
• Multi-AZ 
• Elast...
Users > 1 Million 
RDS DB Instance 
Active (Multi-AZ) 
Availability Zone 
Elastic Load 
Balancer 
RDS DB Instance 
Read Re...
The next big steps
From 5 to 10 Million Users 
You may start to run into issues with your database around 
contention on the write master. 
H...
Database Federation 
• Split up databases by function 
or purpose 
• Harder to do cross-function 
queries 
• Essentially d...
Sharded Horizontal Scaling 
• More complex at the 
application layer 
• ORM support can help 
• No practical limit on 
sca...
Shifting Functionality to NOSQL 
• Similar in a sense to federation 
• Again, think about the earlier points for when you ...
A quick review
• Use Multi-AZ for your infrastructure 
• Make use of self-scaling services 
– ELB, S3, SNS, SQS, SES, etc. 
• Build in re...
Putting all this together 
means we should now 
easily be able to handle 
10+ million users!
Users > 10 Million 
Iterating on top of the 
patterns seen here will get 
you up and over 100 
million users.
Users > 10 Million 
• More fine tuning of your application 
• More SOA of features and functionality 
• Going from multi-A...
One More Thing!
Don’t forget about scaling your bill 
• A fantastic amount of FINANCIAL ENGINEERING 
to do as well 
• Reserved Instances 
...
Next steps? 
Read! 
• aws.amazon.com/documentation 
• aws.amazon.com/architecture 
• aws.amazon.com/start-ups 
Listen! 
• ...
Next steps? 
Ask for help! 
• forums.aws.amazon.com 
• aws.amazon.com/support 
• Your friendly local Account Manager & Sol...
Expand your skills with AWS 
Certification 
Exams 
Validate your proven 
technical expertise with 
the AWS platform 
aws.a...
© 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or i...
Scaling on AWS for the First 10 Million Users
Upcoming SlideShare
Loading in...5
×

Scaling on AWS for the First 10 Million Users

313

Published on

AWS Summit 2014 Melbourne - Breakout 5

Cloud computing gives you a number of advantages, such as being able to scale your application on demand. As a new business looking to use the cloud, you inevitably ask yourself, "Where do I start?" Join us in this session to understand best practices for scaling your resources from zero to millions of users. We will show you how to best combine different AWS services, make smarter decisions for architecting your application, and best practices for scaling your infrastructure in the cloud.

Presenter: Craig Dickson, Solutions Architect, Amazon Web Services

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
313
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

Scaling on AWS for the First 10 Million Users

  1. 1. Scaling on AWS for the First 10 Million Users Craig S. Dickson, Solutions Architect, Amazon Web Services © 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
  2. 2. • ME: Craig S. Dickson – Solutions Architect – AWS • YOU: Here to learn more about scaling infrastructure on AWS and just generally being more awesome • TODAY: About best practices and things to think about when building for large scale
  3. 3. So how do we scale?
  4. 4. Hi, I have NO IDEA what I am doing!!
  5. 5. a lot of things to read
  6. 6. a lot of things to read not where we want to start
  7. 7. If not Auto Scaling, then what do we need first?
  8. 8. Let’s start from day one, user one ( i.e. you )
  9. 9. Day One, User One • A single EC2 Instance – With full stack on this host • Web app • Database • Management • Etc. • A single Elastic IP • Route53 for DNS Elastic IP EC2 Instance Amazon Route 53 User
  10. 10. “We’re gonna need a bigger box” • Simplest approach • Can leverage EBS PIOPs • High I/O instances • High memory instances • High CPU instances • High storage instances • Easy to change instance sizes • Will hit an endpoint eventually hi1.4xlarge m2.4xlarge m1.small
  11. 11. “We’re gonna need a bigger box” • Simplest approach • Can now leverage PIOPs • High I/O instances • High memory instances • High CPU instances • High storage instances • Easy to change instance sizes • Will hit an endpoint eventually hi1.4xlarge m2.4xlarge m1.small
  12. 12. Day One, User One • We could potentially get to a few hundred to a few thousand depending on application complexity and traffic • No failover • No redundancy • Too many eggs in one basket Elastic IP EC2 Instance Amazon Route 53 User
  13. 13. Day One, User One • We could potentially get to a few hundred to a few thousand depending on application complexity and traffic • No failover • No redundancy • Too many eggs in one basket Elastic IP EC2 Instance Amazon Route 53 User
  14. 14. Day Two, User >1 First let’s separate out our single host into more than one. • Web • Database – Make use of a database service? Web Instance Database Instance Elastic IP Amazon Route 53 User
  15. 15. Self-managed Fully Managed Database Server on Amazon EC2 Your choice of database running on Amazon EC2 Bring Your Own License (BYOL) Amazon DynamoDB Managed NoSQL database service using SSD storage Seamless scalability Zero administration Amazon RDS Microsoft SQL, Oracle, Postgres or MySQL as a managed service Flexible licensing – BYOL or license included Amazon Redshift Massively parallel, petabyte-scale, data warehouse service Fast, powerful and easy to scale Database Options
  16. 16. But how do I choose what DB technology I need? SQL? NOSQL?
  17. 17. Not a binary decision!
  18. 18. A blended approach can reduce technical debt
  19. 19. Start with SQL databases where it makes sense
  20. 20. Why start with SQL? • Established and well worn technology • Lots of existing code, communities, books, background, tools, etc • You aren’t going to break SQL DBs in your first 10 million users • But you might break parts of it (hence blended approach) • Clear patterns to scalability
  21. 21. If your usage is such that you will be generating several TB ( >5 ) of data in the first year OR have an incredibly data intensive workload you might need NoSQL
  22. 22. Why might you need NOSQL? • Super low latency applications • Metadata driven datasets • Highly unrelational data • Need schema-less data constructs* • Massive amounts of data (again, in the TB range) • Rapid ingest of data (thousands of records/sec) *Need != “its easier to do dev without schemas”
  23. 23. So decide wisely. Look for the key points of scale.
  24. 24. Users > 100 First let’s separate out our single host into more than one • Web • Database – Use RDS to make your life easier Elastic IP Web Instance Amazon Route 53 RDS DB Instance User
  25. 25. Users > 1000 Next let’s address our lack of failover and redundancy issues • Elastic Load Balancing • Another web instance – In another Availability Zone • Enable Amazon RDS multi-AZ Web Instance Web Instance RDS DB Instance Active (Multi-AZ) Availability Zone Availability Zone RDS DB Instance Standby (Multi-AZ) Elastic Load Balancing Amazon Route 53 User
  26. 26. Scaling this horizontally and vertically will get us pretty far (10s-100s of thousands)
  27. 27. User > 10ks – 100ks RDS DB Instance Active (Multi-AZ) RDS DB Instance Standby (Multi-AZ) Elastic Load Balancing RDS DB Instance Read Replica Availability Zone Availability Zone RDS DB Instance Read Replica RDS DB Instance Read Replica RDS DB Instance Read Replica Web Instance Web Instance Web Instance Web Instance Web Instance Web Instance Web Instance Web Instance Amazon Route 53 User
  28. 28. This scales – but it can be much cleaner
  29. 29. Shift Some Load Around Let’s lighten the load on our web and database instances: • Move static content from the web Instance to Amazon S3 and CloudFront • Move dynamic content from the Elastic Load Balancing to CloudFront • Move session/state and DB caching to ElastiCache or Amazon DynamoDB Elastic Load Balancer Web Instance RDS DB Instance Active (Multi-AZ) Availability Zone Amazon CloudFront Amazon S3 Amazon Route 53 User ElastiCache Amazon DynamoDB
  30. 30. Shift Some Load Around Let’s lighten the load on our web and database instances • Move static content from the web instance to Amazon S3 and CloudFront • Move dynamic content from the Elastic Load Balancing to CloudFront • Move session/state and DB caching to ElastiCache or DynamoDB Elastic Load Balancer Web Instance RDS DB Instance Active (Multi-AZ) Availability Zone Amazon CloudFront Amazon S3 Amazon Route 53 User ElastiCache Amazon DynamoDB
  31. 31. Shift Some Load Around Let’s lighten the load on our web and database instances • Move static content from the web instance to Amazon S3 and CloudFront • Move dynamic content from the Elastic Load Balancing to CloudFront • Move session/state and DB caching to ElastiCache or Amazon DynamoDB Elastic Load Balancer Web Instance RDS DB Instance Active (Multi-AZ) Availability Zone Amazon CloudFront Amazon S3 Amazon Route 53 User ElastiCache Amazon DynamoDB
  32. 32. Now that our Web tier is much more lightweight, we can revisit …
  33. 33. Auto Scaling!
  34. 34. Typical Weekly Traffic to Amazon.com Sunday Monday Tuesday Wednesday Thursday Friday Saturday
  35. 35. Typical Weekly Traffic to Amazon.com Provisioned Capacity Sunday Monday Tuesday Wednesday Thursday Friday Saturday
  36. 36. November Traffic to Amazon.com November
  37. 37. November Traffic to Amazon.com Provisioned Capacity November
  38. 38. November Traffic to Amazon.com 76% 24% Provisioned Capacity November
  39. 39. November Traffic to Amazon.com November
  40. 40. Auto Scaling lets you do this!
  41. 41. Users > 500k+ Availability Zone Amazon Route 53 User Amazon S3 Amazon Cloudfront Web Instance Availability Zone Elastic Load Balancing DynamoDB Web Instance RDS DB Instance Read Replica Web Instance Web Instance ElastiCache RDS DB Instance Read Replica Web Instance Web Instance RDS DB Instance ElastiCache Standby (Multi-AZ) RDS DB Instance Active (Multi-AZ) Auto Scaling Group Auto Scaling Group
  42. 42. This looks impressive. But what is missing?
  43. 43. “Give me six hours to chop down a tree and I will spend the first four sharpening the axe.” – Abraham Lincoln
  44. 44. A World of Hurt If You Are Missing These • Metrics & alarming • Automated builds • Automated deployment • Centralized logging
  45. 45. Spend Your Time Wisely Managing your infrastructure will become an increasingly important part of your time. Use tools to automate repetitive tasks • Tools to manage AWS resources • Tools to manage the software on and configuration of your instances • Automated data analysis of logs and user actions
  46. 46. Host-level Metrics Aggregate-level Metrics Log Analysis External Site Performance
  47. 47. AWS Marketplace & Partners Can Help • Customers can find, research, buy software • Simple pricing aligns with EC2 usage model • Launch in minutes • Marketplace billing integrated into your AWS account • 1,000+ products across 20+ categories Learn more at: aws.amazon.com/marketplace
  48. 48. AWS Application Management Solutions Higher level services Do it yourself Elastic Beanstalk AWS OpsWorks AWS CloudFormation EC2 Convenience Control
  49. 49. Host-based Configuration Management Popular offerings – Opscode Chef – PuppetLabs Puppet – Ansible • Do similar things in slightly different ways • Works well with tools from the previous slide • Require some learning time • Can’t scale easily without this kind of capability
  50. 50. OK, we avoided that world of hurt, what’s next?
  51. 51. From 500k to 1 Million Users • Getting serious now • Significant user base • Plenty of attention if things go wrong • Interesting phase for startups with funding rounds
  52. 52. Time to make some radical improvements at the web & app layers
  53. 53. SOA = Service-oriented Architecture
  54. 54. SOAing Move services into their own tiers or modules. Treat each of these as 100% separate pieces of your infrastructure and scale them independently. Amazon.com and AWS do this extensively! It offers flexibility and greater understanding of each component.
  55. 55. Loose Coupling Sets You Free! • The looser they're coupled, the bigger they scale – Use independent components – Design everything as a black box – Decouple interactions – Favor services with built in redundancy and scalability than building your own Use Amazon SQS as Buffers Controller A Controller B Controller A Controller B Q Q Tight Coupling Loose Coupling
  56. 56. Loose Coupling + SOA = Winning! In the early days, if someone has a service for it already, use that instead of building it yourself Don’t reinvent the wheel! Examples: • Email • Queuing • Transcoding • Search Amazon SNS Amazon SQS Amazon CloudSearch Amazon Elastic Transcoder Amazon SES Amazon SWF • Databases • Monitoring • Metrics • Logging
  57. 57. An example: Imagine we let our users upload photos
  58. 58. Amazon S3 Bucket for Ingest User Amazon SNS Topic RRS Amazon S3 Bucket to Serve Content to CloudFront Amazon S3 Bucket for Originals CloudFront Download Distribution SQS Queue Size for Thumbnail SQS Queue Size Image for Mobile SQS Queue Size Image for Web Instances Autoscaling Group Instances Autoscaling Group Instances Autoscaling Group
  59. 59. Simple Workflow Service (SWF) • Provides an orchestration tool across your infrastructure • Can act as a middle layer to pass messages and setup tasks • Lets you break down individual tasks into different workers • Lets you define logic between workers • Lets you make a worker task from anything that can be scripted • Includes built-in retries, timeouts, logging • Features built-in reliability, scalability, and low cost Your code = & Deciders Workers
  60. 60. Amazon S3 Bucket for Ingest User RRS Amazon S3 Bucket to Serve Content to CloudFront Amazon S3 Bucket for Originals CloudFront Download Distribution SQS Queue Size for Thumbnail SQS Queue Size Image for Mobile SQS Queue Size Image for Web Instances Autoscaling Group Instances Autoscaling Group Instances Autoscaling Group Amazon SNS Topic
  61. 61. Amazon S3 Bucket for Ingest User RRS Amazon S3 Bucket to Serve Content to CloudFront Amazon S3 Bucket for Originals CloudFront Download Distribution Instances Autoscaling Group Instances Autoscaling Group Instances Autoscaling Group SWF Instance Running Decider
  62. 62. Users > 1 Million Reaching a million and above is going to require some of all the previous things: • Multi-AZ • Elastic Load Balancing between tiers • Auto Scaling • Service-oriented architecture • Serving content smartly (S3/CloudFront) • Caching off DB • Moving state off tiers that auto-scale
  63. 63. Users > 1 Million RDS DB Instance Active (Multi-AZ) Availability Zone Elastic Load Balancer RDS DB Instance Read Replica RDS DB Instance Read Replica Web Instance Web Instance Web Instance Web Instance Amazon Route 53 User Amazon S3 Amazon Cloudfront Amazon SQS Amazon DynamoDB ElastiCache Worker Instance Worker Instance Amazon CloudWatch Internal App Instance Internal App Instance Amazon SES
  64. 64. The next big steps
  65. 65. From 5 to 10 Million Users You may start to run into issues with your database around contention on the write master. How can you solve it? • Federation (splitting into multiple DBs based on function) • Sharding (splitting one data set up across multiple hosts) • Moving some functionality to other types of DBs (NOSQL)
  66. 66. Database Federation • Split up databases by function or purpose • Harder to do cross-function queries • Essentially delays the need for something like sharding or NOSQL until much further down the line • Won’t help with single huge functions or tables ForumsDB UsersDB ProductsDB
  67. 67. Sharded Horizontal Scaling • More complex at the application layer • ORM support can help • No practical limit on scalability • Operational complexity and sophistication • Shard by function or key space • RDBMS or NOSQL User ShardID 002345 A 002346 B 002347 C 002348 B 002349 A A B C
  68. 68. Shifting Functionality to NOSQL • Similar in a sense to federation • Again, think about the earlier points for when you need NOSQL vs. SQL • Leverage hosted services like DynamoDB • Consider these use cases: – Leaderboards and scoring – Rapid ingest of clickstream or log data – Temporary data needs (cart data) – Hot tables – Metadata or lookup tables Amazon DynamoDB
  69. 69. A quick review
  70. 70. • Use Multi-AZ for your infrastructure • Make use of self-scaling services – ELB, S3, SNS, SQS, SES, etc. • Build in redundancy at every level • Blend SQL & NOSQL wisely • Cache data both inside and outside your infrastructure • Split tiers into individual services (SOA) • Use Auto Scaling once you’re ready for it • Use automation tools in your infrastructure • Make sure you have good metrics, monitoring, and logging tools in place • Don’t reinvent the wheel
  71. 71. Putting all this together means we should now easily be able to handle 10+ million users!
  72. 72. Users > 10 Million Iterating on top of the patterns seen here will get you up and over 100 million users.
  73. 73. Users > 10 Million • More fine tuning of your application • More SOA of features and functionality • Going from multi-AZ to multi-Region • Possibly start building custom solutions • Deep analysis of your whole stack
  74. 74. One More Thing!
  75. 75. Don’t forget about scaling your bill • A fantastic amount of FINANCIAL ENGINEERING to do as well • Reserved Instances • Spot Instances • Correct use of storage • Scaling driven by queues • Correct instance sizes • Etc…
  76. 76. Next steps? Read! • aws.amazon.com/documentation • aws.amazon.com/architecture • aws.amazon.com/start-ups Listen! • aws.amazon.com/podcast
  77. 77. Next steps? Ask for help! • forums.aws.amazon.com • aws.amazon.com/support • Your friendly local Account Manager & Solutions Architects
  78. 78. Expand your skills with AWS Certification Exams Validate your proven technical expertise with the AWS platform aws.amazon.com/certification On-Demand Resources Videos & Labs Get hands-on practice working with AWS technologies in a live environment aws.amazon.com/training/ self-paced-labs Instructor-Led Courses Training Classes Expand your technical expertise to design, deploy, and operate scalable, efficient applications on AWS aws.amazon.com/training
  79. 79. © 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.

×