© 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or i...
•  ME: Simon Elisha – Principal Solutions Architect –
Amazon Web Services – @simon_elisha
•  YOU: Here to learn more about...
So how do we scale?
Hi, I have NO IDEA what I am doing!!
a lot of things to read
not where we want to start
a lot of things to read
What do we need first?
So let’s start from day
one, user one ( you )
Day One, User One
•  A single EC2 Instance
–  With full stack on this host
•  Web app
•  Database
•  Management
•  Etc.
• ...
“We’re gonna need a bigger box”
•  Simplest approach
•  Can now leverage PIOPs
•  High I/O instances
•  High memory instan...
“We’re gonna need a bigger box”
•  Simplest approach
•  Can now leverage PIOPs
•  High I/O instances
•  High memory instan...
Day One, User One
•  We could potentially get
to a few hundred to a few
thousand depending on
application complexity
and t...
Day One, User One
•  We could potentially get
to a few hundred to a few
thousand depending on
application complexity
and t...
Day Two, User >1
First let’s separate out
our single host into
more than one.
•  Web
•  Database
–  Make use of a database...
Self-managed Fully Managed
Database Server on
Amazon EC2
Your choice of database
running on Amazon EC2
Bring Your Own Lice...
But how do I choose
what DB technology I
need? SQL? NoSQL?
Not a binary decision!
Blended approach can
reduce technical debt
Start with SQL databases
where it makes sense
Why start with SQL?
•  Established and well worn technology
•  Lots of existing code, communities, books, background,
tool...
If your usage is such that you will be
generating several TB ( >5 ) of data
in the first year OR have an
incredibly data i...
Why else might you need NoSQL?
•  Super low latency applications
•  Metadata driven datasets
•  Highly unrelational data
•...
So decide wisely.
Look for the key
points of scale.
User >100
First let’s separate out
our single host into
more than one
•  Web
•  Database
–  Use RDS to make your life
easi...
User > 1000
Next let’s address our
lack of failover and
redundancy issues
•  Elastic Load Balancing
•  Another web instanc...
Scaling this horizontally and
vertically will get us pretty far
(10s-100s of thousands)
User >10 ks–100 ks
RDS DB Instance
Active (Multi-AZ)
Availability Zone Availability Zone
RDS DB Instance
Standby (Multi-AZ...
This scales –
but can be much cleaner
Shift Some Load Around
Let’s lighten the load on our
web and database instances:
•  Move static content from the
web Insta...
Shift Some Load Around
Let’s lighten the load on our web
and database instances
•  Move static content from the web
instan...
Shift Some Load Around
Let’s lighten the load on our web
and database instances
•  Move static content from the web
instan...
Now that our Web tier is much
more lightweight, we can revisit
the beginning of our talk…
Auto Scaling!
Sunday Monday Tuesday Wednesday Thursday Friday Saturday
Typical Weekly Traffic to Amazon.com
Sunday Monday Tuesday Wednesday Thursday Friday Saturday
Typical Weekly Traffic to Amazon.com
Provisioned Capacity
November Traffic to Amazon.com
November
November Traffic to Amazon.com
Provisioned Capacity
November
November Traffic to Amazon.com
76%
24%
Provisioned Capacity
November
November Traffic to Amazon.com
November
Auto Scaling lets you do this!
User >500k+
Availability Zone
Amazon
Route 53
User
Amazon S3
Amazon
Cloudfront
Availability Zone
Elastic Load
Balancing
Dy...
A Pause to Think
“Give me six hours to chop down a tree and I will spend
the first four sharpening the axe.” – Abraham Lincoln
“World of Hurt” If You Are Missing These
•  Metrics & alarming
•  Automated builds
•  Automated deployment
•  Centralized ...
Host-level
Metrics
Aggregate-
level
Metrics
Log
Analysis
External Site
Performance
AWS Marketplace & Partners Can Help
•  Customers can find, research, buy
software
•  Simple pricing aligns with EC2
usage ...
Spend Your Time Wisely
•  Managing your infrastructure will become an
increasingly important part of your time. Use
tools ...
AWS Application Management Solutions
Elastic Beanstalk AWS OpsWorks AWS CloudFormation EC2
Convenience Control
Higher leve...
Host-based Configuration Management
•  Popular offerings
–  Opscode Chef
–  PuppetLabs Puppet
–  Ansible
•  Do similar thi...
From 500K to 1 Million Users
•  Getting serious now
•  Significant user base
•  Plenty of attention if things go wrong
•  ...
Time to make some
radical improvements at
the web & app layers
SOA
=
Service-oriented Architecture
SOAing
Move services into their own tiers
or modules. Treat each of these
as 100% separate pieces of your
infrastructure a...
Loose Coupling Sets You Free!
•  The looser they're coupled, the bigger they scale
–  Use independent components
–  Design...
Loose Coupling + SOA = Winning
Examples:
•  Email
•  Queuing
•  Transcoding
•  Search
Amazon
CloudSearch
Amazon SQSAmazon ...
Imagine we let our users
upload photos
Amazon S3
Bucket for
Ingest
User
Amazon SNS Topic
RRS
Amazon S3
Bucket to
Serve
Content to
CloudFront
Amazon S3
Bucket for...
Amazon Simple Workflow Service (SWF)
•  Provides an orchestration tool across your infrastructure
•  Can act as a middle l...
Amazon S3
Bucket for
Ingest
User
RRS
Amazon S3
Bucket to
Serve
Content to
CloudFront
Amazon S3
Bucket for
Originals
CloudF...
Amazon S3
Bucket for
Ingest
User
RRS
Amazon S3
Bucket to
Serve
Content to
CloudFront
Amazon S3
Bucket for
Originals
CloudF...
Users > 1 Million
Reaching a million and above is going to require some of all the
previous things:
•  Multi-AZ
•  Elastic...
Users > 1 Million
RDS DB Instance
Active (Multi-AZ)
Availability Zone
Elastic Load
Balancer
RDS DB Instance
Read Replica
R...
The next big steps
From 5 to 10 Million Users
You may start to run into issues with your database around contention
on the write master.
•  H...
Database Federation
•  Split up databases by function or
purpose
•  Harder to do cross-function queries
•  Essentially del...
Sharded Horizontal Scaling
•  More complex at the application
layer
•  ORM support can help
•  No practical limit on scala...
Shifting Functionality to NoSQL
•  Similar in a sense to federation
•  Again, think about the earlier points for when you ...
A quick review
•  Use Multi-AZ for your infrastructure
•  Make use of self-scaling services (Elastic Load Balancing, Amazon S3, Amazon
SN...
Putting all this together
means we should now
easily be able to handle
10+ million users!
Users > 10 Million
Iterating on top of the
patterns seen here will get
you up and over 100
million users.
Users > 10 Million
•  More fine tuning of your application
•  More SOA of features and functionality
•  Going from Multi-A...
One More Thing
•  A fantastic amount of FINANCIAL ENGINEERING
to do as well
•  Reserved Instances
•  Spot Instances
•  Cor...
Next steps?
Read!
•  aws.amazon.com/documentation
•  aws.amazon.com/architecture
•  aws.amazon.com/start-ups
Listen!
•  aw...
Next steps?
Ask for help!
•  forums.aws.amazon.com
•  aws.amazon.com/support
•  Your local account manager & solution arch...
 AWS Summit Auckland 2014 | Scaling on AWS for the First 10 Million Users
Upcoming SlideShare
Loading in...5
×

AWS Summit Auckland 2014 | Scaling on AWS for the First 10 Million Users

482

Published on

You have attended AWS training. Gathered all the relevant information about AWS services but how do you now show the value of the AWS Cloud to your business. This session will run through how you would build a business case for the cloud including TCO and cost comparisons.

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
482
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

AWS Summit Auckland 2014 | Scaling on AWS for the First 10 Million Users

  1. 1. © 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc. Scaling on AWS for the First 10 Million Users Simon Elisha Principal Solutions Architect Amazon Web Services
  2. 2. •  ME: Simon Elisha – Principal Solutions Architect – Amazon Web Services – @simon_elisha •  YOU: Here to learn more about scaling infrastructure on AWS •  TODAY: About best practices and things to think about when building for large scale
  3. 3. So how do we scale?
  4. 4. Hi, I have NO IDEA what I am doing!!
  5. 5. a lot of things to read
  6. 6. not where we want to start a lot of things to read
  7. 7. What do we need first?
  8. 8. So let’s start from day one, user one ( you )
  9. 9. Day One, User One •  A single EC2 Instance –  With full stack on this host •  Web app •  Database •  Management •  Etc. •  A single Elastic IP •  Route53 for DNS EC2 Instance Elastic IP Amazon Route 53 User
  10. 10. “We’re gonna need a bigger box” •  Simplest approach •  Can now leverage PIOPs •  High I/O instances •  High memory instances •  High CPU instances •  High storage instances •  Easy to change instance sizes •  Will hit an endpoint eventually hi1.4xlarge m2.4xlarge m1.small
  11. 11. “We’re gonna need a bigger box” •  Simplest approach •  Can now leverage PIOPs •  High I/O instances •  High memory instances •  High CPU instances •  High storage instances •  Easy to change instance sizes •  Will hit an endpoint eventually hi1.4xlarge m2.4xlarge m1.small
  12. 12. Day One, User One •  We could potentially get to a few hundred to a few thousand depending on application complexity and traffic •  No failover •  No redundancy •  Too many eggs in one basket EC2 Instance Elastic IP Amazon Route 53 User
  13. 13. Day One, User One •  We could potentially get to a few hundred to a few thousand depending on application complexity and traffic •  No failover •  No redundancy •  Too many eggs in one basket EC2 Instance Elastic IP Amazon Route 53 User
  14. 14. Day Two, User >1 First let’s separate out our single host into more than one. •  Web •  Database –  Make use of a database service? Web Instance Database Instance Elastic IP Amazon Route 53 User
  15. 15. Self-managed Fully Managed Database Server on Amazon EC2 Your choice of database running on Amazon EC2 Bring Your Own License (BYOL) Amazon DynamoDB Managed NoSQL database service using SSD storage Seamless scalability Zero administration Amazon RDS Microsoft SQL, Oracle, Postgres or MySQL as a managed service Flexible licensing – BYOL or license included Amazon Redshift Massively parallel, petabyte-scale, data warehouse service Fast, powerful and easy to scale Database Options
  16. 16. But how do I choose what DB technology I need? SQL? NoSQL?
  17. 17. Not a binary decision!
  18. 18. Blended approach can reduce technical debt
  19. 19. Start with SQL databases where it makes sense
  20. 20. Why start with SQL? •  Established and well worn technology •  Lots of existing code, communities, books, background, tools, etc •  You aren’t going to break SQL DBs in your first 10 million users. But you might break parts of it (hence blended approach) •  Clear patterns to scalability
  21. 21. If your usage is such that you will be generating several TB ( >5 ) of data in the first year OR have an incredibly data intensive workload you might need NoSQL
  22. 22. Why else might you need NoSQL? •  Super low latency applications •  Metadata driven datasets •  Highly unrelational data •  Need schema-less data constructs* •  Massive amounts of data (again, in the TB range) •  Rapid ingest of data (thousands of records/sec) *Need != “its easier to do dev without schemas”
  23. 23. So decide wisely. Look for the key points of scale.
  24. 24. User >100 First let’s separate out our single host into more than one •  Web •  Database –  Use RDS to make your life easier Web Instance Elastic IP RDS DB Instance Amazon Route 53 User
  25. 25. User > 1000 Next let’s address our lack of failover and redundancy issues •  Elastic Load Balancing •  Another web instance –  In another Availability Zone •  Enable Amazon RDS multi-AZ Web Instance RDS DB Instance Active (Multi-AZ) Availability Zone Availability Zone Web Instance RDS DB Instance Standby (Multi-AZ) Elastic Load Balancing Amazon Route 53 User
  26. 26. Scaling this horizontally and vertically will get us pretty far (10s-100s of thousands)
  27. 27. User >10 ks–100 ks RDS DB Instance Active (Multi-AZ) Availability Zone Availability Zone RDS DB Instance Standby (Multi-AZ) Elastic Load Balancing RDS DB Instance Read Replica RDS DB Instance Read Replica RDS DB Instance Read Replica RDS DB Instance Read Replica Web Instance Web Instance Web Instance Web Instance Web Instance Web Instance Web Instance Web Instance Amazon Route 53 User
  28. 28. This scales – but can be much cleaner
  29. 29. Shift Some Load Around Let’s lighten the load on our web and database instances: •  Move static content from the web Instance to Amazon S3 and CloudFront •  Move dynamic content from the Elastic Load Balancing to CloudFront •  Move session/state and DB caching to ElastiCache or Amazon DynamoDB Web Instance RDS DB Instance Active (Multi-AZ) AVAILABILITY ZONE Elastic Load Balancer Amazon S3 Amazon CloudFro nt Amazon Route 53 User ElastiCache Amazon DynamoDB
  30. 30. Shift Some Load Around Let’s lighten the load on our web and database instances •  Move static content from the web instance to Amazon S3 and CloudFront •  Move dynamic content from the Elastic Load Balancing to CloudFront •  Move session/state and DB caching to ElastiCache or DynamoDB Web Instance RDS DB Instance Active (Multi-AZ) Elastic Load Balancing Amazon S3 Amazon CloudFront Amazon Route 53 User ElastiCache Amazon DynamoDB AVAILABILITY ZONE
  31. 31. Shift Some Load Around Let’s lighten the load on our web and database instances •  Move static content from the web instance to Amazon S3 and CloudFront •  Move dynamic content from the Elastic Load Balancing to CloudFront •  Move session/state and DB caching to ElastiCache or Amazon DynamoDB Web Instance RDS DB Instance Active (Multi-AZ) Elastic Load Balancing Amazon S3 Amazon CloudFront Amazon Route 53 User ElastiCache Amazon Dynamo DB AVAILABILITY ZONE
  32. 32. Now that our Web tier is much more lightweight, we can revisit the beginning of our talk…
  33. 33. Auto Scaling!
  34. 34. Sunday Monday Tuesday Wednesday Thursday Friday Saturday Typical Weekly Traffic to Amazon.com
  35. 35. Sunday Monday Tuesday Wednesday Thursday Friday Saturday Typical Weekly Traffic to Amazon.com Provisioned Capacity
  36. 36. November Traffic to Amazon.com November
  37. 37. November Traffic to Amazon.com Provisioned Capacity November
  38. 38. November Traffic to Amazon.com 76% 24% Provisioned Capacity November
  39. 39. November Traffic to Amazon.com November
  40. 40. Auto Scaling lets you do this!
  41. 41. User >500k+ Availability Zone Amazon Route 53 User Amazon S3 Amazon Cloudfront Availability Zone Elastic Load Balancing DynamoDB RDS DB Instance Read Replica Web Instance Web Instance Web Instance ElastiCache RDS DB Instance Read Replica Web Instance Web Instance Web Instance ElastiCacheRDS DB Instance Standby (Multi-AZ) RDS DB Instance Active (Multi-AZ) Auto Scaling Group Auto Scaling Group
  42. 42. A Pause to Think
  43. 43. “Give me six hours to chop down a tree and I will spend the first four sharpening the axe.” – Abraham Lincoln
  44. 44. “World of Hurt” If You Are Missing These •  Metrics & alarming •  Automated builds •  Automated deployment •  Centralized logging
  45. 45. Host-level Metrics Aggregate- level Metrics Log Analysis External Site Performance
  46. 46. AWS Marketplace & Partners Can Help •  Customers can find, research, buy software •  Simple pricing aligns with EC2 usage model •  Launch in minutes •  Marketplace billing integrated into your AWS account •  1,000+ products across 20+ categories Learn more at: aws.amazon.com/marketplace
  47. 47. Spend Your Time Wisely •  Managing your infrastructure will become an increasingly important part of your time. Use tools to automate repetitive tasks •  Tools to manage AWS resources •  Tools to manage software on and configuration of your instances •  Automated data analysis of logs and user actions
  48. 48. AWS Application Management Solutions Elastic Beanstalk AWS OpsWorks AWS CloudFormation EC2 Convenience Control Higher level services Do it yourself
  49. 49. Host-based Configuration Management •  Popular offerings –  Opscode Chef –  PuppetLabs Puppet –  Ansible •  Do similar things in slightly different ways •  Works well with tools from the previous slide •  Require some learning time •  Can’t scale easily without this kind of capability
  50. 50. From 500K to 1 Million Users •  Getting serious now •  Significant user base •  Plenty of attention if things go wrong •  Interesting phase for startups with funding rounds
  51. 51. Time to make some radical improvements at the web & app layers
  52. 52. SOA = Service-oriented Architecture
  53. 53. SOAing Move services into their own tiers or modules. Treat each of these as 100% separate pieces of your infrastructure and scale them independently. Amazon.com and AWS do this extensively! It offers flexibility and greater understanding of each component.
  54. 54. Loose Coupling Sets You Free! •  The looser they're coupled, the bigger they scale –  Use independent components –  Design everything as a black box –  Decouple interactions –  Favor services with built in redundancy and scalability than building your own Controller A Controller B Controller A Controller B Q Q Tight Coupling Use Amazon SQS as Buffers Loose Coupling
  55. 55. Loose Coupling + SOA = Winning Examples: •  Email •  Queuing •  Transcoding •  Search Amazon CloudSearch Amazon SQSAmazon SNS Amazon Elastic Transcoder Amazon SWF Amazon SES In the early days, if someone has a service for it already, use that instead of building it yourself Don’t reinvent the wheel •  Databases •  Monitoring •  Metrics •  Logging
  56. 56. Imagine we let our users upload photos
  57. 57. Amazon S3 Bucket for Ingest User Amazon SNS Topic RRS Amazon S3 Bucket to Serve Content to CloudFront Amazon S3 Bucket for Originals CloudFront Download Distribution SQS Queue Size for Thumbnail SQS Queue Size Image for Mobile SQS Queue Size Image for Web Autoscaling Group Instances Autoscaling Group Instances Autoscaling Group Instances
  58. 58. Amazon Simple Workflow Service (SWF) •  Provides an orchestration tool across your infrastructure •  Can act as a middle layer to pass messages and setup tasks •  Lets you break down individual tasks into different workers •  Lets you define logic between workers •  Lets you make a worker task from anything that can be scripted •  Includes built-in retries, timeouts, logging •  Features built-in reliability, scalability, and low cost Deciders Workers Your code = &
  59. 59. Amazon S3 Bucket for Ingest User RRS Amazon S3 Bucket to Serve Content to CloudFront Amazon S3 Bucket for Originals CloudFront Download Distribution SQS Queue Size for Thumbnail SQS Queue Size Image for Mobile SQS Queue Size Image for Web Autoscaling Group Instances Autoscaling Group Instances Autoscaling Group Instances Amazon SNS Topic
  60. 60. Amazon S3 Bucket for Ingest User RRS Amazon S3 Bucket to Serve Content to CloudFront Amazon S3 Bucket for Originals CloudFront Download Distribution Autoscaling Group Instances Autoscaling Group Instances Autoscaling Group Instances SWF Instance Running Decider
  61. 61. Users > 1 Million Reaching a million and above is going to require some of all the previous things: •  Multi-AZ •  Elastic Load Balancing between tiers •  Auto Scaling •  Service-oriented architecture •  Serving content smartly (S3/CloudFront) •  Caching off DB •  Moving state off tiers that autoscale
  62. 62. Users > 1 Million RDS DB Instance Active (Multi-AZ) Availability Zone Elastic Load Balancer RDS DB Instance Read Replica RDS DB Instance Read Replica Web Instance Web Instance Web Instance Web Instance Amazon Route 53 User Amazon S3 Amazon Cloudfront Amazon DynamoDB Amazon SQS ElastiCache Worker Instance Worker Instance Amazon CloudWatch Internal App Instance Internal App Instance Amazon SES
  63. 63. The next big steps
  64. 64. From 5 to 10 Million Users You may start to run into issues with your database around contention on the write master. •  How can you solve it? •  Federation (splitting into multiple DBs based on function) •  Sharding (splitting one data set up across multiple hosts) •  Moving some functionality to other types of DBs (NoSQL)
  65. 65. Database Federation •  Split up databases by function or purpose •  Harder to do cross-function queries •  Essentially delays the need for something like sharding or NoSQL until much further down the line •  Won’t help with single huge functions or tables ForumsDB UsersDB ProductsDB
  66. 66. Sharded Horizontal Scaling •  More complex at the application layer •  ORM support can help •  No practical limit on scalability •  Operational complexity and sophistication •  Shard by function or key space •  RDBMS or NoSQL User ShardID 002345 A 002346 B 002347 C 002348 B 002349 A A B C
  67. 67. Shifting Functionality to NoSQL •  Similar in a sense to federation •  Again, think about the earlier points for when you need NoSQL vs SQL •  Leverage hosted services like Amazon DynamoDB •  Consider these use cases: –  Leaderboards and scoring –  Rapid ingest of clickstream or log data –  Temporary data needs (cart data) –  “Hot” tables –  Metadata or lookup tables Amazon DynamoDB
  68. 68. A quick review
  69. 69. •  Use Multi-AZ for your infrastructure •  Make use of self-scaling services (Elastic Load Balancing, Amazon S3, Amazon SNS, SQS, Amazon SES, etc) Build in redundancy at every level •  Blend SQL & NoSQL wisely •  Cache data both inside and outside your infrastructure •  Split tiers into individual services (SOA) •  Use autoscaling once you’re ready for it •  Use automation tools in your infrastructure •  Make sure you have good metrics, monitoring, and logging tools in place •  Don’t reinvent the wheel
  70. 70. Putting all this together means we should now easily be able to handle 10+ million users!
  71. 71. Users > 10 Million Iterating on top of the patterns seen here will get you up and over 100 million users.
  72. 72. Users > 10 Million •  More fine tuning of your application •  More SOA of features and functionality •  Going from Multi-AZ to multi-region •  Needing to start building custom solutions •  Deep analysis of your whole stack
  73. 73. One More Thing •  A fantastic amount of FINANCIAL ENGINEERING to do as well •  Reserved Instances •  Spot Instances •  Correct use of storage •  Scaling driven by queues •  Correct instance sizes •  Etc…
  74. 74. Next steps? Read! •  aws.amazon.com/documentation •  aws.amazon.com/architecture •  aws.amazon.com/start-ups Listen! •  aws.amazon.com/podcast
  75. 75. Next steps? Ask for help! •  forums.aws.amazon.com •  aws.amazon.com/support •  Your local account manager & solution architect

×