Scaling on AWS for the First 10 Million Users

  • 167 views
Uploaded on

AWS Summit 2014 Brisbane - Breakout 3 …

AWS Summit 2014 Brisbane - Breakout 3

Cloud computing gives you a number of advantages, such as being able to scale your application on demand. As a new business looking to use the cloud, you inevitably ask yourself, "Where do I start?" Join us in this session to understand best practices for scaling your resources from zero to millions of users. We will show you how to best combine different AWS services, make smarter decisions for architecting your application, and best practices for scaling your infrastructure in the cloud.

Presenter: Craig Dickson, Solutions Architect, Amazon Web Services

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
167
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. © 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc. Scaling on AWS for the First 10 Million Users Craig S. Dickson, Solutions Architect, Amazon Web Services
  • 2. •  ME: Craig S. Dickson – Solutions Architect – Amazon Web Services – @craigsdickson •  YOU: Here to learn more about scaling infrastructure on AWS •  TODAY: About best practices and things to think about when building for large scale
  • 3. So how do we scale?
  • 4. Hi, I have NO IDEA what I am doing!!
  • 5. a lot of things to read
  • 6. not where we want to start a lot of things to read
  • 7. What do we need first?
  • 8. So let’s start from day one, user one ( i.e. you )
  • 9. Day One, User One •  A single EC2 Instance –  With full stack on this host •  Web app •  Database •  Management •  Etc. •  A single Elastic IP •  Route53 for DNS EC2 Instance Elastic IP Amazon Route 53 User
  • 10. “We’re gonna need a bigger box” •  Simplest approach •  Can now leverage PIOPs •  High I/O instances •  High memory instances •  High CPU instances •  High storage instances •  Easy to change instance sizes •  Will hit an endpoint eventually hi1.4xlarge m2.4xlarge m1.small
  • 11. “We’re gonna need a bigger box” •  Simplest approach •  Can now leverage PIOPs •  High I/O instances •  High memory instances •  High CPU instances •  High storage instances •  Easy to change instance sizes •  Will hit an endpoint eventually hi1.4xlarge m2.4xlarge m1.small
  • 12. Day One, User One •  We could potentially get to a few hundred to a few thousand depending on application complexity and traffic •  No failover •  No redundancy •  Too many eggs in one basket EC2 Instance Elastic IP Amazon Route 53 User
  • 13. Day One, User One •  We could potentially get to a few hundred to a few thousand depending on application complexity and traffic •  No failover •  No redundancy •  Too many eggs in one basket EC2 Instance Elastic IP Amazon Route 53 User
  • 14. Day Two, User >1 First let’s separate out our single host into more than one. •  Web •  Database –  Make use of a database service? Web Instance Database Instance Elastic IP Amazon Route 53 User
  • 15. Self-managed Fully Managed Database Server on Amazon EC2 Your choice of database running on Amazon EC2 Bring Your Own License (BYOL) Amazon DynamoDB Managed NoSQL database service using SSD storage Seamless scalability Zero administration Amazon RDS Microsoft SQL, Oracle, Postgres or MySQL as a managed service Flexible licensing – BYOL or license included Amazon Redshift Massively parallel, petabyte-scale, data warehouse service Fast, powerful and easy to scale Database Options
  • 16. But how do I choose what DB technology I need? SQL? NOSQL?
  • 17. Not a binary decision!
  • 18. Blended approach can reduce technical debt
  • 19. Start with SQL databases where it makes sense
  • 20. Why start with SQL? •  Established and well worn technology •  Lots of existing code, communities, books, background, tools, etc •  You aren’t going to break SQL DBs in your first 10 million users. But you might break parts of it (hence blended approach) •  Clear patterns to scalability
  • 21. If your usage is such that you will be generating several TB ( >5 ) of data in the first year OR have an incredibly data intensive workload you might need NoSQL
  • 22. Why might you need NOSQL? •  Super low latency applications •  Metadata driven datasets •  Highly unrelational data •  Need schema-less data constructs* •  Massive amounts of data (again, in the TB range) •  Rapid ingest of data (thousands of records/sec) *Need != “its easier to do dev without schemas”
  • 23. So decide wisely. Look for the key points of scale.
  • 24. Users > 100 First let’s separate out our single host into more than one •  Web •  Database –  Use RDS to make your life easier Web Instance Elastic IP RDS DB Instance Amazon Route 53 User
  • 25. Users > 1000 Next let’s address our lack of failover and redundancy issues •  Elastic Load Balancing •  Another web instance –  In another Availability Zone •  Enable Amazon RDS multi-AZ Web Instance RDS DB Instance Active (Multi-AZ) Availability Zone Availability Zone Web Instance RDS DB Instance Standby (Multi-AZ) Elastic Load Balancing Amazon Route 53 User
  • 26. Scaling this horizontally and vertically will get us pretty far (10s-100s of thousands)
  • 27. User > 10ks – 100ks RDS DB Instance Active (Multi-AZ) Availability Zone Availability Zone RDS DB Instance Standby (Multi-AZ) Elastic Load Balancing RDS DB Instance Read Replica RDS DB Instance Read Replica RDS DB Instance Read Replica RDS DB Instance Read Replica Web Instance Web Instance Web Instance Web Instance Web Instance Web Instance Web Instance Web Instance Amazon Route 53 User
  • 28. This scales – but can be much cleaner
  • 29. Shift Some Load Around Let’s lighten the load on our web and database instances: •  Move static content from the web Instance to Amazon S3 and CloudFront •  Move dynamic content from the Elastic Load Balancing to CloudFront •  Move session/state and DB caching to ElastiCache or Amazon DynamoDB Web Instance RDS DB Instance Active (Multi-AZ) Availability Zone Elastic Load Balancer Amazon S3 Amazon CloudFront Amazon Route 53 User ElastiCache Amazon DynamoDB
  • 30. Shift Some Load Around Let’s lighten the load on our web and database instances •  Move static content from the web instance to Amazon S3 and CloudFront •  Move dynamic content from the Elastic Load Balancing to CloudFront •  Move session/state and DB caching to ElastiCache or DynamoDB Web Instance RDS DB Instance Active (Multi-AZ) Availability Zone Elastic Load Balancing Amazon S3 Amazon CloudFront Amazon Route 53 User ElastiCache Amazon DynamoDB
  • 31. Shift Some Load Around Let’s lighten the load on our web and database instances •  Move static content from the web instance to Amazon S3 and CloudFront •  Move dynamic content from the Elastic Load Balancing to CloudFront •  Move session/state and DB caching to ElastiCache or Amazon DynamoDB Web Instance RDS DB Instance Active (Multi-AZ) Availability Zone Elastic Load Balancing Amazon S3 Amazon Cloudfront Amazon Route 53 User ElastiCache Amazon DynamoDB
  • 32. Now that our Web tier is much more lightweight, we can revisit the beginning of our talk…
  • 33. Auto Scaling!
  • 34. Sunday Monday Tuesday Wednesday Thursday Friday Saturday Typical  Weekly  Traffic  to  Amazon.com  
  • 35. Sunday Monday Tuesday Wednesday Thursday Friday Saturday Typical  Weekly  Traffic  to  Amazon.com   Provisioned Capacity
  • 36. November  Traffic  to  Amazon.com   November
  • 37. November  Traffic  to  Amazon.com   Provisioned Capacity November
  • 38. November  Traffic  to  Amazon.com   76% 24% Provisioned Capacity November
  • 39. November  Traffic  to  Amazon.com   November
  • 40. Auto Scaling lets you do this!
  • 41. Users > 500k+ Availability Zone Amazon Route 53 User Amazon S3 Amazon Cloudfront Availability Zone Elastic Load Balancing DynamoDB RDS DB Instance Read Replica Web Instance Web Instance Web Instance ElastiCache RDS DB Instance Read Replica Web Instance Web Instance Web Instance ElastiCacheRDS DB Instance Standby (Multi-AZ) RDS DB Instance Active (Multi-AZ) Auto Scaling Group Auto Scaling Group
  • 42. A Pause to Think
  • 43. “Give me six hours to chop down a tree and I will spend the first four sharpening the axe.” – Abraham Lincoln
  • 44. “World of Hurt” If You Are Missing These •  Metrics & alarming •  Automated builds •  Automated deployment •  Centralized logging
  • 45. Host-level Metrics Aggregate- level Metrics Log Analysis External Site Performance
  • 46. AWS Marketplace & Partners Can Help •  Customers can find, research, buy software •  Simple pricing aligns with EC2 usage model •  Launch in minutes •  Marketplace billing integrated into your AWS account •  1,000+ products across 20+ categories Learn more at: aws.amazon.com/marketplace
  • 47. Spend Your Time Wisely Managing your infrastructure will become an increasingly important part of your time. Use tools to automate repetitive tasks •  Tools to manage AWS resources •  Tools to manage the software on and configuration of your instances •  Automated data analysis of logs and user actions
  • 48. AWS Application Management Solutions Elastic Beanstalk AWS OpsWorks AWS CloudFormation EC2 Convenience Control Higher level services Do it yourself
  • 49. Host-based Configuration Management Popular offerings –  Opscode Chef –  PuppetLabs Puppet –  Ansible •  Do similar things in slightly different ways •  Works well with tools from the previous slide •  Require some learning time •  Can’t scale easily without this kind of capability
  • 50. From 500k to 1 Million Users •  Getting serious now •  Significant user base •  Plenty of attention if things go wrong •  Interesting phase for startups with funding rounds
  • 51. Time to make some radical improvements at the web & app layers
  • 52. SOA = Service-oriented Architecture
  • 53. SOAing Move services into their own tiers or modules. Treat each of these as 100% separate pieces of your infrastructure and scale them independently. Amazon.com and AWS do this extensively! It offers flexibility and greater understanding of each component.
  • 54. Loose Coupling Sets You Free! •  The looser they're coupled, the bigger they scale –  Use independent components –  Design everything as a black box –  Decouple interactions –  Favor services with built in redundancy and scalability than building your own Controller  A   Controller  B   Controller  A   Controller  B   Q   Q   Tight  Coupling   Use  Amazon  SQS  as  Buffers   Loose  Coupling  
  • 55. Loose Coupling + SOA = Winning Examples: •  Email •  Queuing •  Transcoding •  Search Amazon CloudSearch Amazon SQSAmazon SNS Amazon Elastic Transcoder Amazon SWF Amazon SES In the early days, if someone has a service for it already, use that instead of building it yourself Don’t reinvent the wheel •  Databases •  Monitoring •  Metrics •  Logging
  • 56. Imagine we let our users upload photos
  • 57. Amazon S3 Bucket for Ingest User Amazon SNS Topic RRS Amazon S3 Bucket to Serve Content to CloudFront Amazon S3 Bucket for Originals CloudFront Download Distribution SQS Queue Size for Thumbnail SQS Queue Size Image for Mobile SQS Queue Size Image for Web Autoscaling Group Instances Autoscaling Group Instances Autoscaling Group Instances
  • 58. Amazon Simple Workflow Service (SWF) •  Provides an orchestration tool across your infrastructure •  Can act as a middle layer to pass messages and setup tasks •  Lets you break down individual tasks into different workers •  Lets you define logic between workers •  Lets you make a worker task from anything that can be scripted •  Includes built-in retries, timeouts, logging •  Features built-in reliability, scalability, and low cost Deciders Workers Your code = &
  • 59. Amazon S3 Bucket for Ingest User RRS Amazon S3 Bucket to Serve Content to CloudFront Amazon S3 Bucket for Originals CloudFront Download Distribution SQS Queue Size for Thumbnail SQS Queue Size Image for Mobile SQS Queue Size Image for Web Autoscaling Group Instances Autoscaling Group Instances Autoscaling Group Instances Amazon SNS Topic
  • 60. Amazon S3 Bucket for Ingest User RRS Amazon S3 Bucket to Serve Content to CloudFront Amazon S3 Bucket for Originals CloudFront Download Distribution Autoscaling Group Instances Autoscaling Group Instances Autoscaling Group Instances SWF Instance Running Decider
  • 61. Users > 1 Million Reaching a million and above is going to require some of all the previous things: •  Multi-AZ •  Elastic Load Balancing between tiers •  Auto Scaling •  Service-oriented architecture •  Serving content smartly (S3/CloudFront) •  Caching off DB •  Moving state off tiers that autoscale
  • 62. Users > 1 Million RDS DB Instance Active (Multi-AZ) Availability Zone Elastic Load Balancer RDS DB Instance Read Replica RDS DB Instance Read Replica Web Instance Web Instance Web Instance Web Instance Amazon Route 53 User Amazon S3 Amazon Cloudfront Amazon DynamoDB Amazon SQS ElastiCache Worker Instance Worker Instance Amazon CloudWatch Internal App Instance Internal App Instance Amazon SES
  • 63. The next big steps
  • 64. From 5 to 10 Million Users You may start to run into issues with your database around contention on the write master. How can you solve it? •  Federation (splitting into multiple DBs based on function) •  Sharding (splitting one data set up across multiple hosts) •  Moving some functionality to other types of DBs (NOSQL)
  • 65. Database Federation •  Split up databases by function or purpose •  Harder to do cross-function queries •  Essentially delays the need for something like sharding or NOSQL until much further down the line •  Won’t help with single huge functions or tables ForumsDB UsersDB ProductsDB
  • 66. Sharded Horizontal Scaling •  More complex at the application layer •  ORM support can help •  No practical limit on scalability •  Operational complexity and sophistication •  Shard by function or key space •  RDBMS or NOSQL User ShardID 002345 A 002346 B 002347 C 002348 B 002349 A A B C
  • 67. Shifting Functionality to NOSQL •  Similar in a sense to federation •  Again, think about the earlier points for when you need NOSQL vs. SQL •  Leverage hosted services like Amazon DynamoDB •  Consider these use cases: –  Leaderboards and scoring –  Rapid ingest of clickstream or log data –  Temporary data needs (cart data) –  “Hot” tables –  Metadata or lookup tables Amazon DynamoDB
  • 68. A quick review
  • 69. •  Use Multi-AZ for your infrastructure •  Make use of self-scaling services (Elastic Load Balancing, Amazon S3, Amazon SNS, SQS, Amazon SES, etc.) Build in redundancy at every level •  Blend SQL & NOSQL wisely •  Cache data both inside and outside your infrastructure •  Split tiers into individual services (SOA) •  Use autoscaling once you’re ready for it •  Use automation tools in your infrastructure •  Make sure you have good metrics, monitoring, and logging tools in place •  Don’t reinvent the wheel
  • 70. Putting all this together means we should now easily be able to handle 10+ million users!
  • 71. Users > 10 Million Iterating on top of the patterns seen here will get you up and over 100 million users.
  • 72. Users > 10 Million •  More fine tuning of your application •  More SOA of features and functionality •  Going from multi-AZ to multi-region •  Needing to start building custom solutions •  Deep analysis of your whole stack
  • 73. One More Thing •  A fantastic amount of FINANCIAL ENGINEERING to do as well •  Reserved Instances •  Spot Instances •  Correct use of storage •  Scaling driven by queues •  Correct instance sizes •  Etc…
  • 74. Next steps? Read! •  aws.amazon.com/documentation •  aws.amazon.com/architecture •  aws.amazon.com/start-ups Listen! •  aws.amazon.com/podcast
  • 75. Next steps? Ask for help! •  forums.aws.amazon.com •  aws.amazon.com/support •  Your friendly local account manager & solutions architect
  • 76. Expand your skills with AWS Certification aws.amazon.com/certification Exams Validate your proven technical expertise with the AWS platform On-Demand Resources aws.amazon.com/training/ self-paced-labs Videos & Labs Get hands-on practice working with AWS technologies in a live environment aws.amazon.com/training Instructor-Led Courses Training Classes Expand your technical expertise to design, deploy, and operate scalable, efficient applications on AWS