Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

ENT309 Scaling Up to Your First 10 Million Users

19,903 views

Published on

Cloud computing gives you a number of advantages, such as the ability to scale your web application or website on demand. If you have a new web application and want to use cloud computing, you might be asking yourself, "Where do I start?" Join us in this session to understand best practices for scaling your resources from zero to millions of users. We show you how to best combine different AWS services, how to make smarter decisions for architecting your application, and how to scale your infrastructure in the cloud.

Published in: Technology

ENT309 Scaling Up to Your First 10 Million Users

  1. 1. © 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. John Pignata, Startup Solutions Architect August 14, 2017 Scaling Up to Your First 10 Million Users
  2. 2. http://i.telegraph.co.uk/multimedia/archive/02674/CLIMBER_2674482b.jpg
  3. 3. Now that’s a lot of things to read! This is NOT where we want to start!
  4. 4. It’s not the single thing that fixes everything.
  5. 5. What do we need first?
  6. 6. Some basics…
  7. 7. AWS global infrastructure Region, # of zones Coming Soon!
  8. 8. AWS global infrastructure
  9. 9. So let’s start from…
  10. 10. You
  11. 11. 1 User Amazon EC2 instance Elastic IP User Amazon Route 53
  12. 12. “We’re gonna need a bigger box” • Simplest approach • Can now leverage PIOPS • High I/O instances • High memory instances • High CPU instances • High storage instances • Easy to change instance sizes • Will hit an endpoint eventually c4.8xlarge m4.2xlarge t2.micro
  13. 13. “We’re gonna need a bigger box” • Simplest approach • Can now leverage PIOPS • High I/O instances • High memory instances • High CPU instances • High storage instances • Easy to change instance sizes • Will hit an endpoint eventually c4.8xlarge m4.2xlarge t2.micro
  14. 14. 1 User • No failover • No redundancy • Too many eggs in one basket EC2 Instance Elastic IP User Amazon Route 53
  15. 15. Users >1
  16. 16. Users > 1 Web Instance Database Instance Elastic IP User Amazon Route 53
  17. 17. Self-managed Fully managed Amazon EC2 Amazon DynamoDB Amazon RDS Amazon Redshift Database options
  18. 18. • MySQL or Postgres compatible • Automatic storage scaling (up to 64 TB) • Up to 15 read replicas • Continuous (incremental) backups to Amazon S3 • 6-way replication across 3 zones Amazon Aurora
  19. 19. To NoSQL, or not to NoSQL?
  20. 20. Start with SQL databases
  21. 21. Why start with SQL? • Established and well-worn technology. • Lots of existing code, communities, books, and tools. • You aren’t going to break SQL DBs in your first millions of users. No, really, you won’t.* • Clear patterns to scalability. *Unless you are doing something SUPER peculiar with the data or you have MASSIVE amounts of it. …but even then SQL will have a place in your stack.
  22. 22. AH HA! You said “massive amounts”
  23. 23. > 5 TB in year one? Incredibly data intensive workload? OK! You might need NoSQL.
  24. 24. Why else might you need NoSQL? • Super low-latency applications • Metadata-driven datasets • Highly nonrelational data • Need schema-less data constructs* • Rapid ingest of data (thousands of records/sec) • Massive amounts of data (again, in the TB range) *Need!= “It’s easier to do dev without schemas”
  25. 25. Users >100
  26. 26. Users >100 Web instance Elastic IP RDS DB instance User Amazon Route 53
  27. 27. Users >1000
  28. 28. Users >1000 Web Instance RDS DB Instance Active (Multi-AZ) Availability Zone Availability Zone Web Instance RDS DB Instance Standby (Multi-AZ) Load balancer User Amazon Route 53
  29. 29. Elastic Load Balancing • Highly available • 1–65535 • Health checks • Session stickiness • Monitoring/logging • Secure Sockets Layer
  30. 30. Application Load Balancer • Highly available • 1–65535 • Health checks • Session stickiness • Monitoring/logging • Content-based routing • Container-based apps • WebSockets • HTTP/2
  31. 31. Horizontally Vertically
  32. 32. Users >100,000
  33. 33. Users > 10,000s–100,000s RDS DB Instance Active (Multi-AZ) Availability Zone Availability Zone RDS DB Instance Standby (Multi-AZ) Load balancer RDS DB Instance Read Replica RDS DB Instance Read Replica RDS DB Instance Read Replica RDS DB Instance Read Replica Web Instance Web Instance Web Instance Web Instance Web Instance Web Instance Web Instance Web Instance Amazon Route 53User
  34. 34. What about performance and efficiency?
  35. 35. RDS DB Instance Active (Multi-AZ) Availability Zone Load balancer Amazon S3 Amazon CloudFront Amazon Route 53 User Shift some load around Web Instances Availability Zone Web Instances RDS DB Instance Standby (Multi-AZ)
  36. 36. RDS DB Instance Active (Multi-AZ) Availability Zone Load balancer Amazon S3 Amazon CloudFront Amazon Route 53 User Shift some load around Web Instances
  37. 37. • Object-based storage • Highly durable • Great for static assets • “Infinitely scalable” • Objects up to 5 TB in size • Optional encryption Amazon S3
  38. 38. • Cache content for faster delivery • Lower load on origin • Dynamic and static content • Streaming video • Custom SSL certificates • Low TTLs (as short as 0 seconds) • Free origin fetches? • Optimized for AWS Amazon CloudFront
  39. 39. Shift some load around RDS DB Instance Active (Multi-AZ) Availability Zone Load balancer Amazon S3 Amazon CloudFrontUser ElastiCache DynamoDB Web Instances Amazon Route 53
  40. 40. Amazon CloudFront ResponseTime ServerLoad Response Time Server Load Response Time Server Load No CDN CDN for Static Content CDN for Static & Dynamic Content 0 20 40 60 80 8:00AM 9:00AM 10:00AM 11:00AM 12:00PM 1:00PM 2:00PM 3:00PM 4:00PM 5:00PM 6:00PM 7:00PM 8:00PM 9:00PM VolumeofData Delivered(Gbps)
  41. 41. Shift some load around RDS DB Instance Active (Multi-AZ) Availability Zone Load balancer Amazon S3 Amazon CloudFront Amazon Route 53 User ElastiCache DynamoDB Web Instances
  42. 42. Amazon DynamoDB • Managed NoSQL database • Provisioned throughput • Fast, predictable performance • Fully distributed, fault tolerant • JSON support • Items up to 400 KB
  43. 43. Amazon ElastiCache • Managed Memcached or Redis • Scale from one to many nodes • Self-healing (replaces dead instance) • Single digit ms speeds (usually) • Local to a single AZ for Memcached • Multi-AZ possible with Redis
  44. 44. Now that our web tier is much more lightweight, we can revisit the beginning of our talk…
  45. 45. Auto Scaling!
  46. 46. Automatic resizing of compute clusters Define min/max pool sizes CloudWatch metrics drive scaling On-Demand or Spot instances aws autoscaling create-auto-scaling-group --auto-scaling-group-name MyGroup --launch-configuration-name MyConfig --min-size 4 --max-size 200 --availability-zones us-west-2c, us-west-2b Auto Scaling
  47. 47. Sunday Monday Tuesday Wednesday Thursday Friday Saturday Typical weekly traffic to Amazon.com
  48. 48. Sunday Monday Tuesday Wednesday Thursday Friday Saturday Typical weekly traffic to Amazon.com Provisioned capacity
  49. 49. November November traffic to Amazon.com
  50. 50. Provisioned capacity November November traffic to Amazon.com
  51. 51. November traffic to Amazon.com 76% 24% November Provisioned capacity
  52. 52. November traffic to Amazon.com November
  53. 53. Auto Scaling lets you do this!
  54. 54. Users > 500,000+ Availability Zone Amazon Route 53 User Amazon S3 Amazon CloudFront Availability Zone Load balancer DynamoDB RDS DB Instance Read Replica Web Instance Web Instance Web Instance ElastiCache RDS DB Instance Read Replica Web Instance Web Instance Web Instance ElastiCacheRDS DB Instance Standby (Multi-AZ) RDS DB Instance Active (Multi-AZ)
  55. 55. Users > 500,000+ Availability Zone Amazon Route 53 User Amazon S3 Amazon CloudFront Availability Zone Load balancer DynamoDB RDS DB Instance Read Replica Web Instance Web Instance Web Instance ElastiCache RDS DB Instance Read Replica Web Instance Web Instance Web Instance ElastiCacheRDS DB Instance Standby (Multi-AZ) RDS DB Instance Active (Multi-AZ)
  56. 56. Use automation
  57. 57. AWS application management solutions Convenience Control Higher-level services Do it yourself AWS Elastic Beanstalk AWS OpsWorks AWS CloudFormation Amazon EC2
  58. 58. AWS CodeDeploy • Deploys your code to a “fleet” of EC2 instances • 1–10,000s of instances • Automatically schedules updates (multiple zones) • Application and deployment groups described in YAML-formatted files • Can reference Auto Scaling groups • AWS Management Console, AWS CLI, or APIs • Can be used with Chef recipes or Puppet scripts
  59. 59. Users >500,000+ • Monitoring, metrics, and logging • If you can’t build it internally, outsource it! (third-party SaaS) • What are customers saying? • Try to squeeze as much performance out of each service/component
  60. 60. AGGREGATE LEVEL METRICS LOG ANALYSIS EXTERNAL SITE PERFORMANCE HOST LEVEL METRICS
  61. 61. There are further improvements to be made in breaking apart our web/app layer
  62. 62. SOA What does this mean?
  63. 63. Now that’s a lot of things to read! This is NOT where we want to start!
  64. 64. This is NOT where we want to start! This IS where we want to start! Now that’s a lot of things to read!
  65. 65. SOAing Move services into their own tiers • Treat them separately • Scale them independently It offers flexibility and greater understanding of each component
  66. 66. Loose coupling + SOA = winning DON’T REINVENT THE WHEEL • Email • Queuing • Transcoding • Search • Databases • Monitoring • Metrics • Logging • Compute Amazon CloudSearch Amazon SQSAmazon SNS Amazon Elastic Transcoder Amazon SWFAmazon SES AWS Lambda
  67. 67. • Reliable (Multi-AZ) • Scalable (unlimited messages) • Secure (queue authentication) • Simple (simple APIs) Application services – Amazon SQS SQS messages Get message Instance Put message Instance Amazon SNS Topic Publish notification Queue is subscribed to topic
  68. 68. Compute/platform – AWS Lambda • Functions triggered by events • JavaScript, Java, Python, and C# • Managed • Implicit scaling S3 bucket Lambda Push: event notification DynamoDB Pull: DynamoDB Stream Amazon Kinesis Pull: Amazon Kinesis stream
  69. 69. Loose coupling sets you free! The looser they're coupled, the bigger they scale • Independent components • Design everything as a black box • Decouple interactions • Favor services with built-in redundancy and scalability • Don’t build your own! S3 bucket Lambda Push: event notification DynamoDB Pull: DynamoDB Stream Amazon Kinesis SQS messages Get message Instance Put message Instance Amazon SNS Topic Publish notification Queue is subscribed to topic Pull: Amazon Kinesis stream
  70. 70. Users >1,000,000
  71. 71. Users >1 million+ Reaching a million and above is going to require some bit of all the previous things: • Multi-AZ • Elastic Load Balancing between tiers • Auto Scaling • Service oriented architecture (SOA) • Serving content smartly (Amazon S3/CloudFront) • Caching off DB • Moving state off tiers that auto scale
  72. 72. Users >1 million+ RDS DB Instance Active (Multi-AZ) Availability Zone Load balancer RDS DB Instance Read Replica RDS DB Instance Read Replica Web Instance Web Instance Web Instance Web Instance Amazon Route 53 User Amazon S3 Amazon CloudFront DynamoDB Amazon SQS ElastiCache Worker Instance Worker Instance Amazon CloudWatch Internal App Instance Internal App Instance Amazon SES Lambda
  73. 73. The next big steps
  74. 74. Users >5 million–10 million Database issues? How can you solve? • Federation – splitting into multiple DBs based on function • Sharding – splitting one dataset up across multiple hosts • Moving some functionality to other types of DBs (NoSQL, Graph)
  75. 75. Database federation • Split up databases by function/purpose • Harder to do cross-function queries • Essentially delays sharding/NoSQL • Won’t help with single huge functions/tables Forums DB Users DB Products DB
  76. 76. Sharded horizontal scaling • More complex at the application layer • No practical limit on scalability • Operation complexity/sophistication • Shard by function or key space • RDBMS or NoSQL User ShardID 002345 A 002346 B 002347 C 002348 B 002349 A CBA
  77. 77. Shifting functionality to NoSQL • Similar in a sense to federation • NoSQL vs. SQL • Leverage managed services like DynamoDB Some use cases: • Leaderboards/scoring • Rapid ingest of clickstream/log data • Temporary data needs (cart data) • “Hot” tables • Metadata/lookup tables DynamoDB
  78. 78. A quick review
  79. 79. A quick review • Multi-AZ your infrastructure. • Make use of self-scaling services – ELB, Amazon S3, Amazon SNS, Amazon SQS, Amazon SWF, Amazon SES, etc. • Build in redundancy at every level. • Start with SQL. Seriously. • Cache data both inside and outside your infrastructure. • Use automation tools in your infrastructure.
  80. 80. A quick review continued • Make sure you have good metrics/monitoring/logging • Split tiers into individual services (SOA) • Use Auto Scaling once you’re ready for it • Don’t reinvent the wheel • Move to NoSQL if and when it makes sense
  81. 81. 10+ million users!
  82. 82. To infinity...
  83. 83. User >10 million • More fine-tuning of your application • More SOA of features/functionality • Going from Multi-AZ to multi-region • Possibly start to build custom solutions • Deep analysis of your entire stack • Amazon EC2 Container Service • AWS Lambda
  84. 84. Next steps? READ! aws.amazon.com/documentation aws.amazon.com/architecture aws.amazon.com/start-ups START USING AWS: aws.amazon.com/free/
  85. 85. Ask for help! forums.aws.amazon.com aws.amazon.com/premiumsupport/ Your account manager A solutions architect
  86. 86. Thank you!

×