AWS re:Invent 2016: Scaling Up to Your First 10 Million Users (ARC201)

389 views

Published on

Cloud computing gives you a number of advantages, such as the ability to scale your web application or website on demand. If you have a new web application and want to use cloud computing, you might be asking yourself, "Where do I start?" Join us in this session to understand best practices for scaling your resources from zero to millions of users. We show you how to best combine different AWS services, how to make smarter decisions for architecting your application, and how to scale your infrastructure in the cloud.

Published in: Technology

AWS re:Invent 2016: Scaling Up to Your First 10 Million Users (ARC201)

  1. 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Joel Williams, AWS Solutions Architect November 29, 2016 Scaling Up to Your First Million Users1011 ARC201
  2. 2. Joel Williams • AWS Solutions Architect (July 2012) AWS re:Invent 2016
  3. 3. AWS re:Invent 2016 WHO ARE YOU?
  4. 4. http://i.telegraph.co.uk/multimedia/archive/02674/CLIMBER_2674482b.jpg
  5. 5. Now that’s a lot of things to read! This is NOT where we want to start!
  6. 6. It’s not the single thing that fixes everything.
  7. 7. What do we need first?
  8. 8. Some basics…
  9. 9. AWS Global Infrastructure
  10. 10. TECHNICAL & BUSINESS SUPPORT Account Management Support Professional Services Solutions Architects Training & Certification Security & Pricing Reports Partner Ecosystem AWS MARKETPLACE Backup Big Data & HPC Business Apps Databases Development Industry Solutions Security APPLICATION SERVICES Queuing Notifications Search Orchestration Email ENTERPRISE APPS Virtual Desktops Storage Gateway Sharing & Collaboration Email & Calendaring Directories HYBRID CLOUD MANAGEMENT Backups Deployment Direct Connect Identity Federation Integrated Management SECURITY & MANAGEMENT Virtual Private Networks Identity & Access Encryption Keys Configuration Monitoring Dedicated INFRASTRUCTURE SERVICES Regions Availability Zones Compute Storage Databases SQL, NoSQL, Caching CDNNetworking PLATFORM SERVICES App Mobile & Web Front-end Functions Identity Data Store Real-time Development Containers Source Code Build Tools Deployment DevOps Mobile Sync Identity Push Notifications Mobile Analytics Mobile Backend Analytics Data Warehousing Hadoop Streaming Data Pipelines Machine Learning
  11. 11. TECHNICAL & BUSINESS SUPPORT Account Management Support Professional Services Solutions Architects Training & Certification Security & Pricing Reports Partner Ecosystem AWS MARKETPLACE Backup Big Data & HPC Business Apps Databases Development Industry Solutions Security APPLICATION SERVICES Queuing Notifications Search Orchestration Email ENTERPRISE APPS Virtual Desktops Storage Gateway Sharing & Collaboration Email & Calendaring Directories HYBRID CLOUD MANAGEMENT Backups Deployment Direct Connect Identity Federation Integrated Management SECURITY & MANAGEMENT Virtual Private Networks Identity & Access Encryption Keys Configuration Monitoring Dedicated INFRASTRUCTURE SERVICES Regions Availability Zones Compute Storage Databases SQL, NoSQL, Caching CDNNetworking PLATFORM SERVICES App Mobile & Web Front-end Functions Identity Data Store Real-time Development Containers Source Code Build Tools Deployment DevOps Mobile Sync Identity Push Notifications Mobile Analytics Mobile Backend Analytics Data Warehousing Hadoop Streaming Data Pipelines Machine Learning
  12. 12. Solutions Architects
  13. 13. Solutions Architects
  14. 14. APPLICATION SERVICES Queuing Notifications Search Orchestration Email SECURITY & MANAGEMENT Virtual Private Networks Identity & Access Encryption Keys Configuration Monitoring Dedicated INFRASTRUCTURE SERVICES Regions Availability Zones Compute Storage Databases SQL, NoSQL, Caching CDNNetworking PLATFORM SERVICES App Mobile & Web Front-end Functions Identity Data Store Real-time Development Containers Source Code Build Tools Deployment DevOps Mobile Sync Identity Push Notifications Mobile Analytics Mobile Backend Analytics Data Warehousing Hadoop Streaming Data Pipelines Machine Learning
  15. 15. AWS building blocks Inherently highly available and fault-tolerant services Highly available with the right architecture  Amazon CloudFront  Amazon Route 53  Amazon S3  Amazon DynamoDB  Elastic Load Balancing  Amazon EFS  AWS Lambda  Amazon SQS  Amazon SNS  Amazon SES  Amazon SWF  …  Amazon EC2  Amazon EBS  Amazon RDS  Amazon VPC
  16. 16. So let’s start from…
  17. 17. You
  18. 18. 1 User Amazon EC2 instance Elastic IP User Amazon Route 53
  19. 19. “We’re gonna need a bigger box” • Simplest approach • Can now leverage PIOPS • High I/O instances • High memory instances • High CPU instances • High storage instances • Easy to change instance sizes • Will hit an endpoint eventually c4.8xlarge m3.2xlarge t2.micro
  20. 20. “We’re gonna need a bigger box” • Simplest approach • Can now leverage PIOPS • High I/O instances • High memory instances • High CPU instances • High storage instances • Easy to change instance sizes • Will hit an endpoint eventually c4.8xlarge m3.2xlarge t2.micro
  21. 21. 1 User • 100s? 1000s? • No failover • No redundancy • Too many eggs in one basket EC2 Instance Elastic IP User Amazon Route 53
  22. 22. Users >1
  23. 23. Users > 1 Web Instance Database Instance Elastic IP User Amazon Route 53
  24. 24. Self-managed Fully managed Amazon EC2 Amazon DynamoDB Amazon RDS Amazon Redshift Database options
  25. 25. Amazon Aurora • Automatic storage scaling (up to 64 TB) • Up to 15 read-replicas • Continuous (incremental) backups to Amazon S3 • 6-way replication across 3 AZs • MySQL Compatible
  26. 26. To NoSQL, or not to NoSQL?
  27. 27. Start with SQL databases
  28. 28. Why start with SQL? • Established and well-worn technology. • Lots of existing code, communities, books, and tools. • You aren’t going to break SQL DBs in your first 10 million users. No, really, you won’t.* • Clear patterns to scalability. *Unless you are doing something SUPER peculiar with the data or you have MASSIVE amounts of it. …but even then SQL will have a place in your stack.
  29. 29. AH HA! You said “massive amounts”
  30. 30. > 5 TB in year one? Incredibly data intensive workload? OK! You might need NoSQL.
  31. 31. Why else might you need NoSQL? • Super low-latency applications • Metadata-driven datasets • Highly nonrelational data • Need schema-less data constructs* • Rapid ingest of data (thousands of records/sec) • Massive amounts of data (again, in the TB range) *Need!= “It’s easier to do dev without schemas”
  32. 32. Users >100
  33. 33. Users >100 Web instance Elastic IP RDS DB instance User Amazon Route 53
  34. 34. Users >1000
  35. 35. Users >1000 Web Instance RDS DB Instance Active (Multi-AZ) Availability Zone Availability Zone Web Instance RDS DB Instance Standby (Multi-AZ) Load balancer User Amazon Route 53
  36. 36. Elastic Load Balancing • Highly available • 1 - 65535 • Health checks • Session stickiness • Monitoring / Logging • Secure sockets layer
  37. 37. Application Load Balancer • Highly available • 1 - 65535 • Health checks • Session stickiness • Monitoring / logging • Content-based routing • Container-based apps • WebSockets • HTTP/2
  38. 38. horizontally vertically
  39. 39. Users >100,000
  40. 40. Users > 10,000s–100,000s RDS DB Instance Active (Multi-AZ) Availability Zone Availability Zone RDS DB Instance Standby (Multi-AZ) Load balancer RDS DB Instance Read Replica RDS DB Instance Read Replica RDS DB Instance Read Replica RDS DB Instance Read Replica Web Instance Web Instance Web Instance Web Instance Web Instance Web Instance Web Instance Web Instance Amazon Route 53User
  41. 41. What about performance and efficiency?
  42. 42. Lighten the Load
  43. 43. RDS DB Instance Active (Multi-AZ) Availability Zone Load balancer Amazon S3 Amazon CloudFront Amazon Route 53 User Shift some load around Web Instances Availability Zone Web Instances RDS DB Instance Standby (Multi-AZ)
  44. 44. RDS DB Instance Active (Multi-AZ) Availability Zone Load balancer Amazon S3 Amazon CloudFront Amazon Route 53 User Shift some load around Web Instances
  45. 45. Amazon Simple Storage Service (S3) • Object-based storage • Highly durable • Great for static assets • “Infinitely scalable” • Objects up to 5 TB in size • Optional encryption
  46. 46. Amazon CloudFront • Cache content for faster delivery • Lower load on origin • Dynamic and static content • Streaming video • Custom SSL certificates • Low TTLs (as short as 0 seconds) • Free origin fetches? • Optimized for AWS
  47. 47. Amazon CloudFront ResponseTime ServerLoad Response Time Server Load Response Time Serve rLoad No CDN CDN for Static Content CDN for Static & Dynamic Content 0 20 40 60 80 8:00AM 9:00AM 10:00AM 11:00AM 12:00PM 1:00PM 2:00PM 3:00PM 4:00PM 5:00PM 6:00PM 7:00PM 8:00PM 9:00PM VolumeofData Delivered(Gbps)
  48. 48. Shift some load around RDS DB Instance Active (Multi-AZ) Availability Zone Load balancer Amazon S3 Amazon CloudFront Amazon Route 53 User ElastiCache DynamoDB Web Instances
  49. 49. Amazon DynamoDB • Managed NoSQL database • Provisioned throughput • Fast, predictable performance • Fully distributed, fault tolerant • JSON support • Items up to 400 KB
  50. 50. Amazon Elasticache • Managed Memcached or Redis • Scale from one to many nodes • Self-healing (replaces dead instance) • Single digit ms speeds (usually) • Local to a single AZ for Memcache • Multi-AZ possible with Redis
  51. 51. Shift some load around RDS DB Instance Active (Multi-AZ) Availability Zone Load balancer Amazon S3 Amazon CloudFrontUser ElastiCache DynamoDB Web Instances Amazon Route 53
  52. 52. Now that our web tier is much more lightweight, we can revisit the beginning of our talk…
  53. 53. Auto Scaling!
  54. 54. Automatic resizing of compute clusters Define min/max pool sizes CloudWatch metrics drive scaling On-Demand or Spot Instances aws autoscaling create-auto-scaling-group --auto-scaling-group-name MyGroup --launch-configuration-name MyConfig --min-size 4 --max-size 200 --availability-zones us-west-2c, us-west-2b Auto Scaling
  55. 55. Sunday Monday Tuesday Wednesday Thursday Friday Saturday Typical weekly traffic to Amazon.com
  56. 56. Sunday Monday Tuesday Wednesday Thursday Friday Saturday Typical weekly traffic to Amazon.com Provisioned capacity
  57. 57. November November traffic to Amazon.com
  58. 58. Provisioned capacity November November traffic to Amazon.com
  59. 59. November traffic to Amazon.com 76% 24% November Provisioned capacity
  60. 60. November traffic to Amazon.com November
  61. 61. Auto Scaling lets you do this!
  62. 62. Users > 500,000+ Availability Zone Amazon Route 53 User Amazon S3 Amazon CloudFront Availability Zone Load balancer DynamoDB RDS DB Instance Read Replica Web Instance Web Instance Web Instance ElastiCache RDS DB Instance Read Replica Web Instance Web Instance Web Instance ElastiCacheRDS DB Instance Standby (Multi-AZ) RDS DB Instance Active (Multi-AZ)
  63. 63. Users > 500,000+ Availability Zone Amazon Route 53 User Amazon S3 Amazon CloudFront Availability Zone Load balancer DynamoDB RDS DB Instance Read Replica Web Instance Web Instance Web Instance ElastiCache RDS DB Instance Read Replica Web Instance Web Instance Web Instance ElastiCacheRDS DB Instance Standby (Multi-AZ) RDS DB Instance Active (Multi-AZ)
  64. 64. Use automation
  65. 65. AWS application management solutions Convenience Control Higher-level services Do it yourself AWS Elastic Beanstalk AWS OpsWorks AWS CloudFormation Amazon EC2
  66. 66. AWS CodeDeploy • Deploys your code to a “fleet” of EC2 instances • 1 – 10,000s of instances • Automatically schedules updates (multiple AZs) • Application and Deployment groups described in YAML-formatted files • Can reference Auto Scaling groups • AWS Management Console, CLI, or APIs • Can be used with Chef recipes or Puppet scripts
  67. 67. Users >500,000+ • Monitoring, metrics, and logging • If you can’t build it internally, outsource it! (third-party SaaS) • What are customers saying? • Try to squeeze as much performance out of each service/component
  68. 68. AGGREGATE LEVEL METRICS LOG ANALYSIS EXTERNAL SITE PERFORMANCE HOST LEVEL METRICS
  69. 69. There are further improvements to be made in breaking apart our web/app layer
  70. 70. SOA What does this mean?
  71. 71. Now that’s a lot of things to read! This is NOT where we want to start!
  72. 72. This is NOT where we want to start! This IS where we want to start! Now that’s a lot of things to read!
  73. 73. SOAing Move services into their own tiers. • Treat them separately • Scale them independently. It offers flexibility and greater understanding of each component
  74. 74. Loose coupling + SOA = winning DON’T REINVENT THE WHEEL • Email • Queuing • Transcoding • Search • Databases • Monitoring • Metrics • Logging • Compute Amazon CloudSearch Amazon SQSAmazon SNS Amazon Elastic Transcoder Amazon SWFAmazon SES AWS Lambda
  75. 75. • Reliable (Multi-AZ) • Scalable (unlimited messages) • Secure (queue authentication) • Simple (simple APIs) Application Services – Amazon SQS SQS messages Get Message Instance Put Message Instance Amazon SNS Topic Publish Notification Queue Is Subscribed to Topic
  76. 76. Compute / Platform – AWS Lambda • Functions triggered by events • JavaScript, Java, and Python • Managed • Implicit scaling S3 Bucket Lambda Push: Event Notification DynamoDB Pull: DynamoDB Stream Amazon Kinesis Pull: Amazon Kinesis Stream
  77. 77. Loose coupling sets you free! The looser they're coupled, the bigger they scale • Independent components • Design everything as a black box • Decouple interactions • Favor services with built-in redundancy and scalability • Don’t build your own! S3 Bucket Lambda Push: Event Notification DynamoDB Pull: DynamoDB Stream Amazon Kinesis Pull: DynamoDB Stream SQS messages Get Message Instance Put Message Instance Amazon SNS Topic Publish Notification Queue Is Subscribed to Topic
  78. 78. Users >1,000,000
  79. 79. Users >1 million+ Reaching a million and above is going to require some bit of all the previous things: • Multi-AZ • Elastic Load Balancing between tiers • Auto Scaling • Service oriented architecture (SOA) • Serving content smartly (Amazon S3/CloudFront ) • Caching off DB • Moving state off tiers that auto scale
  80. 80. Users >1 million+ RDS DB Instance Active (Multi-AZ) Availability Zone load balancer RDS DB Instance Read Replica RDS DB Instance Read Replica Web Instance Web Instance Web Instance Web Instance Amazon Route 53 User Amazon S3 Amazon CloudFront DynamoDB Amazon SQS ElastiCache Worker Instance Worker Instance Amazon CloudWatch Internal App Instance Internal App Instance Amazon SES Lambda
  81. 81. The next big steps
  82. 82. Users >10,000,000
  83. 83. Users >5 million - 10 million Database Issues? How can you solve it? • Federation—splitting into multiple DBs based on function • Sharding—splitting one dataset up across multiple hosts • Moving some functionality to other types of DBs (NoSQL, Graph)
  84. 84. Database federation • Split up databases by function/purpose • Harder to do cross-function queries • Essentially delays sharding/NoSQL • Won’t help with single huge functions/tables Forums DB Users DB Products DB
  85. 85. Sharded horizontal scaling • More complex at the application layer • No practical limit on scalability • Operation complexity/sophistication • Shard by function or key space • RDBMS or NoSQL User ShardID 002345 A 002346 B 002347 C 002348 B 002349 A CBA
  86. 86. Shifting functionality to NoSQL • Similar in a sense to federation • NoSQL vs. SQL • Leverage managed services like DynamoDB Some use cases: • Leaderboards/scoring • Rapid ingest of clickstream/log data • Temporary data needs (cart data) • “Hot” tables • Metadata/lookup tables DynamoDB
  87. 87. A quick review
  88. 88. A quick review • Multi-AZ your infrastructure. • Make use of self-scaling services—ELB, Amazon S3, Amazon SNS, Amazon SQS, Amazon SWF, Amazon SES, etc. • Build in redundancy at every level. • Start with SQL. Seriously. • Cache data both inside and outside your infrastructure. • Use automation tools in your infrastructure.
  89. 89. A quick review continued • Make sure you have good metrics/monitoring/logging • Split tiers into individual services (SOA) • Use Auto Scaling once you’re ready for it • Don’t reinvent the wheel • Move to NoSQL if and when it makes sense
  90. 90. 11+ million users!
  91. 91. To infinity...
  92. 92. • More fine-tuning of your application • More SOA of features/functionality • Going from Multi-AZ to multi-region • Possibly start to build custom solutions • Deep analysis of your entire stack • AWS EC2 Container Service • AWS Lambda User >11 million
  93. 93. Next steps? READ! aws.amazon.com/documentation aws.amazon.com/architecture aws.amazon.com/start-ups START USING AWS: aws.amazon.com/free/
  94. 94. Ask for Help! forums.aws.amazon.com aws.amazon.com/premiumsupport/ Your Account Manager A Solutions Architect
  95. 95. Thank you! Joel Williams
  96. 96. Remember to complete your evaluations!
  97. 97. Related Sessions DEV206 – Scaling Your Web Applications with AWS Elastic Beanstalk Tue, Nov 29, 11:00 AM – 12:00 PM – Venetian, Level 2, Opaline Theatre Thurs, Dec 1, 2:00 PM – 3:00 PM – Venetian, Level 2, Titan 2305 ARC305 – From Monolithic to Microservices: Evolving Architecture Patterns in the Cloud Wed, Nov 30, 2:00 PM – 3:00 PM – Venetian, Level 2, Venetian Theatre Fri, Dec 2, 9:00 AM – 10:00 AM – Venetian, Level 4, Lando 4205 CMP201 – Auto Scaling – The Fleet Management Solution for Planet Earth Thurs, Dec 1, 5:00 PM – 6:00 PM – Venetian, Level 2, Opaline Theatre

×