Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

(ARC301) Scaling Up to Your First 10 Million Users

4,860 views

Published on

Cloud computing gives you a number of advantages, such as the ability to scale your web application or website on demand. If you have a new web application and want to use cloud computing, you might be asking yourself, "Where do I start?" Join us in this session to understand best practices for scaling your resources from zero to millions of users. We show you how to best combine different AWS services, how to make smarter decisions for architecting your application, and how to scale your infrastructure in the cloud.

Published in: Technology
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

(ARC301) Scaling Up to Your First 10 Million Users

  1. 1. © 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Joel Williams, AWS Solutions Architect October 8, 2015 Scaling Up to Your First Million Users1011 ARC301
  2. 2. Joel Williams • Amazon Web Services Solutions Architect (July 2012) AWS re:Invent 2015
  3. 3. AWS re:Invent 2015 WHO ARE YOU?
  4. 4. So how do we scale?
  5. 5. http://i.telegraph.co.uk/multimedia/archive/02674/CLIMBER_2674482b.jpg
  6. 6. Now that’s a lot of things to read! This is NOT where we want to start!
  7. 7. Auto Scaling …is a tool and a destination.
  8. 8. It’s not the single thing that fixes everything.
  9. 9. What do we need first?
  10. 10. Some basics…
  11. 11. US-WEST (Oregon) EU (Ireland) ASIA PACIFIC (Tokyo) US-WEST (N. California) SOUTH AMERICA (Sao Paulo) US-EAST (N. Virginia) AWS GOVCLOUD (US) ASIA PACIFIC (Sydney) ASIA PACIFIC (Singapore) CHINA (Beijing) Regions EU (Frankfurt) INDIA (2016)
  12. 12. US-WEST (Oregon) EU (Ireland) ASIA PACIFIC (Tokyo) US-WEST (N. California) SOUTH AMERICA (Sao Paulo) US-EAST (N. Virginia) AWS GOVCLOUD (US) ASIA PACIFIC (Sydney) ASIA PACIFIC (Singapore) CHINA (Beijing) Availability Zones EU (Frankfurt) INDIA (2016)
  13. 13. Edge locations
  14. 14. TECHNICAL & BUSINESS SUPPORT Account Management Support Professional Services Solutions Architects Training & Certification Security & Pricing Reports Partner Ecosystem AWS MARKETPLACE Backup Big Data & HPC Business Apps Databases Development Industry Solutions Security APPLICATION SERVICES Queuing Notifications Search Orchestration Email ENTERPRISE APPS Virtual Desktops Storage Gateway Sharing & Collaboration Email & Calendaring Directories HYBRID CLOUD MANAGEMENT Backups Deployment Direct Connect Identity Federation Integrated Management SECURITY & MANAGEMENT Virtual Private Networks Identity & Access Encryption Keys Configuration Monitoring Dedicated INFRASTRUCTURE SERVICES Regions Availability Zones Compute Storage Databases SQL, NoSQL, Caching CDNNetworking PLATFORM SERVICES App Mobile & Web Front-end Functions Identity Data Store Real-time Development Containers Source Code Build Tools Deployment DevOps Mobile Sync Identity Push Notifications Mobile Analytics Mobile Backend Analytics Data Warehousing Hadoop Streaming Data Pipelines Machine Learning
  15. 15. TECHNICAL & BUSINESS SUPPORT Account Management Support Professional Services Solutions Architects Training & Certification Security & Pricing Reports Partner Ecosystem AWS MARKETPLACE Backup Big Data & HPC Business Apps Databases Development Industry Solutions Security APPLICATION SERVICES Queuing Notifications Search Orchestration Email ENTERPRISE APPS Virtual Desktops Storage Gateway Sharing & Collaboration Email & Calendaring Directories HYBRID CLOUD MANAGEMENT Backups Deployment Direct Connect Identity Federation Integrated Management SECURITY & MANAGEMENT Virtual Private Networks Identity & Access Encryption Keys Configuration Monitoring Dedicated INFRASTRUCTURE SERVICES Regions Availability Zones Compute Storage Databases SQL, NoSQL, Caching CDNNetworking PLATFORM SERVICES App Mobile & Web Front-end Functions Identity Data Store Real-time Development Containers Source Code Build Tools Deploymen t DevOps Mobile Sync Identity Push Notifications Mobile Analytics Mobile Backend Analytics Data Warehousing Hadoop Streaming Data Pipelines Machine Learning
  16. 16. Solutions Architects
  17. 17. Solutions Architects
  18. 18. APPLICATION SERVICES Queuing Notifications Search Orchestration Email SECURITY & MANAGEMENT Virtual Private Networks Identity & Access Encryption Keys Configuration Monitoring Dedicated INFRASTRUCTURE SERVICES Regions Availability Zones Compute Storage Databases SQL, NoSQL, Caching CDNNetworking PLATFORM SERVICES App Mobile & Web Front-end Functions Identity Data Store Real-time Development Containers Source Code Build Tools Deployment DevOps Mobile Sync Identity Push Notifications Mobile Analytics Mobile Backend Analytics Data Warehousing Hadoop Streaming Data Pipelines Machine Learning
  19. 19. AWS building blocks Inherently highly available and fault-tolerant services Highly available with the right architecture  Amazon CloudFront  Amazon Route 53  Amazon S3  Amazon DynamoDB  Elastic Load Balancing  Amazon EFS  AWS Lambda  Amazon SQS  Amazon SNS  Amazon SES  Amazon SWF  …  Amazon EC2  Amazon EBS  Amazon RDS  Amazon VPC
  20. 20. So let’s start from…
  21. 21. 1 user You
  22. 22. 1 User • Amazon Route 53 for DNS • A single Elastic IP • A single Amazon EC2 instance • With full stack on this host • Web app • Database • Management • And so on… Amazon EC2 instance Elastic IP User Amazon Route 53
  23. 23. “We’re gonna need a bigger box” • Simplest approach • Can now leverage PIOPS • High I/O instances • High memory instances • High CPU instances • High storage instances • Easy to change instance sizes • Will hit an endpoint eventually c4.8xlarge m3.2xlarge t2.micro
  24. 24. “We’re gonna need a bigger box” • Simplest approach • Can now leverage PIOPS • High I/O instances • High memory instances • High CPU instances • High storage instances • Easy to change instance sizes • Will hit an endpoint eventually c4.8xlarge m3.2xlarge t2.micro
  25. 25. 1 User • We could potentially get to a few hundred to a few thousand depending on application complexity and traffic • No failover • No redundancy • Too many eggs in one basket EC2 Instance Elastic IP User Amazon Route 53
  26. 26. 1 User • We could potentially get to a few hundred to a few thousand depending on application complexity and traffic • No failover • No redundancy • Too many eggs in one basket EC2 Instance Elastic IP User Amazon Route 53
  27. 27. Users >1
  28. 28. Users > 1 First, let’s separate out our single host into more than one. • Web • Database  Make use of a database service? Web Instance Database Instance Elastic IP User Amazon Route 53
  29. 29. Self-managed Fully managed Database server on Amazon EC2 Your choice of database running on Amazon EC2 Bring Your Own License (BYOL) Amazon DynamoDB Managed NoSQL database service using SSD storage Seamless scalability Zero administration Amazon RDS Microsoft SQL Server Oracle MySQL PostgreSQL MariaDB Amazon Aurora BYOL or license Included Amazon Redshift Massively parallel, petabyte-scale data warehouse service Fast, powerful, and easy to scale Database options
  30. 30. Self-managed Fully managed Database server on Amazon EC2 Your choice of database running on Amazon EC2 Bring Your Own License (BYOL) Amazon DynamoDB Managed NoSQL database service using SSD storage Seamless scalability Zero administration Amazon RDS Microsoft SQL Server Oracle MySQL PostgreSQL MariaDB Amazon Aurora BYOL or license Included Amazon Redshift Massively parallel, petabyte-scale data warehouse service Fast, powerful, and easy to scale Database options
  31. 31. Amazon Aurora • Automatic storage scaling (up to 64 TB) • Up to 15 read-replicas • Continuous (incremental) backups to Amazon S3 • 6-way replication across 3 AZs • MySQL Compatible
  32. 32. To NoSQL, or not to NoSQL?
  33. 33. Some folks won’t like this, but…
  34. 34. Start with SQL databases
  35. 35. Why start with SQL? • Established and well-worn technology. • Lots of existing code, communities, books, and tools. • You aren’t going to break SQL DBs in your first 10 million users. No, really, you won’t.* • Clear patterns to scalability. *Unless you are doing something SUPER peculiar with the data or you have MASSIVE amounts of it. …but even then SQL will have a place in your stack.
  36. 36. AH HA! You said “massive amounts,” and I will have massive amounts!
  37. 37. > 5 TB in year one? Incredibly data intensive workload? OK! You might need NoSQL.
  38. 38. Why else might you need NoSQL? • Super low-latency applications • Metadata-driven datasets • Highly nonrelational data • Need schema-less data constructs* • Massive amounts of data (again, in the TB range) • Rapid ingest of data (thousands of records/sec) *Need!= “It’s easier to do dev without schemas”
  39. 39. Users >100
  40. 40. Users >100 First, let’s separate out our single host into more than one: • Web • Database  Use Amazon RDS to make your life easier Web instance Elastic IP RDS DB instance User Amazon Route 53
  41. 41. Users >1000
  42. 42. Users >1000 Next, let’s address our lack of failover and redundancy issues: Another web instance • In another Availability Zone RDS Multi-AZ Elastic Load Balancing (ELB) Web Instance RDS DB Instance Active (Multi-AZ) Availability Zone Availability Zone Web Instance RDS DB Instance Standby (Multi-AZ) ELB Balancer User Amazon Route 53
  43. 43. Elastic Load Balancing • Highly available • 1 - 65535 • Health checks • Session stickiness • Secure sockets layer • Monitoring • Logging
  44. 44. horizontally vertically
  45. 45. Users > 10,000s–100,000s RDS DB Instance Active (Multi-AZ) Availability Zone Availability Zone RDS DB Instance Standby (Multi-AZ) ELB Balancer RDS DB Instance Read Replica RDS DB Instance Read Replica RDS DB Instance Read Replica RDS DB Instance Read Replica Web Instance Web Instance Web Instance Web Instance Web Instance Web Instance Web Instance Web Instance Amazon Route 53User
  46. 46. What about performance and efficiency?
  47. 47. Lighten the Load
  48. 48. RDS DB Instance Active (Multi-AZ) Availability Zone ELB Balancer Amazon S3 Amazon CloudFront Amazon Route 53 User Shift some load around Web Instances • static content to Amazon S3 and Amazon CloudFront Move…
  49. 49. Amazon Simple Storage Service (S3) • Object-based storage • Highly durable • Great for static assets • “Infinitely scalable” • Objects up to 5 TB in size • Optional encryption
  50. 50. Amazon CloudFront • Cache content for faster delivery • Lower load on origin • Dynamic and static content • Streaming video • Custom SSL certificates • Low TTLs (as short as 0 seconds) • Free origin fetches? • Optimized for AWS
  51. 51. Amazon CloudFront ResponseTime ServerLoad Response Time Server Load Response Time Serve rLoad No CDN CDN for Static Content CDN for Static & Dynamic Content 0 20 40 60 80 8:00AM 9:00AM 10:00AM 11:00AM 12:00PM 1:00PM 2:00PM 3:00PM 4:00PM 5:00PM 6:00PM 7:00PM 8:00PM 9:00PM VolumeofData Delivered(Gbps)
  52. 52. Shift some load around • static content to Amazon S3 and Amazon CloudFront Move… • session/state to Amazon DynamoDB • DB caching to Amazon ElastiCache RDS DB Instance Active (Multi-AZ) Availability Zone ELB Balancer Amazon S3 Amazon CloudFront Amazon Route 53 User ElastiCache DynamoDB Web Instances
  53. 53. Amazon DynamoDB • Managed NoSQL database • Provisioned throughput • Fast, predictable performance • Fully distributed, fault tolerant • JSON support • Items up to 400 KB
  54. 54. Amazon Elasticache • Managed Memcached or Redis • Scale from one to many nodes • Self-healing (replaces dead instance) • Single digit ms speeds (usually) • Local to a single AZ for Memcache • Multi-AZ possible with Redis
  55. 55. Shift some load around Move… • static content to Amazon S3 and Amazon CloudFront • session/state to Amazon DynamoDB • DB caching to Amazon ElastiCache • dynamic content to Amazon CloudFront RDS DB Instance Active (Multi-AZ) Availability Zone ELB Balancer Amazon S3 Amazon CloudFrontUser ElastiCache DynamoDB Web Instances Amazon Route 53
  56. 56. Now that our web tier is much more lightweight, we can revisit the beginning of our talk…
  57. 57. Auto Scaling!
  58. 58. Automatic resizing of compute clusters Define min/max pool sizes CloudWatch metrics drive scaling On-demand or Spot instances aws autoscaling create-auto-scaling-group --auto-scaling-group-name MyGroup --launch-configuration-name MyConfig --min-size 4 --max-size 200 --availability-zones us-west-2c, us-west-2b Auto Scaling
  59. 59. Sunday Monday Tuesday Wednesday Thursday Friday Saturday Typical weekly traffic to Amazon.com
  60. 60. Sunday Monday Tuesday Wednesday Thursday Friday Saturday Typical weekly traffic to Amazon.com Provisioned capacity
  61. 61. November November traffic to Amazon.com
  62. 62. Provisioned capacity November November traffic to Amazon.com
  63. 63. November traffic to Amazon.com 76% 24% November Provisioned capacity
  64. 64. November traffic to Amazon.com November
  65. 65. Auto Scaling lets you do this!
  66. 66. = one user = 100,000 users= 1,000,000 users
  67. 67. Users >500,000
  68. 68. Users > 500,000+ Availability Zone Amazon Route 53 User Amazon S3 Amazon CloudFront Availability Zone ELB Balancer DynamoDB RDS DB Instance Read Replica Web Instance Web Instance Web Instance ElastiCache RDS DB Instance Read Replica Web Instance Web Instance Web Instance ElastiCacheRDS DB Instance Standby (Multi-AZ) RDS DB Instance Active (Multi-AZ)
  69. 69. Users > 500,000+ Availability Zone Amazon Route 53 User Amazon S3 Amazon CloudFront Availability Zone ELB Balancer DynamoDB RDS DB Instance Read Replica Web Instance Web Instance Web Instance ElastiCache RDS DB Instance Read Replica Web Instance Web Instance Web Instance ElastiCacheRDS DB Instance Standby (Multi-AZ) RDS DB Instance Active (Multi-AZ)
  70. 70. Use automation
  71. 71. AWS application management solutions Convenience Control Higher-level services Do it yourself AWS Elastic Beanstalk AWS OpsWorks AWS CloudFormation Amazon EC2
  72. 72. AWS CodeDeploy • Deploys your code to a “fleet” of EC2 instances • 1 – 10,000s of instances • Automatically schedules updates (multiple AZs) • Application and Deployment groups described in YAML-formatted files • Can reference Auto Scaling Groups • AWS Management Console, CLI, or APIs • Can be used with Chef recipes or Puppet scripts
  73. 73. Users >500,000+ • Monitoring, metrics, and logging • If you can’t build it internally, outsource it! (third-party SaaS) • What are customers saying? • Try to squeeze as much performance out of each service/component
  74. 74. AGGREGATE LEVEL METRICS LOG ANALYSIS EXTERNAL SITE PERFORMANCE HOST LEVEL METRICS
  75. 75. There are further improvements to be made in breaking apart our web/app layer
  76. 76. SOA What does this mean?
  77. 77. Now that’s a lot of things to read! This is NOT where we want to start!
  78. 78. This is NOT where we want to start! This IS where we want to start! Now that’s a lot of things to read!
  79. 79. SOAing Move services into their own tiers. • Treat them separately and scale them independently. Amazon and AWS do this extensively! It offers flexibility and greater understanding of each component
  80. 80. Loose coupling + SOA = winning DON’T REINVENT THE WHEEL • Email • Queuing • Transcoding • Search • Databases • Monitoring • Metrics • Logging • Compute Amazon CloudSearch Amazon SQSAmazon SNS Amazon Elastic Transcoder Amazon SWFAmazon SES AWS Lambda
  81. 81. • Reliable (Multi-AZ) • Scalable (unlimited messages) • Secure (queue authentication) • Simple (simple APIs) Application Services – Amazon SQS SQS messages Get Message Instance Put Message Instance Amazon SNS Topic Publish Notification Queue Is Subscribed to Topic
  82. 82. Compute / Platform – AWS Lambda • Functions triggered by events • JavaScript, Java… and Python • Managed • Implicit scaling S3 Bucket Lambda Push: Event Notification DynamoDB Pull: DynamoDB Stream Kinesis Pull: Kinesis Stream
  83. 83. Loose coupling sets you free! The looser they're coupled, the bigger they scale • Independent components • Design everything as a black box • Decouple interactions • Favor services with built-in redundancy and scalability rather than building your own S3 Bucket Lambda Push: Event Notification DynamoDB Pull: DynamoDB Stream Amazon Kinesis Pull: DynamoDB Stream SQS messages Get Message Instance Put Message Instance Amazon SNS Topic Publish Notification Queue Is Subscribed to Topic
  84. 84. Users >1,000,000
  85. 85. Users >1 million+ Reaching a million and above is going to require some bit of all the previous things: • Multi-AZ • Elastic Load Balancing between tiers • Auto Scaling • Service Oriented Architecture • Serving content smartly (Amazon S3/CloudFront ) • Caching off DB • Moving state off tiers that auto scale
  86. 86. Users >1 million+ RDS DB Instance Active (Multi-AZ) Availability Zone ELB Balancer RDS DB Instance Read Replica RDS DB Instance Read Replica Web Instance Web Instance Web Instance Web Instance Amazon Route 53 User Amazon S3 Amazon CloudFront DynamoDB Amazon SQS ElastiCache Worker Instance Worker Instance Amazon CloudWatch Internal App Instance Internal App Instance Amazon SES Lambda
  87. 87. The next big steps
  88. 88. Users >10,000,000
  89. 89. Users >5 million - 10 million You’ll potentially start to run into issues with your database around contention on the write master. How can you solve it? • Federation—splitting into multiple DBs based on function • Sharding—splitting one dataset up across multiple hosts • Moving some functionality to other types of DBs (NoSQL, Graph)
  90. 90. Database federation • Split up databases by function/purpose • Harder to do cross-function queries • Essentially delays sharding/NoSQL • Won’t help with single huge functions/tables Forums DB Users DB Products DB
  91. 91. Sharded horizontal scaling • More complex at the application layer • No practical limit on scalability • Operation complexity/sophistication • Shard by function or key space • RDBMS or NoSQL User ShardID 002345 A 002346 B 002347 C 002348 B 002349 A CBA
  92. 92. Shifting functionality to NoSQL • Similar in a sense to federation • Again, think about the earlier points for when you need NoSQL vs. SQL • Leverage managed services like DynamoDB Some use cases: • Leaderboards/scoring • Rapid ingest of clickstream/log data • Temporary data needs (cart data) • “Hot” tables • Metadata/lookup tablesDynamoDB
  93. 93. A quick review
  94. 94. A quick review • Multi-AZ your infrastructure. • Make use of self-scaling services—ELB, Amazon S3, Amazon SNS, Amazon SQS, Amazon SWF, Amazon SES, and more. • Build in redundancy at every level. • Start with SQL. Seriously. • Cache data both inside and outside your infrastructure. • Use automation tools in your infrastructure.
  95. 95. A quick review continued • Make sure you have good metrics/monitoring/logging tools in place • Split tiers into individual services (SOA) • Use Auto Scaling once you’re ready for it • Don’t reinvent the wheel • Move to NoSQL if and when it makes sense
  96. 96. Putting all this together means we should now easily be able to handle 11+ million users!
  97. 97. To infinity...
  98. 98. User >11 million Iterating on top of the patterns seen here will get you up and over 100 million users
  99. 99. • More fine-tuning of your application • More SOA of features/functionality • Going from Multi-AZ to multi-region • Possibly start to build custom solutions • Deep analysis of your entire stack User >11 million
  100. 100. Next steps? READ! aws.amazon.com/documentation aws.amazon.com/architecture aws.amazon.com/start-ups START USING AWS: aws.amazon.com/free/
  101. 101. Ask for Help! forums.aws.amazon.com aws.amazon.com/premiumsupport/ Your Account Manager A Solutions Architect
  102. 102. Thank you! Joel Williams
  103. 103. Remember to complete your evaluations!
  104. 104. Related Sessions DVO303 - Scaling Infrastructure Operations with AWS Service Catalog, AWS Config, and AWS CloudTrail Friday, Oct 9, 9:00 AM - 10:00 AM – Lido 3001B DVO201 - Scaling Your Web Applications with AWS Elastic Beanstalk Friday, Oct 9, 9:00 AM - 10:00 AM – Titan 2306 CMP201 - All You Need To Know About Auto Scaling Friday, Oct 9, 10:15 AM - 11:15 AM – San Polo 3506

×